3.5.5

04-02-2014

New languages

  • Korean (all features but auto-categories)
  • Italian (all features but auto-categories)

New Features

  • Option to override native sentiment dictionary
  • Option to flatten all uppercase characters (to avoid uppercase words being recognized as proper nouns)
  • Option to overlap user entities when user defines several entities with the same name but with a different logic: Semantria will only report the last hit
  • Option to use use anaphora resolution for named entities extraction
  • Option to stem queries in query-based categorization: Semantria will stem query terms
  • Option to fail on long sentences: Semantria will postpone long sentences (> 1000 words) without punctuation during analysis

Fixes

  • Duplicated API settings (entities, categories, queries, etc) when large amount of resources
  • Wrong offset for multi-byte characters reported for mentions of all output types
  • Incorrect values of language_score properties returned by Semantria for language detection
  • Request timeout when queuing large batches containing multi-byte characters
  • Inability to use punctuation marks within query terms during query-driven entities definition
  • Wrong content-type header sent by callback service along with the data
  • Adding of punctuation marks in normalized form during query-driven entities definition