Configurations

A configuration in Semantria is a combination of language, API settings and NLP tuning. It represents a way you want documents to be processed. This means you can have different configurations for different industry verticals ("sick" is not a sentiment bearing word in drug research for instance), or for different types of documents (tweets can be treated differently than news documents).

Note that you don't need to create a new configuration to process your date. You can use one of the default language or industry configurations included in your account.

Language

The most important configuration setting is language. Each configuration can have only one language specified, and that language cannot be changed once the configuration is created. This is because not all settings and features are supported for every language. You must determine the language of the documents you send to a configuration. Although Semantria can detect languages, we do not route documents based on language, we merely send back to you what language we thought the document was.

One Sentence Mode

The next most important setting in a configuration is one_sentence mode. When a configuration is in one_sentence mode, it adapts to the language commonly used in very short pieces of content such as tweets, Instagram updates, and many other types of status updates. In these types of content, punctuation and capitalization is often missing and there is common use of acronyms, emoji, and other types of shorthand. One_sentence mode is designed to deal with these issues. We recommend not turning one_sentence mode on for content longer than 3 sentences.

Processing Settings

Semantria supports multiple ways of interacting with the API. More information is available about this in the Integration Scenarios section. One important thing to note is that if you plan to use Excel with this configuration, it must be in polling mode, not auto-response or callback.

Thresholds

Most of the other settings in a configuration are flags. These control whether a particular output type is returned (e.g., entities), or how confident we have to be in a match before we return it to you. If you disable unwanted outputs, your documents will process a little bit faster and you won't have to parse the unwanted output.