Three requirements for a multi-language verbatim analytics service

2 min readSep 5, 2016

There are many things that are easy to do in multi-language. For example developing a multi-language website is simple: translate the text fields and add language symbols somewhere on the page. The market is full of platforms that support this process.

However, dealing with multi-language feedback text is more difficult:

Manual categorization of feedback is too slow, inconsistent and expensive;
Your staff might not know all the feedback languages; and
You seldom have the time and money to use human translators.

The good news is that there are automated services that turn the difficult-to-interpret multi-language feedback into structured, statistical information in real-time.

REQUIREMENTS FOR A MULTI-LANGUAGE FEEDBACK ANALYSIS SERVICE

1. It turns multi-language feedback into statistical information.

It is easy to extract keywords, even in many languages. The problem is that there are too many keywords and their meanings overlap. There is no way to turn that data into actionable information. Therefore semantic categorization of keywords in different languages is needed. This means that the analysis service is ontology-based, and preferably the ontologies have industry specific categorization. Ontologies enable multi-language feedback reporting in single language, so knowledge of the source language is not necessary.

2. It detects the multi-language sentiment consistently.

The only method to achieve this goal is Semantic Natural Language Processing (NLP) technology. It is possible to analyze sentiment in one language using pre-defined phrases or mapped concepts and reach over 60% accuracy. But in order to efficiently and uniformly analyze sentiment across multiple languages the service needs to understand language like a human does: what is a verb, what is a noun (morphology), what are the grammatical relations between words (syntax), what are the semantic relations between words (ontology) and what are the tones of words (sentiment). This approach consistently reaches over 80% sentiment accuracy.

3. It doesn’t use machine translation.

There is a temptation to translate and then analyze. This approach doesn’t work. There are tools that extract keywords and sentiment in English and there are good machine translation services like Google and Microsoft . Sophisticated nuances required to extract the correct keywords and sentiment are often lost in the machine translation that easily causes a broken telephone effect.

Three requirements for a multi-language verbatim analytics service

REQUIREMENTS FOR A MULTI-LANGUAGE FEEDBACK ANALYSIS SERVICE

1. It turns multi-language feedback into statistical information.

2. It detects the multi-language sentiment consistently.

3. It doesn’t use machine translation.

Written by Matti Airas

No responses yet