
Version 7 of the Rosette Linguistics Platform from Basis Technology may be unable to help you with publishing, but it is a leading candidate to meet your multilingual text analytics and search needs.
How does it work?

- The process begins with unstructured text: email, web pages, legacy databases.
- The language and encoding of the input text is automatically determined. Rosette 7 supports 55 languages.
- The input text is converted to Unicode to ensure correct display of the processed text. Rosette 7 converts 168 legacy encodings to Unicode.
- Unstructured Arabic, Asian and European text is analyzed morphologically and tagged appropriately.
- Names, dates, places and other entities are identified within the input text -- otherwise known as entity extraction.
- English or foreign names are matched against a local database.
- Names from foreign languages are translated into English.
- The process completes with the output of normalized, tagged and structured data ready for publishing or further analysis.
Top 5 Use Cases
1) Search Engines
Rosette 7 integrates natively with Apache Lucene and Solr to enable enterprises to find and retrieve documents and data in multiple languages.
2) Legal e-Discovery
Legal teams can search across multiple languages during identification, processing, review and analysis phases of the electronic discovery reference model (EDRM).
3) Financial Compliance
Financial institutions are more accurate and efficient (e.g. fewer false positives) during anti-money laundering and counter-terrorism financing initiatives.
4) Anti-Terrorism
Watch list accuracy improve as documents are screened in their original language rather than in their translated form.
5) Unstructured Data Mining
Businesses, both large and small, can process unstructured data -- which makes up the majority of data within an organization -- looking for trends, issues, and opportunities.
Do I Need Multilingual Search?
If your enterprise manages content, either internally for employees or externally for customers, in multiple languages; then the short answer is Yes.
Think about how frustrating it is to be unable to locate a document when you are only dealing with a single language. Now multiply that feeling by the number of languages your company supports or plans to support. Getting it now?
Is your company global? If so, how you do handle multilingual search? Let us know in the comments.