closing argument
Text mining and search applications are merging, much to the relief of law offices everywhere PHOTO: Joe Perales

The worlds of search and text mining are moving closer together and creating a new market sector as a result. And law offices — among other sectors — are reaping the benefits.

Search applications act as signposts to a link in a website or to a document on a server in answer to a person's query — conceptually simple and of great utility. 

Text mining analyzes collections of documents, identifying relationships between terms in order to extract topics. Everything from sentiment analysis to looking for new drug pathways uses this approach.

The core difference between the two is with search, people work within documents on a much smaller scale than the hundreds of thousands typically analyzed in text mining. 

Avoiding 'Error Through Boredom'

The legal profession, and other sectors, need to identify differences between multiple, lengthy documents on a regular basis. This can be to check through the history of version management or to support a due diligence process ahead of an acquisition. 

Unless you work in law or procurement, it's hard to appreciate the scale of contract documents. Many court cases involve the presentation of thousands of documents. Sifting through these manually is not only time consuming but prone to “error through boredom” as a colleague of mine once described the process.

Microsoft Word and other word processing applications have offered document comparison for many years, but only with two documents at the same time. This feature, of all the Word features, is the one I find most frustrating: It is so easy to lose track of the process. 

Using a search-based application speeds up the process considerably. Many vendors offer applications that work across multiple documents, in multiple formats, completing the analysis and displaying an exception report within seconds. These applications are often SaaS products which are available for free trials. DocsCorp offers a concise checklist of what to look for in these products.

Beyond Document Comparison to Document Analytics

In the early days of computer-based searching, KeyWord in Context (KWIC) indexes presented the titles of research papers with the query keyword aligned in the center of the page, so the reader could assess the position and implication of the keyword. 

Search results today take this a step further, highlighting all occurrences of a word in a document. You've likely seen these results in Google Book Search, but the content is locked down which prevents further queries. A number of vendors now offer search applications which deconstruct the document and present (at the choice of the user) the query terms in a sentence, paragraph or page.

A good example of this approach comes from Nalytics, which indexed the manifestos of the political parties contesting the June 8, 2017 UK Election and processed them so that a search for [pensions] presents all occurrences of the term on a sentence by sentence basis. Nalytics takes this approach further into what is, in effect, a document workplace, where people can add comments and share the content, supporting collaborative document review and analysis.

'Ravn' Spreads its Wings

The inspiration for this article came from the news that document management application provider iManage acquired Raven Systems. iManage boasts many customers in the legal business and Ravn has been in the vanguard of using artificial intelligence and machine learning to provide document-level analytics. 

The press release from Ravn about the acquisition includes some interesting case studies, including the UK Serious Fraud Office, which used Ravn to process 30 million documents at the rate of 600,000 a day. While there's definite overlap between this area and e-discovery, e-discovery applications don't share the same tight focus on this type of document-level analysis as companies like Nalytics and Ravn.

Innovation in the Legal Sector

Both law firms and corporate law departments have shown considerable interest in investing in technology that can improve margins and their competitive advantage. 

The April 2017 issue of Legal IT Insider lists 40 start-ups in this burgeoning sector and Joanna Goodman’s book Robots in Law is well worth reading for its analysis of the role of AI in law. Having worked inside two major law firms in the last few years, I sense there is at last an appreciation of how search technology can make a difference.