AI in investigations
What tools are we using now, and what does the future hold?
Introduction
The range and sophistication of ‘smart’ technological solutions which are available for deployment in investigations has grown exponentially in recent years, in parallel with (and to a large extent in response to) the rise in data volumes. Thus, while the amount of evidence which potentially falls to be analysed and/or reviewed has grown significantly, so has firms’ ability to deal with it effectively using these evolving technologies. In this piece, we highlight some of the most useful elements of the AI toolkit which are being used in investigations today, as well as what this means for the future detection of criminal conduct.
Latest Tools
Some of the most effective techniques for interrogating data include:
- Continuous Active Learning (CAL) and Technology Assisted Review (TAR)
‘CAL’ and ‘TAR’ are different types of ‘machine learning’ that use coding decisions made by humans to determine the relevance of other similar documents in a data set that have not yet been manually reviewed. The private sector has long been applying this technology to assist in large investigations where full linear review is either not realistic or cost effective within the time limits. The key to a successful outcome is the quality of human input into training the system. The accuracy and efficiency of the technology is now so accepted that in the civil disclosure pilot scheme, parties are expected to explain why they are not using TAR when dealing with more than 50,000 documents. Widespread application in criminal law practice cannot now be far behind, especially given that we are seeing regular use of these tools by enforcement agencies, such as the FCA.
- Conceptual Analytics
Conceptual Analytics looks at the text of all the documents in a data set and groups the documents together by the similarity of the concepts that exist within the text. This can be used in conjunction with the above to quickly enable a reviewer to see prevalent themes or concepts within a dataset. More sophisticated than simple keywords, it allows the reviewers to see documents that are similar to other key documents but may not contain the keywords that were applied. This AI looks at how language is deployed in context to allow the user to see a texts different meanings or concepts within the individual data set (and map the usage). This enables reviewers to see groupings of concepts and identify what email participants are discussing both generally and in the same context, revealing patterns and, on occasion, covert communications or code-words.
- Natural Language Processing (NLP)
NLP enables computers to understand and process the human language and get closer to a human-level understanding of the context behind the words being used. It gives us the ability to label different pieces of text as various categories such as dates, geo-political entities, people or places. It can enable more effective review of non-standard communication tools such as WhatsApp, Bloomberg Chat, WeChat, text messages etc., as NLP can contextualise casual communications and categorise them accordingly. Advanced applications of the technology allow us to train the NLP model to identify specific clauses or phrases within a document and extract those for priority review.
- Sentiment Analysis
One of the most interesting types of NLP which can now be used in the early stages of reviews is Sentiment Analysis. This categorises text to determine whether the communication has positive, neutral, or negative connotations (including analysis of the use of emojis). At present, we find it useful in identifying key time periods where negative or stressful communications tend to become more prevalent. We anticipate this being of benefit in the future for the detection or investigation of fraud and other criminal conduct, due to the impact of these behaviours on the demeanour of those involved (for example, where it creates stress, nervousness, anger, or elation).
What’s coming down the road
The expectation of clients managing investigations and enforcement agencies is that smarter technology will be used on every data set and all avenues explored, in order to increase the speed at which key issues can be identified, increase the quality of and confidence in those findings, and, for clients, to deliver cost savings in increasingly difficult operating environments.
We find that use of data analytics and machine learning in these ways has resulted in cost savings of 87% compared with traditional linear document review in investigations. See some of the results that our eDiscovery Solutions team has secured for our clients here.
Smarter, faster analysis has become the benchmark, rather than the exception. As law firms and technology specialists (such as our market-leading eDiscovery Solutions team, and the Simmons & Simmons – Wavelength partnership, see here for details) work ever more closely together to create innovation and bespoke solutions, we consider the sector is poised at the tip of the next wave of evolution in this area. How long will it be before the next generation of tools are deployed to anticipate criminal conduct in a widespread or systemic manner, rather than merely detect it after the event?


.jpeg?crop=300,495&format=webply&auto=webp)




_(1)_11zon.jpg?crop=300,495&format=webply&auto=webp)

.jpg?crop=300,495&format=webply&auto=webp)









