The Value of conceptual clustering: case study

We were able to identify 409 key documents without undertaking a costly and time-intensive manual review of a 57,000 document pool.

06 January 2022

Publication

The challenge

We were asked to help the legal team find the key documents that related to the events and claims at issue in the dispute prepare their defence on a construction claim without having to manually review thousands of documents and under a very tight deadline of only 5 days.

Our solution

We identified and collected 138,000 documents from the client systems. After utilising our custom data processing workflow, we were able to quickly reduce the data volume down to 56,000 documents. We then used data analytics such as communications analysis, conceptual analytics, and clustering to further reduce the document population down to 17,000 documents.

The technique, conceptual clustering, groups together documents that contain similar concepts within the text of the documents. Search terms and other analytics can then be run over the clusters to try to identify specific clusters that contain a high volume of potentially relevant documents. After identifying a few key documents, we were then able to quickly locate two key clusters that contained 800 documents.

The outcome

Upon review of these two key clusters we were able to identify 409 key documents needed for the defence without undertaking a costly and time intensive manual review of the entire 57,000 document pool.

This document (and any information accessed through links in this document) is provided for information purposes only and does not constitute legal advice. Professional legal advice should be obtained before taking or refraining from any action as a result of the contents of this document.