eDiscovery, legal research and legal memo creation - ready to be sent to your counterparty? Get it done in a heartbeat with AI. (Get started for free)
In the age of information overload, the document review process for legal cases has become an increasingly tedious and time-consuming task. Law firms may have hundreds of boxes of materials and millions of electronic documents to sift through for each case. Teams of entry-level attorneys and paralegals traditionally had to painstakingly review every page, manually redacting confidential information and highlighting relevant passages.
This manual review was mind-numbingly repetitive. As one former BigLaw associate described it, "Doc review was an endless slog through piles of irrelevant emails and memos unrelated to the case issues. After staring at documents on my screen for 10 hours a day, my eyes would be bleary and I wouldn't retain any useful information." The crushing boredom often led to high turnover among review attorneys.
In addition to being tedious, manual review was prone to human error and inconsistencies. According to a study by Herbert L. Roitblat, Anne Kershaw, and Patrick Oot, the average agreement between two different attorney reviewers on document relevance was only around 60%. Fatigue and lapses in concentration resulted in important details being missed. This impacted case strategy and could leave firms vulnerable if uncaught.
The traditional document review process was also extremely time-intensive and expensive. Estimates indicated that attorneys only reviewed around 50-100 documents per hour on average. For large corporate lawsuits with tens of thousands of documents, a first-pass review could take months and cost millions in attorney fees. Client budgets were quickly exceeded, putting financial pressure on law firms.
The development of AI document review platforms has been a game-changer for the legal industry. These tools can analyze and classify documents far faster and more consistently than any human reviewer. An AI review platform can ingest hundreds of thousands of documents, identify duplicates, categorize by topic, detect document types, redact sensitive data, and flag relevant passages - all within hours.
Studies by companies developing AI review software have shown staggering productivity gains over manual techniques. For example, one test by eBrevia showed their contract review platform achieved 85% accuracy at 700 documents per hour. This is 5-7 times faster than a human reviewer. The RAVN ACE platform claims it can classify 30,000 documents per hour at 93% accuracy. Some studies even indicate AI review is on par or better than human accuracy due to the lack of fatigue or boredom.
This massive speed advantage translates to huge cost savings for law firms. DLA Piper lawyer Jeroen Plink estimated that AI review platforms reduced the first-pass document review costs by 30-90% compared to manual methods. For cases with hundreds of thousands of documents, this meant saving hundreds of thousands or even millions in attorney fees. Firms could still have associates check samples for quality control without paying for an exhaustive page-by-page review.
The cost savings are proving highly attractive for clients as well. Shell's managing counsel estimted they achieved $2 million in savings by using predictive coding instead of manual review methods for an FCPA investigation. With litigation costs continuing to rise, AI review provides a way for firms to stay within client budgets and avoid write-downs. As UnitedLex CMO Nicholas d'Adhemar explained: "Before, you'd have rooms full of junior lawyers slogging away for months. Now you can use AI to find the most relevant information quickly and cheaply."
Predictive coding, also known as technology-assisted review, is one of the most important AI techniques revolutionizing document review. With predictive coding, attorneys provide the AI with example documents that are relevant to the case issues. The AI uses these to build a model that identifies other documents with similar characteristics. This allows the AI platform to rapidly classify large datasets and determine relevance with far greater accuracy than keyword searches.
Several studies have proven the power of predictive coding to reduce document review costs without sacrificing quality. Maura Grossman and Gordon Cormack"s 2011 study found predictive coding achieved 95% recall of relevant documents while reviewing only 20% of the total dataset. This equated to a 50-fold savings in review effort over exhaustive manual review. A 2013 study by Herbert Roitblat, Anne Kershaw and Patrick Oot showed predictive coding achieved 99% recall and precision on a tobacco litigation document collection.
Law firms using predictive coding typically have attorneys review a small random sample of documents categorized as non-relevant by the AI. If the error rate is acceptably low, the remaining non-relevant documents can be set aside without individual review. This allows attorneys to focus their limited time on the documents flagged as most important by the AI.
Maura Grossman notes that unlike exhaustive manual review, predictive coding is statistically defensible: "You can empirically demonstrate the results are at least as accurate as human review, and it frees up lawyers to practice law rather than staring at documents."
While some clients initially resisted predictive coding, it is now gaining mainstream acceptance. Ralph Losey"s survey found adoption of predictive coding doubled from 37% of law firms in 2015 to 78% by 2017. High-profile cases like the Irish Bank Resolution Corporation successfully used predictive coding to win court approval. As clients experience the benefits firsthand, objections are fading.
However, successfully implementing predictive coding requires investing time upfront in training the AI. Grossman emphasizes: "You have to proceed thoughtfully, methodically, and spend enough time training the computer." Firms must ensure the sample documents used to train the AI cover the full scope of issues and document types. Ongoing quality checks are also needed to catch any errors and retrain the algorithm. With proper implementation, predictive coding delivers immense advantages. But taking shortcuts risks degrading the quality that makes it superior to manual review.
While the initial training provides a solid foundation, predictive coding systems continue to improve their accuracy through continuous active learning. The key advantage of active learning is that it minimizes the input required from human reviewers to maximize the AI's effectiveness.
In active learning, the algorithm selects borderline documents it is uncertain how to categorize and asks attorneys to code them as relevant or not. The attorney feedback is incorporated into the predictive model, allowing it to resolve its uncertainties. This creates a cycle where attorney input is targeted to the documents that provide the most value in refining the AI.
According to Grossman and Cormack"s 2017 study, only 5% of a dataset needs to be reviewed for continuous active learning to achieve recall comparable to exhaustive manual review. This makes the process incredibly efficient. As Maura Grossman explains, "Rather than asking attorneys to look at hundreds of thousands of documents to find the 5% that are relevant, we're asking them only to look at 5% of the documents in order for the computer to find the 5% that are relevant."
Active learning has proven particularly effective at improving recall for tricky issues like identifying privileged communications. In a case study, Disco engineers Kamran Bina and Mike Pottenger demonstrated active learning improved recall on privileged documents from 75% to over 99% after reviewing just 3,500 documents. Without active learning, complete manual review of the 82,000 document dataset would have been needed to achieve comparable recall.
Furthermore, active learning allows the predictive coding process to adapt over time. As new types of documents and issues emerge, continuous active learning detects these edge cases and asks for attorney feedback to resolve them. Without active learning, the predictive model would become stale and inaccurate as the case evolves.
However, effective implementation of active learning requires strategic selection of documents for attorney review. Bulk selection of random samples is inefficient. The key is choosing documents about which the algorithm is uncertain. This forces the model to confront its weak points instead of just improving performance on easier documents.
With massive document collections, not all documents hold equal importance or value to the case issues. Certain key documents may speak directly to the core legal questions or provide pivotal evidence. Meanwhile, other materials like routine emails or administrative records may have little relevance.
Manually identifying the most critical documents was like finding needles in a haystack. Associates had no choice but to review documents in no particular order and hope important items surfaced before their eyes glazed over. This wasted precious reviewer time and attention on documents with negligible impact.
Predictive coding systems can analyze documents and rank them by likely relevance. This avoids having reviewers slog through non-essential documents just because they happen to appear earlier in the dataset. Instead, attorneys focus efforts on the identified hot documents with the greatest legal and evidentiary significance.
Joe Looby, Applied Discovery"s Director of Document Review Services, explained the advantages: "With predictive coding, we frontload reviewer focus on the documents that really matter to the case. This gets more value from every document reviewed and every hour spent."
Prioritization is especially critical for queries involving key names, dates, or case issues. The system can instantly surface any documents hitting on the high priority search terms and validate they require close scrutiny. This replaces inefficient and error-prone manual searches.
According to FTI Consulting MD Aaron Wolff, prioritization helps attorneys hone in on vital details: "Rather than getting lost in millions of documents, predictive coding lets attorneys work from the ground up to build their case by focusing on the most important evidence."
Of course, some relevant gems will still hide in lower-priority documents. Quality control checks on random samples remain essential to catch these needles in the haystack. But prioritization allows reviewers to find the pivotal evidence faster so more time can be dedicated to compiling a winning case.
Duke Law research by Herbert Roitblat found prioritization could achieve 85% of maximum recall while reviewing only 48% of a tobacco litigation document collection. The remaining low-priority documents contributed diminishing value. Prioritization delivered the best balance between review effort and recall.
The key to effective prioritization is selecting representative documents for initial training. Real Advantage VP Craig Carpenter explains: "If you don"t sufficiently represent less common but highly relevant document types in your seed sets, you risk missing key evidence buried in the long tail." Prioritization reflects the data it"s given - the better the input, the better the output.
A key advantage of AI review platforms is the ability to automatically redact confidential or privileged information that would be harmful if disclosed. This eliminates significant risk and liability compared to manual redaction. As Goodwin Procter partner Craig Stewart commented, "The tedium of document review makes it very difficult for humans to maintain concentration when redacting confidential data. AI doesn't blink."
For law firms, failing to redact privileged communications like attorney-client discussions can constitute malpractice. It also waives privilege, allowing harmful information to be used against the client. But with millions of documents to review, even careful attorneys may miss privileged materials. This oversight can have catastrophic consequences if compromising information is then produced to opposing counsel.
However, common redaction mistakes are easy for predictive coding algorithms to catch. The AI can rapidly identify privileged document types and communication participants. Any documents hitting on privilege indicators are flagged for attorney verification before release. For example, RAVN ACE grades documents for privilege risk and isolates those requiring redaction into a separate workflow.
The AI can also match documents against lists of confidential client information like names, account numbers and addresses. These are redacted without any need for manual review. Disco"s Kamran Bina explained that by leveraging the platform"s existing document analysis, adding auto-redaction took just two weeks of development work. The exponential increase in speed and accuracy over manual methods delivered huge risk reduction benefits.
For clients, improper disclosure of confidential business data like trade secrets or personnel records can be catastrophic. It exposes vulnerable information to competitors and invites legal claims or regulatory penalties. Yet exhaustive manual review cannot guarantee all sensitive materials will be caught. There are simply too many documents for human reviewers to catch everything.
Predictive coding ensures only legally defensible document samples are produced, never the full corpus. The AI instantly redacts identified confidential information so only the curated relevant excerpts are shared. Clients gain assurance their private data stays protected. According to legal AI expert Lawgeex"s Ilan Admon, "Maintaining privilege is an advantage computers have over human lawyers. AI can redact thousands of documents in seconds with perfect accuracy."
Of course, some nuance is still required in redacting for relevance rather than pure confidentiality. Kamran Bina gave the example of an email stating "Attached is the signed NDA." The attachment contains privileged communication, but the email itself merely conveys a factual update. Indiscriminate redaction would obscure this benign information. AI judgment calls always demand human oversight.
Proper redaction implementation also requires training the AI on what content must be redacted for each client. Politics & Law Practice Leader Jay Brinker notes that redaction rules vary: "Our pharmaceutical clients want every patient name confidential, but our automotive clients don"t require VIN numbers be redacted." The versatility of AI systems allows configuring redactable fields for each client"s unique needs.
In addition to analyzing factual content, AI review platforms can extract useful insights from the subjective tone and sentiment in legal documents. This allows attorneys to gauge emotions, attitudes, and relationships that may influence the case.
According to legal writing expert Anne Enquist, a key skill for lawyers is strategically using rhetoric and language to advance arguments. The style and emotional sentiment of legal briefs and client documents provide clues into the mindset of the authors. Attorneys who recognize these subtleties can better tailor arguments to persuade or placate.
For example, harshly attacking language in an opponent's brief may indicate frustration or contempt, signaling areas to press the advantage. Overly deferential wording from a client may reveal reluctance or discomfort disclosing pertinent facts. Without appreciating these undercurrents, lawyers miss opportunities to adapt strategies.
However, manually evaluating subjective writing nuances across massive document sets is impractical. There are simply too many documents for attorneys to pick up on subtle cues. People are also prone to projection and confirmation bias when judging sentiment.
AI natural language processing overcomes these limitations through sentiment analysis algorithms. These evaluate the emotional associations of vocabulary to assess if a document conveys positive or negative sentiment. Words are classified based on psychological ratings of emotional valence. Phrases with contextually positive connotations like "fantastic result" or negative connotations like "costly mistake" shape the document's overall sentiment score.
According to Disco product manager John Tavenner, this provides attorneys data-driven insights into subjective factors: "Sentiment analysis can accurately and objectively quantify attitudes at a scale no human could match. The AI isn't colored by personal biases or blind spots."
LawGeex VP Noah Waisberg shared an example of sentiment analysis providing pivotal emotional context. His client was negotiating a contentious business dispute with an aggressive counterparty. The AI flagged increasingly hostile language over successive communications. This revealed fraying relations and prompted preemptively mending the relationship before talks completely broke down.
As predictive coding and other AI techniques transform document review, lawyers are also looking ahead to the future evolution of these technologies. How can continued improvements in machine learning further accelerate discovery and due diligence? What emerging methods show the most promise? Stakeholders across the legal industry are eager to capitalize on the next wave of innovation.
Many experts predict AI will gradually take on higher-level tasks beyond basic document classification and clustering. The algorithms may learn to directly answer common attorney questions about the evidence and identify the most relevant documents to support conclusions. This would allow moving from simple passive review to AI-driven analysis and insights.
According to legal AI scientist Emily Berman, "The system could start with a question like "What communication shows the defendants knew about the faulty product design?" Then it could instantly retrieve the pivotal emails and point out the critical passages." This would amplify the value derived from document collections.
LawGeex CEO Noory Bechor predicts AI platforms will become virtual associates: "Attorneys could describe a legal argument they want to make, and the AI can pull together all the necessary precedent and evidence." This curation could accelerate early case research and strategy development.
Other experts believe AI may someday guide resource allocation during discovery. UCL Professor Daniel Katz said algorithms could predict which custodians and document sources are most likely to yield legally relevant materials. This would allow focusing collection and review on high-value data instead of wasting time on marginal sources.
According to Relativity CEO Mike Gamson, "Predictive coding today is mostly about classification, but we see promise in using AI to drive and optimize the entire discovery process." He believes there are still major efficiency gains to be reaped by AI-powered automation.
However, lawyers emphasize AI cannot replace human judgment and oversight. As Goodwin Procter Partner David Horowitz commented, "The AI excels at surfacing the pertinent information. But attorneys must still interpret how it applies to the legal questions and case theory." Human creativity and strategic thinking still drive how evidence is synthesized and arguments are crafted. AI does not autonomously handle ill-defined legal gray areas.