Automate legal research, eDiscovery, and precedent analysis - Let our AI Legal Assistant handle the complexity. (Get started now)

Less Legwork, More Insight: How AI Streamlines eDiscovery and Cuts Costs

📖 16 min read • 3,003 words

Published: November 2, 2023 • legalpdf.io

Less Legwork, More Insight: How AI Streamlines eDiscovery and Cuts Costs

Automating Document Review Saves Billable Hours

Discovery is one of the most labor-intensive and expensive stages of litigation. Attorneys must pore over massive sets of documents to identify those relevant to the case. Manually reviewing emails, contracts, medical records and other files is extremely time-consuming.

Traditionally, junior associates would conduct first-pass review, flagging potentially relevant items for partners to analyze. With today’s larger data volumes, this full human review model is becoming prohibitively expensive. Clients balk at paying high hourly rates for low-level document classification.

AI-powered tools like predictive coding and analytics now automate parts of discovery, significantly reducing the billable hours required. As one example, an AmLaw 200 firm used an AI platform for a case with over 2 million documents. The software automatically flagged 70% of items as non-relevant. By focusing review on the remaining 30%, the firm reduced attorney hours by over 80%.

Leading eDiscovery provider Reveal Data shared that manual review costs $4,000 on average per gigabyte of data. Their AI tools brought these costs down 50-90%, delivering over $1 million in savings for a client managing terabytes of data. The Vice President and Managing Counsel at an insurance company reported the predictive coding they employed reduced expense by 30% compared to exhaustive human review.

Lightning-fast AI review also accelerates case timelines. A large construction company faced a tight discovery deadline in a multimillion-dollar case. By deploying predictive coding, their law firm reduced the review timeline from months to just weeks. The expedited process allowed the parties to meet the court-ordered schedule.

While AI excels at high-volume document classification, human expertise is still needed to make subtler judgment calls. The key is combining AI and human effort. Tools flag potentially relevant items, then attorneys review these to make final relevance determinations. This allows humans to focus their specialized skills where they have the most impact.

Predictive Coding Narrows Down Relevant Documents

Predictive coding leverages machine learning to pinpoint the documents most likely to be relevant in discovery. This AI technique is a game-changer for narrowing massive data sets down to manageable subsets.

In predictive coding, attorneys first manually code a small sample set of documents as relevant or not. Algorithms analyze these examples to identify patterns that distinguish the relevant items. The software then applies these patterns to code new documents it hasn’t seen before.

Each round of manual review allows the machine learning model to progressively improve its accuracy. After a few iterations, predictive coding can automatically and efficiently filter data sets down to the most important documents.

A property insurance company faced reviewing over 3 million records for multiple lawsuits related to 2017 California wildfires. Using predictive coding, their outside counsel rapidly narrowed this universe down to the most relevant 500,000 documents.

The insured was extremely satisfied that critical evidence was identified upfront before substantial fees were incurred in prolonged human review. This focused the most legally significant aspects and strengthened their litigation strategy.

In a trade secrets case, one party initially produced over 8 million pages of records. Their law firm leveraged predictive coding to cull this down to 2.5 million potentially relevant documents. This massive reduction allowed them to efficiently direct reviewer attention and proceed to the next stages of discovery.

Predictive coding is especially valuable for filtering large email data sets. In a case between two telecom companies, it reduced 1 million emails down to just 30,000 requiring review. The technology excavated communications with key terms related to the dispute while filtering out irrelevant daily chatter.

Some law firms use predictive coding for early case assessment even before formal discovery begins. One team analyzed a client’s records related to an impending lawsuit. Out of 465,000 documents, the AI coded 4,000 as most likely to be requested. This preview of the opposing party’s needs focused the client’s collection and review efforts.

While predictive coding tools are rapidly gaining adoption, some litigators remain hesitant to entrust review fully to algorithms. The technology continues improving, but human inspection adds an extra layer of assurance.

Machine Learning Continuously Improves Over Time

A key advantage of machine learning algorithms is that they progressively enhance their performance through continuous feedback loops. As predictive coding tools ingest more data, their ability to pinpoint relevant documents improves over time.

Law firms report that the more documents the AI reviews, the more nuanced its coding decisions become. For example, the predictive coding used by Gibson Dunn in the Rio Tinto v. Vale case improved its precision and recall from 75% to over 95% after multiple rounds of training. Having reviewed over 17 million records, the algorithm had sharpened its criteria for identifying hot documents.

The auto-learning capability also allows predictive coding to adapt as the issues within a case evolve. Additional issues often emerge during discovery that shift definitions of relevance. With continuous active learning, attorneys can provide new examples to re-tune the algorithms.

For instance, a predictive coding tool trained to flag documents about “employee retention” was missing records that discussed retention in different terms like “personnel turnover” and “attrition rate.” By coding some of these false negatives, the lawyers enhanced the AI’s criteria for relevance. Within a few learning cycles, its recall strengthened to include these synonym variations.

This ability to continuously feedback new data prevents “concept drift,” where the AI model becomes outdated. Auto-learning instead keeps pace as new facts and issues reshape the scope of relevance.

Ongoing learning also augments the quality of privilege reviews. In some predictive coding tools, attorneys can tag documents that were misclassified as privileged or non-privileged. This retrains the algorithm to avoid repeating similar errors. As it ingests more examples, the AI gets sharper at making close calls around privilege designation.

Continued training likewise helps predictive coding overcome technical terminology that is industry or case-specific. For a products liability case, the AI refined its ability to distinguish between generic engineering terms versus those indicating design flaws. In a patent dispute, it learned to recognize technical jargon critical for invalidity arguments.

Unlike manuals rules-based searches, machine learning evolves dynamically like the human reviewers it augments. New memories enhance its pattern recognition, mirroring skills gained through associates’ years of experience. But unlike humans, predictive coding can absorb these lessons across thousands of data points simultaneously.

The compounding knowledge makes AI invaluable for serial litigants managing cross-cutting issues across multiple cases. Training a predictive coding tool on privileged communications in previous lawsuits equips it to flag these swiftly in new matters. The more diverse exposure the algorithms gain, the more dexterously they can reduce datasets and accelerate proceedings.

AI Speeds Up Privilege Reviews and Redaction

Conducting privilege review is one of the most painstaking steps of document discovery. Attorneys must carefully examine all materials to identify those protected by attorney-client privilege or work product doctrine. This traditionally requires manually reading every email, memo and file. With today’s massive datasets, human review of everything is impractical. Privilege designations directly impact what evidence can be used, so errors could profoundly affect case outcomes.

AI tools like predictive coding are optimizing privilege review by rapidly surfacing documents likely to be protected. In a recent patent infringement case, the producing party used predictive coding to classify privileged items within over 3 million documents. This accelerated their production timetable by over a month compared to linear manual review. Opposing counsel appreciated receiving the relevant materials faster so they could sharpen their infringement arguments.

Predictive coding prioritizes documents probable to be privileged based on patterns it identifies. These may include senders/recipients from legal departments, email subject lines containing “Attorney-Client Privilege”, or file names marked “Confidential”. By clustering documents with these privilege indicators, attorneys avoid wasting time inspecting irrelevant items unlikely to be protected.

Law firms amplify this efficiency by combining AI with two-step privilege review workflows. Predictive coding or keywords first filter to a smaller subset of potentially privileged documents. Attorneys then examine these to validate which are actually immune from discovery. Prioritizing manual review on probable privileged documents, rather than every file, speeds the process drastically.

For example, Gibson Dunn used this strategy to review over 1.75 million documents for privilege designation. Automated filtering reduced this to just 4,000 items requiring detailed human examination. By guiding reviewer focus, AI accelerated privilege designation so the case could proceed.

The machine learning capabilities that enhance accuracy over time are invaluable for privilege review. Nuanced human judgment is required to determine if communications involve legal advice or work product. Predictive coding tools continuously integrate attorney privilege designations to sharpen their criteria.

One law firm reported that after reviewing just hundreds of documents, their predictive coding tool’s privilege classifications improved remarkably. With minimal training, the AI recognized occasions when privileged material was buried in larger non-legal discussions. This nuance prevented improper disclosure.

Redacting sensitive text within documents requires similar human discernment. Here too AI assists by automatically flagging texts potentially warranting redaction across massive datasets. This focuses manual reviewer efforts only on documents needing closer inspection. In this way predictive coding speeds the redaction process while preserving contextual human judgment.

AI privilege review tools also excel at surfacing duplicates across data sets. This prevents attorneys from wasting time reviewing identical communications or documents multiple times. Automated de-duplication streamlines datasets into unique items requiring one-time inspection.

Forward-thinking law firms are exploring how AI can go beyond just surfacing privileged documents. Natural language processing can redact privileged content directly within documents deemed responsive. This technique provides usable evidence faster, while protecting exempted materials.

Natural Language Processing Extracts Key Facts

Natural language processing (NLP) extracts critical facts and relationships from unstructured text, unlocking insights from narrative content. For document-intensive litigation, NLP is invaluable for pinpointing salient evidence within oceans of files.

NLP uses linguistic rules and patterns to analyze text meaning. Algorithms parse grammar and semantics to interpret statements, events, opinions, arguments, and conclusions. This capacity is crucial for automatically surfacing relevant nuggets across expansive datasets.

In a recent matter, NLP extracted key details from thousands of insurance claims to reveal fraud patterns. By connecting claims with identical injuries, treatment dates, and policy details, the technology exposed coordinated rings submitting falsified applications. This analysis was instrumental in the government's ensuing fraud prosecution.

An international law firm leveraged NLP in an internal investigation triggered by a whistleblower complaint. The AI parsed terabytes of emails to identify messages discussing improper practices like bribery. By clustering topically related content, NLP revealed extensive misconduct that prompted remediation efforts.

For example, a company sued a parts supplier over a factory explosion allegedly caused by defective materials. The plaintiff's law firm utilized NLP to extract key facts from thousands of pages regarding the parts’ shipment dates, engineering specifications, safety testing, and signatures authorizing purchase. This illuminated critical timeline and accountability details fundamental to proving liability.

NLP also clarifies key arguments and relationships within lengthy contracts. By identifying clauses about payment terms, intellectual property rights, and liability limits, NLP extracts deal essence from tangles of legalese. This speeds contractual analysis integral to transaction disputes.

Say a defendant’s emails frequently refer to a person as “Alex”, while plaintiffs discuss “Alexander”. Simple keyword searching would miss connections between these communications threads. But NLP recognizes the individual’s name variations, connecting relevant discussions.

Attorneys amplify insights by combining NLP with visualization tools. Interactive maps of people, events, and relationships unlock new perspectives. Clustering key facts also helps compare and contrast different document sources, exposing inconsistencies.

While tremendously powerful, NLP has limitations litigators should recognize. Subtle sarcasm, implied meanings, or complex technical concepts can trip up algorithms. Human oversight is key to catch misinterpretations, guiding continuous NLP improvement.

Analytics Identify Communication Patterns and Relationships

Email threading tools reconstruct back-and-forth conversations from fragmented messages. This clarifies the timeline and progression of key discussions related to disputed issues. In a contract breach case, analytics reconstructed a CEO’s demands for early loan repayments and the backpedaling replies. This revealed coercive pressure fundamental to showing duress.

Social network mapping visualizes connections between people, uncovering involvement and hierarchies. For an employment discrimination case, attorneys used analytics to diagram reporting structures and colleagues regularly communicating with plaintiffs. This surfaced potential witnesses with knowledge of events. Opposing counsel determined leaders who likely shaped procedures plaintiffs claimed were discriminatory.

In mergers and acquisition disputes over misrepresented financials, relationship mapping exposes which executives and advisors had access to confidential data. Analytics can also match key dates like earnings calls with spikes in executive communications, suggesting foreknowledge of misstatements.

Email metadata reveals volume and timing patterns, exposing issues and opportunities. A law firm defending against class action fraud allegations used analytics to pinpoint surges in executive emails around an alleged event. The absence of meaningful spikes undercut claims the executives knowingly spread false information.

For internal investigations, metadata helps reconstruct timelines and participation. Analytics identify days with unusually heavy email and document activity by persons of interest. Clustering all data modifications by time and branch office indicates potential evidence tampering.

Network analysis also quantifies influence and expertise by email traffic and document sharing. For a company acquiring a rival’s sales team, this revealed top performers and mentors critical for retention. In a trade secret theft case, influence patterns showed the defendant occupied a central role bridging research and commercial groups. This strengthened infringement claims.

Beyond communication patterns, concept clustering extracts buried commonalities across disconnected records. By grouping documents and terms by topic similarity, this technique surfaces non-obvious substantive links. Attorneys combine concept clustering with timeline-mapping to trace topics strategically.

Cloud-Based AI Lowers Costs and Increases Access

Law firms are rapidly adopting cloud-based AI tools to slash eDiscovery costs and enable universal access across engagements. Rather than purchasing and maintaining expensive on-premise servers, the cloud model provides eDiscovery capabilities on-demand through software-as-a-service. This eliminates capital investments while allowing litigators to scale capacity dynamically based on caseloads.

According to an ILTA survey, 98% of firms are now using cloud computing in some capacity. The ability to handle document review anytime, anywhere has become essential with remote work. Boutique firm Ragsdale Liggett transitioned fully to the cloud during COVID shutdowns. This enabled staff to keep eDiscovery moving smoothly for clients despite office closures. The reduced hardware costs also helped their lean 13-attorney team better compete with big firms on doc review pricing.

Lawyers report that cloud-based review tools yield cost savings up to 80% over traditional methods. The cloud model converts what were once massive fixed expenses into flexible operating costs that rise and fall with demand. Firms pay only for actual processing power and storage used each month rather than overprovisioning on-premise infrastructure to handle peak loads. The cloud’s autoscaling eliminates waste, aligning resources tightly to each case's needs.

Some examples illustrate the dramatic savings:

- A regional firm switched 200 TB of eDiscovery data storage to the cloud, reducing costs by 50%

- An international firm's document review tool transition lowered annual pricing from $180,000 on-premise to just $15,000 in the cloud

- A boutique firm found moving eDiscovery to the cloud decreased their spend by $40,000 per case on average

Lower costs let firms offer eDiscovery pricing models that minimize client risk, like contingency and flat fees. This cost predictability encourages clients to utilize eDiscovery more proactively rather than viewing it as an unavoidable expense. Protera Technologies structures fixed-fee pricing around client data volumes, covering all technology and services. This empowers smaller firms to provide certainty around total discovery costs.

AI Assistance Frees Up Lawyers for Higher-Value Work

While AI accelerates eDiscovery document review and research, its greatest value is enabling attorneys to reallocate time to higher-priority responsibilities. Lawyers freed from manual tasks can focus on work only human insight and judgment can execute. This maximizes human capital efficiency and impact.

Partners at Am Law 200 firms estimate they spend over a quarter of time reviewing documents. While essential, this rote analysis shortchanges their specialized skills in litigation strategy, negotiation, and client counseling. AI's speed advantage at scale documentation review leaves human attorneys' availability for nuanced tasks AI cannot match.

Many lawyers view document review as a rite of passage for junior associates to learn case details. But with shrinking budgets, clients balk at paying high billable rates for document coding that algorithms outperform. AI handles bulk document triage at a fraction of the cost. Associates then analyze the subsets most requiring human discernment around privilege, confidentiality and responsiveness. This balances AI efficiency with tailored human judgement.

At multinational firm Reed Smith, AI decreased document review time so attorneys could strengthen pleading arguments. Associates also gained time for deeper legal research, precedent analysis and trial preparation. This improved case outcomes and client satisfaction.

In M&A deals, AI review of stockpile transaction documents provides rapid background context so attorneys can focus on high-stakes terms negotiation. For secured lending, algorithms handle collateral lien reviews, freeing lenders' counsel for complex intercreditor agreements requiring their expertise. AI also drafts and reviews routine filings like UCC statements, releases, and extensions so attorneys handle only exceptions.

Contract management AI extracts core terms from lengthy contracts and highlights key changes between versions. This enables sharper focus negotiating major commercial points rather than reviewing every line. AI document generation handles routine agreements so attorneys craft more customized contracts for unique deals.

At AM Law 50 firm Lewis Roca, lawyers gained 15-25% more time for client advising by using AI tools for discovery and research. The freed bandwidth strengthened responsive legal guidance differentiated from DIY solutions. This also facilitated cross-selling other firm services, deepening client loyalty. The firm also shifted personnel from routine work to more engaging initiatives improving practice operations.