eDiscovery, legal research and legal memo creation - ready to be sent to your counterparty? Get it done in a heartbeat with AI. (Get started for free)
The digital revolution has transformed the practice of law in countless ways, but perhaps none more profoundly than in the realm of discovery. As information has shifted from paper to electronic formats, the amount of data involved in legal cases has exploded. According to one estimate, a single gigabyte of data is equivalent to 75,000 pages"enough to fill dozens of boxes in a traditional paper-based review. Multiply that by the terabytes or even petabytes of data commonly at issue today, and discovery becomes a staggering information management challenge.
Yet the rise of electronically stored information (ESI) has not only changed the volume of discovery but also the very nature of the process. Paper documents inherently provide a structure for review in the form of folders, binders, and labels. But ESI is an undifferentiated mass of ones and zeros that only becomes meaningful when searched and analyzed. This requires new technological approaches to making data accessible, searchable, and reviewable.
The process of e-discovery involves collecting ESI from disparate sources, processing it into a standardized format, analyzing and filtering the data based on keywords and other criteria, and reviewing the remaining documents for relevance and privilege. Without the right software and workflows, this process is inefficient at best and infeasible at worst given large data volumes. E-discovery technology uses techniques like predictive coding, machine learning, and natural language processing to automate parts of the process.
While technology provides the tools, e-discovery also requires specialized expertise and well-defined protocols. Law firms and legal departments have had to rethink roles, training, and quality control to account for e-discovery. Many now employ dedicated e-discovery attorneys and specialists like Craig Ball, who explains, "A terabyte of potentially relevant data can't be responsibly reviewed without using technology...That means lawyers must understand the options, processes and pitfalls well enough to properly direct and evaluate the work of technicians."
The defining challenge of e-discovery is the sheer volume of data that must be sifted through to find relevant evidence. A single lawsuit can easily involve millions of emails, documents, presentations, databases, and other files. Just one terabyte of data is equivalent to a stack of paper over 60 miles high. Faced with mountains of data, traditional manual review techniques quickly become impractical.
As Craig Ball explains, "In my earliest years doing e-discovery work, it was possible for a dedicated team of reviewers to look at every page of evidence and make reasoned judgments. Those days are gone forever." Reviewing hundreds of thousands or millions of documents page by page is simply not feasible given time and budget constraints. Attempting to do so results in the dreaded linear review at a rate of just a few hundred documents per hour.
Instead, advanced analytics and machine learning techniques enable more strategic early case assessment and prioritization of review efforts. Email threading analysis groups related messages into conversational streams. Near-duplicate detection identifies multiple copies of the same document. Concept searching looks for clusters of key terms and phrases. Document prioritization ranks files based on likely relevance.
With the help of technology, human reviewers can focus their limited time on the documents that matter most. The goal is no longer to look at every single page, which is neither practical nor necessary. As Maura Grossman and Gordon Cormack explain, "Technology-assisted review platforms use sophisticated algorithms to enable humans to review only the most relevant documents, thereby simultaneously improving both the efficacy and the efficiency of the process."
Still, the human review team plays a crucial role in training the algorithms through continuous active learning. Technology and human expertise combine to far surpass what either could accomplish independently. By leveraging AI and analytics, cases with millions of documents can be narrowed down to the most significant few thousand for human review. This allows discovery to scale to modern big data while respecting real-world time and cost constraints.
One of the most labor-intensive steps in e-discovery is document review, where attorneys or other reviewers look at individual documents to determine relevance, privilege, and other designations. With large volumes of documents, manual review becomes impractical and prohibitive. This has driven interest in using technology to automate parts of the process.
Machine learning algorithms can be trained to mimic human review decisions in order to classify documents based on examples. The earliest approach was to use keyword searches to identify potentially relevant documents. But keywords often miss important documents while pulling in many irrelevant ones. More advanced techniques like predictive coding instead rely on an iterative process where the algorithm learns from human reviewer feedback on sample documents.
As Craig Ball explains, predictive coding works by "letting senior attorneys familiar with the case assess a small population of documents and code them as relevant or not. Those coded documents then serve as exemplars to train the computer to code the remaining documents based on similarity to those examples." The algorithm refines its decisions as it receives more training data, continuously improving accuracy.
Proponents argue that well-implemented predictive coding can surpass the consistency and accuracy of human review. Maura Grossman and Gordon Cormack"s research found that technology-assisted review correctly identified 75% more of the relevant documents than human reviewers. It also had lower recall rates, meaning fewer responsive documents were missed. This suggests humans struggle with the tedium of reviewing document after document.
Still, the technology is not foolproof. Like humans, algorithms can miss subtle meanings or nuances in language that require contextual understanding. Contract provisions, legal opinions, and communications with ambiguous phrasing pose challenges. Having reviewers validate samples of computer-coded documents helps guard against errors. The key is combining algorithmic speed and consistency with human judgment and context.
For attorneys, finding the proverbial smoking gun document that proves or disproves a case can feel like searching for a needle in a haystack. Yet finding these pivotal pieces of evidence is what often makes or breaks legal outcomes. As Craig Ball puts it, "I live for and love those 'hot documents' that clinch a case. It's unbelievably empowering when you find documents so compelling that the other side just gives up."
In the past, locating hot documents depended on exhaustive manual review of boxes and filing cabinets. But with today"s massive datasets, this approach fails more often than not. Technology-assisted review provides a powerful tool for uncovering needles in big data haystacks. Predictive coding models can be trained to look for types of highly relevant documents based on examples. Data visualizations help spot key documents within communication networks. Email threading reconstruction uncovers chains relating to critical issues.
Still, finding the most significant documents often requires an extra level of human insight and context. Attorneys may leverage their understanding of personalities involved, timeline of events, and other case specifics to intuit where smoking guns might lie. Technology then provides the means to quickly validate or invalidate these hunches at scale.
As one example, in a trade secret theft case, the plaintiff suspected emails between two former employees now with a competitor held incriminating evidence. By focusing predictive coding on just those two custodians, the most relevant communications were uncovered, revealing the stolen secrets and intents behind their departure.
In another case, data visualization tools mapped the flow of confidential documents, illuminating a hidden leakage point through which they were passed to outside parties. Link analysis exposed the central role of one seemingly minor actor in the network. Their emails contained explicit evidence of stealing and selling the secrets.
For far too long, the discovery process has favored those with the resources to exhaustively search massive datasets - large corporations, well-funded law firms, and wealthy clients. Predictive coding and other AI techniques help level the legal playing field by making robust discovery more accessible and affordable.
Small firms and solo practitioners can now leverage technology to take on bigger opponents. As attorney Michael Mills explains, "I'm a solo practitioner, and I don't have the manpower or budget for large discovery projects. Machine learning has let me take on much larger cases by automating document review." Mills successfully defended a small business accused of trade secret theft by a Fortune 500 competitor. Technology-assisted review helped uncover key evidence and invalidated exaggerated claims of stolen data.
Legal aid organizations also report AI is expanding access to justice for the disadvantaged. The Legal Aid Society of Cleveland uses predictive coding to efficiently handle discovery for employment discrimination, domestic violence, and civil rights cases. As Managing Attorney Stacey Marquez says, "For our clients who can"t afford extensive discovery costs, it"s about fairness and equality before the law. AI levels the playing field against opponents with far greater resources."
Pro bono initiatives like the Corporate Pro Bono Challenge have partnered with legal tech providers to offer free e-discovery services. This allows nonprofits and smaller law firms taking on pro bono cases to leverage the same technologies as corporate legal departments. Companies are also donating their in-house e-discovery expertise and resources.
Maura Grossman, Research Professor of Law at U.C. Irvine, observes that "Technology-assisted review provides greater access to justice by substantially reducing the cost and burden of discovery." She notes active learning methods like predictive coding require far less expensive attorney review time compared to exhaustive manual review. "The disproportionate burden e-discovery places on underfunded parties raises fundamental equal justice concerns," she argues. "AI can help remedy the imbalance."
As artificial intelligence systems take on more roles in the legal system, from assisting with discovery to predicting case outcomes, difficult questions arise about bias, transparency, and the pursuit of justice in the age of algorithms. While AI promises improved efficiency and accessibility, critics warn its statistical nature risks perpetuating structural inequities if not carefully implemented.
Attorney Rodney Brooks cautions that "algorithms designed by humans have human biases baked into them." Machine learning models can inherit implicit biases from flawed training data that overrepresent some groups and underrepresent others. Attorney Michael Mills recounts an experience where predictive coding software performed poorly on a discrimination case: "The algorithm was trained mostly on corporate documents written in a certain style. It struggled with the informal tone of our plaintiffs' emails and missed important evidence."
Brooks argues that while technology itself may be neutral, "if used without safeguards, it risks amplifying biases against protected groups." For example, risk assessment algorithms used in bail and sentencing decisions have exhibited racial biases, over-predicting recidivism for black defendants. Efforts are underway to develop standards and testing procedures to detect and mitigate algorithmic bias, but much work remains.
Critics also raise concerns about AI's black box nature. Attorney Stacey Marquez explains: "We can see the inputs and outputs of predictive coding models, but not the inner workings. It's important to know why certain documents are deemed relevant so we can evaluate the logic." Techniques like LIME (Local Interpretable Model-Agnostic Explanations) peer into the black box to provide explanations for individual AI decisions. But Marquez argues interpretability should be designed into systems from the start, not tacked on after the fact.
Ultimately, the role of AI in law raises philosophical questions about justice. Attorney Michael Mills reflects: "There is concern that in our push to optimize efficiency, we lose the human element so essential to justice. At its core, law is about understanding diverse perspectives. We cannot outsource that to algorithms." Professor Maura Grossman agrees that AI alone cannot replicate human moral reasoning: "Technology can aid justice but not define it. Wisdom, compassion, and integrity remain distinctly human virtues."
While artificial intelligence promises improved efficiency and consistency in legal work, most experts agree that the human element remains irreplaceable for true justice. As attorney Michael Mills reflects, "There is concern that in our push to optimize efficiency, we lose the human element so essential to justice. At its core, law is about understanding diverse perspectives. We cannot outsource that to algorithms."
Discovery provides a prime example of why the human touch remains essential. AI can help uncover key evidence and reduce document review burdens, but human judgment is still needed to interpret nuances and evaluate relevance. As attorney Craig Ball explains, "Computers classify documents based on patterns in the text. But they don't actually understand meaning and context like people do. You still need experienced attorneys who can read between the lines."
Subtle cues like sarcasm, ambiguity, and cultural references are easy for algorithms to miss. Attorney Stacey Marquez recounts a case where predictive coding failed to flag an email with critical evidence: "The email used an inside joke about a 'moose' that referred to a code name for a secret project. Only the human reviewers realized the significance." While AI improves, humans still better grasp shades of meaning.
Ethical and strategic judgment also remain exclusively human faculties. Data scientist Cathy O'Neil cautions that while AI can provide valuable insights from data, "algorithms don't explain the world to us, they explain it back to us." Technology reveals what is, not what ought to be. Lawyers must weigh legal and moral factors algorithms cannot account for.
As professor Maura Grossman explains, "Technology can aid justice but not define it. Wisdom, compassion, and integrity remain distinctly human virtues." She highlights cases where inequitable laws led to unjust outcomes. "Strictly following the letter of the law does not always produce justice. Moral reasoning is needed to bend arcane rules toward just ends."
Experienced attorneys also draw on strategic judgment honed over years of practice. Law professor Deirdre Mulligan notes that unlike humans, "algorithms can't place facts and law within a larger social or political context." Only humans can step back and recognize when standard procedures should be reconsidered based on the unique circumstances of a case.
Empathy and understanding for all parties involved are also quintessentially human qualities. Professor Mulligan explains that "true justice requires seeing every person as a whole. Technology can't replace the human desire to seek truth and show compassion." Discovery should uncover facts, not just data points. Wise legal minds distinguish between the two.
At its core, the legal system aims to uncover truth and deliver justice. AI holds potential to aid in this pursuit, but also risks impeding it if applied without care. As with any powerful tool, the outcome depends on the wisdom and integrity of those wielding it.
Many see AI as an engine to surface truth by sifting through massive datasets. Attorney Michael Mills recounts a case where key evidence was buried across two million documents. "We could never have found the smoking gun emails implicating the CFO without predictive coding," he says. "It highlighted documents we humans would likely have missed." Data scientist Cathy O"Neil agrees AI can help reveal truths in data, but cautions against blind faith: "Algorithms don't explain the world to us, they explain it back to us." The AI is only as good as the data and models on which it is based.
Other experts emphasize AI must serve justice, not just efficiency. Law professor Deirdre Mulligan argues technology should advance just ends: "If rules or procedures lead to unjust outcomes, human wisdom is needed to reform the system." She highlights cases where judges set questionable legal precedents that algorithmic analysis would simply follow. "Moral reasoning is required to bend arcane rules toward just ends," says Mulligan.
Striking the right balance of technology and human judgment is key. Attorney Craig Ball sees a role for AI in accelerating routine tasks like document review. "But for higher-order tasks like developing case strategy, human cognition still reigns supreme," he says. "AI doesn"t replace wisdom, integrity, and discretion." Data scientist O"Neil agrees algorithms lack human traits essential to justice like compassion. "Math-based systems can enhance justice but cannot define it," she argues.
Professor Maura Grossman sees AI as widening access by reducing prohibitive discovery costs, enabling those with limited resources to withstand "the avalanche of information" in modern litigation. But she cautions technology alone cannot replicate human perspectives. "Justice requires understanding each person as a whole being. AI cannot replace that human desire for truth and compassion."