"Trust me, I know what I'm doing!" - Court Outlines Perils of Custodian Self-Collection and Inadequate Keyword Searches

In a recent ruling, United States Southern District Judge and e-discovery authority Shira Scheindlin, of Zubulake and Pension Committee fame, held that various government agencies had failed to adequately design searches for responsive electronically-stored information. While the case, National Day Laborer Org. Network et al. v. U.S. Immigration and Customs Enforcement Agency, et al., 2012 U.S. Dist. LEXIS 97863 (S.D.N.Y. July 13, 2012), deals largely with searches in the context of the Freedom of Information Act (“FOIA”), Judge Scheindlin noted “much of the logic behind . . . e-discovery searches is instructive in the FOIA search context because it educates litigants and the courts about the types of searches that are or are not likely to uncover all responsive documents.”

Plaintiffs sought records from five government agencies concerning the U.S. Immigration and Customs Enforcement Agency’s (“ICE”) “Secure Communities” immigration enforcement program’s “opt-out” provision. In cross-moving for summary judgment, each of the defendant government agencies filed declarations attesting to the sufficiency and level of detail associated with the requested search for records. The Court, however, ordered that additional searches be conducted by most of the defendant agencies because, among other deficiencies, they failed to follow through on obvious leads, search archived records, and adequately describe the extent of their searches; and in one striking example, one defendant “absurd[ly]” interpreted a custodian’s failure to respond to a request for records as proof that no responsive documents existed.

This decision underscores the dangers of custodian self-collection in the context of e-discovery. In particular, the Court emphasized two reasons why it could not “simply trust” assertions by defendants that their custodians “have designed and conducted a reasonable search” that could be “reasonably calculated to uncover all relevant documents.” First, because many of the defendants’ affidavits did not “record and report the search terms that they used, how they combined them, and whether they searched the full text of documents,” the affidavits lacked a “reasonable specificity of detail” and thus failed to establish that an adequate search was conducted. Second, while most custodians are familiar with “[s]earching for an answer on Google (or Westlaw or Lexis),” the Court concluded that defendants’ custodians lacked the skills necessary to “design[] legally sufficient electronic searches in the discovery or FOIA contexts” because it was “not part of their daily responsibilities.”

Moreover, the Court stressed the inadequacies associated with keyword searches, opining that “[e]ven in the simplest case . . . there is no guarantee that using keywords will always prove sufficient.” Other problems associated with the keyword searches involved the lack of supervision over custodial keyword searches, often performed by laypersons. Because of these limitations, Judge Scheindlin cautioned that courts and parties must move “beyond the use of keyword searches” and “rely on latent semantic indexing, statistical probability models, and . . . iterative learning,” which is generally known as predictive coding, in order to “significantly increase the effectiveness and efficiency of searches.” A recent post on the endorsement by another S.D.N.Y. Court of the concept of predictive coding and other computer assisted review in the context of e-discovery can be found here.

Although the case largely deals with searches in the FOIA context, Judge Scheindlin has sent yet another characteristically direct warning that all parties in discovery must realize the limitations of untested keyword searches, guard against inadequate supervision of custodial searches, and learn to use twenty-first century technologies to perform adequate e-discovery searches.


Paul A. Saso is a Director in the Gibbons Business & Commercial Litigation Department and a member of the Gibbons E-Discovery Task Force.

Hard Drive of a Key Non-Party Witness is Searchable in Response to Subpoena

A key non-party fact witness is fairly the target of a subpoena seeking production of ESI. In Wood v. Town of Warsaw, North Carolina, the United States District Court for the Eastern District of North Carolina held that ESI preserved on a former town manager’s personal computer must be made available for a search by a forensic expert in response to the Plaintiff’s subpoena.

Raymond Wood, the former police chief of the town of Warsaw, North Carolina, alleged that his dismissal by former town manager Jason Burrell was motivated by the town’s desire to have “younger blood in the chief’s office.” Plaintiff sued for age discrimination under the Federal Age Discrimination in Employment Act. During discovery, Plaintiff directed a subpoena to non-party Burrell requesting, among other things, a search of Burrell’s personal computer using to be agreed-upon search terms.

Resisting the subpoena, Burrell argued that the proposed search would be time-consuming, costly and an invasion of his personal privacy. He further claimed that he did not use his personal computer for work-related purposes, and that if any responsive documents existed on his personal computer, he would produce them since they would be otherwise responsive to the subpoena. In response, Plaintiff argued that the proposed search was reasonably calculated to lead to the discovery of admissible evidence, that he had already agreed to pay for the cost of the proposed search by a forensic expert, that he had submitted proposed search terms to Burrell’s attorney and that the only cost to Burrell would be a privilege review by his personal attorney.

The Court began its analysis with Federal Rule of Civil Procedure 45, which governs requests for discovery, including ESI, from non-parties. Under Rule 45, a Court must weigh (1) the relevance of the discovery sought; (2) the need for the information; and (3) the potential hardship to the non-party. The Court noted requests to non-parties may also be limited if the information sought “is obtainable from another source that is more convenient, less burdensome, or less expensive, or if the burden of the proposed discovery outweighs its likely benefit." While acknowledging the breadth of the subpoena directed to Burrell, the Court determined that Wood reasonably sought “only those non-privileged documents identified by an electronic search for key words related to the claims and defenses asserted by the parties.” The Court further noted that Burrell was not a disinterested fact witness. Rather, he “is alleged to have been Plaintiff’s supervisor at the time the events at issue occurred and is alleged to have terminated Plaintiff.”
Perhaps the most important holding in Wood was the Court's acknowledgement that employees often transact business outside the workplace using personal electronic devices.

In this age of smart phones and telecommuting, it is increasingly common for work to be conducted outside of the office and through the use of personal electronic devices. Therefore, it is not unreasonable, despite Burrell’s assertions to the contrary, that some relevant information may be found on his personal computer’s hard drive.

Wood clearly supports a party’s effort to obtain ESI from a non-party’s personal electronic devices. (Contrast the New York State Supreme Court's decision in DeRiggi v. Krischen, which you can read about here, where the court refused to order a forensic examination of a plaintiff's personal computer hard drive.) Wood is also another example of a non-party being ordered to comply with a subpoena seeking ESI, even where the non-party may experience some cost or inconvenience in the process. Plaintiff's success in enforcing his subpoena likely resulted, at least in part, from his agreement to pay most of the costs of searching Burrell’s hard drive, as well as Plaintiff’s proactive approach in proposing search terms as part of the meet and confer efforts. Such efforts at cooperation are a clear sign of good faith that nearly always favorably impress a court in resolving an e-discovery dispute of any sort.


Stephen J. Finley, Jr. is an Associate on the Gibbons E-Discovery Task Force.

Not So Fast: 95 Million Reasons to Carefully Select and Limit Search Terms

It has become commonplace for parties engaged in electronic discovery to discuss and agree upon “keyword” searches in an effort to limit the overall scope of discovery. A recent decision in the District of New Jersey, I-Med Pharma, Inc. v. Biomatrix, Civ. No. 03-3677 (DRD), (D.N.J. 2011) , demonstrates the pitfalls that arise when the parties too eagerly agree to conduct a search for electronically stored information using an overly broad set of keywords. The case also demonstrates a court’s willingness to engage in proportionality analysis to cabin broad discovery.

Biomatrix involved a dispute over two medical distribution contracts, with the plaintiff alleging that the defendant breached certain exclusivity provisions. During the course of the parties meet and confer obligations, the plaintiff’s counsel agreed to allow the defendant’s expert to conduct a keyword search of more than 50 terms on the plaintiff’s computer network, servers, and related storage devices. Counsel should have known better than to agree to such keywords -- without limits as to time, custodian, or “active file” status -- that would almost certainly result in millions of hits. In this case, the agreed upon search yielded more than 64 million hits, approximating 95 million pages of data. Despite agreeing to conduct such a broad search as part of a previous court order, the plaintiff was forced to seek court approval to have the prior discovery order modified to further narrow the discovery inquiry. Not surprisingly, the defendants sought to hold the plaintiff to their initial deal, and also sought costs associated with the search.

Magistrate Judge Shipp found that (1) “good cause” existed for modification because the plaintiff’s privilege review of the documents would be unduly burdensome, (2) the defendants did not demonstrate the relevancy of the documents, and (3) the parties’ overbroad search terms were unlikely to yield relevant, admissible information. He thus amended the pre-existing discovery order, but held that the defendants could seek reimbursement of the costs associated with extracting and searching the data on the plaintiff’s computer system.

District Judge Dickinson R. Debevoise ultimately affirmed Magistrate Judge Michael Shipp’s modification of the existing discovery order -- finding not only “good cause” to do so, but even “manifest injustice” if the order was not modified. See Waldorf v. Shuta, 142 F.3d 601, 1998 U.S. App. (3d Cir. 1998). Judge Debevoise also took the opportunity to provide guidance for parties to keep in mind when discussing and proposing search terms. In particular, the Court reasoned that the parties should have been more diligent before agreeing to the broad search terms, and listed the following factors for parties to consider when evaluating proposed search terms:

  1. the scope of the documents searched and whether the search is restricted to specific computers, file systems, or document custodians;
  2. any date restrictions imposed on the search;
  3. whether the search terms contain proper names, uncommon abbreviations, or other terms unlikely to occur in irrelevant documents;
  4. whether operators such as ‘and,’ ‘not,’ or ‘near’ are used to restrict the universe of possible results; and
  5. whether the number of results obtained could be practically reviewed given the economics of the case and the amount of money at issue.

As a practical matter, litigants should pay attention to Judge Debevoise’s guideposts before agreeing to a broad set of search terms. An eagerness to agree up front to a discovery plan and avoid a fight may only delay the inevitable if the search terms picked are so broad as to result in an unduly burdensome stack of material to review. Moreover, courts have become more willing to examine “what’s at stake” in a case before ordering broad-based discovery; and seeking a more limited discovery order early in the case can avoid unnecessary expense later on.


Jennifer Marino Thibodaux is an Associate on the Gibbons E-Discovery Task Force.

Ineffective Privilege Review Leads to Inadvertent Waiver in Rolling Document Production

Recently, a federal court in Illinois held in Thorncreek Apartments III, LLC v. Village or Park Forest that a defendant waived the attorney-client privilege when it inadvertently produced 159 documents that it later claimed were privileged. The defendant’s failure to take reasonably adequate measures to prevent such disclosure serves as a lesson for all attorneys, especially those who manage large, rolling document productions with the help of a vendor.

During its collection and production efforts, the defendant used a vendor to produce documents stored on back-up tapes according to a 3-step process: (1) a search of the back-up tapes using agreed-upon “search terms”; (2) a review of the yielded documents on a database accessible to defense counsel only; and (3) the placement of the yielded documents onto an online production database for the plaintiffs to review. The defendant produced over 250,000 pages of documents over a seven-month period. In the meantime, the defendant did not produce a privilege log and advised the plaintiffs that no privileged documents were withheld.

When the production was completed and the plaintiffs attempted to use certain documents at a deposition, the defendant immediately objected, claiming privilege and inadvertent disclosure, which objection counsel reaffirmed after the deposition. Defense counsel provided a privilege log four months after the deposition, which identified 159 previously-produced documents that were allegedly inadvertently disclosed. After resolving the waiver issue as to all but six of the 159 inadvertently produced documents, the plaintiffs sought the Court’s assistance, arguing that the six documents were not privileged or if they were, that the privilege was waived.

The Court concluded that the six documents, or at least portions of them, were privileged. Similarly, the Court found that the production was inadvertent based upon the circumstances surrounding the production, such as the defendant’s belief that the vendor would automatically withhold all documents tagged “privileged” from the online production database, and defense counsel’s immediate objection to the use of the documents immediately at the deposition and again thereafter.

Nonetheless, the Court ultimately found that the defendant failed to take reasonable steps to prevent disclosure, resulting in a waiver of the privilege under Federal Rule of Evidence 502(b) as to those six documents. The Court noted that the screening process was unreasonable because the defendant failed to check the online production database before it was “live online” to confirm that the privileged documents were withheld, which the Court characterized as a “strong [piece of] evidence of the inadequacy of the [defendant’s] precautions.” The Court also determined that the defendant failed to remedy the inadvertent disclosure in a timely fashion. In light of those factors, the Court found the screening process to be “completely ineffective” and without any “reasonable precautions.” For other blog articles related to inadvertent disclosure and inadvertent production under Rule 502(b), click here and here.


Jennifer Marino Thibodaux is an Associate on the Gibbons E-Discovery Task Force.

The Role of Lawyers in the Age of Electronic Discovery -- Don't Hit Delete!

Will developments in technology make lawyers more efficient or will they become extinct? A March 2011 article in The New York Times, entitled “Armies of Expensive Lawyers, Replaced by Cheaper Software,” discussed the significant efficiency and accuracy of e-discovery software in document review over that of human review. Although technology has enabled computers to imitate humans’ ability to reason at even higher levels, rest assured that Armageddon is not looming on the legal profession’s horizon.

The New York Times article discusses the development of e-discovery software that can analyze documents more quickly than human counterparts. The “linguistic” approach enables the user to find and sort documents that are deemed relevant by searching specific words or phrases. More sophisticated linguistic software can even search and filter documents based upon a tool analogous to a thesaurus. For example, if “dog” is deemed the relevant search term, the user may be able to locate documents that contain phraseology such as “man’s best friend.” Meanwhile, the “sociological” approach uses deductive reasoning and is more conceptual. For example, if someone suddenly switches their communication from e-mail to telephone after writing “call me,” it may trigger heightened scrutiny if that person is under investigation for something. Similarly, some software can even detect when an e-mail author’s style has switched from slang and abbreviations to a more formal style.

The article further cited to law firms’ experiences with e-discovery software. One firm utilized software to sort and assess 570,000 documents in two days, which, in turn, enabled the firm to identify in one day 3,070 responsive documents. Another firm cited software’s ability to scrutinize and understand how the company it was suing functioned. Lawyers have also used such tools by searching their clients’ documents during negotiations based on key words that the adversary had designated as such during pretrial proceedings.

Regardless of how the software is utilized, the role of a live attorney is not lost. While document processing and analysis will cull down documents to a substantially smaller review set, best practices suggests that an attorney must ultimately review whatever documents have been sorted or culled before they can be produced. As noted by "The Sedona Conference®, Commentary on Achieving Quality in the E-Discovery Process," May 2009, quality and procedural safeguards must be built into the e-discovery protocol in order to ensure the discoverability of key evidence, accord the proper privilege or work product protections to documents, provide a defensible process, reduce the need to re-do e-discovery because of deficiencies and to avoid motion practice. The Sedona Conference® recommends:

  1. Judgmental Sampling - the selection of sample documents, whether either culled by a e-discovery software program or by a reviewer, to determine if the documents are truly responsive or relevant to the issues at hand.
  2. Independent Testing - Tests by third-party reviewer to confirm a software’s “reported efficacy at completely extracting files from an e-mail container, accurately displaying such files for review, and completely indexing the searchable text in such files.”
  3. Reconciliation Techniques - Comparison of the amount of ESI processed and the resulting review set in order to confirm that the ESI was handled correctly or to identify gaps in the processing.
  4. Inspection to Verify and Report Discrepancies - Attorneys, particularly senior attorneys, should be available to assist reviewers, address issues and to sample review data sets to confirm and to ensure the quality of the review.
  5. Statistical Sampling - Confirm or de-confirm the effectiveness of search terms and other automated tools in identifying responsive information.

The logical question is then whether the need for the number of attorney reviewers will greatly diminish as a result of enhancements in e-discovery software. The New York Times article indicates that not everyone agrees on the extent of the impact on the labor force of the legal profession. One commentator stated that advances in technology will reduce the number of jobs in the legal sector. A second commentator, however, stated that while technology may not adversely affect the unemployment rate, the concept of automation would negatively affect job growth and individuals’ abilities to identify better jobs. In other words, despite its title, the article did not conclusively determine that lawyers were in danger of losing their jobs en masse.

So what’s the take away? Certainly, computers have made undeniable advances as evidenced by Watson, the computer that recently defeated its human opponents on “Jeopardy,” the popular trivia quiz show. In response to that computer’s performance, however, “It’s elementary my dear Watson” that technology will never completely replace lawyers. There is no doubt that e-discovery software makes lawyers more efficient and productive, but human knowledge, reaction and intuition as to facts, issues and nuances of legal theories make the role of the live attorney indispensable.