Using Technology and e-Disclosure
This is an Insight article, written by a selected partner as part of GAR's co-published content. Read more on Insight
E-disclosure – or the production of electronically stored information (ESI) – is increasingly prevalent in international arbitration. More and more information is being transmitted and stored in electronic and technological form: contracts are exchanged and executed via email; key communications are sent via text message, Slack or Microsoft Teams; and hearings and high-stakes negotiations take place exclusively on Zoom. This has led arbitration practitioners to seek, and tribunals to order, substantial amounts of e-disclosure. The availability and prevalence of e-disclosure in international arbitration has been both confirmed and perhaps encouraged by the International Bar Association’s Rules on the Taking of Evidence in International Arbitration (the IBA Rules), which, starting in 2010, have made express reference to the production of ‘documents maintained in electronic form’.
E-disclosure and the production of ESI present both opportunities and challenges for international arbitration practitioners.
In terms of opportunities, production of ESI can encourage a more thorough search for and examination of relevant evidence in a case; can be more efficiently searched and managed using available technologies; and often has a reduced ecological impact compared with the mountains of paper documents used in years past.
In terms of challenges, ESI is generally found in significantly greater quantities than paper documents and is likely to exist in duplicate forms across multiple locations. It is more easily manipulated and modified from its original form by technology and can be very burdensome and costly to produce. As the rising costs of international arbitration continue to be a ‘hot topic’, the techniques that underpin e-disclosure will, and should, be increasingly scrutinised.
An additional challenge present in international arbitration arises from cultural differences in the evidentiary traditions of practitioners and tribunal members: those from the United States and the United Kingdom, where e-disclosure practices are more developed, may tend to engage more frequently in complex and extensive production of ESI, whereas those from the civil law, inquisitorial tradition may do so on a more limited scale. This can at times lead to disagreements and disequilibrium about the scope of production of ESI and the steps counsel must take to locate and produce responsive ESI.
As information becomes increasingly electronic, producing some amount of ESI is inevitable in most international arbitrations. The focus of this chapter is threefold:
- to summarise the current state of play in international arbitration on production of ESI;
- to identify potentially useful practices from jurisdictions where e-disclosure is particularly developed; and
- to identify and analyse helpful technological tools and strategies that international arbitration practitioners can use to produce ESI efficiently and effectively.
Current state of play on production of ESI
Unlike domestic courts, which are largely constrained by established evidentiary and procedural rules, international arbitral tribunals have wide discretion over the production of documents in international arbitration. For instance, under the UNCITRAL Rules, ‘[a]t any time during the arbitral proceedings the arbitral tribunal may require the parties to produce documents, exhibits or other evidence within such a period of time as the arbitral tribunal shall determine’. The ICSID Convention likewise provides that ‘[e]xcept as the parties otherwise agree, the Tribunal may, if it deems it necessary at any stage of the proceedings, (a) call upon the parties to produce documents or other evidence’. The 2021 ICC Rules provide that ‘[a]t any time during the proceedings, the arbitral tribunal may summon any party to provide additional evidence’.
As with document production generally, tribunals largely have wide discretion over the use of technology and broad-based e-disclosure during arbitral proceedings, and there has historically been limited guidance on production of ESI. This reflects a key difference between document production in arbitration, which typically occurs after the parties have submitted their first round of pleadings, rather than at the outset of the case, as with US discovery. Therefore, by the time document production occurs in arbitration, parties have already submitted their primary evidence to the tribunal, and should, in principle, be less reliant on document production for material in support of their primary case than a party in a US-style litigation. This, along with other cultural factors, has historically resulted in a smaller document universe at issue in international arbitration than in US-style litigation, and consequently less need for the efficiency gains and other benefits that technology can bring to the document production process.
More recently, the availability of technology and the proliferation of ESI has, in practice, led to significant increases in the volume of documents at issue in international arbitrations. As international arbitration counsel become more familiar with technological tools used in e-disclosure, those tools often become a normalised part of the document production process, which can contribute to ‘mission creep’ during document production. However, given the different function that document production serves in international arbitration, and other procedural differences between the practice of international arbitration and US-style litigation, it is important to keep in mind the specific function of document production in the case at hand when determining whether and how to utilise e-disclosure tools. Equally, other factors must be taken into account, such as the actual timescale in which document production must occur (where there remains an ongoing duty throughout the arbitration), the scope of the potential document universe at issue, and any relevant budgetary or counsel capacity constraints.
General arbitral practice has neglected a full debate and engagement with the topic of e-disclosure. However, the increasing relevance of ESI to any form of document production has led arbitral institutions to begin issuing some limited guidance about best practices for e-disclosure, including reasonable limits on its scope and the application of technological tools.
Leading arbitral institutions that have commented on the subject of e-disclosure and the concurrent increase in the volume of documents note that ‘the advent of electronic documents should not lead to any expansion of the traditional and prevailing approach to document production’ in international arbitration. These institutions, and the relevant commentary, encourages arbitration practitioners to always adopt efficient procedures that safeguard against the more costly and burdensome e-disclosure practices that have been adopted in jurisdictions such as the United States.
Increasingly, updated institutional rules, particularly those applicable to commercial arbitrations, have sought to address this as well. For example, the ICDR Procedures (as amended in 2021) include in their notable features that they ‘[a]llow tribunal[s] to manage the scope of document and electronic document requests and to manage, limit, or avoid U.S. litigation-style discovery practices’. The Commentary to the 2020 IBA Rules similarly notes: ‘Expansive American—or English—style discovery is generally inappropriate in international arbitration.’
Although e-discovery and the increased availability of more advanced review platforms can certainly lead to more efficiencies in arbitration, particularly during the document production phase, this should still be considered through the lens of arbitration and should not be seen as an invitation for more wide-ranging requests for electronic documents. Requests for electronic documents should remain limited in scope and tailored to the circumstances of the case.
For their part, the Rules on the Efficient Conduct of Proceedings in International Arbitration (the Prague Rules) seek to achieve this balance by setting out guidelines for the use of ESI and e-disclosure while taking an overall stance against document production. Pursuant to Article 4.2 of the Prague Rules, ‘[g]enerally, the arbitral tribunal and the parties are encouraged to avoid any form of document production, including e-disclosure’. Nevertheless, in cases where document production is necessary, Article 4.5 provides that a party may request ‘a specific document which: a. is relevant and material to the outcome of the case; b. is not in the public domain; and c. is in the possession of another party or within its power or control’. Under Article 4.7 of the Prague Rules, documents produced ‘shall be submitted or produced in photocopies and/or electronically’.
Under the IBA Rules, the relationship between the admissibility and assessment of evidence and e-disclosure can be seen particularly in the application of Article 9 of the IBA Rules to document production requests for ESI. Article 9 sets out various objections that a party may raise in response to a document production request, including:
- lack of relevance to the case or materiality to its outcome (Article 9(2)(a));
- legal impediment or privilege (Article 9(2)(b));
- unreasonable burden (Article 9(2)(c)); and
- considerations of procedural economy, proportionality, fairness or equality (Article 9(2)(g)).
If a party has the technology and e-disclosure tools (as discussed further below) potentially available to it, the relative salience of these objections may shift. For instance, if a document production request is directed at ESI and contains targeted search terms, time frames or custodians, there may be a greater likelihood of the request identifying material that is relevant and material to the outcome of the case.
As discussed further below, another benefit of deploying e-discovery technology is a potentially significant reduction in attorney review time, which also means reduced costs associated with the e-discovery phase and potentially a reduced overall burden on the producing party. At the same time, running searches on documents that are within your client’s possession, custody or control requires collecting that ESI from the relevant custodians, processing and uploading the ESI to an e-discovery platform and potentially hosting it for several months or even years, all of which can be costly. While it is increasingly standard for states and corporates alike to keep most, if not all, contemporary records electronically, this is not always a given, and even so, particular challenges can arise when a dispute involves events that occurred prior to the switch to electronic records.
Moreover, because the volume of ESI is typically greater than for hard copy documents, the document universe for these searches may be in the tens, if not hundreds, of thousands of documents. Accordingly, although e-disclosure can certainly improve the efficiency of document production, it also has the potential to dramatically increase the financial burden on both the producing party (when identifying documents for production) and the requesting party (when subsequently reviewing the produced documents).
This tension between e-disclosure and burden may also arise with documents that are legally privileged. Larger volumes of ESI can make it more difficult to review and accurately identify legally privileged documents, particularly because privileged content can sometimes be found in metadata or in comment bubbles or tracked changes, which are not always immediately visible on a document review platform. In addition, although review platforms offer various methods of applying partial or full-page redactions to ESI, the act of applying redactions is not without cost, as attorneys will often need to review all or a subset of redactions to ensure that they are being applied correctly, a process that can be both costly and time-consuming. Thus, the presence of ESI and the availability of e-disclosure may well encourage requesting parties to cast a wider net for documents than they otherwise would, increasing the burden on producing parties and accordingly warranting an objection under the IBA Rules. The relative benefits and drawbacks of e-discovery is thus something that parties and tribunals should both keep in mind when crafting and ruling on document production requests.
Overall, the aim of guidance offered by arbitral institutions is to avoid allowing e-disclosure to alter the key principles that make arbitration potentially attractive to businesses, namely, to provide more efficient and cost-effective means to resolve disputes.
E-disclosure practices from other jurisdictions
Although e-disclosure practice is relatively new to international arbitration, it is well developed in other jurisdictions – the United States and the United Kingdom, in particular, have developed numerous strategies for limiting the scope of production and ensuring equality of arms when e-disclosure is involved. Some of these practices address challenges that are unique to a discovery phase that takes place at the outset of a case before the parties have had a chance to narrow the issues in dispute, which, as noted above, would be less relevant to document production that takes place after the first round of pleadings in international arbitration. Others, however, address issues common to e-disclosure in the contexts of both domestic litigation and arbitration.
The scope of e-disclosure in the United States and the United Kingdom is generally far broader than that in international arbitrations. Nevertheless, both jurisdictions consider two factors in their proportionality analysis that are not expressly contemplated by the IBA Rules: namely, that e-disclosure be proportionate to (1) the overall importance of the case as a whole (including the amount in dispute) and (2) the financial resources of the producing party. The US Federal Rules of Civil Procedure, for example, require that any production of ESI be proportionate to, inter alia, ‘the importance of the issues at stake in the action’ and ‘the parties’ resources, the importance of the discovery in resolving the issues, and whether the burden or expense of the proposed discovery out-weighs its likely benefit’. The UK’s Practice Direction 57AD likewise permits broad-based e-disclosure only in ‘an exceptional case’ and considers, inter alia, ‘the importance of the case, including any non-monetary relief sought’ and ‘the financial position of each party’ when determining whether such discovery is appropriate.
To encourage efficiency and reduce later disputes, both the United States and the United Kingdom also require that counsel meet and confer at the outset of the case regarding how ESI will be produced. Similarly, both jurisdictions encourage parties to present to one another carefully crafted search terms to limit the review burden on the producing party. US courts have ordered parties to agree on search terms for production of ESI. The United Kingdom likewise encourages parties to consider limiting e-disclosure to only ‘documents responsive to specific keyword searches, or other automated searches’. Although, for the reasons stated above, keyword searches may be less appropriate for document production in international arbitration in which documents must generally be both relevant to the case and material to its outcome, requesting parties may wish to include specific search terms in their document production as part of their description of the document or class of document being sought in a particular request. This step, which is expressly anticipated by the IBA Rules, can help to proactively reduce the producing party’s overall burden and thus can be a useful tool for mitigating the risk of an objection on grounds of undue burden.
Both jurisdictions also encourage the use of technology to lessen review burdens. The Sedona Principles, which are frequently cited by US federal courts, identify ‘the potential use of search technology and other methods of reducing the volume of ESI to be preserved or produced’ as a key topic to be discussed during pre-disclosure meetings. The United Kingdom, likewise, requires parties to discuss how to reduce the cost and burden of e-disclosure, including whether to use technology-assisted review (TAR) or coding strategies to reduce duplication. UK courts may also order parties to use software or analytical tools, de-duplication software or data sampling to reduce the burden of review. The suitability of these tools to document production in international arbitration is discussed further below.
Meet and confers about e-disclosure in the United States also involve discussions about the types of documents that counsel are obliged to identify, preserve and produce during a case. This could be particularly relevant to international arbitration, in which there is often variance in the domestic legal ethics requirements for counsel concerning e-disclosure, as this can produce disequilibrium concerning whether and to what extent counsel will feel obligated to produce ESI beyond emails (e.g., whether text messages or chat logs in applications such as Slack, Skype or Microsoft Teams must be preserved and produced).
Finally, the United States has devised several solutions for the increased risk of inadvertent disclosure of privileged information associated with e-disclosure addressed above.
First, US practitioners often agree at the outset that production of ESI made without an intent to waive privilege can be clawed back at the request of the producing party, thereby largely precluding the receiving party from claiming waiver because of inadvertent production of privileged information in e-disclosure. A claw-back agreement could also be used as a ‘sword’ in international arbitration against any party claiming undue burden under Article 9(2)(c) of the IBA Rules in respect of producing ESI that may contain privileged information.
Second, US practitioners also often use search terms or metadata to locate hidden privileged information within documents, and to auto-populate privilege logs for voluminous ESI populations. These tools are discussed further below.
In sum, although many aspects of e-disclosure in the United States and the United Kingdom are inapplicable to document production in international arbitration, practices such as more detailed proportionality analyses, initial meet-and-confers, robust claw-back agreements, and the use of technological solutions to reduce burden on the parties might be useful tools for arbitrations in which there are large volumes of ESI.
Technological solutions and strategies
As noted above, technological tools, such as review platforms and TAR, are often considered a way to lessen the burden of e-disclosure. As e-disclosure becomes an increasingly common feature of international arbitration, and as ESI continues to be generated in larger and larger quantities by clients, it is important for counsel to familiarise themselves with such tools, to conduct their reviews efficiently while still maintaining high standards.
E-disclosure market solutions are all generally similar in offering. A product, or database, hosts the data set provided by a party, including documents received from a client or productions received from opposing parties. After the data is ingested into the platform, the data processing phase begins and document metadata (such as date, custodian and file type) are extracted to the extent it is available and mapped onto applicable fields to facilitate searching.
At this stage more powerful tools such as de-duplication and email threading, can be applied, and the database begins to take shape. If documents are collected and processed in their native form, attachments can then be extracted and relationships between documents can be established. Thereafter, practitioners generally have access to a fully searchable database and are able to search for files using keywords or extracted metadata fields, such as the date. Counsel can then also create their own coding forms that deal with particular themes relevant to a case, as well as responsiveness and privilege or confidentiality issues, thereby creating an organisational structure within the database.
This section discusses considerations for making the decision to initiate a platform, some of the general pros and cons associated with these platforms, and finally the reliability of analytical tools that can be leveraged.
Initiating a technology solution
At the outset of an arbitration, it can be useful to set up a document review platform. Even if ESI volumes are not significant at this stage, loading available ESI onto a review platform allows practitioners to target the relevant documents more efficiently during initial fact development for the first round of submissions. It also ensures that the platform is set up and organised in advance of receiving additional volumes of ESI from a client during the initial submissions, and well in advance of any discovery phase where a party is required to produce ESI or may be in receipt of a production from opposing counsel that would most efficiently be reviewed on a platform. In assessing whether to deploy a document review platform, it will also be necessary to have some understanding of the client’s data, including how they store and access that data, what applications they use and their backup and archiving policies.
As soon as documents come into counsel’s possession, or even as early as once discussions have taken place with the client regarding the types and volumes of documents expected, the decision-making process on review platform vendors and databases can begin (if in-house solutions are not available). The choice of which platform to introduce is an important strategic consideration in any e-disclosure process and is quite case-specific. A checklist of considerations for selecting a platform – including the format and volume of data, the complexity of the review required and the client’s ability to pay review-related costs – is included at the end of this chapter.
Successful use of a review platform can depend on the client’s and practitioners’ understanding of how to use the platform and the e-disclosure process generally. It is vital that counsel and the support team attend training sessions and for them to spend time familiarising themselves with the review functionalities available, as well as the process by which data is collected, processed and ingested into the platform. By having a better understanding of these processes, counsel can better instruct their clients on collection, facilitating a more focused document set that captures the most relevant information for the case. By familiarising themselves with the platform technology early on, they ensure they are deploying the most effective search and analytical tools it has to offer, and that they are making the necessary requests of their e-discovery project management team. Familiarity with the discovery process, ESI and the technology used to navigate it will ensure counsel teams are reducing costs and establishing more efficient collection and review workflows.
The technology for document analysis has substantially increased in recent years. An example of this is TAR. TAR is becoming more common where discovery is becoming more voluminous due to the quantities of ESI being collected. TAR is:
[a] process for Prioritizing or Coding a Collection of Documents using a computerized system that harnesses human judgments of one or more Subject Matter Expert(s) on a smaller set of Documents and then extrapolates those judgments to the remaining Document Collection.
There are two general types of TAR, the first of which is traditionally referred to as TAR 1.0 or predictive coding, where counsel review a sample (or ‘seed’) set of documents for responsiveness determinations, and those determinations are then fed back into a computer algorithm. Using that responsiveness coding, the algorithm then makes responsiveness determinations on additional documents, which are then again reviewed by counsel in an iterative process until the system is accurately predicting responsive documents. TAR 2.0, or continuous active learning, requires no seed set of documents, but rather integrates the review team’s coding of documents on a continuous basis, and then feeds them what it predicts are responsive documents based on their coding.
In addition to discussing this with their client, counsel that do opt to use TAR to assist in the document production phase may wish to consider engaging with the opposing party on a proposal to use TAR prior to undertaking the disclosure process, particularly if the parties reasonably anticipate large document universes being involved on both sides. This may involve a procedural conference to agree on a protocol addressing issues such as how the technology would be set up to review documents, how the party would verify the accuracy of the search process, and when a party can stop the review process. Otherwise, and depending on the applicable law, a party unilaterally adopting TAR without engaging with the other party at the outset, may risk being ordered to redo the review manually if the lack of transparency means that other party cannot be confident that the review is accurate and comprehensive. As a general practice point, counsel throughout any document production process should keep records of the steps taken, in particular with ESI, to have a defensible record should they be challenged about the quantity and quality of the production.
Pros and cons
As discussed above, most international arbitrations involve some form of ESI and, therefore, may benefit from the use of a document review platform to store and manage this data. Operations such as applying keyword searches and date limiters during the processing phase can assist counsel with understanding and culling the initial data set before it is moved onto a review database, which ultimately reduces platform hosting costs and ensures that the document set on the platform is as relevant as possible.
Once the data set is online, arbitrations that involve large volumes of ESI, complex document sets or significant time constraints might benefit from additional technology solutions such as TAR, or communications mapping to identify potentially responsive documents. Where data sets are particularly large, tools such as TAR can allow counsel to prioritise the review of documents that are most likely to be responsive to ensure a more efficient and targeted review.
The benefit of standard tools available within any e-discovery platform is that they can minimise recurring obstacles that arise with paper-only review or with saving ESI onto local drives. For example, de-duplication tools available on most review platforms can quickly eliminate duplicate documents that counsel might otherwise spend substantial time reviewing and re-reviewing. Review platforms also provide opportunities for greater collaboration and communication between teams, as reviewers can generally make responsiveness and privilege determinations and provide comments about documents directly within the review platform itself, which other team members can later consult as needed. Tools such as email threading and other analytics that help demonstrate patterns in the document set can be further utilised to increase efficiencies and save costs.
However, there are some drawbacks with this technology. First, these tools are only as good as their users. Historically, counsel have had a distrust of relying on technology in lieu of manual processes, and it can be difficult to convince counsel teams to embrace new technological solutions, particularly in disputes where time is often of the essence and they are unfamiliar with its benefits. If counsel teams do not take the time to learn and master these technological tools, they may become more of a hindrance and financial burden than a help during the course of an arbitration.
As noted above, the fact that these tools facilitate the collection, processing and review of electronic documents can also be a double-edged sword. Because ESI is easier to store and back up, it tends to be more voluminous than paper documents, and is often hosted in multiple locations within a company’s data storage system, across multiple custodians, and even across multiple jurisdictions. Although the tools discussed above can facilitate a review of a commensurately larger universe of documents, this can come at a high cost in vendor fees, which can vary depending on the size of the data set. These fees can become unpredictable when counsel is grappling with a large and diffuse ESI data set, and they must be closely managed to avoid unpleasant surprises. The costs associated with such e-disclosure techniques are likely to be scrutinised further as scrutiny of costs becomes increasingly prevalent in international arbitration.
Nevertheless, counsel should avoid short-term thinking about the costs of e-disclosure solutions. Although initial ingestion and processing costs for ESI in review platforms can seem high, and the amount of time needed to set up platforms and train TAR processes may seem daunting, they may ultimately pale in comparison to the costs attributable to attorney review time. One study has shown that in large-volume cases, review-related activities accounted for 73 per cent of the total cost of e-disclosure, whereas 19 per cent was associated with processing and only 8 per cent with collection. Taking advantage of culling tools offered by review platforms such as de-duplication and email threading, as well as more advanced TAR software, allows counsel to hone in on the relevant documents more quickly, significantly reducing the size of the data to be reviewed and, therefore, the amount of time and money that needs to spent on review.
Accuracy and predictability
Practitioners are often concerned that analytical tools such as TAR will introduce substantial error into a review process, requiring much attorney time to remedy. This fear is largely misplaced. There is strong evidence to support the position that employing analytical tools such as TAR ‘yield[s] higher recall and/or precision than an exhaustive manual review process’ and with much lower effort, debunking the notion that manual review of large sets of data is the only way to ensure accurate results. Studies have found that TAR software outperforms human review in accurately identifying responsive documents, which ultimately reduces the number of documents an attorney must then review to only a fraction of what is in the collection, leading to obvious benefits in saving costs and time as well as more accuracy in the human side of review as well. These are some of the reasons that, as noted above, UK and US practice encourages the application of TAR.
Of course, these tools are not 100 per cent accurate. Rigorous quality control by counsel is important when using tools such as TAR to ensure accuracy in document production. Best practice would be to have sample sets of documents coded by TAR systems reviewed by senior members of the team who are most familiar with the issues in the case and the nuances of the document production requests. Counsel should also check both responsive and non-responsive documents coded by software throughout the course of the iterative review phase and should provide feedback to the vendor when there are significant and recurring errors. Interfacing with TAR in this way can reduce the burden (and inaccuracy) of counsel reviewing large volumes of ESI themselves, while ensuring that counsel have strategic input and the last word on any document production.
Further, when deciding whether to use TAR, counsel should be aware of its more general limitations. Because the current technology relies on text to learn what documents might be responsive, TAR does not work on non-textual documents such as images or audiovisual material. It will also not work if a legible electronic version of a document cannot be generated, or on a population of documents that are in multiple languages. Further, for TAR to train itself, it needs to have a substantial population of responsive documents to learn from. If a population of documents does not contain enough responsive documents (i.e., ‘low-richness’ populations), then TAR may be able to learn what is not responsive but may not be able to identify what is responsive.
The following checklist of questions can assist in determining the need for a database and which solution to choose.
Sources and format of ESI
- Where are the documents currently located?
- Who or what are the potential sources (the custodians) of the data?
- Is the ESI collected in its original native documentation (i.e., the format used by the application that created the document)?
- Are there hard-copy documents that are physically stored, or that have been previously scanned into a new electronic document?
- Is there ESI that requires certain software to view it, or that contains other complexities such as non-searchable text, uncommon file formats?
- Does the ESI extend to communications across social media and mobile communications (e.g., WhatsApp, LinkedIn, Facebook)?
- Are foreign language documents anticipated?
- Does the client have an in-house litigation specialist who is familiar with preservation practices and managing forensic data collection?
- Will the provider need to assist with the collection process, either remotely or on-site?
- What are the potential locations of ESI (i.e., is the data domestic or abroad?) and are there any data protection laws that need to be taken into consideration for the jurisdiction you are collecting from?
- What is the volume of ESI that is expected to be collected, reviewed and produced?
Case analysis and platform capabilities
- How can early case assessment tools assist prior to the promotion to a review database?
- What types of keywords or date ranges might be required to limit the scope of review?
- What kinds of analytical tools are anticipated; for example, will optical character recognition and de-duplication suffice, or will more advanced features, such as email threading and TAR, be required?
- Do the procedural rules of the case require that documents be produced in a certain format or any other specifications regarding production?
- Do the procedural rules set out any requirements about the processing of data and parties and tribunals obligations to protect this data?
- Is there a preference for a server or cloud-based platform?
Project management and contract attorney needs
- Does your firm have in-house e-disclosure specialists that can assist with questions from the team regarding crafting searches, for example, and who are equipped to address any potential issues that arise with the document set?
- Is the preference to have the e-disclosure provider staff a project manager who can be available to assist with any questions from the team?
- Is counsel experienced with the e-disclosure platform, or will they require significant training?
- Are there enough resources available to conduct a document review, or will you need to engage with third-party contract attorneys?
- If you are using contract attorneys what are the protocols in place to protect data?
- Do they have an existing relationship or reduced rates with a particular provider?
- Do they have an in-house platform they already use?
- What is expected to be the general level of participation by the client during the e-disclosure phase?
- Are there any particularly sensitive documents that may require additional restrictions within the database?
- What is the value of the case?
- What is the budget for the e-disclosure phase?
- Does that budget include anticipated translation costs?
- Is there hard-copy documentation?
- Have costs for unitisation and objective coding been factored in?
 Julia Sherman and Himmy Lui are associates, Kelly Renehan is a case manager and Anish Patel is director of practice at Three Crowns LLP.
 ‘eDiscovery’ is generally considered a logical extension of the well-established discovery process for electronically stored information (ESI) that an organisation might possess, including email messages, voicemails, presentations, word processing files, spreadsheets, tweets, Facebook posts and all other relevant communication or information that might be useful in a legal action. See Osterman Research, ‘Why eDiscovery Should be a Top Priority for Your Organization’, October 2013, available at www.legal500.com/wp-content/uploads/assets/legal500/images/sponsors/HP_Why_eDiscovery.pdf.
 International Bar Association, Rules on the Taking of Evidence (IBA Rules), 2010, Article 3(a)(ii); see also IBA Rules, 2020, Article 3(a)(ii). Unless specified, all references to the IBA Rules refer to the 2020 version.
 Arbitration Rules established by the United Nations Commission on International Trade Law (UNCITRAL Rules) (latest version adopted in 2021).
 UNCITRAL Rules, Article 27(3). Article 27(4) further provides that ‘[t]he arbitral tribunal shall determine the admissibility, relevance, materiality and weight of the evidence offered’.
 Convention on the Settlement of Investment Disputes between States and Nationals of Other States established by the International Centre for Settlement of Investment Disputes in 2006 (ICSID Convention).
 ICSID Convention, Article 43.
 Arbitration Rules established by the International Chamber of Commerce (ICC Rules) (latest version adopted in 2021) .
 ICC Rules, Article 25(4).
 Of course, there are international arbitrations that have involved the production of tens, if not hundreds, of thousands of documents. The Bilcon v. Canada arbitration, for example, involved a protracted document production phase resulting in the review of 75,000 documents and the production of 50,000.
 E Shirlow, ‘E-Discovery in Investment Treaty Arbitration: Practice, Procedures, Challenges and Opportunities’, 11 Journal of International Dispute Settlement (2020), 561.
 ICC Commission Report, ‘Managing E-Document Production’, July 2016, p. 2.
 See, e.g., ‘Commentary on the revised text of the 2020 IBA Rules on the Taking of Evidence in International Arbitration’, January 2021 (Commentary to 2020 IBA Rules), p. 8. These requirements are intended to ensure that production in international arbitration does not become a ‘fishing expedition’ for documents from which a party might attempt to construct a claim that is otherwise speculative. See id., p. 9.
 International Dispute Resolution Procedures established by the International Centre for Dispute Resolution (ICDR Procedures) (latest version effective as of 1 March 2021).
 2021 ICDR Procedures, p. 8. See also id., Article 24(6).
 Commentary on the revised text of the 2020 IBA Rules on the Taking of Evidence in International Arbitration (Commentary to 2020 IBA Rules), p. 8.
 ICC Commission Report, ‘Managing E-Document Production’, July 2016, p. 3.
 See Commentary to 2020 IBA Rules, pp. 6, 10.
 See Shirlow, op. cit., 550.
 See below for a further discussion of the pros and cons of technological solutions to e-disclosure.
 Metadata typically is embedded information about an electronic document including, for example, the date and time a file was created or modified or the author, date and time an email was sent. E-Discovery Glossary, available at https://uk.practicallaw.thomsonreuters.com/6-617-8070.
 See, e.g., Shirlow, op. cit., pp. 579–80; D R Rizzolo, ‘Legal Privilege and the High Cost of Electronic Discovery in the United States: Should We Be Thinking Like Lawyers?’, 6 Digital Evidence and Electronic Signature Law Review (2009).
 US Federal Rules of Civil Procedure (FRCP), Rule 26(b)(1).
 Practice Direction 57AD, Article 8.3, Model E. This Practice Direction applies to cases in the Business and Property Courts of England and Wales.
 id., Article 6.4.
 See, e.g., Romero v. Allstate Ins. Co., 271 F.R.D. 96, 109–10 (E.D. Pa. 2010) (ordering the parties to confer and come to agreement on future search terms, custodians, date ranges and other essentials to a search methodology).
 Practice Direction 57AD, Article 9.6(1)(e).
 IBA Rules 2020, Article 3(a)(ii) (providing that ‘in the case of Documents maintained in electronic form, the requesting Party may, or the Arbitral Tribunal may order that it shall be required to, identify specific files, search terms, individuals or other means of searching for such Documents in an efficient and economical manner’).
 Best Practices, Recommendations & Principles for Addressing Electronic Document Production (published by The Sedona Conference, currently in its third edition) (Sedona Principles).
 Sedona Principles, Commentary to Principle 3.
 Also referred to as predictive coding, computer-assisted review, or supervised machine learning, technology-assisted review (TAR) is ‘a review process in which humans work with software (“computer”) to train it to identify relevant documents. The process consists of several steps, including collection and analysis of documents, training the computer using software, quality control and testing, and validation.’ Technology Assisted Review (TAR) Guidelines, Duke Law, January 2019, p. 1, available at https://edrm.net/wp-content/uploads/2019/02/TAR-Guidelines-Final.pdf (footnotes omitted).
 Practice Direction 57AD, Article 9.6(3)(a) and (b).
 De-duplication is ‘[a] process to identify and segregate files that possess the same digital fingerprint . . . [and which] reduces the number of documents for lawyer review because it removes redundant documents from the document review process’. E-Discovery Glossary available at https://uk.practicallaw.thomsonreuters.com/6-617-8070. Options for near de-duplication are also available where documents that are almost identical can be grouped together to reduce manual review costs and efforts. ibid.
 Practice Direction 57AD, Article 9.7.
 FRCP, Rule 26(f)(3)(C); see also Sedona Principles, Commentary to Principle 3.
 For more on the effects of varying domestic ethical requirements for arbitration counsel, see Jan Paulsson, ‘Standards of Conduct for Counsel in International Arbitration’, American Review of International Arbitration, Vol. 3, Nos. 1–4, December 1992.
 See Committee Notes on Rules – 2006 Amendment – FRCP, Rule 26(f).
 See Sedona Principles, Comment 10.g.
 Osterman Research, ‘Why eDiscovery Should be a Top Priority for Your Organization’, October 2013, pp. 1-3, available at https://www.legal500.com/wp-content/uploads/assets/legal500/images/sponsors/HP_Why_eDiscovery.pdf.
 There separately exists ‘end-to-end’ solutions that would assist parties and a tribunal throughout the case; see, e.g., Protocol for Online Case Management in International Arbitration, paragraphs 47–50, at https://sites-herbertsmithfreehills.vuturevx.com/20/21553/landing-pages/platforms-protocol---wg-on-legaltech-in-arbitration---november-2020.pdf.
 Databases are e-discovery platforms either provided by third-party vendors or managed on-premises by legal specialists as part of an overall document management scheme. The in-house route is one way by which firms may attempt to reduce costs, particularly vendor costs, associated with an e-discovery platform, but this often requires a substantial investment in software/hardware, as well as specialist staff to manage the platform. In regard to third-party vendors, the past decade has seen a proliferation of companies that have entered the market offering e-disclosure solutions, such that e-disclosure specialists have received their own legal directory recognition; see, e.g., https://chambers.com/legal-rankings/ediscovery-uk-wide-58:2817:11805:1.
 During the data processing phase, searchable text can also be generated for documents such as paper files that were scanned to image, via the optical character recognition (OCR) process, to facilitate full-text search capabilities. See EDRM, Production Guide, 4 November 2010, available at https://edrm.net/resources/frameworks-and-standards/edrm-model/production/. OCR is defined as ‘[t]he process of generating a searchable text file that contains the content of the original document. . . . OCR technology is used to make searchable both scanned paper documents and non-searchable ESI’. OCR technology converts letters, numbers and other characters from image files, including scanned documents and unreadable PDFs, into searchable text data. E-Discovery Glossary available at https://uk.practicallaw.thomsonreuters.com/6-617-8070.
 See footnote 33, above.
 Email threading is the process by which email relationships are identified, including threads and duplicate emails, and grouped together so that email exchanges can be reviewed in a logical way. This can significantly reduce review time as it allows counsel to avoid reviewing the same emails repeatedly as well as the likelihood of inconsistent coding across the same email chains. See ‘Email Threading 101: An Introduction to an Essential e-disclosure Tool’, 19 April 2017, available at www.relativity.com/blog/email-threading-101-an-introduction-to-an-essential-e-discovery-tool/.
 See footnote 69, below.
 During the ingestion and processing phase, data sets tend to expand in terms of ‘megabytes’ and ‘gigabytes’ because of attachments, embedded files and the like. This can make cost estimates and the size of ESI collection and hosting somewhat unpredictable.
 This is often referred to as a parent–child relationship or a document family. These relationships can exist with documents such as emails and their attachments, zip files, word processing files and embedded files. It is important to take into consideration document families during the review process, as one family member’s responsiveness or privilege status might affect counsel’s decision to produce other members of the same family. See ‘Glossary: Parent-Child Relationship’, at https://us.practicallaw.thomsonreuters.com/0-521-0521.
 Many e-disclosure providers are now offering low cost ‘self-service solutions’, which can aid this process. These sorts of platforms typically incur lower costs as they are self-managed and cloud-based, reducing project management and hosting costs, and can be particularly useful during the early case assessment stage. See, e.g., www.logikcull.com/use-cases/early-case-assessment.
 In the ordinary course, counsel make initial document requests of their clients. These initial requests might be informative to the initial case analysis, but in most instances will not scratch the surface of the document repository. Usually, only once a case moves towards an arbitration being launched, and post-launch, will the true extent of the document set become visible. This set of documents will, in all likelihood, further increase as a document production phase begins and concludes.
 See ‘Demystifying Ediscovery Production Formats’, at www.nextpoint.com/ediscovery-blog/ediscovery-production-formats-explained/. Understanding how your client’s information is stored will also assist with instructing them on collection and preservation of ESI. It may also assist with selection of an e-discovery provider at an early stage. For example, if a client has a large volume of hard copy documents that require scanning and manual coding, or a substantial volume of documents that will require third-party contract attorneys to complete the review, you may choose a provider that can also manage those work streams. See also footnote 70, below.
 See ‘Considerations When Selecting an E-Discovery Vendor Checklist’, at https://uk.practicallaw.thomsonreuters.com/4-520-7423.
 M Grossman and Gordon V Cormack, ‘The Grossman-Cormack Glossary of Technology Assisted Review’, 7 Fed. Courts L. Rev. 1 (2013), p. 32. Litigation (or arbitration) support companies typically offer varieties of TAR software and workflows that ‘train’ the algorithms supporting the product, which in turn uses that information to code the unreviewed documents. Platforms continue to develop increasingly intelligent applications in this field, such as Brainspace 6, a form of augmented intelligence technology that introduces continuous multimodal learning. See Brainspace, ‘Continuous Multimodal Learning – Whitepaper’, at www.brainspace.com/documents/BRS-CMML-WHITEPAPER.pdf.
 See K Khamsi, ‘Compliance with document production orders: Traditional paradigm and new questions’, in B Cremades and P Peterson (eds), Rethinking the Paradigms of International Arbitration (ICC, 2023), pp. 90–91.
 T Trew, ‘Ethical Obligations in Technology Assisted Review’, American Bar Association, 7 December 2020, www.americanbar.org/groups/litigation/committees/professional-liability/practice/2020/ethical-obligations-in-technology-assisted-review/.
 See for example, Triumph Controls UK Ltd v. Primus International Holding Co  EWHC 176 (TCC). In this case, Triumph manually reviewed a subset of responsive documents using searches assisted by TAR. A sampling exercise carried out on the remaining documents indicated that a very small percentage would be relevant, so Triumph decided that it would be disproportionate to conduct a manual review on that set. Triumph had not previously discussed this approach with opposing counsel, nor had it disclosed it on a discovery questionnaire. Primus made an application to the court, and the judge agreed that ‘both the [TAR] exercise, and the sampling exercise that it produced, cannot be described as transparent, and cannot be said to be independently verifiable’ and concluded that ‘steps taken by [Triumph] in relation to the balance of the [remaining] documents have not been adequate’. The judge ordered that the parties agree on a methodology by which to manually review a sample of the remaining documents, within a limited time frame. See Fenwick Elliott legal briefing on the case, at www.fenwickelliott.com/research-insight/newsletters/legal-briefing/2018/3. In contrast, under Rule 26 of the FRCP, a producing party does not necessarily have to coordinate with the other party prior to using TAR, as it is generally accepted that ‘a producing party has the right in the first instance to decide how it will produce its documents during discovery’ and, accordingly, ‘that a party can use TAR so long as its use its transparent and timely disclosed’. However, if provisions concerning the use of TAR have been included in an electronic discovery protocol agreed by the parties and memorialised in a court order, the parties will be required to abide by those provisions. See In re Valsartan, Losartan, and Irbesartan Prods. Liab. Litig., 337 F.R.D. 610, 616 (D. N.J. 2020).
 Another standard culling procedure that occurs during this initial processing phase is de-NISTing, which can also significantly reduce data sets and, therefore, the amount of irrelevant ESI, as well as hosting and review costs. De-NISTing is ‘the removal of system files, program files, and other non-user created data from [ESI]’. E-Discovery Glossary, at https://uk.practicallaw.thomsonreuters.com/6-617-8070. These sorts of system files ‘can be numerous and voluminous. . . . Often, more than half of the data captured in a hard drive image is system and software files’. Matthew Verga, ‘Demystifying De-NISTing’, Relativity Blog, 15 January 2016, available at www.relativity.com/blog/demystifying-de-nisting/.
 See ‘Database checklist’, below.
 See footnote 52, above.
 See ‘Technology Resources for Arbitration Practitioners – Document collection, review and production’, at www.ibanet.org/technology-resources-for-arbitration-documents.
 See ‘The True Cost of ediscovery’, 17 November 2009, at www.cmswire.com/cms/enterprise-cms/the-true-cost-of-ediscovery-006060.php (‘Manual review is usually the most expensive aspect of discovery.’); see also C Malinvaud, ‘Will Electronic Evidence and e-disclosure Change the Face of Arbitration’, in T Giovannini and Alexis Mourre (eds), Written Evidence and Discovery in International Arbitration: New Issues and Tendencies (Kluwer, 2009), p. 378 (‘The preponderance of costs involved [in disclosure] therefore relate to the need for a review of the documents to be carried out – by both the producing and receiving party.’).
 ‘Technology-assisted review models and investigate features explained’, Epiq, at www.epiqglobal.com/epiq/media/thinking/ediscovery/tar-models-investigative-features-explained.pdf.
 See ‘Why ediscovery should be a top priority for your organisation’, October 2013, at www.legal500.com/wp-content/uploads/assets/legal500/images/sponsors/HP_Why_eDiscovery.pdf.
 Shirlow, op. cit., 578, referring to Glencore v. Bolivia; see also C Malinvaud, op. cit., p. 378 (‘the availability of search tools can mean that ESI can be cheaper to disclose than paper data (assuming the data is readily accessible and amenable to being searched)’).
 M Grossman and G Cormack, ‘Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient than Exhaustive Manual Review’, 17(3) Richmond Journal of Law and Technology 1 (2011), pp. 3, 48.
 See ‘Myths and facts about technology-assisted review’, at: https://legal.thomsonreuters.com/en/insights/articles/myths-and-facts-about-technology-assisted-review. Although the technology available in many platforms can significantly aid and reduce attorney review time, there will always be some level of human review involved. ‘Human review and software programming are vital to the success of TAR . . . [t]he technology is augmenting our own human abilities . . . TAR supports human reviewers, who now only need to review a fraction of the documents collected as opposed to the entire collection.’ ibid.
 See Practice Direction 57AD, Article 9.6(3)(a): ‘[T]he court may give directions, on the following matters with a view to reducing the burden and cost of the disclosure exercise— . . . the use of—software or analytical tools, including technology assisted review software and techniques.’ See also Sedona Principle 10, which recommends counsel use TAR for privilege review.
 See K Khamsi, op. cit., p. 92.
 See also ‘Considerations When Selecting an E-Discovery Vendor Checklist’, at https://uk.practicallaw.thomsonreuters.com/4-520-7423, and Annex 3 in the Protocol for Online Case Management in International Arbitration referred to in footnote 40, above.
 Once scanned in, hard-copy files collect new metadata that does not accurately reflect the information corresponding to the document, such as the actual date of the document. This can create complexities in crafting searches for review and keyword searchability, particularly if the scans are of poor quality. Scanned documents may require further time and monetary investment, as bulk scans could require unitisation (the process whereby a single scan of multiple documents is broken up into individual documents). Scanned documents may also require objective coding (the process of collecting and applying key metadata fields that help identify the documents (e.g., date and author)) to ensure the database is organised as efficiently as possible.
 Globalisation means that, increasingly, foreign language documents are included in businesses’ data stores, which can add complexities for the e-discovery phase. Many vendors now offer functionalities such as specialist translation plug-ins and can prepare other technological workflows to assist with identifying and managing foreign language data. Text analytics can be used early on to identify foreign language documents at the beginning of a review process, and therefore review of those documents can be conducted in parallel with documents in the language of the arbitration, allowing for additional time and cost efficiencies. Some vendors are also able to provide more comprehensive services as they either offer their services entirely in that language or have partners or in-house solutions to assist with translating documents. These factors should all be taken into consideration when deciding which database to use. See John Del Piero, ‘3 Tips for Navigating the World of Foreign Language Data’, 22 July 2016, at www.relativity.com/blog/3-tips-for-navigating-the-world-of-foreign-language-data/.
 It is important that if counsel is intending to produce metadata, whoever collects the data takes necessary steps to preserve the original metadata.
 Many established providers have in-house experts who can assist with collection of ESI.
 Counsel might consider a cost/benefit analysis of number of documents versus time/costs of a database versus number of attorneys staffed on a matter.
 Some providers are now offering both options. There are advantages to a cloud-based platform, such as scalability and potentially lower costs, and while server solutions offer more control over data security, they are more susceptible to being hacked. Cloud-based solutions have more advanced security measures and are continuously updated with the latest software and security patches. See LogikCull, ‘To the Cloud! The Trend Toward Cloud-Based eDiscovery’, at www.logikcull.com/blog/cloud-based-ediscovery.
 This will be an important consideration in choosing between a self-service and full-service option. E-discovery providers charge variable market rates for project management time.
 An important consideration here will be the time frame of the review and production phases of the case. Third-party contract attorneys will be significantly more cost-efficient in conducting a review. Although there may be concerns about using attorneys with whom counsel is not familiar, contract attorney reviewers generally go through a vigorous background check and are continuously given feedback to improve the quality of their work. However, contract attorney review will only be as good as the training provided by counsel, as well as the feedback from counsel that they receive throughout.
 See footnotes 70 and 77, above, for discussions on unitisation and objective coding, and third-party contract attorneys, who can also be used for these sorts of tasks.