While writing this chapter, we asked a number of lawyers we have worked with what they would like to know about econometrics or regression analysis (the two terms are largely synonymous for our purposes). One particularly colourful response captured the general mood: ‘even the words “regression analysis” send a chill down my spine!’
We understand the genesis of these feelings. At its worst, the part of an expert report that presents econometric analysis is dense and hard to follow, while the debate between experts can soon become incomprehensible as terms like ‘heteroscedasticity’, ‘omitted variable bias’, ‘mis-specification’ and ‘failure to reject the null hypothesis’ fly back and forth like guided missiles.
To the extent that our chapter has a message, however, it is that econometric analysis provides a powerful set of tools for handling data. Used correctly and presented clearly, it can be central to estimating damages in a robust and reliable manner. At the highest level, we would suggest that:
- There are some fundamental elements of regression analysis that practitioners of international arbitration should feel comfortable with (just as many antitrust lawyers do). This chapter attempts to lay out some, though certainly not all, of those elements.
- It is true that regression analysis can give rise to complex technical debates. Nonetheless:
- As with other technical subjects, it is the duty of the expert to present the core issues in a clear, concise fashion, eliminating all unnecessary complexity. The use of graphics, simple examples and plain language can help considerably.
- In any given context, if the technical concerns are material then it should be possible to explain, in a reasonably intuitive way, each concern and why it matters.
- Where appropriate, arbitrators can facilitate the debate by requiring opposing experts to use the same data, and to explain how their results would change if they used the other expert’s assumptions.
In this short chapter, we focus on the first point only. We are aware that many of our readers will be lawyers – both counsel and arbitrators – reading this because they are faced with an expert report containing econometric analysis, and they need to understand some more or less complex points contained in that report. We cannot promise that this chapter will address the exact points at issue. However, we are confident that understanding the main points covered by this chapter is a necessary condition for anyone wishing to be a competent consumer of econometrics.
What is econometrics?
Econometrics is usually described as the application of statistical techniques to economic data. Economists use econometrics to quantify economic relationships; for example, to estimate how demand for a product varies with its price, or to estimate trends in a nation’s imports over a given time period. For our purposes though, perhaps the most useful characterisation, albeit it is a partial one, is that econometrics is one of the principal tools that economists use to construct and quantify counterfactual scenarios. For example, econometric techniques might be used to estimate the level of imports in scenarios with different levels of import duties, so as to assess alternative trade policies; or to assess how the presence of a bidding cartel affects prices in a procurement auction, so as to estimate the impact of the cartel on consumers and profits.
Put in those terms, it is clear why econometric techniques are so important to the estimation of damages, which routinely involves comparing the actual scenario with a ‘but-for’ scenario so as to estimate a relevant measure of value. This measure might be profits, with the difference between the actual and ‘but-for’ scenarios representing lost profits, which is often the metric for damages in relation to alleged breach of contract. In cases of alleged expropriation, the appropriate metric might be the value of an asset, with the difference between scenarios representing the loss in value resulting from the expropriation.
This chapter provides a brief introduction to regression analysis, which is by far the most commonly used econometric tool in the estimation of damages. Our approach is a practical one: we use a single example to explain how one should understand regression output as the reader will typically encounter it in an expert report; how the output can be used to provide a damages estimate; and how to understand some of the basic statistical concepts that will often be invoked by one expert or the other, touching on the validity of the regression analysis itself, the accuracy of the damages estimate, and so-called ‘statistical significance’. A second example provides further illustrations of a number of these points, and is also of interest in itself because it utilises the ‘event study’ methodology, which is particularly useful for estimating damages.
This chapter obviously does not seek to be comprehensive, and there are a number of important topics that we have not touched on, in part for reasons of space. To mention just two, these include:
- Issues around compiling and preparing data sets. The quality of data used is of fundamental importance. Even the most sophisticated econometric techniques will struggle in the face of a poorly constructed, error-filled, or misunderstood dataset: the cruel but often accurate acronym GIGO (Garbage In, Garbage Out) describes the problem succinctly.
- Techniques used to estimate consumer choice and ‘willingness to pay’. A very large body of work over some decades has focused on understanding how consumers choose between discrete alternatives, and this is potentially relevant to damages estimation in many settings.
Example 1: lost profits
Obviously, the basis on which damages should be calculated in a given matter is, in many respects, a legal question. However, in many cases, this basis is conceptually straightforward: namely, damages are intended to provide monetary compensation that puts the claimant in the same position that it would have been in absent the alleged wrongdoing.
As a specific example, consider the case of a large multinational consumer goods company (HugeCo) that produces and sells mouthwash, among many other products, in many countries. In 2006, HugeCo entered a new market, the rapidly emerging economy of Ruritania. To do so, it entered into a joint venture with a local partner (LocalCo). As part of the joint venture agreement, LocalCo made extensive commitments to undertake advertising and other promotional activities (which for various reasons it was uniquely well placed to do). HugeCo’s entry was successful, and by 2008 its mouthwash was an established consumer product in Ruritania. However, relations between HugeCo and LocalCo began to deteriorate in 2013, for reasons unrelated to the mouthwash business. Beginning in January 2014, LocalCo abruptly stopped all the advertising and other promotional activities it had previously undertaken in line with the joint venture agreement. This lasted for 30 months until June 2016, when HugeCo sold its interest in the joint venture, at which point LocalCo resumed its promotional activities. However, the arrangement left unresolved the question of compensation for LocalCo’s alleged failure to comply with its obligations under the joint venture agreement. HugeCo is therefore now seeking damages in the form of compensation for lost profits between January 2014 and June 2016, on the grounds that its mouthwash sales (and consequently its profits) over that period were lower than they would have been if LocalCo had met its obligations.
To quantify damages in this example, it is therefore necessary to estimate what HugeCo’s mouthwash sales would have been over the time period from January 2014 to June 2016 had LocalCo undertaken the promotional activities, and to compare these to the actual sales generated over that period. A simplistic approach to estimating sales in the ‘but-for’ scenario of full promotional activities might be to take the actual monthly sales from 2013 – or maybe an average over, say, 2010 to 2013 – and assume that this level of sales would have persisted in the future. Suppose, however, that 2014 saw a new company entering the mouthwash market and aggressively cutting prices relative to their prior levels. In that case, it seems intuitively obvious that HugeCo’s sales in 2014 would, even in the presence of full promotional activities, have been lower than in previous years as a result of the loss of market share to this new entrant. Regression analysis is the economist’s way of formalising that intuition and of taking into account quantitatively the various factors that might affect HugeCo’s mouthwash sales when estimating what the ‘but-for’ sales in 2014 to mid-2016 would have been.
More specifically, to estimate what HugeCo’s mouthwash sales in 2014 to mid-2016 would have been had LocalCo engaged in full promotional activities, an economic expert would first need to consider what factors might reasonably be expected to affect these sales. This is an important step. It is possible to estimate a regression on any set of variables – for example, we could use regression analysis to estimate the relationship between daily ice-cream sales, and the sterling/US dollar exchange rate. However, in the absence of an economic rationale as to why and how these variables are interrelated, the results of such a regression must be interpreted with extreme caution. To give another example, a regression between wholesale electricity prices and the price of crude oil may indicate a strong correlation between the two. However, one cannot necessarily conclude that there is a causal relationship between the two: in some markets there may be such a relationship (e.g., because wholesale power prices reflect the cost of running gas-fired power stations, and natural gas is imported at prices that are indexed to the price of crude oil); in other markets there is not (rather, there is a correlation between the price of different forms of energy, because demand for energy responds to common macroeconomic factors such as GDP growth). As we noted earlier, econometrics can be described as the application of statistical techniques to economic data, i.e., data that is generated by some economic process. Without an understanding of this process, there is a danger that no matter how sophisticated the economic techniques that are utilised, the exercise becomes one of what is disparagingly referred to as ‘data mining’, where chance correlations are confused with meaningful relations.
Let us return to the example. In reality – and this example is based on a real case, albeit well disguised – the economist would identify a number of variables, such as price, advertising expenditure and the previous month’s sales (technically referred to as the ‘explanatory variables’ or ‘independent variables’), that are economically relevant to determining HugeCo’s mouthwash sales volume (technically referred to as the ‘dependent variable’). Of course, the previous month’s sales do not directly cause sales this month, but they are a good proxy for factors that may cause sales this month: notably, the fact that people have tried the product, know whether they like it or not, and if they do like it, have gotten into the habit of buying it. The regression analysis will then measure how material this relationship is: the estimated coefficient for that variable captures this relationship between the explanatory and the dependent variable, and the size of the estimate indicates its materiality.
For expositional purposes, we now simplify and assume that it is sufficient to focus only on the previous month’s sales, and on the presence or absence of promotional activity. Given that the question at issue is the extent to which, if at all, the alleged failure to engage in promotional activity depressed HugeCo’s mouthwash sales volume, the economist can estimate a regression (e.g., using actual data of monthly sales volume, and which months had promotional activity, covering the 90 months from January 2009 to June 2016) where the dependent variable is sales volume in a given month (measured in units of 1,000 bottles), and the explanatory variables are:
- sales volume in the previous month; and
- a ‘dummy variable’ that is equal to one if it is in a month in which no promotional activity was taking place, and zero otherwise.
Suppose that the results of this regression (presented in the potentially confusing tabular form in which such results often appear in expert reports and which we will explain below) are as follows:
Table 1: Regression output
Prior Month Sales
No Promotion Dummy
Notes: Standard errors in brackets; *p < 0.1; **p < 0.05; ***p < 0.01
Turning the table into something comprehensible
As noted above, this presentation immediately looks confusing. Before any discussion as to meaning, we first explain in a more or less mechanical fashion how to turn the table into something more comprehensible.
Our recommendation here is, as a first step, to ignore large parts of this output: the bottom two rows; the figures in brackets, which the note tells us are ‘standard errors’; the little stars below some of the figures in the rightmost column; and the note beginning ‘*p < 0.1…’. These figures describe various statistical properties of the data that may or may not, depending on the specific issues in contention, be relevant. Ignoring them, for now, gives a much simpler table:
Table 2: Simplied regression output
Prior Month Sales
No Promotion Dummy
The numbers in the right-hand column are the ‘coefficients’ of an equation that relates the dependent variable (‘Sales’, in the top left-hand corner) to the explanatory variables. The result can in fact be written in equation form as:
Sales = 7,163 + 0.58 × Prior Month Sales – 1,956 × No Promotion Dummy
The coefficient of 0.58 on ‘Prior Month Sales’ represents the estimated sensitivity of this month’s sales to last month’s sales: each additional 100 units of sales last month gives an additional 58 units this month. Mechanically speaking, the intercept of 7,163 represents the expected sales volume assuming both explanatory variables are equal to zero – i.e., where there were no sales in the prior month and there is promotional activity by LocalCo (however, as we explain later, this interpretation should be handled with extreme care). This leaves the mystery of the ‘No Promotion Dummy’. As explained above, a dummy variable is one that takes on just two values, zero or one. In this instance, it takes on the value one when LocalCo does not engage in promotional activity and zero otherwise. We can most easily explain what that means by suggesting that the equation above be thought of as two equations. In months when LocalCo engages in promotional activity, the value of the dummy variable is zero, so for those months the equation is:
Sales = 7,163 + 0.58 × Prior Month Sales
In months when LocalCo does not engage in promotional activity, the value of the dummy variable is one. This only impacts the intercept and so for those months the equation is:
Sales = 5,207 + 0.58 × Prior Month Sales
Interpretation of the regression output
At its core, regression analysis is essentially a technique for determining the relationship between the dependent and explanatory variables that best explains the data on which the analysis is based, and the equation that results from the analysis is the equation of the ‘line of best fit’. Figure 1 illustrates what this means in the case of a single explanatory variable. The intercept here represents the expected value of the dependent variable when the explanatory variable is equal to zero (again, subject to the ‘health warning’ below).
Given the introductory nature of this article, we do not expand on exactly what is meant by the term ‘best,’ but the basic idea is that given values of the explanatory variables for a given month, plugging these values into the equation above yields the best estimate of the sales volume for that month. So, for example, if the sales volume in the prior month were 12,000, the best estimate of the sales volume in the current month (assuming LocalCo is providing promotional activities) would be 7,163 + 0.58 x 12,000 = 14,123. If the current month were one in which LocalCo failed to provide promotional activities, this estimate would be 1,956 lower at 5,207 + 0.58 × 12,000 = 12,167.
From this discussion, it is clear why, as explained above, the coefficient on prior month sales (0.58) may be interpreted as a measure of the impact of past sales – an increase of 100 in the prior month sales volume would lead to an increase in the best estimate of the current month sales volume of 58, all else being equal. This should be thought of as a kind of expected average effect: it is unlikely to hold exactly in any given month, but the statistician believes that it will be accurate on average (or more precisely, it is the statistician’s best estimate of what the effect will be on average). The qualifier ‘all else being equal’ (equivalently, ‘ceteris paribus’) is important: each coefficient in the regression output measures the effect of changes in the corresponding variable, holding all the other variables constant.
If we remember that the No Promotion Dummy is equal to one in a month in which LocalCo failed to engage in promotional activities and zero otherwise, the interpretation of the coefficient of -1,956 becomes clear: in a month in which LocalCo fails to promote HugeCo’s mouthwash, HugeCo’s sales volume is, on average, 1,956 units lower than it would otherwise have been.
We therefore derive an estimate of lost profits as follows:
- Take the regression’s estimated monthly effect of the alleged breach: a reduction in sales of 1,956.
- Multiply by the duration of the alleged breach, i.e., 30 months, to get 1,956 × 30 = 58,680.
- Estimate the profit margin on each sale. Suppose (for simplicity) it is $1 per bottle of mouthwash.
- Multiply the volume of lost sales (remembering that each unit of sales in the above analysis is 1,000 bottles) by the profit margin to get lost profits of 58,680 × 1,000 × 1 = $58,680,000.
The data on which this example is based is artificial, but a few useful observations can still be made. First, one possible criticism of the analysis might be that it does not accurately estimate sales for certain periods of time. Opposing counsel might argue: ‘this alleged relationship is clearly implausible and does not fit the facts: back in 2006 when HugeCo began to sell mouthwash, its sales in the first month were just 10 units. But according to the other side’s so-called expert, the first month’s sales should have been 7,163’.
However, that criticism is misplaced, because the statistician does not (or at least should not) claim that the relationship they have estimated applies for all level of sales, and the use of the results of the analysis in the damages exercise certainly does not require that this be the case. Recall that the regression is estimated using 90 months of sales data. Suppose, for example, the observed sales volume over those 90 months ranges from 8,832 to 19,289. Then the regression analysis captures the relationship between the dependent and explanatory variables over this range, and a responsible statistician will be cautious in claiming that the results of the analysis apply outside of this range (and a volume of 10 is almost as far out of that range as it can possibly be in the downward direction). In general, understanding both the data on which the analysis is based and any potential limitations in that data is a crucial part of the interpretation of any regression output.
Second, there is nothing to say that the relationship between the variables in the regression has to be linear – this is just an assumption, and an assumption that the statistician should test against the evidence. There are many ways of doing so, including the simplest of all, which is to visually depict the data on a chart and see if it appears to follow a linear pattern.
Figure 2 below shows two examples, the one on the left where an assumption of a linear relationship appears appropriate, the other on the right where it does not.
In that respect, it is worth noting that a commonly used alternative assumption is a linear relationship between the logarithms of the variables. For example, the statistician might estimate a relationship of the form:
Log [Sales] = 2.67 + 0.60 × Log [Prior Month Sales] – 0.24 × No Promotions Dummy
This may appear daunting. However, it may also be necessary, because it better represents the true relationship between the variables. Moreover, in reality, the use of a logarithm in a regression equation has a fairly intuitive interpretation: it can be understood to represent proportional (i.e., percentage) change. So, the figure of 0.60 in the equation above describes the relationship between a 1 per cent change in Prior Month Sales and the corresponding percentage change in Sales: it means that, on average, a difference of 1 per cent in prior month sales volume corresponds to a difference of 0.60 per cent in the current month sales volume. Similarly, the figure of -0.24 describes the percentage difference in sales between months where the dummy takes the value zero (promotional activity occurs) and where it takes the value 1 (no promotional activity): it means that the statistician expects sales to be lower by (approximately) 24 per cent in the absence of promotions. Again, whether a linear relationship or a relationship involving logarithms is more appropriate depends on the economics of the situation under consideration – representing the data graphically may well be a useful first step in addressing this question.
The complicated stuff: R2, standard errors, etc.
A few pages back we showed a table of regression output, and began by suggesting that the reader could ignore, for the time being, a large part of the table. We imagine this will have been welcome advice. However, it will often be necessary to understand some of the additional output in order to use it correctly and to understand commentary from opposing experts.
‘Coefficient of variation’ (usually written R2, pronounced ‘R-squared’)
In our example, the opposing expert might say ‘the R2 is too low. R2 is a measure of the strength of a regression, and it should be at least 70 per cent for the regression to be reliable. A regression with an R2 of just 53 per cent is not reliable.’ Unfortunately, we have seen comments along these lines in actual ‘expert’ reports. They suggest a fundamental lack of understanding of regression analysis: there is no such thing as ‘strength of a regression’, and there is no ‘threshold value’ for R2. In fact, whether or not the R2 value matters depends on the context and the aim of the exercise.
The R2 is a percentage, between zero and 100 per cent that indicates the proportion of variation in the dependent variable that can be explained by the observed variation in the explanatory variables. So, in the example of electricity prices and oil prices discussed earlier, a high R2 would indicate that variations in oil prices ‘explain’ a very high proportion of the variation in electricity prices. In our HugeCo example, the R2 of 53 per cent indicates that the explanatory variables used (Prior Month Sales and the No Promotions Dummy) explain 53 per cent of the observed variation in monthly sales. The rest is a result of other factors. However, that is per se not directly relevant to the quantum exercise, because the goal of the quantum exercise is not to identify all of the determinants of sales, or to predict future sales. It is to estimate the impact on sales of LocalCo’s promotions, all else being equal. Recall that we estimate that impact through looking at the coefficient on the No Promotion Dummy, which was -1,956. The relevant question, therefore, is ‘how accurate is the estimate of -1,956?’ and not ‘how well does this equation capture all the factors that explain monthly sales?’
In some instances, a low R2 may also suggest that the chosen form of the equation is incorrect – for example, if the regression uses a linear equation when the underlying relationship is non-linear, then that may show itself in a low R2, because ‘the wrong equation’ will generally not be able to explain or predict well the variation in the dependent variable.
The standard error and confidence intervals
How then does one answer the question ‘how accurate is the estimate of -1,956?’ To answer that, one turns to the ‘standard error’, as reported in Table 1 (which gave a value of 255).
The standard error is a measure of the accuracy of the estimate. As a rule of thumb, the likely range of estimation error can be taken for many purposes to be approximately two standard errors. So the figure of -1,956 can be interpreted as being the middle of a range whose lower bound is -1,956 – 2 × 255 = -2,466 and whose upper bound is -1,956 + 2 × 255 = -1,446. The statistician may say that their best ‘point estimate’ of the impact of the alleged breach is a reduction in monthly sales of 1,956 units, and they are confident, statistically speaking, that the reduction falls within a range of between 1,446 and 2,466.
That range will give rise to a corresponding range of damage estimates. It is sometimes argued that a very wide range should call into question whether the estimate provides sufficient legal certainty to be a basis for awarding damages. That is essentially a legal question. We would observe, however, that the estimate produced by regression analysis (i.e., the ‘point estimate’, in this case the figure of -1,956) is in various senses (discussed very briefly in footnote 10 above) the best available estimate of the effect of the alleged breach.
The ‘rule of thumb’ described above may appear rather mysterious. A more accurate and rigorous elaboration of the rule is the statement that 95 per cent of the time the regression methodology should produce an estimate that is within approximately two standard errors of the true value. The exact number is not two, but depends on the sample size and the number of explanatory variables: in our case with 90 data points and two explanatory variables, the number is 1.9876. The range obtained by taking the coefficient plus or minus (in this case) 1.9876 times the standard error is referred to as a ‘95 per cent confidence interval’.
There is no magic to the figure of 95 per cent. Different confidence intervals can be obtained by using different multiples of the standard error: a bigger multiple means a wider range, which obviously corresponds to a higher degree of confidence, but at the cost of a lower degree of precision. Commonly used values are 90 per cent, 95 per cent and 99 per cent (although this is to some extent conventional, there is no magic to those values). For instance, in the example being discussed, the 90 per cent confidence interval is obtained by taking plus or minus 1.67 standard errors.
We note also that in this example the range of error (the 95 per cent confidence interval) for our estimate does not contain zero – in other words, the statistician is confident that the true impact of the alleged breach was not zero. That is the meaning of the commonly heard phrase ‘statistically significant’. Regression analysis estimates coefficients, whose true values are unknown. An estimate is ‘statistically significant’ if the regression analysis indicates that the estimate is sufficiently far from being zero for it to be unlikely that the true value of the coefficient is zero. The finding implies that the variable in question has a genuine impact on the dependent variable, and that the relationship is not just random ‘noise’.
In our opinion, statistical significance is often of little relevance to estimating damages, as this example illustrates. Often, the fact that one variable has an impact on another is obvious, and not a point of contention between the parties. Moreover, if it is a point of contention then the disagreement is more likely to arise in the context of assessing merits rather than quantum. For example, suppose that LocalCo had changed the nature of its promotional activities rather than stopped undertaking them; and that HugeCo claimed that the change had led to a drop in sales, and was a breach of a contractual commitment to make best efforts to promote HugeCo’s mouthwash. LocalCo might then use regression analysis to show that any change in sales following the change in promotional activity was not statistically significant, and argue that this indicated that the change had not been due to a less effective form of promotional activity, and therefore could not be a breach of its best efforts obligation.
‘Little stars’ and p-values
Recall that the regression output in Table 1 also included some ‘little stars’ beneath the coefficients, and a note that referred to the ‘p-value’. These relate to the question of statistical significance, and therefore, as suggested above, are often not central to the dispute. However, we attempt to explain them here, at the risk of entering into a rather detailed discussion.
We said above that the fact that the 95 per cent confidence interval for our estimate of the impact of promotional activities (the coefficient on the No Promotions Dummy) does not contain zero means that the estimate is ‘statistically significant’. However, this depends on the confidence interval used. For example, suppose that the regression analysis showed a standard error of 800 (instead of 255). Then:
- the 95 per cent confidence interval would be given by -1,956 +/– 1.9875 × 800, and still does not contain zero; however
- the 99 per cent confidence interval would be given by -1,956 +/– 2.61 × 800, and does contain zero.
In those circumstances we say that the estimated coefficient is statistically significant at a 5 per cent level of significance, but not at a 1 per cent level of significance. The ‘little stars’ attached to the coefficients in Table 1 (and found in the note ‘*p < 0.1; **p < 0.05; ***p < 0.01’) can now be explained. A single * indicates that the estimate is statistically significantly at a level of 10 per cent. In the note, p denotes the level of significance, so ‘p < 0.1’ means statistically significant at 10 per cent (which is the same thing as 0.1). Two stars correspond to 5 per cent (0.05) and three stars correspond to 1 per cent (0.01).
To be more rigorous, the statement p < 0.1 means ‘if the true value of the coefficient being estimated were zero, then the probability that the methodology we have used to estimate it would, as a result of sampling error, give an estimate as large as (or larger than) the estimate here, is no more than 0.1.’
In the example given three paragraphs above where we assumed that the standard error on the estimate of the coefficient for the variable No Promotions Dummy was 800, we found that the estimate was statistically significant at 5 per cent but not at 1 per cent. In presenting the result, we would therefore report the estimate with two stars: -1,956**.
Remaining with that example, rather than say that the level of significance is less than 5 per cent but more than 1 per cent, one can find exactly what value it has. In this case the answer is that it is statistically significant at 1.6 per cent. That figure is known as the ‘p-value’. Clearly if you know the p-value then the ‘little stars’ are otiose (although it is not unheard of for authors, nonetheless, to present both).
Other commonly encountered forms of presenting regression output
Unfortunately there is no uniform template for presenting regression output. Some authors present the standard errors, others do not. Often the author will not present the standard error, but will instead present something called the ‘t-statistic’. In our opinion this is almost always inappropriate for a non-technical audience. However, the standard error can be recovered from the t-statistic: for each estimated coefficient, the standard error is the estimate itself, divided by the t-statistic. So for example, if we had reported that the estimated coefficient on the No Promotion Dummy was -1,956 and the t-statistic was -7.67 then the reader could find the standard error to be -1,956 / -7.67 = 255.
As an additional source of confusion, the standard error is often reported with different names or abbreviations (or simply unlabelled, with the reader expected to guess that it is the standard error). Common labels include ‘SE Coeff’ (‘Standard Error of Coefficient’), ‘SE’ (‘Standard Error’), ‘StDev’ or ‘Std Dev’ (‘Standard Deviation’).
Example 2: Event studies
As a second example of how regression analysis may be used to construct a ‘but-for’ scenario in the context of a damages assessment, consider the case of firm ZedCo, a large part of whose business comprises long-term contracts with foreign governments to provide pharmaceutical products such as vaccines in bulk. On 11 January 2016, it announced – completely unexpectedly – that one of its major governmental customers had unilaterally decided to terminate with immediate effect all of its contracts with the firm, with a total value of £1 billion. ZedCo has launched a breach of contract case against this government, and is looking for an assessment of damages in the event that it prevails on the merits.
At first glance, this would appear to be a very straightforward question – what else could damages be other than £1 billion? However, a little thought shows that the situation is rather more complex than that. For example, £1 billion is the value of the revenues that the contracts would have generated – what about the costs that ZedCo would presumably save as a result of the contracts’ termination? In any case, is the £1 billion a contractually guaranteed amount, or is it a figure (e.g., an estimate or a maximum) that depends on other factors? What ability does ZedCo have to mitigate its damages – for example, will the termination free up resources that will enable the firm to work on other contracts (perhaps even more lucrative ones)? Will the actions of this one government cause ZedCo’s other customers to follow suit?
Answering these and other similar questions is clearly a challenging exercise. It is often feasible to undertake such an exercise in a robust and objective way, but equally there are instances where it requires a number of somewhat subjective assumptions, to which the assessment of damages may be quite sensitive. However, if the shares of ZedCo are publicly traded, the econometric methodology known as an ‘event study’ offers an alternative way of tackling the damages question. This methodology is based on the simple idea that the change in a firm’s share price following the release of a particular piece of news represents the market’s assessment of the implications of that news for the value of the firm.
To illustrate, suppose that on the day of the announcement, the share price of ZedCo fell by 7 per cent, wiping £1.4 billion off its market capitalisation of £20 billion, as shown in Figure 3.
A comprehensive review of the business press on that day reveals no other news relating to ZedCo. Does this mean that the £1.4 billion reduction in market capitalisation reflects the market’s estimate of how much value has been destroyed as a result of the contract terminations, and that this amount is an appropriate measure of damages? Unfortunately, things are again a little more complex than that. The future profitability of ZedCo is certainly affected by factors specific to the firm itself – but it is also affected by factors relating to the economy and/or stock market as a whole, and to factors relating to the industry in which it operates. So, while there may have been no other news on 11 January 2016 relating to ZedCo specifically, there may well have been market-wide or industry-wide news that was relevant to ZedCo.
We can then ask how much of the 7 per cent (£1.4 billion) fall can be attributed to such news, and how much can be attributed to the news relating to the contract terminations? It is only the latter element that can potentially be considered as damages. To use an extreme example, if it could be shown that all of the fall was driven by an industry-wide piece of news – for example, the imposition of an unexpected excess profits tax on the pharmaceuticals industry – the implication would be that the market perceived the contract terminations to have had no impact on the value of ZedCo. Taken in isolation, this analysis would then suggest zero as the appropriate measure of damages.
This is where the event study methodology and regression analysis come into play. As a concrete illustration, suppose that on that day, the FTSE 350 index fell by 1.14 per cent, while the market capitalisation of a portfolio of firms that operate in the same industry as ZedCo (‘the Pharma Index’) falls by 2.75 per cent, as shown in Figure 4. For the ease of visual comparison, ZedCo, FTSE 350 and the Pharma Index prices are all shown to start at 100 on the first date of the chart, 1 October 2015.
The idea here is that changes in the FTSE 350 are a proxy for the market’s reaction to market-wide news, while changes in the Pharma Index are a proxy for the market’s reaction to industry-wide news. Changes in the share price of ZedCo that are attributable to market-wide or industry-wide news can be explained by changes in the FTSE 350 or the Pharma Index respectively. What remains is the change in the share price that reflects firm-specific news, and as explained above, the only firm-specific news on the day in question was the cancellation of the contracts.
To break down the 7 per cent fall in the share price of ZedCo into market-wide, industry-wide and firm-specific elements, we need to determine the relationship between changes in the share price of ZedCo, changes in the FTSE 350, and changes in the Pharma Index. This is where econometrics comes in. Using regression analysis, we can estimate how, on average, the firm’s share price moves with the FTSE 350 (all else being equal), and with the Pharma Index (again, all else being equal). To do so, we perform a regression using data on the firm’s share price, the FTSE 350 and the Pharma Index. Suppose that the output from our regression is:
Log (ZedCo share price) = 2.4105 + 0.7918 Log (FTSE 350) + 0.4837 Log (Pharma Index)
Recall that the presence of a log term in a regression equation can be read as ‘percentage change’. The equation above therefore tells us that, according to our best estimates (and rounding the coefficients to two decimal places), on average, a 1 per cent change in the FTSE 350 leads to a 0.79 per cent change in the share price of ZedCo; and on average, a 1 per cent change in the Pharma Index leads to a 0.48 per cent change in the share price of ZedCo.
This means that on a day like 11 January 2016, when the FTSE 350 fell by 1.14 per cent and the Pharma Index fell by 2.75 per cent, we would expect the share price of ZedCo to fall by:
0.79 × 1.14% + 0.48 × 2.75% = 2.22%
In other words, of the 7 per cent fall observed on that day, 2.22 per cent is explained by changes in the market (the FTSE 350) and the industry (the Pharma Index). In a ‘but-for’ scenario with no firm-specific news, we would have expected a 2.22 per cent fall. The additional 4.78 per cent fall in the actual scenario can only be attributed to the firm-specific news regarding the contract terminations.
Given the initial market capitalisation of £20 billion, this implies that the market has assessed the impact of the terminations on the value of ZedCo to be £0.96 billion (4.78 per cent of £20 billion). That figure is, therefore, our estimate of the loss in value, and therefore of the quantum of damages.
Regression analysis has many applications, and depending on the case the inferences to be drawn from a regression analysis may vary considerably. We hope that this chapter has provided a useful insight into the application of regression analysis in the context of litigation. As indicated earlier, we believe that the fundamentals of regression analysis should be comprehensible to a non-technical user, while it is the role of the expert to guide that user through the more technical details, to explain and illuminate as necessary, and to eschew the presentation of misleading analysis or ill-founded criticism.
- Boaz Moselle is senior vice president and Ronnie Barnes is principal at Cornerstone Research. The views expressed herein are solely those of the authors, who are responsible for the contents of this chapter, and do not necessarily represent the views of Cornerstone Research. We thank the staff of GAR; our Cornerstone colleagues, particularly Tiffany Eu and Lauro Remmler; and Matthew Vinall of Dentons and James Freeman of Allen and Overy for their input and advice. All mistakes, of course, are ours.
- At the risk of appearing self-serving, we would add that we believe that a suitably qualified expert (generally someone with appropriate formal qualifications) is more likely to have the kind of deep understanding that is required in order to simplify without misrepresenting.
- For more on this see our later discussion of ‘R2’, and in particular footnote 13.
- Unfortunately there is no single standard format for presenting regression results. We explain later some alternative forms of output that the reader may encounter in practice.
- Apart from the tabular format we note two other common sources of confusion. The first arises from the difference between the Anglo-Saxon and continental European uses of commas and periods. In this chapter we use a period to represent a decimal point, and a comma to count off thousands. So 0.5 is one half, and 7,163 is seven thousand one hundred and sixty three. The second arises from the question of units: the expert should specify clearly the units being used – in this case, HugeCo ships mouthwash to retail outlets in boxes containing 1,000 bottles, and sales are therefore measured in units of 1,000, as noted earlier.
- The value of ‘N’ in the last row is straightforward however: it is the number of data points used for the regression (recall we have 90 months of data in this example).
- We do not subscribe to the Stephen Hawking view that each equation we include will scare off half of our potential readers. Our preferred philosophy is one that has been attributed to Albert Einstein: ‘make things as simple as possible, but no simpler.’
- The figure of 5,207 comes from taking the overall equation (‘Sales = 7,163 + 0.58 × Prior Month Sales – 1,956 × No Promotion Dummy’), setting ‘No Promotion Dummy’ to one, and noting that 7,163 – 1,956 × 1 = 5,207.
- To give a brief hint: the regression is seeking to estimate the relationship between the dependent and the explanatory variables, represented by the coefficients (i.e., the numbers) in the regression equation. These coefficients are unknown, and we want to estimate them as well as possible. Statisticians think of the ‘best estimate’ as the one that is produced using the best methodology for estimating (which makes sense, since there is no way of judging whether an individual estimate is good or bad, unless one already knows the true value – in which case there is no need to estimate). So statisticians define various criteria for assessing different estimation methodologies, such as requiring the methodology to be ‘unbiased’ (i.e., right on average, and not systematically under- or over-estimating) and ‘minimum variance’ (lowest possible average size of estimation error). The methodology that produces regression analysis meets many of these criteria.
- Recall the equation is Sales = 7,163 + 0.58 x Prior Month Sales, so if Prior Month Sales are zero (as was the case, by definition, for the first month of sales) then the equation predicts Sales of 7,163.
- The figure of 24 per cent is only approximate because things get a little more complex when one relates the logarithm of the dependent variable to something that is not a logarithm (as here, relating log (sales) to the dummy variable). It is still true that we are looking at the proportionate impact of the explanatory variable: in this case, how the presence of promotion affects sales in proportionate (i.e., percentage) terms. However, the effect is actually measured by the exponential of the coefficient: so the more accurate statement is that sales without promotion are predicted to be exp(-0.24) of sales with promotion, which is 78.7 per cent, i.e., lack of promotion reduces sales by 100 per cent minus 78.7 per cent, which is 21.3 per cent, not 24 per cent. Clearly it is the job of the expert to guide a tribunal through such complexities.
- It is often expressed as an absolute number, rather than a percentage. So, for example, an R2 of 0.53 is the same thing as an R2 of 53 per cent, expressed in a different way.
- Caveat: this is subject to various assumptions about the underlying data. Those assumptions may or may not be reasonable, depending on the specific circumstances; they are also to some extent testable, and the statistician should check them before using the standard error in this way (or various others).
- So the lower bound is -1,956 – 1.67 x 255 = -2,382 and the upper bound is -1,956 + 1.67 x 255 = -1,530. The multiples used (in this case 1.67) are obtained using something called ‘Student’s t-distribution’, which is similar to the well-known ‘normal distribution’ (the ‘bell curve’).
- Confusingly, the level of significance is 100 minus the level of confidence (so a 95 per cent confidence interval means a 5 per cent level of significance, and a 99 per cent confidence interval means a 1 per cent level of significance).
- As with confidence intervals, the figures of 1 per cent, 5 per cent and 10 per cent are conventionally used, but there is no magic to that choice.
- The stars can also often be found written adjacent to, rather than beneath, the number, as shown here.
- Here, the term ‘market’ is used to describe the collective of investors – individual and institutional – who are (or could be) buying and selling the shares in question.
- Market capitalisation means the total market value of a firm’s outstanding ordinary shares.
- Such a portfolio is commonly referred to as an industry index.