The Ivory Tower – With ESG Studies, the Devil’s in the (Data) Details

The Ivory Tower – With ESG Studies, the Devil’s in the (Data) Details

The body of research on environmental, social, and governance (ESG) scores and stock returns is large and growing. A search on GoogleScholar using that phrase shows that since 2018 alone, roughly 11,700 academic articles have been published on what ESG scores reveal about stock returns, or whether ESG scores have any consistent relationship with stock returns, and so on. This indicates considerable interest, but when we examine these studies we soon discover that the conclusions may not be valid or do not necessarily offer practical, actionable insights.

Does that mean all of these studies are flawed? Not necessarily (and no, we haven’t read all 11,700 of them), but we think it is important to put this field of research under a magnifying glass. In a nutshell, when we look at studies based on ESG scores, what we find is that the devil is in the data details. Here, we attempt to provide a thought-provoking overview of this issue.

Sources and methods

It is important to consider the data sources used in any study of finance and economics, but ESG data presents some particularly thorny challenges. Studies that use ESG scores to classify companies and study stock returns that way are at the mercy of the way those scores are determined. When we buy into the conclusions a given study presents, we are assuming that those ESG scores are valid, that they actually contain the information the study needs. 

For starters, what underlying sources are used to construct ESG scores? Do the scores rely heavily on reports published by the companies themselves (which are likely to paint as rosy a picture as possible)? Or, do the scores draw upon a mixture of third-party sources, each of which may cover only part of the investable universe? How does the mix of information inputs change over time? ESG disclosure requirements differ around the world, so ESG scores for companies in the EU can reflect inputs that may not be available for U.S. companies. Furthermore, definitions of what each pillar of E, S, and G encompass are non-standard, rapidly evolving, and expanding. A definition from just three years ago is probably a bit stale already. 

ESG scores – not a stable foundation 

Digging into these details of ESG scores, what they cover, and how they change over time, can lead to cynicism about studies that hang everything on those scores. Those of us with a strong research mindset, who enjoy diving into and analyzing data, realize that a typical approach used in studying stock price returns – a multi-year (at least 10 years) analysis using monthly or just quarterly data – may not be meaningful or even possible in the ESG space. There are many reasons for this – here are just a few: 

  • ESG scores may not go back for 10 years. If they do, the way the scores were defined and the data used to construct them changed over the period studied. That means we’re looking at different metrics from the beginning of the study to the end. We don’t have this problem in studies of stock returns and return on equity, earnings per share, or other clearly-defined financial measures. That means studies based on ESG scores don’t prove or disprove a link between companies’ ESG practices and stock returns – it’s more about stock returns and the data inputs and methods used to construct ESG scores.
  • Interest in ESG has increased so dramatically over the past 10 years so corporations’ sustainability practices have gone from “not much” to “a whole lot” over that period, but ESG scores wouldn’t necessarily pick that up. Why? Because ESG scores tend to use relative rankings within an industry or peer group (e.g., if the scores range from 1-10 and the entire peer group gradually improves, a given company might have a score of “5” – average among its peers – for that entire period, even though its ESG practices improved a lot over that time. Would stock prices “react”? That’s a stretch.
  • Updating ESG scores quarterly? Don’t get us started about inconsistencies in the quantity, quality, and timing of publicly available information about companies’ behaviors related to ESG. A company could do a really bad thing that doesn’t show up in a score for another nine months, but it shows up in the stock price right away.

We don’t mean to go off on a rant, but citing results of studies based on ESG scores without acknowledging these limitations does not provide a fair, unbiased picture.

Study results are biased by difference in scores

A big issue that is rarely (ever?) addressed in these studies is, why did the study authors choose the ESG scores they used? The lack of consistency across the scores from the best-known ESG ratings vendors is widely acknowledged. Most studies use scores from one of these vendors, which means their research results reflect the biases or idiosyncrasies of that vendor. It’s not at all the same as choosing between Moody’s or Standard & Poor’s in a study using bond ratings; there is enough consistency between those two providers that the results would be basically the same either way. Not so with ESG ratings.  

Another issue that studies fail to mention is that the ESG scores assigned across a universe of stocks contain certain biases. The article “Integration of ESG in Asset Allocation” describes three types of biases observed in ESG providers’ scores: 

  1. Industry Bias: Companies in mature and heavily regulated industries (such as banks and telecommunications) tend to have higher ESG scores than companies in less mature, unregulated industries. Some of those “biases” may be partly justified, given that some types of regulations may reduce some types of ESG risks, but things like gender pay gaps and inadequate board oversight can happen in any industry. 
  2. Country Bias: Differences in regulations and disclosure requirements around the world lead to significant discrepancies among ESG scores for companies depending upon the region where they operate. As noted above, the EU has been an early adopter of ESG regulations and disclosures, and companies there tend to have higher ESG scores than their counterparts in the U.S. 
  3. Size Bias: Larger companies tend to have higher ESG scores than smaller companies. There are several possible reasons for this. Larger companies tend to receive greater scrutiny from institutional investors, securities analysts, the media (including social media), and therefore may put more effort into their ESG reporting and disclosures. Larger companies also tend to have more resources available to address ESG issues than smaller companies. 

As an ESG data and analytics firm, OWL believes that investors can benefit greatly from looking beyond the summary ESG scores to dig into the details. Taking that approach and mindset allows you to emphasize the components of the E, S, and G pillars that you believe are most critical. For financial advisors, this approach allows you to reflect your clients’ priorities instead of an ESG rating vendors’ point of view. Contact us to learn more about how we make this possible.