Takeaways
“Alternative” data doesn’t remain alternative for long.
Throughout history, sensors, analytical theory, and communications have advanced in an iterative process.
This process has sped up considerably during the Information Age.
Now web scraping, exhaust data, and sensor data combine to provide hedge fund analysts the tools they need to gather and forecast fundamental information and understand underlying business drivers.
It remains the analysts’ job, though, to find new ways to interpret the data for investors’ benefit.
Nirvana. R.E.M. The Red Hot Chili Peppers. Hard to believe, but these Rock ‘n’ Roll Hall of Fame laureates all used to be called “alternative”.
It’s a description with a very short shelf life, so we at Facteus don’t much care for the word “alternative” to describe our data. But that’s what everyone else calls it, at least for now.
So rather than relying on one, admittedly misleading label, let’s use plain language to tell you what we provide our investment research clients.
The origin of alternative data
As hedge fund managers already know, the first written records were tallies used to keep track of sales and inventory. Still, it’s hard to draw a straight line from those early Sumerian tablets to the Big Data warehouses of today.
It wasn’t until the 1640s that “datum” was even a word, meaning an elemental fact that can be taken as given. It applied to such predictable events as the ebb and flow of the tides, or such repeatable observations as the relationship between how thick a glass lens is and its refractory power. And while all that was useful, any solvent enterprise or government could hire a clerk to copy over such tables or create them from scratch. The first use case for alternative data, as we understand it today, might have been weather forecasting. From those days until this day, there’s no competitive advantage greater than the ability to predict the weather better than your rivals.
There’s a difference between climate and weather: Climate is what you expect, and weather is what you get. That is, weather is reality. Climate can only be a model combined with a data set. The trick is to better predict tomorrow’s reality based on today’s available information (i.e. forecasting).
Ancient Greeks and Babylonians made their attempts, according to a Yale University syllabus, but the scientific method wasn’t in place yet. In more recent years, though, data scientists have greatly improved how closely climate models match the weather on the horizon.
Much of that improvement came from advancements in gathering data. The Renaissance brought the invention of thermometers, barometers and rain gauges.
Further improvement came from developments in the processes through which these data could be interpreted. The Renaissance yielded to the Enlightenment, during which thinkers postulated that the morning sun heats the air, which becomes less dense than the cooler air to the west and that difference in pressure causes wind. From there it was a short leap to the theory that storms moved and could thus be tracked.
Next, as the Industrial Revolution took hold, the ability to communicate the findings moved ever faster and more pervasively, allowing a broader array of people to receive the information and benefit from it. Everyone from farmers to umbrella salesmen could come out ahead.
And here we are in the Information Age, and all three of these processes – technological improvement, theoretical advancement, and communication efficiency – continue to iterate. But those iterations came faster than thermometer inventor Galileo or first storm tracker Benjamin Franklin could have ever imagined. The hardware now hovers in geosynchronous orbit. Sensors feed mainframe computers which host artificial intelligences that constantly learn nuances of how the near-chaotic elements on, over, and under the earth interact. And results are projected on the screens of HDMI trading station and pocket-sized smartphones.
Here comes the hard part, though. As we just noted, neither the man who proved the earth revolved around the sun nor the one who proved lightning is made of electricity could be expected to understand any of this. And yet, interpreting alternative data inevitably becomes the job of a hedge fund’s analysts – often junior analysts at that.
Alternative data today
Alternative data is now used by hedge funds to supplement the sources of information they use to evaluate investments and companies. Due to regulatory proscriptions, these firms must avoid material, non-public information – but that doesn’t mean they should only look at EDGAR database filings, Investor Day PowerPoint decks, and press releases.
What alternative data does is find facts that are entirely publicly available, but requires information systems experts to look for it, know where to find it and understand how it relates to the companies and markets an analyst might be interested in.
A large part of alternative data comes from the breadcrumbs people leave online. When you know which websites people visit, where they spent the most time, what they bought, what they looked at and declined to buy, what they watched, what they listened to and which influencers they follow, you get a good idea of who they are.
But let the companies worry about what an individual consumer might be leaving in their abandoned cart. For investors, it’s the anonymous and aggregate data that’s important. As those change over time, it becomes clear whether people are poised to buy more big-ticket goods, more staples, or larger homes. This affords the canny analyst with additional resources to determine where the economic cycle is heading and at what velocity. It can also signal which political messages are gaining acceptance, which could suggest what policies might soon be put in place, and whether pharmaceutical companies, defense contractors or financial services firms will benefit the most.
Of course, web scraping isn’t the only source of alternative data. Debit and credit card companies as well as payment portals know about almost every retail transaction made; this is often called “exhaust data” because it’s a byproduct of regular business processes. There’s also data generated by sensors. For example, geolocation services can read enough into foot traffic through retail stores that the concept of “I’m just looking” becomes meaningless. And, of course, there’s the whole infrastructure that’s been built around the sensors used to predict the weather.
Whether a hedge fund wants to know more about a particular sector or data type, or whether it needs raw data or cleansed data or fully analyzed information, determines how a data insights provider would acquire its materials and deliver its services.
At the close
Just like yesterday’s alternative music became today’s classics, today’s alternative data will soon become tomorrow’s standard feed.
The challenge for funds is to keep their research analysts current with the latest data that is on the cusp of becoming mainstream. Once it becomes mainstream, of course, knowledge of it becomes table stakes, not a competitive distinction.
The alternative data provider’s challenge, then, is to continuously discover new relevant data sources that have not yet achieved mass adoption, then introduce them to analysts and portfolio managers who can use them to pursue alpha for their clients.
So maybe we shouldn’t refer to all this as “alternative” data. It’s innovative. It’s overlooked. It’s non-obvious. But its nature is to not be alternative for long.