“These types of strategy have exploded in recent months and by 2020, the industry is projected to be worth USD 350mn, with total alternative data spending in excess of USD 1.7bn.”

Data has become businesses’ most precious resource

Investment professionals across the industry have begun to understand the utility of alternative data and that it is forever changing the invest­ment landscape. But what exactly is it? Alternative data (Alt Data) is infor­mation gathered from non-traditional information sources. By analysing it, is possible to gain insight beyond what traditional data sources are capa­ble of providing. Alt Data sources are typically non-financial information that can be used to better assess the future price performance of invested assets. Alt Data is found in a compa­ny’s internal data, physical technolog­ical installations or, most commonly, via web-scraping tools, an automated method of extracting data from web­sites. The web-scraped data itself takes many forms, including product pricing, search trends, insights from expert networks and web traffic data of all kinds. Around 2014, a group of largely sophisticated hedge funds started operating in this new data-rich investment world, aggressively seek­ing information advantages. Since then, many start-ups have jumped into this business, attempting to mon­etise this extensive availability of data.

Why Alt Data?

We found four reasons. The first driving the need for Alt Data is related to the growth of data availability over the last ten years, thanks to advancements in technol­ogy. Roughly 800 datasets across more than 20 categories relevant to the buy-side are available today. The five most popular data types are: social/ sentiment, private company, credit card, supply chain and Web. Exam­ples include satellite imagery used to count the number of cars in shop­ping centre car parks as a metric for retail sales activity; geospatial anal­ysis for identifying the geographical proximity of competitors; and pricing data analysis for insight on everything from bundle pricing to financial rates monitoring.

The second is the pursuit of buy-side firms to ride the “low latency” world of delivering non-traditional data in order to make faster and bet­ter investment decisions, enabling them to capitalise on opportunities early and mitigate potential risks. Using Alt Data, investors can monitor how a business is doing on a weekly or even daily basis (rather than waiting for the monthly or quarterly company updates), giving them an incredible edge over other investors. For exam­ple, by employing “quantitative over­lays”, such as leveraging credit-card swipe payment information, funda­mental analysts can now monitor sales data against earnings estimates and forecast potential share price impact well in advance.

The third reason is based on ROI (return on investment). Alt Data is expensive and failure is often due to spending too much money and time on the wrong data firms. But if you are wise enough in selecting the right vendors, you can get a very good edge and performance ROI from your data spend.

And finally there is the so called “fear of missing out”: buy-side firms do not want to be left out of this party and are keen to glean insight on what is happening around the business. All types of investors are now embracing the use of more data and those who fail to join this revolution are likely to underperform and be left behind.

Implementing Alt Data is not only about benefits

While firms recognise the alpha gener­ation potential of these new datasets, they face challenges like data connec­tivity, data cleaning, varying quality and ease-of-use. Some of the top chal­lenges when leveraging Alt Data are lack of workflow integration, short histories, collection systems that are prone to change, information integ­rity and reliability as well as data pro­tection policies. Sufficient regulatory protection for individuals remains elusive at this stage. And some of the data types being procured by hedge funds are not anonymous with respect to personal information. The GDPR (General Data Protection Regulation) in the EU, for example, has recently been adopted to strengthen and standardise the protection (anonym­ity) of its citizens’ personal data. The main driver behind this regulation lies in the problematic nature of the complex information management, which results in the difficult govern­ance of Big Data.

Data must now be handled appropriately, certified and compliant with local laws, in respect of both privacy and security manage­ment. Despite these challenges, we still think there can be substantial benefits to using alternative data. Very recently, some of the leading sources of reference in pricing data and major investment firms have started offer­ing clients a single access point with multiple, market leading alternative data providers, for finding and receiv­ing reliable data, that eliminates costly and lengthy procurement pro­cesses. This speeds up time to value, enabling easy and efficient integra­tion to existing systems or databases with quant investors who need only select their preferred programming languages (mainly Python). With this access point, professional inves­tors can browse and examine quality metadata online, trial sample data­sets prior to acquisition and imme­diately put them to use within their organisation.

How quants will start using alterna­tive data practically

According to recent studies and sur­veys, on average over 80% of funds use or are expected to use alternative data. We believe that 2019 will mark the beginning of a more mature phase of this business that will probably last between five to seven more years, and where the early majority of quants will start incorporating alternative data in their businesses. We think the hottest category of alternative data that we will emerge in the com­ing years could be consumer transac­tion data, where the buy side gets the most return on investment. The most prominent emergence will probably be the increase in demand for employ­ment data. The first significant chal­lenge for quants dealing with Alt Data is backtesting - having a mechanism to evaluate the effectiveness of a trad­ing strategy by running it against his­torical data. Today, backtesting on Alt Data is very difficult, simply because we do not yet have good and suffi­ciently broad historical data.

This also creates an urgent need for advanced analytics skills and capa­bilities to process this vast amount of data. As a result, data teams are grow­ing everywhere: the number of Alt Data full-time employees on the buy side - mainly data scientists and ana­lysts - has grown ~450% in last five years.

To fully reap the benefits of this new investment world of abundant data, machine learning will play a central role in identifying patterns and correlations, managing risk and transforming this knowledge into actions that allow buy-side firms to gain a competitive advantage. Tensor­Flow and Scikit-learn in Python are the prevailing big data analytics used in asset management today. Nearly all major industry players are now filling their quant teams with physicists and data scientists, providing them with access to the data and turning them loose, expecting them to come up with something brilliant. We do not agree with this kind of approach: quants will certainly fail if they do not use a more “conservative approach” that keeps all decisions and management in the hands of experienced finance profes­sionals. Discerning valuable informa­tion from noise requires extensive real financial experience – not maths and statistics alone.

The goal then, is not to replace finance professionals with mathema­ticians, but to evaluate an experienced investor’s hypothesis and test it with machines to realise superior, explain­able and more actionable information. Intelligently mining this data is criti­cal to avoid getting lost in machine-made interpretations. As we said last year, it is not mans versus machine, but experienced finance professional with machine.

To conclude, we believe the big data revolution will usher in a new era of investing that will ultimately ben­efit markets by lowering day-to-day volatility, producing fewer surprises and empowering investor confidence, enhancing market stability. Big data holds incredible promise to facilitate so many investment decisions. Com­panies capable of extracting value from their data will enjoy a competi­tive advantage – as long as they ensure they distinguish what is of value from what is not.