Quandl assembled 400 buy-side investors, data providers, and sell-side professionals at their second Alternative Data Conference on January 18, up 143% from last year. Presentations focused on 5 main themes: A.I., resources for success in using alternative data, quantamental analysis, presenting new datasets, and compliance.
Six new datasets were introduced:
- Legal Shield – leverages legal information from a network of 1.7m subscribers, 6.9k broker clients, and 34 law firms to predict macro indicators: consumer confidence, housing starts, total bankruptcies, foreclosure starts, and existing home sales.
- Quandl M&A Insights – uses aviation industry partnerships to track daily activity of 43k private jets, unmasking FAA block list.
- SimilarWeb – web traffic and app usage data provider. Highly accurate on Williams Sonoma e-commerce sales.
- Dun & Bradstreet – provides metrics that measure the health of private companies.
- PCC Indicator (monthly) tracks ~400 businesses for each of the 43k zip codes in the US. Highly correlated to US GDP at a regional level.
- PAYDEX is a dollar-weighted score for how promptly a company pays its bills.
- Collected this data from a trade program over 30 years.
- S&P Market Intelligence – using language analytics to understand sentiment from earnings calls. Has data on 8,300 companies with history back to 2004.
- Wolfe Research – partnered with Quandl, by using its Dodge US Construction dataset to develop 30+ construction factors with forecasting power on asset returns.
Tools mentioned throughout the conference:
- Jupyter – open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.
- Trust.mit.edu – info on creating a secure multi-party system that enables interaction with a protected dataset without sharing it.
- MAPD – data integration and visualization platform. Uses GPU (really fast) database and runs on AWS. Has an open source offering called MapD Core.
- Gunning Fog Index – open source “readability” test, similar to what is used by language analysis providers.
- Vin.place – database of car ownership, sourced from VIN numbers.
You can predict human behavior better by understanding groups and tribes rather than individuals – Alex Pentland, Professor at MIT.
- 90-95% of human behavior is predictable when analyzing groups. The character of that 5-10% unpredictable behavior can be a leading indicator of trends and performance.
Top safety recommendations for data practitioners from Alex Pentland:
- Do not put all your data in one place – move/share answers, not the data itself.
- Create an auditable Q&A system – use blockchain to log your process.
- Never decrypt the data – use a secure multi-party platform. (More details attrust.mit.edu)
- Data owners should think about charging for access to data and not for the data itself.
New dataset on buy-side alternative data full-time employees (FTEs) was created to analyze how funds are recruiting and compensating their teams. James Moran, Co-founder and President of YipitData, presented the major hiring trends and integration best practices from the dataset. Some highlights:
- Number of alternative data FTEs at funds has doubled every year over the last 5 years.
- Significant demand for data talent also coming from Long-only investors.
- Educational background at funds likely to shift significantly toward STEM majors (41% of alt. data FTEs), primarily Mathematics and Computer Science.
- Tech Companies, Data Providers, and Academia accelerate as FTE sourcing targets.
- An introductory alternative data team likely to cost more than $3mm with many funds already paying well over $10mm on alternative data FTEs.
- Full analysis to be published on AlternativeData.org.
Matthew Rothman, MIT Professor, was hired to lead Goldman’s “Data as a Service” offering. He left the details of this offering TBD.
- He’s an “alternative data contrarian” who believes we should be focused on addressing 1st order data problems (internal/proprietary data that already exists) instead of 3rd order data challenges from 3P alternative data sources.
- Pushed the audience to use technology and analytics to find value in their own internal data and current processes.
- Provided a couple of examples of data applications:
- Measuring garbage as a better productivity indicator.
- Sports car ownership a sensation-seeking indicator for leadership risk tolerance, habits, and attitudes. (Can use vin.place)
Top line performance numbers don’t really tell the full story. Analysts should use alternative data as a fundamental snapshot of what is happening underneath the hood, understand category and unit level trends. – Michael Recce, Chief Data Scientist at Neuberger Berman.
- People don’t realize how much information they leave online. Cookies can be tied to a particular IP address, even once you delete your cookies.
- This data can be aggregated, sold, and used to see what specific companies (e.g. funds) are over-indexing in search terms and content.
- Funds must place engineers and technologists at the center of the investment idea process.
Schroders launched their data insights unit in late 2014 and now has a team of 22 engineers and data scientists. Led by Mark Ainsworth in London, the unit just added its first New York team member.
- They successfully used satellite imagery to predict the outcome of M&A activity in the UK.