Quandl assembled 400 buy-side investors, data providers, and sell-side professionals at their second Alternative Data Conference on January 18, up 143% from last year. Presentations focused on 5 main themes: A.I., resources for success in using alternative data, quantamental analysis, presenting new datasets, and compliance. Six new datasets were introduced:
  • Legal Shield - leverages legal information from a network of 1.7m subscribers, 6.9k broker clients, and 34 law firms to predict macro indicators: consumer confidence, housing starts, total bankruptcies, foreclosure starts, and existing home sales.
  • Quandl M&A Insights - uses aviation industry partnerships to track daily activity of 43k private jets, unmasking FAA block list.
  • SimilarWeb - web traffic and app usage data provider. Highly accurate on Williams Sonoma e-commerce sales.
  • Dun & Bradstreet - provides metrics that measure the health of private companies.
    • PCC Indicator (monthly) tracks ~400 businesses for each of the 43k zip codes in the US. Highly correlated to US GDP at a regional level.
    • PAYDEX is a dollar-weighted score for how promptly a company pays its bills.
    • Collected this data from a trade program over 30 years.
  • S&P Market Intelligence - using language analytics to understand sentiment from earnings calls. Has data on 8,300 companies with history back to 2004.
  • Wolfe Research - partnered with Quandl, by using its Dodge US Construction dataset to develop 30+ construction factors with forecasting power on asset returns.
Tools mentioned throughout the conference:
  • Jupyter - open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.
  • Trust.mit.edu - info on creating a secure multi-party system that enables interaction with a protected dataset without sharing it.
  • MAPD - data integration and visualization platform. Uses GPU (really fast) database and runs on AWS. Has an open source offering called MapD Core.
  • Gunning Fog Index - open source “readability” test, similar to what is used by language analysis providers.
  • Vin.place - database of car ownership, sourced from VIN numbers.
You can predict human behavior better by understanding groups and tribes rather than individuals - Alex Pentland, Professor at MIT.
  • 90-95% of human behavior is predictable when analyzing groups. The character of that 5-10% unpredictable behavior can be a leading indicator of trends and performance.
Top safety recommendations for data practitioners from Alex Pentland:
  • Do not put all your data in one place - move/share answers, not the data itself.
  • Create an auditable Q&A system - use blockchain to log your process.
  • Never decrypt the data - use a secure multi-party platform. (More details attrust.mit.edu)
    • Data owners should think about charging for access to data and not for the data itself.
New dataset on buy-side alternative data full-time employees (FTEs) was created to analyze how funds are recruiting and compensating their teams. James Moran, Co-founder and President of YipitData, presented the major hiring trends and integration best practices from the dataset. Some highlights:
  • Number of alternative data FTEs at funds has doubled every year over the last 5 years.
  • Significant demand for data talent also coming from Long-only investors.
  • Educational background at funds likely to shift significantly toward STEM majors (41% of alt. data FTEs), primarily Mathematics and Computer Science.
  • Tech Companies, Data Providers, and Academia accelerate as FTE sourcing targets.
  • An introductory alternative data team likely to cost more than $3mm with many funds already paying well over $10mm on alternative data FTEs.
  • Full analysis to be published on AlternativeData.org.
Matthew Rothman, MIT Professor, was hired to lead Goldman’s “Data as a Service” offering. He left the details of this offering TBD.
  • He’s an “alternative data contrarian” who believes we should be focused on addressing 1st order data problems (internal/proprietary data that already exists) instead of 3rd order data challenges from 3P alternative data sources.
    • Pushed the audience to use technology and analytics to find value in their own internal data and current processes.
  • Provided a couple of examples of data applications:
    • Measuring garbage as a better productivity indicator.
    • Sports car ownership a sensation-seeking indicator for leadership risk tolerance, habits, and attitudes. (Can use vin.place)
Top line performance numbers don’t really tell the full story. Analysts should use alternative data as a fundamental snapshot of what is happening underneath the hood, understand category and unit level trends. - Michael Recce, Chief Data Scientist at Neuberger Berman.
  • People don’t realize how much information they leave online. Cookies can be tied to a particular IP address, even once you delete your cookies.
    • This data can be aggregated, sold, and used to see what specific companies (e.g. funds) are over-indexing in search terms and content.
  • Funds must place engineers and technologists at the center of the investment idea process.
Schroders launched their data insights unit in late 2014 and now has a team of 22 engineers and data scientists. Led by Mark Ainsworth in London, the unit just added its first New York team member.
  • They successfully used satellite imagery to predict the outcome of M&A activity in the UK.
[yikes-mailchimp form="1"]