Uncategorized

The Download 21: How Tiger Cubs are using Alt. Data, 4 New Datasets, 2 Jobs

Providers Added to the Database

  • Crimson Hexagon – Data on 1 trillion social media posts dating back to 2008, aggregated from Facebook, Twitter, Instagram, Reddit, Tumblr, and more.
  • Tribe Dynamics – Data on earned media (influencer) performance across beauty and fashion companies.
  • Certify – Data on U.S. business travel and expense spending trends. Covers restaurants, airlines, lodging and car rental.
  • Webhose.io – Web data provider for sentiment tracking across news, blogs, reviews, broadcast, and e-commerce.
  • See full public database of 297 data providers.

Updates

Jobs
News & Insights
  • Viking and Coatue are using alternative data to inform fundamental strategy, while Maverick is tapping the data to implement a systematic strategy on its smaller positions. (Novus)
  • According to a State Street survey, only 60% of institutional investors feel they have the right talent in place for investment data strategy. Five years ago, that figure was 91%. (Asset Servicing Times)
  • Lyft claims to have 35% of ridesharing market share according to email receipt data. (CNBC)
    • Second Measure, a credit card data provider, has Lyft at 27%.
    • Certify, a business expense tracker, puts Lyft at 19% of business-related ridesharing.
  • NPD, a point-of-sale and consumer survey provider, reports that online sales in the automotive aftermarket have doubled in the past three years to 14% market share. (prweb)
  • A UBS Analyst claims that satellite data on Walmart parkings lots correlates significantly with earnings, suggesting that there is indeed insight obtainable from the most cited example in alternative data history. (Observer Research Foundation)
Other Reading
  • Satellite data of nighttime illumination used as GDP indicator in paper that shows authoritarian regimes tend to inflate their growth rates by 1.15 – 1.3 compared to strong democracies. (TwitterFull Paper)

 

Uncategorized

The Download 20: Goldman Gears Up to Sell Alt. Data Product, Alt. Data Spend Accelerates, 4 New Datasets

Providers Added to the Database

Updates

  • Avast, the parent company of Jumpshot, a web traffic provider, estimates the addressable market for its Jumpshot business at $2.8b by 2021 (Prospectus, p. 57). Avast IPO’d this week. (Reuters)
  • YipitData launched web-scraping based product to track key performance metrics for China Lodging Group (HTHT), including occupancy rates, ADR, and RevPAR. (Request Info)
  • 7Park Data raised 7mm in new funding. (SEC)
  • iSentium, a sentiment provider, hired Ray Tierney III (former head of trading at Morgan Stanley and Bloomberg Tradebook) as president and COO. (efinancialcareers)
Jobs
Events
  • Discovery Day Intrepid – BattleFin (New York, June 20) –AlternativeData.org is a media partner for the event – use discount code “AltData20” to receive 20% off your ticket. 
News & Insights
  • Buy-side spend on alternative datasets has significantly accelerated over the last two years (~60% y/y) and is estimated to reach over $1bn by 2020. (Financial TimesAlternativeData.org)
  • Goldman Sachs gearing up to sell internal data. Potential product could include trade and pricing data at high frequencies, leveraging role as over-the-counter dealmaker. (Risk.netsubscription required)
  • Market for geo-location data grows with increasingly inventive use cases, but technical challenges remain.
    • Market for geo-location data expected to grow to $250mm by 2020, according to new report from Opimas, a capital markets consultancy. (marketsmediaFull Report)
    • Schroders data science unit used geo-location data to assess a pizza-chain expansion plan based on casual dining competition and foot traffic. (Pensions & Investments)
    • Still, the data requires expert validation before it can be put to use. “In one case, there was data that showed peak visitors in a Home Depot in California … (came) at 8 a.m. The store opened at 10 a.m. But that store has a loading dock that abuts a freeway.” – Michael Recce, Chief Data Scientist, Neuberger Berman. (Pensions & Investments)
  • Washington Post op-ed theorizes on a potential regulation for companies to release their data to the public after a few years, in a model similar to drug patents. (Washington Post)

 

Uncategorized

FT: Buy-side Alternative Data Spend to Exceed $1bn by 2020

The Financial Times published an article yesterday titled “Asset managers double spending on new data on hunt for edge” featuring AlternativeData.org’s first analysis on buy-side industry spend on alternative data. Our findings show a significant acceleration in spending over the last two years (~60% y/y) that is estimated to reach over $1bn by 2020.

Key takeaways:

  • Buy-side spend on alternative datasets has doubled since 2016, and is expected to hit $1bn by 2020.
  • The FT cited increasing pressure from low-cost index funds as a primary driver for asset managers turning to alternative data in the search for an edge.
  • The increased spend on alternative datasets suggests that the need to build out alternative data teams will continue to accelerate, as suggested in this January feature of AlternativeData.org in the FT (first link).
  • Another consultant is referenced with a higher current estimate of $5bn in annual spend on alternative data, but a more modest growth projection (30% y/y). (Note: Methodology could not be verified but it likely includes a broader industry in their “investor” sample, rather than exclusively buy-side discretionary investors).
  • PDF of article.

Momentum in alternative data use affirmed by industry leaders:

  • “The advanced pace of technological disruption is impacting the traditional investment landscape, providing new ways to identify and originate investment opportunities that generate value for investors.” – Chris Molumphy, CIO, Franklin Templeton fixed income.
  • “We continue to add investment research staff and are working on a more comprehensive data strategy, including enhancements in the ease of doing business for advisers . . . as well as enhanced data availability for our distribution and investment management teams.”  – Benjamin Clouse, CFO, Waddell & Reed.

About the analysis:

  • Our results combine two separate analyses: 1) a spend survey of buy-side AlternativeData.org subscribers, and 2) our buy-side alternative data full-time employee analysis.
  • The survey collected spending by buy-side discretionary investors on alternative datasets over the last two years and estimated spend in the coming two years. Results provided a breakdown of spend across different fund tiers:

  • The employee spend figures looked at growth projections of full-time employees that work with alternative data in the investment process, based on the full-time employee analysis we published in January. We then estimated the total hiring, salary, and overhead costs of maintaining these employees to reach our total spend estimate.

Uncategorized

The Download 19: Hedge Funds Compete With Tech for Data Talent, 9 New Datasets, 2 New Jobs

Providers Added to the Database

  • Big Byte Insights – Pricing and supply data on self-storage containers, and retailer exposure data for malls and shopping centers.
  • SuperData – Revenue data and research reports across game platforms, game titles, game categories, and esports markets.
  • Gro Intelligence – Data on the agriculture supply chain: production, consumption, prices, freight, trade flows, etc. Aggregate public (government and trade associations), satellite, and weather data.
  • RepRisk – Data on ESG and business conduct risks for 100k+ companies.
  • Irithmics – Artificial intelligence platform for understanding investor behavior.
  • Yewno – Data on quantity of intellectual property a company has across technologies with high growth expectations (e.g. blockchain, AI).
  • See full public database of 292 data providers.

Updates

  • YipitData combined its web-scraped pricing and bookings database with ARC travel agency settlement data to analyze the competitive dynamics and unit revenue economics of JBLU, ALGT and UAL vs. other U.S. based carriers. (YipitData)
  • Ursa, a satellite analytics provider, partners with ICEYE, a micro-satellite company using weather-agnostic imaging technology, to create global oil storage dataset. (Geospatial World)
  • Quantopian, a data science platform, partners with FactSet to integrate datasets into its interface for building and testing investment algorithms. (CNBC)
  • Quandl, a data broker, launches corporate aviation dataset. (Quandl)
Jobs
RFD
  • Sales and review data direct from a panel of Amazon sellers (i.e. not web scraped or consumer panel data). Contact data@alternativedata.org if you have this data.
News & Insights
  • Hedge fund executives from Marshall Wace, Lansdowne, Man Group, and more weigh in on alternative data and machine learning. “Many hedge fund firms now hire highly quantitative talent — such as mathematicians, physicists, and computer scientists — instead of the traditional business school graduates … this is pushing hedge fund firms into direct competition with the technology industry for the brightest talent.” – Alternative Investment Management Association. (Financial News,subscription required)
  • Kansas City Fed uses satellite data of nighttime lights as indicator for GDP to forecast current-quarter US export growth. (Kansas City Fed)
  • Ship tracking data captures two China-bound American vessels changing course after China announced US supply tariffs while they were en route.(Bloomberg)
  • Jeff Bezos’ flight records used as further evidence by reports suggesting Arlington region is contender for AMZN HQ2. (Fortune)
  • Sentieo, a data interface, calls positive PSTG guidance using data from Google Trends, Alexa web traffic, and Twitter mentions. (Forbes)

 

Uncategorized

The Download 18: Data Executive Candidate, 7 New Datasets, Long-Onlies Using Alternative Data

Providers Added to the Database

  • RxData.net – Data on pharmaceutical pricing, reimbursement, formulary placement and regulatory registrations.
  • MariData – Live database of global marine cargo trade activity between over 7,000 ports, including cargo details and trade indices for monitoring macro and micro performance.
  • ktMINE – Intellectual property data, including royalty rates, agreements, patents, assignments, trademarks, corporate trees, IP connections, IP news, and litigation information.
  • AreaMetrics – Geo-location data on 10mm active monthly users over 4.2mm bluetooth beacons.
  • KD Interactive – Transaction and mobile match key data in mid-market retail, including convenience stores, 3-star hotels, restaurants, and airport concessions.
  • LUX Fund and Technology Solutions – Data infrastructure provider. Raised $6 million from Credit Suisse. (Traders Magazine[free] login required)
  • See full public database of 286 data providers.

Updates

  • YipitData combined web traffic data from SimilarWeb with its web-scraped transaction dataset to analyze the competitive dynamics and unit economics of BKNG and EXPE, and the implications for TRIP and TRVG. (Sample Report)
  • Neudata, a data consultant, announces $600k in new funding. (prnewswire)
  • Eagle Alpha, a data aggregator, publishes 105-page report on alternative data use cases. (Eagle Alpha)
  • Caserta, a data consultant, hosts webinar with AlternativeData.org on Building an Alternative Data Analytics Platform. (Watch Now)
Candidates (NEW)
  • Seasoned technology executive is seeking the opportunity to build out alternative data technical infrastructure at a hedge fund. Contact jobs@alternativedata.org for more information.
New Jobs
Events
News & Insights
  • Long-only asset managers make (slow) progress in using alternative data. (Pensions & Investments)
    • The primary challenge is using a rapidly dissipating signal in a large fund with slow investment processes.
      • “Most … signals have an incredibly short half-life — more a matter of how much money can be put to work in four weeks than four years.” – Timothy Bruce, Director of Traditional Research, NEPC LLC.
      • “Asset management firms, if they’re successful, have a very large and profitable business, and they’re run in a certain way … and to make changes to that takes time.” – Richard Dell, Head of Equity Manager Research, Mercer.
    • A survey of data providers on AlternativeData.org found that 30% of providers have a higher share of long-only clients than they did in 2016. Only 10% of providers have a lower share. Full results to be published next week. 
  • Expert network utility will rise with emergence of increasingly novel alternative data sets.  “… when data is more ambiguous or less easily interpretable, it may give rise to demand for expert conversations around it.” – Max Cartellieri, Co-CEO of AlphaSights. (Integrity Researchsubscription required)
  • Bank of America Merrill Lynch hires Rajesh T. Krishnamachari as Head of Data Science for Equities. Krishnamachari was a senior quantitative strategist at JP Morgan and co-author of the 280-page report on AI and Machine Learning in Investment Strategy. (efinancialcareers)
  • Third Avenue invests $10mm in JBG Smith (Arlington, VA REIT), suggesting that the fund believes Arlington will be the location of Amazon HQ2 based on web traffic data. (ZeroHedge)
  • Schroders’ Data Insights Unit predicted a precise property divestment requirement from regulators a year in advance by analyzing the distance between all physical stores involved in a proposed merger. (Schroders)

 

Uncategorized

The Download 17: Favorable Ruling For Web-Scraping, 4 New Datasets

Providers Added to the Database

  • Jettrack.io – Public data on corporate aircraft flight activity used to identify future corporate deals.
  • Datarama – Public data aggregated from specialised media sources on private and public companies in Southeast Asia and Greater China.
  • Scrapehero – Bespoke web scraping solutions.
  • SpaceKnow – Satellite provider with datasets covering autos, airlines, manufacturing, and construction.
  • See full public database of 280 data providers.
New Jobs
Events
News & Insights
  • DC District Court is allowing a constitutional challenge to the CFAA to proceed, issuing language that web-scrapers may have First Amendment protection. (TechdirtBoing Boing)
  • IHS Markit will incorporate web-scraped data into its AI data initiative. The company has advanced beyond AI for process automation into AI for product improvement, which signals that integration with other alternative data categories is likely. (Integrity Researchsubscription required)
  • Study on Fed data leaks using taxi data hints at the wide range of use cases available for data collected by Uber and other ride-sharing services.(UChicago)
  • Alpha Architect summarizes high-level steps for performing sentiment analysis on Twitter. (Alpha Architect)
  • PreData, a social data analytics platform, uses volatility in social media along with other alternative data signals to predict North Korean missile launches and macro market trends alike. (AlleyWatch)
  • Decode project explores how blockchain may one day govern ownership of personal transactions data. Pilot already incubating in Amsterdam and Barcelona.  (The Guardian)

 

Uncategorized

The Download 16: TSLA Production Tracker, YipitData Adds Email Receipt Panel, 8 New Datasets

New Datasets

  • YipitData integrates one of the largest and fastest growing email receipt panels into its data offering, expanding coverage and increasing product granularity and accuracy. (Full announcement)

Providers Added to the Database

  • Omney Data – Web data tracking retailer pricing and promotional activity/discounting.
  • Bloomberg Tesla Tracker – Tracks Model 3 production by counting VIN sequences across National Highway Traffic Safety Administration registrations, social media, and user submissions.
  • Verbatim Advisory Group – Survey data focused on business services, consumer products, retail & restaurants, industrials & energy, and TMT.
  • Caserta – Builds alternative data implementation systems for hedge funds and tier-1 banks.
  • Moody’s Data Alliance – Data portal tracking commercial and industrial loans to private companies. (Integrity Researchsubscription required)
  • AlphaLetters – Research provider that manually reviews and summarizes top academic papers on quant investment strategy.
  • See full public database of 277 data providers.

Updates

  • VisibleAlpha, a research interface, announces investment from HSBC, joining a group of several other sell-side investors. (Intergrity Researchsubscription required)
  • Neudata, a data consultant, launches a use-case research service for investors to apply alternative datasets for different investment narratives. (Neudata)
  • Autotrader (LON:AUTO) – Tracks listing counts, dealer counts, key product penetration rates, and UK used car transactions. (YipitData)
New Jobs
Consulting Projects (NEW)
  • Consulting project for transaction data experts: Projects for data analysts or scientists with experience making KPI estimates using email receipt and/or credit card data. Requires knowledge of unbiasing and modeling techniques beyond simple growth rate analysis. Competitive compensation. Contact jobs@alternativedata.org for more information.
News & Insights
  • A report from Morgan Stanley and Oliver Wyman estimates that 40% of employees on the buy-side will require fundamental retraining to improve data and analytics capabilities. (Oliver Wymanlong read)
  • SAE, the quant arm investment unit within BlackRock, uses alternative datasets to turn out tools for the asset manager’s traditional investment groups. (Financial Timessubscription required)
    • Economic gauges used across investment teams are created by SAE using internet searches, online invoices, and traffic patterns.
    • Over 37% of SAE employees are PhDs in computer science, physics, and engineering.
  • Datasets used to track Tesla Model 3 production include Bloomberg’s new VIN trackerU.S. import records, and imagery of factory lots(Bloomberg)
  • Earnest Research, a credit card data provider, reports that HelloFresh surpassed Blue Apron in share of the $5bn American meal-kit market.(Recode)
  • Funds that access public filing information see 1.5% higher returns in the following month than funds using no public information. (Bloomberg BriefFull Paper)
    • The research had its flaws, however, in identifying hedge funds accurately. (Integrity Researchsubscription required)
  • Insights from survey at JPMorgan quant conference in Asia. (@RobinWigg)
    • 53% say sentiment data is most promising data type.
    • Lack of talent and high fixed cost are top barriers to entry in using big data.
  • James Rosseau, Chief Commercial Officer of LegalShield, discusses data generation, application, and backtesting of Law Index dataset, which uses legal activity to forecast economic conditions. (SeekingAlpha)

 

Uncategorized

The Download 15: Data On YELP, 16 New Datasets, Sector Data Specialist Job

New Datasets

  • YELP Dataset – Tracks Paying Advertiser Accounts, Request-a-Quote penetration, and major accounts by category and geography. (YipitData)
  • Cryptocurrency Dataset – Tracks investor sentiment on nearly 100 cryptocurrencies using post data from over 40,000 monthly investors. (Integrity Researchsubscription required)

Providers Added to the Database

  • Bridg – Credit card data provider for restaurant industry.
  • Venpath – Geo-location provider sourcing from 212 apps and 61m unique monthly devices.
  • Safegraph – Geo-location data from over 50m mobile devices, tracked to 15m POI and 1000 brands. Has raised $16m in funding.
  • X-mode –  Geo-location data on 30m monthly active users, obtained from 300+ apps.
  • Anonymous Provider – Data on 20% of US household moves, available 4-8 weeks before the move event. Insight into retail (regional demographic shifts), insurance, cable/internet, and banking. Contact data@alternativedata.org for more information.
  • Drillinginfo – Data on exploration & production, oilfield services, midstream, and financial services.
  • Rigdata – Drilling activity data with over 25 years experience covering US, Gulf of Mexico, and Western Canada oil & gas industry.
  • MarketCheck – Auto data provider with active inventory for over 35k US car dealers.
  • TVeyes – Data on brand placements in TV and radio, including logo and object recognition.
  • PriceStats – Data on online prices tracking inflation in 22 economies. (Institutional Investor)
  • Optimum Complexity – Risk analysis data for assessing tickers based on organizational complexity.
  • Legis – Prediction data on Congressional bill outcomes. (Economist)
  • Associated Press – Ticker level data on text archives, real time news, and human-curated database of 140k upcoming potentially newsworthy events.
  • Alpha Hat – Visualization platform focused on geolocation data with plans to expand to multiple datasets.
  • See full public database of 270 data providers.

Updates

  • Datastreamx, a data broker, announces blockchain-based network for decentralized data access. Smart contracts will set rules for data usage and payment. Network will launch in April 2018, but no ICO date is set. (Medium)
  • Crux Informatics, a data infrastructure provider, announces investment from Citi, increasing total funding to $21m. (PRnewswire)
New Jobs
News & Insights
  • Funds with in-house alternative data processing capabilities, including State Street and MSCI, are building related ETFs to sell to the competition. They are betting that smaller firms will get broad alternative data exposure through the ETFs, rather than the expensive process of building alternative data capabilities themselves. (Institutional Investor)
  • Schroders data insights team does not believe that wide access to the same alternative data sources arbitrages away alpha. Rather, they claim that the more data sources teams incorporate, the more possible permutations of analyses emerge, suggesting that ubiquitous alternative datasets still have unique value for each firm. (Morningstar)
  • Pledge to enact new cyber security standards for fintech firms may impact how Yodlee can access financial data. (Financial Timessubscription required)
  • Foursquare, a geo-location provider, reports that “mall death” is overstated, and that attendance at high-end luxury malls is actually on the rise. (Yahoo Finance)
  • Two data providers predict probability of bills passing in Congress.
    • Skopos Labs provides both outcome predictions and valuation impact forecasts on the ticker level. (Skopos Labs)
    • Legis has correctly forecasted the outcome for the first 44 bills for which it has issued predictions. (Economist)
  • Key takeaways from the Augvest-hosted geo-location panel:
    • Cell tower data, the most common geo-location data, is far less accurate than GPS or wifi.
    • Geo-location providers are differentiated by the app types that they work with (and the resulting bias of their sample) and whether they provide useful analyses/products on top of the raw data. Reveal and X-Mode provide raw data, while Safegraph and Cuebiq deliver features layered on.
  • Point72 names Kirk McKeown director of proprietary research. He will oversee Aperio, Data Sourcing and Strategy, and Point of the Spear, with the goal of aligning communication between investment managers and data scientists. (Financial Adviser – Private Wealth)
  • Marine Traffic, a ship-tracking provider, gives insight into the whereabouts of yacht seized by FBI off the coast of Bali in relation to the 1MDB scandal. (WSJ,subscription required)
Upcoming Events

 

 

Uncategorized

The Download 14: Data on Airbnb 2017 Growth, 10 New Datasets, 6 New Jobs, 5 Requests for Data

New Datasets

  • Connotate – Web scraping, data collection, and monitoring services.
  • Mozenda – Web scraping software that integrates with databases and BI toolkits.
  • Business Intelligence Advisors – Founded by CIA employees, analyze earnings calls and other management commentary.
  • DAR Partners – Sales consultants for alternative data providers.
  • Infotrie – News analytics and sentiment data provider with data on 50k stocks, topics, people, and commodities.
  • RootMetrics – Mobile network performance data; subsidiary of IHS Markit.
  • OpenSignal – Crowdsourced mobile network performance data.

Updates

Requests for Data

The AlternativeData.org network has hundreds of multibillion L/S and Long-only funds looking for very targeted datasets. Below are some specific Requests for Data from a selection of these funds. Please reach out to data@alternativedata.org if you have datasets on any of the following:

  • LinkedIn – Profile data on employees of top ~25k companies.
  • Merchant Acquirers – Market share data of companies involved in credit card routing and exclusivity agreements (e.g. First Data Corp, WorldPay).
  • Amazon – Earnings power, industry expansion, country regulations, digital advertising revenue, etc.
  • Any alternative data on Canon or The New York Times
New Jobs
News & Insights
  • Airbnb grew ~50% in 2017 according to alternative data sources: 
    • YipitData tracked global listings up 40% and room-nights stayed up 50% YoY. (YipitData)
    • 1010Data, a credit card data provider, found that US bookings grew 49% YoY, significantly outpacing to the hotel industry average of 28%. (MediaPost)
  • AlternativeData.org published an analysis of alternative data full-time employees (FTEs) on the buy-side. 
    • The number of alternative data FTEs has grown ~450% in last 5 years.
    • Most alternative data FTEs have 11+ years experience and do not have graduate degrees.
    • Tech, Academia, and Data Providers are quickly becoming main channels for sourcing alternative data FTEs.
    • Cost of an alternative data team starts at $1.5 – $2.5m.
  • Funds are paying total compensation of nearly $180k for the average engineer/quant role. (efinancialcareers)
  • The Chartered Alternative Data Analyst Institute launches with plans to develop an exam-based curriculum for standardizing best practices in alternative data analytics. (FinAlternativesIntegrity Research)
  • J.P. Morgan attributes climb in Institutional Investor sell-side research rankings to alternative data incorporation. In a statement, Sunil Garg, J.P. Morgan’s head of international equity research in Asia and EMEA, says the bank has worked to ensure that its research coverage footprint is among the largest of all sell-side research houses by using alternative-data analysis techniques.” (Institutional Investor)
    • UBS also credited use of alternative data through UBS Evidence Lab for their top research ranking. (Institutional Investor)
  • Sentieo, a data interface, correctly predicts the Twitter, Grubhub, Skechers, and Sodastream earnings beats using data from Google Trends, Alexa, and Twitter mentions. (Sentieo)
  • Schroders incorporates alternative data not to implement quantitative approaches, but to augment its fundamental analyses. Having hired their first data scientist in September 2014, the fund now employs 27 such employees.“The data that does not fit into our analysts’ spreadsheets is the gap that we are trying to fill.” – Mark Ainsworth, Head of Data Insights, Schroders. (MarketsMedia)
  • Automakers explore monetizing the data collected by smarter cars. “Hedge funds probing the health of the economy want anonymized trunk sensor data to see if you bought anything when you went to the mall.” (Bloomberg)

Other reading:

  • T-Mobile claims that RootMetrics rankings, the widely cited reports on mobile coverage, are biased in favor of Verizon due to a dataset of “paid consultants.” T-Mobile points to OpenSignal, a crowdsourced mobile coverage dataset, as an alternative source that provides unbiased data. According to OpenSignal, T-Mobile is on equal ground with Verizon in coverage and speed. (Android Authority)
  • ARLnow.com, an Arlington, VA news source, reports large traffic volume from an “internal Amazon.com page devoted to its HQ2 search,” leading to speculation that Arlington is in the final mix for new Amazon HQ. The report does not disclose the referral URL or identification methods, leaving the possibility that a crawler or bot using Amazon Web Services is being mistaken for employee traffic. (ARLnowBusiness Insider)
    Get the latest on AlternativeData.org. Join over 1,000 investors from companies like Citadel, Millenium, Point72, Lone Pine, Tiger Global, Fidelity, and BlackRock.

 

 

Uncategorized

Takeaways from Battlefin Miami

Battlefin brought together 107 asset managers ($760bn AUM), 94 data providers, and ~100 other industry professionals in Miami from January 30-31. Format was productive with packed, short presentations in the morning followed by an afternoon of back-to-back 15 minute one-on-one meetings. AlternativeData.org was a media partner for the event, from which we highlight new datasets, updates, and key takeaways below.

Twenty-four new datasets:

  • Consumer Edge Insights – Credit card transaction panel of over 15mm users from hundreds of US banks. Also has merchant scanner data, Amazon basket tracking with 100k opt-in panel, and survey data.
  • Standard Media Index – Ad spend data sourced directly from booking and invoice systems of media holding partners. Data is aggregated monthly.
  • Epsilon – Marketing company with ~130mm US users’ credit card transaction data.
  • Rystad Energy – Tracks 1,000 companies in the oil and gas industry, providing metrics on exploration, production, oilfield servicing, and North American shale.
  • BizQualify – Tracks company employee benefit plans using IRS and Department of Labor filings.
  • TMT Analysis – Mobile device data provider with metrics tracking unique ad-cookie IDs, IMEI data, and number portability.
  • EPFR – Daily fund flows data, showing the fund origin and destination of moving assets.
  • FeatureX – Satellite analytics provider. API allows for natural language querying.
  • Drawbridge – Data on cross-device consumer attribution.
  • Edison – Real-time data on user purchases and product demand, sourced directly from Edison’s mail app. Covers 11,000 brands. Acquired Return Path’s Consumer Insights business.
  • Dodge – Construction data provider with information on projects and bidding.
  • Linkup – Global job listing provider with 150mm jobs tracked since 2007. Provides both raw data and insights.
  • Sequentum – Web scraping software and solutions.
  • GovSpend – Data on government spending, filterable by products, companies, or people.
  • aWhere – Agriculture data provider with global coverage of key predictors including weather, pest, and disease risk.
  • Vigilant – Public records data provider with real-time alerts across courts, lobbying records, business filings, and campaign financing, among others.
  • Amenity Analytics – Text analytics platform for analyzing unstructured data. Customizes reports for earning call transcripts, regulatory filings, broker research, news, and more.
  • ListenFirst – Tracks social data across organic & paid channels to create a full picture of a company’s social presence.
  • Sharablee – Aggregates all social pages to assess social presence for brands and companies.
  • MKT Mediastat – Unique signals from company media coverage, including measurements of unexpected news coverage, rate of agreement across media sources, and linkages between companies.
  • QL2 – Public data on travel, retail, and automotive companies. Cover ~150 public and ~150 private companies.
  • Sustainalytics – Environment, social, and governance (ESG) score data provider. Provides the ESG scores shown on Yahoo Finance.
  • Owl Analytics – Data on environment, social, and governance (ESG) metrics. Mission is for investors to be able to maintain strategy but point their capital toward companies that have positive social and environmental impact.
  • ISS Analytics – Data on governance metrics as an indicator of company performance.

Updates:

  • Main difference between top two web-traffic data providers:
    • Jumpshot – Created to monetize the data from antivirus software Avast. Has more reliable cohorts (people don’t uninstall antivirus software often) but has more panel bias.
    • SimilarWeb – Based on browser extensions, manages their panel bias better (given broad distribution of users) but suffers from higher cohort turnover.
  • AppAnnie, a mobile app usage provider, now has a dedicated professional services team that provides custom data and analytics from their dataset.
  • Enigma, a public data and infrastructure provider, uses data to measure new wells and operations of oil production. Correlates with revenue.
  • Ursa, a satellite data provider, says China dataset on oil storage is their most robust dataset. Ursa provides total storage and flows 2-3 months prior to government reports.
  • GroundTruth, a geolocation data provider, has a separate company called “Skymap” (200 employees) that is entirely devoted to “geo-fencing”, associating each location with a given place of business and keeping track of changes over time.
  • Cuebiq, a geolocation data provider, has ~72mm MAUs in US (one-third of smartphones).
  • Reveal Mobile, a geolocation data provider, has just started selling to institutional investors and has 125mm phones in US.
  • Thinknum, a web data aggregator, tracks FB check-ins. Their customer base is 20% sell-side. They have a tool that correlates a given data point with a stock price.
  • Thasos, a geolocation data provider, has 2.5 years of history and provides weekly delivery of over 400 KPIs. Best KPI to forecast is sales.
Key takeaways from presentations and discussions

Common theme throughout the conference was that access to certain data sources is no longer the main source of alpha, but rather the ability to process that data well and reach the best insights the fastest.

Nobody has figured out how to automate the data cleaning process. It is a heavily manual process that requires a lot of work everywhere. Philip Brittain presented the CRUX model to make data “Available, Accurate, and Actionable”. Focus on data engineering rather than data analysis to develop a process that maintains “data in motion”, providing stream of answers, while addressing maintenance and irregularities.

  • Elements of Data Engineering: ingestion, extraction, validation, structuring/storing, cleaning, normalization, mapping/standardizing, tagging/enriching, joining, de-duping.
  • Machine learning should theoretically be able to help automate a lot of this work.

Integrating various different alternative data sources requires a firm grasp of investment questions around a particular ticker. YipitData demonstrated how it created 7 different datasets from 3 data sources to develop a very granular product that addressed key investor questions on GRUB. Here’s how:

  • Start with the key investor questions for a particular name.
  • Search for the data sets that speak specifically to those questions.
    • If a dataset doesn’t address a key investor questions – make sure you have confidence in the data provider’s ability to dig into their data and create something new.
  • Focus on one data set first and then build from there.
    • YipitData started scraping just GRUB’s restaurant locations, but as the investment narrative on GRUB evolved, they layered on additional datasets that build upon one another.

Many data providers emphasized they are receiving increased attention from quant funds in the past 6 months. There seems to be a trend of the major quants starting to incorporate more traditionally fundamental-oriented alternative datasets. Common quant needs include:

  • High time granularity and delivery frequency (at least weekly).
  • Coverage across many tickers (100+) for a given metric.
  • Long time series (3+ years) for a given metric.

Chris Petrescu, ex Data Strategy at WorldQuant, emphasized the importance of having a dedicated data analysis team with an engineer that is focused on answering the main questions on the data.

  • It can be exciting to work with data owners that have no finance experience and offer a valuable raw product, but most analysts often underestimate the amount of work required to turn that into valuable insights.
  • Alpha is found in stitching datasets together and drawing broader conclusions from them, not looking at one standalone.

Challenges for geolocation data providers:

  • Getting a highly specific location (confusing a spot with its next door location).
  • Differentiating between customers vs. employees?
    • Ability to measure “cross visitation” vs. simply aggregate footfall is an advantage over satellite data, but is very hard to attribute.
  • Changes by Apple/Google to their OS (location services APIs), needs a lot of oversight and testing to adapt SDKs and ensure consistency.
    • Past few years have shown significant reduction of SDKs that can exist in-app, so data providers using SDKs now need to show clear value to keep high penetration.

Satellite imagery is best suited for restaurant, home improvement, and specialty store sectors, according to backtest of RS Metrics data from Wolfe Research. The satellite provider evaluation found that industries with more concentrated peak hours of operations have the most success in capturing traffic.

  • Best performing sub-industry: restaurants, home improvement retail, specialty stores, department stores, home furnishing retail.
  • Tickers with highest correlation: LOW, CMG, HD, JCP, BWLD, TGT, ROST, LL, BIG, TSCO.
  • Still, credit card and foot traffic data can be better predictors for these sectors, depending on geographic bias and percent of customers paying with cash.

Observations on satellite data:

  • Frequency and resolution of satellite imagery are expected to improve drastically over the next 5 years as we move toward real-time visual analytics.
  • Satellite data for Asian markets is often less reliable due to the higher cloud cover/air pollution levels.

StockTwits could be used as a source of sentiment data for cryptocurrencies.25% of all engagement and communications on the 1.5mm user social network is now cryptocurrency related.

SUBSCRIBE TO GET THE LATEST ALTERNATIVE DATASETS, JOBS, NEWS, and EVENTS: