The Download X – Coding-Free Web Scraping Tool, Tesla Production Data, and 14 New Data Providers

New Dataset Spotlight
  • Zaoshu.io – coding-free web scraping platform focused on Chinese companies. Tracks JD and AMZN.
  • RandomWalk – consumer foot traffic (geolocation) and email receipt data on 15K stores, including BBBY, COH, JCP, HD, FL, KORS, SHOP, and TGT.
  • 7Park Data – launched a transactional data product offering data on B2B spending.
  • BayStreet Research – research on U.S. and European smartphone, tablet, and wearable markets. Has a database of mobile device sell-through data with rolling 6-month forecasts, and retail & wholesale pricing at the SKU level.

Other Additions

  • Narrative.io – data marketplace with 50 data providers primarily location data (13), ID mapping (13), and web traffic (10).
  • Arcadia Data – data analytics provider and systems consultant.
  • Alt Hub – new alternative data broker.
  • Solution Loft – builds customized data solutions, including web scraping.
  • Tegus – expert network that transcribes theirs calls into a searchable database.
  • Tala.co – collects data points on user behavior including geographic patterns and financial transactions.
  • Kayrros – satellite data estimates on oil supply, inventory, and demand. Recently hired ex-GLG exec to push buy-side initiative. (Integrity Research,subscription required)
  • Signal.co – aggregates consumer identity signals and activity across different marketing channels.
  • Civic Science – consumer poll and surveys insights.
  • FinScience – data analytics platform. Milan-based team closed €1mm seed round. (FinSmes)


  • Visible Alpha, an analytics platform, acquired Alpha Exchange, a research management platform, expanding its offering into the research distribution space. (Nasdaq)
  • SmartKarma, a research aggregator for the buy-side, raised $13.5mm Series B led by Sequoia India. Model capitalizes on MiFID II rules. (Bloomberg)
  • YipitData published new data quantifying OTAs pushback on metasearch. Data enables look into metasearch market share on TRIP and TRVG for OTAs and direct hotel bidders, as well as Google ad appearance rates, averagerank and market share for hotel and vacation rental keywords. (YipitData)
  • See full public database of 205 data providers. 

Continue Reading


The Download 09: 21 New Datasets, Moody’s Invests In Property Data Startup



  • EEDAR – data and analytics on over 127K video game products. 
  • WDZJ.com – daily data on various Chinese P2P companies, including Yirendai (YRD), PPDF, Rapid Finance.
  • Re-analytics – web data analytics on fashion, retail, consumer, and travel companies. 
  • Pitchbook – database of private company and VC information. 
  • Innovata – airline flight and cargo schedule information.
  • Above Data – data consulting/sourcing services started by ex-JPM AM analyst. 
  • Satellite data providers and services:
    • Urthecast (TSX:UR) – satellite imagery and video provider.
    • Mavrx – satellite, aerial, and infrared data for agriculture.
    • Geocento – provides analytical services on satellite and aerial imagery.
    • Earthcube – satellite data focused on defense and infrastructure.
  • Other data sources and providers: 
    • J Capital Research – uses survey data for research and investment ideas on commodities, equities, and property; primarily China-focused.
    • Kinetica – GPU-accelerated (really fast) analytics database.
    • B23 –  data ingestion and visualization platform.
    • Dawex – open alternative data marketplace for companies looking to monetize their data.
    • OTAS – visualizes social media data to provide stock sentiment.
    • RunningAlpha –  generates trade ideas from financial sentiment data.
    • Eurekahedge – database focused on indices and news on hedge funds.


  • Moody’s acquires minority stake in CompStak, a firm that collects granular information on property listings from brokers and developers. Some of thedatapoints they collect (e.g. landlord concessions) are not publically available elsewhere. (WSJsubscription required)
  • YipitData launches a price/load factor curve tool using bookings-weighted data on United (UAL) to help gauge the health of the pricing environment.(YipitData)
  • Wolfe Research partners with RS Metrics, a satellite data provider, to incorporate satellite data into fundamental research. (prweb)
  • Credit Suisse partners with Ravenpack, a sentiment data provider, to create a sentiment index from news analyses. (Credit Suisse)
  • Wall Street Horizon, an event data provider, launches EventBreaks, an alert system that lets you know when and how corporate events change.(Businesswire)

Multibillion-dollar L/S funds are looking for the following. Please reach out to data@alternativedata.org if you have any of these datasets:

  • Datasets on digital advertising, particularly ad inventories and CPCs.
  • Two academics publish their process for using sentiment data to predict iPhone X success. (insideBIGDATA)
  • An analysis by Sentieo, a news analytics platform, shows NFLX management less bullish than street. (Sentieo)
  • YipitData’s web data accurately predicted out-of-consensus room-night growth for EXPE. Also identified drivers of pressure, noting weak trends in the U.S. hotels business, as well as the monetization efforts at HomeAway. (YipitData)
  • Two articles discuss why the alternative data skillset is getting more valuable. 
    • “There are all these big data sets that could be useful to inform the investment decisions that stock pickers are making. But they don’t currently exist in a form that is useful for investment managers.” – Mark Ainsworth, head of data insights at Schroders. (Financial Times)
    • “What will happen is that as data becomes more and more commoditised, the value is going to be on the intellectual capital to turn the data into information for a specific problem. Just because you have got data doesn’t mean you can really perceive how to use the data to solve problems that you have.” – Ken Nickerson, MD at Morgan Stanley. (HFM Technology)
  • Dataminr, a social sentiment data provider, predicted the acquisition of Kite Pharma 5 days before it was announced. (Financial Times)
  • Maverick set to debut two new funds focused on short-term alpha signals from alternative datasets. (Business Insider)
  • Good intro to satellite data applications. Geospatial Insights used high-res aerial photography to assess the impact of hurricanes in Houston and Key West. (The Edge Markets)

The Download 08 – 16 New Datasets, Introducing “Request For Data”



  • One Click Retail – e-commerce data analytics firm. Provides sales figures for items sold on Amazon.com. (WSJ, subscription required)
  • TransCore – provides various transportation industry metrics. 
  • Scoop Analytics – retail sentiment data from Twitter news and events.
  • DataTrek – new capital markets newsletter with a focus on data by a former SAC PM/analyst. (paid)
  • BrandLoyalties – brand loyalty indices based on consumer citations.
  • Othe datasets and providers:
    • Acuris – trade ideas network, regulatory and policy developments tracker, and event-driven news feed on capital markets transactions.
    • HFA Group – consolidation, data optimization, and other infrastructure solutions.
    • Stax – data-driven consulting firm, focused on institutional investors.
    • Street Diligence – capital structure analytics and distressed acquisition tools, primarily for credit investors.


Multibillion-dollar L/S funds are looking for the following. Please reach out to data@alternativedata.org if you have any of these datasets:
  • Data estimating advertising revenue for Facebook, Google, and Twitter. 
  • Email receipt, credit card, or app data for the gaming industry (PC or mobile).
  • Flight utilization prior to departure (load factor) by airline (US domestic).
  • Alpha Architect, a small asset manager, published a primer on machine learning for investors. (Alpha Architect)
  • Geo-location startup, Thasos Group, published an analysis of Whole Foods traffic following its acquisition by Amazon. “The following stores experienced the highest rates of customer defection to Whole Foods: Trader Joe’s (10%), Sprouts (8%), Target (3%). Customer Defection Rates remained elevated for all competing stores as of September 16. The new customers Whole Foods attracted with its price reduction were the wealthiest regular customers of the competing stores.” (Thasos Group, pdf)
  • YipitData & MoffettNathanson analysis reveals Netflix has made big gains with in-house TV shows, despite studios pulling some content. (MediaPost, YipitData)
  • UBS Evidence Lab published an analysis that local regulations are slowing down Airbnb’s growth. (ValueWalk)
  • Banks may be making it more difficult to access consumer transaction data. “Over the past six to nine months there’s been a sharp uptick in deliberate bank blockages,” a source with direct knowledge of the matter told Yahoo Finance. (Yahoo Finance)
    • Senator Ed Markey (D-Mas.) sent a letter to the CFPB siding with the banks. (ABA Banking Journal)
  • Sentiment analysis provider, Dataminr, accurately predicted Brexit outcome and other legislative decisions by analyzing Twitter. (ValueWalk)

Top Data Providers: China Equities

The biggest Chinese public companies and the top data providers that help investors find an edge.

We identified the 17 largest equities in China, based on market cap, for which alternative data exists. After interviewing most alternative data providers that cover these names, we compiled a list of the key providers for each company. This article was originally published on Integrity Research.

Email us at data@alternativedata.org with any questions.

Top China Data Sources:

Data Provider Evaluation Criteria:

  1. Buy side feedback: Anecdotes from fundamental buy side investors who have experience using these datasets.
  2. Data source type: Does the data source and analysis closely reflect and clarify company performance, narratives, or key metrics?
  3. Accuracy: How accurate have these providers been historically?
  4. Ease of use: Do the providers have raw data or do they also do their own QA and analysis in-house?

We also mapped all the alternative data providers that have data on these companies in the landscape below.

China Data Landscape (vFFF)




The Download 07 – 13 New Datasets, Point72 Invests in Database Startup



  • Point72 leads $25mm Series A for database startup, FaunaDB(PRnewswire)
    • Coatue invested in data science infrastructure company, Domino Data Lab, earlier this year. (WSJlogin required)
  • AppAnnieapp usage data provider, expands Chinese consumer analytics offering. (Reuters)
  • YipitData has combined web data with new email receipt data for GRUB, enabling KPI accuracy and insights not possible with either dataset alone.
  • YipitData launched Best Inc. (BSTI) dataset, which IPO’d last Wednesday.
  • 1010data launched a Suburban Shopper Panel (SSP), data into consumer behavior and demographics of rural and suburban shoppers. (Business Wire)
  • Prattle, a public sentiment analysis provider, launched an analytics platform of corporate earnings calls. (Prattle Blog)
  • QuadAnalytix and Mobee merged into Wiser Solutions to provide consumer/retail data from online and offline sources. (Business Wire)
  • VisibleAlpha, an analytics platform, partnered with Thompson Reuters.(Reuters)
  • M Science’s 3rd party email receipt panel is 3.5mm and it now has access to an EU credit card panel. (from Learn2Quant Hong Kong)
  • Sandalwood employs 3 main sources of data: UnionPay credit card panel, a partnership with JD.com, and web scraping for Tmall brand data. (from Learn2Quant Hong Kong)


  • Prosper – consumer survey data on 296 tickers (US) and 126 tickers (China), distributed through Consumer Edge Research.
  • Granular.ai – satellite data on industrial sectors and emerging economies.
  • Selbourne Research – data on payments ecosystem.
  • Geotab – fleet GPS tracking and fleet management data.
  • Predata – analytics platform based on market/risk signals and event predictions.
  • Statistical Surveys – data on recreational vehicles.
  • IndexMath – index for predicting UK stock market trends. 
  • 74% of hedge funds plan to increase spending on alternative data, based on a survey of 50 hedge funds by Greenwich Associates and Arcadia Data. (full report for purchase at Greenwich Associates) Other report highlights:
  • Market size for alternative data estimated between $183 – $200mm, and projected to double in 4 years. (Value WalkQuartz)
  • Use of alternative data could lead to a revenue uplift of 15% for asset managers, while additionally cutting costs by another 15%, according to Quinlan and Associates. (The Street)
  • Sentieo, a financial data platform, developed a short thesis on Darden (DRI) based on sub-brand level restaurant unit counts and implied growth rates.(Futures Magazine)
  • State Street launches Quantextual Idea Lab, a quant-based research management platform. (FINalternatives)
  • Private equity and corporates see opportunity in satellite, GPS, IoT, and economic alternative data. (International Business Times)

The Download 06: 14 New Datasets, 3 New Jobs, 21 Data Success Stories

Monday, September 11, 2017

  • PROME – customized web scraping services.
  • Datavore – data analytics and visualization platform.
  • Dataiku – data science platform, recently raised $28mm. (PRNewswire)
  • ExtractAlpha – licenses quant models from social, web, and market sources.
  • SensorTower – app usage and ad performance.
  • Inferess – converts news feed into event-driven analytics.
  • JWN Energy – oil and gas data repository.
  • Seer Aerospace – aircraft usage patterns and reliability datasets.
  • Epsilon  large panel of consumer data and insights.
  • RVIA – recreational vehicle data and trends.
  • WallStreetHorizon – corporate event related datasets. 
  • Eagle Alpha, a data aggregator, published 20 case studies of successful quant and discretionary alternative data applications for multiple data sources. The 80-page paper provides a thorough evaluation of the alternative data space for both quant and discretionary investors. (Eagle Alphapdf)
  • Sentiment provider, Prattle, correctly predicted various Fed and international central bank decisions. Quinlan & Associates published a 50-page report on the use of alternative data for alpha generation, including a case study on Prattle. (Quinlan & Associatespdf)
  • Anonymity concerns in geolocation data continue. (Financial News)
  • Nasdaq accelerates expansion into data analytics, acquiring asset manager research platform eVestment for $705mm. (NasdaqReuters)

The Download 05: 9 New Datasets, Balyasny Invests in Data Company, Challenges for Geo-Location Data

Thursday, August 31, 2017
DATA PROVIDER DATABASE (New Datasets and Updates)
  • Update Balyasny invests in geo-location alternative data provider, Cubiq.(Integrity Research)
  • Update Geo-location data provider, Reveal Mobile, is facing challenges for receiving unauthorized user data from AccuWeather. (ZeroDayTechCrunch)
    • Removed Tutela, another mobile data company, discontinued selling data to investors.
  • Update YipitData launches new datasets for Camping World (CWH), FedEx (FDX) and China Education (EDU, TAL).
  • Update Meltwater acquires AI news tracker Algo, bolstering its analytical capabilities for investment research. (Integrity Research)
  • Added Skopos Labs uses machine learning to predict federal/state legislation outcomes.
  • Added Tailwind uses aircraft to fly under clouds to collect high-resolution aerial imagery.
  • Other new providers:
  • Web scraped listings help predict a decline in US retail employment. (Financial Times)
  • Verified list of 140 alternative data providers, and counting. (AlternativeData.org)
  • Data “gold rush” continues amidst different concerns for data sources. (Financial Times)
  • NYU is starting a data science Ph.D. program in September. “The university plans to ramp up its Ph.D. program to include 50 to 100 students in five years. Newly minted doctorates stand to make a lot of money — $200,000 plus a bonus — at a hedge fund, estimated Adam Zoia, head of recruiting firm Glocap.” (Information Management)
  • NYU Computer Science Professor, Anasse Bari, shares how he prepares students to become data scientists and how predictive analytics delivers value to hedge funds. (Predictive Analytics World)
  • How learning SQL made me a better analyst. (Forbes)