Uncategorized

Takeaways from Battlefin Miami

Battlefin brought together 107 asset managers ($760bn AUM), 94 data providers, and ~100 other industry professionals in Miami from January 30-31. Format was productive with packed, short presentations in the morning followed by an afternoon of back-to-back 15 minute one-on-one meetings. AlternativeData.org was a media partner for the event, from which we highlight new datasets, updates, and key takeaways below.

Twenty-four new datasets:

  • Consumer Edge Insights – Credit card transaction panel of over 15mm users from hundreds of US banks. Also has merchant scanner data, Amazon basket tracking with 100k opt-in panel, and survey data.
  • Standard Media Index – Ad spend data sourced directly from booking and invoice systems of media holding partners. Data is aggregated monthly.
  • Epsilon – Marketing company with ~130mm US users’ credit card transaction data.
  • Rystad Energy – Tracks 1,000 companies in the oil and gas industry, providing metrics on exploration, production, oilfield servicing, and North American shale.
  • BizQualify – Tracks company employee benefit plans using IRS and Department of Labor filings.
  • TMT Analysis – Mobile device data provider with metrics tracking unique ad-cookie IDs, IMEI data, and number portability.
  • EPFR – Daily fund flows data, showing the fund origin and destination of moving assets.
  • FeatureX – Satellite analytics provider. API allows for natural language querying.
  • Drawbridge – Data on cross-device consumer attribution.
  • Edison – Real-time data on user purchases and product demand, sourced directly from Edison’s mail app. Covers 11,000 brands. Acquired Return Path’s Consumer Insights business.
  • Dodge – Construction data provider with information on projects and bidding.
  • Linkup – Global job listing provider with 150mm jobs tracked since 2007. Provides both raw data and insights.
  • Sequentum – Web scraping software and solutions.
  • GovSpend – Data on government spending, filterable by products, companies, or people.
  • aWhere – Agriculture data provider with global coverage of key predictors including weather, pest, and disease risk.
  • Vigilant – Public records data provider with real-time alerts across courts, lobbying records, business filings, and campaign financing, among others.
  • Amenity Analytics – Text analytics platform for analyzing unstructured data. Customizes reports for earning call transcripts, regulatory filings, broker research, news, and more.
  • ListenFirst – Tracks social data across organic & paid channels to create a full picture of a company’s social presence.
  • Sharablee – Aggregates all social pages to assess social presence for brands and companies.
  • MKT Mediastat – Unique signals from company media coverage, including measurements of unexpected news coverage, rate of agreement across media sources, and linkages between companies.
  • QL2 – Public data on travel, retail, and automotive companies. Cover ~150 public and ~150 private companies.
  • Sustainalytics – Environment, social, and governance (ESG) score data provider. Provides the ESG scores shown on Yahoo Finance.
  • Owl Analytics – Data on environment, social, and governance (ESG) metrics. Mission is for investors to be able to maintain strategy but point their capital toward companies that have positive social and environmental impact.
  • ISS Analytics – Data on governance metrics as an indicator of company performance.

Updates:

  • Main difference between top two web-traffic data providers:
    • Jumpshot – Created to monetize the data from antivirus software Avast. Has more reliable cohorts (people don’t uninstall antivirus software often) but has more panel bias.
    • SimilarWeb – Based on browser extensions, manages their panel bias better (given broad distribution of users) but suffers from higher cohort turnover.
  • AppAnnie, a mobile app usage provider, now has a dedicated professional services team that provides custom data and analytics from their dataset.
  • Enigma, a public data and infrastructure provider, uses data to measure new wells and operations of oil production. Correlates with revenue.
  • Ursa, a satellite data provider, says China dataset on oil storage is their most robust dataset. Ursa provides total storage and flows 2-3 months prior to government reports.
  • GroundTruth, a geolocation data provider, has a separate company called “Skymap” (200 employees) that is entirely devoted to “geo-fencing”, associating each location with a given place of business and keeping track of changes over time.
  • Cuebiq, a geolocation data provider, has ~72mm MAUs in US (one-third of smartphones).
  • Reveal Mobile, a geolocation data provider, has just started selling to institutional investors and has 125mm phones in US.
  • Thinknum, a web data aggregator, tracks FB check-ins. Their customer base is 20% sell-side. They have a tool that correlates a given data point with a stock price.
  • Thasos, a geolocation data provider, has 2.5 years of history and provides weekly delivery of over 400 KPIs. Best KPI to forecast is sales.
Key takeaways from presentations and discussions

Common theme throughout the conference was that access to certain data sources is no longer the main source of alpha, but rather the ability to process that data well and reach the best insights the fastest.

 

Nobody has figured out how to automate the data cleaning process. It is a heavily manual process that requires a lot of work everywhere. Philip Brittain presented the CRUX model to make data “Available, Accurate, and Actionable”. Focus on data engineering rather than data analysis to develop a process that maintains “data in motion”, providing stream of answers, while addressing maintenance and irregularities.

  • Elements of Data Engineering: ingestion, extraction, validation, structuring/storing, cleaning, normalization, mapping/standardizing, tagging/enriching, joining, de-duping.
  • Machine learning should theoretically be able to help automate a lot of this work.

 

Integrating various different alternative data sources requires a firm grasp of investment questions around a particular ticker. YipitData demonstrated how it created 7 different datasets from 3 data sources to develop a very granular product that addressed key investor questions on GRUB. Here’s how:

  • Start with the key investor questions for a particular name.
  • Search for the data sets that speak specifically to those questions.
    • If a dataset doesn’t address a key investor questions – make sure you have confidence in the data provider’s ability to dig into their data and create something new.
  • Focus on one data set first and then build from there.
    • YipitData started scraping just GRUB’s restaurant locations, but as the investment narrative on GRUB evolved, they layered on additional datasets that build upon one another.

 

Many data providers emphasized they are receiving increased attention from quant funds in the past 6 months. There seems to be a trend of the major quants starting to incorporate more traditionally fundamental-oriented alternative datasets. Common quant needs include:

  • High time granularity and delivery frequency (at least weekly).
  • Coverage across many tickers (100+) for a given metric.
  • Long time series (3+ years) for a given metric.

 

Chris Petrescu, ex Data Strategy at WorldQuant, emphasized the importance of having a dedicated data analysis team with an engineer that is focused on answering the main questions on the data.

  • It can be exciting to work with data owners that have no finance experience and offer a valuable raw product, but most analysts often underestimate the amount of work required to turn that into valuable insights.
  • Alpha is found in stitching datasets together and drawing broader conclusions from them, not looking at one standalone.

 

Challenges for geolocation data providers:

  • Getting a highly specific location (confusing a spot with its next door location).
  • Differentiating between customers vs. employees?
    • Ability to measure “cross visitation” vs. simply aggregate footfall is an advantage over satellite data, but is very hard to attribute.
  • Changes by Apple/Google to their OS (location services APIs), needs a lot of oversight and testing to adapt SDKs and ensure consistency.
    • Past few years have shown significant reduction of SDKs that can exist in-app, so data providers using SDKs now need to show clear value to keep high penetration.

 

Satellite imagery is best suited for restaurant, home improvement, and specialty store sectors, according to backtest of RS Metrics data from Wolfe Research. The satellite provider evaluation found that industries with more concentrated peak hours of operations have the most success in capturing traffic.

  • Best performing sub-industry: restaurants, home improvement retail, specialty stores, department stores, home furnishing retail.
  • Tickers with highest correlation: LOW, CMG, HD, JCP, BWLD, TGT, ROST, LL, BIG, TSCO.
  • Still, credit card and foot traffic data can be better predictors for these sectors, depending on geographic bias and percent of customers paying with cash.

 

Observations on satellite data:

  • Frequency and resolution of satellite imagery are expected to improve drastically over the next 5 years as we move toward real-time visual analytics.
  • Satellite data for Asian markets is often less reliable due to the higher cloud cover/air pollution levels.

 

StockTwits could be used as a source of sentiment data for cryptocurrencies.25% of all engagement and communications on the 1.5mm user social network is now cryptocurrency related.

 

SUBSCRIBE TO GET THE LATEST ALTERNATIVE DATASETS, JOBS, NEWS, and EVENTS:

 

Uncategorized

Buy-side Alternative Data Employee Analysis

We compiled a dataset of alternative data full-time employees (FTEs) on the buy-side to analyze the various recruiting trends impacting institutional investors. As competition for data talent heats up, it is essential to understand the landscape, background, and cost of these professionals.

Key Takeaways:

  • The number of alternative data FTEs has grown ~450% in last 5 years.
  • Most alternative data FTEs have 11+ years experience and do not have graduate degrees.
  • Tech, Academia, and Data Providers are quickly becoming main channels for sourcing alternative data FTEs.
  • Cost of an alternative data team starts at $1.5 – $2.5m.
Building the employee database.

Our methodology leveraged LinkedIn, IPREO, and the AlternativeData.org network to scan through the 14k buy-side funds to find individuals that are focused on alternative data initiatives full time. We first identified all data-focused individuals within discretionary funds and then screened for all false positives, including employees that work with traditional datasets (e.g. macro, business, market, etc.). We then reviewed each individual’s profile to confirm their focus on alternative data and arrived at a final database of 163 funds that employ a total of 340 alternative data FTEs (Figure 1).

bottom right 2 Images for AD.org - Data on Investors Using Alternative Data copy.004

Figure 1. Building the dataset.

This methodology has some limitations, suggesting that the actual number of data FTEs may be even higher:

  • Various alternative data FTEs are not on LinkedIn, IPREO, or our network
  • People’s titles don’t always reflect their responsibilities/focus
  • Most people don’t highlight “alternative data” in their profiles
  • People don’t update their LinkedIn profiles very often
How quickly is the landscape evolving?

We charted the growth in alternative data FTEs over time, capturing the acceleration of this skillset over the last five years (Figure 2). While both the total number of employees and the total number of funds employing alternative data FTEs is increasing, the total employee count is increasing at a faster rate. This suggests that funds are increasingly hiring more data talent and building out entire teams.

slide7.001

Figure 2. 4x Growth of alternative data FTEs in last 5 years.

We compared growth in alternative data FTEs to growth in alternative data providers and identified that, while providers had a correlated trend, their inflection point for growth occurred roughly four years earlier than FTEs (Figure 3). This suggests that between 2009-2012, funds realized that they could no longer outsource (or avoid) the need to analyze and integrate new sources of alternative data.

callouts_added.009

Figure 3. Funds are playing catch-up building out their alternative data teams.

What is the composition of alternative data FTEs?

We investigated the different roles that comprise the data FTE sample to understand its composition (Figure 4). We grouped various different job titles into 6 major categories to better identify trends across each function. We found that 59% of FTEs are in Data Analyst and Data Scientist positions. These are also the functions that have been growing the fastest, at 3x the rate of the other data categories (Figure 5).

bottom right 2 Images for AD.org - Data on Investors Using Alternative Data copy.011

Figure 4. Majority of Buy-side alternative data FTEs are Data Analysts and Data Scientists. Note: Data Scout refers to roles in which the primary responsibility is data sourcing.

callouts_added.013

Figure 5. Data Analyst and Data Scientist have the highest growth rate of major alternative data FTE functions.

Not just hedge funds in this game.

An important takeaway from looking at the types of funds in the dataset was that hedge funds are not the only ones that have been adding alternative data FTEs. Long-only funds are adding considerable amounts of these employees. We identified several long-only funds that have built full data teams or have many alternative data FTEs, including Schroders, Fidelity, Capital Group, Neuberger Berman, T.Rowe Price, and Invesco.

Given long-only investors experience much longer investment cycles (~5 years) than their hedge fund counterparts (quarterly), it is reasonable to assume that they have not yet seen and validated the impact of alternative data in their investment decisions. As a result, we expect that several more long-only funds will commit to building dedicated data teams in the near future as more alternative data is incorporated and its ROI is demonstrated.

Backgrounds of Alternative Data FTEs.

We examined the dataset to identify profile characteristics and trends that would help in recruiting alternative data skill sets. We first looked at the educational concentration and previous employer type of the four main functions to understand the general background of each function. Most functions had relatively high concentrations of STEM backgrounds, except for Heads of Data, who were almost entirely from traditional investment backgrounds (Figure 6). We expect that the background of Heads of Data will diversify with time, as Data Analysts and Data Scientists with STEM backgrounds progress into leadership roles. 

bottom right 2 Images for AD.org - Data on Investors Using Alternative Data copy.018

Figure 6. Heads of Data have more traditional buy-side backgrounds than other alternative data functions.

How do these employees differ from typical buy-side talent?
When compared to the average educational background on the buy-side, alternative data FTEs hold significantly more STEM degrees, but less Ivy League degrees and MBAs (Figure 7). It is evident how the industry profile will change significantly in the coming years. Recruiters will need to adapt, as few will have extensive networks of referrals for this skillset.
bottom right 2 Images for AD.org - Data on Investors Using Alternative Data copy.024

Figure 7. Alternative data FTEs have more STEM degrees than the average buy-side employee. Much lower concentration of Ivy League or MBA degrees for these roles.

Did most alternative data professionals attend graduate school?

We also examined education levels across the different functions and found that graduate degrees are highly concentrated to Data Scientist positions (Figure 8). Over 40% of Data Scientists hold a graduate degree. While we expected this given the technical sophistication of that role, one could also conclude that you probably do not need to hire a PhD or graduate student for the majority of these roles.

callouts_added.020

Figure 8. Only Data Scientists have a high concentration of graduate degrees amongst alternative data FTEs. Most roles don’t require a graduate degree.

Where to find alternative data talent?

In 2012, most talent came from other funds (69%) or the sell side (20%). In 2017, the Sell-side has remained largely the same (19%), but sourcing from other funds has decreased substantially (48%) (Figure 9). Over the last 5 years, funds have substantially increased their hiring from tech companies, academia, and data providers. We expect these channels to continue diversifying and growing as the industry seeks to fill the increasing demand for the alternative data skillset.

bottom right 2 Images for AD.org - Data on Investors Using Alternative Data copy.026

Figure 9. Funds are increasingly sourcing Alternative Data FTEs from tech, academia, and data providers.

How experienced are these professionals?

We looked at work experience and found that the majority of funds hired individuals with 11+ years of experience (Figure 10). Few funds are currently building out these teams with recent college graduates. This will change over time, as the alternative data skillset is only around seven years old. As the use cases for alternative data grow, we expect funds to invest more in hiring and training younger talent for roles on their data teams.

callouts_added.022

Figure 10. The majority of alternative data employees have 11+ years of experience.

How much will this cost?

Finally, we gathered compensation figures to estimate the cost of building a small, but complete, data team at a fund (Figure 11). We estimated that a team comprised of each of the functions and three Data Analysts would start at $1.5m – $2.5m, at an entry level. With consideration for insurance, benefits, overhead, etc., it is likely that the true cost could be twice as much. Moreover, from the size of some teams and anecdotal research, several top funds are already spending over $10m on alternative data teams.

callouts_added.028

Figure 11. Alternative data FTE team compensation starts at $1.5m – 2.5m.

We are just getting started.

Competition for alternative data talent on the buy-side is escalating and has yet to hit full stride. As data sources grow, integration improves, ROI manifests, and more long-only funds begin building their data teams, we expect demand for this skill set to accelerate. One clear takeaway is that there is not enough alternative data talent within the institutional investor industry to sustain the growing demand. We expect to see funds training younger candidates and increasingly competing with tech and other industries to hire top talent. As demand grows, the cost of attracting top talent out of other fields will increase as well.

See our article How to Integrate Data Analysts, Data Engineers, and Research Analysts to learn more.

SUBSCRIBE TO GET THE LATEST ALTERNATIVE DATASETS, JOBS, NEWS, and EVENTS:
Uncategorized

The Download 13: Using Alternative Data to Win Board Seats, 7 New Datasets

DATA PROVIDER DATABASE

New Datasets

  • Slingshot Aerospace – Data provider with satellite, aerial, and drone capabilities.
  • Dun & Bradstreet (PAYDEX) – Tracks company health with dollar-weighted score for how promptly a business pays its bills.
  • Gyana – UK geolocation data provider.
  • h2o – Machine learning platform allowing insights without data science expertise.
  • Brain Company – Sentiment data provider using public data.
  • Endor – Predictive engine generating results from questions asked in plain language; has applications in consumer research.
  • Bitvore – Identifies ticker-level price inflections from analyzing the news.

Updates

  • Nowcast, a Japanese transaction data provider, teamed up with CCC Marketing to forecast company sales based on T-card data, a popular rewards card generating $63 billion in annual purchases. (Bloomberg)
  • Thasos Group, a geolocation provider, formed agreement with Chinese government to assess GDP growth on district level.
  • ExtractAlpha, a social/sentiment provider, added new ClosingBell dataset providing crowd-sourced buy/sell ratings from a collaborative trading app.
  • See full public database of 218 data providers.
NEW JOBS
NEWS & INSIGHTS
  • D.E. Shaw used alternative data to win a seat on the board of Lowe’s. Using satellite, census, and survey data, the fund made their case that an additional $8 billion in annual revenue was left on the table. (WSJsubscription required)
  • Competition for alternative data analyst talent is escalating across hedge funds and long-onlies. Full analysis to be published on AlternativeData.org. (Financial Timessubscription required)
  • Integrity Research went deep into the history of legal challenges to alternative data usage and best practices for compliance with trading regulations. (Integrity Researchlong read)
  • Electronic Frontier Foundation, Internet Archive, and DuckDuckGo filed amicus brief supporting HiQ and attacking LinkedIn’s suggestion that the Computer Fraud and Abuse Act should be used to limit web scraping of publicly available data. (Integrity Researchsubscription required)
  • Three articles discussed the challenges faced in using quantitative techniques to make investment decisions.
    • “We have yet to find quantitative techniques that can anticipate human behavior. . . . Our fundamental process exists to . . . understand those changes on the margin. . . . Our quantitative group is really responsible for helping us establish accurate starting points.” – Ryan Caldwell, Co-Founder and Chief Investment Officer, Chiron Investment Management. (Forbes)
    • “To me the biggest challenge is processing — being able to build that pipeline. The actual code to derive the alpha part is so much smaller than the whole wrapper around it which is focused on cleaning and processing, reconciliation and post-trade processes.” – Mansi Singhal, Co-Founder, qplum. (finextra)
    • A case study on Voleon fund showed how quant strategies are harder in practice than in theory. (WSJsubscription required)
  • Earnest Research, a credit card transaction provider, published findings that AMZN received 89% of all holiday spending across Walmart, Best Buy, Target, and itself. (Bloomberg)
  • Sentieo, a data interface, showed that analyst sentiment moved favorably on utilities last quarter, while manager sentiment trended negatively. (Forbes)
SUBSCRIBE TO GET THE LATEST ALTERNATIVE DATASETS, JOBS, NEWS, and EVENTS:
Uncategorized

Takeaways from Quandl’s Conference

Quandl assembled 400 buy-side investors, data providers, and sell-side professionals at their second Alternative Data Conference on January 18, up 143% from last year. Presentations focused on 5 main themes: A.I., resources for success in using alternative data, quantamental analysis, presenting new datasets, and compliance.

Six new datasets were introduced:

  • Legal Shield – leverages legal information from a network of 1.7m subscribers, 6.9k broker clients, and 34 law firms to predict macro indicators: consumer confidence, housing starts, total bankruptcies, foreclosure starts, and existing home sales.
  • Quandl M&A Insights – uses aviation industry partnerships to track daily activity of 43k private jets, unmasking FAA block list.
  • SimilarWeb – web traffic and app usage data provider. Highly accurate on Williams Sonoma e-commerce sales.
  • Dun & Bradstreet – provides metrics that measure the health of private companies.
    • PCC Indicator (monthly) tracks ~400 businesses for each of the 43k zip codes in the US. Highly correlated to US GDP at a regional level.
    • PAYDEX is a dollar-weighted score for how promptly a company pays its bills.
    • Collected this data from a trade program over 30 years.
  • S&P Market Intelligence – using language analytics to understand sentiment from earnings calls. Has data on 8,300 companies with history back to 2004.
  • Wolfe Research – partnered with Quandl, by using its Dodge US Construction dataset to develop 30+ construction factors with forecasting power on asset returns.

Tools mentioned throughout the conference:

  • Jupyter – open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.
  • Trust.mit.edu – info on creating a secure multi-party system that enables interaction with a protected dataset without sharing it.
  • MAPD – data integration and visualization platform. Uses GPU (really fast) database and runs on AWS. Has an open source offering called MapD Core.
  • Gunning Fog Index – open source “readability” test, similar to what is used by language analysis providers.
  • Vin.place – database of car ownership, sourced from VIN numbers.

You can predict human behavior better by understanding groups and tribes rather than individuals – Alex Pentland, Professor at MIT.

  • 90-95% of human behavior is predictable when analyzing groups. The character of that 5-10% unpredictable behavior can be a leading indicator of trends and performance.

Top safety recommendations for data practitioners from Alex Pentland:

  • Do not put all your data in one place – move/share answers, not the data itself.
  • Create an auditable Q&A system – use blockchain to log your process.
  • Never decrypt the data – use a secure multi-party platform. (More details attrust.mit.edu)
    • Data owners should think about charging for access to data and not for the data itself.

New dataset on buy-side alternative data full-time employees (FTEs) was created to analyze how funds are recruiting and compensating their teams. James Moran, Co-founder and President of YipitData, presented the major hiring trends and integration best practices from the dataset. Some highlights:

  • Number of alternative data FTEs at funds has doubled every year over the last 5 years.
  • Significant demand for data talent also coming from Long-only investors.
  • Educational background at funds likely to shift significantly toward STEM majors (41% of alt. data FTEs), primarily Mathematics and Computer Science.
  • Tech Companies, Data Providers, and Academia accelerate as FTE sourcing targets.
  • An introductory alternative data team likely to cost more than $3mm with many funds already paying well over $10mm on alternative data FTEs.
  • Full analysis to be published on AlternativeData.org.

Matthew Rothman, MIT Professor, was hired to lead Goldman’s “Data as a Service” offering. He left the details of this offering TBD.

  • He’s an “alternative data contrarian” who believes we should be focused on addressing 1st order data problems (internal/proprietary data that already exists) instead of 3rd order data challenges from 3P alternative data sources.
    • Pushed the audience to use technology and analytics to find value in their own internal data and current processes.
  • Provided a couple of examples of data applications:
    • Measuring garbage as a better productivity indicator.
    • Sports car ownership a sensation-seeking indicator for leadership risk tolerance, habits, and attitudes. (Can use vin.place)

Top line performance numbers don’t really tell the full story. Analysts should use alternative data as a fundamental snapshot of what is happening underneath the hood, understand category and unit level trends. – Michael Recce, Chief Data Scientist at Neuberger Berman.

  • People don’t realize how much information they leave online. Cookies can be tied to a particular IP address, even once you delete your cookies.
    • This data can be aggregated, sold, and used to see what specific companies (e.g. funds) are over-indexing in search terms and content.
  • Funds must place engineers and technologists at the center of the investment idea process.

Schroders launched their data insights unit in late 2014 and now has a team of 22 engineers and data scientists. Led by Mark Ainsworth in London, the unit just added its first New York team member.

  • They successfully used satellite imagery to predict the outcome of M&A activity in the UK.
SUBSCRIBE TO GET THE LATEST ALTERNATIVE DATASETS, JOBS, NEWS, and EVENTS:
Uncategorized

2017 Engagement Highlights

AlternativeData.org saw a lot of interest in 2017. We’d like to share some of the most highly engaged content in case you missed it. Please email us with any comments or content recommendations for the year ahead.

TOP DATA CATEGORIES

Below is a distribution of clicks generated by each data category in our Provider Database during 2017.

pie_chart_sellside

KEY TAKEAWAYS
  • Subscribers are most interested in niche datasets. ‘Emerging Data Categories’ had the highest concentration of clicks and includes niche datasets that are usually industry sector specific and/or private company exhaust data (e.g. transportation, video game, and retail).
  • Weighed by provider count, Credit/Debit Card data had the highest CTR at 90% above the average CTR of other data source categories.
    • Web Data and App Usage both received strong engagement at 20% above average CTR.
  • While Social/Sentiment received a large amount of clicks (10% of total), its weighted CTR was 40% lower than average given the large number of providers in the category. 
    • Web Traffic and Sell-side were among the lowest weighted CTRs, more than 30% below average CTR.
  • Data Brokers, Infrastructure/Interface, and Consultants received a combined 31% of all clicks, highlighting the challenges of discovery fatigue and data analysis/integration. 
  • These were the newly discovered data providers that received the most clicks:
    • One Click Retail (Emerging/Consumer) – Includes AMZN dataset.
    • Random Walk (Geo-location) – Consumer foot traffic and email receipts.
    • Re-analytics (Web Data) – Fashion, retail, consumer, and travel.
    • Dawex (Data Broker) – Open alternative data marketplace.
    • EEDAR (Emerging/Video Games) – Data on over 127K game products.
    • Broughton Capital (Emerging/Transportation) – Trucking, rail, and airfreight.
    • BayStreet Research (App Usage) – Smartphone, tablet, and wearables.
    • Mavrx (Satellite) – Satellite, aerial, and infrared data for agriculture.
    • FaunaDB (Infrastructure/Interface) – Enterprise data warehousing.
TOP 5 POSTS
HIGHLY ENGAGED ARTICLES
  • When Silicon Valley came to Wall Street. (Financial Times – subscription required)
  • UBS wins Institutional Investor #1 in equities thanks to their investment in data team, Evidence Lab. (Institutional Investor)
  • Two academics publish their process for using sentiment data to predict iPhone X success. (insideBIGDATA)
  • Alpha Architect, a small asset manager, published a primer on machine learning for investors. (Alpha Architect)
  • 74% of hedge funds plan to increase spending on alternative data, based on a survey of 50 hedge funds by Greenwich Associates and Arcadia Data. (Greenwich Associates – subscription required)
  • Market size for alternative data estimated between $183 – $200mm, and projected to double in 4 years. (Value Walk – subscription requiredQuartz)
  • Web scraped listings help predict a decline in US retail employment.(Financial Times – subscription required)
  • A federal court ruled against LinkedIn, confirming that a startup can scrape its publicly available data – a potentially precedent-setting ruling in favor of web scraping based analytics. (Ars TechnicaWSJ – subscription required)
INDUSTRY STATISTICS

Number of alternative data providers: 212
Discretionary funds using alternative data: 163
Alternative data full-time employees at funds: 340

Growth of Alternative Data Providers 01_25_18.png

SUBSCRIBE TO GET THE LATEST ALTERNATIVE DATASETS, JOBS, NEWS, and EVENTS:
Uncategorized

The Download 12: 7 New Datasets, Google Maps vs. Apple Maps, BattleFin Discount

DATA PROVIDER DATABASE
New Datasets
  • Jiguang – App usage provider on over 800mm Chinese Android devices.

  • IPqwery – Collects patent and IP ownership data from multiple public records offices.

  • Broughton Capital – Has seven transportation data sets across trucking, rail, and airfreight.

  • FNGO – Provides Korean export data through partnership with Korea Customs Service. High correlation with product revenue for Samsung, Hyundai, and many more.

  • ThinkTopic – Analytics tools for satellite imagery.

  • TruValue Labs – Provides ESG metrics as an indicator of company performance. Recently benchmarked its ESG scores on a group of equities, outperforming the S&P 500 by 3-5% over the past five years. (prsnewwire)

Updates

NEW JOBS
NEWS & INSIGHTS
  • Growth in alternative data sources combined with MiFID II consolidation (and the resulting increase in sell-side research competition) likely to create more research products using alternative data in 2018.
    • The history of Wall Street has been about copying and emulation … Having 35 analysts cover the same stock in the same way is just not going to cut it anymore.” – Barry Hurewitz, Global COO UBS Research. (Integrity Researchsubscription required)
    • “The use of alternative data to make investment decisions will only intensify in the coming year.”  Bjørn Sibbern, NASDAQ. (Markets Media)
  • Deloitte published a white paper evaluating risk/reward in alternative data implementation. (Deloitte)
  • CTOs and CIOs among most in-demand roles for asset managers. “Larger houses on the buy side … are looking to hire a CTO or CIO or VP of IT with cloud and data analytics experience and leadership skills, being able to bring strategies to the table and communicate to business leaders effectively.”  – Emmeline Kuhn, Leathwaite. (efinancialcareers)
  • Google Maps far superior to Apple Maps because of its combined use of satellite and street-view imagery. (justinobeirne.comlong read) 
SPECIAL EVENT ANNOUNCEMENT

Miami Beach, FL  |  January 30-31, 2018
BattleFin’s Discovery Day is a 2-day pre-arranged one-on-one meeting event that connects investment firms looking to integrate alternative data into their investment process. BattleFin is targeting 100 alternative data companies in the satellite imagery, geolocation, sentiment, web scraping, social and other categories. The event will also cover sourcing, technology and compliance topics. More on the agenda here and event logistics here.

 Subscribers receive a 20% discount by using code:  AltData20

RSVP Here

To RSVP, fill out the contact form, which will then take you to the ticket purchasing page.
ALL UPCOMING EVENTS

 

(Twitter)

SUBSCRIBE TO GET THE LATEST ALTERNATIVE DATASETS, JOBS, NEWS, and EVENTS:

 

Uncategorized

The Download 11: Quantifying Disney-Fox Impact On Netflix, 9 New Datasets, 2 New Skills

DATA PROVIDER DATABASE

New Datasets

  • TrustData – Chinese mobile app usage data provider with over 150mm MAUs.
  • Unacast – geo-location data provider with ~11mm MAUs in the US, sourced from GPS, WiFi and beacon data.
  • RedTech – survey data on Chinese companies, including BABA, BIDU, Tencent, JD, CTRP, WB, EDU, TAL, VIPS, Ant Financial, Meituan, Didi, and Toutiao.
  • StockTwits – tracks social/sentiment data from multiple primary sources to generate signals on various equities.
  • IOTA – Internet-of-things analytics platform launched a data marketplace. (IOTA marketplace announcement)
  • Dataffirm – London based data consultant.

Updates

  • SpaceKnow, a geolocation data provider, hired Bloomberg alternative data exec, Jeremy Fand, as VP of Product. (Business Insider)
  • Dawex, a data marketplace, partnered with Mnubo, expanding data sources from IoT companies to ~100. Dawex now has 2,000 companies transacting data onboard. (Dawex)
  • YipitData launched Redfin (RDFN) dataset, tracking lead agent counts, transactions, and brokerage revenue. Data can be further analyzed by agent join year, transaction side, rating, region and zip code. (YipitData)
  • Hyperplane VC leads $2.5mm round in Elsen, a data analytics startup that provides software to ingest and analyze proprietary and 3rd party data. (Institutional Investor)
  • OneClickRetail, an ecommerce data analytics firm, published a deep dive showing positive trends for the pet product market on Amazon. (OneClickRetail)
  • Estimize, a crowd-sourced estimates platform, hired Coleman Research CFO, Jeffrey Geisenheimer, as its CFO and COO. (Integrity Researchsubscription required)
  • See full public database of 212 data providers.
NEW JOBS
SKILLS New!

Email us at data@alternativedata.org if there are specific skills you would like to learn:

  • All the elements for a perfect chart. (Datawrapper)
  • A guide to learning Pandas, the most popular Python data science library on StackOverflow. FYI, Pandas enables you to: read/write in many different data formats, finding and fill missing data, apply operations to independent groups within the data, reshape data into different forms, comb multiple datasets together, advanced time-series functionality, and visualization through matplotlib and seaborn. (Medium)
NEWS & INSIGHTS
  • Disney-Fox combination poses a greater threat to Netflix than Amazon Prime, according to content ownership data from YipitData. (FTsubscription required; YipitData)
  • UBS wins Institutional Investor #1 in equities thanks to their investment in data team, Evidence Lab. “We are trying to develop a new culture where both types of research – traditional analyst-driven security evaluation and new innovations in investment research – are deeply respected” – Barry Hurewitz, Global COO UBS Research. (Institutional Investor)
    • UBS autos analysts stripped a Chevrolet Bolt to do fundamental analysis of its components, discovering it was a lot cheaper than they estimated. (Institutional Investor)
    • Following recent regulations in China, Evidence Lab analysts are using satellite imagery to measure air pollution levels to assess the potential impact to petrochemical companies. (ICIS)
  • AI applications make some headway within sell-side equity research.
    • Wells Fargo’s AIERA, a machine learning equity research analyst, inaccurately downgraded FB in October, although it showed improvements identifying negative news sentiment and making a sell call. (Bloomberg)
    • Morgan Stanley using AI technology to accelerate insight generation for its analysts. (Bloomberg)
  • Ken Griffin hosted a datathon at the NYSE to highlight the need for data talent at funds. (CNBC)
  • Tammer Kamel, CEO of Quandl, shared five data use cases during an interview (CNBC):
    • New insurance policy data as a proxy for estimating auto sales.
    • Measure iron ore production by combining shipping data with satellite data.
    • Tracking private plane transponders/routes to detect potential M&A activity.
    • Scan government public contract (e.g. defense contracts) and link to beneficiaries of transactions to get insights into small/mid-size companies.
    • Partnership with construction intelligence company that enables read into construction companies.
  • Multi-billion dollar fund attributes strong 2017 performance to creative applications of machine learning throughout its investment process. Hiring top talent is the key to success and they are partnering with top universities to source candidates. They have applied similar techniques used by astronomers to identify supernovae to classify stock analysts on the forecasting power. (Risk.net)
UPCOMING EVENTS
SUBSCRIBE TO GET THE LATEST ALTERNATIVE DATASETS, JOBS, NEWS, and EVENTS: