
Parsely
Web traffic data tagged by content category, companies mentioned, and more. Obtained through direct relationships with content publishers.
Social/Sentiment, Web Data, Web Traffic
main data source2009
Year Company Founded- 1
Discretionary Asset Manager Customers5
$12MM
Hedge Fund
All
New York, NY
Raw, Platform Based
$1K - $10K / mo
2016
US, Europe, Asia, Latin America
Narratives move the markets. We know what topics people are paying attention to.
We are not just looking at social interaction data or just search interaction data but looking at what happens after an article is shared on social or queried on search to show real consumption. Additionally, search and social are only 40% of the ways people consume content. In addition to those sources we show you the other 60% which gives you the full picture of what the market cares about and pays attention to which is likely a driver of the way they invest.
Using the appropriate filter on the enrichments_category field (see section below), it is possible to focus solely on how much attention financial news on a company like Apple is receiving. This is a simple way to use the Parse.ly dataset: with our ticker mapping it is easy to track attention on financial news for thousands of companies -- but this type of financial news monitoring only scratches the surface of what is possible. It is also possible to track key products, issues, or people surrounding a company -- in the example above, tracking the iPhone and Tim Cook might provide interesting leading indicators for Apple. For Apple one could also consider setting up a query that tracked attention on all articles that mention both Foxconn AND Apple, for Facebook one might track content that focuses on Facebook AND privacy. For each company, one could set up a query that tracks much attention is being paid to articles that focus on that attention and the entity ‘lawsuit’ or ‘data breach’.
Full Dataset Description
Parse.ly is a web analytics company that informs enterprise media companies how much attention their content is receiving. At a very basic level, we collect two types of data that are provided within our data set.
Pageviews – We collect data on every pageview that lands on our customers’ sites, allowing publishers to know how often each article is read, how readers discover each article (via search, social, or the publication’s homepage), which device types readers are using, and which geographic regions visitors originate from. We’ve tracked over 480 billion pageviews over the last three years alone.
Articles – We scrape every article page and obtain its full text, as well as other metadata such as title, author, and publication date. We enrich each article by running its full-text through state-of-the-art NLP algorithms, which extract fine-grained information on the categories, people, companies, places, etc, that each article focuses on. We’ve scraped, enriched, and tracked pageviews on over 250 million articles.