This is a repository of code and data for meausuring the news
at MozFest 2013.
data/
contains data files and dictionaries for New York Times content, promotion, and performance data.
code-examples/
provides some code in python
and R
for extracting features from text, predicting pageviews, and visualizing relationships
Since pageviews are a sensitive metric, we've elected to scale this number from 0 to 1. This way we can study the variance across articles without compromising the interests of the New York Times.
In addition, while you're free to explore and play around with this data, we'll ask that you do not publish a blogpost or article using the data without first running it by Brian Abelson (@brianabelson). Thanks!
Links: