Why Map the News?
Media attention matters - in quantity and quality. It helps determine what we talk about as a public and how we talk about it. This project tracks where media sources attention goes and what that attention looks like across different countries in combination with diverse data sets like population and income.
How does this Work?
We start by pulling articles from the MediaCloud database. We pass those through an augmented version of the CLAVIN geoparsing package to find mentions of places. We also use the Stanford Named Entity Recognizer to pull out mentions of people. We run TF-IDF on the words in the articles to make the word cloud of "keywords" that are used more in each country.
Where is this Going?
We are building out our ability to algorithmicly detect the who, what, and where of online news. We are incrementally improving our entity detection, topic modeling, and geoparsing to add that metadata back to the enormous corpus of news held in the MediaCloud database. With that additional data we'll be able to deliver a whole new level of understanding of how attention works in online news.