Data Inventories for the Modern Age? Using Data Science to Open Government Data

Julia Lane, Ernesto Gimeno, Ekaterina Levitskaya, Zheyuan Zhang, and Alberto Zigoni
Harvard Data Science Review

This article describes how data science techniques—machine learning and natural language processing—can be used to open the black box of government data. It then describes how an incentive structure can be established—using human–computer interaction techniques —to create a new and sustainable data ecosystem. The particular focus is on the United States and on scientific researchers, who are major users of government data. However, the approach can be deployed to other use cases, such as data mentions in newspapers and government reports, and many other countries. 

