Every ten years the U.S. conducts a census in an attempt to count every person once, only once, and in the right place. The U.S. Census Bureau does not release or provide information about individual people or households. Instead, it releases tables and counts of population data.
Historically, the Census Bureau has been able to protect information about individuals through a “disclosure avoidance system." However, with the advent of sophisticated data mining techniques, the Bureau is concerned that its existing system will not prevent information on individuals from being ferreted out of tables when it releases results from the 2020 census.
Because of this concern, the Bureau has been working for several years on a new, elaborate disclosure avoidance system known as “Differential Privacy,” (DP) which statistically adjusts data. Differential Privacy is the Bureau’s attempt to strike a balance between providing the highest level of data accuracy in the tables it releases, which would provide minimal privacy protection, vs. virtually 100 percent privacy, which would result in data so erroneous that they are not usable.
As the Bureau grapples with the privacy vs. accuracy issue, it has been releasing a series of “demonstration products” using data from the 2010 census so census stakeholders can see the kinds of changes DP will make and determine the impact it would have on their work.
I looked at Washington state’s census block population data, and the four-county area of Island, San Juan, Skagit, and Whatcom counties (to evaluate small-area-population-data). I have tested the impacts of DP and assessed that the errors it introduces are substantial. I believe similar errors will be found in other states as well. Further, I believe that if they are applied nationwide, they will render the country’s block level data essentially unusable.