Skip to main content

Enhanced Privacy Protections Delay the Release of Census Data

Pictures of paper cut-out faces above the quote "In 2020, we experienced significant delays in the receipt of census data. This is largely due to an updated privacy protection model."

On August 12th, the US Census Bureau will release data that states and localities will use to redraw congressional, state legislative, and other district lines. This process is known as redistricting.  

In 2020, we experienced significant delays in the receipt of census data. This is largely due to an updated privacy protection model, which ensures greater confidentiality for respondents while keeping the data’s accuracy. 

The Census and Privacy 

The census is a decennial count of the people living within the country. This massive undertaking requires the Census Bureau to protect the privacy and confidentiality of respondents in their published data, which is then used to draw community maps.

Learn More About Fair Maps

 

 

 

The Census Bureau aims to ensure that the data released cannot be traced back to the source and unwittingly "unmask" residents and their personal information. Without confidence in the privacy of their self-reported info, people may withhold essential information, which would affect the accuracy and usefulness of the census data as it applies to matters of funding, resource divisions, map drawing, and more. Privacy protections also prevent the use of sensitive data by bad actors who may use it to target, defraud, or harass respondents. To this end, the 2020 Census decided to use a new approach when guaranteeing privacy protection. 

The Census Bureau last changed how they protected individual responses to census statistics in 1990. So why change now? Simply put, the Census Bureau believed that the old way of ensuring confidentiality was outdated. The increased accessibility of census data online created the need for the most robust possible methods of protecting privacy.  

In 2016, Census Bureau researchers wanted to test the strength of the protections used in the previous census. Their researchers found that, too easily, they could successfully re-identify 52 million people (about twice the population of Texas) by name within the 2010 Census. They also found that a bad actor with more resources could easily penetrate through the privacy protections of as many as 179 million people. Thus, they concluded that the standard method of privacy protections was inadequate.  

Delays Due to System Refinements 

In the past, the Census Bureau used a process called data swapping, which involved the swapping of characteristic data between nearby residents. Data swapping aimed to create uncertainty in who the info applied to without affecting its accuracy.  

To replace this system, the Census Bureau turned to a differential privacy model. This is a mathematical method that injects statistical noise, or false information, to alter data so that the link between the data and specific individuals or groups cannot be figured out. Differential privacy protects privacy by making it unclear what is false statistical noise and what is actual data.  

Me comprometo a ser contado - Censo 2020 - 2020Census.gov

But this model didn’t come without its flaws and needed many attempts at refinement. Through conversations with stakeholders and extensive feedback from multiple outside organizations who found significant problems with how the differential privacy model worked, updates were made. The comprehensive Mexican American Legal Defense and Educational Fund (MALDEF) and Asian Americans Advancing Justice (AAJC) Report explains many of those flaws: for example, the report found that the new model often resulted in the improper movement of population totals from urban census blocks to rural ones, small counties tended to lose racial and ethnic minority populations, and the model favored increasing homogeneity of the counties that already had clear majority race/ethnic populations. 

As a result of this and other feedback, the Census Bureau continuously updated their differential privacy model parameters. They have now released a new version of the model that they feel addresses past concerns by focusing on "more accuracy or precision in the data in exchange for a comparable, but limited, reduction in privacy protection." This version is much more accurate than the previous model. To illustrate, when it comes to the concerns of population data moving counties or having oversized changes for smaller counties, the new model adds an average of 1.75 people to total populations, while earlier models had an 82-person average error. The newer model also has less of an impact on racial population totals than earlier models and it also improves on errors between rural and urban population totals. 

Processes like these contributed to delays in the census data release because the Census Bureau had to guarantee that it supplied sufficient security while protecting the integrity of the data. After a lot of trials, reforms, and research, the Census Bureau landed on its final model. Ultimately, this model successfully balances privacy protections with data accuracy. 

Where We Are Now 

Now that the privacy model has been set up, Leagues across the country are gearing up for the release of data. Some Leagues will be using the data to draw maps, while others will work to ensure that redistricting hearings are transparent and available to the public. We know that these maps, which will be cemented for another decade once drawn, will decide everything from local education to health care to employment. 

The delays are over, so now is the time to ensure that our communities are represented accurately and fairly through the redistricting process. You can get involved in ensuring that the voices of you and your loved ones are heard. Learn more about the redistricting process in your state and how you can engage with People Powered Fair Maps™