In 2016, the U.S. Census Bureau faced a pivotal choice in its plan to digitize the nation’s once-a-decade population count: build a system for collecting and processing data in-house, or buy one from an outside contractor.
The bureau chose Pegasystems Inc, reasoning that outsourcing would be cheaper and more effective.
Three years later, the project faces serious reliability and security problems, according to Reuters interviews with six technology professionals currently or formerly involved in the census digitization effort. And its projected cost has doubled to $167 million — about $40 million more than the bureau’s 2016 cost projection for building the site in-house.
The Pega-built website was hacked from IP addresses in Russia during 2018 testing of census systems, according to two security sources with direct knowledge of the incident. One of the sources said an intruder bypassed a “firewall” and accessed parts of the system that should have been restricted to census developers.
“He got into the network,” one of the sources said. “He got into where the public is not supposed to go.”
In a separate incident during the same test, an IP address affiliated with the census site experienced a domain name service attack, causing a sharp increase in traffic, according to one of the two sources and a third source with direct knowledge of the incident.
Neither incident resulted in system damage or stolen data, the sources said. But both raised alarms among census security staff about the ability of the bureau and its outside security contractor, T-Rex Solutions, to defend the system against more sophisticated cyberattacks, according to five sources who worked on census security, as well as internal messages from security officials that were reviewed by Reuters.
Among the messages, posted on an internal security registry seen by Reuters, was a note observing that T-Rex’s staff lacked adequate forensic capability as recently as June of this year. “In the event of a real-world event such as a significant malware infection,” the team would be “severely limited in its capability to definitively tell the story of what occurred,” the message said.
One of the sources with direct knowledge of the hack involving Russian IP addresses described the internal Census Bureau reaction as a “panic.” The incidents prompted multiple meetings to address security concerns, said the two sources and a third census security source.
Census Bureau spokesman Michael Cook declined to comment on the incidents described to Reuters by census security sources. He said no data was stolen during the 2018 system test and that the bureau’s systems worked as designed.
The work of Pega and T-Rex is part of the bureau’s $5 billion push to modernize the census and move it online for the first time. The project involves scores of technology contractors building dozens of systems for collecting, processing and storing data and training census workers for the once-a-decade count. T-Rex’s security work is projected to cost taxpayers up to $1.4 billion, according to the census budget, making it the largest recipient of the more than $3.1 billion that the bureau set aside for contracts.
The problems with Pega and T-Rex reflect the Census Bureau’s broader struggle to execute the digitization project. The effort has been marred by security mishaps, missed deadlines and cost overruns, according to Reuters interviews over the past several months with more than 30 people involved in the effort.
“The IT is really in jeopardy,” said Kane Baccigalupi, a private security consultant who previously worked on the census project for two years as a member of the federal digital services agency 18F, part of the General Services Administration. “They’ve gone with a really expensive solution that isn’t going to work.”
The potential costs of a hacking incident or a system failure go beyond busted budgets or stolen data. A technological breakdown could compromise the accuracy of the census, which has been a linchpin of American democracy since the founding of the republic more than two centuries ago.
The U.S. Constitution requires a decennial census to determine each state’s representation in Congress and to guide the allocation of as much as $1.5 trillion a year in federal funds. Census data is also crucial to a broad array of research conducted by government agencies, academics and businesses, which rely on accurate demographic statistics to craft marketing plans and choose locations for factories or stores.
In a worst-case scenario, according to security experts, poorly secured data could be accessed by hackers looking to manipulate demographic figures for political purposes. For example, they could add or subtract Congressional seats allocated to states by altering their official population statistics.
The Census Bureau says its information-technology overhaul is on-track. Systems supporting initial census operations – such as creating its address database and hiring workers – are “fully integrated with one another, performance-tested, and deployed on schedule and within budget,” bureau spokesman Cook said.
Cook said that the bureau had conducted a “bug bounty,” a bulletproofing practice in which benevolent hackers are invited to search for vulnerabilities. He called the effort successful but declined to provide details for security reasons.
Lisa Pintchman, a spokeswoman for Cambridge, Massachusetts-based Pega, said the company was selected through a “very rigorous process” and stands by its work. T-Rex, headquartered in Maryland, declined to comment.
The escalating costs and reliability concerns for Pega’s front-end website have prompted the bureau to consider reverting to an in-house system, which remains under construction as a backup, according to three technology professionals involved in the census project. Census spokesman Cook confirmed that the in-house system, called Primus, would be available for use if needed next year.
This exclusive account of the Census Bureau’s technology troubles comes after government oversight agencies have chronicled other security problems, delays and cost overruns.
The Government Accountability Office (GAO), the fiscal watchdog for Congress, has said the 2020 census is at high risk for a breach or system outage that could prevent people from filling out surveys. The GAO has also said the bureau’s information technology systems won’t be fully tested before the census kicks off for almost all Americans on April 1, 2020, and that 15 of the bureau’s systems – including Pega’s data collection mechanism – were at risk of missing development deadlines ahead of the census.
The Inspector General of the Department of Commerce, meanwhile, in October announced plans to audit the bureau’s technology operations, months after identifying mismanagement of its cloud data-storage system that left it vulnerable to hackers.
Cook declined to comment on the audit but said the bureau is poised to “conduct the most automated, modern, and dynamic decennial census in history.”
The effort to move the census online aims to streamline the counting process, improve accuracy, and rein in cost increases as the population rises and survey response rates decline. Adjusting for 2020 dollars, the 1970 census cost $1.1 billion, a figure that rose steadily to $12.3 billion by 2010, the most recent count. The 2020 tally is projected at $15.6 billion, including a $1.5 billion allowance for cost overruns.
The bureau’s technology woes mounted outside the limelight, as Washington focused on the Trump administration’s push to add a question asking census respondents if they were U.S. citizens, part of a larger effort to curb illegal immigration.
The president abandoned that effort in July after the U.S. Supreme Court rejected it, cheering civil rights groups who had worried it would dissuade immigrants from responding and cost their communities political representation and federal dollars. Still, an October 18 study by the nonpartisan Pew Research Center found that more than one-fifth of Hispanics say they may not participate in next year’s census, compared to 12% of whites.
‘SINGLE POINT OF FAILURE’
The census technology overhaul got off to a late start, in part because Congress gave the bureau less funding than it requested for most of the decade. Pressed for time, bureau leadership at times prioritized speed over security, according to four people familiar with the bureau’s security operations.
New technology systems, they said, were tested in settings that were vulnerable to hackers despite carrying unresolved risks that had been identified by the bureau’s in-house security team. The testing was authorized by bureau leadership and supported by T-Rex, over the objections of the in-house security officials, who wanted the vulnerabilities fixed first, three of the people said. It stoked internal tensions that ultimately led one security boss to quit his post, the people said.
The Census Bureau’s Cook declined to comment on whether the testing was done over the objections of in-house security officials but said that the bureau follows a strict protocol to minimize risk.
The bureau began rolling out its technology plans in 2014, promising a technological tour-de-force with 52 separate systems. Twenty-seven of them will be used for collecting census data, which include building the website where respondents submit forms and the tools used by door-knockers tasked with nudging stragglers.
Most of the Census Bureau’s $5 billion in technology spending has gone to seven main contractors, who together have tapped another 41 companies as subcontractors, according to public presentations by the Census Bureau in 2018.
Within months of the rollout, government advisors from two outside agencies – the U.S. Digital Service and 18F – began warning officials off the sprawling approach, according to Baccigalupi and five other people familiar with the discussions. The outside advisers urged a simpler system, one that would be easier to defend against hacks and glitches.
The Digital Service was created in 2014 by President Barack Obama after the troubled launch of Healthcare.gov, the website meant to allow Americans to sign up for health insurance under Obamacare. Design flaws left the site overwhelmed by higher-than-expected traffic and prevented many users from registering for weeks. Digital Service officials saw the 2020 census as a potential repeat of that fiasco, two of the people said.
The General Service Administration’s 18F unit – named for the address of its Washington, D.C. office – functions like a private-sector consultant and is paid by agencies seeking technology help.
18F declined to comment for this story, and the Digital Service did not respond to requests for comment.
The debate between Census Bureau leadership and its advisors from the Digital Service and 18F focused on two broad approaches to software production: monolithic versus modular.
A monolithic framework – like the one envisioned by Census Bureau officials – bundles different functions into one system. In the case of the census, that could mean a system that allows people to answer the survey on a website, translates incoming responses into data and stores it. Monolithic systems can be easier to build, but critics say they become hopelessly complex when something goes wrong. A problem with one function can shutdown the whole process.
“It’s a single point of failure,” Baccigalupi said.
In a modular system, by contrast, engineers build different pieces of software for each function, then write code to allow them to interact. While it’s more challenging to move data through different components, the risk of a system collapse is much smaller. If one function breaks, others can still work while it’s repaired.
Census officials brought in 18F and Digital Service consultants on long-term secondments to help with aspects of the project but largely ignored their recommendations to take a more modular approach, said 18F’s Baccigalupi and Marianne Bellotti, a former agent at the Digital Service who consulted on the project in 2017.
“I told them pretty consistently in 2017: If you suffer a denial-of-service attack, I’m not sure your architecture can withstand it,” Bellotti said.
In a denial-of-service attack, a hacker tries to prevent legitimate users from accessing a program, often by overwhelming it with more connection requests than it can process. Any extended outages during the census would reduce response rates, compromising the accuracy of the data and making it more expensive to collect.
Cook, the Census spokesman, did not comment on why the bureau chose a more monolithic approach but said the consultants recommending against that path did not fully understand its systems.
“18F and USDS looked at portions of our systems and provided recommendations, but neither group had an overall understanding of how those systems integrated or their capabilities,” Cook said.
Bellotti and Baccigalupi say they told the bureau repeatedly in 2016 and 2017 that Pega’s technology wasn’t well-suited to its central tasks – building the self-response website and the mobile applications to be used by census door-knockers. Pega’s code, they argued, would require so much customization that the final product would be slow and prone to glitches.
“If you want to build the fastest car in the world, you build that car from scratch,” Baccigalupi said. “You don’t try to customize a tour bus until it’s the fastest car in the world.”