Typo cited in telephone disruptions Wrong character effectively disabled computer program


DALLAS -- For want of a "D," the phones were lost.

A typographical error -- a single mistyped character -- in a line of computer code caused major disruptions in telephone service June 26 in Baltimore, according to studies by DSC Communications Corp. and Bell Communications Research.

The error was in code used to make changes last summer in phone system software on the east and west coasts -- changes made to correct an earlier software error, the companies said.

It caused a series of phone outages in late June and early July that affected more than 12 million customers in Baltimore, Washington, Pittsburgh, San Francisco and Los Angeles, the two organizations said as a final report nears submission to the Federal Communications Commission.

The typo was contained in a "patch" that was supposed to fix an earlier problem. The compounded error occurred in software that controls signals regulating telephone traffic. In the patch sent to signal transfer sites, a "6" was typed where a "D" should have been. An operator loading the "correction" into a system actually added a crucial error.

As a result, the cure was worse than the original ailment. The new error disabled the systems' ability to deal with the waves of messages constantly traversing telephone networks. The phone companies lost the ability to tell what calls to set up, whether calls could be connected and even what equipment in the network was functioning.

Al Burman, a spokesman for Chesapeake & Potomac Telephone Co., a subsidiary of Bell Atlantic that provides phone service in Maryland, Virginia and Washington, said the revelation about the error merely adds more details to a report DSC made earlier that a software problem caused the outages.

"Steps have been taken to protect against future occurrences," Mr. Burman said.

A spokesman for Bell Atlantic said the company planned no legal action.

"We're very, very embarrassed that it happened," DSC Chairman James L. Donald said in an interview at the company's headquarters in the Dallas suburb of Plano. "It just wipes you out. It's just such a little bitty difference there, but such a large effect upon other things. We just wish it had never happened."

The "large effect" was the disruption of phone service on both coasts. The signal transfer points with the error became dams in what is supposed to be a free-flowing river of communications.

The flow of messages backed up through local telephone networks, flooding other signaling equipment and ultimately swamping the entire system.

"We've installed instruction software hundreds of thousands of times before. It was obviously inopportune," said Allen R. Adams, DSC vice president for advanced marketing and technology.

The employee who committed the typo is known, but his identity has not been revealed because he has been one of the company's star programmers. "He's conscientious, he works hard, he wants everything to be right," Mr. Donald said.

Bell Communications Research concurs with DSC in its final analysis that the outages on both coasts could be traced to the single error.

Using techniques that replicated everything from lightning strikes to massive traffic loads on the network and tools ranging from minicomputers to a pair of pliers, Bellcore scientists ran tests for three days in July that confirmed the solitary mistake.

But the typo by itself did not take down the Pacific Bell and Bell Atlantic systems, according to the report for the FCC, a copy of which was obtained by the Dallas Morning News.

In every case, a trigger event led to the generation of messages that eventually overloaded the networks.

Copyright © 2021, The Baltimore Sun, a Baltimore Sun Media Group publication | Place an Ad