The summer of 1991 was, in the words of one Bell Atlantic Corp. engineer, "a summer that people around here will never forget. It was disastrous."
Indeed. One year ago today, Bell Atlantic had the first of a summer's worth of high-tech headaches from repeated telephone network outages.
The first outage crippled phone service in Maryland, Washington, D.C., Virginia and parts of West Virginia, leaving as many as 5 million customers without local phone service for most the day.
Others followed in rapid succession, leaving Bell Atlantic customers in western Pennsylvania, and Pacific Bell customers in Los Angeles and San Francisco, without phone service. The outages, which hit in waves throughout July, left millions of people out of touch for periods of a few minutes to more than eight hours.
The outages were traced to a minor software error -- three bits of errant code buried in 3 million lines of computer coding. By the time the source of the problem had been found, public confidence had been shaken, Congress was alarmed and state and federal regulators were asking, "Could this happen again?"
One year later, the short answer is yes.
But industry reforms make the chances of a replay of last summer more remote, experts say.
"That's not to say it's fail-safe from now on, but the chances of a major outage are somewhat less," says James Spurlock, special assistant to the chief of the common carrier at the Federal Communications Commission in Washington.
In the last year, the industry has made changes aimed at reducing the likelihood of outages and, in the event of a breakdown, ensuring that solutions can be found faster.
* Not relying on one software or equipment vendor. The idea is to shore up "redundancy," or backup, in the network.
In last summer's outages, Bell Atlantic and Pacific Bell were using the same hardware and software, provided by the same vendor. The upshot: Both fell victim to the same, errant software.
* More testing. Before last summer's failures, the Bells relied heavily on outside vendors to test software and hardware. Now the Bells test extensively themselves, and they scrutinize test data submitted by outside vendors more closely.
* Tightened security. A close review of the Bells' electronic systems revealed weak spots, company officials said. Those systems, always a security priority, have been tightened even more. Security also has been tightened at buildings that house critical equipment.
* Better communication among telephone companies and their vendors. The outages left some companies and their vendors in a communication lurch after lines went down and calls couldn't be completed. A backup network has been established to prevent a communication blackout in a major outage.
* Better communication within the industry. The FCC now requires companies to report any outage that affects 50,000 people or more and lasts more than 30 minutes. That information is available for public inspection. The FCC has formed an industrywide council to discuss network reliability issues. Both initiatives have forced even die-hard competitors to share information.
The changes are timely.
Raymond Albers, assistant vice president of technical planning for Bell Atlantic, notes that the regional Bells are connecting their networks to those of the long-distance companies. The idea is to allow the sophisticated networks of the Bells and the long-distance companies to communicate with each other more easily.
But the move has drawbacks. Problems that creep up in a regional network could infect and take down long-distance networks, too.
Electronic walls are being established to prevent that. But nobody is offering guarantees a catastrophic outage couldn't happen, given the right circumstances.
"You can never say never," Mr. Albers said.
Regulators and phone officials say it is too early to say whether recent changes have made the network safer. One good sign, according to Mr. Spurlock, is the lack of a major outage since last fall. But that, he said, doesn't prove the long-term reliability of the nation's networks.
"We've learned enough to know that you're never going to make them go away for all time," he said. "But you can make them few and far between."
Over the last year, telephone companies have taken steps to prevent outages and correct them when they occur. The measures include:
* Having backups for software and hardware and no longer relying on one piece of software.
* Testing their equipment themselves instead of relying on outside companies.
* Tightening security. Although neither hackers nor terrorists were blamed for recent outages, the mishaps revealed weak spots in security.
* Improving communication between telephone companies and their vendors.
* Improving communication in the industry. The Federal Communications Commission now requires companies to report any outage that affects 50,000 people or more and lasts for more than 30 minutes. That information is available for public inspection.