Bad judgment and a violation of accepted operating practices by AT&T; technicians were contributing factors in the massive phone outage that hit New York Tuesday and crippled the nation's three largest airports, Joseph Nacchio, AT&T; vice president of business communication services, said yesterday.
As AT&T; was attempting to explain the incident, the Federal Aviation Administration said that it planned to launch a full investigation into what has been described as the worst aviation communication breakdown in modern history.
Mr. Nacchio said that AT&T; technicians on duty at a switching site in Lower Manhattan ignored alarms throughout the day indicating that the center was operating on emergency battery power. Technicians also failed -- twice -- to make a physical check of equipment during a routine energy-conservation exercise.
Those two judgment errors proved disastrous when AT&T;'s phone network crashed due to a power shortage. The outage snarled long-distance calling across New York and halted traffic at La Guardia, Kennedy and Newark airports because communication lines between air traffic controllers were down.
Mr. Nacchio said that steps to avert the outage could have been taken with ease had proper procedures been followed.
"There were errors in judgment that didn't allow us to diagnose the problem as it should have been," he said. "Standard practices weren't followed."
AT&T;'s problem apparently began sometime after 10 a.m. Tuesday, when the switching site cut over from commercial power to internally generated power. The procedure is part of the energy-conservation exercise.
According to Mr. Nacchio, the conservation exercise calls for a switch to diesel generator power, but for some reason that didn't happen. When the system couldn't engage the generator, it automatically cut over to emergency battery power.
At this point, Mr. Nacchio said, details become fuzzy. Early indications suggest alarms sounded and warning lights flashed to alert technicians that the emergency power had been engaged. But for some reason, technicians either didn't see or hear the alarms, or saw and heard them but failed to follow up.
Mr. Nacchio estimated there were about 20 workers in the affected site at the time of the incident.
The system continued to operate off battery power throughout the day, slowly draining the batteries of their juice. The battery backup system, which is supposed to be used only in cases of extreme emergencies, is designed to provide power for a maximum of six hours.
At 4:30 p.m., a shift back to commercial power was attempted. But the cutover didn't take place because the generator -- which had triggered the battery backup in the first place -- still wasn't working. When the system attempted to shift to battery backup power, the drained batteries couldn't deliver enough juice to avert an outage.
At 4:50 p.m., AT&T;'s network crashed because of a power failure. And there was nothing AT&T; could do to stop it.
The power failure crippled AT&T;'s billion-dollar, state-of-the-art network in New York and caused traffic at the three area airports to grind to a halt. The peripheral effect, however, could be felt as far away as Los Angeles, where eastbound planes had to be held on the ground for hours pending restoration of communication lines with the critical New York air traffic control system.
The story was much the same along the Eastern seaboard, including Baltimore. New York-bound planes had to be called back or diverted to other airports, and international flights connecting through the area had to be turned back.
AT&T; restored service across the area by midnight.
According to Mr. Nacchio, AT&T; is poised to begin upgrading the FAA's air traffic control lines in New York next month. The upgrade will greatly diminish the likelihood of arepeat performance of Tuesday, he said.
Even so, the FAA said yesterday that it plans to launch a full-scale investigation into the communication breakdown.
Fred Farrar, an FAA spokesman in Washington, said FAA investigators want to know why AT&T;'s vaunted "network redundancy," or backup systems, didn't work as they should have.