DELETING A computer file is usually like sweeping dirt under the rug. The file, like the dirt, is still there; it just takes a little extra effort to see it.
This long-standing peculiarity of most operating systems often lets files be resurrected after they have been deleted, as miscreants and errant typists have been surprised or delighted to learn. But there are scarier byproducts, including the unintended transfer of data to friends, enemies and the world at large.
My latest depressing discovery is that files produced by Microsoft Office Windows applications, including Word, Excel, Powerpoint and Access, often incorporate chunks of data previously deleted from the disk on which the files were saved. The chunks are typically as small as one character (or none) or as large as 4,095, but can be even bigger.
The phantom information is hidden by the program that created the file but can readily be seen by opening the file with a disk utility or text editor. Notepad, which comes with every copy of Windows, reveals that unintended data in my Word and Excel files include telephone numbers, travel reminders, file names and gibberish. Your data will differ.
Much of it, of course, will be innocuous or unreadable. Some will not be.
The likeliest material is something deleted the last time you used your machine before you rebooted it, but even that is not certain. The odds of your nasty joke about the boss appearing in the file of a memo to her may be quite low, but not so the odds of something confidential finding its way into some random file she might get from you.
Does this mean the contents of a particular file will never turn up elsewhere as long as you never delete it? Hardly. Those phone numbers in my Word files were never deleted by me. Programs and the operating system itself often store information in temporary files that get deleted automatically.
One culprit is something known as the compound file. The
Microsoft Corp. has been urging it upon the industry as part of the standard known as OLE 2, which stands for Object Linking and Embedding, and is pronounced "ole," as at bullfights.
Just as a disk has a directory and files, so, in effect, does a compound file itself. This Chinese-box arrangement offers programmers advantages in creating products that work with each other. But it virtually invites the inclusion of deleted data unless programmers take explicit steps to prevent it.
Steve Sinofsky, group program manager for Microsoft Office, explained that his programmers solved the problem in earlier versions of Microsoft Word but acknowledged that it has recurred in the new version for Windows 95.
But while interim versions of Word may have been fixed, deleted data appear in several of my files created last year with Word version 6.0a. It is reasonable to conclude that millions of Office files contain hidden data that their creators never intended to be there, and more are being created every day.
This is not entirely a new problem. In the past, experts assure me, certain database and other programs regularly created files with similar "holes" of deleted data. But the importance of data security has grown as formatted files fly across local and far-flung networks, and database files are shared far less often than word-processed documents.
In a day when your files typically resided on a disk in your machine, the prime security risk of leftover deletions was the possibility that an unauthorized person might walk up and sneak a peek. Now random chunks of information from that machine may ride in your files via unsecure networks to correspondents around the world.
Microsoft's applications have been rather cavalier about data security. Last year, Fred Langa of Windows magazine pointed out that Word's Fast Save option stores deletions in the file in a similarly hidden but readily readable way. Despite a brief flurry of outrage, Microsoft kept the Fast Save mode as the standard for Word's Windows 95 edition without explaining the risks of sending deletions to your correspondents.
Third-party programmers told me about yet another security lapse. Word offers optional password protection, encrypting a file so that passwordless snoops cannot read it, with Word or without. The encryption, however, does not extend to "objects" within the file. Embed a small spreadsheet in a password-protected Word file, and the spreadsheet's contents will be visible in Notepad. That does not appear to happen with, say, Wordperfect.
Users will find no mention of any of these security issues in any of Microsoft's manuals or help files.
Sinofsky told me, "You're the only person who's ever brought it up to us." But he admitted that Microsoft was aware of the errant-chunk problem and said, "We're going to get it fixed as soon as we can."
In the interim, no real solution is available, and removing (P extraneous data from existing files may be a business opportunity for some enterprising programmer. Using something like the Speed Disk program in Norton Utilities can overwrite deleted data, but cannot guarantee safety.
For the moment the best security is to use the many programs that do not store their information this way or, if you are firmly committed to Microsoft's line, an industrial-strength operating system like Windows NT. When it deletes information, the
information stays deleted and does not turn up where it does not belong.