Early bird tickets for Baltimore’s BEST party on sale now!

LOOK IT UP

THE BALTIMORE SUN

Alogomachy - n. an argument about words - is brewing on the World Wide Web.

Houghton-Mifflin plans to publish the fourth edition of its American Heritage dictionary this month, the volume's first major overhaul in eight years. The new edition is full of changes sure to arouse lexicographers - color illustrations, notes on slang and a new appendix describing Semitic as well as Indo-European roots.

But the change by the publisher that has stirred the most excitement is happening outside the covers, as Houghton-Mifflin hustles to sell electronic versions of its dictionary for inclusion in other companies' software, Web sites and digital publications.

Houghton-Mifflin is not alone. Its major rivals - most notably Merriam-Webster and Microsoft Corp.'s year-old Encarta dictionary - are stepping up their digital dictionary efforts to tap an increasingly lucrative market, setting up a business contest that philologists say will also have consequences for the way Americans use English.

Electronic novels might be making headlines these days, but electronic dictionaries are making money. At Houghton-Mifflin, licensing of digital dictionaries is expected to account for more than $1 million in profit this year, more than 10 percent of the earnings from the company's trade and reference division, according to Wendy Strothman, the division's publisher.

Stifled for years by low margins and flat sales, dictionary publishers are salivating over digital sales and licensing as a new source of revenue growth, promoting flashy new features such as audible pronunciations. But word scholars worry that the new pressures of the online market might end up favoring well-connected or well-positioned dictionaries - some sniffingly note Microsoft's Encarta - over more authoritative lexicons.

Many lexicographers first saw the Internet as a terrific new tool, especially because it made possible electronic texts of nearly infinite length. That impulse inspired the Oxford University Press to revise its 20-volume Oxford English Dictionary for the first time since its completion in 1928.

A new online version of the OED is available to subscribers for fees starting at $550 a year. Researchers are posting the revisions and additions online in stages, and they expect to finish the alphabet in about 40 volumes around 2010. Oxford University Press has not decided whether it will publish a new printed version, too, said Jesse Sheidlower, its American editor.

Potential to share

The Internet also enables rival dictionary compilers to share a common digital "corpus," or archive of usage samples. Inspired by the British National Corpus that was established in 1993, a group of publishers and linguists based in New York is raising funds and gathering material to build an American National Corpus of 100 million words in texts of all kinds, including transcript, newspapers and novels.

But the American National Corpus has yet to win help from many of the nation's big dictionary publishers, who would stand to lose the advantage of their own proprietary archives. "We think we have our needs pretty well served," said John Morse, president and publisher of the Merriam-Webster, the United States' oldest and best-selling dictionary, with an archive of more than 15 million citations.

The World Wide Web is also a gold mine for linguistic research. For the first time, scholars can trace the infancy of new words as they bubble up from narrow subcultures through online discussion groups and eventually into general use, said Michael Adams, a professor at Albright College in Pennsylvania and editor of the journal Dictionaries.

Adams recently published a study of new coinages from the television show "Buffy the Vampire Slayer" - "slayage" and many other -age formations, for example - tracking their progress from teen-age fans' Web sites to magazines including Mademoiselle. He argues that Buffy has also spawned novel uses of "much," as in "pathetic much?" "morbid much?" or "Having issues much?"

But Microsoft's Encarta dictionary, billed as the first lexicon for the Digital Age, has some lexicographers shaking their heads, partly because they worry that it could indeed be the dictionary of the future.

The idea for the Encarta was born in the early 1990s, when Nigel Newton, chief executive of the British publishing house Bloomsbury, wrote Bill Gates a letter proposing to create a dictionary of "world English."

At the time, Microsoft was paying Houghton-Mifflin to license online versions of its American Heritage dictionary to use in Microsoft's spell-checking software and to bundle with its Encarta digital encyclopedia. Why pay Houghton-Mifflin, Newton suggested, when the two companies could build a wordbook of their own? Bloomsbury developed the dictionary, selling international digital rights to Microsoft and the American rights to Holtzbrinck Publishers' St. Martins Press.

The new venture faced long odds in bookstores. Most American consumers traditionally want a red dictionary with the name Webster on the cover -as in Merriam-Webster, Random House's Webster's, and IDG Books' Webster's New World, says John Sargent, president of Holtzbrinck's American operations.

But the new dictionary's publishers are betting that Microsoft's commanding position in the software market can make Encarta's name and black cover ubiquitous. "Our thinking was that, given its use in Microsoft software, the Encarta brand would over time become the leading reference brand," Sargent said. The electronic version is available for sale with some Microsoft software or for free at www.encarta.com

The possibility that Encarta will become the new Webster is what troubles many linguists. In a forthcoming review in the journal Dictionaries, Sidney I. Landau, author of "Dictionaries: The Art & Craft of Lexicography," roundly pans Encarta's "cumbersome, repetitious and inconsistent style" and especially what he sees as its excessive political correctness.

The word "Indian," an example Landau notes, is described in other dictionaries as potentially insensitive but also widely used among Native Americans and inextricably woven into terms like "Indian summer." The Encarta issues a blanket condemnation, calling the term "offensive" several times. In a few cases, the Encarta Web site even interrupts the viewer with a "language advisory" before even displaying a potentially offensive word, as if it were a lewd movie.

Such labels, Landau says, reverse most lexicographers' understanding of their job - to report in neutral terms the changing shape of the language.

Adams worries that Encarta will succeed despite its flaws and at the expense of its rivals. "The problem is that if they don't put out the best possible dictionary, because of the access they have through the Microsoft software, they could very well depress the sales of the four major publishers," said Adams, who has worked as a consultant to American Heritage.

"Good dictionaries would disappear and we would be left with an inferior dictionary," he said.

Microsoft and its partners dismiss the criticism as predictable nitpicking. Every new or different dictionary has met a similar response from professional lexicographers, said Holtzbrinck's Sargent.

Deal with Microsoft

Houghton-Mifflin, Microsoft's previous digital dictionary supplier, was the publisher with the most to lose from the Encarta dictionary, which Microsoft this year began using instead of the American Heritage. Houghton-Mifflin's Wendy Strothman said that new digital licensing deals had "more than made up for the loss of that revenue stream."

Strothman says Houghton-Mifflin prepares customized versions of its digital database for a variety of clients, seeking to capitalize on the recent interest in electronic publishing by embedding its dictionary in electronic books or reading software. Readers can look up any word with a click.

When a half-million fans downloaded copies of Stephen King's electronic novella "Riding the Bullet" in March, some of the software programs for displaying it included a digital version of the American Heritage dictionary, and Houghton-Mifflin received a small royalty on each. This fall, the digital publisher netLibrary will begin including American Heritage dictionaries with its e-books, paying a sliding scale fee for its use. (Microsoft's new Reader software, however, includes a version of Encarta.)

A number of Web sites, including www.dictionary.com, have even paid Houghton-Mifflin for use of its digital dictionary to provide free spellings and definitions on the Web, hoping to attract viewers and sell advertising. "They are welcome to do that, but our content costs us money and we want to get paid for it," Strothman said. "What puzzles me is why our competitors put their own dictionaries up on the Web for free."

Houghton-Mifflin sells its dictionary on CD-ROM, but does not put it on a Web site of its own.

Merriam-Webster has taken a radically different tack from American Heritage, giving its dictionary away on its Web site (www.m-w.com) while trying to license it to whomever it can, including America Online and the handheld computer maker Franklin Electronic Publishers. Recently, it struck a deal to display its Web site on Palm devices.

"Unlike Houghton-Mifflin, we are just a dictionary publisher," said Morse, of Merriam-Webster. "We aim mainly to promote the brand."

The main Merriam-Webster Web site and a related site for children offer word games and a free word-of-the-day e-mail with usage and etymology tips. Morse said the site was getting about 20 million page views a month, at a rate of about 50,000 look-ups an hour during the middle of the day.

Merriam-Webster also tracks which words users look up for guidance in making revisions. This month's hot word: "chutzpah," spurred by press coverage of Democratic vice presidential candidate Joseph I. Lieberman.

dic*tion*ar*y (n.)

1. A reference book containing an alphabetical list of words, with information given for each word, usually including meaning, pronunciation, and etymology.

2. A book listing the words of a language with translations in another language.

3. A book listing words or other linguistic items in a particular category or subject with specialized information about them: a medical dictionary.

4. Computer science.

a. A list of words stored in machine-readable form for reference, as by spelling-checking software.

b. An electronic spelling checker.

AMERICAN HERITAGE DICTIONARY

Copyright © 2019, The Baltimore Sun, a Baltimore Sun Media Group publication | Place an Ad
63°