Baltimore County's latest reading test scores, while showing promising gains, do not necessarily reflect a dramatic leap in performance, testing experts say.
As county educators celebrate a startling 20-plus point jump in the number of first- and second-graders reading on grade level during the course of the school year, testing specialists caution that such a sharp rise is unusual -- and likely to be partly explained by factors other than better instruction.
Testing authorities who reviewed the county's recent results note that there is clearly improvement in reading scores, which county educators attribute to a new beginning-reading program that emphasizes phonics.
But before holding that program up as a model for other districts to copy, the experts advise watching the scores for several years. They warn that a number of factors might have complicated the data and made for murky comparisons.
"I don't find it strongly persuasive that the program is having this impact," said H. D. Hoover, a nationally known testing expert at the University of Iowa and senior author of the widely used Iowa Test of Basic Skills.
"There are too many possible other explanations," Hoover said. "Schools don't change that quickly. The most meaningful comparisons will be after two or three years of this program, as you follow kids to third and fourth grade."
The results come as educators around the country feel intense pressure to boost test scores in response to calls for accountability in public schools.
In some districts, the pressure has led to accusations of misrepresentation of scores, which is becoming "an epidemic in the United States," said Tom Haladyna, a testing expert at Arizona State University.
No one has suggested that this is an issue in Baltimore County. Along with outside experts, the county's testing officials have cautioned against drawing overly strong conclusions about instruction from one year's worth of data.
Interpreting scores is complicated. It is made more complicated by the widespread practice among school systems and states of switching tests every few years; as a result, historical comparisons are nearly impossible.
Even comparing school districts on these tests can be frustrating. Most other Maryland districts haven't released their reading scores. When they do, some present their data in different forms.
This spring, 85 percent of county first- and second-graders scored at or above grade level, compared with the national average of 77 percent. In 1997, 78 percent of county first- and second-graders scored at or above grade level.
Higher scores this year might be expected in a county recovering from a scattershot approach to reading instruction that sent scores declining in the early 1990s. The remedy, a new "Word Identification" curriculum that began in 1996, is receiving praise for its strong alignment with current research on reading.
Also, the tests featured here -- unlike the statewide Maryland School Performance Assessment Program -- target phonics and
other basic skills taught in that program.
"What we know is that it is remarkable how quickly young kids can recover if they are given the right instruction," said William J. Moloney, former superintendent of Calvert County, now the education commissioner of Colorado. "With a simple, direct approach, you can get those kinds of results."
County educators say the results reflect what they're seeing in the schools: More students appear to be learning to read.
For example, at Grange Elementary School, students have performed better not just on the first- and second-grade exams but also on a series of diagnostic tests given to third-graders by the Sylvan Learning Corp.
"We're seeing steady improvement," said Grange principal Harry Belsinger. "Our teachers can see in their classes that more children are learning to read, and the test results show that."
Experts advise viewing the seemingly huge one-year leap with caution. They raise these concerns:
The scores come from three different tests: the Gates-MacGinitie in grade one, and two versions of the Comprehensive Tests of Basic Skills (CTBS) for grade two, a fall version and a spring version.
Paul Mazza, the county's director of research and data analysis, said that the fall and spring CTBS tests were aligned so they can be compared, and allowances were made for the expected growth of students during the school year.
Michael Kean, vice president for public and governmental affairs at CTB/McGraw-Hill, which publishes the test, said the tests were equated and described the county's testing practice as "a very powerful way to judge student achievement."
But properly linking different tests is one of the most challenging problems in the testing field, experts say, and people argue about it all the time.
What particularly puzzled the experts was what appeared to be abnormally low scores in the fall. For example, first-graders -- 76 percent of whom scored at or above grade level in the spring of 1997 -- scored 10 points lower when they returned to school as second-graders last fall. County officials say there is an influx of new students in the fall, and that the second-grade test, the CTBS, is harder.
In any case, some experts suspect that the fall scores could be artificially low, which would make the fall-to-spring gains less meaningful.
"It means it's not real," Hoover said, stressing that he is not suggesting the data was manipulated. "There's something here that I don't understand."
Some experts said spring-to-spring results are more reliable, arguing it's a cleaner comparison to measure two groups of children at the same point in development against a national "norm" group at the same point in development.
From this perspective, from spring 1997 to spring 1998, the county posted a moderate gain -- an 8 point increase in the number of first-graders scoring at or above grade level and a 6 point increase among second-graders.
Another factor that might have slightly inflated the scores is the county's decision to eliminate from the data children who moved during the school year, in an effort to get a sharper picture of each school's instruction.
That group, representing 8 percent of first-graders and 13 percent of second-graders, is likely to disproportionately include low-income, low-achieving students.
Paul Glovinsky, a testing expert at Kaplan Educational Centers, which prepares students for standardized tests, said that it's obvious there is some overall change in performance; the testing sample is large, and the changes are across the board.
After a cursory review of the data, he said, "I think what happened is two things -- there was a program instituted that was more closely aligned with the skills that are weighted on the test. And there appears to be an abnormally low set of scores in the district in the fall."
'Going in the right direction'
County educators acknowledge that it will take time to draw conclusions about the effectiveness of Word Identification.
"I think that the gains show we're going in the right direction," Mazza said. "But we're going to need another year or two of data to prove it."
State schools Superintendent Nancy Grasmick said the county has taken important steps in the first, crucial phase of reading instruction: teaching children to sound out words.
But she noted that reading also involves comprehension and application of reading to problem-solving, and educators will watch for similar improvements on the MSPAP -- given initially in third grade -- which measures those skills.
Pub Date: 8/30/98