Hopkins code cracks baseball scheduling

In baseball, it turns out, making a perfect schedule can be as elusive as throwing a perfect game.

There are a lot of rules, starting with the basics: For each major league team, 81 games must be on the road and 81 at home, with 13 home weekend series and 19 games against divisional opponents.


With any wiggle room left, a team such as the Orioles might ask for a home game on Father's Day or an away game when a concert is scheduled at nearby M&T; Bank Stadium. But any special request from one team can lead to a lot of eraser smudges for the rest. Such scheduling issues also extend to the minor leagues, which have their own idiosyncrasies.

Just as the statistics-heavy "Moneyball" approach changed how teams build rosters, the scheduling solution can lie in using math to simplify the complex task of determining who plays who, where and when.


A team of Johns Hopkins University researchers has developed a system that uses thousands of lines of computer code to satisfy all of a league's scheduling rules, and as many of the teams' requests and preferences as possible.

The researchers have begun selling their services around the country and hope to be setting schedules for half of minor leagues affiliated with Major League Baseball by next spring; they say a sale to one of the leagues is imminent. More efficient scheduling can save teams money on travel and help them maximize ticket sales – though the researchers emphasize it's impossible to meet every demand.

"In the absence of a perfect schedule, it's really a question of how elegant of a trade-off can you find that the teams can live with?" said Anton Dahbura, an associate research scientist on the project and an owner of the minor league Hagerstown Suns.

Any league has a long list of conditions to be met, with some needs more important than others. In the majors, rules prohibit any team from playing more than 20 days in a row, for example. In cities such as New York, home to two teams, schedulers try to minimize the instances when both are in town at the same time — and that goes for the Orioles and Washington Nationals, too.

Computer systems are used to produce schedules meeting basic demands, but they might still require shuffling if they include strenuous road trips that send teams back and forth across the country for long stretches, said Katy Feeney, senior vice president for scheduling and club relations for Major League Baseball. Computers also help perform quick checks of schedules to make sure all the rules are met, she said.

"There's still some human work that goes into it," Feeney said.

But most minor leagues do without computers — a tedious task done in pencil, with eraser in hand.

In the 1990s, Dahbura started wondering if he could combine his passions for computer science and baseball to make a better system. The Hagerstown native and lifelong ballplayer developed a semiautomated system he shared with a friend who worked for the Detroit Tigers, and it was used by Minor League Baseball's South Atlantic League in 1998, he said.


It wasn't until Dahbura, a Johns Hopkins alumnus, returned to his alma mater that he delved into building a robust solution to the challenges. He joined forces with Donniell Fishkind, an associate research professor in Hopkins' applied math and statistics department, to develop a system through which 10,000 schedule constraints and more than as many variables could be plugged into a supercomputer to spit out a workable schedule.

Their approach uses a concept known as combinatorial optimization — there should be at least one schedule that "best" satisfies every rule and team request, though opinions might differ on which to choose. In other math applications, the problem could be solved more easily, but in the case of baseball games, the answer must be a combination of integers. A team can't play fractions of a game against multiple teams in one day.

The researchers won't go into detail on how their system works, citing competition as they said they neared their first sales to minor leagues. But it involves creating thousands of lines of code defining a wide range of schedule constraints, said Matt Molisani, a graduate student in computer science who has spent the past two years working on the project, starting as an undergraduate.

Some limitations simpler than others. Any major league team could play in any of 30 stadiums for any given game, so all of those variables are in the realm of possibilities to start, for example.

But each league has preferences that challenge the programmers. In the International League, a minor league in the eastern United States, teams are relatively spread out, so the scheduling rules require that teams play "sister city" opponents in succession for efficiency of travel. If the Gwinnett Braves are going to travel all the way from Georgia to face the Buffalo Bisons, then they should play the Rochester Red Wings or Syracuse Chiefs, too.

"We had to go back to the drawing board to figure out ways of constraining that," Molisani said. "There's always more than one option. It's just, which one ends up working better?"


The supercomputer that processes all the constraints can take a month to produce a solution in a case like the International League's, as it sifts through the exponentially large number of possible schedules to find ones that meet the rules. Simpler cases take a matter of hours or days, but then leagues typically add on requests in search of that perfect schedule for everyone, and that can lengthen the process.

While the researchers aren't applying the technology to the major leagues yet, a spokesman for the Orioles acknowledged that competing demands can make it difficult to approach a perfect schedule.

"The more scheduling issues that are presented to us in advance ... the fewer opportunities we have to request dates we want to be home," said Greg Bader, Orioles vice president of communications and marketing.

But that is just the challenge that excites the math and baseball enthusiasts. Molisani will graduate in May, off to a job in health care coding, but said he's going to miss the project.

"I like the fact there's always so many different ways to get to the final solution," Molisani said. "There's never just one correct answer, which is weird because in math there's usually only one correct answer."