How A Baseball Simmer Challenged History
April 12, 2008 by Brian Joseph · 7 Comments
Robert Bofors’ study of caught stealing estimates during the Deadball era proves to be illuminating.
For most of early baseball, times caught stealing were not recorded. In the National League, caught stealing were not recorded from 1876-1914, 1916-1919 and 1926-1950 and in the American League, the statistic was not recorded from 1901-1913 and from 1916-1919. Some of the most prolific base stealers including all-time stolen base leader Billy Hamilton lack caught stealing data.
For roughly a century, the players of the Deadball era basically stole with relative impunity. That is until Robert Bofors had something to say. Taking on the daunting task of developing ten database discs for Diamond Mind Baseball, Bofors had to fill in the gaps of missing statistical data. The largest statistical gap for Bofors to fill was the missing CS stats for 11,410 player records from every season from 1894 to 1919. After six months of research, analysis and estimation from December 2006 to May 2007, you could say that Bofors is responsible for gunning down more base runners than Johnny Bench.
The baseball simmer could have easily settled on an average stolen base rate of 55% based on known CS stats from 1914-1915 and 1920-1925 but Bofors wasn’t satisfied with that. Applying that average would credit Hamilton with a major-league record 81 times caught stealing during the 1894 season where he stole 98 bases, nearly double Rickey Henderson’s major league record 42 times caught stealing in 1982. He was looking for true accuracy in the simulation.
Instead, Bofors took a number of different factors into consideration. He took historian Dan Levitt’s work into account. Levitt was responsible for the 55% stolen base rate based on his research as co-author of Deadball Stars of the American League and Deadball Stars of the National League . Bofors considered Bill James’s speed score, a formula used to evaluate a player’s speed.
In the end, it took a more in-depth look at the stolen base records of every individual during the seasons being evaluated to determine the overall stolen base success rate. While looking at the known data from 1914-1915 and 1920-1925, Bofors was able to discern that players from those eight years could be segmented by stolen base totals to determine a more true success rate. His conclusion— it’s not a stretch to imagine that a hundred players with 60-plus steals would have a much higher success rate than a hundred players with six steals.
Bofors took the players and broke them down into three categories: regulars, bench players, and pitchers. He used Net-Facing Pitchers (NFPs), a watered-down version of Plate Appearances due to the lack of data on sacrifice hits, sacrifice flies and catcher’s interference. The idea of NFPs came from Michael Schell’s work in Baseball’s All-Time Best Sluggers and adds At Bats, Walks and Hit Batsmen. After breaking down the groups, he assigned SB success rates based on the data mined from the eight known seasons. The numbers were broken down as follows:
Regulars:
SB | RATE |
---|---|
40+ | .716 |
31-40 | .651 |
24-30 | .615 |
20-23 | .574 |
17-19 | .575 |
14-16 | .542 |
11-13 | .539 |
9-10 | .521 |
8 | .509 |
7 | .489 |
5-6 | .458 |
4 | .409 |
3 | .375 |
2 | .326 |
1 | .193 |
Bench:
SB | RATE |
---|---|
11-20 | .647 |
6-10 | .584 |
4-5 | .563 |
3 | .526 |
2 | .495 |
1 | .476 |
Pitchers:
SB | RATE |
---|---|
2-7 | .707 |
1 | .804 |
* |
* |
Finally, for players with zero steals, Bofors gave out a caught stealing randomly to those 4,525 players to account for the occasional caught stealing or busted hit-and-run play. He then went back to his modification of Bill James’s Speed Score to further shape the caught stealing estimates. While working the data, Bofors had to break the player groups down based on seasons using league runs per times on base split into five groups- 1894-1895 NL, 1896-1897 NL, 1898-1900 NL, 1901-1919 NL and 1901-1919 AL. He determined the nineteenth-century season had to be sorted into very small subsets because of league volatility. The less volatile Deadball years of 1901-1919 were able to be grouped together. Then Bofors worked the data and initially came up with SB rates around .593, much higher than the .55 average widely accepted. So, he reworked the numbers and rounded the caught stealing numbers straight up which lowered the rate to .584 but his numbers were consistent. The Deadball Era saw a slight drop with the AL from 1901-1919 at .582 and the NL from 1901-1919 at .578.
What started as an exercise in building a more realistic simulation for a computer game ended up providing some valuable insight on the success of base stealers during the infancy of baseball. Bofors drew a number of conclusions based on his analysis:
- Contrary to expectations, improvement in catcher protection and defense didn’t significantly improve after 1894.
- The late 19 th century was not an era for hog-wild base stealing.
- The foul strike rule gave players less opportunities to steal.
- Good base stealers took advantage of the opposing weaker catchers resulting from the disruption of the National League rosters by the nascent American League made both circuits uneven in talent.
- With the jump in batting averages and runs scored due to the new cork-center ball in 1911 and 1912, SB rates also increased. What Bofors couldn’t determine is whether or not this was a natural consequence of higher on-base averages or an aberration of the formula used.
- By 1913, pitchers had the edge with the introduction of the emery ball. For the rest of the decade, stolen base rates fell.
- Estimated CS values fell into line with the 1914-1915 and 1920-1925 estimates at the tail end of the Deadball era.
The one absolute is that the detailed estimations provided by Robert Bofors supplies evidence contrary to the widely accepted belief that the Deadball era had a higher stolen base success rate than most experts have claimed. Until a deeper study is done, we will not know how close Bofors’ estimates are. Until then, these estimates are definitely food for thought for even the casual baseball historian.
Estimated Stolen Base Rates (1894-1925)
That’s an impressive amount of research, and an interesting way of looking at the problem. I am skeptical, however, about his assumptions concerning the success rates of different groups of players. It isn’t necessarily true that more stolen bases = more successful base stealer, percentage-wise. For instance, in the 5-15 steals range, I would expect to find some guys who are smart enough to only steal when it’s a sure thing. They’d have fewer steals, but almost never be caught.
Hmm, actually, having reread your article, I see where he is coming from. Is the original research available online, for more details?
Justin,
No it isn’t, but I have a copy of his original paper as well as the database he created if you’d like to see it.
I would love to see that database. Very interesting.
Nice work. Is there a way to get a copy of the database to kraut2k@gmail.com ? Thanks.
I would love to see that database as well.
Thank you.
crice@morrowco.com
I have viewed this article on and off for a few years now, and I have finally “peaked” my own curiosity.I too would love to see the SB Database of Mr. Bofors. I have read much of his Ballpark Research, so I am guessing that the Database would be of great interest. Can you please eMail me a copy of the Bofors Database when you have a spare moment? Greatly appreciated.
Thank You Much