Introducing the New Negro Leagues Database
December 5, 2016 by Daniel Hirsch · 2 Comments
It’s been over five years since we originally launched the Negro Leagues Database. Over that time, there have been significant additions to the database, in terms of new seasons and statistics. But the website and the presentation of these statistics have largely remained the same. In May of 2015, I overhauled the Major League part of The Baseball Gauge, and I’ve wanted to do the same with the Negro Leagues section. Today, we re-launch the award-winning Negro Leagues Database. Here are some of the new features:
Per 162 games
One of the biggest issues with Negro Leagues statistics is that they are incomplete. We don’t have box scores for every game and we currently do not have data for every season and league. Because of this, it’s tough to compare Buck Leonard’s 62 career home runs to Cristóbal Torriente’s 70, the same way we compare Harmon Killebrew (573) to Andre Dawson (438).
To help fix this issue, I’ve included “per 162 games” rates on player and season/career leaderboard pages. Here we’ll see that Buck Leonard averaged 26 home runs per 162 games, while Torriente averaged 11.
Similarity scores
Comparing raw stats from Negro Leagues to Major Leagues is far from perfect. It doesn’t account for league quality, park factors or era. Having said that, we have similarity scores on all player pages, to see which Major Leaguer had the most similar career. Because of the issue described above, “per 162 games” statistics are used instead of career totals. There is also the ability to only compare to Hall of Famers or active players.
The similarity score tool
shows us that Oscar Charleston’s most similar Major Leaguer was Rogers Hornsby
Defensive Regression Analysis
These fielding statistics have been available on the Major League site for a few years now and they are finally included in The Negro Leagues Database. Defensive Regression Analysis, created by Michael Humphreys , takes basic fielding statistics and estimates how many runs a player has saved (or allowed) compared to average.
Defensive Regression Analysis shows us that Dick Seay , while a lightweight with the bat (career 51 OPS+), saved 67 runs at second base in the season we have fielding data.
New Wins Above Replacement
The calculation for Wins Above Replacement now matches the Major League site. It uses Base Runs for offense, Defensive Regression Analysis for fielding, and runs allowed (with an adjustment for fielding) for pitching. The replacement level has been set at .294 to be consistent with Baseball-Reference and Fangraphs.
There is also Wins Above Average and Wins Above Greatness if you prefer a different baseline. As with the previous version of the website, Win Shares and Win Shares Above Bench are included.
The career leaders per 162 games
contains many familiar names:
Roster pages
These are available on team , year , franchise , and all-time pages. They contain vitals, uniform #’s, and birth/death information.
Data Coverage
These pages give the user an idea of which statistics we have and which we are missing.
New Logo
We have a beautiful new logo, which was kindly provided by Gary Cieradkowski , creator of the Infinite Baseball Card Set and author of The League of Outsider Baseball .
Finally, we have all the features that were previously available on The Negro Leagues Database as well as the Major League version of The Baseball Gauge.
What is the latest year do you intend to include in the database?
For the Negro League specifically, we will possibly stop at 1948. However, we will likely add the Mexican League thru 1955, and the 1950’s Mandak League, as they both had many Negro League players.