The Seamheads.com Parkfactors Database (Seamheads Parkfactors 2019_December.accdb)
Release Date:  December 14, 2019

----------------------------------------------------------------------
README CONTENTS
  0.1 Copyright Notice
  0.2 Contact Information
  1.0 Release Contents
  1.1 Introduction
  1.2 What's New
  1.3 Acknowledgements
  1.4 Using this Database
  2.0 Data Tables
  2.1 Home Main Data Tables
  2.2 Visitor Main Data Table
  2.3 ParkConfig Table
  2.4 RH_LH_Data Tables
  2.5 Parks Table
  2.6 Teams Table
  2.7 Leagues Table
  2.8 LeagueTeams Table
  2.9 x-Reference Table
  3.0 Data Issues
  4.0 Online Version of Database

----------------------------------------------------------------------
0.1 Copyright Notice & Limited Use License

This database is copyright 2019 by Kevin D. Johnson. A license is granted
for individual use for research purposes. It may not be re-distributed
without permission. Any commercial use, or other dissemination of the
database in part or in whole is prohibited. Use of this database
constitutes acceptance of these terms.

For licensing information or further information, contact me at
kjokbaseball@yahaoo.com.

----------------------------------------------------------------------
0.2 Contact Information

Groups IO egroup: kjokbaseball
 E-Mail : kjokbaseball@yahoo.com


----------------------------------------------------------------------
1.0  Release Contents



MS Access Versions:
      Seamheads Parkfactors 2019_December.accdb
      Seamheads Parkfactors 2019_December Documentation.txt

Comma Delimited Version:
      Seamheads Parkfactors 2019_December Documentation.txt
      Active Park Events.csv
      Home Main Data.csv
      Home Main Data With Parks Breakout.csv
      Leagues.csv
      LeaguesTeams.csv
      Park Events.csv
      ParkConfig.csv
      Parks.csv
      Retroshet_BBDB_Team_XREF.csv
      RH_LH_ALL_HR.csv
      RH_LH_Boxscore.csv
      RH_LH_Data.csv
      Teams.csv
      Visitor_Main_Data.csv


----------------------------------------------------------------------
1.1 Introduction

This database contains batting statistics by ballpark for
Major League Baseball from 1871 through 2019.  It includes data from
the two current leagues (American and National), the four other "major"
leagues (American Association, Union Association, Players League, and
Federal League), and the National Association of 1871-1875.

This database also contains park configuration data by year for each
major league ballpark used.   This data, however, should be understood
to be based on many reported measurements which may be unreliable and may
conflict with other reported measurements.  Some reported data has been
corrected via the usage of maps and geometric approximations to make
the data as accurate as possible.

If you have any problems or find any errors, please let me know.  Any
feedback is appreciated

----------------------------------------------------------------------
1.2 What's New

2019 December Version data changes (12/14/2019)

The 2019 version now includes 2018, 1906 and 1907 home and away, and LH and RH splits based on
Retrosheet play-by-play and Boxscore data. (note sometimes bathand is unknown for switch-hitters
and for some other batters pre-1926).


Added numerous configuration data updates.



----------------------------------------------------------------------
1.3 Acknowledgements


This database has been built based on the data in many other sources,
and help from many people over the years, including:


The MacMillan Encyclopedia - Original source of Home and Road Runs for 20th century thru 1987
Retrosheet.org - Game Logs source of Home and Road data.  Event Files source of LH/RH H/A by Park Data
The SABR Home Run Log (David Vincent) - source of Home and Road Home Runs
Dan Hirsch - LH/RH H/A breakdowns of Retrosheet.org Event and Boxscore file data
Howard Johnson - LH/RH H/A breakdowns of Retrosheet.org Event and Boxscore File data
Brian Cartwright – LH/RH H/A breakdowns of Retrosheet.org Event File data
Mark Miller - LH/RH H/A breakdowns of Retrosheet.org Event File data
Clem Comley - LH/RH H/A breakdowns of Retrosheet.org Event File data
Victor Wilson - LH/RH H/A HR breakdowns
Charles Saeger - Original source of 19th Century Home and Road Home Runs
Paul Wendt - Various 19th century ballpark usage issues
Green Cathedrals by Phillip Lowry - primary source of Park Configuration data
Ballparks of the Deadball Era by Ron Selter - primary source of Deadball era Park Configuration data
Ballparks.com - Secondary source of Park Configuration data
Ballparksofbaseball.com - Secondary source of Park Configuration data
The Lahman Database - Source of Team/League Data
Tom M. Tango (TangoTiger) - Database Design Assistance
Sean Forman - Database Design Assistance

Eric Jones - Detailed 1914 & 1915 Federal League Home and Road Data

     The information used here was obtained free of
     charge from and is copyrighted by Retrosheet.  Interested
     parties may contact Retrosheet at "www.retrosheet.org".

For the online version of this database at www.seamheads.com, special thanks to Mike Lynch for his leadership and persistence and
Dan Hirsch for his technical expertise in setting up the user interface.

Thanks to all, and if I missed anyone, please let me know.

----------------------------------------------------------------------
1.4 Using this Database

This version of the database is available in Microsoft Access
format or in a generic, comma delimited format. Because this is a
relational database, you will not be able to use the data in a
flat-database application.


Please note that this is not a stand alone application.  It requires
a database application or some other application designed specifically
to interact with the database.


------------------------------------------------------------------------------
2.0 Data Tables

The design follows these general principles.  Each park is assigned a
unique number (ParkID).  All of the information relating to that park
is tagged with its ParkID.  The ParkIDs are linked to Teams and
Years in the main data tables.

The database is comprised of the following main tables:

Home Main Data W Park BreakOUT - Runs and Home Runs for Home and Opponent teams for all seasons by Park by Team By Season.
 	Other offensive events (doubles, triples, etc) for the seasons:
 	1906-2018
	1871, 1872, 1874 (NA)

Home Main Data - Runs and Home Runs for Home and Opponent teams for all seasons by Team by Season.
 	Other offensive events (doubles, triples, etc) for the seasons:
 	1906-2018
	1871, 1872, 1874 (NA)

Visitor Main Data - Runs and Home Runs for Visitor and Opponent teams for all seasons by Team By Season.
 	Other offensive events (doubles, triples, etc) for the seasons:
 	1906-2018
	1871, 1872, 1874 (NA)

ParkConfig - Data on descriptive park items such as foul line distances, fence heights, capacities, etc.

It is supplemented by these tables:

RH_LH_Data - Subset of Main Data Tables broken down by LH/RH for most offensive events within H/A for
 years where available (note some years before 1974 incomplete). Box Score Data was used to supplement the data for seasons prior to 1941.

RH_LH_ALL_HR - Home Runs broken down by LH/RH within H/A for all seasons by Team by Season
(may not match Home Runs in  RH_LH_Data for any years where Retrosheet play by play data is incomplete).

Parks - Park ID Master Table
Teams - Team ID Master Table
Leagues  - League ID Master Table
LeagueTeams - League/Team Valid Combinations Master Table
Retrosheet_BBDB_Team_Xref - Cross reference of team IDs between Retrosheet and Baseball Databank where different
Park Events - Firsts and Lasts, special events per park for selected historical parks (under construction)
No-Hitters - Firsts and lasts, special events per park (under construction)

Sections 2.1 through 2.9 of this document describe each of the tables in
detail and the fields that each contains.

---------------------------------------------------------------------------

2.1 Home Main Data With Parks Breakout

Year		Season
TeamID		Lahman Database Team Identifier
ParkID		Park Code based on Retrosheet Park IDs (NOT in Home_Main_Data_WO_Parks)
LgID		League (NL, AL, AA, etc.)
SEQ		Numeric Code which helps link a team's "main" home park data with the road data for that same year (NOT in Home Main Data W/O Parks)
GP_H		Games Played as Home Team
R_Off_H	Runs 	Scored as Home Team
R_Def_H	Runs 	Allowed as Home Team
HR_Off_H	Home Runs Hit as Home Team
HR_Def_H	Home Runs Allowed as Home Team
AB_Off_H	At Bats as Home Team
H_Off_H		Base Hits as Home Team
D_Off_H		Doubles as Home Team
T_Off_H		Triples as Home Team
RBI_Off_H	RBI's as Home Team
SH_Off_H	Sacrifices as Home Team
SF_Off_H	Sacrifice Flies as Home Team
HBP_OFF_H	Hit By Pitches as Home Team
BB_Off_H	Base on Balls as Home Team
IW_Off_H	Intentional Walks as Home Team
K_Off_H		Strikeouts as Home Team
SB_Off_H	Stolen Bases as Home Team
CS_Off_H	Caught Stealing as Home Team
GDP_Off_H	Grounded Into Double Plays as Home Team
AB_Def_H	At Bats for Opposition when Home Team
H_Def_H		Base Hits Allowed as Home Team
D_Def_H		Doubles Allowed as Home Team
T_Def_H		Triples Allowed as Home Team
RBI_Def_H	RBI's Allowed as Home Team
SH_Def_H	Sacrifices Allowed as Home Team
SF_Def_H	Sacrifice Flies Allowed as Home Team
HBP_Def_H	Hit By Pitches Allowed as Home Team
BB_Def_H	Base on Balls Allowed as Home Team
IW_Def_H	Intentional Walks Given as Home Team
K_Def_H		Strikeouts Made as Home Team
SB_Def_H	Stolen Bases Allowed as Home Team
CS_Def_H	Caught Stealing Made as Home Team
GDP_Def_H	Grounded Into Double Plays Made as Home Team
------------------------------------------------------------------------------

2.11 Home Main Data

Year		Season
TeamID		Lahman Database Team Identifier
LgID		League (NL, AL, AA, etc.)
GP_H		Games Played as Home Team
R_Off_H		Runs Scored as Home Team
R_Def_H		Runs Allowed as Home Team
HR_Off_H	Home Runs Hit as Home Team
HR_Def_H	Home Runs Allowed as Home Team
AB_Off_H	At Bats as Home Team
H_Off_H		Base Hits as Home Team
D_Off_H		Doubles as Home Team
T_Off_H		Triples as Home Team
RBI_Off_H	RBI's as Home Team
SH_Off_H	Sacrifices as Home Team
SF_Off_H	Sacrifice Flies as Home Team
HBP_OFF_H	Hit By Pitches as Home Team
BB_Off_H	Base on Balls as Home Team
IW_Off_H	Intentional Walks as Home Team
K_Off_H		Strikeouts as Home Team
SB_Off_H	Stolen Bases as Home Team
CS_Off_H	Caught Stealing as Home Team
GDP_Off_H	Grounded Into Double Plays as Home Team
AB_Def_H	At Bats for Opposition when Home Team
H_Def_H		Base Hits Allowed as Home Team
D_Def_H		Doubles Allowed as Home Team
T_Def_H		Triples Allowed as Home Team
RBI_Def_H	RBI's Allowed as Home Team
SH_Def_H	Sacrifices Allowed as Home Team
SF_Def_H	Sacrifice Flies Allowed as Home Team
HBP_Def_H	Hit By Pitches Allowed as Home Team
BB_Def_H	Base on Balls Allowed as Home Team
IW_Def_H	Intentional Walks Given as Home Team
K_Def_H		Strikeouts Made as Home Team
SB_Def_H	Stolen Bases Allowed as Home Team
CS_Def_H	Caught Stealing Made as Home Team
GDP_Def_H	Grounded Into Double Plays Made as Home Team
-----------------------------------------------------------------------------

2.2 Visitor Main Data

Year		Season
TeamID		Lahman Database Team Identifier
ParkID		Park Code based on Retrosheet Park IDs
LgID		League (NL, AL, AA, etc.)
SEQ		Numeric Code which helps link a team's "main" home park data with the road data for that same year
GP_A		Games Played as Visiting Team
R_Off_A		Runs Scored as Visiting Team
R_Def_A		Runs Allowed as Visiting Team
HR_Off_A	Home Runs Hit as Visiting Team
HR_Def_A	Home Runs Allowed as Visiting Team
AB_Off_A	At Bats as Visiting Team
H_Off_A		Base Hits as Visiting Team
D_Off_A		Doubles as Visiting Team
T_Off_A		Triples as Visiting Team
RBI_Off_A	RBI's as Visiting Team
SH_Off_A	Sacrifices as Visiting Team
SF_Off_A	Sacrifice Flies as Visiting Team
HBP_OFF_A	Hit By Pitches as Visiting Team
BB_Off_A	Base on Balls as Visiting Team
IW_Off_A	Intentional Walks as Visiting Team
K_Off_A		Strikeouts as Visiting Team
SB_Off_A	Stolen Bases as Visiting Team
CS_Off_A	Caught Stealing as Visiting Team
GDP_Off_A	Grounded Into Double Plays as Visiting Team
AB_Def_A	At Bats for Opposition when Visiting Team
H_Def_A		Base Hits Allowed as Visiting Team
D_Def_A		Doubles Allowed as Visiting Team
T_Def_A		Triples Allowed as Visiting Team
RBI_Def_A	RBI's Allowed as Visiting Team
SH_Def_A	Sacrifices Allowed as Visiting Team
SF_Def_A	Sacrifice Flies Allowed as Visiting Team
HBP_Def_A	Hit By Pitches Allowed as Visiting Team
BB_Def_A	Base on Balls Allowed as Visiting Team
IW_Def_A	Intentional Walks Given as Visiting Team
K_Def_A		Strikeouts Made as Visiting Team
SB_Def_A	Stolen Bases Allowed as Visiting Team
CS_Def_A	Caught Stealing Made as Visiting Team
GDP_Def_A	Grounded Into Double Plays Made as Visiting Team
------------------------------------------------------------------------------
2.3 ParkConfig table

ParkID		Park Code based on Retrosheet Park IDs
Name		Most common name used for park IN THAT SEASON
Year		Season
Capacity	Estimated Normal Maximum Capacity
Surface		Type of Surface (N=Natural, T=Turf)
Area_Fair	Square Feet of Fair Territory estimated in thousands of Square Feet
Cover		Type of Roof (O=Open, D=Dome, R=Retractable)
LF_Dim		Left Field Line Fence Distance in Feet at the Foul Pole
SLF_Dim		Straightaway Left Field Distance in Feet approx. 15 degrees in from foul line
LFA_Dim		Left Field Power Alley Distance in Feet approx. 22.5 degrees in from foul line
LC_Dim		Left Center Field Distance in Feet approx. 30 degrees in from foul line
RCC_Dim		Right Centerfield Corner Distance in Feet between Left Center and CF
CF_Dim		Centerfield (straightway) Fence Distance in feet 45 degrees in from foul lines
LCC_Dim		Left Centerfield Corner Distance in Feet between Right Center and CF
RC_Dim		Right Center Field Distance in Feet approx. 30 degrees in from foul line
RFA_Dim		Right Field Power Alley Distance in Feet approx. 22.5 degrees in from foul line
SRF_Dim		Straightaway Right Field Distance in Feet approx. 15 degrees in from foul line
RF_Dim		Right Field Line Fence Distance in Feet at the Foul Pole
Backstop	Distance from Home Plate to Stands
Foul		Square Feet of Foul Territory estimated in thousands of Square Feet OR (L=Large, N=Normal, S=Small)
LF_W		Left Field Wall Height in Feet
LC_W		Left Center Field Wall Height in Feet
CF_W		Center Field Wall Height in Feet
RC_W		Right Center Field Wall Height in Feet
RF_W		Right Field Wall Height in Feet
Comments	Comments about remodeling, fires, special features, etc.
------------------------------------------------------------------------------
2.4 RH_LH_Data

Year		Season
TeamID		Lahman Database Team Identifier
Park_ID		Park Code based on Retrosheet Park IDs
Off_Def		Indicator of team being on offense or defense
H_A		Indicator of team being Home or Visitors
Bathand		R=Right, L=Left, B=Switch-hitter, bathand unknown, U=bathand unknown
AB		At Bats
1B		Singles
2B		Doubles
3B		Triples
HR		Home Runs
RBI		Runs Batted In
BB		Base on Balls
IBB		Intentional Walks
K		Strikeouts
HBP		Hit By Pitches
SF		Sacrifice Flies
SH		Sacrifices
GDP		Ground into Double Plays
PA		Plate Appearances
R		Runs
G		Games
SB		Stolen Bases (based on batter handedness)
CS		Caught Stealing (based on batter handedness)
ROE		Reached on Error

------------------------------------------------------------------------------
2.41 RH_LH_ALL_HR

Year		Season
TeamID		Lahman Database Team Identifier
Off_Def		Indicator of team being on offense or defense
H_A		Indicator of team being Home or Visitors
Bathand		R=Right and L=Left
HR		Home Runs
------------------------------------------------------------------------------

2.5  Parks

PARKID		Park Code based on Retrosheet Park IDs
NAME		Most Common Name for Park DURING IT's LIFETIME
CITY		City Location of Park
STATE		STATE or Province Location of Park
START		Date of first major league game at Park
END		Date of last major league game at Park
LEAGUE		League that Park was most often used in
NOTES		Various Notes about Park
AKA		Other Names Park may have been known as
EXACT		Latitude and Longitude are exact known coordinates
Latitude	Latitude location of park in degrees decimal
Longitude	Longitude location of park in degrees longitude
Altitude	Altitude of park in thousands of feet

NOTE: The EXACT locations of all parks ever used in MLB games are now the Parks table, except for four parks:
Monumental Park - Baltimore, MD
Ludlow Park - Ludlow, KY
West New York Field - West New York, NJ
Fireworks Park, GLoucester City, NJ
Any location information on these parks would be greatly appreciated.
------------------------------------------------------------------------------

2.6  Teams

TeamID		Lahman Database Team Identifier
League		League
Start Year	First Year of Team
ENd Year	Last Year of Team
City 		City Name only of Team
Nickname	Nickname only of team
------------------------------------------------------------------------------

2.7  Leagues

LgID		League (NL, AL, AA, etc.)
LgName		Name of League
Start_Year	First Major League Season of League
End_Year	Last Major League Season of League
Comments	Comments about league history
------------------------------------------------------------------------------

2.8 LeagueTeams table

LgID		League (NL, AL, AA, etc.)
TeamID		Lahman Database Team Identifier
Year		Season

------------------------------------------------------------------------------

2.9 Retrosheet_BBDB_Team_Xref

Year		Season
RetroID		Retrosheet Team ID
BBDBID		Baseball-Databank Team ID

------------------------------------------------------------------------------

2.10 Park Events

ParkID
First_Game
Score_Winner
First_Attendance
First_Starting_Pitchers
First_Batter
First_Game_Result
First_Hit
First_Result_Inning
First_Run
First_RBI
FIRST_HR
Date_Game_No
First K_Batter
First_Win
First_Loss
First_Grand_Slam
First_Inside_the_Park_HR
First_No_Hitter
Last_Game
Score_Winner
Last-Attendance
Last_Starting_Pitchers
Last_Batter
Last_Game_Result
Last_Hit
Last_Result_Inning
Last_Run
Last_RBI
Last_HR
Last_K_Batter
Last_Win
Last_Loss
Last_Grand_Slam
Last_Inside_the_Park_HR
Last_Most_Recent_No_Hitter
Trivia

2.11 No-Hitters

Under construction


3.0 Data Issues

RH_LH_Data Table MAY not tie exactly to Home_Main_Data Tables and Visitor_Main_Data in some seasons for all data due to incomplete play by play data in Retrosheet Event Files.

RH_LH_ALL_HR_Only Table home runs MAY not tie exactly to RH_LH_Data table home runs due to incomplete play by play data in Retrosheet Event Files.

LH/RH HR data for 1906 thru 1925 seasons have records marked as "U" for "Unknown" as the handedness of some batters is not known.

LH/RH data for 1906 thru 1936 seasons have records marked as "B" for "switch-hitter, handedness unknown".

If anyone knows where to find the exact numbers for these items, please let me know.

------------------------------------------------------------------------------



4.0 Online Version of Database

The online database version includes the following additional data:

Map of park location
Percentage calculations by season of Turf and Roof types.
Averages by season of field dimensions, wall heights, fair territory, and backstop distances.
Totals of games by team and by city.


1 Year Park Factors:

The 1 year park factors are based on UNREGRESSED observed data. There is an 'other parks corrector' calculation
made due to the other road parks' total difference from the league average being offset by the park rating
of the park that is being rated. In other words, if you're calculating factors for Coors Field, then Coors itself
is not part of the 'road' set of parks it is being compared against, so that road set of parks is actually
slightly pitcher friendly (assuming Coors is hitter friendly that year) instead of being 100% neutral, so the
'other parks corrector' makes an adjustment for that fact.  Except for the other parks corrector calculation,
the 1-year park factors are simple rates of components per At Bat at home divided by rate of components per AB on the road.


3 Year Park Factors:

The 3 year park factors are REGRESSED, and meant as an estimate of the park's 'true' impact on batting components.
We calculate this factor slightly different from Total Baseball/Baseball-Reference.com (see http://www.baseball-reference.com/about/parkadjust.shtml
for an explanation of that calculation).  For a given park/season, we use the 1-year factor for that park/season plus
the 1-year factor for the PREVIOUS season plus the 1-year factor for the FOLLOWING season.  We weight each based on the
number of home games for each season, but otherwise we weight them equally (we don't add weight to the current season).
Then we regress that number 25% towards that parks' long-term historical factor for that component.  The long-term
historical factor is the sum of all 1-year factors for the history of the park weighted by home games each season.
We believe this gives us a closest possible approximation of the 'true' park factor without adding more complicating
variables such as modifications to park characteristics, new parks in the league, weighting the long term
factor by number of seasons, etc.

For the very first and very last year of a park, since there is no PREVIOUS or FOLLOWING season in those cases, we
chose to use an additional following season (season +2) for the first park year and an additional previous season
(season - 2) in those calculations.  This results in the first TWO seasons and the last TWO seasons of the 3 year
calculations being the same!  What we're saying is that lacking an adjacent season our best guess of the first
and last seasons of a parks existence is the same guess we use for the 2nd season and the next to last season.
If anyone wants to prove that there is a more accurate way to handle these 'end' seasons, we are very open to ideas.
Mobilize your Site
View Site in Mobile | Classic
Share by: