Hitting Correlations With Runs Scored
For making successful baseball picks you want focus on both offensive (run production) and defensive (pitching/fielding) factors. All aspects of the game have an effect on the outcome, however, we do find that hitting statistics (at least in the modern era) correlate slightly more to winning than the other variables.
The question we are left with is, what hitting statistics are most important? That is, which stats are best at predicting a winning team? We pulled the numbers from the 2010 season and ran correlations to runs scored on several different hitting stats. Runs scored is our best predictor of who will win a baseball game, which is why it was the clear choice for these correlations.
Correlation is a statistical tool that measures how closely one set of numbers relates to another set. For our purposes we are going to look at different categories and see how well they track compared to runs allowed. Correlation runs from -1 to 1. The closer you get to either extreme the closer the two sets of numbers are tied together. The closer the number is to 0 the less likely one number will tell you anything about the other. If the number is negative it means that when one number goes up, the other goes down. If the number is positive they both rise or fall together.
Batting Average (0.74)
On Base Percentage (0.89)
Hit by Pitch (0.41)
Intentional Walks (0.18)
Strike Outs (-0.03)
These stats all deal with runners getting on base. Clearly reaching base is a great predictor for scoring runs. A hit translates the best into run production, but walks are not far behind. Intentional walks has a lower correlation because typically the best hitters are give a free pass in order to face a weaker batter. Strike outs are predictably a negative correlation because the batter fails to put the ball in play to give himself a chance to get on base or advance any runners already on base.
Home Runs (0.72)
Slugging Percentage (0.89)
These would be considered “power stats”. They all deal primarily with extra base hits, and we can see there is a strong correlation with run production for almost all of them. Triples are much more rare than other extra base hits, meaning there is probably more luck involved than with, say, doubles, which results in a lower correlation. Slugging percentage (Singles) + (2 x Doubles) + (3 x Triples) + (4 x Home Runs)/At Bats) is the best indicator in this group because it weighs each hit according to how many bases are covered.
Stolen Bases (-0.13)
Caught Stealing (-0.17)
Any attempted stolen base risks an out, which risks taking a runner off base and risks giving the team one less attempt at getting another runner on base. Because having a player on base correlates so strongly with run production, it makes sense that risking that out would be a negative correlation.
On-base Plus Slugging (0.95)
OPS (OBP + SLG) has a stronger correlation with run production than any other underlying stat. Runners get on base (OBP) then are brought home by a big hit (SLG), so teams that do well in this stat category are going to be the best teams at producing runs (read: winning games).