While the baseball season is still young, Marlins second baseman Luis Arraez is starting to chase history by flirting with a .400 batting average in June. Arraez, who was hitting .403 after the Marlins’ Wednesday night win, is just the second batter in 15 years to be hitting better than .400 after 62 games, and just the third this century to be hitting at least .400 this deep into a season, after Nomar Garciaparra (91 games in 2000) and Chipper Jones (73 games in 2008).
No batter has broken the hallowed .400 mark over a full season since Ted Williams hit .406 in 1941. A pursuit of Williams would make for a great summer story, but it’s a serious long shot that Arraez will still be batting .400 as the season winds down. In fact, you would have about the same chance of flipping a coin and having it land on heads eight times in a row as Arraez does of batting .400 this season.
The 26-year-old Arraez, of course, is a terrific hitter. He won the American League batting title for the Minnesota Twins last year with a .316 average, and has a lifetime batting average of .326 over five seasons. He was projected to lead the majors in batting average in 2023, per analyst Dan Szymborski’s preseason estimates, yet that forecast of a .311 average was far short of the .400 mark. In fact, Arraez’s true talent level (as derived from Bayes’ Theorem, a mathematical method to update beliefs based on new evidence using prior and current performance data) equates to a .351 batting average. Based on that figure, and forecasting he has about 340 at-bats remaining this season, we can estimate the probability of him hitting .400 or better this season at about 0.46 percent.
Why such low odds for an elite hitter? As good as Arraez is, he has some flaws at the plate – and in some ways he’s been fortunate this season, despite those issues. For example, he’s swinging at a higher rate of pitches outside of the zone this season, 34 percent, which is the highest rate of his career. He’s also hitting a larger share of groundballs than he did last season, 46 percent compared to 41 percent. But his batting average on balls in play this season is a whopping .417, the best rate of his career, a huge delta over the major league average of .297 BABIP and a key reason for his fabulous numbers. His batting average on groundballs alone is .362, compared to an average of .242 for the rest of Major League Baseball.
Arraez’s extremely low strikeout rate is helpful to his quest. Batting average is simply hits divided by at-bats, but we can also calculate a player’s batting average by multiplying (1- player’s strikeout rate) by the ratio of hits to balls put in play to get a sense of how a player’s strikeout rate and batting average on balls in play influence his performance. At the most obvious level, player with a zero strikeout rate would need to get a hit 40 percent of the time he puts a ball in play to be a .400 hitter. If he struck out at an average rate of 23 percent, he would need to get a hit 52 percent of the time he puts a ball in play to reach .400. Arraez has a 4.6 percent strikeout rate this season, the lowest in baseball. Still, he would likely need to sustain a BABIP of about .419 to have a shot at .400. That’s mighty difficult to do. There have been just 10 hitters in baseball history to sustain that high of a batting average on balls in play over at least 500 at-bats – and none have done it since 1924.
The fact is, the current offensive environment in MLB is not conducive to a .400 hitter, and hasn’t been for some time. When Williams hit .406 in 1941, MLB hitters had an average batting average of .262. This year, the average is .247 through Wednesday’s games. Incidentally, every other season that featured a .400 hitter had a collective batting average of .265 or higher, and we haven’t seen a collective batting average at that level since 2007, when Matt Holliday won the NL batting title with a .340 average and Magglio Ordóñez won it in the AL with an average of .363.
Dallas Adams, in the Research Journal for the Society for American Baseball Research, outlined a simple approach to determine how likely it was a .400 hitter would emerge based on MLB’s collective batting average. His method relied on the relative batting average concept laid out by David Shoebotham in the 1976 Baseball Research Journal. For example, a batting average of .400 would be 62 percent higher than the average this season. The highest anyone has ever exceeded the league average en route to a batting title was Ty Cobb in 1910. He surpassed the AL average, .243, by 58 percent.
Not including the 2020 season, you have to go back to Barry Bonds in 2002 to find a hitter this century who surpassed the league batting average by more than 40 percent. Chipper Jones exceeded the league average by 40 percent in 2008 during his quest for .400. He ultimately ended the season far short of .400, at .364. Since 2006, when baseball adopted a leaguewide drug testing policy, the eventual batting champ outpaced the league average by between 20 and 40 percent. That would give us a reasonable expectation that this year’s batting champion(s) could hit .349 on the high side, very close to Arraez’s year-end estimates of .360.