Most people try to avoid thinking about statistics, but they’re part of everything we do in investing. When other people get around to thinking about statistics, they often end up thinking about them the wrong way.

Most people assume the most important numbers of any data set are the average, median, and standard deviation, and they can safely ignore everything else. But it’s not that simple.

Statistics are about trying to figure out the unknown based on what we have seen in the past. It’s all about working with – and dealing with – uncertainty. We can’t know what the stock market’s “true” expected return is going forward, but we can make an estimate.

Whenever we’re working with uncertainty, especially with something as noisy as investing, we always need to be careful about how confident we are in our estimates, even when the statistics seem rock solid. The financial markets are noisy places, and we need to appreciate that.

### Do Stocks *Really* Have Higher Returns than Bonds?

One of my favorite examples of this is simply trying to figure out if stocks have a higher average return than bonds. You wouldn’t think that this would be tough to do – there’s a pretty big difference between the computed average returns for stocks and bonds. And we would have to rethink more than few things if stocks didn’t have higher returns than bonds – especially since we talk about the ratio of stocks to bonds in your investment portfolio as the most important investment decision you’ll make.

But let’s play this out and put some numbers around it. To make our lives simple, let’s use the S&P 500 Index to represent stocks, and Five Year US Treasury Bonds to represent bonds. When we look at their average annual returns from 1926 through 2023, stocks averaged 12.2% per year, and bonds averaged 5.0% per year – a difference of 7.2 percentage points per year. That’s a pretty big number.

But that’s only one side of the story. We also need to consider how much the returns bounce around. If there’s a lot of noise in stock returns (and there is), who’s to say that stocks aren’t just getting really lucky for a long period of time? So in addition to the average returns, we also need to look at the standard deviations of the returns. The standard deviation of the stocks, from 1926 through 2023, was 19.6%, compared to only 5.7% for the bonds. That’s an even bigger difference.

But even so, it should be pretty clear what’s going on here right?

Well, let’s say you want to be really confident when you say that stocks have a higher average return than bonds – you want there to be less than a 1% chance that the difference isn’t just down to chance. How long would it take to prove this?

It would take 64 years to prove that stocks have better returns than bonds. Starting in 1926, it would have taken until 1989 to make this determination.

This seems a little counterintuitive at first blush. But it gets at something that is incredibly important. Statistics, like the financial markets, are messy. Because stock returns are bouncing around so much (and even bond returns are bouncing around a bit) it’s really hard to be able to make definitive statements about investment returns purely based on what the data is “telling us.”

### How Confident Are You?

I also want to hone in on one part of that example that is really important – the level of confidence you’re looking for in your analysis. As we’ve said (a lot) the financial markets are noisy. So we always need to consider that any relationship we can find in the data is just there by chance. So we need to take that possibility into account, and think about how likely that there’s something real happening.

In our earlier example, I wanted to be sure that there was a better than 99% chance that stocks had a higher average return than bonds. Or, put differently, that there was a less than 1% chance that stocks *didn’t* have a higher returns than bonds.

Now, admittedly, a 1% confidence threshold is actually a little bit high for finance – typically people want to be 95% sure that a relationship isn’t just there by chance (though even at this level it would take 30 years to be able to say that stocks beat bonds) – I just wanted a big eye catching number.

But think about what a 95% confidence level means. It means that you are ok with a 5% chance that a relationship is there by chance. In other words, if I run 100 completely random tests, I should expect about five of them – about one in twenty – to show up with a “real” relationship that is statistically significant.

### Bad Data is Fun and Easy

It’s not hard to find relationships big sets of data. If you throw enough stuff at the wall a whole bunch of it will stick. And some of it is pretty amusing – Tyler Vingen’s site Spurious Correlations is a really good example of this – so long as you don’t take it all that seriously. For instance, he “found” that the popularity of the name Thomasin the US is incredibly highly correlated with motor vehicle thefts in Maine – and there was a less than 1% chance that the relationship was random. As someone who lives in Maine, this is pretty concerning – and he even helpfully has an AI written paper explaining why the relationship exists.

Now, we know that this isn’t real (even though folks named Thomas do seem a little shifty). Even though the numbers are gesticulating wildly that there’s something here, we can step back and apply some critical reasoning.

Does it really make sense that the popularity of the name Thomas has anything to do with the number of cars stolen in a single state? No, of course not. There’s no even remotely plausible relationship between these two things. It’s simply the result of trawling a bunch of data sets and looking for things that line up by chance.

### Financial Storytelling is a Numbers Game

And we see this all of the time with our investments. How many relationships do you think someone looking to create a new mutual fund, or write an article for the financial media is going to run?

It’s trivially easy to find all sorts of crazy relationships in the data. The trick is finding meaningful relationships. That’s not so easy.

This matters from an investment standpoint because it will affect how you divvy up your risk budget and what your asset allocation looks like. But the bigger and more important point is that you constantly need to be thinking in terms of uncertainty.

We talk a lot about thinking through risk, but it’s important to recognize that everything – including our descriptive statistics – is uncertain.

### So What Does This Mean For You?

It’s one thing to play with numbers and point out interesting relationships. It’s another to actually do something useful with them.

But there’s a big takeaway from this uncertainty: we should be incredibly wary of overconfidence.

Overconfidence is one of the most destructive behavioral biases investors have. We all think we are better than we are – statistically speaking, almost everyone reading this article thinks they are a better than average driver.

In fact, there’s really no such thing as under-confidence. Even people who self-identify as pessimistic are still overconfident, just by slightly less than the average person.

Even when we are primed to think about the overconfidence of investors – or ourselves, specifically – pretty much everyone still acts overconfident. I am overconfident. You are overconfident. Your boss is really overconfident (and that goes double if you’re your own boss).

My mother-in-law, who grew up in Taiwan, used to call this America’s superpower. We just assume we know how to do everything – or at least we’ll figure it out as we go along. But overconfidence is universal – it may vary a little in degree across cultures, but it doesn’t go away.

So we need to be hypervigilant about minimizing the effects of overconfidence. Simply relying on a passive investment philosophy doesn’t cut it. It’s certainly better than trying to beat the markets, but it doesn’t eliminate the dangers of overconfidence, it just changes what we are overconfident about.

The biggest area where passive investors are overconfident (we’ll ignore just how overconfident this statement probably is) is probably our assessment of our own risk tolerance – especially for men. We think we understand the risk and return relationships of the market (and we’ve already seen that’s not the case), so we think we can tough out those periods when the markets turn against us.

This is better than trying to outguess where the markets are going, but it’s still not good. Being overconfident about the level of investment risk that you can tolerate just straps you right back into the fear and greed roller coaster that we’re all trying to get off of.

Investing is about managing risk. And a huge part of doing this well is being humble. Humility allows us to recognize that however smart we are (or think we are), the markets are smarter.

Statistics allow us to understand certain relationships, and we become pretty comfortable making certain statements about why things happen. But we still regularly have to throw up our hands and say the markets are just weird sometimes.

It’s really easy to overextend and think we understand everything that is happening – especially when we have something as seemingly reliable as statistics on our side – but behavioral biases like overconfidence can prevent you from reaching the retirement you’ve always wanted.

**To find out more about the most important risks to your retirement, read our ebook the 7 Risks of Retirement Planning.**