Tuesday, July 24, 2012

What is Apple Worth? Part 1: Regarding Large Numbers

A Big Lottery

“Peter tosses a coin and continues to do so until it should land "heads" when it comes to the ground. He agrees to give Paul one ducat if he gets "heads" on the very first throw, two ducats if he gets it on the second, four if on the third, eight if on the fourth, and so on, so that with each additional throw the number of ducats he must pay is doubled. Suppose we seek to determine the value of Paul's expectation.”

This is how Daniel Bernoulli rephrased a problem nagging a tight circle of mathematicians almost 300 years ago. The issue is that, if you go by the mathematical expected value of the winnings you end up with an infinite value for the lottery: you have 1/2 probability of winning one ducat on the first throw, 1/4 probability of winning two ducats in 2 throws (which adds another 1/2 to the expected value), 1/8 probability of winning 4 ducats in 3 throws (adding another 1/2), and so on adding 1/2 a ducat in value an infinite number of times.

The expected value suggests that a player should be willing to pay any amount asked for a chance to play this lottery. Yet, in real life no one in their right sense would pay more than a few ducats to play this, perhaps somewhere around 5 ducats. Why is it that the mathematical expected value doesn’t reflect the real value placed by real players on this lottery?

The quote above is from the seminal paper where Bernoulli shared his nearly “magical” solution. In it, Daniel all but pulled a rabbit out of a hat (an utility function) and ended up formalizing such central concepts in economics like diminishing marginal utility and risk aversion. Another example, these concepts are behind what's known as the paradox of value: why something as useful and vital as water can be so cheap while something as useless as diamonds can be so expensive.

I’ll return to his solution in a bit, or more likely in part 2. Right, this appears to be another 3-part series of posts. I have several layers of seemingly disjointed ideas that I’d like to share, and I hope in the end it’ll all come together in some sort of rich and congruent whole. And please excuse my wordy exposition, which I’ve tried my best to avoid (but most likely failed).

Apple’s Large Numbers

During the quarter ending in March 2003, Apple traded for an average share price* of $7.28. During the last quarter ended on June 30, the average price was $584.10. This represents an 80-bagger in 9 years (those “average prices” are smoothing out the very extreme low and high prices experienced for a few days or maybe minutes). That is equivalent to achieving a double 6 times, and we’re about a third along the way to the 7th doubling (log2(80)=6.32). The first doubling to almost $15 occurred over 5 quarters by Jun 2004. The next doubling to over $29 happened within the following 2-3 quarters. Then it doubled again within the next 4 quarters (over $58 by end of 2005). The fourth doubling took 6-7 quarters to complete by the summer of 2007 (over $116). And then the fifth, spanning the financial crisis, took the longest: the average quarterly price didn’t top $233 for 3 years until mid-2010. Finally, the sixth doubling to $466 got completed over the last 2 years by last March.

So, on average, it took 38/6.32 = 6 quarters for each doubling. Let me repeat that: that’s 6 consecutive doubles (and a bit more), and each one achieved every year and a half on average. Yes, the last two doublings took 5 years that included a year or two “unfairly” lost during the crisis (earnings kept expanding). So it could be argued that those last two should still have taken 3 years, or the same 1.5-year per doubling.

Of course, the 6 or 7 stock price doublings are more than justified by the much faster, more frequent doublings in earnings over the same number of years (mushroomed by about a 1350x multiple, the equivalent of doubling well beyond 10 times). Revenues instead have doubled just 4 times, currently more than halfway along the 5th doubling (a total multiple of 25x). The more than twice ratio of earnings doublings to revenue doublings has been driven by solid increases in net margin percent, from zero to 30% of revenues in the 9 years, or a mind-blowing 330 basis point increase in net margin each and every year for almost a decade.

 * The average share price for each quarter was computed as the volume weighted average of the typical price, this in turn being the average of the high, low, and closing prices for each daily session. Quarter end dates were based on Apple’s fiscal calendar.

Big Questions

What are the chances, for any one company, of enjoying such an impressive and consistent run? If we were to follow a frequentist probability approach (or Bayesian), it appears the best estimate of the probability that Apple achieves a double every couple of years would be quite high, I’d say comfortably higher than a coin toss. A double by 2014 implies a price of $932, and another by 2016 gets it over $1800. Surely at some point these doublings most stop, or slow down?

At what point that 330 bps/yr incremental net margin as a percent of revenue begins to bump into a ceiling? By definition net margins can’t exceed 100% of revenue. I believe it’s highly unlikely that margins would expand much beyond 35%. So the sensible conclusion is that margin expansion must slow down within a couple of years (it has instead accelerated). When net margin stops expanding, earnings will, by definition, expand at the same pace as revenue. After another doubling in revenue from current levels it becomes hard to imagine how to achieve an additional doubling. So to expect the same doubling pace of the stock every 2 years seems questionable. Perhaps it makes sense to double the time to double, to 4-5 years? Then again, this same reasoning would have made perfect sense 2 years and 5 years ago, and here we are 2 further doublings later, with still accelerated expansion in all the business metrics.

Looking forward and discounting the future, what is all of this worth, today? Even if the probability seems higher than a coin toss that the stock would double again, what if it doesn’t? Would it then be worth waiting another couple of years in the hope it does double by then? Do we get to toss the coin again? And what happens when the music stops? Will everyone try to just get out at the same time? Will a significant dividend help prevent that? And, wouldn’t that be an even stronger signal to move on to better growth pastures?

What’s with all the questions? Seems this post has become all about these uncertain philosophical musings, and puzzles, and mind experiments, and hypothetical problems, instead of a simple valuation model like 10x earnings plus cash. Please keep on reading, and stay tuned for another part or 2 over the next week or two, because I have a hunch that this is leading to somewhere interesting.

The Other Famous Bernoullis

Almost three centuries ago in 1713, Daniel’s cousin Nicolaus Bernoulli posed the probability problem stated above in a letter to Pierre Rémond de Montmort. Nicolaus had just published his late uncle Jacob Bernoulli’s eagerly expected book on probability titled Ars Conjectandi (Jacob had died 8 years before). The ambitious but unfinished project had been written more than 25 years earlier and included a consolidation of the latest developments on the relatively incipient theory, distilling the best insights from Cardano, to Pascal and Fermat, and Huygens.

But in addition to the formalization of all those hairy combinatorial computations for enumerating outcomes, calculating odds, and determining fair payouts in various games of chance, as the techniques had been mostly developed and applied up to then, Jacob’s plan was to broaden its application to cover much more meaningful decisions, such as in the political, economic, and judicial domains.

For an enlightening take of Jacob Bernoulli’s legacy with insightful historical context as compared to our current times, and an informative account of the Ars publication and reception among his peers, check out this article by Glenn Shafer.

On The Real LLN

Among the many profound insights in Jacob’s Ars, it included the very first proof of the simplest version of the theorem we now refer as the Law of Large Numbers (LLN). First stated by Cardano without proof more than a century before Bernoulli, the real LLN or “golden theorem” as Bernoulli referred to it (a few decades later it was Poisson who coined the LLN nickname), simply states that, in the long run, the average of a random variable empirically observed from independently repeating the same experiment many times converges to the theoretical “expected value” of that variable.

Since in the real world (as opposed to the gaming world of cards and dice) we usually can’t construct a priori theories about the expected values of events, the LLN allows us to confidently estimate these a posteriori by taking long-run observations or from large samples and computing an empirical average to estimate those random variables, which the theorem says should gravitate to the unknown theoretical qualities of these events, and from then we can more confidently use them as a priori values.

All of this may seem extremely intuitive and obvious to us today, and indeed the basic concept was also common knowledge 300 years ago. Jacob himself acknowledges this fact: “I would have estimated it as a small merit had I only proved that of which no one is ignorant.” What nobody at the time was prepared to accept was that those same highly tractable payoff rules and perfectly defined expectations in play when cards and dice and coins are concerned could be extrapolated to much less structured physical models (e.g. from astronomy to climatology), and even to social and moral science (although maintaining a certain latitude about the inferential knowledge we could obtain about the true fundamental nature of such things). It was Jacob who first opened that door although no one was prepared to walk into the utter darkness behind it. It would be a century later with Laplace, the first to cross it, who then flipped the lights on, and showed the rest of the world how to search for anything inside that wonderfully assorted epistemic closet.

Despite the intuitiveness of it, rigorously proving the theorem was quite a different matter. It took Jacob more than 20 years to navigate the logic and math to prove the simplest case of a binary-valued random variable, and a couple of centuries’ worth of hard interdisciplinary work by dozens of the most incredibly ingenious math thinkers hacking at the intricate fundamental problems required to finally prove it in full (it was some guy named Khinchin in 1929 who finally got it to work for any arbitrary random variable).

This is why the problem that opened this post is paradoxical. If such a lottery was repeated a large number of times, in the long run the expected average winnings are unbounded, yet no one is willing to risk more than a few coins to play. In people’s empirical valuation of it, the lottery seemingly violates the LLN (which is an undisputable mathematical truth). How can that be? The answer and how it may relate to Apple’s size and valuation will have to wait for part 2. But to end this post, a parenthetical clarification is needed.

The LLN ≠ Limits to Growth

So, we’ve often heard of the LLN in connection with Apple’s size, right? Actually, that’s something completely unrelated. I prefer a different nickname for that other so called “law” of large numbers: lol numbers as coined by an Apple 2.0 commenter. The lol numbers is just a buzzword term used by some when referring to the obvious inevitability that nothing can grow exponentially forever and ever. In the sense that it is inevitable it may be called a law, except there’s no true applicable conception of such limits except at some extreme, uninteresting, and impractical levels nowhere near the current situation, like the whole world economy as a limit for sales, or the whole world’s population as a limit for the customer base, or perhaps the whole world’s resources of some particular commodity needed in Apple’s products (e.g. rare metals) for which there is no alternative.

The construction of market/industry penetration models and through them assessing Apple’s remaining growth potential and eventual saturation limits within those markets, as well as the markets' overall growth potential, is not a direct corollary to the lol numbers, but instead an unverified theory or hypothesis that one could propose as an analyst. Apple of course could enter new industries and markets, for which new theoretical models would be required, and there’s no practical limit to how many markets it could enter other than those impractical extremes already mentioned.

I wish all those who brought up the lol numbers shared their growth/market penetration models, but that’s never the case. Instead, these sound-bite driven commentators (usually on TV or some other big mainstream media property) robotically recite the phrase to set up an arbitrary straw-man concern about whatever statistic they think is most likely to elicit doubt in the investor mind, e.g. being the largest among peers or largest ever, or reaching half a trillion dollar market cap, or having experienced a period of extraordinarily high growth that defies their narrow logic. For this last case they’ll use another fallacious buzzword, which is also conflated to the LLN: they’ll talk about mean reversion.

PED - Apple and those LOL numbers

By calling it a “law,” and by there actually existing a rigorous mathematical law of that name, and using technical sounding terms which are incorrectly connected to the real LLN (like mean reversion), and the whole academic/mathematic facade staged as a backdrop, the real motivation gets revealed: to deceive investors into granting those concerns an authority and validity and likelihood that is unwarranted. To reinforce these concerns commentators never fail to remind everyone of all the infamous past examples of companies failing right after becoming really big, like GE, Cisco, Microsoft, Intel, Dell, AOL, or basically any big bank, to name a few. They conveniently ignore the fact that in almost all of those cases the company became huge by eating up practically all the market share in its respective industry, which of course drove it right up against a growth wall within its customer base, and in addition were unable to expand into new markets/industries.

Obviously you’ve reached the limits to growth when you’ve completely dominated and saturated your markets, so you must search for new growth by entering new markets/industries. But finding new turf, especially turf that’s big enough to move the needle if you’re already so big, is quite tough.

None of those conditions apply in Apple’s case.

(part 2 continues here)


JavaJack said...

Brilliant thoughts, Daniel. You wrote all this since the earning report. Wow! I look forward to part 2.

Jack T.

Daniel Tello said...

"You wrote all this since the earning report. Wow!"

You must mean the April report, right? lol

Anonymous said...

Very nice.

Remember though that Apple and AAPL are not coins being tossed.

Anonymous said...

Good article. I think I know where you are going with this. Let's see Part II!

Anonymous said...

A thorough explanation why the nonsense about the "law" of great numbers is, well, nonsense. On the other hand almost everything said by the talking heads on Bloomberg (etc.) about Apple is without value and should be e posed as suce, just like you do.

Lou Mannheim

Chris said...

Good and interesting article. Always enjoy your posts.

Apple is an engineering company. And here is the difference between an engineer and a mathematician:

Put a naked lady on one side of a room and both an engineer and mathematician on the other side. The engineer says "We'll walk half the distance to her, then stop. Then half the remaining distance, then stop. Then half the remaining again, etc" The mathematician says "Theoretically we'll never reach her". The engineer says "I'll get close enough".

Its all about tolerances.

Anonymous said...

Outstanding. When do we get Part II?

JavaJack said...

How are you feeling about the new Android share of the cellphone market moving forward? We all know the release of the iphone 5 will help but they've won a big part of the market recently. It worries me. Will tablet iPad sales begin to decline, too, as more people move to the Adroid OS?

Anonymous said...

Java jack...

The valuations of companies such as Porsche and BMW depended more on their profit dynamics than on market share.

I will start worrying about AAPL when I see significant numbers of people abandoning the platform.

With the platform being as sticky as it is, I don't foresee that happening any time soon.

Although soon in this industry should be measured in a handful of quarters or less!

vk said...

I am actually not clear what the essential differences are between LL and Mean Reversion. I agree they do not have anything to do with how the talking heads use it but I want to understand it better on its technical merits. If you toss a coin 5 times, if I get 4 heads and a tail, the mean reversion will tell me that the next tosses may tend to be tails since I got lucky in getting the 4 heads. Meaning, that is always possible by chance but that is not expected to continue in the future. That is mean reversion. LLN will say that over the long term my sample result will tend towards the expected value,namely in this case 50%. Did I get that right? They seem to be talking about pretty much the same thing, hence my confusion.

Daniel Tello said...

vk, what you described is the typical example of a Gambler's fallacy. No matter how many heads you've gotten in the previous tosses, the next toss will still have a 50% chance of landing heads because these are independent trials, i.e. the coin knows nothing about the previous results.

The LLN only says that in the long run, the proportion of heads will tend toward 50%, but this doesn't mean the next toss will try to compensate for previous imbalances. Your 4 heads in 5 tosses will get diluted to nearly nothing after the next 100 tosses, but definitely not on the 6th toss. Even if you get 100 heads in 100 tosses (an extremely unlikely event but not impossible), the 101st toss will still have a 50% chance of landing heads. But after 10 thousand tosses, the heads imbalance will most likely have disappeared, if the coin is fair of course.

On the contrary, a persistent deviation in the long run from the 50% expected probability would suggest the next toss would more likely come up heads again (rather than the fallacious notion that tails must be more likely in order to compensate the prior heads preponderance) simply because it would suggest the coin was not fair to begin with.

vk said...

Daniel: OK, got it about the statelessness/lack of memory and independence of these trials.

But still Mean reversion and LLN seems to be saying the same thing, don't they? LLN says that over the long term propotion of head will tend towards 50%, the expected value. What will an equivalent mean reversion statement be?

May be, that is what you are explaining in the last paragraph above and I am not catching on to the import of that ( though I understand what you are saying there )

Sorry for pestering you on this.


Daniel Tello said...

vik, apologies for the late reply.

In finance, what's referred as mean reversion (e.g. how traders use MACDs and Bollinger Bands, also see Siegel's theories on very long term market returns) is quite different from the well defined statistical effect better known as regression toward the mean (RTM). The finance usage implies negative serial correlations (i.e. extreme outcomes allegedly tend to be followed by opposite extreme outcomes which compensate each other to create a longer-term trend) while the statistical RTM applies under any kind of imperfect (|r|<1) correlation between the events (e.g. intra-subject test/re-test experimental designs), and its maximum effect happens when there is no correlation or complete independence.

It so happens that independence is one of the requisites for the LLN to hold. The other condition for the LLN is that the individual events' probability remains constant, which is not required for the RTM concept.

In the coin tossing case, since the tosses are independent (zero correlation) then you have 100% RTM. And since the coin odds remain constant, the LLN is also in play. So yes, in this case the equivalent RTM statement would be similar to that of the LLN.

But RTM is practically of no use here since the outcome of a coin toss is completely random (nothing worth measuring unless we were trying to identify unfair coins) and yet assessing the outcome involves no error.

RTM is important when repeated measurements are made on the same subject or unit of observation and it happens because values are observed with random error. The effect of RTM is compounded by categorizing subjects into groups based on their baseline measurements (i.e. when trying to discern outperformers from underperformers or improvement/decline of such extreme groups): the top performers on an initial measurement are more likely to do worse (as a group) on a subsequent measurement only due to chance (it's more likely that they had scored better than their true measurement due to luck). Similarly, the worse initial performers will tend to do better on a subsequent measurement. It works both ways: the top performers on the second measurement will have done worse initially, and the worst will have done better at first. It doesn't matter if the group as a whole improved or not, the effects are relative to the overall mean. This is the essence of RTM when there's positive correlation (same thing being measured repeatedly) and the expected real underlying values may change between measurements. The LLN can't be assumed at work in such conditions.

Sorry for the long, and perhaps somewhat confusing explanation. And no, you're not pestering at all. Grasping these things is quite tricky and there's a lot of muddling and fallacy going on with all this within conventional wisdom. I'm really glad you asked.

vk said...

Danie: Awesome explanation. I get it now. You are a great teacher. Your reply made me go back to some topics and read up on things.

Thanks very much.

Is the Central Limit Theorem and LLN the same or different? Or at least CLT states more than LLN but it does include LLN fully..