Wednesday, June 19, 2019

More winning!!

Where winning = doing something, anything, faster than James.

Last year I discovered why the Lake District is called that. I always thought it was a funny name for a bunch of pretty mountains and lots of cars. But it turns out there are all these big deep cold lakes, and you are allowed to swim in almost all of them! 

Ullswater 500m, 1 mile (1610m - don't ask me why it isn't a sensible 1500m!), and 3.5km swims were last Sunday.  The 500m (84 finishers) is perhaps the beginners event. The 3.5km is pretty much ironman practice distance and the standard was high. However, it was so cold that this event was reduced to 2.5km. The 1 mile was equally cold (11.8C brrrr.) but they made us do the whole thing! 292 people finished this one, including me and James. I got round 6 minutes quicker than James which makes the difference between us in swimming and running about the same, but the other way round! But somehow James came out more inspired. My race was a bit of a fist fight. Whereas a week ago at Leeds I was swimming among a wave of elegant, lithe, lightweight, coordinated, fit but middle aged women, when it comes to pure swimming, the big, the fat, the young and the male tend to trounce lightweight middle-aged elegance! I was completely unprepared for being half overtaken by thrashing behemoths doing front crawl who then collapsed into breaststroke for  few strokes thus entangling all their kicky limbs among mine. The way out is to kick violently, but this does take quite a lot of energy. Next time! Still, a reaction of annoyance rather than panic is encouraging I suppose. I am still not sure how to overtake these people, however, as it is really hard to get around widely flailing limbs in a packed field, and trying to draft behind them doesn't really work. 

Tuesday, June 18, 2019

The sociopaths have taken over the asylum

Just in case anyone was in any doubt about the nature of the swivel-eyed loons who will shortly be picking our new PM....

Saturday, June 15, 2019

[jules' pics] World triathletes

James kindly blogged my amazing triumph in the British Triathlon Championships... the triumph being not dying during the event and also BEATING HIS MARATHON TIME!! (hurrah!)

Here are the real ones. 

Cycling (not the lead group)


Georgia Taylor-Brown won in the end (she is in second in the running pic here). Katie Zaferes was second, and Jess Learmonth was third.

Some men also did it later on.

Friday, June 14, 2019


jules has taken up triathloning. I'm a rubbish swimmer so am not really tempted. The cycling and running bit would be ok but there's not much fun in doing a race where I start by half-drowning myself and giving everyone else a 20 minute head start. Anyway she has done a couple of shorter pool-based events over the last couple of years but enjoys open water swimming so wanted to do one of those, which are more often the full Olympic distance (1500m swim, 40k bike, 10km run).

Leeds of course is the centre of the UK for triathlon, with not just the Brownlees but also the women's team (who are probably better than the men these days) mostly based there. So doing the Leeds triathlon was the obvious choice. As well as the UK age-group championships there was an international elite event following (part of the ITU World Triathlon Series).

We started out with the traditional pizza, which was very good but so small we had to get some more slices.

The morning was bright and sunny but quite cold. Compared to Windermere where we had been practising, the water was apparently not too bad at 15C.

One of these pictures contains jules, the other is the wave in front of hers.

This isn't jules, who had apparently just swum past without me noticing. She didn't want to wave in case she got accidentally rescued! She was a little faster than I'd expected and you really can't tell people apart in the water when they are all wearing wetsuits and hats. So I missed the fun of watching her struggle to get out of her wetsuit in transition.

A massive collection of very high-tech bikes. Together with jules' one. All surrounded by high fences and patrolled by security guards all night as you had to leave your bike there the night before.

Not much evidence from the photo but she was actually running in this pic! (It was uphill to be fair). And having been following her round the course, I didn't quite have time to get into the grandstand proper for the finish, due to the circuitous route and closed roads. But her hat is just visible over the barrier. There was also a live stream on the BBC website...ah here it is with no sound.

jules had worked out that she might be able to beat my marathon time....and sure enough...

She's been wearing the medal non-stop since the weekend! So I've got my work cut out over the winter to win back bragging rights....

Friday, June 07, 2019

The risks of financial managers, part 2

With reference to this post.

Unknown commenter pointed out the issue with portfolio E in particular, that although it had an expected gain of 5% per year, investors who persist with this portfolio over the long term would probably lose more in the bad years than they would gain in good ones. Sounds contradictory? Not quite. If you do the sums, you will see that the expected gain over a long sequence of years is generated from a very small probability of a extremely large gain, together with a very large probability of losing almost all your initial investment. The distribution of wins and losses is binomial (which tends towards Gaussian for a lot of years) but in order to come out ahead the investor needs to get lucky roughly 3 out of 5 years, and the probability of this happening will shrink exponentially (in the long term) as the number of years increases because it's moving further and further into the tail of a Gaussian.

As an extreme version of this, consider being invited to place a sequence of bets on a coin toss where the result of a T means you lose whatever your stake was, but H means you get back 3 times your stake (ie you win 2x stake, plus get your stake back - odds of 2:1 in betting parlance). This bet clearly has positive expectation, each pound bet has an expected return of £1.50, so if you want to maximise your expected wealth then rationally this bet is a great offer. If you start with a pound in the pot and do this 20 times in a row, betting your entire pot each time, you either end up with 3^20 pounds (with a 1 in a million probability, when you get 20 heads) or else you lose everything (with 999,999 in a million probability, when a tail turns up at any time). (2^20 is actually 1,048,576 which is close enough to a million for many purposes and can be a useful rule of thumb to remember). The expected gain at the end of the 20 bets is about £3400 but the vast majority of players will end up with nothing. Would any of my readers pay £1000 for the right to take part in this game? 

In fact, for most people, most of the time, increasing wealth by a factor of 10 doesn't really make life 10 times better, but most people would be very averse to a bet where they could lose everything they own, including their house and the clothes off their back, even if the expected return was positive (eg betting the farm on the coin toss as above). A standard approach to account for this is to evaluate uncertain outcomes in terms of expected utility rather than expected value, and a utility function which is the logarithm of value is a plausible function to use.  One typical implication would be that the subject would be ambivalent about taking a bet where they might either double or halve their wealth with equal probability. The expected value of the bet is positive of course, but expected utility (compared to the prior situation) is zero. It should be noted that no-one really behaves as a fully rational utility-maximiser in realistic testing, but it's a plausible starting point widely used for rational decision theory.

This logarithmic utility maximisation idea leads naturally to the Kelly Criterion for choosing the size of the stake in betting games like the coin toss above. The point is that by betting a proportion of your wealth (rather than all of it) you can improve your return in terms of expected utility. Note that the log of 0 is infinitely negative, so losing all you own is best avoided! In 1956, Kelly proposed a formula for the stake which gives the maximum expected gain in logarithmic terms. The Kelly formula of (p(b+1)-1)/b, where p is probability of winning and b is odds in the traditional sense, implies a stake of (0.5*3-1)/2 = 0.25, ie you should bet a quarter of your wealth on each of the "triple or nothing" coin tosses. After the first bet, you will have either 0.75 or 1.5 pounds etc, so you either gain 50% or lose 25% and if you were to have an equal number of wins and losses you will more than triple your money in 20 bets. A smaller win in absolute terms, but a much better outcome in terms of expected utility and the majority of players who follow this strategy will make a profit.

So what does this have to do with the investment portfolios? Returning to the investments, each portfolio can be considered a bet where you stake a proportion of your wealth with a particular odds and 50% chance of winning. Eg with portfolio E the investor is betting 0.48 of their wealth with odds of (1.06/0.48 - 1):1 = 1.21:1. Kelly says that with such odds and a 50% win chance, you should really bet only about 9% of your wealth, which would return either 0.91 or 1.11 which gives a small gain in log terms. Of course the investor doesn't get to choose their stake here, but it still provides an interesting framework for comparison. The 5 investments have the following implied odds, stakes, geometric mean returns and Kelly-optimal stakes respectively:

A 1.6 0.07 1.02 0.18
B 1.7 0.10 1.03 0.12
C 1.6 0.16 1.02 0.18
D 1.4 0.27 1.00 0.14
E 1.2 0.48 0.91 0.09

C has a better return than A (having the same odds and a closer to optimal bet) but the rounding conceals it. B is better than either due to having better odds and a near-optimal stake. D is useless and E is worse than useless in these terms, implying a massive bet on rather poor odds which means most of the time you'll actually lose money in the long run.

It is fair to say that not everyone necessarily wants to maximise the expected log of their wealth, but I was surprised to see investment strategies proposed that were actually loss-making in log space. It's also true that investment E has the largest gain in purely expected value terms, but it would require an extraordinary appetite for risk to take it (rather than tolerance or indifference). And this wasn't a single accident, the other similar question had no fewer than 3 out of 6 options having the same property. I actually wonder if it's partly due to a cognitive error due to presentation. One of the questionees said that they wouldn't be bothered by a 40% loss one year if they could expect a 60% gain the next. If that was written as dividing their investment by a factor of 1.7 one year and then multiplying it by 1.6 the next, it might seem less attractive! 

Wednesday, June 05, 2019

The risks of financial managers

The following question is a slightly reworded version of a real question in a real financial management company's risk questionnaire that was provided to someone locally. I've tried to be fair to the financial company while making their question a bit less vague, they actually had two similar questions which cover this issue in slightly different ways.

"You have the choice of placing your investment in one the following 5 portfolios, ranging from low to high risk. For each portfolio, you can assume the return over each consecutive year (edit: was 5 years) takes one of two possible values, with 50% probability of each outcome. Which portfolio would you prefer for your investment?

A: 50% chance of either +11% or -7%
B: 50% chance of either +17% or -10%
C: 50% chance of either +25% or -16%
D: 50% chance of either +37% or -27%
E: 50% chance of either +58% or -48%"

So, which option(s) do you like, and why?

Tuesday, June 04, 2019 How confident should you have been about confidence intervals?

OK, it’s answer time for these questions (also here on this blog). First, a little background. This is the paper, or rather, here it is to download. The questions were asked of over 100 psychology researchers and 400 students and virtually none of them got all the answers right, with more wrong than right answers overall.

The questions were modelled on a paper by Gigerenzer who had done a similar investigation into the misinterpretation of p-values arising in null hypothesis significance testing. Confidence intervals are often recommended as an improvement over p-values, but as this research shows, they are just as prone to misinterpretation.

Some of my commenters argued that one or two of the questions were a a bit unclear or otherwise unsatisfactory, but the instructions were quite clear and the point was not whether one might think the statement probably right, but whether it could be deduced as correct from the stated experimental result. I do have my own doubts about statement 5, as I suspect that some scientists would assert that “We can be 95% confident” is exactly synonymous with “I have a 95% confidence interval”. That’s a confidence trick, of course, but that’s what confidence intervals are anyway. No untrained member of the public could ever guess what a confidence interval is.

Anyway, the answer, for those who have not yet guessed, is that all of the statements were false, broadly speaking because they were making probabilistic statements about the parameter of interest, which simply cannot be deduced from a frequentist confidence interval. Under repetition of an experiment, 95% of confidence intervals will contain the parameter of interest (assuming they are correctly constructed and all auxiliary hypotheses are true) but that doesn’t mean that, ONCE YOU HAVE CREATED A SPECIFIC INTERVAL, the parameter has a 95% probability of lying in that specific range.

In reading around the topic, I found one paper which had an example which is similar to my own favourite. We can generate valid confidence intervals for an unknown parameter with the following procedure: with probability 0.95, say “the whole number line”, otherwise say “the empty set”. If you repeat this many times, the long-run coverage frequency tends to 0.95, as 95% of the intervals do include the true parameter value. However, for a given example, we can state with absolute certainty whether the parameter is either in or outside the interval, so we will never be able to say, once we have generated an interval, that there is 95% probability that the parameter lies inside that interval.

(Someone is now going to raise the issue of Schrödinger’s interval, where the interval is calculated automatically, and sealed in a box. Yes, in this situation we can place 95% probability on that specific interval containing the parameter, but it’s not the situation we usually have where someone has published a confidence interval, and it’s not the situation in the quiz).

And how about my readers? These questions were asked on both blogs (here and here) and also on twitter, gleaning a handful of replies in all places. Votes here and on twitter were majority wrong (and no-one got them all right), interestingly all three of the commenters on the Empty Blog were basically correct though two of them gave slightly ambiguous replies but I think their intent was right. Maybe helps that I’ve been going on about this for years there 🙂