Monday, January 16, 2012

Presidential Race

It looks like we've got a good battle for the presidency coming up. (Odds from Pinnacle Sports)


Decimal Odds Minimum Percent to be Profitable Implied Actual Percent
Democraps 1.714 0.5834305718 0.5697791165
Republican'ts 2.27 0.4405286344 0.4302208835



Saturday, May 28, 2011

Fun with the NBA Finals! and a Positive EV Bet?!?!

(all betting prices taken from pinnaclesports.com on may 28th)
If you read my sports betting post, I wrote an R function called oddsFinderDec to find implied percentages of events occurring given betting prices.  Here I will present how likely each team is to win the finals based on pinnacle prices of 2.55 for the Mavericks and 1.581 for the Heat.

> oddsFinderDec(c(2.55,1.581))
[1] 0.3827160 0.6172840

This means the Mavericks have about a 38.27% chance of winning the finals.

What about an exact series results breakdown?   They have prices for each possible result (as in, Mavs 4-0, Mavs 4-1...Heat 4-3).  Let's check out those percentages.

> exactSeriesResult=oddsFinderDec(c(20.72,8.35,8.5,7.85,10.42,7.89,4.47,4.02))
> exactSeriesResult
[1] 0.04354880 0.10806362 0.10615661 0.11494665 0.08659609 0.11436391 0.20186381 0.22446050
Those eight probabilities are for each result.  The first four are for the Mavs winning in order from 4-0 to 4-3, then the Heat winning in order from 4-0 to 4-3.  So, just to be clear there should be about a 20.19% chance the Heat win 4-2.  If I sum up each Mavs winning scenario percent, it should add up to the percent of the Mavs winning the whole series.

> sum(exactSeriesResult[1:4])
[1] 0.3727157
Similar, but about 1% off.  I could force the exact series results percentages to add up so that the sum of each result in which the Mavs win adds up to the percent that the Mavs win the series.

> adjExactSeriesResult=exactSeriesResult
> adjExactSeriesResult[1:4]=exactSeriesResult[1:4]*oddsFinderDec(c(2.55,1.581))[1]/sum(exactSeriesResult[1:4])
> adjExactSeriesResult[5:8]=exactSeriesResult[5:8]*sum(exactSeriesResult[1:4])/oddsFinderDec(c(2.55,1.581))[1]
> adjExactSeriesResult
[1] 0.04471726 0.11096308 0.10900491 0.11803079 0.08433333 0.11137558 0.19658911 0.21859536


There are also over/unders on the length of the series.

Here's the 4.5 games series length over/under.
> ou4p5=oddsFinderDec(c(1.117,7.02))

> ou4p5
[1] 0.8627258 0.1372742
There's about an 86.27% chance the series lasts more than 4.5 games.

> ou5p5=oddsFinderDec(c(1.467,2.92))
> ou6p5=oddsFinderDec(c(2.77,1.508))
From the over/unders and using the power of subtraction, I can find implied chances of each total game result.
> fourG=ou4p5[2]

> fiveG=ou5p5[2]-ou4p5[2]
> sixG=ou5p5[1]-ou6p5[1]

> sevenG=ou6p5[1]
> exactGames=c(fourG,fiveG,sixG,sevenG)
> exactGames
[1] 0.1372742 0.1971229 0.3131017 0.3525012
I wonder how these match up with those from the exact series result chances.  What I'm doing here is just adding the Mavs 4-0 to the Heat 4-0 to get the chance of a 4 game series and the same for 4-1, 4-2, and 4-3.

> exactGames2=exactSeriesResult[1:4]+exactSeriesResult[5:8]
> exactGames3=adjExactSeriesResult[1:4]+adjExactSeriesResult[5:8]
> exactGames2
[1] 0.1301449 0.2224275 0.3080204 0.3394072
> exactGames3
[1] 0.1290506 0.2223387 0.3055940 0.3366262
There are some large differences for 5 and 7 games.

What about game 1?  The Heat are at home.

> oddsFinderDec(c(2.75,1.513))
[1] 0.3549144 0.6450856
The Mavs have about a 35.49% chance of winning game 1.
If we assume the Heat are equally likely to win game 2 at home and their chances of winning games 3 and 4 on the road are the same, then we can make a prediction of the Heat's chances of winning on the road using their chance of sweeping we estimated earlier.

> heatHome=oddsFinderDec(c(2.75,1.513))[2]
> heatAway=sqrt(adjExactSeriesResult[5]/heatHome^2)
> heatHome
[1] 0.6450856
> heatAway
[1] 0.4501759
Alright!

Now that I have this info, I might as well simulate the series 100000 times.

> exactSeriesResult4=numeric(8);
> for(i in 1:100000){
+ h=0; m=0; bool=T;
+ while(bool){
+ if(runif(1,0,1)<heatHome){h=h+1} else{m=m+1}
+ if(runif(1,0,1)<heatHome){h=h+1} else{m=m+1}
+ if(runif(1,0,1)<heatAway){h=h+1} else{m=m+1}
+ if(runif(1,0,1)<heatAway){h=h+1} else{m=m+1}
+ if(h==4){exactSeriesResult4[5]=exactSeriesResult4[5]+1; break} else{
+ if(m==4){exactSeriesResult4[1]=exactSeriesResult4[1]+1; break}}
+ if(runif(1,0,1)<heatAway){h=h+1} else{m=m+1}
+ if(h==4){exactSeriesResult4[6]=exactSeriesResult4[6]+1; break} else{
+ if(m==4){exactSeriesResult4[2]=exactSeriesResult4[2]+1; break}}
+ if(runif(1,0,1)<heatHome){h=h+1} else{m=m+1}
+ if(h==4){exactSeriesResult4[7]=exactSeriesResult4[7]+1; break} else{
+ if(m==4){exactSeriesResult4[3]=exactSeriesResult4[3]+1; break}}
+ if(runif(1,0,1)<heatHome){h=h+1} else{m=m+1}
+ if(h==4){exactSeriesResult4[8]=exactSeriesResult4[8]+1; break} else{
+ if(m==4){exactSeriesResult4[4]=exactSeriesResult4[4]+1; break}}
+ }}
> exactSeriesResult4=exactSeriesResult4/sum(exactSeriesResult4)
> exactSeriesResult4
[1] 0.03830 0.10898 0.10671 0.10969 0.08498 0.13295 0.21587 0.20252
Here are the exact series results in the order they appeared before when we looked at the percentages implied by the betting prices.
How do they compare to those percentages?

> exactSeriesResult
[1] 0.04354880 0.10806362 0.10615661 0.11494665 0.08659609 0.11436391 0.20186381 0.22446050

There are a couple modest differences.

How often did the Mavs win the series in this sim?
> sum(exactSeriesResult4[1:4])
[1] 0.36368
That's nearly 2% off from the series prices.  I don't like how big that difference is.  I could do an adjustment similar to the one I did earlier.

> adjExactSeriesResult4=exactSeriesResult4
> adjExactSeriesResult4[1:4]=exactSeriesResult4[1:4]*oddsFinderDec(c(2.55,1.581))[1]/sum(exactSeriesResult4[1:4])
> adjExactSeriesResult4[5:8]=exactSeriesResult4[5:8]*sum(exactSeriesResult4[1:4])/oddsFinderDec(c(2.55,1.581))[1]
> adjExactSeriesResult4
[1] 0.04030473 0.11468432 0.11229551 0.11543149 0.08075315 0.12633715 0.20513277 0.19244679
Of course, this is a stupid way to do it because the chances of each series result should not increase proportionally by making the Mavericks better. Regardless, look at this:
> exactGames4=adjExactSeriesResult4[1:4]+adjExactSeriesResult4[5:8]
> exactGames4
[1] 0.1210579 0.2410215 0.3174283 0.3078783
Now, recall this:

> exactGames
[1] 0.1372742 0.1971229 0.3131017 0.3525012
.3525 vs. .3079
That is huge!  We may have a +EV bet if we bet on the under 6.5 games!  But hold your horses, I'm going to redo the 10000 series simulations with a slightly higher Mavs road winning percent.

> heatAway=sqrt(adjExactSeriesResult4[5]/heatHome^2)
> heatAway
[1] 0.4405167
Here are the results of the simulation using updated Heat road win percent based on new sweep chances.
> exactSeriesResult5
[1] 0.03914 0.11405 0.10642 0.11290 0.08171 0.12988 0.21383 0.20207
Note that the Mavs won more, but not enough more.
> sum(exactSeriesResult5[1:4])
[1] 0.37251

I'll do this one more time, but now I will choose the Heat road win percent based on linear logic.  Here is middle school math at it's most tedious.
> heatAway=(oddsFinderDec(c(2.55,1.581))[1]-sum(exactSeriesResult[1:4]))/((sum(exactSeriesResult[1:4])-sum(exactSeriesResult4[1:4]))/(heatAway-sqrt(adjExactSeriesResult[5]/heatHome^2)))+heatAway
> heatAway
[1] 0.4298262
Look at how long that first equation is!
Here is the simulation one more time with the new Heat road winning percent.
> exactSeriesResult6
[1] 0.04224 0.12187 0.10968 0.11226 0.07713 0.12150 0.21214 0.20318
> sum(exactSeriesResult6[1:4])
[1] 0.38605
I actually overshot it a little, with the Mavs now winning too often.  
> exactGames6=exactSeriesResult6[1:4]+exactSeriesResult6[5:8]
> exactGames6
[1] 0.11937 0.24337 0.32182 0.31544
Even with me giving the Mavs a little too much credit, the series only went to 7 games 31.54% of the time.  

So should we take the under on 6.5 games for the series?
Here's a new function:
> profOddsDec=function(odds){
+ 1/odds}
It'll tell you what you need the chances to be for a bet to have positive expected value.
Here it is for the under on 6.5 games.
> profOddsDec(1.508)
[1] 0.66313
What did my sim find?
> 1-exactGames6[4]
[1] 0.68456
and that was while giving the Mavs too much credit.

So I say go for it!  Bet your life savings! 
> (1-exactGames6[4])*1.508
[1] 1.032316
You can expect a whopping 3.2% return on investment if you make a bunch of assumptions about game independence and accuracy of betting prices.

The case may be that refs (whether intentionally or not) may have a slight bias toward the team that is losing the series to make it go longer.  That may be the reason for this.  

I would also like to point out that in the world of sports betting, much like the world of stock market investing,  you should live by the following rule.  If you think something is priced incorrectly, it's not because it's priced wrong, it's because those who made it the price it is know something that you don't.  Of course, there are exceptions to this rule.  



Wednesday, May 18, 2011

1 2 3 Foot

1 2 3 Foot is a great game.  Generally, it is played when there is a group of people who must determine a subset of that group who must do a specific thing.  For example, if there were five dudes hanging out and they decided to go to McDonald's and they wanted one guy to drive everyone, they would play 1 2 3 Foot to determine who drives.  This guy is the winner, regardless of whether he wanted to drive or not.  There does not have to be one winner though, there can be more - you might use it to determine which two guys have to carry a heavy couch or which three guys get to eat the last three pieces of pizza.  You can play the game to determine who must do something that is no fun or who gets to do something that is fun.

how the game works:
First, specify how many winners there will be.  Then, all players stand around a circle and chant, "1 2 3 Foot," while alternating between putting their bad and good foot forward in unison with the words they chant.  By this, I mean when they say, "1," they put their bad foot forward, when they say "2," they put their good foot forward, bad foot again for "3", but on "foot", they put their good foot forward in one of four ways - they elevate their heel while keeping their toes down (known as "down"), they elevate their toes while keeping their heel down (known as "up"), they rotate their foot 90 degrees clockwise ("in" if you are a lefty, "out" if you are a righty), or they rotate their foot 90 degrees counter-clockwise ("out" if you are a lefty, "in" if you are a righty).  Next the players compare what they have each done.  If their is exactly one group of people that did the same thing with their feet that is the same size as the number of winners specified, this group is the winner(s).  That last sentence may have been confusing, so here's an example.  Six players assemble to determine two winners.  One guy puts his foot out, no one puts their feet in, two guys put their feet up, and three guys put their feet down.  These two guys are the winners.  Many times, the match will result in what is known as a "foot".  A foot occurs when no winner is determined.  Consider the previous example.  If instead, one guy puts his foot out, one guy puts his foot in, two guys put their feet up, and two guys put their feet down, their were two sets of two.  These guys tied for the win, so that round was a foot.  A foot also occurs when their are no groups of the correct size.  Consider the 6 player and two winner scenario once more.  If three people put their feet up, three people put their feet down, and no one put their feet in or out, there are no groups of size two.  They have achieved a foot.  Achieving a foot may be more good than bad, though.  When a foot occurs, players get to play another round.  What a treat!  This is where things get complicated, though.  Each time there is another round, players switch between counting to three and counting to four.  So if there is a foot in round one, play another round and count to four.  Players must remember to stick their good foot forward on "1", so that their good foot will be forward on "foot".

how often do foots occur:
I made an R function which simulates 1 2 3 Foot games to find probabilities of achieving a foot.
Here's the code:

one23footFootProb=function(players,sim,winners=1){
foot=0;
for(i in 1:sim){
vec=sample(1:4,players,replace=T);
vec2=numeric(4);
for(j in 1:4){
vec2[j]=sum(vec==j)}
if(!sum(vec2==winners)==1){foot=foot+1}}
return(foot/sim)
}
"players" is the number of players, "sim" is the number of simulations and "winners" is the number of winners. The function returns the proportion of times that a foot occurred in the simulation.  Here's an example:

> one23footFootProb(5,10000,2)
[1] 0.6439
In a 5-player match to determine 2 winners, there was a foot in 64.39% of the rounds.  This means that the average match lasted 1.553036 rounds.

> 1/.6439
[1] 1.553036

Here are 10000 trials for matches between one and thirty players to determine one winner.

> foots=numeric(30)
> for(i in 1:30){
+ foots[i]=one23footFootProb(i,10000)}
> foots
 [1] 0.0000 1.0000 0.4417 0.8231 0.5908 0.6310 0.5862 0.5377 0.4925 0.5023
[11] 0.5363 0.5924 0.6364 0.7012 0.7447 0.8009 0.8348 0.8686 0.8986 0.9182
[21] 0.9378 0.9494 0.9587 0.9693 0.9753 0.9826 0.9849 0.9869 0.9882 0.9938
For the most part, the odds of a foot increases as the players increase.  What's interesting is how often foots occur in four player matches.  That's because a foot occurs only when three people do one thing and one person does another.

Note that this simulation assumes that each player will do each foot move with equal probability in each round.  This is not a correct assumption.

Monday, May 16, 2011

Risk Fighting

Have you ever wondered what the chance of each outcome in a Risk battle is?  Here you go.  I wrote R code to sample as many battles as you want with alternate dice combinations like those in Risk Factions or Risk 2210 to be included.  This only applies to battles where each side can lose two guys.


riskSim=function(att,def,samp){
attLostFreq=numeric(3);
for(i in 1:samp){
attRolls=numeric(length(att));
for(j in 1:length(att)){
attRolls[j]=sample(att[j],1)}
defRolls=numeric(length(def));
for(j in 1:length(def)){
defRolls[j]=sample(def[j],1)}
attWins=sum((sort(attRolls,decreasing=T)[1:2])>(sort(defRolls,decreasing=T)[1:2]));
attLostFreq[3-attWins]=attLostFreq[3-attWins]+1}
finalVec=c(attLostFreq/samp,(attLostFreq[1]+attLostFreq[2]/2)/samp);
return(finalVec)}

You enter the attack dice for att and the defense dice as def.  Here's the syntax - c(8,6,6) means you have one eight-sided die and two six-sided dice.  Samp is the number of trials.  

I did 10000 trials for every combination in risk, risk factions, and risk 2210.  Note that even though that seems like a large sample, there can still be some error.  

Here is an example of the standard matchup:
> riskSim(c(6,6,6),c(6,6),10000)
[1] 0.3712 0.3380 0.2908 0.5402
What this means is that the attack has three six-sided dice, the defense has two six-sided dice, and there were 10000 trials.  The results say that in those trials, 37.12% of the time, the defense lost 2, 29.08% of the time, each side lost 1, and 29.08% of the time, the attack lost 2.  In total, the attack had a 54.02 winning percentage.   



Risk 2210:
> riskSim(c(6,6,6),c(6,6),10000)
[1] 0.3716 0.3316 0.2968 0.5374
> riskSim(c(8,6,6),c(6,6),10000)
[1] 0.4745 0.3086 0.2169 0.6288
> riskSim(c(8,8,6),c(6,6),10000)
[1] 0.55440 0.27590 0.16970 0.69235
> riskSim(c(8,8,8),c(6,6),10000)
[1] 0.63380 0.24050 0.12570 0.75405
> riskSim(c(6,6,6),c(8,6),10000)
[1] 0.2707 0.3610 0.3683 0.4512
> riskSim(c(8,6,6),c(8,6),10000)
[1] 0.35520 0.34730 0.29750 0.52885
> riskSim(c(8,8,6),c(8,6),10000)
[1] 0.4525 0.3210 0.2265 0.6130
> riskSim(c(8,8,8),c(8,6),10000)
[1] 0.5112 0.2948 0.1940 0.6586
> riskSim(c(6,6,6),c(8,8),10000)
[1] 0.21300 0.34070 0.44630 0.38335
> riskSim(c(8,6,6),c(8,8),10000)
[1] 0.27880 0.33810 0.38310 0.44785
> riskSim(c(8,8,6),c(8,8),10000)
[1] 0.3481 0.3346 0.3173 0.5154
> riskSim(c(8,8,8),c(8,8),10000)
[1] 0.4066 0.3258 0.2676 0.5695

Risk Factions:
> riskSim(c(6,6,6),c(6,6),10000)
[1] 0.37650 0.33770 0.28580 0.54535
> riskSim(c(6,6,6,6),c(6,6),10000)
[1] 0.4620 0.3288 0.2092 0.6264
> riskSim(c(6,6,6),c(6,6,6),10000)
[1] 0.23220 0.29890 0.46890 0.38165
> riskSim(c(6,6,6,6),c(6,6,6),10000)
[1] 0.3136 0.3062 0.3802 0.4667

Here are some things to note.  Improving dice helps defense more, especially in Factions.  Also, look at how likely the 2-0 and 0-2 sweep are.  Consider the standard Risk Factions battle.  Attack won 54.535% of the time.  For two independent battles with attack winning 54.535% of the time, the distribution of series results are: 
2-0
> .54535*.54535
[1] 0.2974066
1-1
> .54535*(1-.54535)*2
[1] 0.4958868
0-2
> (1-.54535)*(1-.54535)
[1] 0.2067066
That's 49.58868% for independent battles compared to 33.77% in Risk.  

The Basics of Sports Betting

First things first, I'll explain what the numbers mean when you visit a sports betting website.  There are three different ways sports betting odds are written.  
American
Here you will see either a plus sign or a minus sign followed by a three digit number.  If there is a plus sign, that means that if you bet one hundred dollars, you will profit whatever is after the plus sign.  For example, +105 means if you bet $100 and you are right, you will profit $105 to have $205.  If there is a minus sign, it means you have to bet whatever is after the minus sign to profit one hundred dollars.  For example, -120 means if you bet $120, you will profit $100 to have $220 total.  No matter what the odds are, if you are wrong, you will lose your bet.
Decimal
This is my favorite form.  Here the decimal means you will have this much if you are right including what you bet if you bet a dollar.  For example, 3.2 means if you bet $1 and you are right, you will profit $2.20 to have $3.20.
Fractional
Here you will see two numbers separated by a slash.  If you bet the second number and are right, you will get the first number.  For example, 10/11 means that if you bet $11 and are right, you will profit $10 to have $21.

Each line implies certain odds of an event occurring.  However, sports books try to make a profit, so the odds they offer aren't quite fair.  If the odds they offer on something are off enough that placing a bet has positive expected value, all the rich guys who devote themselves to sports betting would bet huge amounts of money on that side.  There are two ways sports books protect themselves from that.  Way one, get guys who really know their stuff to set the odds.  Way two, offer bad prices on everything.

Let's compare pinnaclesports.com to sportsbook.com.  Pinnacle Sports has a decimal price of 2.68 for the Houston Astros moneyline tomorrow (moneyline means you're betting on them to win rather than betting on them to beat the spread).  Sports Book has a price of 2.65 - you stand to profit less if you are right from betting at Sports Book.  So Sports Book thinks Houston is more likely to win than Pinnacle Sports does, right?  And they're going to give a better price for betting on their opponent Atlanta than Pinnacle Sports does, right?  Wrong!  Pinnacle Sports has a price of 1.559 for Atlanta, while at Sports Book, it's 1.53.  You see, Pinnacle Sports follows way one, while Sports Book follows way two.  You will see especially bad prices at Sports Book or similar sites for futures with a lot of choices, like who is going to win a championship before the season starts.

Unsure Ursella: Hey Knowshon, before the NBA playoffs started, Sports Book said the Bulls were most likely to win the East, while Pinnacle said the Heat were.  Should I take the average of the odds to determine who is actually most likely?
Knowledgeable Knowshon: No way!  Just look at the Pinnacle Odds.  Only doufuses bet at Sports Book since they have bad prices.  There are no hot shots who go there to make significant moves on the lines.

Finally, I'll show you how to find the implied percentages of winning from decimal prices using R.  Here is a function I wrote called oddsFinderDec.  If you put in a vector with the decimal prices, it will return the implied percentages of those events occurring from those prices.

oddsFinderDec=function(vec){
goodBetOdds=1-(vec-1)/vec;
estOdds=goodBetOdds/sum(goodBetOdds);
return(estOdds)}

I'll show you an example.  In tomorrow's Starcraft 2 matches, oGs is playing startale.  oGs's price at Pinnacle is 1.8, while startale's price is 2.07.

In R, I would type:
oddsFinderDec(c(1.8,2.07))
and it would return:
0.5348837 0.4651163
which means oGs has about a 53.5% chance of winning.

Games of Skill and Games of Luck

Welcome to Games of Skill and Games of Luck.  Here I will talk about the statistics of sports, betting, board games, video games, and other games of skill and luck.  Enjoy.