Introduction
Pick N combination pools are popular within the racing & sports betting spheres across the world. They’re frequently populated with ’outside money’, creating carryover pools/jackpots, with takeout percentages miles below standard numbers [takeouts as high as 30-40 % are fairly standard for non-carryovers], & in some special cases even generating negative vigs. Due to the excellent liquidity offered by these pools, they naturally become great targets for the high-scale, high-volume informed bettor.
Publicly available tools
In a standard Pick N pool, the following tools/information are available/transmitted to the public/bettors:
- Turnover figures.
- Pool size/carryover numbers.
- Race-level stake distributions.
- Batch betting functionality.
If complemented with,
- An abstract, internal set of ’true’ race-level probability distributions,
a sweet opportunity, at least in theory, to stake [a subset of] the underbet combinations arises.
Limitations
However, to make things more complex there are usually factors that [assuming you’re in possession of perfect fair odds distributions, which in themselves are quite demanding to come up with] increase the difficulty of taking ’proper’ advantage of the pools:
- Combination-level stake distributions/odds not transmitted to the public, rather hidden internally with the pool provider [PMU, TAB, HKJC, etc.].
- Batch betting and/or staking limitations. Rules governing what’s allowed & what’s not, generally meant to restrict sharper high-volume bettors from ’cleaning up completely’.
Approaching those limitations
The second one is trivial to handle. You simply adhere to the relevant rules, and/or, if elite, lighten them up by ’qualifying for special treatments’ from the parimutuel provider/the partner you’re routing your action through.
The first limitation however, the lack of combination-level granularity, is more complicated & requires the construction of a [mental?] model for how to move from what’s publicly available [race-level stake distributions] to what’s of actual interest [combination-level stake distributions]. A given set of combination-level stake distributions maps uniquely to a specific set of race-level distributions, but the reverse does not hold true [a single set of race-level distributions can be explained by many different combination level ones], hence the issue.
A standard, yet kind of naive way of handling this is to assume that the combination-level probabilities can be well approximated by a simple multiplication of the race-level probabilities. I.e. if a specific combination consists of 6 horses all wagered at 20 % of total stakes, this naive computation would yield an expected payoff of 1/0.2^6 = 15625. As you’ll swiftly note in practice, this kind of estimation is frequently *off by quite a margin*, though on average [probably?] correct [many such cases in betting]. The key question becomes: is there an order to the chaos/does there exist a way of accurately modelling/predicting the direction & magnitude of this deviation?
Representation
Let’s define a discrete function C that takes [a vector of] the N winners [winner of each of the N races] as input, counts what share of the total pool has been wagered on the specified combination, & returns the given share.
As gamblers we don’t get to see C a priori [& after the fact we’re generally limited to observing it only for the winning combination [or well, sometimes for a slightly larger subset, but rarely for the full universe of all possible combinations]], hence the interest in developing an estimation routine that predicts it as well as possible [parimutuel is PvP → no need to arrive at the truth, just make sure to be closer to it than what your counterparties are].
To predict C, we develop a separate routine, X∗, whose sole goal is to return as good of an approximation to C (the true combination-level payoffs) as possible. In order to ’perfect’ this approximation, we use whatever [relevant] parameters are available to us at the time of estimation. This includes but is not limited to measures such as the race-level stake distributions [including changes/deltas in those], any internal/proprietary win probabilities, publicly available information regarding the horses/selections & public/widely known tipster/model figures.
Mapping all of the above information into a reasonable decision framework can, as should be expected, be both difficult & performed in a multitude of ways [your imagination is the only limitation here]. One such route is via the development of a ’similarity measure’ whose purpose is to measure the amount of similarity/correlation between the different winners in any provided combination/Pick N-vector. The thinking goes; the larger the similarity, the higher the likelihood of the combination being *overbet* relative to the naive race-level probabilities estimation.
A similarity measure
Section removed by LL OPSEC dept.
Effects on parallel/standard markets
Significant carryover pools can have quite interesting effects on the remainder of the marketplace.
- Clear incentives to psyop early markets if, on a relative basis, cheap enough to do so [throws off counterparties that are guided by those numbers]. Even non-psyop-maxis are clever enough to understand that they’re heavily *disincentivized* from providing early markets with any kind of insightful information [assuming they’re pushing significant volume into the jackpot pools].
Effect: Inefficient [fixed odds] early markets [& yes, you can ’potentially’ use the late steam in those carryovers to pick off parallel market offers.]
- Extra effort is put into the study of races included in ’carryover sets’ → this in- formation propagates into the closing numbers in exchange, parimutuel & fixed odds markets. Effect: Sharper ’standard markets’ as soon as the large, attractive pools are closed for betting.
Until next time…