I work in the industry, can confirm, the tactics used to dissuade people for agg...

phillc73 · on Sept 3, 2019

You've hit the nail on the head. It's all about vested interests.

I was involved for a number of years with a UK based horse racing ratings service (handicapping if in the US). This service used to license their base data from the Press Association[1] and then run algorithms on top to produce the ratings.

There's certain things I can't say due to NDAs which are probably still in effect, but the cost of licensing this basic data was in excess of £10k per annum. So, unless you were a serious bettor or were looking to operate a service of some kind, it's beyond the pocket of most individuals.

Timeform in the UK also license some of their own proprietory data, via an API[2]. They've published some pricing on their website and you're looking at between £6k - £12k per year. This is just to access data which is available via their website for a subscription fee of £75 per month, but via their API.

There's even a specific UK organisation which apparently has the permission from the British Horse Racing Authority to officially licence key racing data. This is who sells the data to bookmakers, form guides, racing newspapers etc. They have a rate card published on their website.[3] Private, pro-punter? £8.5k per year please.

It's a bit of a rort really. Most of the data is "freely" available online or in the racing press, but if you want to access it any useable format, either build a scraper (good luck with staying on top of the website changes) or pay a stack to access things programmatically.

[1] https://pa.media/racing-betting/horseracing/

[2] https://www.timeform.com/commercial/products/api

[3] http://www.racecoursedatacompany.com/

listenallyall · on Sept 3, 2019

As you stated, the vast majority of racing data is collected, measured and entered by hand, by people who are paid to perform this job. It costs enormous amounts of money to employ all these people to watch every race in meticulous detail and gather all the data required to publish the Daily Racing Form. Why would you expect them NOT to protect this proprietary, valuable information?

Almost all tracks publish result charts online for free along with race videos. If you want free, why not compile the data yourself? How long would DRF or Equibase exist if people could access their data for free?

primeradical · on Sept 3, 2019

The DRF relies on Equibase data for program and scratch data for all US and most International tracks. Even Churchill Downs relies on data agreements from Equibase to provide up-to-date information to feed to Totes. Result chart information is also almost exclusively Equibase data at least in the US. They make closed door deals with tracks, ADWs and Totes to provide data feeds.

Also, it's important to make the distinction between editorial content (analysis, predictions, subjective descriptions of a horse or jockey performance) and empirical information (horse weights, medication, surface conditions, weather, placements, jockey-horse combo win-rates, etc).

The DRF sells its speed ratings as well as analysis of pedigree and past performances. There's value in that and it definitely justifies the cost of their publication and the other publications that perform similar work.

The critical issue with your stance is that users have no options to aggregate their own data easily. The free PPs Equibase offers have been scrapped before and I know of several specific instances where the creators of those scrappers were sent cease and desist for collecting the information Equibase otherwise provides for free. Even to Github to remove the repository that contains the code.

I'm not advocating scrapping (please don't scrape sites like that) but there isn't any industry interest in providing modern consumable data. Wouldn't it be in Equibases best interest to put that information behind an API and sell access to the public? The industry actively discourages using publicly available data.

phillc73 · on Sept 3, 2019

Charging a lot for the data is self defeating. In order for the sport to grow, more people need to be interested in the sport. One measure of interest is betting turnover. And a proportion of betting turnover is usually used to fund the industry. In order to increase betting turnover, one strategy could be to make the data free and easily accessible in an automated, machine readable form.

I really do not care about the likes of DRF or Equibase and how long they will or won't exist. I think it is upon the industry itself to ensure this data is available free and easily accessible. Look at Hong Kong as the alpha example. Loads of free data, huge betting turnover, well funded industry.

listenallyall · on Sept 3, 2019

You may not care about DRF, but it is the sole source for a typical horseplayer to get reliable information about the horses, without which, these players would have zero guidance and likely abandon the sport.

DRF makes racing data easily accessible. If it was left to the tracks, which are independent entities (unlike NFL/NBA/MLB), an horseplayer would have to compile past performances from dozens of sources. The fields of a single day's race card may have run at 30 or more individual venues, in aggregate. Even if that data were free (well, the result charts and replay videos are already free, so technically this is already possible) if would take a ton of work to assemble it all in a digestible format -- which the DRF does for 6 bucks.

I don't believe HK offers free data that is not available from American tracks. There is no API, the result charts are less detailed than American tracks. If info was so freely available to everyone, how would someone like Bill Benter gain such a huge advantage? Why wouldn't he replicate his methods in the US? Probably because the US makes MORE data available.

JoeAltmaier · on Sept 3, 2019

Its a business, around a game. I guess they can do whatever they want with it.