...Built a dataset by scraping Wikipedia

Kentucky Derby data

Data falls into 3 categories: general info, the jockey, the horse

Possible analyses
Time Odds Starts, WPS, WPS ratio Superfecta payout (jockey-specific) No. of derby wins
Year, grouped by decades x x x Avg
Repeated Names (Jockey, trainer, owner) i.e. the trainer's avg time x i.e. the avg winning payout when a certain jockey is riding x
Track condition x x
State horse was bred in x wps ratio Avg
Sex of racehorse x x x x
Triple crown status x x i.e. when TRUE, what is the avg wps ratio? x

Other research questions: