How It Works
Box Box uses a machine learning ensemble trained on F1 race data from 2022 to present. Predictions are regenerated after qualifying each Saturday using the actual grid positions.
2026 introduces new aerodynamic and power unit regulations — essentially a new formula. The model now trains on 2026 race results as they come in, weighted 5× more heavily than 2025 data. Early-season predictions carry higher uncertainty until the new competitive order becomes clear.
Data Sources
Jolpica / Ergast
Historical race results, qualifying positions, sprint race results, driver standings, and constructor standings from 2023 onwards. Used for both training and 2026 race result ingestion.
api.jolpi.caOpenF1 API
Real-time session data: practice lap times, sprint fastest laps, live race weather (temperature, rainfall). Used for pace delta features and race-day conditions once the session is live.
openf1.orgOpen-Meteo
Hourly weather forecast for the race location and start time. Used pre-race to estimate track temperature and rain probability when the session hasn't started yet. Track temp is derived from air temp and cloud cover.
open-meteo.comAll API responses are cached locally. Qualifying and sprint data refreshes every 48 hours; race results are cached permanently. Calendar and standings refresh every 6 hours. Weather forecasts refresh every 2 hours.
Feature Engineering
Each row in the training dataset represents one driver in one race. 23 features are computed per driver. Features with no historical baseline fall back to field averages.
Actual grid position from qualifying. If a driver crashed or was excluded, they're assigned last place.
Driver's best qualifying time as a % behind pole. More meaningful than raw lap time across different circuits.
Average finishing position over the last 5 races. Captures current momentum regardless of which team they're on.
Constructor's average finishing position over recent races, exponentially weighted so the latest races count more.
Driver's best practice lap (FP3 → FP2 → FP1) vs the session fastest, as a percentage. On sprint weekends where no FP2/FP3 exists, sprint race fastest laps are used instead.
Driver's average finish at this circuit with their current team only (removes cross-team era contamination). Retired results are excluded. Exponentially weighted so recent visits count more.
Average positions gained from grid to finish over recent races. Captures race-craft vs pure qualifying pace.
Driver's normalised championship position and points from the prior round. Reflects current-season competitive order.
All-time win rate and podium rate from training data. Helps correctly rank drivers who recently changed teams (e.g., Hamilton to Ferrari).
Team's normalised constructor standing. Independent signal for overall car performance.
Driver's average finish in their most recent complete season. Reflects current car, not career average.
Is it raining (0/1) and track temperature at race start. Source priority: live OpenF1 session → Open-Meteo forecast (air temp + cloud cover estimate) → same-weekend session → default. Changes tyre behaviour and overtaking rates significantly.
Constructor's DNF rate this season. DNS and DSQ events are excluded from both numerator and denominator — only actual race starts count toward reliability.
Finishing position in the sprint race. Only meaningful on sprint weekends — the has_sprint flag tells the model when this feature contains real data vs. imputed noise.
Sprint finishing position minus qualifying position. Negative = gained positions, so a driver with strong race pace relative to their quali pace scores negative here. One of the top-5 most important features on sprint weekends.
Driver's best sprint lap vs session fastest, as a percentage. Captures outright race pace independent of grid position.
Binary 0/1 flag indicating whether the current weekend has a sprint race. Without this, the model can't distinguish real sprint data from median-imputed values, making sprint features noisy.
Driver's DNF rate over last 10 races. Penalises error-prone or crash-prone drivers.
Average positions gained from grid to finish specifically at this circuit, with current team only. Unlike the general positions-gained average, this captures circuit-specific race-craft — e.g. a driver who always holds position at Monaco vs one who charges through the field at Spa.
How easy it is to overtake at this circuit: higher = more overtaking opportunities (high-speed circuits like Spa/Monza score 0.8), lower = grid position is destiny (Monaco scores 0.05, Singapore 0.10). Per-circuit overrides take priority over the circuit-type default.
Model Architecture
XGBoost Win
Binary classifier (position = 1). Uses scale_pos_weight to handle class imbalance — only 1 in 20 drivers wins per race.
LightGBM Podium
Gradient boosting with is_unbalance=True. Trained independently from the win model. Output is normalised so all drivers' podium probabilities sum to 3.0.
XGBoost Position
Regression model for full grid ordering. Resolves ties in probability output and determines predicted finishing rank.
Training: TimeSeriesSplit cross-validation (4 folds) ensures the model is never trained on future race data — no leakage. Season weights: 2026 races are weighted 5× more than 2025, which is weighted 2× more than 2024. Trained on ~1,500 driver-race rows across 2022–2026. Models are retrained as 2026 results accumulate.
Accuracy
Picking any driver from a 20-car grid at random wins 5% of the time.
Historically, the pole-sitter converts to a race win about 30% of the time.
Measured via TimeSeriesSplit on 60 historical races. The model correctly picks the race winner in roughly 1 in 2 races — ~10× better than random.
Average overlap between predicted top-3 and actual top-3. Measured across the same 60 races.
CV figures are from cross-validation on 2023–2025 data. Live 2026 accuracy is tracked on the home page as each race result comes in.
Known Limitations
The model trains on 2022–2026 race results (~1,500 driver-race rows across 4+ seasons). This is a small dataset for ML — confidence intervals are wide, and the model is most reliable when qualifying position already tells a clear story.
New aerodynamic and power unit regulations mean 2026 cars behave differently from anything in the 2023–2025 training data. The model now incorporates 2026 results as they arrive (weighted 5× more than 2025), but the first few races carry high uncertainty until the new competitive order stabilises.
Hamilton moved to Ferrari, Antonelli replaced him at Mercedes, Cadillac is brand new. The model relies on career stats and constructor standings when team-specific history is thin — some drivers are inherently more uncertain than others.
The model cannot predict Q1 crashes, rear-axle failures, or race-day retirements. Verstappen starting P20 in Australia after a Q1 crash is a perfect example — no historical feature could foresee that outcome.
There are only ~6 sprint weekends per season, so sprint features (position, pace delta, quali delta) have less training signal than their real-world value warrants. On sprint weekends the model carries higher uncertainty for drivers whose form diverges significantly between qualifying and sprint.
Compound choice at the start, undercut windows, safety car timing, and pit strategy calls are major race outcome drivers that aren't available pre-race and aren't modelled.
If qualifying data isn't available from the API yet (pre-qualifying weekend), the model uses a driver's historical average grid position. This is less accurate than actual qualifying results and can significantly skew predictions.