Win Rate by Year
Matches by Surface
Ranking Progression
Cumulative Wins vs Losses
Monthly Win Rate (%)
Win Rate by Round
Win Rate by Surface per Year
Surface Win % Summary
Serve Stats by Surface
Break Points by Surface
H2H Match History
Most Played Opponents (Top 20)
Performance by Tournament (Bubble = Matches Played)
Best Tournament Runs
Win Rate by Tournament Level
Titles & Finals
Aces & Double Faults per Match
First Serve % In
Break Points Faced vs Saved
Break Points Converted
Full Match Log
Deep Analytics
Advanced statistical metrics across all analysis dimensions
Select an entity above to compute all statistical metrics for that dimension. All sidebar filters (year, surface, round) apply.
>> Summary Statistics
>> Central Tendency & Spread
Distribution (Histogram + Density)
Central Tendency Metrics
>> Spread, Shape & Quartiles
Box Plot (Quartiles + Outliers)
Quartile & Spread Metrics
>> Correlation & Regression (vs Year)
Scatter + Regression Line (vs Year)
Regression & Correlation Metrics
>> Hypothesis Tests
T-Test: Wins vs Losses
Z-Score Analysis
Chi-Squared: Result vs Surface
>> Year-over-Year Statistical Profile
Statistical Metrics by Year
Predictions & Forecasting
Five forecasting models applied to every performance metric -- select an entity and horizon below
All 5 models run simultaneously. Confidence bands shown at 80% and 95%. Sidebar year/surface/round filters apply to training data.
Model Comparison -- All 5 Forecasts
Individual Model Detail
Model 1 -- Linear Trend (OLS Regression)
Model 2 -- ARIMA (Auto)
Model 3 -- Exponential Smoothing (ETS)
Model 4 -- Holt-Winters Trend
Model 5 -- Moving Average + Rolling Forecast
Model Accuracy Metrics
Forecast Summary
Next-Season Forecast Table
User Guide & Statistics Reference
v3.0 -- Dual-source data: Jeff Sackmann GitHub + Tennis Abstract 2025-2026
Live Data Status
Updated automatically every time the dashboard loads or refreshes.
1. Data Sources & How the Dashboard Stays Current
Two Data Sources, Always Fresh
Every time the dashboard opens or refreshes, it automatically fetches the latest data from two complementary sources and merges them together:
| Source | Coverage | What it provides |
|---|---|---|
| Jeff Sackmann / tennis_wta | 2015 - present | All WTA main-draw match results, rankings, serve stats (aces, double faults, break points, serve percentages). Updated throughout the season. |
| Tennis Abstract | 2025 - 2026 | Supplementary match-level stats for recent matches: Dominance Ratio (DR), Ace %, DF %, 1st Serve % In, 1st Serve Won %, 2nd Serve Won %, BP Saved %, and match duration. |
Automatic Update Detection
On every load the dashboard queries the GitHub API to check whether new commits have been pushed to the Sackmann dataset since the last cached match date. The Live Data Status panel above shows the current sync state. The Sackmann data is always re-downloaded fresh -- no manual refresh needed.
How the Sources Are Merged
Sackmann data is used as the primary source for 2015 onwards. Tennis Abstract rows are appended for any matches dated after the latest Sackmann entry (typically the current season's most recent matches). Where both sources overlap, Tennis Abstract percentage-based serve statistics are joined onto Sackmann rows to fill any gaps.
Sidebar Filters
Three filters apply to every tab simultaneously:
| Filter | What it does |
|---|---|
| Year Range | Slider to include only matches from selected years (2015-2026). |
| Surface | Show only Hard, Clay, Grass, or Carpet matches. Select All for every surface. |
| Round | Restrict to a specific round: R128 / R64 / R32 / R16 / QF / SF / F / RR / Q1 / Q2 / Q3. |
2. Overview Tab
High-level career summary. Six KPI cards: total matches, wins, losses, win rate, best ranking, tournaments.
- Win Rate by Year: % matches won each year. Hover for exact figures.
- Matches by Surface: Donut chart -- proportion of matches per surface.
- Ranking Progression: WTA ranking over time -- axis inverted (lower = better).
3. Win / Loss Trends Tab
- Cumulative W/L: Two growing lines -- widening gap signals sustained strong form.
- Monthly Win Rate: Bar chart by calendar month, coloured red -> green by win rate.
- Win Rate by Round: Declines in later rounds as opponents strengthen.
4. Surface Analysis Tab
Hard = fast (US/AO) | Clay = slow, high bounce (RG) | Grass = fastest, low bounce (Wimbledon)
- Win Rate by Surface per Year: Four coloured lines showing yearly trends per surface.
- Surface Win % Summary: Overall win rate per surface -- check match count in tooltip.
- Serve Stats by Surface: Avg aces and double faults per match on each surface.
- Break Points by Surface: Break points faced vs saved per match on each surface.
5. Head-to-Head Tab
Select any opponent from the dropdown (sorted by most matches) to see the full H2H record.
- H2H Cards: Wins, losses, total matches, win rate vs that opponent.
- Match Timeline: Each dot = one match. Green = win, pink = loss. Hover for details.
- Most Played: Top 20 opponents by career matches faced.
6. Tournament Stats Tab
- Bubble Chart: Each bubble = one tournament. X = matches, Y = win rate, size = matches, colour = surface.
- Best Runs: Tournaments where she won the most matches in a single visit.
- Tournament Level: Grand Slam, Premier Mandatory (WTA 1000), Premier 5 (WTA 500), Level I/II (WTA 250), Finals.
- Titles & Finals: Every final won -- year, tournament, surface, opponent.
7. Serve & Return Stats Tab
Note: serve stats are not available for all matches.
- Aces & Double Faults: Yearly averages per match.
- 1st Serve % In: WTA average ~= 60-65%. Below 58% = high-risk serving.
- BP Faced vs Saved: Smaller gap = stronger serving under pressure.
- BP Converted: >45% = excellent return game.
8. Match Log Tab
Searchable table of every match. Use the column search boxes to filter by surface, round, opponent, etc. Sidebar filters also apply.
Column Reference
| Column | Description |
|---|---|
| Date | Date the match was played. |
| Tournament | Tournament name. |
| Surface | Hard, Clay, Grass, or Carpet. |
| Round | Stage of tournament (R128 through to F). |
| Result | Win (green) or Loss (red). |
| Opponent | Player faced. |
| Opp Rank | Opponent WTA ranking at time of match. Lower = higher ranked. |
| Score | Match scoreline set by set. |
| Aces | Aces served by Kudermetova. |
| DF | Double faults by Kudermetova. |
9. Deep Analytics Tab
Comprehensive statistical breakdown of any selected performance dimension. Use the Entity dropdown to choose a metric (e.g. Aces, Win/Loss, Opponent Ranking). All sidebar filters apply.
Summary Statistics & Central Tendency
| Metric | What it means |
|---|---|
| N | Number of matches used in the analysis. |
| Mean | Arithmetic average. Central value assuming a normal distribution. |
| Median | Middle value when sorted. More robust than mean when outliers are present. |
| Mode | Most frequently occurring value. |
| Std Deviation | Spread around the mean. Higher = more variable results. |
| Variance | Square of the SD. Measures average squared deviation from mean. |
| Std Error | SD / sqrtN. Precision of the mean estimate. |
| CV (%) | (SD / Mean) x 100. Relative variability -- lower = more consistent. |
Spread, Shape & Quartiles
| Metric | What it means |
|---|---|
| Q1 | 25th percentile -- lower quartile. |
| Q2 | 50th percentile -- median. |
| Q3 | 75th percentile -- upper quartile. |
| IQR | Q3 - Q1. Middle 50% spread. Robust to outliers. |
| Range | Max - Min. Full spread of observed values. |
| Skewness | ~0 = symmetric; positive = right-skewed; negative = left-skewed. |
| Kurtosis | Tail heaviness. Positive = heavy tails / more extremes. |
Correlation & Regression (vs Year)
| Metric | What it means |
|---|---|
| R^2 | Proportion of variance explained by year (0-1). >0.7 = strong fit. |
| Adj. R^2 | R^2 penalised for number of predictors -- more conservative. |
| Pearson r | Strength & direction of linear relationship (-1 to +1). |r|>0.7 = strong. |
| P-value | P<0.05 = statistically significant trend. *** <0.001, ** <0.01, * <0.05, ns. |
| Trend | [up] Improving or [down] Declining based on regression slope sign. |
Hypothesis Tests
| Test | What it answers |
|---|---|
| T-Test | Is the metric significantly different in wins vs losses? Reports mean difference, t-statistic, p-value, 95% CI. |
| Z-Score | How many SD above/below the career mean is each year? Bars outside +/-1.96sigma are statistically unusual. |
| Chi-Squared | Does surface significantly affect match outcome? Reports chi^2, degrees of freedom, p-value. |
| Cramer's V | Effect size for Chi-Squared (0-1). <0.1 negligible, 0.1-0.3 small, >0.3 moderate-large. |
Year-over-Year Statistical Profile Table
Computes all key metrics (N, Mean, Median, Mode, SD, Variance, CV%, Q1, Q3, IQR, Min, Max) for each calendar year. Mean and SD columns include embedded colour bars for quick visual comparison. Use this to pinpoint years with unusual statistical behaviour.
Note: analyses use only matches where the selected metric is available. Missing serve data is excluded automatically.
10. Quick Reference Glossary
Serve & Return
| Term | Meaning |
|---|---|
| Ace | Serve winner -- opponent cannot touch the ball. |
| DF | Double Fault -- two missed serves, point to opponent. |
| 1st Srv % | % of first serves that land in. |
| 1st Won | % of points won when first serve lands in. |
| 2nd Won | % of points won on second serve. |
| BP Saved % | % of break points defended. |
| BP Conv % | % of break opportunities converted. |
Round Codes
| Code | Stage |
|---|---|
| R128 | Round of 128 -- 1st round at Grand Slams. |
| R64 | Round of 64. |
| R32 | Round of 32. |
| R16 | Round of 16. |
| QF | Quarter-final -- last 8 players. |
| SF | Semi-final -- last 4 players. |
| F | Final -- winner takes the title. |
| RR | Round Robin -- WTA Finals group stage. |
Statistics Terms
| Term | Meaning |
|---|---|
| R^2 | Goodness of fit for regression (0-1). |
| P-value | Probability result is due to chance. <0.05 = significant. |
| T-test | Compares means of two groups. |
| Z-score | SDs from the mean. |Z|>1.96 = statistically unusual. |
| Chi-sq | Tests independence between two categorical variables. |
| IQR | Interquartile range (Q3-Q1). |
| CV | Coefficient of Variation -- relative spread. |
| Cramer V | Effect size for chi-squared (0-1). |
Data Sources
Primary: Jeff Sackmann / tennis_wta -- open-source WTA main-draw match results 2015-present, updated throughout the season. Supplementary: Tennis Abstract (Jeff Sackmann) -- match-level percentage stats for 2025-2026 matches. Note: DR (Dominance Ratio) and serve percentage columns are only available for Tennis Abstract-sourced matches. Raw ace/DF counts are only available from Sackmann rows. Qualifying round results are included from Tennis Abstract but not from the Sackmann main-draw files.
11. Understanding the Deep Analytics Results
The Deep Analytics tab produces a large number of statistical outputs across six sections. This section explains what each result is actually telling you about Kudermetova's performance, written in plain language without assuming any statistical background.
Summary Statistics
The six headline cards give the fastest possible read on a metric. The Mean is the average outcome across all selected matches -- it answers the question: in a typical match, what value does this metric take? The Median answers the same question differently: it finds the value sitting exactly in the middle when all matches are sorted low to high. When the Mean is noticeably higher than the Median, a handful of exceptional matches are pulling the average up -- the Median tells you what a more ordinary match looks like. The Mode is the single most common value -- the outcome that repeats most often. The Standard Deviation tells you how much results vary: low SD means consistent performance, high SD means results swing widely. The Coefficient of Variation (SD divided by Mean as a percentage) lets you compare consistency across different metrics regardless of scale -- under 20% is very consistent, over 50% is highly erratic.
Distribution Histogram
The histogram shows the full shape of match results. Each bar covers a range of values, and its height shows how many matches fell in that range. A tall narrow cluster means results are concentrated and consistent. A wide spread means scattered, unpredictable performance. The green dashed line marks the Mean and the orange dotted line marks the Median -- when they diverge, the histogram will show a long tail on one side. A right tail (bars stretching right) means a few unusually high-value matches; a left tail means a few notably poor performances. The panel beside the chart adds Variance (the square of the standard deviation -- used in formulas but harder to interpret directly), Standard Error (how precisely the Mean is estimated -- smaller SE means the Mean is more trustworthy), and the CV.
Box Plot and Quartiles
The box plot is a compact visual summary of spread. The box spans from Q1 (25th percentile) to Q3 (75th percentile) -- the middle 50% of all matches fall inside the box. A narrow box means consistency; a wide box means high variability. The line inside the box is the Median. Dots beyond the whiskers are outlier matches -- unusually high or low results. The IQR (Q3 minus Q1) gives the width as a single number. Skewness measures how lopsided the distribution is: near zero means roughly symmetric, positive means a long right tail (a few outstanding high-value matches), negative means a long left tail (a few notably poor ones). Kurtosis measures how extreme the outliers are: positive (leptokurtic) means extreme matches appear more often than typical, negative (platykurtic) means results cluster tightly with few extremes.
Correlation and Regression vs Year
This section asks: has this metric been trending upward, downward, or staying flat across Kudermetova's career? Each dot on the scatter represents one year's average, and the dashed line is the best-fit linear trend. R-squared (R^2) tells you how well that line describes the actual year-to-year pattern: R^2 = 0.8 means the trend explains 80% of the variation across years -- a strong, consistent direction. R^2 below 0.3 means results bounce around without clear direction. Pearson r captures the strength and direction in a single number between -1 and +1: +0.8 is a strong improving trend, -0.8 is a strong declining trend, values near 0 mean no consistent direction. The P-value on the slope confirms whether the trend is statistically real: below 0.05 (marked *) means it is unlikely to be coincidence. The slope value shows the actual pace -- a slope of +0.015 on win rate means approximately 1.5 percentage points improvement per year on average.
T-Test: Wins vs Losses
The T-Test compares the metric in matches Kudermetova won against matches she lost, asking: is this metric genuinely different between outcomes, or is any apparent gap just random noise? The Mean Difference (wins minus losses) is the raw gap: green means the metric is higher in wins (a positive sign), red means it is higher in losses (a potential concern). The T-Statistic standardises this gap -- values above +2 or below -2 indicate a meaningful difference. The P-value is the definitive answer: below 0.05 means the difference is real, not a chance finding. The 95% Confidence Interval gives a range for the true gap -- if it does not include zero, the difference is significant. For example a 95% CI of [0.4, 1.9] on aces means wins tend to contain 0.4 to 1.9 more aces than losses.
Z-Score Chart
The Z-Score chart reframes each year as a distance from the career mean, measured in standard deviations. A green bar at +1.5 means that year was 1.5 standard deviations above the career average -- a notably strong season. A pink bar at -1.0 means one standard deviation below average -- a weaker year. The orange dotted lines at +1.96 and -1.96 mark the 95% reference band: any year crossing these lines is statistically unusual, performing significantly better or worse than the long-run norm. Years staying inside the band are within the expected range of normal variation and should not be over-interpreted.
Chi-Squared Test: Result vs Surface
The Chi-Squared test asks whether the court surface (Hard, Clay, Grass) genuinely influences match outcomes, or whether any win-rate differences across surfaces could simply be due to chance. It compares actual win/loss counts on each surface to the counts expected if surface had no effect. The Chi-Squared statistic measures the total discrepancy between actual and expected -- larger means the observed pattern deviates more from random. The P-value is the key output: below 0.05 means surface has a statistically significant effect. However significance alone does not tell you how large the effect is -- that is what Cramer's V is for. Cramer's V runs from 0 to 1 independent of sample size: under 0.1 is negligible, 0.1 to 0.3 is a small but real effect, above 0.3 means surface is a meaningful factor. A significant p-value paired with a moderate Cramer's V confirms surface genuinely matters for this player.
Year-over-Year Statistical Profile Table
The Year-over-Year table is the most granular view in Deep Analytics, computing every metric separately for each season. Start with the N column -- years with fewer than 15 matches (notably 2020 due to COVID) should be treated cautiously as small samples make all statistics unreliable. The Mean column has colour bars embedded -- darker teal highlights better-than-average years. Comparing Mean to Median within a year reveals whether standout individual matches skewed the average. The CV% column is the most useful consistency measure: low CV% means reliable and repeatable performance that season, high CV% means erratic results. The Min and Max columns flag extreme outlier matches -- an unusually low minimum might reflect a retirement or injury match, a high maximum might mark a dominant individual performance.
How to Use the Sections Together
The six sections are designed to be read together. A suggested workflow: start with the Summary Cards for a quick orientation -- is this metric high or low, consistent or variable? Check the Histogram to understand the shape of the data and whether outliers are distorting the mean. Use the Regression chart to see if there is a meaningful career trend and confirm with the p-value. Run the T-Test to find out if the metric actually differs between wins and losses -- if it does not, it may not be a useful predictor of outcomes. Check the Z-Score chart to identify which seasons were genuinely exceptional versus normal variation. Use the Chi-Squared test to determine whether surface is a meaningful factor for this specific metric. Finally use the Year-over-Year table to drill into the specific seasons flagged as unusual by the other outputs.
Note: all Deep Analytics computations automatically exclude matches where the selected metric has no data. This is particularly relevant for serve statistics, which are not available for all matches in the dataset.
12. Understanding the Predictions & Forecasting Results
The Predictions tab applies five statistical forecasting models to whichever metric you select, then projects it forward by however many periods you choose. This section explains what each model is doing, what the outputs mean, and how to use them sensibly.
Controls: Metric, Horizon and Granularity
The Metric dropdown selects which performance dimension to forecast -- win rate, aces, double faults, break points, serve percentages, or ranking. All sidebar filters (year range, surface, round) apply to the historical training data, so you can forecast win rate on clay only, or aces in Grand Slams only, simply by adjusting the sidebar. The Horizon sets how many periods ahead to project (1 to 5). Forecasts become less reliable the further ahead you project -- one or two periods is generally trustworthy; five periods should be treated as indicative only. The Granularity switch changes between annual forecasts (one point per year, good for career-level trends) and monthly forecasts (one point per month, better for identifying seasonal patterns but noisier).
Model Comparison Chart
The comparison chart shows all five model forecasts on one plot alongside the historical data. The solid grey line with dots is the actual historical record. The five dashed coloured lines each show a different model's projection from the point labelled 'Forecast -->'. Where the lines agree and cluster together, the forecast is more reliable -- convergence across models is a positive signal. Where the lines diverge widely, there is high uncertainty and no single model should be trusted alone. The vertical dotted divider marks the boundary between historical data and the forecast period.
The Five Models Explained
Model 1 -- Linear Trend (OLS Regression)
Fits a straight line through the historical annual averages and extends it forward.
This is the simplest model and works well when a metric has been moving consistently in one direction over the years --
for example, if win rate has been steadily improving since 2019.
It assumes the trend continues at the same pace indefinitely, which is often unrealistic at long horizons.
Best used when the regression section of Deep Analytics shows a high R-squared and a significant p-value,
confirming there is a genuine linear trend to extrapolate.
The shaded band around the forecast is the 95% prediction interval -- individual future matches could fall anywhere within this band.
Model 2 -- ARIMA (Auto)
ARIMA stands for AutoRegressive Integrated Moving Average.
It automatically analyses the historical series to find patterns --
does this year's value depend on last year's? Is there a consistent drift upward or downward?
Are recent errors in prediction informative about future values?
The 'Auto' version tests many possible ARIMA configurations and selects the one with the best AIC score
(lower AIC = better fit relative to model complexity).
ARIMA handles non-constant trends and autocorrelation well, making it one of the most reliable models here.
It tends to produce conservative forecasts that mean-revert over time rather than projecting extreme trajectories.
Model 3 -- ETS (Exponential Smoothing)
ETS (Error-Trend-Seasonality) is a family of exponential smoothing methods.
It weights recent observations more heavily than older ones -- last year's win rate matters more than five years ago.
Unlike a simple moving average that weights all past observations equally, ETS applies an exponentially declining weight,
so the most recent season has the strongest influence on the forecast.
This makes ETS particularly useful when a player's recent form differs significantly from their long-run average --
it adapts faster to a genuine change in performance level.
Like ARIMA, it selects the best configuration automatically using AIC.
Model 4 -- Holt-Winters Trend
Holt-Winters is a form of double exponential smoothing that explicitly models both the current level
and the direction of change (trend). It maintains two smoothed estimates -- one for the value and one for the slope --
and updates both as new data arrives.
This makes it more responsive to genuine trend changes than a simple linear regression,
which is fixed once fitted. If a player was declining for three years and then started improving,
Holt-Winters would detect and reflect the new upward direction faster than a linear model.
When only a small amount of data is available, the seasonal component is turned off automatically.
Model 5 -- Moving Average + Rolling Forecast
This model computes a rolling average of the last 2 to 3 periods, then projects it forward
using the average rate of change observed in the most recent four seasons.
It is the most naive of the five models -- it has no statistical machinery for detecting complex patterns --
but it is also the most transparent and least likely to overfit.
When the historical data is short or noisy, the moving average forecast provides a sensible baseline.
The confidence intervals are derived from the spread of residuals (how much the actual values differed from the rolling average),
so wider intervals mean the metric has been historically unpredictable.
Confidence Bands (80% and 95%)
Each individual model chart shows two shaded bands around the forecast line. The inner darker band is the 80% confidence interval -- there is an 80% probability the actual future value will fall within this range. The outer lighter band is the 95% confidence interval -- a wider range with 95% probability of containing the true value. Narrow bands mean the model is confident; wide bands mean high uncertainty. As a rule, confidence bands widen as you forecast further into the future -- this is expected and healthy. If bands are very narrow even at a 5-year horizon, the model is likely underestimating uncertainty.
Model Accuracy Metrics Table
The accuracy table compares the five models on four metrics:
| Metric | What it measures | How to use it |
|---|---|---|
| RMSE | Root Mean Squared Error -- average size of the model's prediction errors on historical data | Lower is better. The model with the lowest RMSE fitted the historical data most accurately. The Best Model card in the Summary section highlights this. |
| MAE | Mean Absolute Error -- average absolute difference between predicted and actual values | Also lower is better. Less sensitive to large one-off errors than RMSE. If RMSE is much higher than MAE, there were occasional large prediction misses. |
| AIC | Akaike Information Criterion -- balances goodness of fit against model complexity | Lower is better. Only available for ARIMA and ETS (Linear regression also shows AIC). Models without AIC (Holt-Winters, Moving Average) show -- instead. |
| Next Period | The model's forecast value for the very next period | Compare across models -- if all five agree closely, the next-period forecast is reliable. Wide disagreement means high uncertainty. |
Summary Cards and Ensemble Forecast
The six summary cards above the forecast table give the most important outputs at a glance. Current (Latest) is the most recent actual observed value in the dataset -- the baseline everything is projected from. Ensemble Next Period is the average prediction across all five models for the next period. Ensemble methods typically outperform any individual model because model errors tend to partially cancel out when averaged. Ensemble End of Horizon is the average across models at the furthest point in the forecast. Expected Change shows the percentage change from current to ensemble next period -- green means improvement, red means decline. Best Model (RMSE) identifies which single model fitted the historical data most accurately -- a useful guide to which individual model chart to trust most.
Next-Season Forecast Table
The table at the bottom lists the forecast value from each model for every future period, alongside the 95% lower and upper bounds. The final column is the Ensemble Mean -- the average across all models. Use the Lo95 and Hi95 columns to understand the realistic range of outcomes. For example, if the Ensemble Mean forecasts a win rate of 55% but the Lo95 is 38% and Hi95 is 72%, the honest interpretation is that the model expects somewhere between 38% and 72% -- a wide range that reflects genuine uncertainty.
Important Caveats
All forecasting models assume that future patterns will resemble the past. They cannot anticipate injuries, schedule changes, coaching changes, or other disruptions. With only 10 years of annual data points, the models are working with a small sample -- statistical forecasts become more reliable with more data. Monthly granularity gives more data points but introduces more noise. The forecasts here are best used as a quantitative complement to qualitative judgment, not as a definitive prediction. A model saying win rate will reach 65% next year means: if recent trends continue unchanged, this is the statistically expected value -- not a guarantee.
Note: forecasting models require a minimum of 5 data points to fit. If the current filters produce fewer than 5 periods of data, some or all models may not run. Widening the year range or removing surface/round filters will provide more training data.
Surface Performance (Win %)
Serve & Break Point Stats
Strengths
Weaknesses
Shot Distribution
Shot Direction Tendency
Rally Length
Playing Style Radar
Court Zone Distribution
Court Positioning Heatmap
Movement Radar
Tactical Profile
Backend pipeline (yt-dlp + ffmpeg auto-extraction)
How YouTube + Claude Vision works
WTA Women's Psychological Guidance
Key strategies
- Breathing reset: 3 slow breaths between points activates the parasympathetic system and lowers cortisol within 20-30 seconds.
- Towel routine: Use the towel ritual as a deliberate pause -- a physical anchor to break negative emotional chains.
- Self-talk: Replace 'I can't believe I missed that' with 'Next ball. I've trained for this.' Keep language present-tense.
- Body language: Walk tall between points regardless of the score. Confident posture reduces anxiety hormones independently.
The 90-minute window
- Visualisation: 10 minutes mentally rehearsing your best tennis -- specific shots, patterns, and how you want to feel.
- Activation level: Know your ideal arousal zone. Design your warm-up to reach it deliberately.
- Opponent scouting: Review 2-3 tactical patterns to exploit, then let it go. Over-analysis creates paralysis.
- Personal mantra: One phrase representing your competitive identity e.g. 'warrior', 'fight for every point'.
Breathing rate and self-talk quality are the strongest predictors of performance on break points -- not technical ability.
Tiebreak protocol
- Treat each point as a fresh match -- no scoreboard watching
- Return to your service routine exactly -- don't rush
- After every double fault: 4-second exhale, bounce the ball 5 times, recommit
- The '1-0 mindset': You are always only 1 point away from leading
Closing out sets
- Reframe nervousness as excitement -- it means you care
- Raise first-serve % by 5-8% when serving for the set -- margin of safety
- Attack the net more when leading 5-3 or 6-5 -- don't retreat into defence
The clean slate protocol
- Walk to the furthest corner of the baseline -- physical repositioning signals a mental reset.
- Identify one tactical adjustment. Just one. Execute it on the next point.
- Celebrate winning a single big-moment point as loudly as a set -- momentum is psychological.
- Use changeovers to rehydrate, close your eyes for 30 seconds, and review your one adjustment. Not the score.
ATP Men's Psychological Guidance
Key strategies
- First-strike mentality: Commit to attacking the second ball in every rally. Hesitation compounds into defensive patterns.
- Fist pump calibration: Deliberate celebration after points raises subsequent serve speed by an average of 4 km/h.
- Anger management: Accept one racket bounce per match as a release valve. More than that correlates with 67% loss rate in the following game.
- Controlled breathing: Exhale sharply on contact -- synchronises the kinetic chain and reduces muscle tension by 15-20%.
The 20-second window
ATP rules allow 20 seconds between points. Elite players use this as a structured mental cycle, not a rest period.
Cue words
- 'See it hit it': Reduces cognitive load, promotes instinctive striking
- 'Watch the ball': Refocuses from outcome to process under pressure
- 'Own the court': Spatial awareness cue for maintaining court position
ATP players who perform best in Grand Slam finals treat the final as just another match until the trophy presentation.
Strategies for elevated stakes
- Routine anchoring: Keep every pre-match routine identical to a regular tour match.
- Crowd management: When the crowd is against you, slow your service routine to own the silence.
- Opponent decoupling: Don't adjust strategy in the changeover after losing the first set. Give yourself 3 games in the second set first.
- 5-set fitness: Begin visualising the fifth set from the morning of the match.
Managing the grind
- Selective investment: Identify your 8 priority events. Give 100% to those; 80% emotional weight to the rest.
- Off-ball recovery: Sleep is the #1 performance tool -- 9 hours outperforms any training session.
- Social battery: Media and interviews are energy expenditures. Budget them like training loads.
- Identity beyond ranking: Players with strong non-tennis identities outperform their rankings late in long seasons.
Universal mental skills -- WTA & ATP
- Target 9 hours per night during tournament weeks
- No screens 90 minutes before sleep -- blue light suppresses melatonin
- Nap 20 minutes max if sleeping after a late match
- Cool room (18C) improves deep sleep quality by 30%
- 5 minutes of focused breathing before every session
- During drills: narrate what you see, not what you want to happen
- Use missed shots as data, not judgment
- End every practice with 3 things that went well
- Debrief within 2 hours of a match -- memory fades quickly
- Use video in debriefs to separate fact from feeling
- Ask for positives first, then corrections
- Establish a 30-minute post-match zone where you process alone
Scoring system
Points
Tennis uses Love (0), 15, 30, 40, Game. When both players reach 40-40 (Deuce), one player must win two consecutive points -- the first gives Advantage, the second wins the game.
Sets
First to 6 games (by 2) wins the set. At 6-6 a tiebreak is played. First to 7 points (by 2) wins the tiebreak.
Match format
- Best of 3: WTA all events; ATP most events.
- Best of 5: ATP Grand Slams only.
The court
Dimensions
- Length: 23.77m (78 feet)
- Singles width: 8.23m (27 feet)
- Doubles width: 10.97m (36 feet)
- Net height (centre): 0.914m (3 feet)
Court zones
Surfaces
Shot types
Rules & key terms
Tournament structure
WTA categories
ATP categories
Nutrition & Recovery
Sport science guidelines for elite tennis players -- WTA & ATP
Medical disclaimer: The information on this page is for general educational purposes only and is based on published sport science literature. It does not constitute medical or nutritional advice. Always consult a qualified sports dietitian, physiotherapist, or physician before making changes to your nutrition, hydration, or recovery protocols.
Carbohydrate & Energy
Hydration Protocol
Protein & Muscle Recovery
Micronutrients & Supplements
Sleep & Physical Recovery
Surface-Specific Considerations
Sample Match-Day Meal Plan
Terms & Conditions
Last updated: 13 May 2026 -- Please read carefully before using this platform
1. Acceptance of Terms
By accessing or using the Tennis Analytics Platform (the 'Platform'), you agree to be bound by these Terms and Conditions. If you do not agree to these terms in full, you must cease use of the Platform immediately. These terms apply to all users of the Platform, including visitors, registered users, coaches, analysts, and sports professionals.
2. Nature of the Platform & Permitted Use
The Tennis Analytics Platform is an analytical research and educational tool that
provides statistical analysis, visualisation, AI-powered frame analysis, performance
forecasting, nutritional guidance, and psychological profiling for tennis.
You may use this Platform solely for:
- Personal research and educational purposes
- Coaching and player development activities
- Non-commercial performance analysis
- Academic or journalistic study of tennis statistics
3. Data Sources & Accuracy
Match statistics are sourced from the Jeff Sackmann WTA/ATP open-source dataset
(github.com/JeffSackmann) and from Tennis Abstract. Supplementary 2025-2026 data
is embedded from Tennis Abstract match records.
Player profile data (rankings 1-250, serve stats, surface win rates) are sourced
from official WTA and ATP rankings as of May 2026, supplemented by statistical modelling.
The Platform makes no warranty
that any data, statistic, forecast, or analysis is accurate, complete, or current.
All outputs should be independently verified before being relied upon for
coaching, medical, or professional decisions.
4. AI Frame Analyzer -- Important Limitations
The AI Frame Analyzer uses Claude Vision (Anthropic) to analyse video frames. You acknowledge and agree that:
- AI coaching suggestions are generated by machine learning models and are not a substitute for qualified human coaching
- Frame analysis accuracy depends on image quality, lighting, and camera angle
- The Platform operators are not liable for any training decisions made based on AI-generated coaching tips
- Video content you upload or link to must be content you have the right to use; you must not upload copyrighted match footage without authorisation
- YouTube embeds are subject to YouTube's Terms of Service; CORS restrictions may prevent frame capture from certain videos
5. Nutrition & Medical Disclaimer
The Nutrition & Recovery section provides general sport science educational content based on published research and guidelines from bodies including:
- International Society of Sports Nutrition (ISSN)
- Australian Institute of Sport (AIS)
- British Dietetic Association (BDA) Sport & Exercise Nutrition register
- WTA & ATP published player welfare guidelines
6. Forecasting & Predictions -- No Guarantee of Outcomes
The Predictions & Forecasting section uses statistical models (Linear Regression,
ARIMA, ETS, Holt-Winters, Moving Average) applied to historical match data.
All forecasts are probabilistic estimates based on historical patterns and assume
conditions remain broadly consistent with the past. They cannot account for injuries,
schedule changes, personal circumstances, or other real-world variables.
Forecasts must not be used for betting, gambling, or any form of wagering.
The Platform operators accept no responsibility for financial or other losses
arising from reliance on forecast outputs.
7. Intellectual Property
The Platform's code, design, user interface, analytical methodologies, and original
content are the intellectual property of the Platform operators.
Match statistics from the Jeff Sackmann datasets are published under the
Creative Commons CC BY-NC-SA 4.0 licence.
Player data from Tennis Abstract is used with attribution under fair use for
non-commercial research purposes.
You may not reproduce, redistribute, scrape, or commercialise Platform outputs,
analytical results, or any substantial portion of the data presented without
express written permission.
8. Privacy & Data
This Platform does not collect, store, or process personally identifiable
information unless explicitly provided by the user. Video frames uploaded for AI
analysis are transmitted to the Anthropic API for processing and are subject to
Anthropic's Privacy Policy.
If you are accessing this Platform within the European Union or United Kingdom,
you have rights under GDPR/UK GDPR including the right to access, rectify, and
erase any personal data held. Contact the Platform operators to exercise these rights.
Match data downloaded from GitHub is fetched at runtime and not stored permanently
by the Platform beyond the current session.
9. Limitation of Liability
To the fullest extent permitted by applicable law, the Platform operators shall not be liable for any direct, indirect, incidental, special, consequential, or punitive damages arising out of or related to your use of or inability to use the Platform, including but not limited to:
- Loss of data, revenue, or profit
- Coaching or performance decisions made on the basis of Platform outputs
- Health outcomes from nutritional or recovery guidance
- Financial losses from forecast-based decisions
- Interruption of service due to technical failures
- Third-party content accessed via embedded links or YouTube
10. Third-Party Services
This Platform integrates with the following third-party services:
- Anthropic Claude API: AI frame analysis and coaching -- subject to Anthropic's usage policies
- Jeff Sackmann / tennis_wta & tennis_atp: Open-source match data on GitHub
- Tennis Abstract: Supplementary player statistics
- YouTube: Video embedding -- subject to YouTube Terms of Service
- Posit Connect / shinyapps.io: If hosted, subject to Posit's hosting terms
11. Changes to These Terms & Governing Law
The Platform operators reserve the right to modify these Terms and Conditions at
any time. Changes will be reflected by updating the date at the top of this page.
Continued use of the Platform after changes constitutes acceptance of the revised terms.
These Terms and Conditions shall be governed by and construed in accordance with
the laws of Ireland / European Union, without regard to conflict of law principles.
Any disputes arising from use of this Platform shall be subject to the exclusive
jurisdiction of the Irish courts.
Tennis Analytics Platform v5.0 * Terms effective: 13 May 2026 * For questions contact the platform administrator