2 Data Acquisition
The foundation for all of the advanced analytics in Ultimate Frisbee comes from the data that is manually collected during games. This process began in 2021 when the Ultimate Frisbee Association (UFA) launched an initiative to collect detailed play-by-play data from every game. This data collection effort has been crucial in enabling the research and insights we’re sharing with the community.
2.1 The UFA’s Data Collection System
In 2021, the UFA began manually collecting play-by-play data for every game in the league. Each team hires a dedicated person to travel with the team and collect data during their matches. This individual is equipped with an iPad and an app designed for this purpose. The app allows them to record events in real time, including detailed information about each play, such as the time, type of event, and players involved. This data is then sent to a centralized database and made publicly available through an API, which can be accessed by anyone. The API is located at UFA Stats API, and full documentation on how to interact with the API can be found at UFA Stats Documentation.
The API is organized into distinct data categories: Teams, Players, Games, Game Events, Player Game Stats, and Player Stats. Teams, Players, and Games provide metadata about their respective entities, while Player Game Stats aggregate into season-based Player Stats. Game Events contain the most granular and analytically valuable data.
2.1.1 Game Events
The game events in the UFA data track specific actions and play moments, each with structured fields providing comprehensive information about on-field activities.
- Start D Point
- line: Array of playerIDs that started on the defensive point.
- time: Time (in seconds) into the period when the defensive point started.
- line: Array of playerIDs that started on the defensive point.
- Start O Point
- line: Array of playerIDs that started on the offensive point.
- time: Time (in seconds) into the period when the offensive point started.
- line: Array of playerIDs that started on the offensive point.
- Timeout Events
- Midpoint Timeout - recording team: Array of playerIDs that came onto the field during the timeout.
- Between Point Timeout - recording team: Same structure as the above, for team changes between points.
- Midpoint Timeout - opposing team: Same as above for the opposing team.
- Between Point Timeout - opposing team: Same as above for the opposing team.
- Midpoint Timeout - recording team: Array of playerIDs that came onto the field during the timeout.
- Pull Events
- Pull - inbounds: Details of the pull to start the point, including the player pulling and the coordinates where the pull was brought into play.
- Pull - out of bounds: Same as the above, but when the pull goes out of bounds.
- Offsides: Recorded for both the recording and opposing teams during a pull.
- Pull - inbounds: Details of the pull to start the point, including the player pulling and the coordinates where the pull was brought into play.
- Block and Turnovers
- Block: Information on a block by a defender.
- Callahan: A turnover where the defender catches the disc thrown by the opposing team.
- Throwaway: A throwaway by the opposing team, including the location where the turnover occurred.
- Block: Information on a block by a defender.
- Score and Penalties
- Score: A goal scored by the opposing team.
- Penalty: A penalty event recorded either on the recording team or the opposing team.
- Score: A goal scored by the opposing team.
- Pass and Goal Events
- Pass: Details of the throw and catch, including the thrower’s and receiver’s player IDs and coordinates.
- Goal: A score event, detailing the thrower, receiver, and their respective coordinates on the field.
- Pass: Details of the throw and catch, including the thrower’s and receiver’s player IDs and coordinates.
- Miscellaneous Events
- Drop: A drop by the receiver, with details on where it occurred.
- Dropped Pull: When the receiver drops the pull, including location information.
- Injury: Substitutions after player injuries.
- Player Misconduct Foul: Details on any misconduct fouls.
- Player Ejected: If a player is ejected from the game.
- Drop: A drop by the receiver, with details on where it occurred.
- Game Time Events
- End of Periods: Events marking the end of game periods such as the first quarter, halftime, third quarter, regulation, and overtime.
Each event is accompanied by various fields, such as playerIDs, coordinates on the field (X, Y), and timestamps in seconds or milliseconds. These provide a robust and detailed view of the game, which can be used to analyze player performance, game strategy, and more.
Note that some events, such as “Start D Point” or “Start O Point,” only contain basic information about the players involved, while others, like “Pass” or “Goal,” include detailed positional data about both the thrower and the receiver. As you can see, this requires a significant amount of processing to engineer useful features, create tabular data for modelling and other tasks. We detail this more thoroughly in the next chapter: Data Ingestion.
2.2 Limitations of Manual Data Collection
While the data collected by the UFA provides a fantastic starting point for analysis, the manual nature of the process introduces several limitations:
Human Error and Misclicking The data is recorded manually by a single individual for each team. As a result, errors can occur due to misclicking events or incorrectly categorizing plays. For example, a throw might be recorded as a turnover when it was actually a successful pass. These errors can impact the accuracy of the data, and while the data collection team does their best to minimize mistakes, they can never be fully avoided.
No Tracking of Off-Disc Movement One of the most significant analytical blind spots is the absence of off-disc player tracking. Because only the disc-related events (throws, catches, turnovers, goals, etc.) are logged, all movement by players not involved in the play is completely unobserved. This omission severely limits our ability to evaluate offensive schemes (e.g., whether a cutter created separation or was simply open by chance) and defensive strategy (e.g., whether a shutdown defender prevented a viable throwing option). Without data on spacing, timing, and movement off the disc, any conclusions about decision-making, defensive pressure, or spatial control are inherently speculative. As a result, advanced metrics like expected throw value, coverage quality, or offensive efficiency relative to available options remain underdeveloped.
Missing Events or Incomplete Data Because the app is operated by a person traveling with the team, sometimes key events may be missed or not recorded in real time. If a player’s throw is not captured or if the person collecting the data is distracted during an important play, that event won’t be part of the dataset. Missing data can create gaps in the analysis and may require additional steps to handle during the modeling phase.
Lack of Timestampt Granularity Another limitation is that game time is only recorded for point starts and some timeouts, while individual throws and other in-play events have no timestamp information. As a result, analysts cannot assess tempo, pacing, or the impact of time-sensitive situations like end-of-quarter pressure. This restricts our ability to model disc movement speed, identify delay strategies, or simulate realistic gameplay scenarios. The lack of throw-level timestamps reduces the depth of temporal analysis and impairs any modeling that depends on understanding how quickly or slowly plays develop over time. Additionally,these are frequently input incorrectly.
Aligning Home and Away Events Independent event recording by each team creates significant data alignment challenges that cannot be resolved without game footage review. Common discrepancies include: quarter boundary mismatches where one team records a defensive block while the opponent fails to log the corresponding turnover; inconsistent penalty recording across team datasets; and sequence reconstruction issues where off-disc events (such as defensive team injuries) cannot be precisely positioned within the opposing team’s throw sequence. These alignment problems produce incomplete possession chains, statistical inconsistencies in turnover/block counts, and temporal sequence ambiguities that reduce analytical precision. Analysts working with this data should implement validation protocols to identify misalignment patterns and establish consistent reconciliation rules for handling these inherent discrepancies.
Limited Granularity of Play-by-Play Data While the play-by-play data provides a great overview of events as they unfold, it is not able to capture the full picture of every play. For example, in a fast-paced game, multiple actions can occur in rapid succession, and each of these events may not be fully separated out in the dataset. The granularity of events, such as a precise record of every individual action in a play (e.g., a defensive cut, an off-the-disc action), is limited.
2.3 Conclusion
Despite its limitations, the manual data collection system established by the UFA in 2021 has been a game-changer for the sport of ultimate frisbee. The data collected through this system has provided a wealth of insight into player performance, team dynamics, and game strategies. While there are challenges to working with this data, such as human error and inaccuracies in location data, the overall value it brings to the field of ultimate frisbee analytics cannot be overstated.
As the system continues to evolve and improve, we can expect even more accurate and detailed datasets in the future, leading to more refined analytics and greater insights into the game.