About the SportsDataverse
The SportsDataverse is an open-source organization dedicated to making sports data accessible, tidy, and reproducible. We build and maintain a family of packages that share common data representations and a consistent API design, so that moving between sports — or between R, Python, and Node.js — feels familiar instead of foreign.
Our packages provide clean access to play-by-play, box score, schedule, roster, and betting data across the NBA, WNBA, NFL, MLB, NHL, PWHL, college football, men's and women's college basketball, soccer, and more — sourced from ESPN, league Stats APIs, CollegeFootballData, KenPom, The Odds API, Sports Reference, and other providers.
How it started
The first conversations about the SportsDataverse happened at the Carnegie Mellon Sports Analytics Conference. The paper our lead engineer, Saiem Gilani, wrote for the conference was selected as the winner of the Data and Software contribution in the Open Track of the reproducible research competition.
What began as a single college-football package has grown into a cross-language ecosystem maintained by a community of contributors.
R packages
The R side is the heart of the project, and most of its core packages are
published on CRAN. You can install and load the whole core set at once with the
sportsdataverse umbrella package, or grab
development builds from the
SportsDataverse r-universe.
- cfbfastR — clean, tidy college football play-by-play (CollegeFootballData, ESPN).
- hoopR — men's basketball, NBA + NCAA (NBA Stats API, ESPN, KenPom).
- wehoop — women's basketball, WNBA + NCAA (WNBA Stats API, ESPN).
- baseballr — MLB, MiLB, and college baseball (MLB Stats API, FanGraphs, Baseball Reference, NCAA).
- fastRhockey — NHL and PWHL play-by-play and stats.
- sportyR — scaled, rule-book-accurate
ggplot2playing surfaces. - oddsapiR — sportsbook odds via The Odds API.
A growing set of companion packages — including cfbplotR, cfb4th, recruitR, usfootballR, and softballR — extends the ecosystem further, alongside community projects such as ggshakeR, mlbplotR, and puntr.
Python packages
- sportsdataverse-py — tidy access to data across NBA, WNBA, NFL, MLB, NHL, MBB, WBB, and CFB.
- sportypy — the Python sibling of
sportyR, drawing surfaces withmatplotlib. - collegebaseball — college baseball data and analysis.
Node.js packages
- sportsdataverse.js — sports data for the JavaScript and TypeScript ecosystem.
Game on Paper
Beyond the libraries, we build tools on top of the data. Game on Paper delivers live win probability, expected points, and advanced box scores for college football, powered by the same engine that drives cfbfastR.
Get involved
The SportsDataverse is MIT-licensed and community-driven. Whether you want to report a bug, request a sport, contribute a function, or just explore the data — everyone is welcome.
- Organization: sportsdataverse.org
- GitHub: github.com/sportsdataverse
Get in touch
Is there something you would like to work on in the SportsDataverse? Whether it's related to work or just a casual conversation, we are here and ready to listen. Please don't hesitate to reach out to us.