About the SportsDataverse

The SportsDataverse is an open-source organization dedicated to making sports data accessible, tidy, and reproducible. We build and maintain a family of packages that share common data representations and a consistent API design, so that moving between sports — or between R, Python, and Node.js — feels familiar instead of foreign.

Our packages provide clean access to play-by-play, box score, schedule, roster, and betting data across the NBA, WNBA, NFL, MLB, NHL, PWHL, college football, men's and women's college basketball, soccer, and more — sourced from ESPN, league Stats APIs, CollegeFootballData, KenPom, The Odds API, Sports Reference, and other providers.

How it started

The first conversations about the SportsDataverse happened at the Carnegie Mellon Sports Analytics Conference. The paper our lead engineer, Saiem Gilani, wrote for the conference was selected as the winner of the Data and Software contribution in the Open Track of the reproducible research competition.

What began as a single college-football package has grown into a cross-language ecosystem maintained by a community of contributors.

R packages

The R side is the heart of the project, and most of its core packages are published on CRAN. You can install and load the whole core set at once with the sportsdataverse umbrella package, or grab development builds from the SportsDataverse r-universe.

  • cfbfastR — clean, tidy college football play-by-play (CollegeFootballData, ESPN).
  • hoopR — men's basketball, NBA + NCAA (NBA Stats API, ESPN, KenPom).
  • wehoop — women's basketball, WNBA + NCAA (WNBA Stats API, ESPN).
  • baseballr — MLB, MiLB, and college baseball (MLB Stats API, FanGraphs, Baseball Reference, NCAA).
  • fastRhockey — NHL and PWHL play-by-play and stats.
  • sportyR — scaled, rule-book-accurate ggplot2 playing surfaces.
  • oddsapiR — sportsbook odds via The Odds API.

A growing set of companion packages — including cfbplotR, cfb4th, recruitR, usfootballR, and softballR — extends the ecosystem further, alongside community projects such as ggshakeR, mlbplotR, and puntr.

Python packages

  • sportsdataverse-py — tidy access to data across NBA, WNBA, NFL, MLB, NHL, MBB, WBB, and CFB.
  • sportypy — the Python sibling of sportyR, drawing surfaces with matplotlib.
  • collegebaseball — college baseball data and analysis.

Node.js packages

Game on Paper

Beyond the libraries, we build tools on top of the data. Game on Paper delivers live win probability, expected points, and advanced box scores for college football, powered by the same engine that drives cfbfastR.

Get involved

The SportsDataverse is MIT-licensed and community-driven. Whether you want to report a bug, request a sport, contribute a function, or just explore the data — everyone is welcome.

Get in touch

Is there something you would like to work on in the SportsDataverse? Whether it's related to work or just a casual conversation, we are here and ready to listen. Please don't hesitate to reach out to us.

Contact us