This project was a technical deep dive into the world of sports data, focused on uncovering insights from historical betting odds and fantasy football projections. It was designed to serve two core goals: guiding smarter decisions in sports betting and fantasy roster management, and challenging myself by building a complete end-to-end data solution using a unified platform.
The pipeline includes everything from data ingestion and transformation to validation and insight generation, processing both historical and regularly updated datasets. It allowed me to apply scalable, production ready design patterns while deepening my understanding of modern data workflows.
While the initial implementation was built on a single platform for learning purposes, future iterations will transition to a more modular architecture. This shift will better align with the project's evolving goals, reduce costs, and offer greater flexibility for long-term development.
My ultimate goal for this is to have this ready before the next season. I hope others could utilize it to help with their decision making when it comes to betting and fantasy!
I currently don't have this set up for a live view, but in the meantime while that's being setup enjoy the visuals I took these snippets of!
Go checkout the repo for a deeper dive:
Github Repo


- Databricks (Delta Live Tables)
- PySpark (Transformations)
- Delta Lake (Storage & Lineage)
- APIs (Betting Odds & Player Stats)
- Pandas (Prototyping)