Hi Tip-Sheeters,

This week there's big news in Python-land as the Starlette nears the official 1.0 release. I also have an interview with a data scientist who is developing and deploying his code out in public. Let's dive in!

Starlette gets the v1.0.0rc release candidate

There's major news in the Python community this week as Marcelo Trylesinski (aka Kludex) announced the release candidate v1.0.0rc of Starlette. This means the full production 1.0 release is on its way.

Starlette is an important web framework that implements the ASGI standard for building asynchronous Python web apps, APIs, and the like.

It is one of the core dependencies of FastAPI (along with Pydantic) which makes it critical to modern AI and data science. It's also a dependency of FastMCP (and the 1.0 version of FastMCP that's in the official MCP SDK).

Why does v1.0 matter so much? Following Semantic Versioning, a 1.0 version generally means that maintainers are committing to a stable major version that users of that library (and other libraries depending on it) can count to avoid breaking changes going forward.

A frequent question I see asked is when FastAPI would get to a 1.0 release since so many people are counting on it for production releases even without a 1.0 release. The FastAPI maintainer Sebastián Ramírez often mentioned that FastAPI couldn't get to 1.0 until Starlette did, since it's heavily dependent on Starlette.

So I'm guessing the public drumbeat will begin now for a FastAPI 1.0 release -- exciting stuff!

Congrats to Marcelo, the original creator Tom Christie and all the Starlette maintainers for this milestone.

(Side note on this: if you're using open source code for your projects have you considered putting a little something in the tip jar for the maintainers like Marcelo, Sebastián and others? You can with a few clicks on https://github.com/open-source/sponsors.)

Sean Sullivan: Sharing Fantasy Football Analytics with the world

Earlier this month I read statement in a LinkedIn post that spoke to me: Fantasy football is cruel. If you have spent any time on this hobby/obsession, those words probably speak to you too.

Sean shared a link to a Streamlit data app that pulls fantasy data from the ESPN fantasy football website using the espn-api Python package (a Python wrapper aka software development kit that simplifies using the ESPN API).

I checked it out with my ESPN league (one of way.......to many that I participate in) and the stats were great.

I reached out to Sean and he was gracious enough to answer a few questions via Google doc (my own version of asynchronous).

A conversation with Sean Sullivan

Sean Sullivan is a senior data scientist from Chicago, Illinois. He is the founder of URAM Analytics, where he shares his original sports analytics research to support the broader sports data community.

Ryan: Thanks for taking time for our virtual chat. I enjoyed checking out your Beneath the Record app with my own ESPN fantasy league stats. (Turns out I’ve been a bit fortunate in my league’s regular seasons, but don’t tell my league mates.) Tell us how you came to develop this application and write about it online?

Sean: First off, thank you for checking it out and running your league through the app! It’s always nice to know that someone besides myself is using it.

I did not start the project thinking I would turn it into an app or write a blog post about it. Like I mentioned in the post, I mostly built it out of curiosity (and a bit of frustration). I felt like my fantasy football team should have finished better if things had broken slightly differently, and I wanted to see whether that belief was even remotely justified.

As with most side projects, the scope slowly expanded. What started as “How many wins would a simulation suggest?” turned into “What does the distribution of possible finishing places look like?” which then turned into “How often did the simulations think I should have finished first over the league’s 15 year history?” The more I explored it, the more interesting questions kept popping up.

Once I landed on a set of outputs that I found to be interesting, it clicked that this might be something other fantasy players would enjoy. In my public sports analytics work, I try to focus on building things people can actually use rather than doing analysis purely for its own sake. Given that mindset, and my familiarity with Streamlit, turning it into an app felt like a natural next step.

As for the blog post, that was very intentional. The app by itself produces results (in the form of charts), but interpretation is where people can get stuck. Writing about the project gave me a chance to explain the motivation, clarify what the outputs mean, and guide how someone might think about the results. It also doubles as documentation of the work and helps people discover the tool through the site.

Q2: At a glance it looks like you’re using Streamlit, which I’m a big fan of. What other software, libraries, and hosting services are you using for Beneath the Record?

Sean: I’ve become familiar with Google Cloud Platform (GCP) through my recent roles, so I have a personal GCP account where I build, run, and store my public facing sports analytics research tools, including Beyond the Record.

The app is built in Python using Streamlit and relies primarily on pandas, numpy, espn_api, and matplotlib. For deployment, I use Docker and Google Cloud Run.

The app is built in Python using Streamlit and relies primarily on pandas, numpy, espn_api, and matplotlib. For deployment, I use Docker and Google Cloud Run.

Ryan:You are using Christian Wendt’s espn-api Python library to automatically import data for ESPN. I’ve used similar libraries for other fantasy league APIs like Joey Greco’s pymfl and sleeper. In your experience, why does having a wrapper library or software development kit (SDK) make using an API more convenient for data scientists?

Sean: For me, the biggest advantage of using a wrapper library or SDK like espn_api is that it handles all the tedious stuff that comes along with working directly with an API. Instead of dealing with raw JSON responses, the data is exposed through python objects, which makes it much faster to get into the fun part - the analysis!

the biggest advantage of using a wrapper library or SDK ... is that it handles all the tedious stuff that comes along with working directly with an API.

I’ve also found that well documented libraries or SDKs are especially helpful. Clear examples and guidance around core functionality make it easier to use the resource correctly. Shoutout to espn_api for doing a nice job on that front as well!

Ryan: Are there any features of APIs that you find helpful when building data science applications or models?
Sean: Honestly, the best feature an API can have is being boring. In my sports analytics work, like when I’m pulling from MLB Statcast via pybaseball, the data feeds into a pipeline of transformations, models, and simulations. I just need things to stay consistent. A strike should always be a strike and exit velocity should always live in the same place.

That stability is the key. When an API behaves predictably, I’m not burning time fixing pipelines because a field name changed or a schema shifted. It lets me focus on modeling and analysis instead of constantly cleaning up avoidable data issues.

Ryan: In addition to Beneath the Record, what other side projects have you enjoyed building and sharing online?
Sean: It’s actually a bit ironic that we crossed paths through my fantasy football project, because neither fantasy sports nor football are really my primary interest. Most of my side work is centered on baseball.

Lately, I have focused on evaluating strategic decision making by Major League Baseball (MLB) managers, specifically around pitching changes. I built a framework that attempts to break pitching change decisions down from a few angles and uses a mix of machine learning and simulations to assess real decisions made by managers. You can learn more about the project in this introductory blog post. I also maintain a pipeline that runs the evaluation process daily throughout the season and surfaces results in a Looker dashboard. In addition, I built a Streamlit app where users can simulate pitching change decisions for themselves. I am currently exploring extensions of this work, particularly questions like whether a pitcher should be used now versus later in a game.

Beyond that, one of my more recent projects involved developing a data driven framework for suggesting pitch profile changes for MLB pitchers aimed at improving outcomes like whiff rate or groundball rates. That project received a very positive response from the baseball analytics community, so it’s one I’m especially proud to share.
Also - it’s worth mentioning that I am getting MLB data via the pybaseball package. It’s another great example of a wrapper library.

Ryan: Do you have a core set of technologies or services that you use repeatedly for side projects?

Sean: Yes! I lean pretty heavily on my Google Cloud Platform account. Vertex AI, BigQuery, Cloud Run, and Looker Studio are probably the tools I use most often.

Vertex AI, BigQuery, Cloud Run, and Looker Studio are probably the tools I use most often.

My apps are not running at a huge scale, so costs stay very reasonable. Having access to that ecosystem is well worth it for the types of projects that I like to build.

Ryan: How are you adding generative AI and LLMs to your tool set for your side projects?

Sean: I think LLMs can be very useful tools, depending on the use case. For my side projects, I am usually trying to explore my own ideas and approaches to sports data, so I stay away from relying on LLMs for ideation or heavy code generation. Part of the value of these projects for me is working through those problems myself.

Where I do find LLMs helpful is in more supportive roles. I often use them for things like documenting code, proofreading blog posts for clarity, and recalling the steps needed to deploy Streamlit apps. They’re especially useful as a productivity and refinement aid rather than a primary driver of the work.

Ryan: As a working data scientist, what are the benefits of having side projects like these for you professionally?

Sean: I think the biggest professional benefit of side projects is simply reach and visibility. In most jobs (like my own), your work is typically seen by a fairly small group of stakeholders. With side projects, you have the opportunity to put your thinking and problem solving style in front of a much broader audience.

I’ve seen very direct benefits from that. Past projects have led to conversations with MLB teams and even gave me the opportunity to work with a college baseball program. More recently, one of my projects caught the attention of another college baseball coaching staff, which led to ongoing conversations about potentially working together. Those are opportunities that likely never happen if the work doesn’t make it to the internet.

Side projects also act as a very strong proof of concept. They give you something concrete to point to when discussing your skills and interests. Anecdotally, my own projects and writing played a meaningful role in helping me land previous roles. Now that I’m involved in interviewing candidates, I find myself drawn to blogs and GitHub repositories when provided because they reveal how someone actually thinks and works.

Side projects are also a great space for experimentation. They let you try new ideas, tools, and approaches that you might not have room to pursue in your day job. I’ve absolutely carried concepts and frameworks from personal projects back into my work.

Ryan: Beyond building in the open and writing about your projects, do you have any other methods of skill development that you have benefited from?

Sean: Keeping up with everything in data science and machine learning is honestly tough, so I try to stay reasonably plugged in by subscribing to newsletters and reading blog posts or articles that catch my attention.

One thing that has been especially valuable for me is building relationships within the sports analytics community. I’m a member of the Chicago Area Sports Analytics group, which hosts meetups throughout the year (reach out if you’re interested). Beyond being a great group socially, it’s been useful from a technical perspective. Having peers you can exchange ideas with, ask questions, and get feedback from is incredibly helpful.

At work, there are also many opportunities for learning and exposure. We have a weekly series where teams present their projects, which is a great way to see different problem sets and solution approaches. There are also internal hackathons and rotation programs that encourage experimentation and trying new tools or modeling techniques.

Ryan: What tips would you give to people who are working as software developers or data scientists and want to continue developing their technical skills?
Sean: The biggest tip I’d give is to connect skill development to something you actually care about. For me, that’s been sports. There’s such a huge amount of publicly available data that it makes experimenting and learning feel fun instead of like homework.

The genuine interest really lowers the mental barrier of working on new ideas or pushing through harder concepts. I honestly don’t think I’d be nearly as consistent with building projects or learning new techniques if I forced myself to work only on more traditional “industry” problems like search ranking or recommendation systems.

And that’s not to knock on those areas at all. It’s just that motivation matters a lot. If you can tie learning to a personal interest, it becomes much easier to stay curious and keep improving over time.

Ryan: What other ideas or thoughts would you like to pass along to my Tip Sheet audience?
Sean: One thing that I’ve come to appreciate is that developing technical skills does not have to involve big projects. A lot of meaningful progress comes from small, consistent effort over time.

Most of my projects started from a simple curiosity or a small question I wanted to explore. They were not initially designed to become polished apps or long running research projects. But there is a lot of long term value in simply following those threads and seeing where they lead.

I also think there is a lot of value in building things that are imperfect. Struggling with messy data, unclear assumptions, or awkward design decisions is where much of the learning actually happens. Waiting until an idea feels “good enough” or impressive enough can easily become a barrier to doing the work at all. It’s cliche, but don’t let perfect be the enemy of good.

If you stay curious, have fun, and keep working, the skill development tends to follow pretty naturally.

You can checkout the story and app here: Beneath the Record: Who Should Have Won Your Fantasy Football League.

Thanks so much to Sean for taking time to chat with the Tip Sheet.

Keep coding,

Ryan Day

👉 https://tips.handsonapibook.com/ -- no spam, just a short email every week or two.

Ryan Day

Tip Sheet #59: Tips on creating sports analytics side projects

Starlette gets the v1.0.0rc release candidate

Sean Sullivan: Sharing Fantasy Football Analytics with the world

A conversation with Sean Sullivan

Tip Sheet #67: Building a CLI for agent harnesses

Tip Sheet #65: Data Scientist, Doctor, Football Aficionado

Tip Sheet #64: The One Where I Deploy to FastAPI Cloud