Tip Sheet #65: Data Scientist, Doctor, Football Aficionado


Hi Tip-Sheeters,

This week I'm sharing highlights of virtual chat I had with Ricardo Heredia Joya, a Madrid data scientist with a background as a medical doctor. He is active in the Football analytics community, and shares his thoughts and tips in the Underdoc substack.

(If you're enjoying the Tip Sheet I'm sure you'll get a lot from Ricardo's newsletter.)

You are your best asset, so never stop learning.
- Ricardo Heredia Joya

Q1: Thanks for having a virtual chat with the Tip Sheet subscribers. I have enjoyed the hands-on articles you have shared in your substack on topics like MCP and football data sources. What has motivated you to take the time to write your newsletter?

Thank you, Ryan for letting me share my thoughts with your subscribers.

What motivated sharing my thoughts and experience in my newsletter is giving out what I would’ve liked having when I pivoted to data: real world experience testimonies from people in my niches of interest. Something I believe helps anyone getting into an exciting field like data and AI.

Q2: You have an interesting backstory about starting as a medical doctor and coming to data science through soccer analytics. What was it about sports analytics that interested you so much?

The possibility to transfer my analytical skills into something I love: soccer. I felt that if I applied the new knowledge about data to a field I love I’d feel way much more motivated to hone these skills.

Q3: Your article mentioned that you “joined a non-profit group of analysts working with NWSL and national teams”. That sounds really interesting, tell us more about that group.

I was invited to join this group thanks to a LinkedIn message I sent to one of the members about popular data sources for my own projects. He then invited me to join the team and I didn’t hesitate.

The group was formed by many data analyst and data scientist around the world (US, Chile, UK, Argentina, Spain), spread into different teams like Scouting, Open Play Analysis, Corner Analysis, and the goal was to help teams from the NWSL to improve their game through analytics and stabling the foundations of a solid global sports data consultancy.

Q4: It seems like you have done a lot of your learning outside of business hours on side projects. Why are side projects valuable for people learning about data science and growing their skills?

I found out early on in my data career that side projects are one of the best ways to stand out from other candidates. It helps you show what you can actually build, it lets others see your way of thinking, how you make decisions and how it translates into real value (something people can actually use).

Q5: Do you have a core set of technologies or services that you use repeatedly for side projects?

I use agentic coding tools like Claude Code to help me scaffold the project. I use Python mainly along with SQL for ETL pipeline, libraries like Dagster for orchestration, MinIO (storage) or DuckDB, Docker for containerization.

And more recently since I mix data engineering/science + generative AI, I use a mix of open source (Llama 3.1) and closed sourced models (Anthropic mainly) along with Opik to evaluate prompts and the results of my apps.

Since I’ve been building RAG pipelines I’ve included embeddings models, frameworks like LlamaIndex, and more recently I’ve been learning foundations of agentic development like context engineering in order to improve the information fed to my genAI apps.


Q6: Beyond building in the open and writing about your projects, do you have any other methods of skill development that you have benefited from?

I’ve learnt to combine three methods:

  • Structured learning through courses or books
  • Project based learning, the main one I use to apply all the knowledge and make it stick
  • Unstructured learning like reading from blog posts, newsletters like yours, social media (having a well curated timeline helps a lot 😂)

I use all of them to improve my skills. The sweet spot is knowing when to lean in to keep yourself sharp.

Q7: What tips would you give to people who are working as software developers or data scientists and want to continue developing their technical skills?

  • You are your best asset, so never stop learning.
  • Projects > Certifications.
  • If you feel frustrated keep practicing, it’s completely normal and necessary for developing new skills.
  • Write about what you learn and share it online (blog, X/Twitter, LinkedIn, wherever), doesn’t matter if you believe it’s a “small project”, it’ll help somebody.

Find some time to rest your mind and body, it helps you enjoy the process.

Q8: What other ideas or thoughts would you like to pass along to my Tip Sheet audience?

We are in the best era to create things.

Use AI as a leverage to improve yourself, to understand the foundations of anything you learn, not only to accelerate the learning.

Aim to transfer capacities from different fields and make connections between them. That’s probably where your moat is. It happened to me with data+healthcare and data+sports.

-----

Great stuff from Ricardo -- be sure to check out his newsletter The Underdoc.

Keep coding,

Ryan Day

👉 https://tips.handsonapibook.com/ -- no spam, just a short email every week or so.

Ryan Day

This is my weekly newsletter where I share some useful tips that I've learned while researching and writing the book Hands-on APIs for AI and Data Science, a #1 New Release from O'Reilly Publishing

Read more from Ryan Day

Hi Tip-Sheeters, I've been looking forward to this week's newsletter for a while -- The One Where I Deploy to FastAPI Cloud (fans of TV's Friends may appreciate the title). Since I began using FastAPI, some of the best improvements I have seen have been in the "fastapi" command line interface program. CLIs have become pretty important in their own right lately so if you're interested in the code for the FastAPI CLI you can see it here: https://github.com/fastapilabs. The CLI is built with...

Hi Tip-Sheeters, Model Context Protocol is a fast-growing standard for providing data and other context to LLM apps. This is an area that Python is really leading the way, namely FastMCP. According to the FastMCP PyPi page, 70% of all MCP servers (in any programming language) are written with some version of FastMCP. Yesterday [Feb. 18, 2026], the FastMCP team released FastMCP 3.0 into production with quite a few new features. I had a chance to chat with Jeremiah Lowin, the creator of FastMCP...

Hi Tip-Sheeters, This week there's big news in Python-land as the Starlette nears the official 1.0 release. I also have an interview with a data scientist who is developing and deploying his code out in public. Let's dive in! Starlette gets the v1.0.0rc release candidate There's major news in the Python community this week as Marcelo Trylesinski (aka Kludex) announced the release candidate v1.0.0rc of Starlette. This means the full production 1.0 release is on its way. Starlette is an...