14 November 2017 on projects
I've been borderline obsessed with the eephus pitch for some time now. Every time I see a player pull this pitch out of their arsenal I become equal parts excited and bamboozled. Startlingly little research has been done to date on this uncommon pitch, and thus, this post is going to serve as an exploratory analysis of and tribute to the mythical eephus.
14 August 2017 on data-science
For the past three months I have had the exciting opportunity to work as a data scientist at Major League Baseball Advanced Media, the technology arm of MLB. This post gives an overview of what I've been working on and the advice I would give a fellow first-time data scientist on their first day on the job.
27 July 2017 on projects, open-source
Throughout my baseball-facing work at MLB Advanced Media, I came to realize that there was no reliable Python tool available for sabermetric research and advanced baseball statistics. As a response to this, I built pybaseball - a Python package for baseball data analysis.
04 June 2017 on summer, ml, reading
Inspired by a similar project by Chris Albon, I am sharing my day-to-day progress on my summer goals for becoming a better data scientist.
12 January 2017 on projects, coffee, personal
Each cup of coffee I have consumed in the past 5 months has been logged on a spreadsheet. Here's what I've learned by data sciencing my coffee consumption.
01 November 2016 on projects
Literature is a tricky area for data science. Think of your five favorite books. What do they have in common? Some may share an author or genre, but besides that, it is probably hard for you to think of what traits they share. My team and I set out to explore the mysterious components of an individual’s literary taste profile, and in the process built a content-based recommender system for books. This post is a brief overview of the system, the features it uses, and how it was built.
15 May 2016 on personal, reading
A collection of some of my favorite books. Business, popular economics, stats and machine learning, and some literature.
10 January 2016 on projects, ML, data, science, field, goal
Probabilistic modeling on NFL field goal data. Applying logistic regression, random forests, and neural networks in R to measure contributing factors of field goal success, and then using this model to rate kickers by posts-added above the exptected value. Published in Elements Research Journal Fall 2016, presented at Boston College Big Data Research Symposium Spring 2015.