Introducing pybaseball: an Open Source Package for Baseball Data Analysis

27 July 2017 on projects, open-source

Throughout my baseball-facing work at MLB Advanced Media, I came to realize that there was no reliable Python tool available for sabermetric research and advanced baseball statistics. As a response to this, I built pybaseball - a Python package for baseball data analysis.

Summer of Machine Learning

04 June 2017 on summer, ml, reading

Inspired by a similar project by Chris Albon, I am sharing my day-to-day progress on my summer goals for becoming a better data scientist.

338 Cups of Coffee

12 January 2017 on projects, coffee, personal

Each cup of coffee I have consumed in the past 5 months has been logged on a spreadsheet. Here's what I've learned by data sciencing my coffee consumption.

Building a Content-Based Recommender System for Books: Using Natural Language Processing to Understand Literary Preference

01 November 2016 on projects

Literature is a tricky area for data science. Think of your five favorite books. What do they have in common? Some may share an author or genre, but besides that, it is probably hard for you to think of what traits they share. My team and I set out to explore the mysterious components of an individual’s literary taste profile, and in the process built a content-based recommender system for books. This post is a brief overview of the system, the features it uses, and how it was built.

Bookshelf

15 May 2016 on personal, reading

A collection of some of my favorite books. Business, popular economics, stats and machine learning, and some literature.

Machine Learning and the NFL Field Goal: Using Statistical Learning Techniques to Isolate Placekicker Ability

10 January 2016 on projects, ML, data, science, field, goal

Probabilistic modeling on NFL field goal data. Applying logistic regression, random forests, and neural networks in R to measure contributing factors of field goal success, and then using this model to rate kickers by posts-added above the exptected value. Published in Elements Research Journal Fall 2016, presented at Boston College Big Data Research Symposium Spring 2015.