I recently finished a project using the dataset from Pump It Up: Mining the Water Table from DRIVENDATA. I’m going to walk through the Random Forest Classifier, one of the classifiers I tested, which was the one I found to perform the best after tuning its hyperparameters.

I won’t go into it here but there is a significant amount of data cleaning and feature selection to do before the data is ready for a model. …

Mike Erb

Data Scientist with a background in Computer Science and as an Entrepreneur in the Bike industry — Based in Ithaca, NY

