As a data guy, I've been fortunate to work on some pretty cool stuff. At Yahoo, I led the project to collect billions of URL clickstreams from Toolbar and record them in Hadoop, data which was used to launch multiple new features for production web search ranking. I'm proud to have this recommendation from a member of the Y data science team:
"David is an enabler for researchers, and works tirelessly to build systems that can be leveraged in multiple ways. He has the vision and foresight to push for the right things, and the determination to deliver them even when it requires shifting a company's corporate inertia to do so."
At Microsoft, Omar Alonso and I used labeled Twitter data to train a model that detected uninteresting tweets with high accuracy. During this project, we stumbled on the idea of using human computation to create features (instead of just labels) in training sets for supervised learning. Such "faux features" could save engineering time by identifying which features are actually worthy of machine computation. Someday, I'd love to explore this idea further.
At Groupon, I built an elite data science team that trained the first machine-learned models for mobile relevance and created the mobile A/B test framework from scratch. I also worked as an individual contributor, querying Hadoop, Vertica, Teradata and MySQL in my daily work. One of my favorite projects was using log data in Hadoop to train a model that predicts customer inactivity ("churn") for Groupon mobile users. I'm proud to have this recommendation from one of my team members:
"[David] has a deep understanding of predictive modeling and analytics, which allowed our team to make major contributions to bucket testing and machine learned relevance. His vision for the team was excellent and resulted in maximal impact at the company."
At BigML, I'm focused on improving and evangelizing the company's cloud-based machine learning tool. I've spoken on a wide variety of machine learning topics at venues including SVForum, DataBeat, DataWeek, the SF Machine Learning Meetup, the LA Machine Learning Meetup, Predictive Analytics and Data Science 2014, Zipfian Academy and StumbleUpon.
Finally, my firsthand experience with the many forms of data pain has motivated me to invest in several startups, including Platfora, SiSense, Interana, Alation and UPSHOT.
Vice President of Data Science
Jul 1, 2013
Director of Data Science
Jun, 2010 - Jul, 2013
Principal Program Manager
May, 2009 - Jun, 2010
Director of Analytics
Sep, 2007 - Feb, 2009
2009 - Jan, 2009
Principal Technical Yahoo!
2003 - Jan, 2007
Founder and CEO
2000 - Jan, 2001