Primary Role
Vice President of Data Science at BigML
1 Investment in 1 Company
June, 1970

Person Details


As a data guy, I've been fortunate to work on some pretty cool stuff. At Yahoo, I led the project to collect billions of URL clickstreams from Toolbar and record them in Hadoop, data which was used to launch multiple new features for production web search ranking. I'm proud to have this recommendation from a member of the Y data science team:

"David is an enabler for researchers, and works tirelessly to build systems that can be leveraged in multiple ways. He has the vision and foresight to push for the right things, and the determination to deliver them even when it requires shifting a company's corporate inertia to do so."

At Microsoft, Omar Alonso and I used labeled Twitter data to train a model that detected uninteresting tweets with high accuracy. During this project, we stumbled on the idea of using human computation to create features (instead of just labels) in training sets for supervised learning. Such "faux features" could save engineering time by identifying which features are actually worthy of machine computation. Someday, I'd love to explore this idea further.

At Groupon, I built an elite data science team that trained the first machine-learned models for mobile relevance and created the mobile A/B test framework from scratch. I also worked as an individual contributor, querying Hadoop, Vertica, Teradata and MySQL in my daily work. One of my favorite projects was using log data in Hadoop to train a model that predicts customer inactivity ("churn") for Groupon mobile users. I'm proud to have this recommendation from one of my team members:

"[David] has a deep understanding of predictive modeling and analytics, which allowed our team to make major contributions to bucket testing and machine learned relevance. His vision for the team was excellent and resulted in maximal impact at the company."

At BigML, I'm focused on improving and evangelizing the company's cloud-based machine learning tool. I've spoken on a wide variety of machine learning topics at venues including SVForum, DataBeat, DataWeek, the SF Machine Learning Meetup, the LA Machine Learning Meetup, Predictive Analytics and Data Science 2014, Zipfian Academy and StumbleUpon.

Finally, my firsthand experience with the many forms of data pain has motivated me to invest in several startups, including Platfora, SiSense, Interana, Alation and UPSHOT.

Experience (7)

  • 98ac7b6af82c801411dcd0474815826a


    Vice President of Data Science
    Jul 1, 2013
  • 33bdf1574b179b9ca8a8793bc6925cab


    Director of Data Science
    Jun, 2010 - Jul, 2013
  • 6835990abacfb4d14ecdbfc87641e469


    Principal Program Manager
    May, 2009 - Jun, 2010
  • C187156ec70bbfdaa955acbb559c3389


    Director of Analytics
    Sep, 2007 - Feb, 2009
  • 2350be56d3529d47b19c69811a4b910f


    2009 - Jan, 2009
  • C187156ec70bbfdaa955acbb559c3389


    Principal Technical Yahoo!
    2003 - Jan, 2007
  • Cb default image


    Founder and CEO
    2000 - Jan, 2001

Education (2)


Investments (1)