Blog

Posts in Apache Spark
Why h2o Sparkling Water?

Despite having an SEO hostile name, h2o.ai is a pretty cool company. They have developed a great open source plug-and-play data science platform in h2o. They some other projects that are noteworthy and of course Sparkling Water, the subject of this post. Sparkling Water is essentially the h2o APIs on top of Spark, allowing the power of h20 to take advantage of Sparks distributed computing model. That being said, is it worth it to load another dependency when Sparks MLLib is adequate for most machine learning needs? I went through this exercise a few weeks ago and this post is mostly my notes with some added illustration and some code.

Read More