Blog

Posts in Apache Spark
Apache Spark: Linear Regression With Stochastic Gradient Descent

When coming to Spark from a background in R or Python Pandas, you’ll likely get tripped up on a few things. The most notable of these is the difference between R and Python dataframe apis and the Spark dataframe API. Furthermore, not all models in Spark are fit with a dataframe and the inter loop between dataframes and RDD (Resilient distributed datasets) are not so obvious.

Read More