Jowanza Joseph

View Original

Handling Data In Node.js Using Datalib

Javascript wasn’t necessarily developed for scientific computing, but due to the web and the popularization of d3 and other visualization frameworks, Javascript has become a place where people are more and more interested in doing analysis. Doing analysis still isn’t great in Javascript but it is improved due to modules like Datalib, Math.js and Vectorious, it’s much more manageable.

Datalib provides a simple API for doing basic data analysis and manipulation. It allows uploads from CSV, TSV, DSV, Json, TopoJSON and Treejson among others. For this post I want to load data in from a CSV and compare the three point shooting of Steph Curry, Kyle Korver and Lou Williams. I chose these three because Lou Williams is seen as a high variability player; either he’s not or not, while Korver and Curry are seen as consistent.

Datalib

First we need to load the data in and then group by player then we will calculate a few statistics. Doing this in Datalib is mostly simple asides for a disregard of camel case :(.



On first pass I really like the .table() method, it reminds me of features dataframe API’s use. It takes JSON and puts it in a nice summary for you without doing anything nifty with lodash. From here you can visualize the data pretty easily, or even take this data and dig deeper.

About the data

Interestingly enough Lou Williams doesn’t have the highest variance among the three, he is just not as good at making 3’s. Curry and Korver shoot a much higher percentage and even with their higher variance they are in a different world than Williams shooting wise.

Future

I’ve got some lofty plans for Datalib. Next up I want to look at the limitations of the package by loading some huge JSON and CSV files and trying to do some summaries.