Blog — Jowanza Joseph

Distributed Data Querying with Alluxio

This post is about how I used Alluxio to reduce p99 and p50 query latencies and optimized the overall platform costs for a distributed querying application. I walk through the product and architecture decisions that lead to our final architecture, discuss the tradeoffs, share some statistics on the improvements, and discuss future improvements to the system.

ArchitectureJowanza JosephMay 20, 2019Architecture

Efficient Stream Processing with Pulsar Functions

Efficient Stream Processing with Pulsar Functions

A walkthrough of migrating a streaming workflow from AWS Kinesis and Spark Streaming to Apache Pulsar and Pulsar Functions.

ArchitectureJowanza JosephMarch 11, 2019Architecture

To The Books I Never Read

To The Books I Never Read

Some books I haven’t been able to read yet.

Jowanza JosephJanuary 23, 2019

Serverless Model Serving: OpenWhisk, Apache Spark and MLeap

Serverless Model Serving: OpenWhisk, Apache Spark and MLeap

A tutorial on serving Spark MLlib models with Apache OpenWhisk and MLeap.

ServerlessJowanza JosephJanuary 14, 2019Data Engineering

Quantified Me

Quantified Me

My set up and critique of quantified self.

PersonalJowanza JosephJanuary 10, 2019personal