TRACK: Big Data
Implementing BigPetStore with Apache Spark – uses cases of a unified engine
Having a unified data processing engine empowers Big Data application developers as it makes connections between seemingly unrelated use cases natural. This talk discusses the implementation of the so-called BigPetStore project (which is a part of Apache Bigtop) in Spark. It uses the Spark RDD API to generate transaction data, DataFrames and SparkSQL for ETL and reporting, MLlib for building a recommender system on the transaction data and Spark Streaming to serve online recommendations.