Modeling and Simulation of Spark Streaming

Abstract

As more and more devices connect to Internet of Things, unbounded streams of data will be generated, which have to be processed ‘‘on the fly’’ in order to trigger automated actions and deliver real-time services. Spark Streaming is a popular realtime stream processing framework. To make efficient use of Spark Streaming and achieve stable stream processing, it requires a careful interplay between different parameter configurations. Mistakes may lead to significant resource overprovisioning and bad performance. To alleviate such issues, this paper develops an executable and configurable model named SSP (stands for Spark Streaming Processing) to model and simulate Spark Streaming. SSP is written in ABS, which is a formal, executable, and object-oriented language for modeling distributed systems by means of concurrent object groups. SSP allows users to rapidly evaluate and compare different parameter configurations without deploying their applications on a cluster/cloud. The simulation results show that SSP is able to mimic Spark Streaming in different scenarios.

Publication
In Proc. 32nd Intl. Conf. on Advanced Information Networking and Applications (AINA). © IEEE CS Press 2018.
Ingrid Chieh Yu
Ingrid Chieh Yu
Associate professor