
International Journal of Leading Research Publication
E-ISSN: 2582-8010
•
Impact Factor: 9.56
A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal
Plagiarism is checked by the leading plagiarism checker
Call for Paper
Volume 6 Issue 4
April 2025
Indexing Partners



















Stream Processing Internals and Usecases
Author(s) | Arjun Reddy Lingala |
---|---|
Country | United States |
Abstract | Batch processing is widely used concept in data warehousing where many companies build analytical solutions deriving insights into their systems and building new products based on the analysis based on various aspects of the systems. The exponential growth of real-time data sources like IoT sensors, social media has necessitated systems capable of processing unbounded data streams with low latency, high throughput, and guaranteed correctness. Unlike batch processing, stream process- ing engines must handle continuous data flows with dynamic arrival patterns, out-of-order events, and variable workloads. The problem with daily batch processes is that changes in the input are only reflected in the output a day later, which is too slow for some use cases. To reduce the delay, we can run the processing more frequently. In the batch processing world, the inputs and outputs of a job are files may be on distributed file system like HDFS [1] or Amazon S3 [2]. Stream processing has emerged as a critical computational model that enables real-time ingestion, transformation, and analysis of continuous data streams. This paper presents a comprehensive exploration of the necessity for stream processing, identifying its use cases over batch processing and its suitability for latency-sensitive applications such as financial trading, fraud detection, and Internet of Things (IoT) systems. We begin by establishing the fundamental motivation behind stream processing, outlining key challenges associated with real-time data analytics, including data velocity, system scalability, and fault tolerance. The discussion highlights the lim- itations of traditional batch processing frameworks like Apache Hadoop and their inability to efficiently handle continuous data flows. In contrast, we analyze how stream processing frameworks such as Apache Kafka [3], Apache Flink [4], Apache Storm [5], and Spark Streaming [6] address these challenges by enabling near real-time event-driven computations. |
Keywords | Stream Processing, Data warehouse, Windowing, Change Data Capture, Message Queues, Message Brokers, Kafka, Flink, Storm, Real-time |
Field | Engineering |
Published In | Volume 5, Issue 4, April 2024 |
Published On | 2024-04-10 |
Cite This | Stream Processing Internals and Usecases - Arjun Reddy Lingala - IJLRP Volume 5, Issue 4, April 2024. DOI 10.5281/zenodo.14945841 |
DOI | https://doi.org/10.5281/zenodo.14945841 |
Short DOI | https://doi.org/g86pfm |
Share this


CrossRef DOI is assigned to each research paper published in our journal.
IJLRP DOI prefix is
10.70528/IJLRP
Downloads
All research papers published on this website are licensed under Creative Commons Attribution-ShareAlike 4.0 International License, and all rights belong to their respective authors/researchers.
