Introduction

Enterprises are starting to adopt streaming pipelines to provide insights that adapt to new data in real-time. This goes beyond traditional approaches that operate on data batches. Compared to batch processing, the advantage of streaming is obvious. For example, in the manufacturing area, analyzing data from various sensors in real-time allows a manufacturer to spot problems and correct them before a product leaves the production line. This improves the efficiency of operations — and saves money. When real-time experience is important (or mandatory), a flexible, scalable and robust streaming platform is always more suitable.

AllegroGraph is used very often as an Entity Event Knowledge Graph platform. Customers use the entity-event approach in diverse settings like a call center or a hospital, or in insurance, aviation and even finance. AllegroGraph as an entity-event knowledge graph will accept incoming events, do instant queries and analytics on the new data and then store events and results.

AllegroGraph is geared to high speed inserts so in general can keep up with high business loads but for several reasons it is advantageous to couple AllegroGraph with a distributed event streaming platform such as Apache Kafka. Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.

See this streaming example using Kafka in the AllegroGraph examples github repo. It is a complete data streaming example connecting AllegroGraph to Apache Kafka.