Apache Kafka and KSQL in Action: Let’s Build a Streaming Data Pipeline!

Wednesday, October 31, 2018 - 11:45 am12:30 pm

Robin Moffatt, Confluent

Abstract: 

Have you ever thought that you needed to be a programmer to do stream processing and build streaming data pipelines? Think again!

Apache Kafka is a distributed, scalable, and fault-tolerant streaming platform, providing low-latency pub-sub messaging coupled with native storage and stream processing capabilities. Integrating Kafka with RDBMS, NoSQL, and object stores is simple with the Kafka Connect API, which is part of Apache Kafka. KSQL is the open-source SQL streaming engine for Apache Kafka, and makes it possible to build stream processing applications at scale, written using a familiar SQL interface.

In this talk we’ll explain the architectural reasoning for Apache Kafka and the benefits of real-time integration, and we’ll build a streaming data pipeline using nothing but our bare hands, the Kafka Connect API, and KSQL.

Gasp as we filter events in real time! Be amazed at how we can enrich streams of data with data from RDBMS! Be astonished at the power of streaming aggregates for anomaly detection!

This will be a practical talk, after which attendees will have a clear idea of the power of stream processing, and how to get started with it using the open-source Apache Kafka and KSQL projects.

Robin Moffatt, Confluent

Robin is a Developer Advocate at Confluent, as well as an Oracle ACE Director and Developer Champion. His career has always involved data, from the old worlds of COBOL and DB2, through the worlds of Oracle and Hadoop, and into the current world with Kafka. His particular interests are analytics, systems architecture, performance testing and optimization. He blogs at http://cnfl.io/rmoff and http://rmoff.net/ (and previously http://ritt.md/rmoff) and can be found tweeting grumpy geek thoughts as @rmoff. Outside of work he enjoys drinking good beer and eating fried breakfasts, although generally not at the same time.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@conference {221772,
author = {Robin Moffatt},
title = {Apache Kafka and {KSQL} in Action: {Let{\textquoteright}s} Build a Streaming Data Pipeline!},
year = {2018},
address = {Nashville, TN},
publisher = {USENIX Association},
month = oct
}

Presentation Video 

Presentation Audio