Track: Stream Processing at Large
The software industry has learned that the world’s data can be represented as unbounded queues of changes. It can be sliced into sliding windows. It can be aggregated, rolled up, and analyzed. We can choose a number of ways to do this work such as using Kafka Streams or Spark Streaming. We can opt for Apache Beam, Storm, Samza, Flume, or Flink. We have a large pool of options on which we can build powerful systems, but there is accidental complexity lurking in any of the choices:
- What if I need to rebuild all the data?
- How do I know when my system is not healthy?
- How do I reason about time in this system?
- What if things arrive out of order?
- How do I know things have arrived?
This track walks through uses of streaming technologies at large, the problems encountered, and how teams are coping with the state of this new world. As we approach maturity in streaming systems the companies using these systems are growing ecosystems and best practices around building and operating them. They are discovering new ways to reason about monitoring, testing, performance, and failure. This track is an opportunity to learn from their experiences.
Michelle Brush is a math geek turned computer geek with 15 years of software development experience. She has developed algorithms and data structures for pathfinding, search, compression, and data mining in embedded as well as distributed systems. In her current role as an Engineering Director for Cerner Corporation, she is responsible for the data ingestion and processing platform for Cerner’s Population Health solutions. She also leads several engineering education programs and culture initiatives including Cerner’s software architect development program and internal developer conference. Outside of Cerner, she is the chapter leader for the Kansas City chapter of Girl Develop It and one of the conference organizers for Midwest.io.
by Anton Gorshkov
Managing Director @GoldmanSachs
How good is your streaming framework at failure? Does it die gracefully telling you exactly at which point it died? Does it tell you why it died? Does it pick-up where it left off afterwards? Can it easily skip the "erroneous" portions of the stream? Do you always know what was processed and what wasn't? Does it even have to die when process, host, data-center fail?
In this talk we focus on "What Ifs" scenarios and how to evaluate and architect a streaming platform that has high level...
by Sean Cribbs
Software Engineer @Comcast
In the midst of building a multi-datacenter, multi-tenant instrumentation and visibility system, we arrived at stream processing as an alternative to storing, forwarding, and post-processing metrics as traditional systems do. However, the streaming paradigm is alien to many engineers and sysadmins who are used to working with "wall-of-graphs" dashboards, predefined aggregates, and point-and-click alert configuration.
Taking inspiration from REPLs, literate programming, and DevOps...
by Gwen Shapira
System Architect @Confluent, PMC Member @Kafka, & Committer Apache Sqoop
In a world of microservices that communicate via unbounded streams of events, schemas are the contracts between the services. Having an agreed contract allows the teams developing those services to move fast, by reducing the risk involved in making changes. Yet delivering events with schema change in mind isn’t the common practice yet.
In this presentation, we’ll discuss patterns of schema design, schema storage and schema evolution that help development teams build better contracts...
by Shriya Arora
Senior Data Engineer @Netflix
Streaming applications have historically been complex to design and implement because of the significant infrastructure investment. However, recent active developments in various streaming platforms provide an easy transition to stream processing, and enable analytics applications/experiments to consume near real-time data without massive development cycles.
This talk will cover the experiences Netflix’s Personalization Data team had in stream processing unbounded datasets. The...
by Michael Hansen
Principal Data Engineer @hbcdigital
“Perfect is the enemy of good” - Voltaire
On the journey through life, we learn and adapt via trial and error - software development is no different. We realize and accept that we won’t build the perfect solution the first time around, it takes many iterations. At Gilt.com, now part of HBC Digital, we started processing and streaming event data nearly 5 years ago. Our initial solution was dramatically different from our current solution - and will likely be different from our...
Tracks
Monday, 26 June
-
Microservices: Patterns & Practices
Practical experiences and lessons with Microservices.
-
Java - Propelling the Ecosystem Forward
Lessons from Java 8, prepping for Java 9, and looking ahead at Java 10. Innovators in Java.
-
High Velocity Dev Teams
Working Smarter as a team. Improving value delivery of engineers. Lean and Agile principles.
-
Modern Browser-Based Apps
Reactive, cross platform, progressive - webapp tech today.
-
Innovations in Fintech
Technology, tools and techniques supporting modern financial services.
Tuesday, 27 June
-
Architectures You've Always Wondered About
Case studies from the most relevant names in software.
-
Developer Experience: Level up Your Engineering Effectiveness
Trends, tools and projects that we're using to maximally empower your developers.
-
Chaos & Resilience
Failures, edge cases and how we're embracing them.
-
Stream Processing at Large
Rapidly moving data at scale.
-
Building Security Infrastructure
How our industry is being attacked and what you can do about it.
Wednesday, 28 June
-
Next Gen APIs: Designs, Protocols, and Evolution
Practical deep-dives into public and internal API design, tooling and techniques for evolving them, and binary and graph-based protocols.
-
Immutable Infrastructures: Orchestration, Serverless, and More
What's next in infrastructure. How cloud function like lambda are making their way into production.
-
Machine Learning 2.0
Machine Learning 2.0, Deep Learning & Deep Learning Datasets.
-
Modern CS in the Real World
Applied, practical, & real-world dive into industry adoption of modern CS.
-
Optimizing Yourself
Maximizing your impact as an engineer, as a leader, and as a person.
-
Ask Me Anything (AMA)