Designing Modern Reliable Architectures

The design of reliable architectures have evolved a lot in the past few years and assumptions that were true for consumer services are no longer effective to provide quality performance for enterprise customers. In the world of globally distributed systems, the most reliable services are those that offer improved observability, allowing for performance troubleshooting, quick outage investigation and fast mitigation. Revenue critical applications must find the correct balance between processing resource cost, performance latency and service availability, often exploring multi-cloud solutions while migrating from a traditional architecture on-prem into a new distributed microservice model in the Cloud.

On this track, you will see a diverse collection of reliability strategies covering finance applications, gaming platforms and other real world segments of the tech industry. Join us to hear reliability practitioners from BigTechs and startups on how they have unlocked improved architectures for their products to meet their customer availability, consistency and performance needs.

From this track

Session Kafka

How to Build a Reliable Kafka Data Processing Pipeline, Focusing on Contention, Uptime and Latency

Wednesday Jun 14 / 10:35AM EDT

Shifting workloads from synchronous to asynchronous can simplify the operational cost of high-throughput HTTP services. But understanding the evolution of performance metrics in the world of complex, high-concurrency, asynchronous distributed systems can be quite challenging.

Speaker image - Lily Mara

Lily Mara

Engineering Manager @OneSignal


Unconference: Designing Modern Reliable Architectures

Wednesday Jun 14 / 11:50AM EDT

What is an unconference? An unconference is a participant-driven meeting. Attendees come together, bringing their challenges and relying on the experience and know-how of their peers for solutions.

Session Architecture

Building an Architecture to Predict Customer Behavior in a Revenue-Critical System

Wednesday Jun 14 / 01:40PM EDT

At Neon digital bank in Brazil, we strive to make revenue-impacting predictions based on customer behavior. Building a low latency and high availability distributed system that meets this requirement becomes especially challenging.

Speaker image - Yves Junqueira

Yves Junqueira

Distinguished Software Engineer @Neon

Session Architecture

Reliable Architectures Through Observability

Wednesday Jun 14 / 02:55PM EDT

We want our systems to be reliable, but testing alone isn't enough. In a complex, multi-service system, it's impossible to test your way to correctness. That's why we need observability. Observability is the ability to see what our code is doing, in production and in development.

Speaker image - Kent Quirk

Kent Quirk

Staff Engineer

Session Developer Environment

Architecting a Production Development Environment for Reliability

Wednesday Jun 14 / 04:10PM EDT

At Meta, developers use a combination of development servers, including virtual machines and physical hosts, as well as on-demand containers to perform their daily software engineering work.

Speaker image - Henrique Andrade

Henrique Andrade

Production Engineer @Meta

Session Cloud Architecture

Survival Strategies for the Noisy Neighbor Apocalypse

Wednesday Jun 14 / 05:25PM EDT

Noisy neighbor issues are a common challenge for multi-tenant platforms, leading to resource contention, performance degradation, and costly downtime for other tenants sharing the same resources.

Speaker image - Meenakshi Jindal

Meenakshi Jindal

Staff Software Engineer @Netflix


Wednesday Jun 14 / 10:30AM EDT


Track Host

Silvia Esparrachiari

Software Engineer @Google

Silvia Esparrachiari has been a software engineer at Google for 12 years, having worked at User Data Privacy, Spam and Abuse Prevention, and most recently in Cloud and Infrastructure/SRE. Her current focus at Google is to build a communication platform to report planned and unplanned service health disruptions to Cloud customers.

Read more