Seeking Observability: Getting Started with Service Mesh on GCP “Observability” What is Observability? Observ/ab/ility Wikipedia says... In control theory, observability is a measure of how well internal states of a system can be inferred from knowledge of its external outputs https://en.wikipedia.org/wiki/Observability In Software Engineering ... Observability: collecting diagnostics data all across the stack to identify and debug production problems and also to provide critical signals about usage to our highly adaptive and scalable environment. What do we need? We need metrics that matters! We need logs that matter! We need to trace what happened! “Microservices” Microservices (Generally speaking) Several, could be thousands of , services might be - written in Different Languages / Frameworks / Library - using Many Protocols - having Distributed system calls Microservices Observability Think what happens - when starting a new service in a new language - when communicating with a new procol - when making a breaking change to network and infrastructure Microservices Observability We want to implement something that - is decoupled from languages, frameworks and libraries - supports many protocols or other procedures - decouples applications and the whole infrastructure “Service Mesh” Service Mesh - is a transparent network between services - Decoupled from application - Language independent - provides automated applications network functions - Observability - Service Discovery - Policy Enforcement - etc... Here’s what’s happening Let’s say we have two services written in different languages Service A (Java w/ Spring Boot) Service B (Python w/ Flask) Here’s what’s happening Without Service Mesh, one call the other directly Service A (Java w/ Spring Boot) Service B (Python w/ Flask) Here’s what’s happening For the observability, each services must implement things Service A (Java w/ Spring Boot) Service B (Python w/ Flask) Metrics / Logs Service Metcics / Tracing Codes Metcics / Tracing Codes Here’s what’s happening What if another service is deployed...? and with new runtime or new protocol...? Service A (Java w/ Spring Boot) Service B (Python w/ Flask) Service C (Go w/o Framework) Metrics / Logs Service Metcics / Tracing Codes Metcics / Tracing Codes Metcics / Tracing Codes