Knative: The missing serving layer for Kubernetes Kubernetes is a great platform to deploy and run microservices βEveryone Kubernetes is a generic platform to run any workload, and "services" deserve better networking, rollout and monitoring capabilities from the infrastructure they run on -me π Kubernetes: the good parts β 1. A "declarative" and "goal-state driven" API. 2. Manage a large set of machines (i.e. a cluster) 3. APIs to run container workloads on those machines (Pod, Deployment, StatefulSet..) 4. Extensibility to define your own APIs (CRDs) and controllers around them to actuate resources. Pod smallest deployment unit (1..N containers) ReplicaSet a scalable set of identical stateless Pods Deployment ReplicaSet but with revisions and rolling updates StatefulSets Pods pinned to particular nodes Job Run a Pod to completion CronJob Run a Job periodically β serves an API or web page β stateless replicas β load balancing β autoscaling β rollouts (blue/green) β rollbacks Microservices noun. service, but smaller. usually a twelve-factor app. β service discovery β secure transport (TLS) β request metrics β graceful termination β shield from spikes/DoS β concurrency limits β ... β serves an API or web page β stateless replicas β load balancing β autoscaling β rollouts (blue/green) β rollbacks Where Kubernetes falls short β service discovery β secure transport (TLS) β request metrics β graceful termination β shield from spikes/DoS β concurrency limits β ... DIY microservice HTTP request HTTP response HTTP request HTTP response client TCP socket microservice client Kubernetes has no notion of application-layer (L7) requests (HTTP, gRPC, ...). β Per-connection β Causing uneven distribution β single client establishing too many connections β Naturally "sticky sessions" β a client routed to the same Pod even if degraded or faulty Where Kubernetes falls short Load balancing Pod Pod Pod Pod Pod β Based only on CPU and memory β Delayed metrics collection β cannot easily handle spiky traffic patterns β it might be too late when it's time to scale up Where Kubernetes falls short Autoscaling β Based only on CPU and memory β Delayed metrics collection β cannot easily handle spiky traffic patterns β it might be too late when it's time to scale up Where Kubernetes falls short Autoscaling 0.4 cpu Autoscaling target: 1.0 CPU Pod (1.5 CPU) β Based only on CPU and memory β Delayed metrics collection β cannot easily handle spiky traffic patterns β it might be too late when it's time to scale up Where Kubernetes falls short Autoscaling 0.4 cpu Autoscaling target: 1.0 CPU 0.6 cpu Pod (1.5 CPU) β Based only on CPU and memory β Delayed metrics collection β cannot easily handle spiky traffic patterns β it might be too late when it's time to scale up Where Kubernetes falls short Autoscaling Pod (1.5 CPU) 0.4 cpu Autoscaling target: 1.0 CPU 0.6 cpu 0.2 cpu β Based only on CPU and memory β Delayed metrics collection β cannot easily handle spiky traffic patterns β it might be too late when it's time to scale up Where Kubernetes falls short Autoscaling Pod (1.5 CPU) 0.4 cpu Autoscaling target: 1.0 CPU 0.6 cpu 0.2 cpu β Based only on CPU and memory β Delayed metrics collection β cannot easily handle spiky traffic patterns β it might be too late when it's time to scale up Where Kubernetes falls short Autoscaling Pod (1.5 CPU) 0.4 cpu Autoscaling target: 1.0 CPU 0.6 cpu 0.2 cpu Pod β Based only on CPU and memory β Delayed metrics collection β cannot easily handle spiky traffic patterns β it might be too late when it's time to scale up Where Kubernetes falls short Autoscaling Pod 0.4 cpu Autoscaling target: 1.0 CPU 0.6 cpu 0.2 cpu Pod β No support for highly spiky traffic patterns. β Need a proxy or gateway to βfrontβ the requests and βbufferβ them. β No "max N requests per container" Where Kubernetes falls short Meat shielding Concurrency Controls Rapid Autoscaling Meat shield Pod Pod Pod Pod Pod Pod β Can't split traffic per-request, e.g. β 95% v1 β 5% v2 β Need to implement blue/green rollouts yourself. β Deployment API gives some options for rolling updates, but not quite blue/green. Where Kubernetes falls short Rollouts Blue/green deployments Pod v1 Pod v2 95% 5% Pod β Unused replicas keep consuming resources. β Hard to have high utilization, because we almost always overprovision in Kubernetes. Where Kubernetes falls short Scale to zero Pod Pod