Diving deep into the event-driven side of Serverless How does Serverless work? How does Serverless work? Storage Databases Analytics Machine Learning . . . Your unique business logic User uploads a picture Customer data updated Anomaly detected API call . . . Fully-managed services Events Functions What is an “event” ? “something that happens” Events tell us a fact Immutable time-series Time What 2019 06 21 08 07 06 CustomerCreated 2019 06 21 08 07 09 OrderCreated 2019 06 21 08 07 13 PaymentSuccessful 2019 06 21 08 07 17 CustomerUpdated . . . . . . Should you focus on the current status, or what is happening? Current status Domain model Commands Control ”CreateAccount” “AddProduct” What happens Domain events Event-driven Autonomy ”AccountCreated” “ProductAdded” Commands Vs Events Command Has an intent Directed to a target Personal communication ”CreateAccount” “AddProduct” Event It’s a fact For others to observe Broadcast one to many ”AccountCreated” “ProductAdded” What can a function do with events? Function Service Event Service Storage / Database Event React to facts that are coming in Publish new facts for others to use Events can bring data and simplify state hydration For writes/updates, minimize read s What can more functions do together? A microservice Function Service Event Service Function Function Microservice Storage / Database Event Replace commands with events to minimize coupling and increase autonomy “If this, then that...” What can more functions do together? A microservice Create Customer Amazon API Gateway Create User Request Amazon SNS Get Customer Update Customer Microservice Amazon DynamoDB User Created What can more functions do together? A better microservice Create Customer Amazon API Gateway Create User Request Get Customer Update Customer Microservice Amazon DynamoDB User Created Using DynamoDB Streams Using the Lambda HTTP interface (Invoke) as service boundary (Sync/Async + AAA) AWS Lambda Invoke What about integration? Create Order Reserve Item Process Payment Start Order Delivery Payment Service Product Inventory Keep data within the microservice Amazon Aurora Amazon DocumentDB Amazon DynamoDB Amazon Neptune Amazon Quantum Ledger Database (QLDB) Amazon RDS Amazon Timestream Amazon Elasticsearch Service Relational NoSQL Graph Time series Ledger Bonus Polyglot persistence gives database freedom! So, again, what about integration ? Create Order Reserve Item Process Payment Start Order Delivery Payment Service Product Inventory Introducing sagas Long lived transactions (LLT) The same idea can be applied to transactions across multiple microservices SAGAS Hector Garcaa-Molrna Kenneth Salem Department of Computer Science Princeton University Princeton, N J 08544 Abstract Long lived transactions (LLTs) hold on to database resources for relatively long periods of time, slgmficantly delaymg the termmatlon of shorter and more common transactions To alleviate these problems we propose the notion of a saga A LLT 1s a saga if it can be written as a sequence of transactions that can be interleaved with other transactions The database manage- ment system guarantees that either all the tran- sactions m a saga are successfully completed or compensatmg transactions are run to amend a partial execution Both the concept of saga and its lmplementatlon are relatively simple, but they have the potential to improve performance slgmficantly We analyze the various lmplemen- tatron issues related to sagas, including how they can be run on an exlstmg system that does not directly support them We also discuss tech- niques for database and LLT design that make it feasible to break up LLTs mto sagas 1. INTRODUCTION As its name indicates, a long lived transac- tron 1s a transactlon whose execution, even without interference from other transactions, takes a substantial amount of time, possibly on the order of hours or days A long lived transac- tion, or LLT, has a long duration compared to Permlsslon to copy wlthout fee all or part of this material IS granted provided that the copies are not made or dlstrlbuted for direct commercial advantage, the ACM copyrlght notice and the title of the pubhcatlon and Its date appear, and notlce IS given that copymg IS by permlsslon of the Assoclatlon for Computmg Machmery To copy otherwlse, or to repubhsh, requires a fee and/or specfic permisslon 0 1987 ACM O-89791-236-5/87/0005/0249 75@ the malorlty of other transactions either because it accesses many database obJects, it has lengthy computations, it pauses for inputs from the users, or a combmatlon of these factors Examples of LLTs are transactions to produce monthly account statements at a bank, transactions to process claims at an insurance company, and transactions to collect statrstlcs over an entire database [Graysla] In most cases, LLTs present serious perfor- mance problems Since they are transactions, the system must execute them as atomic actions, thus preserving the consistency of the database [DateSla,Ullm82a] To make a tran- saction atonuc, the system usually locks the objects accessed by the transaction until It com- mits, and this typically occurs at the end of the transactlon As a consequence, other transac- tions wishing to access the LLT’s objects suffer a long locking delay If LLTs are long because they access many database obJects then other transac- tions are likely to suffer from an mcreased block- mg rate as well, 1 e they are more likely to conflict with an LLT than with a shorter transac- tion Furthermore, the transaction abort rate can also be increased by LLTs As discussed m [Gray8lb], the frequency of deadlock 1s very sensitive to the “size” of transactions, that IS, to how many oblects transactions access (In the analysis of [GraySlb] the deadlock frequency grows with the fourth power of the transaction size ) Hence, since LLTs access many oblects, they may cause many deadlocks, and correspond- ingly, many abortions From the point of view of system crashes, LLTs have a higher probability of encountering a failure (because of their duration), and are thus more likely to encounter yet more delays and more likely to be aborted themselves 249 From Long lived transaction (LLT) to saga Sub-transactions for partial executions T i , i=1...n Compensating transactions to revert partial executions C i , i=1...n-1 T 1 T 2 T 3 T 4 C 1 C 2 C 3 T 1 Sample saga transaction Create Order Reserve Item Process Payment Start Order Delivery Payment Service Product Inventory T 1 T 2 Sample saga transaction Create Order Reserve Item Process Payment Start Order Delivery Payment Service Product Inventory Unreserve Item T 1 T 2 C 1 Event - driven saga transaction Create Order Reserve Item Process Payment Start Order Delivery Payment Service Product Inventory New Order Payment Confirmed Item Reserved Unreserve Item Item Unreserved Cancel Order Error / DLQ Distributed Sagas Distributed Sagas McCa ↵ rey, Caitie Sporty Tights, Inc Kingsbury, Kyle The SF Eagle Narula, Neha That’s DOCTOR Narula to you! May 20, 2015 1 Introduction The saga paper outlines a technique for long-lived transactions which provide atomicity and durability without isolation (what about consistency? Preserved outside saga scope, not within, right?). In this work, we generalize sagas to a distributed system, where processes communicate via an asynchronous network, and discover new constraints on saga sub-transactions. We are especially interested in the problem of writing sagas which inter- act with third-party services , where we control the Saga Execution Coordinator (SEC) and its storage, but not the downstream Transaction Execution Coordi- nators (TECs) themselves. Communication between the SEC and TEC(s) takes place over an asynchronous network (e.g. TCP) which is allowed to drop, delay, or reorder messages, but not to duplicate them. We assume a high-availability SEC service running on multiple nodes for fault-tolerance, where multiple SECs may run concurrently. They coordinate their actions through a linearizable data store, which ensures saga transactions proceed sequentially. 1 Choreography Event-driven Orchestration Commands Saga Execution Coordinator Distributed Sagas – Saga Execution Coordinator Distributed Sagas McCa ↵ rey, Caitie Sporty Tights, Inc Kingsbury, Kyle The SF Eagle Narula, Neha That’s DOCTOR Narula to you! May 20, 2015 1 Introduction The saga paper outlines a technique for long-lived transactions which provide atomicity and durability without isolation (what about consistency? Preserved outside saga scope, not within, right?). In this work, we generalize sagas to a distributed system, where processes communicate via an asynchronous network, and discover new constraints on saga sub-transactions. We are especially interested in the problem of writing sagas which inter- act with third-party services , where we control the Saga Execution Coordinator (SEC) and its storage, but not the downstream Transaction Execution Coordi- nators (TECs) themselves. Communication between the SEC and TEC(s) takes place over an asynchronous network (e.g. TCP) which is allowed to drop, delay, or reorder messages, but not to duplicate them. We assume a high-availability SEC service running on multiple nodes for fault-tolerance, where multiple SECs may run concurrently. They coordinate their actions through a linearizable data store, which ensures saga transactions proceed sequentially. 1 2 The Saga Execution Coordinator Start Log saga start clean Log saga abort incomplete saga Saga abort aborted saga Saga start Let i = 0 Log T i start i++ Request T i Await T i Log T i done ok error i = n? more Log saga done done Let i = last logged value of i Log C i start i-- Request C i Await C i error Log C i done ok i = 0? more done Saga done 2