UMA- – Uniform Memory Access – access to any RAM takes the same amount of time. NUMA-- Non-Uniform Memory Access – some parts of memory make take longer to access than others (local vs. remote). Multiple processors share the computer bus, clock, memory, and peripherals. Race-Condition-- A situation where several processes access and manipulate the same data concurrently and the outcome of the execution depends on the order in which the access takes place. We need a way to synchronize the processes so that only one process at a time can be manipulating a critical data item. Solution to critical-section--- Mutual Exclusion - If process Pi is executing in its critical section, then no other processes can be executing in their critical sections Progress - If no process is executing in its critical section and some processes wish to enter their critical sections, only those processes that are not executing in their remainder sections can participate in deciding which will enter its critical section next, and this selection cannot be postponed indefinitely. 3. Bounded Waiting - A bound must exist on the number of times that other processes are allowed to enter their critical sections after a process has made a request to enter its critical section and before that request is granted Assume that each process executes at a nonzero speed No assumption concerning relative speed of the n processes. Areas Prone to Race Conditions: Kernel Open Files data structure 1)List must be modified when a new file is opened or closed 2)If two processes were to open files simultaneously, the separate updates to this file could cause a race condition. 3. Memory Allocation Structures 4. Process List Structures 5. Interrupt Handling. Non-preemptive----runs until exits kernel mode, blocks, or voluntarily yields CPU Essentially free of race conditions in kernel mode Preemptive- allows preemption of the process when running in kernel mode Are hard to design especially in SMP (Symmetric Multiprocessors) architectures where two kernel-mode processes can be running simultaneously on different processors. Why choose preemptively? --More responsive since the risk of long-running processes holding up the works is eliminated More suitable for real-time programming as real-time processes can preempt as needed. Priority Inversion – Scheduling problem when a lower-priority process holds a lock needed by the higher-priority process. Consider processes L, M, and H whose priorities are represented by: L < M < H. Assume process H requires resource R that is currently being accessed by process L. H would have to wait for L to complete to access resource R. BUT... M becomes runnable and preempts process L. Now, M, a process with a lower priority than H, has affected how long H must wait for L to relinquish resource R. Priority Inversion – Solved--- via priority-inheritance protocol All processes that are accessing resources that are needed by a higher-priority process inherit the higher priority until they have finished with the resources in question. When they are finished using those resources, their priorities revert to their original values. Process Synchronization: Message passing may be either blocking or non-blocking. Blocking is considered synchronous Blocking sends -- the sender is blocked until the message is received by the receiving process or mailbox Blocking receives -- the receiver is blocked until a message is available. Non-blocking is considered asynchronous Non-blocking sends -- the sender sends the message and resumes operation Non-blocking receives -- the receiver receives: A valid message, or Null message Different combinations possible If both send and receive are blocking, we have a rendezvous. Producer-consumer problem becomes trivial Producer invokes a non blocking send() call. Consumer invokes a blocking receive() and waits until a message is available. Buffering Messages exchanged between communicating processes reside in a temporary queue. implemented in one of two ways No Buffering System: Zero capacity – no messages are queued on a link. Sender must wait for receiver (rendezvous) Automatic Buffering: Bounded capacity – finite length of n messages Sender must wait if link full Unbounded capacity – infinite length Sender never waits. Interprocess Communication Processes within a system may be independent or cooperating Cooperating process can affect or affected by other processes, including sharing data. Reasons for cooperating processes: Information sharing, Computation speedup (requires multiple processing cores), Modularity, Convenience. Cooperating processes need interprocess communication (IPC) Two models of IPC--- (Shared memory & Message passing) Shared Memory---- An area of memory shared among the processes that wish to communicate. The communication is under the control of the users processes not the operating system. Goal is to provide mechanism that will allow the user processes to synchronize their actions when they access shared memory... Message Passing-- Mechanism for processes to communicate and to synchronize their actions Particularly useful in distributed systems where processes may reside on different computers connected by a network Message system----- processes communicate with each other without resorting to shared variables IPC facility provides two operations: send and receive message. The message size is either fixed or variable Variable messages are harder for the OS programmer to implement and make the life of the application programmer easier. Fixed messages are easier for the OS programmer to implement and make the life of the application programmer harder. If processes P and Q wish to communicate, they need to: Establish a communication link between them. Exchange messages via send/receive. Implementation issues: How are links established? Can a link be associated with more than two processes? Implementation of communication link Physical: Shared memory, Hardware bus, Network Logical: Direct or indirect Synchronous or asynchronous Automatic or explicit buffering. Direct Communication: Processes must name each other explicitly: send (P, message) – send a message to process P receive(Q, message) – receive a message from process Q. Properties of communication link--- Links are established automatically....A link is associated with exactly one pair of communicating processes,.. Between each pair there exists exactly one link...The link may be unidirectional but is usually bi-directional. Symmetric Addressing: sender and receiver name their partner send (P, message) – send a message to process P. Receive(Q, message) – receive a message from process Q. Asymmetric Addressing: only sender names partner send (P, message) – send a message to process P receive(id, message) – receive a message from any process. Id is set to the id of the sender process. Significant Disadvantage: If process names change or PID values change, all referring processes must be modified. Indirect Communication: Messages are directed and received from mailboxes (also referred to as ports)... Each mailbox has a unique id Processes can communicate only if they share a mailbox. Properties of communication link:1)Link established between two processes 2)IFF processes share a common mailbox 3)A link may be associated with many processes 4)Each pair of processes may share several communication links with each link corresponding to one mailbox 5)Link may be unidirectional or bi-directional Primitives are defined as: send(A, message) – send a message to mailbox A receive(A, message) – receive a message from mailbox A Mailbox sharing P1, P2, and P3 share mailbox A.. P1, sends; P2 and P3 receive Who gets the message? Assumption (message is consumed when received). Solutions Allow a link to be associated with at most two processes Allow only one process at a time to execute a receive operation Allow the system to select arbitrarily the receiver (or use an algorithm –e.g., round-robin). Sender is notified who the receiver was. Synchronization Hardware Many systems provide hardware support for implementing the critical section code. All solutions below based on idea of locking 1.Protecting critical regions via locks 2. Uniprocessors – could disable interrupts Currently running code would execute without preemption This is the approach generally taken for non-preemptive kernels Generally too inefficient on multiprocessor systems. Why? Message is passed to all processors This delays entry into each critical section decreasing system efficiency. May impact clock performance if clock is updated using interrupts. Modern machines provide special atomic hardware instructions Atomic = non-interruptible 1.test and modify contents of memory word 2.swap contents of two memory words These instructions can be used to solve the critical section problem Mutex Locks OS designers build software tools to solve the critical section problem Simplest is mutex lock: (mutual exclusion) 1.Protect a critical section by first acquire() a lock then release() the lock while(true). 2.Boolean variable available indicating if lock is available or not 3.Usually implemented via hardware atomic instructions DISADVANTAGES But this solution requires busy waiting When a process is in its critical section, any other process that tries to enter its critical section must loop continuously in the calls to acquire() and is therefore called a spinlock because the process spins while waiting for the lock to become available. In a single CPU multiprogramming environment busy waiting wastes CPU cycles that could be used by other processes. ADVANTAGES Spinlocks do not require context switching and are efficient when locks are expected to be held for short times. Multiprocessor systems are good candidates for spinlocks as another thread can enter its critical section on another processor. SEMAPHORES Synchronization tool that provides more sophisticated ways (than Mutex locks) for processes to synchronize their activities. Semaphore S – integer variable Can only be accessed via two indivisible (atomic) operations wait() and signal() Counting Semaphore – integer value can range over an unrestricted domain Used to provide access to a resource consisting of a finite # of resources The semaphore is initialized to the number of available resources Each process that wishes to use a resource performs a wait() operation the decrements the semaphore When a process releases a resource, it performs a signal() operations that increments the semaphore. When the semaphore goes to 0, all resources are being used. After that, all processes that wish to use a resource will block until the count becomes >0. Binary semaphore – – integer value can range only between 0 and 1 Same as a mutex lock (and they suffer from busy-wait too) Can solve various synchronization problems Consider P1 and P2 that require S1 to happen before S2 Create a semaphore synch initialized to 0 P1 : S1: signal(synch); P2 :wait(synch); S2; Because synch is set to 0, P2 will execute S2 only after P1 has invoked signal (synch) which is after statement S1 has been executed. Semaphore Implementation We can modify the definition of wait() and signal() as follows: When a process executes a wait() operation and finds that the semaphore value is not positive, it must wait. Rather than engaging in busy waiting, the process can block itself. The block operations puts the process into a waiting queue associated with the semaphore and the state of the process is switched to ‘waiting’. Control is transferred to the CPU scheduler which selects another process to execute. A process that is blocked should be restarted using the wakeup() operation when some other process executes a signal() operation. This changes the process state to ‘ready’ and it will get scheduled based on the CPU scheduler’s priority algorithm Semaphore Implementation with no Busy waiting 1.With each semaphore there is an associated waiting queue 2.Each entry in a waiting queue has two data items: value (of type integer) pointer to next record in the list 3.Two operations: block – place the process invoking the operation on the appropriate waiting queue wakeup – remove one of processes in the waiting queue and place it in the ready queue Semaphore values can be negative. Magnitude = # of processes waiting Must guarantee that no two processes can execute the wait() and signal()on the same semaphore at the same time Thus, the implementation becomes the critical section problem where the wait and signal code are placed in the critical section Could now have busy waiting in critical section implementation But implementation of wait and signal code is short Little busy waiting if critical section rarely occupied Busy waiting is more of a concern in applications that may spend a long time in their critical section and therefore this is not a good solution DEADLOCK AND STARVATION Deadlock – two or more processes are waiting indefinitely for an event that can be caused by only one of the waiting processes Let S and Q be two semaphores initialized to 1 Starvation – indefinite blocking A process may never be removed from the semaphore queue in which it is. FOUR NECESSARY CONDITIONS FOR DEADLOCK to occur Mutual Exclusion Condition – Resource may be acquired by one and only one process at a time Wait-For Condition (Hold and Wait Condition) Process that has acquired an exclusive resource may hold that resource while the process waits to obtain other resources No-preemption Condition Once a process has obtained a resource, the system cannot remove it from the process’ control until the process is finished using the resource. Circular-Wait Condition Two or more processes are locked in a ‘circular chain’ in which each process is waiting for one or more resources that the next process in the chain is holding. FACTS If graph contains no cycles =>no deadlock If graph contains a cycle => if only one instance per resource type, then deadlock or if several instances per resource type, possibility of deadlock Dining-Philosophers Problem Algorithm Deadlock Handling solution Allow at most 4 philosophers to be sitting simultaneously at the table with five chopsticks available. Allow a philosopher to pick up the chopsticks only if both are available (picking must be done in a critical section). Use an asymmetric solution -- An odd-numbered philosopher picks up first the left chopstick and then the right chopstick. An even-numbered philosopher picks up first the right chopstick and then the left chopstick. TRANSACTIONAL MEMORY A memory transaction is a sequence of read-write operations to memory that are performed atomically. Advantages: System provides atomicity, not developer No locks are involved ... so no deadlock. Transactional memory system can detect opportunities for concurrency within an atomic block Implementation Software Transactional Memory (STM)– code inserted by compiler to guarantee atomicity and detect concurrency options Hardware Transactional Memory (HTM)– Uses cache hierarchies and cache coherency protocols to manage shared data conflicts. Less overhead than STM THREADS Multiple simultaneous tasks within the application can be implemented by separate threads Update display Fetch data Spell checking Answer a network request Can simplify code, increase efficiency Kernels are generally multithreaded Thread: basic unit of CPU utilization. Comprises Thread id Program counter Register set Stack Threads of a process share Code section Data section Other OS resources (e.g., Open files and signals) Most modern applications are multithreaded. Threads run within application. Multithreaded Server Architecture Consider implications of a single threaded web server with 9,000 requests!!.... We can create a new process for each received request And.. Process creation is time consuming and resource intensive In Solaris, process creation is ~30 times slower than thread creation. Context switching is ~5 times slower. Asynchronous threading - Once the parent creates a child thread, the parent resumes its execution. Parent and child execute concurrently. Each thread runs independently of every other thread Parent thread agnostic to child terminations Synchronous threading (fork-join strategy) -- Once the parent creates a child thread it must wait for all child threads to terminate Child threads work concurrently, but parent must wait. Typically involved significant data sharing amongst threads. BENEFITS OF THREADS: 1. Responsiveness – may allow continued execution if part of process is blocked, especially important for user interfaces. 2. Economy – cheaper than process creation, thread switching lower overhead than context switching. 3. Economy – cheaper than process creation, thread switching lower overhead than context switching. 4. Scalability – process can take advantage of multiprocessor architectures MULTICORE PROGRAMMING Concurrency supports more than one task making progress via rapid switching, and only one executes at any given time. In a single processor/core environment, scheduler provides concurrency Concurrency simulates parallelism parallelism implies a system can perform more than one task simultaneously (requires multicore or multiprocessor) multithreaded programming takes advantage of multicore or multiprocessor systems and achieves parallelism Multicore or multiprocessor systems put pressure on both OS and application programmers. Challenges include: 1.Dividing activities 2.Balance – ensuring that separate execution is worth the cost 3.Data splitting 4.Data dependency 5.Testing and debugging – more challenging to debug concurrent programs than single-threaded applications Logical cores enable a single core to do 2 or more things simultaneously. This grew out of the early Pentium 4 CPUs ability to do what was termed Hyper Threading TYPES OF PARALLELISM; Data parallelism – distributes subsets of the same data across multiple cores, same operation on each Add up the elements in an array A. Use two threads on two cores- subsets of the same data. Same operation Task parallelism – distributing threads across cores, each thread performing unique operation Perform two distinct statistical operations on array A. Use two threads on two cores. Two distinct operations. (Data may or may not be the same) USER THREADS AND KERNEL THREADS User threads - management done by user-level threads library Three primary thread libraries: POSIX PYthreads (either user-level or kernel level) Windows threads (kernel level) Java threads (implemented using host system API) Kernel threads - supported by the kernel. Examples – virtually all general-purpose operating systems, including Windows , Solaris, Linux Tru64 UNIX, Mac OS X MULTI-THREADING MODELS A relationship must exist between user threads and kernel threads. Many-to-one Many user-level threads mapped to single kernel thread Thread management is done efficiently in the user space by the thread library One thread blocking causes all to block Multiple threads may not run in parallel on multicore system because only one user thread may access the kernel at a time Few systems currently use this model Example: Solaris green threads One-to-one Each user-level thread maps to a kernel thread Creating a user-level thread creates a kernel thread (cost issue) More concurrency than many-to-one – another thread can run when a thread makes a blocking system call More parallelism in multiprocessor environment Number of threads per process sometimes restricted due to overhead Examples: Windows Linux. Solaris 9 and later Many-to-many Allows many user level threads to be mapped/multiplexed to many kernel threads Allows the operating system to create the appropriate number of kernel threads based on application or hardware requirements Solaris prior to version 9 Windows with the threadfiber package User to Kernel Thread Mapping Impact on concurrency Many-to-one allows ‘unlimited’ user thread Not true concurrency since only one kernel thread executes at a time One-to-one allows greater concurrency but sets limits on the number of threads Many-to-many allows ‘unlimited’ user threads and the kernel threads can run in parallel on a multiprocessor Two-level Model Similar to M: M, except that it allows a user thread to be bound to a kernel thread Examples. IRIX HP-UX Tru64 UNIX Solaris 8 and earlier Thread Library Provides Programmer With API For Creating And Managing Threads Two Primary Ways Of Implementing Library Entirely In User Space Less Overhead To Create A Thread Kernel-level Library Supported By The OS Requires System Call To Kernel Each Time A Thread Is Created. IMPLICIT THREADS Growing in popularity as numbers of threads increase, program correctness more difficult with explicit threads Creation and management of threads done by compilers and run-time libraries rather than programmers Three methods for designing multithreaded programs that can take advantage of multicore processing through implicit threading: Thread pools OpenMP Grand Central Dispatch Other methods include Microsoft threading building blocks (TBB), java.Util.Concurrent package. Thread Pools Issues with this solution: Thread creation still takes time (albeit less than process creation) Threads are discarded after they complete their tasks ‘Unlimited’ threads could exhaust CPU or memory. Create a number of threads at process startup in a pool where they await work Advantages: Usually slightly faster to service a request with an existing thread than create a new thread Allows the number of threads in the application(s) to be bound to the size of the pool Separating task to be performed from mechanics of creating task allows different strategies for running task i.e. Tasks could be scheduled to run periodically or after a delay Thread pool size can be Set heuristically based on system resources and expected concurrent client requests Dynamically adjusted (e.g., Apple’s Grand Central Dispatch) Signals are used in UNIX systems to notify a process that a particular event has occurred. A signal handler is used to process signals A signal is Generated by particular event. Delivered to a process Handled by one of two signal handlers: Default User-defined Every signal has default handler that kernel runs when handling signal User-defined signal handler can override default For single-threaded process, signal delivered to process Where should a signal be delivered for multi-threaded process? Deliver the signal to the thread to which the signal applies, very thread in the process, certain threads in the process Assign a specific thread to receive all signals for the process Synchronous signals need to be delivered to the thread causing the signal (e.g., Divide by 0 error) Some asynchronous signals should go to all threads (e.g., <Control> <C>) OSs vary in their support To specify threads as destination (Windows APC), The ability of threads to selectively block signals (some multithread UNIX systems) Thread Cancellation Terminating a thread before it has finished Multiple threads searching a database. When the data item is found by one thread, the others can be cancelled. Pressing the x button on a browser window will cause multiple threads loading content (each image is loaded by a separate thread) to be cancelled. Thread to be canceled is target thread Two general approaches: Asynchronous cancellation terminates the target thread usually immediately but not guaranteed to be so Problems occur when resources have been allocated to the target thread or it is updating data shared by other threads OS may only reclaim system resources but not all resources Deferred cancellation allows the target thread to periodically check if it should be cancelled Thread-Local Storage Threads share the data of the process and reduces overhead in data sharing. Sometimes threads need their own copy of the data. Transactions might store their transaction ids in tls Thread-local storage (TLS) allows each thread to have its own copy of data e.g., Use of errno to specify system error code. If not protected can get overwritten by another thread Useful when you do not have control over the thread creation process (i.E., When using a thread pool) Different from local variables Local variables visible only during single function invocation TLS visible across function invocations (useful to save state) Similar to static data Except that TLS is unique to each thread HARDWARE threads (or CPU threads) 1.A single CPU core is represented to the operating system as two cores 2.OS schedules two tasks on the two "logical" cores as it would on two physical cores in a multi-processor system 3.The single physical CPU core will switch between the tasks on the two logical cores as it sees fit Process AT Burst Time CT TTT WT P 1 0 9 18 18 – 0 = 18 18 – 9 = 9 P 2 1 3 4 4 – 1 = 3 3 – 3 = 0 P 3 2 10 28 28 – 2 = 26 26 – 10 = 16 P 4 3 6 10 10 – 3 = 7 7 – 6 = 1 4.When one task is stalled waiting for data to be loaded, it switches to the other one 1.Program issues a LOAD instruction If the content of the requested address isn’t in cache, it needs be fetched from RAM and there’s a delay 2.Without hardware CPU threading, the CPU is idle during this fetch time. 3.With hardware threading multiple threads of computation are saved in an internal cpu memory. 4.Instead of waiting, the computer swaps out the current state, swaps in one ready to go, and keeps executing 5.That swap can start the new thread on the very next CPU cycle CPU SCHEDULING ALGORITHMS Maximum CPU utilization obtained with multiprogramming CPU–I/O Burst Cycle – Process execution consists of a cycle of CPU execution and I/O wait CPU burst followed by I/O burst CPU burst distribution is of main concern SHORT-TERM SCHEDULER selects from among the processes in ready queue, and allocates the CPU to one of them Queue may be ordered in various ways (FIFO, Priority, Tree, Unordered) and usually contains process PCBs CPU scheduling decisions may take place when a process: 1.Switches from running to waiting state (I/O request, child termination) 2.Switches from running to ready state (when interrupt occurs) 3.Switches from waiting to ready (when I/O completes) 4.Terminates DISPATCHER module gives control of the CPU to the process selected by the short-term scheduler. THIS INVOLVES : 1. switching context 2. switching to user mode 3. jumping to the proper location in the user program to restart that program DISPATCH LATENCY– time it takes for the dispatcher to stop one process and start another running. Dispatchers should be fast to minimize Dispatch latency There are many criteria for comparing scheduling algorithms 1.CPU utilization – keep the CPU as busy as possible (40-90%) (maximize) 2. Throughput – # of processes that complete their execution per time unit(maximize) 3.Turnaround time – amount of time to execute a particular process (sum of wait times to enter memory, be on ready queue, execute on CPU, wait on I/O). Turnaround time is generally a function of the speed of the output device. Why? (minimize) 4.Waiting time – amount of time a process has been waiting in the ready queue (minimize) 5.Response time – amount of time it takes from when a request was submitted until the first response is produced, not the time it takes to output the response (for time-sharing environment). (minimize) First- Come, First-Served (FCFS) Scheduling (Process, cpu burst time); (P1,24) (P2,3) (P3,3) The Gantt Chart for the schedule is: P1 P2 P3 0 24. 27. 30 Waiting time for P1 = 0; P2 = 24; P3 = 27. Average waiting time: (0 + 24 + 27)/3 = 17 BUT the average wait time is NOT minimal and is sensitive to variance in CPU burst times of processes. Remember: FCFS Scheduling is nonpreemptive. Not suitable for timesharing (read cloud-computing) systems. Shortest-Job-First (SJF) Scheduling: Associate with each process the length of its next CPU burst 1 Use these lengths to schedule the process with the shortest time 2 FCFS breaks the tie in case two have equal times SJF is optimal – gives minimum average waiting time for a given set of processes The difficulty is knowing the length of the next CPU request (aka shortest next CPU burst) SJF is a special case of the general priority-scheduling algorithm Example: (Process,Burst time)= (P1,6), (P2,8) (P3,7) (P4,3) P4 P1 P3 P2 0 +3 3 +6 9 + 7 16 +8 24 Average waiting time = (3 + 16 + 9 + 0) / 4 = 7 Average waiting time under FCFS would have been 10.25 (NOT GUT) Preemptive SJF: The SJF algorithm can be preemptive or nonpreemptive. The choice arises when a new process arrives at the ready queue while a process is still executing. The next CPU burst of the newly arrived process may be shorter than what is left of the currently executing process. A preemptive SJF algorithm will preempt the currently executing process A nonpreemptive SJF algorithm will allow the currently executing process to finish its CPU burst. Preemptive SJF is sometimes call Shortest Remaining Time First scheduling. Process Arrival Time Burst Time P 1 0 9 P 2 1 3 P 3 2 10 P 4 3 6 P 1 P 2 P 4 P 1 P 3 0 1 4 10 18 28 B) Turn Turnaround Time = Completion Time – Arrival Time Wait Time = Turn Around Time – Boost Time Average Wait Time = 𝑆𝑢𝑚 𝑜𝑓 𝑊𝑇 𝑓𝑜𝑟 𝑒𝑎𝑐ℎ 𝑃𝑟𝑜𝑐𝑒𝑠𝑠 # 𝑜𝑓 𝑃𝑟𝑜𝑐𝑒𝑠𝑠𝑒𝑠 = 9+0+16+1 4 = 6. 5 PRIORITY SCHEDULING A priority number (integer) is associated with each process The CPU is allocated to the process with the highest priority (smallest integer = highest priority) (FCFS breaks ties) 1. Preemptive 2. Nonpreemptive Major Problem is Indefinite Blocking or Starvation – low priority processes may never execute Solution = Aging – as time progresses increase the priority of the process (e.g., increment process priority by 1 every 15 minutes) SJF is priority scheduling where priority is the inverse of predicted next CPU burst time. Priority Types: Internal – use measurable quantity or quantities to compute priority 1.Time limits 2. Memory requirements 3 .# of open files 4.Ratio of average I/O burst to average CPU burst. External – use criteria outside the operating system. 1.Importance of process 2.Type and amount of funds being paid for computer use 3.Department sponsoring the work 4.Organizational politics Round Robin (RR): Designed for timesharing systems (read cloud computing). 1.FCFS + preemption 2.Each process gets a small unit of CPU time (time quantum q), usually 10-100 milliseconds. After this time has elapsed, the process is preempted and added to the end of the ready (circular) queue. 3.If there are n processes in the ready queue and the time quantum is q, then each process gets 1/n of the CPU time in chunks of at most q time units at once. No process waits more than (n-1)q time units. 4. Timer interrupts every quantum to schedule next process 5. Performance == If q large = >FIFO/FCFS. If q small => q must be large with respect to context switch, otherwise overhead is too high. Example => Typically, higher average wait time than SJF, but better response. q should be large compared to context switch time Why? q usually 10ms to 100ms, context switch < 10 μsec. Average turnaround time does NOT necessarily improve as time-quantum size increases. MULTILEVEL QUEUE: Designed for situations where processes fit into different groups Ready queue is partitioned into separate queues, e.g.: foreground (interactive) and background (batch) Process permanently in a given queue Each queue has its own scheduling algorithm: foreground – RR (why?) and background – FCFS Scheduling must be done between the queues: 1.Fixed priority scheduling; (i.e., serve all from foreground then from background – including preemption!). Possibility of starvation. 2.Time slice – each queue gets a certain amount of CPU time which it can schedule amongst its processes e.g., 80% to foreground in RR ; 20% to background in FCFS Multilevel Feedback Queue: Processes are separated according to the characteristics of their CPU bursts. process can move between the various queues. Use too much CPU and you get demoted. Aging can be implemented this way => A process waiting too long in a lower-priority queue may be moved to higher priority queue. Multilevel-feedback-queue scheduler defined by the following parameters: 1.number of queues 2.scheduling algorithms for each queue 3.method used to determine when to upgrade a process 4.method used to determine when to demote a process 5.method used to determine which queue a process will enter when that process needs service. Priority-based Scheduling For real-time scheduling, scheduler must support preemptive, priority-based scheduling But only guarantees soft real-time What's that mean? For hard real-time must also provide ability to meet deadlines and processes must announce their deadlines to the scheduler Processes have new characteristics: periodic ones require CPU at constant intervals. Has processing time t, deadline d, period p 0 ≤ t ≤ d ≤ p Rate of periodic task is 1/p Processes announce their deadline requirements to the scheduler. Using an admission control algorithm, the scheduler either: Admits the process, guaranteeing that the process will complete on time Rejects the request as impossible if it cannot guarantee that the task will be serviced by its deadline. Rate Monotonic Scheduling: Assign higher priority to tasks that need the CPU more often A priority is assigned based on the inverse of its period Shorter periods = higher priority Longer periods = lower priority P1 is assigned a higher priority than P2. RMS is static priority with preemption. Assign higher priority to tasks that need the CPU more often. SOLARIS: Priority-based scheduling Six classes available: 1.Time sharing (default) (TS) 2.Interactive (IA) 3.Real time (RT) 4.System (SYS) – reserved for kernel use 5.Fair Share (FSS) - uses shares instead of priorities for scheduling 6.Fixed priority (FP) – priorities not dynamically adjusted Given thread can be in one class at a time. Each class has its own scheduling algorithm Time sharing is multi-level feedback queue dynamically altering priorities and assigns time-slices of different lengths Loadable table configurable by sysadmin 1.What is a Hardware Thread and why it is used? A hardware thread is a single CPU core being represented to the OS as two or more cores. The OS schedules two logical cores as it would on two physical cores in a multi-processor system. The single CPU core switches between the tasks. This is used for more efficient utilization of resources such as during the execution of a program. With hardware threading, a CPU can switch between multiple threads instead of saving all resources for a single thread. This helps to eliminate wait delays and improve effective utilization of CPU cycles. 2.What is a Thread Pool and why is it used? A thread pool is creating a number of threads at process startup in a pool where they await work. The advantages are usually slightly faster request servicing by using an existing thread rather than creating a new one, allowing the number of threads in an application to be bound to the size of the pool, and separating task to be performed from mechanics o creating task which allows different strategies for running task. A thread pool is used to reduce thread creation, recycle threads rather than discarding them and preventing unlimited threads that could exhause CPU or memory. 3.When two cooperating processes need to share large amounts of data, which of the two methods discussed in class would be the best choice? Justify your answer. Shared memory would be the best choice for cooperating processes that need to share large amounts of data because this exchanges information by reading and writing data. An area of memory would be designated for the two processed to communicate. They would be able to synchronize their actions through accessing the shared memory. In message passing, only messages would be exchanged rather than large amounts of data and this would not happen concurrently. In message passing, there are no shared variables which makes this type of communication less suited for sharing less amounts of data. 4.A server is being overloaded by client requests. Explain how the use of threads could alleviate this problem. Explain why you would choose synchronous or asynchronous threading in your solution. A thread pool would create a number of threads at process startup in a pool where they wait to handle client requests work. The advantages are usually slightly faster request servicing by using an existing thread rather than creating a new one and allowing the number of threads in an application to be dynamically bound to the size of the pool. Recycling of threads rather than creating new ones is less costly and better for CPU or memory. Asynchronous threading would definitely be the better solution so that threads could run simultaneously and process requests as quickly as possible. Threads running sequentially in synchronous threading could quickly cause a huge, slowly processed backlog of requests. 5.Describe three general methods used to pass parameters to the operating system during system calls. 1.Pass the parameters in registers. This is inefficient when there are more parameters than registers. 2.If there are more parameters than registers, store parameters in a block or table in memory that is passed. No limit on the number or length of parameters being passed. 3.Parameters placed, or pushed, onto the stack by the program and popped off the stack by the OS. No limit on the number or length of parameters being passed. # CPU Scheduling **What is CPU Scheduling?** - It is the basis of multiprogrammed operating systems - By switching the CPU among processes, the operating system can make the computer more productive - The objective of multiprogramming is to have some process running at all times, to maximize CPU utilization - A process is executed until it must wait, typically for the completion of some I/O request - When one process needs to wait, the OS will take the CPU away from that process and give it to another one until the wait is over **CPU and I/O Burst Cycles** - Process execution consists of a cycle of CPU execution and Input/Output wait. The processes alternate between these two states - CPU Burst: When the process is being executed in the CPU - I/O Burst: When the CPU is waiting for I/O for further execution 🔚 The final CPU burst ends with a system request to terminate execution **CPU Scheduler** - Short Term Scheduler - Selects from among the processes in ready queue, and allocates the CPU to one of them - Queue may be ordered in various ways and usually contains process PCBs - CPU Scheduling decisions may take place when a process: 1. Switches from running to waiting state (I/O request, child termination) 2. Switches from running to ready state (when interrupt occurs) 3. Switches from waiting to ready (when I/O completes) 4. Terminates **Dispatcher** - Dispatcher module gives control of the CPU to the process selected by the short-term scheduler - This involves: - Switching context - Switching to user mode - Jumping to the proper location in the user program to restart the program - Dispatch Latency - Time it takes for the dispatcher to stop one process and start run another - Dispatchers should be fast to minimize dispatch latenc **Scheduling Criteria** - CPU Utilization: Keep the CPU as busy as possible (40 - 90%) - Throughput: Number of processes that complete their execution per time unit - Turnaround Time: Amount of time to execute a particular process. Turnaround time is generally a function of the speed of the output device - Waiting Time: Amount of time a process has been waiting in the ready queue - Response Time: Amount of time it takes from when a request was submitted until the first response is produced, not hte time it takes to output the response **CPU Scheduling Algorithms** - Optimization Criteria - Maximiz