Bootstrapping Theory
Boot is short for bootstrap loading and derives from the phrase “to pull oneself up by one's bootstraps”. The usage calls attention to the requirement that, if most software is loaded onto a computer by other software already running there, then some mechanism must exist to have loaded the “initial” software onto the computer. Early computers used a variety of methods to get a small program into memory to solve this initial problem. Up through the 1960s this was performed by a card deck. The processor hardware was wired to read the contents of a single card into memory. The contents of that card contained a program that could then instruct the computer to read additional cards into memory. Those cards in turn contained sufficient programming to direct an OS to be loaded from either more cards, tape or disk devices. After the OS was successfully loaded into low memory it then took control of the processor either via interruption or loading of a current program control register (PSW).
Modern OS are initially loaded (“booted”) by supplied firmware mechanisms with such names as: IPL – Initial Program Loader IMPL – Initial Microcode Program Loader BIOS – Basic Input Output System UEFI – Unified Extensible Firmware Interface
System Services Major services provided by the OS are: User Interfaces Text
or Graphical User Interface
Program Loading
or Instantiating
Scheduling and Dispatching Time-of-Day
and Event based
Input-Output Operations Fairly and Efficiently scheduling I/O services
Maintain the File System Providing users and programs with an abstraction of hierarchical directories and files, along with their manipulatives.
Data Communications Message
Passing among processes.
Error Detection Vigilance
for User generated program checks.
Resource Access Protection and Security Ensure
that access to resources is controlled.
Resource Allocation Ensure
that processes are not scheduled unless resources needed become
available.
Accounting for services rendered to Applications and Users Resource
Consumption,
System User Interfaces
Command Interpretation Command
text line interface for users via terminal.
Graphical User Interface Desktop
Application Programming Interfaces (API) A supplied set of functions that specifies the parameters that are passed to each function as arguments, and the return values the programmer or user can expect as results. Along with the API are usually a set of return codes that enumerate the possibilities of function failure.
Calls for Supervisor Services (SVC) I/O, File and Device manipulation Examples: OPEN, READ, WRITE, CLOSE
Information query and maintenance Examples: What is the TIME?, How much storage do I have?
Process Control Termination
Communications Domain
Name Services, DHCP and other daemon services
System Programs
A set of system utilities that perform: File Management Status Information Search and Edit Tools Programming Language Support Program Loading and Execution Automation Services Inter-Task and Inter-network Communications Common unprivileged Applications Web
browsers
OS Design and Implementation
Design Goals Convenience
Mechanisms How to accomplish some work.
Policies What will be accomplished, and when.
Good mechanisms provide flexibility by being insensitive to policy changes.
Implementation Early OS'es were written in assembly language. Modern OS'es are written with an assembly first-level interrupt handlers and assembly device drivers. Generally a higher level language such as C for the remainder of the kernel. System programs in C++ and PL/S. High-Performance Areas of the OS: Interrupt
Handlers
OS Structure
Simple Structure Microsoft MS-DOS, early AT&T UNIX, IBM TOS and DOS
The OS Service Pyramid Computer
HW Architecture
Layered Structure Approach The OS is modularized into a number of layers. The bottom layer is the hardware. The highest layer is the user interface. Each layer may interface with only its adjacent layers. Each layer is implemented with operations provided by lower-level layers. Overall functionality is determined and separated into components. Allows the OS retains much greater control over the computer and the applications.
Loadable Kernel Modules The kernel provides core services while others are implemented dynamically while the kernel is running. Linking and loading services dynamically is preferable to adding new features directly to the kernel, which would require recompiling the kernel with every change. Examples: device drivers and file systemsCore Dump
Virtual Machine
What is a Virtual Machine? Virtual Machines are those operating environments that provide emulation methods that present a standard software interface that is identical to the underlying hardware to their "guest" operating systems. Also known as "hypervisors", these operating platforms create multiple instances of the same computer architecture as the one on which the hypervisor is natively running. The hypervisor control program objectifies virtual instances of the computer architecture through the use of software data structures that emulate the hardware components of the architecture.
Guest Operating Systems The “guest” OS is any operating system that can run on that instance of the architecture. The guest runs as an application to the hypervisor. Therefore, it executes in the “problem state”. The guest has no knowledge nor reason to believe that it is running on anything but real hardware. No changes need to be made to the guest to allow it to function correctly when running under a Virtual Machine environment.
History The CP67 OS was developed by IBM in 1967. This was the first attempt at VM. Later IBM provided VM370 as a field developed program for corporate experimentation. In the 1980s IBM released VM/SP as part of its official product line. Today zVM is the flagship of the VM line of hypervisors. zVM runs a variety of operating systems including: MVS/ESA, VSE, zOS, zTPF, AIX, zLINUX RedHat and SUSE. EMC developed VMWARE to emulate Intel architectures in the late 1990s. After 2000 we saw a proliferation of hypervisors developed by other vendors; such as KVM for Linux, Virtual Box for Solaris.
Benefits Unlike distributed hardware-based solutions, virtualization technology allows you to virtualize processor, communications, storage, I/O, and networking resources to help reduce the need to provide duplicate hardware, programming, and data resources.
Reduction
in hardware costs.
Simulation of Architectures This is where the native host system and the guest system are of different architectures. The host system presents a simulated machine architecture to the guest. One in which the guest operating system was designed to run. No change to the operating system is necessary to run as a guest. Virtualization allows us to run operating systems requiring older architectures without the need to keep the old real machine. Emulation can increase the life of programming.
Para-Virtualization This is a variation on the virtualization theme. Rather than trick the guest operating system into believing it has a hardware system to itself, para-virtualization presents the guest with a system that is similar but not identical to the guest's normal operating environment. The guest operating code must be modifed to run on paravirtualized hardware. This results in a more efficient use of resources and a smaller virtualization layer.
Implementations Only the native operating system, the hypervisor, is permitted to execute in the Supervisor State (privileged). This means that each of the guests must operate in the Problem State. Execution of instructions in the problem state is limited to the non-privileged instructions. But OSes are designed to make use of the full instruction set including privileged instructions. When a guest OS attempts to execute a privileged instruction then a real PROGRAM CHECK interrupt occurs in the hardware. This check is presented to the hypervisor. The hypervisor will interrogate the source of the interruption. The hypervisor must check the virtual PSW of the guest to determine if the guest was in the “virtual” supervisor state (OS) or virtual problem state (APP). If the instruction was attempted while in the virtual supervisor state then the hypervisor will execute the instruction on the guest OS behalf, then reflect the results and statuses for the completed operation in the fixed storage locations of the guest operating system. The guest is then re-dispatched. Otherwise if the hypervisor determines that the guest was in the virtual problem state when it attempted a privileged instruction, then the hypervisor would fail the attempt, reflect that status back to the guest; perform an emulated program-check interruption to the guest; then re-dispatch the guest. In either case the guest “believes” that it itself performed the operation because all events from its perspective appear to have occurred on real hardware.
Some commercial hypervisors: IBM
Virtual Machine (zVM)
Operating System Problem Determination
Debugging
-- effort to find cause of faults. Core Dump -- snapshot of the contents of memory at some particular point in processing. |
The Process Concept The Process The process or job is a program in execution. It has current activity as represented by the changes in the value of addresses of executable code residing in memory containing an instance of the program. These values are present in the PSW or Program Counter register. Executable Code – contains the instructions for the CPU. Data Section – structures that contain variables and constants used by the process. Save Areas – memory to save the contents of registers and the PSW. The Process Stack local
variables Recall that the process is the active entity that represents an executing instance of a program. The program is a passive entity representing a "blueprint" from which the process is instantiated.
Process States New – the process is being created by the OS. This will shortly transition to the Ready-for-Work state. Running – instructions are being executed by the CPU. Only a single process can be in this state per CPU. Can transition to either Terminated, Waiting or Ready-for-Work. Waiting – in this state the task does not consume CPU because it is waiting for some event to occur, such as I/O, Timer pop. Waiting tasks next transition are to Ready-for-Work. Ready-for-Work --The process is not waiting for any event, but is waiting for access to the CPU. The next transition is to Running. Terminated – The task has completed execution and is being garbage collected, its resources de-allocated by the OS.
Process Control Block Each process is represented in the OS by a software data structure called a PROCESS CONTROL BLOCK. It contains information associated with the task: Current State Such as the next instruction to be executed, interrupts pending, system mode. Save Area Information Program
Counter information from the PSW. CPU Dispatching information CPU
Priority Memory management information Base
and Limit registers Accounting information CPU
consumption I/O device allocation and status information
Threads or Tasks A Thread is the basic dispatchable unit of computation. The thread represents an independent instance of execution of a unit of logical program code. Its environment is characterized by a PSW, a set of register values, and a save area stack. The thread is spawned at the direction of a "parent" process. The parent of all processes the operating system itself. Each thread of code runs independently from its other threads. Only a single instruction of a thread may be executing at a time, regardless of the number of CPUs.
Process Scheduling The goal of scheduling is to: Keep the CPU busy with productive work as possible. Switch the CPU among processes fast enough to give interactive users the illusion that each is simultaneously interacting with the applications, and receiving good response times.
Scheduling Queues Job Queue This consists of all processes that enter the system. Dispatch or Ready Queue Process must be ready to run; not waiting for anything but CPU. The OS selects the next task to be dispatched and given the CPU for processing. Eligible-List Process is ready but insufficient storage is available to run without adversely impacting the other processes. Device Queue Waiting for the device-end interrupt to be posted for an I/O request.
Schedulers – Migrate processes among the various scheduling queues. The JOB scheduler selects processes from the pool of all submitted jobs and loads their program into memory. The CPU scheduler selects from among the processes that are ready to run and allocates the CPU to one of them. Interactive Type – These exhibit characteristics of requiring small intervals of CPU time very often. Batch Type – These are long running processes that require large increments of CPU time infrequently. I/O Bound – Task that spends more than half of its wall clock time requesting and waiting for the completion of input and output. It tends to release the CPU prior to it's timeslice end. CPU Bound – A Task that spends more than half of its wall clock time consuming CPU cycles. It tends to consume its entire allocated timeslice. It is important that the JOB scheduler attempt to select a favorable process mix of I/O-bound and CPU-bound tasks to ensure good CPU utilization and throughput. A good process mix will keep both the ready queue and the I/O queue non-empty so that both the devices and the CPU are kept busy.
Medium-Term Scheduler and Swapping Medium-Term processes are not in need of CPU time as frequently as Short-Term tasks but more frequently than Long-Term tasks. Swapping removes the entire address space of a process from memory to auxiliary storage until its next timeslice.
Context
Switching – This concerns the process state represented by
the Program Control Block (PCB and PSW), the values in registers,
the memory management information and the state of execution.
Operations on Processes Process Creation Processes
are created by a parent mechanism. Main
Tasks are created by the OS.
Process Termination Processes are terminated by self-destruction or by a parent mechansim. Generally, a child process cannot be allowed to exist if its parent has terminated. When a process terminates its resources are deallocated and garbage collected by the OS.
Interprocess Communication Any process that does not share data with any other process is known as independent. A process is cooperating if it can affect or be affected by another process executing in the system. Processes may be sharing a resource or passing messages to each other.
Reasons for providing process cooperation Information sharing, such as common files read by users. Computation speedup via subtasking work. Modularity, dividing programmed functions into separate subtasks. Convenience, such as using an IDE that is checking syntax while the user is entering code into a editor.
Shared-Memory Systems Producer vs. Consumer processes The producer generates information. The consumer accesses and uses the information generated.
Message-Passing Systems Direct
communication Store-and-Forward Synchronization A process is termed as blocked when it can no longer make progress due to waiting for some event to occur or some resource to become available. Blocking
send (must wait if partner is not ready to receive)
Buffering Zero
capacity
Communication in Client-Server Systems Sockets A socket is defined as an endpoint for station to station communication. A socket is identified by an IP Addresss concatenated with a Port ID. Such as: 130.171.242.6:2048 There are 65536 possible ports per OS instance. Pairs of processes communicating over a network (inter-machine or intra-machine) employ a pair of sockets, one for each side. Generally, two channels or communication are established. One for sending, the other for receiving. This way both process can transmit messages simultaneously (full-duplex). Higher level programming languages allow sending/receiving messages as easily as reading/writing for local files.
Remote Procedure Calls (RPC) Messages exchanged in RPC communication are well-structured and no longer just packets of data. Each message is addressed to an RPC daemon function listening to a particular port on the remote system. Each message contains an identifier specifying the particular function to execute and the parameters needed to be passed to that function. The function is then executed as requested and any output is sent back to the requestor via a separate message.
PIPES A pipe is a communications conduit between two processes. Messages are usually sent one-way. Pipes allow two processes to communicate in standard producer-consumer fashion. The producer process writes output to one end of the pipe. The consumer process reads input from the other end of the pipe. So one process feeds its output as input to the next process. Pipes will reduce I/O wait time and increase CPU utilization because multiple pipe stages are active processing concurrently.
Examples: UNIX: ls | less Windows: dir | more
|
Basic Concepts Synchronization mechanisms are those protections that are designed address concurrency-control problems by ensuring the orderly execution of cooperating processes that share a common resource, so that data consistency and integrity is maintained.
Consumer-Producers that share data can exhibit functional problems when executed concurrently. Either the transmission or reception of data between the processes may block should data not be ready to transmit or the receiver not be ready to accept.
Race Condition occur where several processes access and manipulate the same data concurrently and the outcome of the execution depends on the particular order in which access to the shared data takes place. Problems:
Incorrect
or Inconsistent results.
Mutual Exclusion ensures that a critical section of code is executed by only one process at a time. Other processes are barred from access. Benefits:
Process
Synchronization and Serialization
The Critical-Section Problem A critical section (CS) is a segment of code within each process which may be changing common variables, updating a table, writing a file, etc. When one process has update access to its critical section then no other process may be permitted to execute in the same area. Critical Sections have entry and exit sections. Critical Sections should be small as possible, self contained, fast executing with no calls. The actual code of Critical Sections may itself be shared among programs.
A transaction is a programmed work unit in the critical section of code that must be fully executed without interruption; that is, either all the operations associated with the function are executed to completion, or none of the operations are attempted.
Solution to the CS problem: 1) Mutual Exclusion (interlocking) occuring prior to the execution of the CS. 2) Progress rapidly through the CS. 3) Bounded waiting; a limit on the number of times that other processes are allowed to enter their CS after a process has made a request to enter and before that request is granted.
Pre-emptive vs. non-preemptive kernel code Non-preemptive code does not release its Critical Section until it normally exits the CS, blocks on a resource, or voluntarily yields control of the CPU. It is free from race conditions. Preemptive code is much more suitable for real-time programming since other processes are not held up. It is more responsive by relinquishing the CPU to other waiting processes. It must be carefully designed to ensure that it is free from possible race conditions.
Synchronization Hardware Atomic Locking allow the querying and setting the contents of memory as one uninterruptible unit. Locks provide protection of critical sections. Test-and-Set Both instructions can be used by CPU tasks sharing common storage areas in either a multiprogramming or multiprocessing environment. A task can modify the contents of a storage location even though the possibility exists that this task may attempt to be interrupted by another process attempting to update the same location.
Semaphores and Serialization Mechanisms Serialization is the employing of mechanisms to ensure that only a single process may access a shared resource at a time. A semaphore is a synchronization variable used to overcome the difficulty of possible simultaneous or overlapped access to a shared resource. When one process modifies the semaphore value, no other process can simultaneously modify the same semaphore value. The semaphore access/modification is uninterruptably locked under the test-and-set hardware. A mutex (mutual exclusion) is a binary semaphore that provides simply two possible states, locked or an unlocked state.
If a process attempts to acquire a lock and it is found already in use by another then it has 3 options: Spin
on the lock (busy wait).
Two mechanisms that employ semaphores that can be used to ensure synchronization between threads are: Enqueueing
(Reserve/Release)
Enqueuing mechanisms are program constructs used for reserving and releasing a system resource. The resource can be virtually anything that can be shared between two or more tasks; such as static memory, a device, a data communications channel, etc. Enqueueing is the process of attempting to acquire a lock on a resource, or waiting until the lock is available, then locking the resource, followed by performing some critical section action on the resource, and finally releasing the resource. Only then are other waiting threads notified that the resource is available for reservation. Safe Sequence: Request
a Lock
Wait/Post mechanisms are program constructs used for synchronizing two or more threads so that one thread does not pass a particular point in its processing until the other(s) have reached the same agreed upon point in their processing. Such as waiting for all threads to be fully initialized before having them attempt to send messages to each other. Another would be the shutdown processing of an operating system. All tasks must be quiesced and brought down gracefully before the operating system can perform is final steps of withdrawing and issuing a halt to the hardware.
Semaphore Signalling versus Spin Locks. A spinlock is a “busy wait” in which the process executes through a loop while waiting for the lock to become available. The “passive wait” of both the Wait-and-Post and Reserve/Release do not consume CPU cycles, but voluntarily gives up its timeslice to another thread and block until the lock to become available. A notifying signal called a “post” will awake it out of its wait when the resource becomes available.
Locking Protocol Shared access – Multiple Read and possible Single Write Exclusive access – Only Single Read-Write / No other Reads
Executable Coding for Shared Critical Sections Serially Reusable – entry to the CS is reserved, then code and data may then be altered during execution of a critical section, but all changed status and data in the CS must be reset to the original state prior to CS release. Re-entrant – entry to the CS is NOT required to be reserved; code is executed but never altered by simultaneous processes; and all state and data references are outboard of the Critical Section of code.
Observance of Reservation Note that all threads using the resource must observe, respect and employ the use of semaphores, otherwise there can be no data integrity.
Deadlock This is a situation where two or more processes are waiting indefinitely for an event that can be caused only by one of the waiting processes, and therefore will never occur.
Priority Inversion This is a situation where a higher-priority process needs to read or modify kernel data that is currently being accessed by a lower-priority process. The kernel data is typically protected with a lock owned by the ower-priority process. The situation compounds itself should the lower -priority process get preempted in favor of another process with a higher-priority than itself. Processes: L < M < H H requires resource R which is currently held by L. H would wait for L to release resource R. While waiting process M preempts process L. Now a lower priority M has affected the length of the time that H must wait for L to relinquish resource R.
This situation can be avoided by implementing a priority-inheritance protocol. All processes that are accessing resources needed by a higher-priority process inherit the higher priority until such time as they have released the resources in question.
|
|
Basic Concepts On a uniprocessor only one process can run at any time; all other processes must wait in the ready queue until the CPU is free and can be rescheduled. The objective of multiprograming is to maximize CPU utilization by having some process performing productive work forcing the CPU to be running at all times. CPU operates at 100 percent of its engineered capacity, or zero percent; there is no partial state. The CPU is either busy with work or at idle. A process will execute until it must wait (voluntarily or forced), or terminate. When a process must wait, the OS give the CPU to another process that is ready-for-work. If no ready-for-work processes are available then the CPU goes IDLE. The amount of CPU time given to a task is a timeslice or time quantum.
CPU-I/O Burst Cycle CPU Burst – Any CPU cycles or time consumed prior to waiting for an event. I/O Burst – Amount of time spent waiting prior to needing the CPU for further instruction execution.
CPU Scheduler Short-Term or CPU Scheduler
Preemptive Scheduling CPU scheduling decisions are cooperative or preemptive. Preemptive means that the CPU is taken away from the process by the OS. Cooperative implies that the process has not done something that requires giving up the CPU, but because it cannot or decides not to use the CPU immediately. The transitions may take place under the following circumstances: Transition from running to waiting state. (cooperative) Transition from running to ready state due to an interrupt such as timeslice end. (preemptive) Transition from running to ready state via voluntary timeslice end. (cooperative) Transition from waiting to ready state. (preemptive) Normal Process Termination. (cooperative) Abnormal Process Termination. (preemptive)
Cooperative scheduling does not require hardware in the decision process. It is not a model for supporting realtime systems. Preemptive scheduling requires the Interval Timer hardware to cause regular and timely interruptions. Interrupt Masking is the enable/disablement of notification that an interrupt is pending in the hardware. Masking allows the OS to work uninterrupted on a unit of work. The OS will re-enable interruption when it decides that its work is done, or when it dispatches a user task. Interrupt Handlers run disabled. This means that these entry points into the OS are entered with interruptions disabled. The processing of the first-level Interrupt Handlers must be extremely fast.
Dispatcher The dispatcher is the service of the OS that gives control of the CPU to the next process selected by the scheduler. This function involves: Context switching, Switching from Supervisor to Problem state, Branching to the proper location in the application program to commence the program at its next instruction.
Scheduling Criteria Criteria by which to compare scheduling algorithms: CPU utilization – How busy can we cause the CPU to be driven. Throughput – How much aggregate work can be accomplished per unit of wallclock time. Wall Clock Turnaround time – How fast can the work of a process be finished. From submission to completion. I/O Wait time – What portion of time is spent waiting for I/O to complete. User Service time – What portion of time is spent waiting for the OS to perform services on a task's behalf. CPU Wait time – What portion of time is spent by processes waiting for the CPU in the ready queue. Response time – How fast are human trivial interactions responded.
Scheduling Algorithms First-Come, First-Served scheduling Simple FIFO queue
Shortest-Job-First scheduling Jobs that need the CPU the least go first.
Priority scheduling Jobs follow FCFS queueing unless marked as a higher priority. Those are moved to appropriate placement in the queue. Can have problems with “starvation” because they are continually being “cut in front of”. Priority is raised over time (aging).
Round-Robin scheduling Following time quantum preemption the task goes to the back of the dispatch queue. Performance relies on the size of the timeslice. Too small a timeslice will incur frequent context switching and overhead. Too large a timeslice will affect interactive users with inconsistent response times and RR degenerates to FCFS.
Multilevel Queue scheduling Separate the tasks into different queues by their run characteristics. Queues with smaller CPU requirements are given higher priorities. Such as: (high to low priority)
System
Services
Interactive tasks get the CPU frequently but in very small time quantums. Batch work get the CPU infrequently but in larger time quantums when there is no interactive tasks queued ready. No batch work is permitted to run unless the queues for interactive and system services tasks are empty. Each queue has associated with it a different time quantum. Each queue is guaranteed a fixed percentage of access to the CPU over time. The tasks of a queue are prioritized within the queue.
Multilevel Feedback Queue scheduling. Like the multilevel queue but with reassignment among the queues in the event that a task either exceeds its time quantum, or never comes close to using its time allocation. In the event that the task exhibits the characteristic of consuming its entire timeslce then the task is moved to a more compute intensive queue. In the event that the task voluntarily yields a significant remainder of its timeslice, then the task is moved to a less compute intensive queue.
Multiple-Processor Scheduling Asymmetric v. Symmetric multiprocessing scheduling In asymmetric processing a single CPU handles all the scheduling decisions and I/O processing. The remaining CPUs execute user application code. In symmetric processing all CPUs schedule from their own queues. Programming at the OS level must ensure that no two CPUs choose the same task, and that no processes are lost from the queues.
Processor Affinity When a process migrates from one CPU to another then the contents of the CPU pipeline and the memory cache for the last CPU must be invalidated. Then, the pipeline and the cache for the next CPU must be repopulated. This is costly and wasteful. Keeping the task on a single CPU is known as processor affinity. This avoids the high cost of migration. A downside may be waiting for a particular desired CPU when another is available.
Load Balancing Performed automatically when CPUs share a common queue of ready to run tasks. Push Migration – An OS service periodically checks the load on each processor. If it finds an imbalance then it evenly distributes the load by moving processes from the overloaded to a less-busy CPU. Pull Migration – An OS service running on an idle CPU will examine the queues of other processors and extract one for itself. Will counteract the affects of processor affinity.
Multicore Processors This hardware multithreading is employed to increase processor utilization by switching to another thread when the current thread that is found to be memory stalled due to a cache miss, waiting or yielding ( or incurring any long-latency event). In this situation the CPU switches to a second hardware thread. Sometimes referred to as “hyperthreading” or a “dual-threaded” processor core. From the OS perspective dual threaded processors appear as two CPUs. This operation is somewhat like timeslicing performed on a hardware level. No OS decision processing is involved.
Virtualization and Scheduling OS guest machines cannot account accurately for the passage of real time. The have access to a “virtual” clock not the real clock. The virtual clock does not tick smoothly and continuously, but is updated by the hypervisor with real clock information at dispatch time. Any guest operating system scheduling algorithm that assumes a certain amount of progress in a given amount of time can be negatively impacted. There will be “uncaptured” or “unaccounted for” time as measured by the guest. This will appear to the OS guest as time in which the CPU was in a suspended or stopped state.
|
Basic Concepts A deadlock is that state entered by a process when its continued forward progress depends on waiting to acquire access to one or more resources that never becomes available. A requirement for deadlock is that a process is respecting the lock of a shared resource by another process. The sharing processes are attempting to synchronize the use of a resource.
Possible Problems due to Non-Synchronization Memory
Consistency Problems: One thread is writing to a file while the other(s) are reading the same file. The data will be unreliable at best.
Thread
Execution Interference:
Necessary Conditions required for Deadlock to occur Mutual Exclusion – Only one process at a time can access a resource. Hold and Wait – Holding at least one resource and waiting to acquire another held by some other process. No Preemption allowed – Resources can be released only voluntarily. Circular Wait – Process-1 is waiting for release of Process-2 which is waiting for release of Process-3,..., which is waiting for release of Process-N,..., which is waiting for Process-1.
Possible Problems due to Synchronization A race condition is an undesirable situation that occurs when multiple program threads attempt to perform operations simultaneously (or concurrently) on the same resource, but because of the nature of the system, the operations must be performed in the proper sequence in order to be done correctly, otherwise the outcome of the execution depends on the particular order in which access to the shared data takes place.
Starvation occurs when one or more threads in your program are blocked from gaining access to a resource and, as a result, cannot make further progress.
Deadlock, the ultimate form of starvation, occurs when two or more threads are waiting on a condition that cannot be satisfied. Deadlock most often occurs when two (or more) threads are each waiting for the other(s) to provide some service or release some resource that is needed to progress further. All resources locked must be later unlocked by the owning thread.
Livelock is a form of deadlock. It differs in that the competing threads repeatedly change their conditions without regard that the same change is being performed in the other threads. No progress is made because they meet in deadlock with another new condition that they have synchronized on.
Indefinite Overtaking is a type of race condition. One or more fast tasks accessing a common resource with a relatively slow task(s). The slow tasks rarely get access to the resource because they are continually being “overtaken” in processing speed by the faster tasks. By the time the slow task recycle itself and attempts another access it is yet again beaten to the punch by a faster task.
Deadlock degrades overall system performance to the point that all or parts of the system will stop functioning, and will require manual restart.
Deadlock Prevention Deadlock can be prevented by ensuring that at least one of the following required conditions do not hold. Mutual Exclusion Do not employ mutual exclusion semaphores on other than common shared resources.
Hold and Wait When a process requests to reserve a common resource, ensure that that process holds no other common resource. Serialize the reservation and release of all shared resources.
No Preemption Allow the OS to force release of resources from holder.
Circular Wait Impose a total ordering on all resource types, and to require that each process requests resources in an increasing order of priority. Require implementing a priority-inheritance protocol. All processes that are accessing resources needed by a higher-priority process inherit the higher priority until they have released the resource.
Deadlock Avoidance Possible side-effects of deadlock prevention by various algorithms are low device utilization and reduced work throughput. A process is said to be deadlock safe, if and only if, there exists a safe sequence that the system can allocate all required shared resources to each process, in some order and still avoid a deadlock. Bankers Algorithm (1) When a new process enters the system, it must declare the maximum number of instances of each shared resource type that it may need. This may not exceed the total number of shared resources in the system. (2) When a process requests a set of shared resources, the system must first determine whether the allocation of these resources will leave the system in a safe state. If it will, the resources are allocated. Otherwise the process must wait without requesting any resources until some other process releases sufficient resources needed by (1).
Recovery from Deadlock (possibles) Process Termination
Abort
all deadlocked processes.
Resource Preemption
Selecting
a Victim to cancel.
|
|
Memory Management Main memory and the registers built into the processor itself are the only storage that the CPU can access directly. Memory consists of a large array of addressable bytes; that is each byte has its own address. The CPU fetches bytes from memory according to the address kept by the PSW or program counter. The CPU then decodes those bytes into (hopefully) instructions. The instructions may cause additional loading from and storing to specific memory addresses. What does the CPU do while it is waiting for data to come over the memory bus?
Supporting Hardware cache
base
register (also known as a relocation register)
Logical (Virtual) vs. Physical (Real) Addressing Logical or Virtual address is the address computed with the aid of the OS, based upon the “view” of the address space in which the program executes. Physical or Absolute address is a real location on the memory chip. It is between 0 and the highest addressable byte of real memory.
Address Binding Symbolic address – The addresses from the prespective of the programmer. Relocatable address – Addresses in a format that can be altered, or relocated, at execution time. Real address – The physical address of real memory.
Address Mapping Run-time mapping between virtual and physical addresses is performed by a hardware mechanism called dynamic address translation (DAT) or the memory-management unit (MMU). The value of the relocation is adjusted for every address in each memory access. This means that the value of the relocation register is added to each accessed address. The application program never sees the real physical addresses. So we have logical addresses in the range 0 to “any logical address”. And physical addresses in the range of Real Address 0 to the “end physical address”
Swapping Swapping is a process that temporarily removes the entire address space of a process from memory to a backing store (auxiliary storage). This frees up memory for occupancy by another process. Later, the rolled-out process and brought back into memory (roll-in) and its execution is continued at the point of interruption. The context switching time is significantly high. Swapping is rarely used since it provides very little execution time in relation to context-switching time for this method to be a reasonable memory management solution.
Memory Management Allocation Schemes Single Contiguous Allocation Before the advent of operating systems we had a simple address space that could accommodate a single application. Program must be able to fit into the address space. Each instruction and data address mapped directly to a real address. The unused portion of storage went wasted. If a program does not fit the developer had the option to use dynamic loading. With dynamic loading a routine is not loaded until it is called. All routines are kept on some storage media in relocatable form. The main program is loaded into memory and executed. When a routine that is not in memory is needed, the relocatable linking loader is called to load the desired routine into memory, and control is then transferred to that routine. Each program was initialized or “booted” into storage. The logical programmer view was the same as the absolute physical view of storage. Along with CPU and Memory the process required Unit Record devices to run. Card Reader, Card Punch and Printer. As the program executed it read and punched cards, printed lines and accessed other peripherals in real time If a unit record device was unavailable or down the program would not run.
Partitioned Allocation Static Partitioned. Address space that could accommodate a fixed number of partitions and therefore a fixed number of applications. An operating system was needed; it was loaded beginning at memory location zero. Each application partition, aside from the first, was offset by a fixed displacement from memory location zero.
Since programs are written with the assumption of running relative to address zero, each memory access was manipulated by an offset representing the beginning partition address so that it could address the proper physical address relative to the partition start boundary.
The memory space of partitions were protected by means of a base-register and a limit-register. Only the OS can load these two registers. Every address generated by a user mode program is validated. A program can then legally access any real address between the base and (base+limit) range. Remaining unused portions of each partition called fragments could not be used for other programming executions
Supporting Hardware
base
register (also known as a relocation register)
Dynamic Partitioned. Had all the features of Static Partitioned with the exception that the address space could accommodate a variable number of partitions and applications. The variable number was set by the operator at machine startup time.
Both static and dynamic partitioned required that each program must be able to fit in the partition allocated to it.
Again, programs running in partitions other than the first must have their memory addresses and accesses offset by the displacement of the start of the partition. Still the unused portion of storage in each partition was unusable for any other program. Fragmentation exacerbated with time. And If the entire program did not fit in any of the partitions, then it could not be run; or it must be segmented by the programmer. No way to collect the fragments into a single contiguous partition. All tasks shared a single address space. One process malfunctioning could affect all other processes running. You could now run multiple programs at the same time but the shop needed to have a set of Unit Record Devices allocated to each partition. If you had 4 partitions running you may have been required to have 4 sets of unit record equipment.
Spooling The IBM Federal Systems Division contractors at the NASA invented a program that virtualized card readers, punches and printers (HASP). The SPOOLer (which stood for Simultaneous Peripheral Operations Online) would capture the formatted output (unit record work) destined for the printer/punch and hold it on a disk device. The same for card reader data destined as input for partitions. Spooling allowed multiple partitions to run at the same time while the shop may have had a only a single printer, card reader and card punch. The SPOOLER, which ran in its own partition, would queue, retrieve and drive the output to the appropriate devices in a serial manner at the devices' design rate. Without spooling, most programs would be relegated to patterns of fast CPU processing followed by long waits for the printer/card. Use of spooling removed much of the WAIT from the execution system, causing the CPU to be tasked with more frequent work and driven harder. The result was faster throughput and the ability to run more programming in a smaller schedule window. CPU utilization necessarily increased.
Relocatable Partitioned Scheme This was an attempt to have the benefits of dynamic partitioned allocation with reduced fragmentation. The relocatable scheme had the ability to reclaim the unused fragments of each partition. By suspending operation and “percolating” or “compacting” the used portions of each partition up to the top of memory, thereby reducing the partition sizes dynamically; and creating a new unused partition at the bottom of memory representing the collection of the fragments. That new partition now can be used to run another application.
Problems were the cost and complexity of relocation. The logical addresses had to be relocated dynamically at execution time. Compaction time was substantial. Computer operations were required to be suspended, then each partitions' instructions had to be manipulated to reflect a new displacement, or offset, from the beginning of real storage. There was no guarantee that the new partition would be large enough to run a particular application. Failures required restarting the application and possibly the entire machine.
Paged Memory Management Paging is one of the memory-management schemes by which a computer can store and retrieve data from secondary storage for use in main memory. In the paging memory-management scheme, the operating system retrieves data from secondary storage in same-size blocks called pages. The main advantage of paging over memory segmentation is that it allows the physical address space of a process to be non-contiguous. Before paging came into use, systems had to fit whole programs into storage contiguously, which caused various storage and fragmentation problems. Paging is an important part of virtual memory implementation in most contemporary general-purpose operating systems, allowing them to use disk storage for data that does not fit into physical random-access memory (RAM).
Break the real storage up into small pieces called pages or page frames. Each page, usually of size 4K, would contain a piece of the application. The memory for an application no longer must be physically contiguous, pieces can exist whereever in storage that a page frame is available. Logical addresses do not match physical addresses. Each page has its own offset from the beginning of real storage. The pages are linked together to form a logically contiguous address space by way of a Page Map Table. The OS maintained the PMT. From the program's perspective the logical addresses matched the symbolic addresses. All pages still must fit into real physical memory.
Since the pages could be located anywhere in real storage, each instruction that addressed real storage must be manipulated to reference the actual page that contained the true offset into real storage needed by an application. Therefore each instruction had to be modified or translated in CPU processing by using the contents of the Page Map Table to direct operations to the correct real addressable page. Dynamic address translation, or DAT, is the process of translating a virtual address during a storage reference into the corresponding real address.
Fragmentation was remedied. Allowed a dynamic number of concurrent partitions. All tasks still shared a single address space and therefore could still adversely affect each other. Programmers still could not run a program that exceeded the size of real memory.
|
Virtual Memory Management
Virtual Memory allows the execution of processes whose programming cannot fit completely in real memory. A major advantage is that programs can now be larger than the available physical real memory. Because programs can use less physical memory, more programs can run together and thereby increase CPU utilization and work throughput. The virtual address space of a process refers to the logical view or perspective of how a process (or the programmer) “sees” memory.
Demand-Paged Memory Management The motivation here was to address the problem of running an application(s) that exceeded the size of the real memory available. Prior to this time programmers were responsible to manage memory allocation by dividing a program that could not fit in the available memory into segments or overlays. The segments were loaded and unloaded under user program control during execution. The programmer had to ensure that the segments loaded never attempted to exceed the real storage.
Virtual Storage is a technique that used main real memory as a “cache” for secondary or auxiliary storage. The auxiliary storage is usually an array of disk drives that contains all the pages for all the applications running in the computer. The main real storage is used to cache only the “active” parts of the application. Each cache block is known as a page. Each cache miss, an event that occurs when the accessed page is not present in real memory, is known as page fault. The virtual address is an address that corresponds to a location in the virtual address space and is translated or re-mapped to a physical address when memory is accessed. The virtual memory address is translated by a combination of hardware and software mechanisms to a map to a real physical address. This mapping process is called address translation.
Demand paged memory management had all the advantages of paged memory management with relief from the problem of “too big” applications. Its disadvantages were increase hardware cost, processor overhead and OS complexity. But, all tasks continued to share a single address space and could adversely affect each other.
Basic Concepts Locality of Reference is the “closeness” in either space or time with which a process' pages are accessed. The Page Map Table is the array in real memory that the OS uses to translate the logical page location referenced by an instruction, to the logical page to a physical page frame in memory. The page fault rate is the number of page-ins and page-outs occuring over a unit of time. The Dynamic Page Pool are those page frames of real memory that are to be used for non-OS page frames. The working-set of a process is the pages that exhibit sufficient locality for the OS to keep resident in real memory. Overcommitting Storage is the term used for allocating more virtual storage than the available physical memory resources. The victim page frame is the page selected by the paging algorithm to be removed from real storage to auxiliary storage to make room for an incoming page. Page frames can be allocated on the basis of priority of processes maintained by the OS.
Page Replacement Algorithms First In First Out (FIFO) is a scheme in which the page of storage selected for replacement is the oldest one. Timestamped or replacement of head of queue.
Least-Frequently-Used (LFU) requires that the page with the smallest number of “touches” be replaced. Count of references.
Most-Frequently-Used (MFU) is based on the argument that the page with the smallest count was probably just brought in and has yet to be referenced sufficiently.
Least Recently Used Removal (LRU) is a scheme in which the page of storage selected for replacement is the one that has been unused or un-referenced for the longest time. modify (or changed) bit reference bit To aid the operating system to estimate and identify the LRU pages, the OS provides a reference-bit which is set whenever a page is accessed. The operating system periodically clears the reference bits and later records them so it can determine which pages were touched during a particular time period. The OS can then select a page that is among the least recently referenced page by detecting it having its referenced bit turned off.
Thrashing is the term given to the condition of incurring high paging rate activity relative to actual productive work. A process that is thrashing is spending more wall-clock time paging than executing.
Some Performance Consideration Prepaging or Anticipatory Paging This technique, sometimes called "swap prefetch", preloads a process's non-resident pages that are likely to be referenced in the near future (taking advantage of locality of reference). Such strategies attempt to reduce the number of page faults a process experiences. Some of those strategies are "if a program references one virtual address which causes a page fault, perhaps the next few pages' worth of virtual address space will soon be used".
Page Size considerations Each page fault generates a large amount of overhead needed for processing the interrupt, saving registers, context switching, replacing a page via a page-out, followed by a page-in, I/O queuing for the paging device and updating the page map tables. Smaller pages result in less I/O activity and less total allocated memory, due to less fragmentation and less loading of code and data that may not actually be referenced.. Larger pages minimize the number of page faults, but have larger portions that remain unused.
I/O Page Locking Pages that contain read/write buffers should remain fixed in memory. Performance consideration of Page Locking. Fix in real memory the pages with the greatest anticipated locality.
Dynamic Address Translation (Page Mapping) Dynamic Address Translation (DAT) is the process of translating a virtual address during a storage reference into the corresponding real address. If the virtual address is already in central storage, the DAT process may be accelerated through the use of a translation lookaside buffer. If the virtual address is not in central storage, a page fault interrupt occurs, the OS is notified and it brings the page in from auxiliary storage. If, while executing an instruction, the CPU fetches an instruction located at a particular virtual address, or fetches data from a specific virtual address or stores data to a particular virtual address, the virtual address must be translated to the corresponding physical address. This is done by a hardware component which looks up the real address (from the segment and page tables) corresponding to a virtual address and passes the real address to the CPU for execution of the instruction. If the tables indicate that the virtual memory page is not currently in real memory, the hardware raises a page fault exception which invokes the paging subsystem component of the operating system.
DAT is implemented by both hardware and software through the use of page tables, segment tables, region tables and translation lookaside buffers. DAT allows different address spaces to share the same program or other data that is read-only. This is because virtual addresses in different address spaces can be made to translate to the same frame of central storage. Otherwise, there would have to be many copies of the program or data, one for each address space.
Segmented Memory Management Segmented Memory Management allowed for multiple address spaces to be constructed. An address space is a length of memory that is viewed and managed from location zero and extends to some N-address. Up until the invention of segmented memory management all processes shared a single address space. Each process would occupy a partition of memory carved into this address space. The logical view of the process memory space was mapped by a relocation register into a displacement into the address space. It was possible for one partition to adversely affect another partition.
SMM allowed all the benefits of demand paging using more than a single address space. You could run multiple sets of address spaces in which each set of applications were architecturally isolated from programs running in other address spaces. Segmentation supports the user's logical view of memory by allowing each process to run in its own protected address space. DAT led to this capability and forms the basis for memory management on all major computer architectures. When a program in some address space was dispatched the hardware is loaded via a special control register with a Segment Table Origin (STO) pointer which addresses or points to a unique individual Page Map Table associated with each different address space, thereby capability to context switch among multiple independent address spaces. Now, no process running in some address space can reach and affect the code or data of another process running in some other address space.
Shared Segments These are read-only blocks of pages that are physically shared by multiple address spaces, as opposed to filling memory with duplicate copies of code and data. Generally they house simultaneous reusable programming code (also known as re-entrant code). Re-entrant code never alters the data or instructions inside the shared segment; only registers and data areas outside of the segment are used to manipulate and undergo change. Shared Segments save enormously on storage sizes by attaching seamlessly into the address spaces of each running application using the shared code. Virtual addresses in different address spaces can be made to translate or attach to the same page frame of real storage.
Virtual Machines VM Hypervisor and Guests Virtual Machines are those operating environments that provide emulation methods that present a standard software interface that is identical to the underlying hardware, to their "guest" operating systems. Also known as "hypervisors", these operating platforms create multiple instances of the same computer architecture as the one on which the hypervisor is natively running. As the native operating system, and owner of the real Page Zero fixed storage locations, the hardware will respect the VM hypervisor as the owner of the processor.
The virtual machine hypervisor is the only task(s) that runs in the privileged (or supervisor) instruction state. All other guest operating systems run in the problem (or application) state. Therefore when a guest attempts to execute a privileged instruction such as an I/O, the real hardware will capture a Program Check Interrruption and invoke the VM hypervisor's interrupt handler for processing. If the hypervisor determines that the guest was in the proper "virtual" privileged state, the hypervisor will execute the instruction on the behalf of the guest, then "reflect" the signal and completion statuses back to the guest's operating environment. Next it will re-dispatch the guest. The guest will have no indication that it did not perform the same operation itself. Reflection alllows the guest to infer from its perspective that the instruction was executed directly by the guest itself. The guest operating systems generally cannot detect that it is running in a virtual machine aside for the unaccounted wall clock time as measured by the guest.
|
File Concepts
What is a File? A file is an abstract data type defined and implemented by the operating system, consisting of a sequence of logical records. The logical record is made up of data fields, which in turn are made up of strings of bytes.
File Operations Open/Close – Reserving/Releasing access to an existing or a new file to be created. Create – Generate a new file with requested attributes. Write – Write or rewrite an existing file. Read – Read data from an existing file. Reposition – Set the read/write position indicator on a file. Delete – Erase a file directory entry, and release the file space. Truncating – Reducing the file size to zero. Appending – Write new or additional information at the end.
File Open/Close Processing File Pointer – tracks the last read or write location with a file position indicator. File open counter – tracks the number of opens and closes against the file. It reaches zero on the final close. The OS can then remove the entry. Device Location of the file is the address of the device within the computing system. Permissions and Access Rights – used by the OS to allow or deny I/O requests for the file data. These control the ability to view or make changes to the contents of the filesystem by various processes.
File Types File Extensions Names The OS can use the extension name to indicate the type of file and the type of operations that can be successfully performed on that type of file. TXT, JAVA, PDF, BIN, EXE, CLASS, JPEG, SH, ZIP, etc.
File Attributes Name, TimeStamp,Type, Location, File Size, Record Size, Accessability, Fixed or Variable length records.
File Structure A directory provides a table of contents of the files. File creation – Generate an index entry for a new file. File deletion – Remove of an index entry for an existing file. File list – Show each file and its attributes. File System traversal -- Creation of a hiearchy or network.
File Access Methods An access method is a function provided by the operating system that provides an application programming interface that allows read/write manipulation against files. Sequential Start with the first record and search in linear fashion forward one record at a time.
Direct, Relative Record or Random Calculate precise location of record inside of the file, relative to the beginning of the file.
Indexed Sequential Access Method (ISAM) File is divided into partitions. Calculation of the partition that contains record, then searched sequentially within the partition.
Virtual Storage Access Method (VSAM) VSAM has both methods of Relative Record and ISAM, with the additional features of files spanning disk volumes.
File System Mounting The root node is the highest directory from which all other directories are descendants. The mount point is the directory where a file system on a given device is connected to the main OS file system to make it's files available for use.
File Protection by OS File protection is provided by access lists, passwords, information hiding and other techniques to control access. Types of Access Read
Control of Access Owner/Creator
|
Mass Storage Concepts
The Data Pyramid CPU Mass Storage is generally a reference to the lower levels of the Data Pyramid.
Overview Magnetic Disk arrays Magnetic Tape libraries Store
large quantities of data Network-Attached Storage Network
access pools of virtualized disks Storage-Area Network Private
local access pools of virtualized disks
Disk Structure Disk – The surfaces of the platters physically arranged in cylinders. Each platter has two working surfaces. Sector – A segment of data that makes up the tracks on a magnetic disk. It is the smallest amount of information that is read or written on a disk. Head – The magnetized read/write mechanism that flies over the disk surface. Cylinder – The collection of all concentric rings or tracks of data that are the same distance from the platters' edge and which can be accessed without the need to reposition the heads. Arm Assembly – The mechanism that move all the heads simultaneously from one cylinder to another.
Disk Storage Magnetic disks are nonvolatile storage. A track is one of the concentric circles that make up the surface of a magnetic disk. Tracks are accessed by read/write heads that literally "fly" over the track as the platter surface spins. A cylinder consists of all the tracks on the disk platters that can be accessed by the set of heads without the need to reposition the heads. The head is moved back and forth across the cylinders by the disk actuator mechanism.
Disk Scheduling Physical Movement Components Seek is the time required to move the heads from one cylinder to another. This includes acceleration, deceleration, stopping and settling to a stable position. Rotational Latency is the amount of time required for the sector, or record to be accessed, to rotate under the read/write head. Transfer time is the time needed to move the data between the disk surface and the computer memory. Transfer time is proportional to the amount of data transferred. Service Time = seek + latency + transfer times Connect Time is the time that the device is transferring data to/from memory. Disconnect Time is the time that the device is not transferring data between the device and memory. Pending Time is the time that the device request to perform I/O remains queued. Channel
Utilization =
Scheduling
of disk I/O First-Come-First-Serve (FCFS) Choose
the track where the I/O was requested first. Shortest-Seek-Time-First (SSTF) Choose
the track closest to the current head position. Elevator Algorithm (SCAN) Choose
the track is next in the direction of the moving head. When an end
is reached then reverse direction and process to the opposite end.
Circular (C-SCAN) Like
the SCAN method but when an end is reached then reset to the
beginning without servicing requests along the way back. Modified SCAN (LOOK) Like
the SCAN method but move forward only to the furthest currently
queued I/O request, then reverse direction.
RAID RAID stands for "redundant arrays of inexpensive disks". Motivated initially by performance. The key reason it is used today is for dependability. Dependability, Reliability, Availability and Serviceability Computer system dependability is the quality of delivered service such that reliance can justifiably be placed on this service. Reference specification of expected behavior so that one can determine dependability. 1) Service accomplishment, where the service is delivered as specified. 2) Service interruption, where the delivered service is different from the specified expected service. Transitions from state-1 to state-2 are called service failures. Transitions from state-2 to state-1 are called service restorations.
Reliability is a measure of the continuous service accomplishment or the mean time to failure (MTTF). Service interruption is a measure of the mean time to repair (MTTR). Mean time between failures is the sum of the preceding two measures. (MTTF + MTTR)
Availability is a measure of service accomplishment with respect to the alternation between the two states of accomplishment and interruption.
Availability
= MTTF / (MTTF + MTTR)
Serviceability is a measure of the ease (or difficulty) of the application of upgrade or corrective service to be performed to the computing system. Can service be performed on a running system, or must the system experience loss of availability.
RAID Levels
RAID-0
"striping". The RAID controller make multiple disk drives appear as a single large logical drive with great capacity and many R/W heads. Note there is no redundancy of data here. In the event of a single drive failure you must recover all volumes from backup media, such as tape; then journalling files.
RAID-1
"mirroring". Data is striped over many drives and full set is duplicated in another location. Fast switchover in the event of a failure, but slow recovery for the failed set of drives since all volumes may need restoration.
RAID-2 (no longer used.)
RAID-3
"bit-interleaved" Long time to recover by less expense for mirrored storage (less volumes needed).
RAID-3 improves on RAID-1 by reduction of one disk, plus each byte is spread against multiple striped disks.
RAID-4/5/6 improve on RAID-3 by replacing the single parity volume by placing parity partitions on each of the data volumes. You thereby have “striping” performance with “bit-interleaved” recovery, and no need for additional volumes.
(Higher level RAID schemes progressively employ sophisticated algorithms to apply over the data such that one can begin to approach error detection and correction, and data redundancy with less extra device volumes.)
|
Overview
Polling Polling or Busy-Waiting is the actively sampling the status of an I/O device as a synchronous activity to exclusion of all other work until the device responds. Waiting-for-Interruption allows commencement or continuation other work while the I/O subsystem asynchronously and simultaneously handles I/O processing. When the I/O has completed or failed an interrupt will be presented to the Operating System.
Interrupts Interrupt-request or Interruption An Interrupt is a signal to the Operating System emitted by the hardware or running software indicating that an event needs immediate attention. The hardware could be a device or a component of the processor. This can be a Channel End, Control Unit End or Device End signalling a milestone in the I/O read/write request. The software is usually an application program. In the event of the software this could be an exception driven by a request for service or a program check.
The Interrupt-Handler is a special OS process that gets control from the computer architecture at interruption time. This process rapidly determines the cause of the interruption and then services the event requiring attention. Then the OS dispatches its next ready-to-run task.
Interrupt-Controller hardware and software mechanisms Interrupt Masking is the disabling or enabling of further interrupts in the PSW or PC register, prior to handling the current interruption and immediately before exiting the interrupt handler. Interrupt Priority designates the importance of interrupts being presented to the processor. Interrupts of greater priority are handled first and may interrupt the processing of lower priority events.
Direct Memory Access (DMA) Transferring data between an I/O Device and Memory Both polling and interrupt-driven I/O transfers place the burden of moving data and managing the transfer on the processor. Polling simply consumes an extraordinary number of CPU cycles, but it is notified of the I/O event within one polling interation. Interrupt-driven I/O relieves the OS of requiring high frequency reporting from the I/O subsystem, but the notification of the I/O event may be slightly delayed by the processing of other work by the CPU. Engineers have designed a mechanism for offloading the processor and having a device controller or a channel processor handle the transfer of data directly to and from memory without involving the processor. This mechanism is called Direct Memory Access (DMA). The interrupt mechanism is used by the I/O processing to signal the completion of the I/O transfer, alerts when an error occurs or to post milestones to the OS during the I/O. These milestones may be Channel End, Control Unit End and Device End events. In the meandtime the processor is free to work on other tasks' computation.
DMA is a specialized controller that transfers data between an I/O device and memory independent of the processor. There may be multiple DMA devices in the computer. The DMA follows a script known as a I/O Program or Channel Program. This program gives the DMA a list of instructions to be followed to complete the I/O request. The DMA can now perform the data transfer without consuming a great number of the processor cycles. Both the DMA and the CPU however must serialize their access to the common memory.
DMA Memory Access and the Memory System Acting independently from the processor, the DMA will have the same difficulties of the operating system when dealing with a virtual memory system. The page frames that will be the source/targets of a data transfer will have both a physical and a virtual address. The DMA will have to necessarily perform address translation to locate the page frame. Because of ongoing paging operations, the OS will have to "lock" the page(s) that contain I/O data in real storage until the completion of the I/O. The I/O completion is signaled by the subsequent I/O interrupt. Data transfers cannot cross a page frame boundary, so the transfer must be constrained to a single page. Large transfers must be broken up into sizes that will fit on a page; and then chain together the transfers.
Advantage of DMA: Frees up processor cycles for computation. Disadvantage of DMA: Increases cost of the I/O system.
Application I/O Interface The I/O Interface is a standardized set of functions that is used to abstract away the detailed differences among the various I/O devices and providing a standard and uniform way to access, read and write to external peripheral devices.
Blocking and Non-Blocking I/O Blocking or synchronous I/O requires that no other processing commence until the I/O transmission has completed or failed. Control does not return until the I/O is finished. A function blocks when it waits for something to happen before returning, thereby leaving the CPU idle. Non-Blocking or asynchronous I/O allows other processing to continue that does not have a dependency on the I/O operation. Generally the OS will have other processing ready to dispatch.
The Operating System's I/O Subsystem I/O Scheduling is determining the order in which to execute I/O requests so that overall system performance can be improved, devices can be accessed fairly among processes and the average I/O wait time for completion is minimized. Device Status Table is used by the OS to track I/O request and maintain a queue. Information includes the device type, its address, and its idle or busy state. Buffering is the use of memory to store data being transferred between a device and an application: for coping with a speed mismatch. To provide adaptations for devices that have different data-transfer sizes, and may arrive out of original sequence. To support copy semantics for application I/O and thereby locking the buffer for further update. A copy is moved into a protected area while the application my continue using the original location.
Caching is using a faster region of memory to hold copies of data that exhibit a high degree of locality.
Spooling is the operating system's process of intercepting all unit record input and output. Each applications' print is spooled to a separate disk file. The unit record work is virtualized for the application process. The OS can then serialize the queued print files to a system printer one at a time.
Device Reservation is the holding of devices to prevent deadlock.
Error Handling allows operating systems to compensate for transient I/O errors by attempting to retry the operation upon sensing a failure.
I/O Protection intervenes when a user process attempts to issue illegal I/O instructions which may disrupt the proper functioning of the system. All I/O instructions are privileged instructions. Programs in the problem state must use a System Service Call to request that the OS perform the I/O on its behalf.
OS maintains data structures to keep track of the state of each device. I/O Subsystem Summary
Management
of the namespace for files and devices.
I/O Commands The I/O interrupts may be presented at milestones so that the OS management may periodically check on the status of the work being performed. Test for device Start the I/O Halt the I/O operation Resume to I/O Check on progress, (channel end, control unit end, device end, etc.), completion milestones.
Hardware-Software Interface The operating system must be able to communicate with the I/O devices directly and also prevent the application programs from performing I/O directly. The OS must be able to give commands to the I/O device. Such are read, write, but also control commands such as SEEK, SUSPEND and HALT. The device must be able to notify the OS when the operation has completed or has encountered an error. Data must be transferred between memory and the I/O device. No direct I/O device to I/O device transfers. Much of this processing must occur independent of the OS.
An I/O instruction is a command sequence to an I/O device that specifies: the target device address a small script or program describing the work to be performed, the location in virtual memory of the read/write buffer, and possibly the address of the next I/O instruction.
Transforming I/O Requests to Hardware Operations Application process builds a parameter list for performing a read/write. This is followed with a Supervisor Call to request OS servicing of the actual read/write. OS checks to see if that file fragment is already in a real storage, If not, the OS via a device driver, allocates buffers then schedules the DMA (or Channel) to perform the physical I/O. The application is suspended pending return of the I/O. After the I/O is completed an interrupt is presented to the processor. The OS interrupt handler processes the event. The OS schedules the application to be re-dispatched now that I/O request is completed.
I/O Performance I/O places heavy demands on the CPU and the OS Scheduler, thereby greatly impacting system performance. Areas of concern: Interrupt handling to service the I/O devices signaling of milestones. Context Switching among tasks requesting I/O operations. Page Locking of frames containing I/O source/target buffers. Memory Bus saturation with data transfers. Handling of networked I/O
Remediation Techniques to reduce demand load: Front-End processors Terminal Controllers Communication Concentrators Use of the I/O Channel Processors Interrupt handling instead of Polling
I/O interrupts may have a hierarchy and relative priority associated with them thereby relating each type of interruption with a different urgency, and order in which to be handled by the operating system.
The Interrupt mechanism eliminates the need for the processor to poll the device and allows the processor to focus on executing programs' CPU bound work.
|
Return to Professor Page]