Professor Cullen Lecture Notes

Operating Systems


Topic 1
Introduction


What is an Operating System

An operating system (OS) is the set of control programs (CP) that manages the computer hardware and the application programs (APP). It is responsible for ensuring an environment in which the applications can access and equitably share the hardware resources needed for their execution. The operating system also acts as the intermediary between the computer user and the computer hardware acting as a resource allocator.





What is a Task

A task is a computer program acting upon data to produce some service or result.





What are the Goals of the Operating System

To increase system throughput. Execute as many tasks as possible.

To decrease task turnaround time. Shorten the time between the introduction of a task and its completion.

To provide services to both the tasks and the users. Such as device abstraction, TOD query, communications, etc.





What do we mean by Hardware

The major hardware components consist of:

Central Processing Unit (CPU)

Addressable Memory

Input-Output (I/O) devices





What do we mean by Applications

The applications are programs that provide useful function and value to the computer user so that problems can be solved, information accessed; provide entertainment, data transmission, communication, etc.





Abstract View of a Computer System

User View

Interaction with applications via a User Interface (UI) presented on a monitor, as well as an input interfacing provided by mouse, keyboard or touchscreen.



System View

The OS acts as a “control program” that manages the computer hardware resources via allocation of computation time, use of memory, file system storage and performing input/output operations.

The OS must allocate resources in such a way as to efficiently and fairly allow the applications to execute. It must attempt to respond to interactive users quickly, and share I/O devices unobtrusively among the applications.

The OS must maintain logical separation between applications, trap, log and recover from errors commited by the applications and prevent improper use of the machine.





Types of Programming

Kernel or Nucleus

The core program that is recognized as the owner of the hardware processor. (a.k.a native operating system.)



Systems Programs

Programming that runs privileged. These may be called by the kernel and directly by the hardware.

The kernel and systems programs make up the Operating System.



Application Programs

Unprivileged programming that is dispatched by the OS. This is all the programming OTHER than the applications.





Computer System Operation

BootStrapping is the process by which a loaded OS is given control over the computer hardware.

Firmware or Microcode, the lowest level of programming. The BIOS is written in microcode. Machine language instructions can be enhanced or modified by decomposition into microcode.

Interruption is the process by which the hardware mechanisms regain control from application tasks and transfer authority to the OS.

Supervisor Service Call (SVC) or Monitor Call (MC) is a process by which an application can request privileged services of the OS.





Addressable Storage Structure

Registers

Extremely fast access storage areas that are packaged with each CPU. Usually there are 16 General Purpose Registers, numbered in hexadecimal from 0-F. They are directly addressable by the CPU via their number.

Real Memory

Storage that is directly accessible across a bus. Each byte of memory is addressable. An individual bit within a byte is not addressable. The limit of addressability is governed by the width of the register. In 64-bit architecture this means 2**64 bytes are uniquely addressable.

Virtual Memory

Virtual Memory is the aggregate of Real Memory plus Auxiliary Storage used to back real memory. Virtual Memory generally exceeds the amount of real memory up to the architectural limit. Storage areas that do not fit in real memory are provided by auxiliary storage, (usually a hard drive(s)). Virtual addresses very likely will not necessarily match (or even have) real addresses.



The execution cycle implements a von Neumann architecture, that is the instruction is first fetched from memory, then decoded for correctness by the CPU, executed, the result stored back to memory, then the next instruction is processed.





Input-Output (I/O) Structure

Device Drivers

These are programs that provide the OS with capability to perform I/O to each of the various peripheral devices. The driver may poll for status and data, or it may have set up a DMA transfer into kernel memory. If the transfer is managed by a DMA controller then an interrupt is generated when the transfer complete. The device driver receives the signal, determines which I/O request has completed, determines the request's status, and signals the kernel I/O subsystem that the request has been completed.



I/O Channels

The channels are independent I/O processors that are provided a script or program from the OS to perform I/O.

A CPU sends relatively small channel programs to the controller via the channel to handle I/O tasks, which the channel and controller can, in many cases, complete without further intervention from the CPU (exception: those channel programs which utilize 'program controlled interrupts', PCIs, to facilitate program loading, demand paging and other essential system tasks).

When I/O transfer is complete or an error is detected, the controller communicates with the CPU through the channel using an interrupt. Since the channel has direct access to the main memory, it is also often referred to as DMA controller.



Direct Memory Access (DMA)

DMA is a specialized controller that transfers data between an I/O device and memory independent of the processor. There may be multiple DMA devices in the computer.

The DMA follows a script known as a I/O Program or Channel Program. This program gives the DMA a list of instructions to be followed to complete the I/O request.

The DMA can now perform the data transfer without consuming all the processor cycles, leaving the processor free for other tasks.





Computer System Architecture

Uni-Processors

One CPU, including one set of Registers. Only a single instruction may be executing at a time.

Advantages:

Simplicity in that there is no need to serialize access to memory, nor special processing for sharing.





Multi-Processors

More that one CPU, each with its own set of registers. Instructions can execute simultaneously.

Advantages:

Increased throughput

Economy of Scale

Increased Reliability

Symmetric multiprocessing means that all processors are peers and each can perform all tasks, including operating system functions and user processes.



Simultaneous Multi-Threading

Each CPU manages two or more threads of execution. When one thread delays (enters some type of a wait), another waiting thread is switched into the CPU for execution.

Also known as “hyperthreading”.

Appears to OS as multiple real CPUs.

Advantages:

Increased CPU utilization, (keeps it busy), through reduction of wait time.

Increased overall computation throughput.



Cluster Computing

Multiple tightly-coupled processors that work together. Very close and often communications among processors is needed for coordinated efforts. Signals are used among processors to synchronize efforts.





Operating System Structure

Multiprogramming increases CPU utilization by organizing jobs so that the CPU always has work to execute.

Time Sharing is a logical extension of multiprogramming for interactive users in that the CPU executes multiple sessions by rapidly switching among them. Each user shares the system simultaneously (or given the illusion).

Interactive vs. Batch processing. When a human is involved entering requests, then waiting for a response the process is interactive. No human then the processing runs until it is finished is referred to as batch work.

A Process is a program loaded into memory and represents a unit of work in the system. Unlike a program which is a passive entity, the process is an active entity that has demands for CPU.

Job Scheduling refers to the Time-of-Day and the processing requisites, both prerequisite and co-requisite, needed to start a particular process.

Virtual Memory = The aggregate of Real Memory + Auxiliary Storage

Paging refers to that activities associated with the over commitment of main memory and the need to move executable code and information between real and auxiliary storage, as it is needed.







Operating System operations

Interrupt Driven (as opposed to polling)

The OS waits for a signal to perform some requested work or required management. Such as a request for service by an application, I/O event completed, Timer interruption or an application failed due to a program check condition.



Waits

Enabled-Wait – The OS has nothing to do, it is waiting for work. The OS is currently in suspense, and will be re-animated upon receipt of an interrupt.

Disabled-Wait – The OS has entered a failure state. It cannot work further without intervention. When disabled the OS requires problem determination efforts.

Active-Wait – The OS is actively seeking work. This drives the CPU up to 100 percent activity though no productive work is being performed.



Exceptions

Expected Exceptions – From calls for service by applications (SVC), and timer pops (EXT).

Unexpected Exceptions – Errors generated by application programs attempting to use privileged instructions or using instructions with invalid operands. Machine checks caused by failing hardware.





System States of Operation

Supervisor or Kernel Mode

Privileged Instructions

Non-privileged Instructions

Mode Bit is set to supervisor state.



Problem or Application Mode

Non-privileged Instructions

Supervisor Calls and Monitor Calls for OS service. When a supervisor call is executed, it is typically treated by the hardware a a software interrupt.

Mode Bit is set to problem state.





Timers

Time-of-Day (TOD) Clock – Used to interrogate year, month, day, hour min sec.

Interval Timer – Used for measuring intervals or differences in time.

Clock Comparator – Causes an interruption when the TOD clock has equaled or exceeded a value specified.

CPU Timer – Keeps track of the amount of time that the CPU is active.





Process Management

A Process requires certain resources: CPU time, access to memory, files and I/O devices to accomplish its task. These resources are allocated by the OS when the process is created, or requested during it run.



Program Status Word (PSW) or Program Counter

The PSW is a hardware register which contains information about program state used by the operating system and the underlying hardware. It will contain a pointer (address) to the next instruction to be executed by the CPU. The program status word typically contains an error status field and condition codes set by the last executed instruction, interrupt enable/disable masking bits and a supervisor/problem state mode bit.





Memory Management

Each BYTE is addressable.

Main memory is the only storage area that the CPU is able to address and access directly.

The OS is responsible for:

Keeping track of which parts of memory are currently being used and which processes are using them.

Deciding which processes and data to shuttle into and out of memory.

Allocating and deallocating memory space as needed.



To improve both user response and utilization of the CPU, the OS attempts to keep several programs in memory.



Dynamic Address Translation

For a program to be executed, it must be mapped to absolute addresses in memory. Dynamic address translation, or DAT, is the process of translating a virtual address during a storage reference, by the CPU, into the corresponding real address.





Storage Management

What is a File?

A computer file is a resource for storing a collection of related information organized into a one-dimensional array of bytes.

Information in a computer file can consist of smaller packets of information (often called “records” or “lines”) that are individually different but share some common traits. Records are composed of related fields. Fields are composed by characters or bytes.

The OS abstracts the physical properties of its storage devices to define the logical storage unit, the file.



File Management

The fundamental methods for creation, naming, organizing and handling files usually in a hierarchical or tree structure.



Mass-Storage Management

The Data Pyramid

                        CPU
                    
Buffers
                  
Registers
                
L1/Lx Cache
              
Real    Memory
            Auxiliary Storage

        
DASD Internal Cache
      DASD (hard drive) Storage
    
Emulated-Virtual Tape Storage
  Physical
Magnetic TAPE Storage
          I   N    T   E    R    N    E   T
 C   L   O   U   D          S  E  R  V  E  R  S



Caching

A cache is a smaller, faster memory which stores copies of instructions most recently executed and the data accessed from the most frequently used memory locations.

The most important storage management property that the OS can endeavor to exploit is locality of reference. This comes from the observation that programs tend to reuse data and instructions that have used recently. It can be predicted with reasonable accuracy which instructions and data a program will use in the near future based upon its history of accesses in the recent past.

The OS will attempt to measure this locality property to ensure that the most useful data is ready to access from the closest storage.

Main memory is a cache for auxiliary storage.





I/O Subsystem

Manages the following:

SPOOLING – This is unit record work such as printing and print images.

Paging Subsystem – Movement of fixed sized page frames (commonly 4K) of memory.

Device Drivers – Programs that mediate the I/O of all peripherals. They run as extensions of the OS. Only the device driver knows the peculiarities of the specific device to which it is assigned.





Protection and Security

User Identifiers and Passwords

Resource access control facilities over data and files

Job Submission

Group Identifiers

Privilege Escalation







Virtual Machine

Hypervisors – The nucleus or native operating system that creates and manages guest operating systems running in an emulated computer architecture called a virtual machine

Guest Operating Systems – Any OS that runs under a hypervisor.

Live Guest Relocation – Migrating a running guest operating system's virtual storage, executing threads and disk storage from one locale to another without disruption of processing.





Topic 2
Operating System Structures




Bootstrapping Theory


Boot is short for bootstrap loading and derives from the phrase

to pull oneself up by one's bootstraps”.

The usage calls attention to the requirement that, if most software is loaded onto a computer by other software already running there, then some mechanism must exist to have loaded the “initial” software onto the computer.

Early computers used a variety of methods to get a small program into memory to solve this initial problem. Up through the 1960s this was performed by a card deck.

The processor hardware was wired to read the contents of a single card into memory. The contents of that card contained a program that could then instruct the computer to read additional cards into memory. Those cards in turn contained sufficient programming to direct an OS to be loaded from either more cards, tape or disk devices.

After the OS was successfully loaded into low memory it then took control of the processor either via interruption or loading of a current program control register (PSW).



Modern OS are initially loaded (“booted”) by supplied firmware mechanisms with such names as:

IPL – Initial Program Loader

IMPL – Initial Microcode Program Loader

BIOS – Basic Input Output System

UEFI – Unified Extensible Firmware Interface







System Services

Major services provided by the OS are:

User Interfaces

Text or Graphical User Interface
Command Line
Script or Batch



Program

Loading or Instantiating
Executing
Termination, Teardown and Cleanup



Scheduling and Dispatching

Time-of-Day and Event based
Context Switching and Timeslicing



Input-Output Operations

Fairly and Efficiently scheduling I/O services



Maintain the File System

Providing users and programs with an abstraction of hierarchical directories and files, along with their manipulatives.



Data Communications

Message Passing among processes.
Socket Processing
Sharing of memory locations



Error Detection

Vigilance for User generated program checks.
I/O errors.
Detection and Alerts.
Recovery.
Recording and Logging.



Resource Access Protection and Security

Ensure that access to resources is controlled.
Secure files, memory and network from unauthorized use.
Protection from other task invasion.
Insolation and Isolation.



Resource Allocation

Ensure that processes are not scheduled unless resources needed become available.
Serialization of resources,
Reserve and Release enqueueing mechanisms,
Wait and Post mechanisms
Memory Allocation and Garbage Collection



Accounting for services rendered to Applications and Users

Resource Consumption,
Supervisor execution time,
Process Time accounting,
Chargeback for services rendered by OS.






System User Interfaces


Command Interpretation

Command text line interface for users via terminal.
Interpretation Shells,
Provide access to files, processes and program invocation.



Graphical User Interface

Desktop
Icons
Mouse
Screen Swipes
Gestures



Application Programming Interfaces (API)

A supplied set of functions that specifies the parameters that are passed to each function as arguments, and the return values the programmer or user can expect as results.

Along with the API are usually a set of return codes that enumerate the possibilities of function failure.





Calls for Supervisor Services (SVC)

I/O, File and Device manipulation

Examples: OPEN, READ, WRITE, CLOSE



Information query and maintenance

Examples: What is the TIME?, How much storage do I have?



Process Control

Termination
Wait for Event, Time Countdown
Enqueue/Dequeue
Forking or subtask attachment
Event Recording, Tracing, Address Stopping and Instruction Stepping
Request to Dump real or virtual storage locations



Communications

Domain Name Services, DHCP and other daemon services
Socket maintenance
Timeout services
Message Passing









System Programs


A set of system utilities that perform:

File Management

Status Information

Search and Edit Tools

Programming Language Support

Program Loading and Execution

Automation Services

Inter-Task and Inter-network Communications

Common unprivileged Applications

Web browsers
Email agents
Compilers
Spreadsheets
Media Players





OS Design and Implementation


Design Goals

Convenience
Easy to learn
Abstraction via APIs
Reliable
Availability
Serviceability
Speed
Cost



Mechanisms

How to accomplish some work.



Policies

What will be accomplished, and when.



Good mechanisms provide flexibility by being insensitive to policy changes.



Implementation

Early OS'es were written in assembly language.

Modern OS'es are written with an assembly first-level interrupt handlers and assembly device drivers. Generally a higher level language such as C for the remainder of the kernel.

System programs in C++ and PL/S.

High-Performance Areas of the OS:

Interrupt Handlers
Paging and I/O Management
Memory Management
CPU Scheduler






OS Structure


Simple Structure

Microsoft MS-DOS, early AT&T UNIX, IBM TOS and DOS



The OS Service Pyramid

                         Computer HW Architecture
                  
OS First-Level Interrupt Handlers
                
OS System Management Programs
              OS Provided Abstracted User Services
            
User Application Programming Interfaces



Layered Structure Approach

The OS is modularized into a number of layers.

The bottom layer is the hardware. The highest layer is the user interface.

Each layer may interface with only its adjacent layers.

Each layer is implemented with operations provided by lower-level layers.

Overall functionality is determined and separated into components.

Allows the OS retains much greater control over the computer and the applications.



Loadable Kernel Modules

The kernel provides core services while others are implemented dynamically while the kernel is running. Linking and loading services dynamically is preferable to adding new features directly to the kernel, which would require recompiling the kernel with every change.

Examples: device drivers and file systemsCore Dump







Virtual Machine


What is a Virtual Machine?

Virtual Machines are those operating environments that provide emulation methods that present a standard software interface that is identical to the underlying hardware to their "guest" operating systems. Also known as "hypervisors", these operating platforms create multiple instances of the same computer architecture as the one on which the hypervisor is natively running.

The hypervisor control program objectifies virtual instances of the computer architecture through the use of software data structures that emulate the hardware components of the architecture.



Guest Operating Systems

The “guest” OS is any operating system that can run on that instance of the architecture. The guest runs as an application to the hypervisor. Therefore, it executes in the “problem state”. The guest has no knowledge nor reason to believe that it is running on anything but real hardware. No changes need to be made to the guest to allow it to function correctly when running under a Virtual Machine environment.



History

The CP67 OS was developed by IBM in 1967. This was the first attempt at VM. Later IBM provided VM370 as a field developed program for corporate experimentation. In the 1980s IBM released VM/SP as part of its official product line. Today zVM is the flagship of the VM line of hypervisors. zVM runs a variety of operating systems including: MVS/ESA, VSE, zOS, zTPF, AIX, zLINUX RedHat and SUSE.

EMC developed VMWARE to emulate Intel architectures in the late 1990s.

After 2000 we saw a proliferation of hypervisors developed by other vendors; such as KVM for Linux, Virtual Box for Solaris.



Benefits

Unlike distributed hardware-based solutions, virtualization technology allows you to virtualize processor, communications, storage, I/O, and networking resources to help reduce the need to provide duplicate hardware, programming, and data resources.


Reduction in hardware costs.
Increased productive utilization of hardware resources.
Ability to share CPU, memory and I/O devices among operating systems.
Research and Development without affecting production or availability.
Rapid porting and testing of different environments.
Server Consolidation.
Simulation and prototyping of networked environments.



Simulation of Architectures

This is where the native host system and the guest system are of different architectures. The host system presents a simulated machine architecture to the guest. One in which the guest operating system was designed to run.

No change to the operating system is necessary to run as a guest.

Virtualization allows us to run operating systems requiring older architectures without the need to keep the old real machine. Emulation can increase the life of programming.



Para-Virtualization

This is a variation on the virtualization theme. Rather than trick the guest operating system into believing it has a hardware system to itself, para-virtualization presents the guest with a system that is similar but not identical to the guest's normal operating environment. The guest operating code must be modifed to run on paravirtualized hardware. This results in a more efficient use of resources and a smaller virtualization layer.



Implementations

Only the native operating system, the hypervisor, is permitted to execute in the Supervisor State (privileged). This means that each of the guests must operate in the Problem State. Execution of instructions in the problem state is limited to the non-privileged instructions. But OSes are designed to make use of the full instruction set including privileged instructions.

When a guest OS attempts to execute a privileged instruction then a real PROGRAM CHECK interrupt occurs in the hardware. This check is presented to the hypervisor. The hypervisor will interrogate the source of the interruption. The hypervisor must check the virtual PSW of the guest to determine if the guest was in the “virtual” supervisor state (OS) or virtual problem state (APP). If the instruction was attempted while in the virtual supervisor state then the hypervisor will execute the instruction on the guest OS behalf, then reflect the results and statuses for the completed operation in the fixed storage locations of the guest operating system. The guest is then re-dispatched.

Otherwise if the hypervisor determines that the guest was in the virtual problem state when it attempted a privileged instruction, then the hypervisor would fail the attempt, reflect that status back to the guest; perform an emulated program-check interruption to the guest; then re-dispatch the guest.

In either case the guest “believes” that it itself performed the operation because all events from its perspective appear to have occurred on real hardware.





Some commercial hypervisors:

IBM Virtual Machine (zVM)
VMWARE
Linux KVM
Oracle Virtual Box
Hercules
JAVA Virtual Machine (JVM)





Operating System Problem Determination


Debugging -- effort to find cause of faults.

Performance Monitoring and Tuning -- effort to measure use of system resources and set controls to make access to these equitable.

Bottlenecks -- resource areas which run short during normal or peak processing periods.

Tracing -- effort which relies on scrutiny of past instructions and events leading to a problem event.

Program Event Recording -- recording of detail events during processing.

Core Dump -- snapshot of the contents of memory at some particular point in processing.


Topic 3
Process Management


The Process Concept

The Process

The process or job is a program in execution. It has current activity as represented by the changes in the value of addresses of executable code residing in memory containing an instance of the program. These values are present in the PSW or Program Counter register.

Executable Code – contains the instructions for the CPU.

Data Section – structures that contain variables and constants used by the process.

Save Areas – memory to save the contents of registers and the PSW.

The Process Stack

local variables
function parameters
return branch addresses
heap stack

Recall that the process is the active entity that represents an executing instance of a program. The program is a passive entity representing a "blueprint" from which the process is instantiated.





Process States

New – the process is being created by the OS. This will shortly transition to the Ready-for-Work state.

Running – instructions are being executed by the CPU. Only a single process can be in this state per CPU. Can transition to either Terminated, Waiting or Ready-for-Work.

Waiting – in this state the task does not consume CPU because it is waiting for some event to occur, such as I/O, Timer pop. Waiting tasks next transition are to Ready-for-Work.

Ready-for-Work --The process is not waiting for any event, but is waiting for access to the CPU. The next transition is to Running.

Terminated – The task has completed execution and is being garbage collected, its resources de-allocated by the OS.





Process Control Block

Each process is represented in the OS by a software data structure called a PROCESS CONTROL BLOCK. It contains information associated with the task:

Current State

Such as the next instruction to be executed, interrupts pending, system mode.

Save Area Information

Program Counter information from the PSW.
Registers saved contents after time slice end.

CPU Dispatching information

CPU Priority
Priority to I/O device access
Time Slice Interval and Consumption
Dispatch Queue Residency

Memory management information

Base and Limit registers
Page and Segment Tables for process address space.

Accounting information

CPU consumption
Wall-Clock time
CPU OS Service time
I/O Activity time
Memory usage

I/O device allocation and status information





Threads or Tasks

A Thread is the basic dispatchable unit of computation. The thread represents an independent instance of execution of a unit of logical program code. Its environment is characterized by a PSW, a set of register values, and a save area stack.

The thread is spawned at the direction of a "parent" process. The parent of all processes the operating system itself.

Each thread of code runs independently from its other threads.

Only a single instruction of a thread may be executing at a time, regardless of the number of CPUs.





Process Scheduling

The goal of scheduling is to:

Keep the CPU busy with productive work as possible.

Switch the CPU among processes fast enough to give interactive users the illusion that each is simultaneously interacting with the applications, and receiving good response times.





Scheduling Queues

Job Queue

This consists of all processes that enter the system.

Dispatch or Ready Queue

Process must be ready to run; not waiting for anything but CPU. The OS selects the next task to be dispatched and given the CPU for processing.

Eligible-List

Process is ready but insufficient storage is available to run without adversely impacting the other processes.

Device Queue

Waiting for the device-end interrupt to be posted for an I/O request.





Schedulers – Migrate processes among the various scheduling queues.

The JOB scheduler selects processes from the pool of all submitted jobs and loads their program into memory.

The CPU scheduler selects from among the processes that are ready to run and allocates the CPU to one of them.

Interactive Type – These exhibit characteristics of requiring small intervals of CPU time very often.

Batch Type – These are long running processes that require large increments of CPU time infrequently.

I/O Bound – Task that spends more than half of its wall clock time requesting and waiting for the completion of input and output. It tends to release the CPU prior to it's timeslice end.

CPU Bound – A Task that spends more than half of its wall clock time consuming CPU cycles. It tends to consume its entire allocated timeslice.

It is important that the JOB scheduler attempt to select a favorable process mix of I/O-bound and CPU-bound tasks to ensure good CPU utilization and throughput.

A good process mix will keep both the ready queue and the I/O queue non-empty so that both the devices and the CPU are kept busy.



Medium-Term Scheduler and Swapping

Medium-Term processes are not in need of CPU time as frequently as Short-Term tasks but more frequently than Long-Term tasks.

Swapping removes the entire address space of a process from memory to auxiliary storage until its next timeslice.







Context Switching – This concerns the process state represented by the Program Control Block (PCB and PSW), the values in registers, the memory management information and the state of execution.

All information that is required to resume processing between dispatches (the context) must be saved and restored between timesliced interruptions.

This switching effort is purely overhead charged to the OS.

The OS attempts to keep the switching as infrequent as possible without significantly impacting multi-tasking or rapid interactive response time.







Operations on Processes

Process Creation

Processes are created by a parent mechanism.
When created they are assigned an identification usually termed a Process-ID.

Main Tasks are created by the OS.
Sub-tasks are created by the Main tasks.



Process Termination

Processes are terminated by self-destruction or by a parent mechansim.

Generally, a child process cannot be allowed to exist if its parent has terminated.

When a process terminates its resources are deallocated and garbage collected by the OS.





Interprocess Communication

Any process that does not share data with any other process is known as independent.

A process is cooperating if it can affect or be affected by another process executing in the system. Processes may be sharing a resource or passing messages to each other.



Reasons for providing process cooperation

Information sharing, such as common files read by users.

Computation speedup via subtasking work.

Modularity, dividing programmed functions into separate subtasks.

Convenience, such as using an IDE that is checking syntax while the user is entering code into a editor.



Shared-Memory Systems

Producer vs. Consumer processes

The producer generates information.

The consumer accesses and uses the information generated.



Message-Passing Systems

Direct communication
In-direct communication

Store-and-Forward

Synchronization

A process is termed as blocked when it can no longer make progress due to waiting for some event to occur or some resource to become available.

Blocking send (must wait if partner is not ready to receive)
Non-Blocking send (never waits)
Blocking receive (must wait for data to become available)
Non-Blocking receive (never waits)



Buffering

Zero capacity
Bounded
Unbounded







Communication in Client-Server Systems

Sockets

A socket is defined as an endpoint for station to station communication.

A socket is identified by an IP Addresss concatenated with a Port ID.

Such as: 130.171.242.6:2048

There are 65536 possible ports per OS instance.

Pairs of processes communicating over a network (inter-machine or intra-machine) employ a pair of sockets, one for each side.

Generally, two channels or communication are established. One for sending, the other for receiving. This way both process can transmit messages simultaneously (full-duplex).

Higher level programming languages allow sending/receiving messages as easily as reading/writing for local files.





Remote Procedure Calls (RPC)

Messages exchanged in RPC communication are well-structured and no longer just packets of data. Each message is addressed to an RPC daemon function listening to a particular port on the remote system.

Each message contains an identifier specifying the particular function to execute and the parameters needed to be passed to that function.

The function is then executed as requested and any output is sent back to the requestor via a separate message.





PIPES

A pipe is a communications conduit between two processes. Messages are usually sent one-way.

Pipes allow two processes to communicate in standard producer-consumer fashion.

The producer process writes output to one end of the pipe.

The consumer process reads input from the other end of the pipe.

So one process feeds its output as input to the next process.

Pipes will reduce I/O wait time and increase CPU utilization because multiple pipe stages are active processing concurrently.



Examples:

UNIX: ls | less

Windows: dir | more



Topic 4
Threads and Sub-Tasks


Overview

A Thread is the basic dispatchable unit of computation. The thread represents an independent instance of execution of a unit of logical program code.

The thread is spawned at the direction of a "parent" process. The parent of all processes the operating system itself.

Each thread of code runs independently from its other threads.

The thread's environment comprises an OS maintained data structure called the Program Control Block which includes:

an IDentification or Process-ID,
a Program Counter,
a set of Register Save Areas,
and a Program Stack containing pointers (address) of various data.

Only a single instruction of a thread may be executing at a time, regardless of the number of CPUs.





Single-threaded

Can perform only a single task at a time.



Multi-threaded

Can perform two or more tasks concurrently at a time.

Example: Web Server, Interactive Development Environment







Benefits of Multithreaded programming

Responsiveness – Some threads may continue providing interactive service while others may be blocked or are performing a time-consuming operation.

Work Pipelining – Immediately passing the output stream of one process as the input stream of a next process thereby keeping the CPU busy and productive; as opposed to processing with a single thread of execution against all input, then starting a subsequent thread to process the accumulated output of the first thread.

Resource Sharing – Threads share the memory and resources of the process to which they are subordinate.

Economy – It is more inexpensive to create and context-switch threads than processes. “Hot-Threads” can be allocated and pooled; thereby ready to be dispatched when needed.

Scalability – Single-threaded processes cannot scale to multiple CPU environments. Multi-threaded processes can exploit parallelism.

Reduced WAIT Time – threads need not wait for behind each other. A string of instructions completing faster decreases wallclock time.





Parallelism vs. Concurrency

Parallel denotes simultaneous processing.

Concurrent denotes inter-leaved execution on a uniprocessor.







Scheduling and Loss of Control

Threads must compete with one another within the same process.


Loss of control of a processor by a thread is classified as either involuntary or voluntary.


Involuntary loss of control, also known as preemption, occurs even though the thread itself does nothing that would cause it to become unable to continue executing.


Voluntary loss of control, which includes blocking and yielding, occurs when the thread performs some function that makes it unable or unwilling to continue to execute. An example of this is issuing a Wait or Pause for some period of time: the thread cannot continue executing until the wait is satisfied.





Multi-Core Programming Challenges using Threads

Dividing activities into separate concurrent tasks that run in parallel on individual CPU cores. Identify which workunits are independent, disjoint and separable from each other.

Balance – Ensure that subtasks perform equal amounts of work.

Data Splitting – Data to be accessed by the separated subtasks must also be divided and “fenced” to the particular tasks manipulating the data.

Data Dependency – Programmers must ensure that any data dependency between two or more subtasks must be synchronized to accommodate the dependency.

Testing and Debugging – Programming running in parallel present many different execution paths. Problem determination is inherently more diffficult than single threaded tasks. It is difficult to recreate a problem using the same conditions that led to the problem in the first place.





Multithreading Models

User threads – Supported, created and maintained by application programming.

Kernel threads – Supported, created and maintained by the operating system. These are the maintasks or main processes.



Many-to-One model

Many user threads matched to one kernel thread. Downside that the entire process will suspend if a thread makes a blocking system call. Also since only one thread can access the kernel at a time, multiple threads are unable to run in parallel on multiprocessors.



One-to-One model

Each user thread is matched to a kernel thread. This allows another thread to run when some thread makes a blocking system call. It also allows multiple threads to run in parallel on multiprocessors. The downside is that creating a new user thread requires creating a new kernel thread with its associated overhead.



Many-to-Many model.

This model multiplexes many user-level threads to a smaller or equal number of kernel threads. This model avoids the shortcomings of the above two models; you can create as many users threads a necessary, and the corresponding kernel threads can run in parallel on a multiprocessor. Also, when a thread performs a blocking system call the kernel can continue to schedule another thread for execution.







Threading Issues

forking with fork() method creates another duplicate process.

exec() method will duplicate all threads of a process.



Thread cancellation

Asynchronous cancellation – The thread is immediately and passively terminated. The thread incurs having “the rug pulled out from under it”. Problem with cleanup of held resources. This is dangerous because you may not know the extent of the fallout.

Deferred cancellation – The target thread to be canceled will be flagged to get its attention, the target periodically checks whether it is being requested to terminate, upon being interrupted it gracefully terminates itself in an orderly fashion.

Cancellation Points – These are the code points in a program where a thread can safely determine if it is being requested to terminate. The program is at a logical point to release resources and perform the orderly termination.

Often the OS will reclaim system resources from a canceled thread but will not reclaim every resource held by the thread. Therefore, canceling a thread asynchonously may not free a necessary system-wide resource. This could lead to a deadlock condition.





Signalling

A signal is a flag or interrupt used to get the attention of a process and to notify it that a particular event has occurred.

Signals may be synchronous or asynchronous.



Signal Delivery

A signal is generated by the occurrence of a particular event.

A generated signal is delivered to a process via flag or interrupt.

Once delivered, the signal is either handled or ignored.



Signal Handling

The default signal handler defined by the OS.

A user-defined signal handler, such as CATCH-TRY BLOCKs in JAVA and exception handling in C. if the user-defined handler is available then it is attempted prior to the OS default handler.



Handling signals in single-threaded programs vs. multi-threaded.





Thread Pools

A number of ready to go “hot” threads.

These require no “on-the-fly” creation or destruction. They are pre-built and reusable, and waiting for work. Once a thread completes its service it returns to the pool.

The size of the pool can expand or contract when demand is low. Use trends or history as basis.





Hardware Threading

Sometimes referred to as "symmetric multitasking" or “hyperthreading”, is employed to increase processor utilization by switching to another thread when the current thread is found in any way to be stalled, waiting or yielding.

The OS “sees” this as multiple CPUs. The hardware is virtualizing the view of each CPU by processing multiple threads of execution. The benefit is that the utilization of the CPU is increased.


Topic 5
Process Synchronization


Basic Concepts

Synchronization mechanisms are those protections that are designed address concurrency-control problems by ensuring the orderly execution of cooperating processes that share a common resource, so that data consistency and integrity is maintained.



Consumer-Producers that share data can exhibit functional problems when executed concurrently. Either the transmission or reception of data between the processes may block should data not be ready to transmit or the receiver not be ready to accept.



Race Condition occur where several processes access and manipulate the same data concurrently and the outcome of the execution depends on the particular order in which access to the shared data takes place.

Problems:

Incorrect or Inconsistent results.
Errors with data integrity.
Deadlocks (waiting or indefinite loops)





Mutual Exclusion ensures that a critical section of code is executed by only one process at a time. Other processes are barred from access.

Benefits:

Process Synchronization and Serialization
Coordination among cooperating processes





The Critical-Section Problem

A critical section (CS) is a segment of code within each process which may be changing common variables, updating a table, writing a file, etc. When one process has update access to its critical section then no other process may be permitted to execute in the same area.

Critical Sections have entry and exit sections.

Critical Sections should be small as possible, self contained, fast executing with no calls.

The actual code of Critical Sections may itself be shared among programs.



A transaction is a programmed work unit in the critical section of code that must be fully executed without interruption; that is, either all the operations associated with the function are executed to completion, or none of the operations are attempted.



Solution to the CS problem:

1) Mutual Exclusion (interlocking) occuring prior to the execution of the CS.

2) Progress rapidly through the CS.

3) Bounded waiting; a limit on the number of times that other processes are allowed to enter their CS after a process has made a request to enter and before that request is granted.



Pre-emptive vs. non-preemptive kernel code

Non-preemptive code does not release its Critical Section until it normally exits the CS, blocks on a resource, or voluntarily yields control of the CPU. It is free from race conditions.

Preemptive code is much more suitable for real-time programming since other processes are not held up. It is more responsive by relinquishing the CPU to other waiting processes. It must be carefully designed to ensure that it is free from possible race conditions.





Synchronization Hardware

Atomic Locking allow the querying and setting the contents of memory as one uninterruptible unit. Locks provide protection of critical sections.

Test-and-Set
and
Compare-and-Swap

Both instructions can be used by CPU tasks sharing common storage areas in either a multiprogramming or multiprocessing environment. A task can modify the contents of a storage location even though the possibility exists that this task may attempt to be interrupted by another process attempting to update the same location.





Semaphores and Serialization Mechanisms

Serialization is the employing of mechanisms to ensure that only a single process may access a shared resource at a time.

A semaphore is a synchronization variable used to overcome the difficulty of possible simultaneous or overlapped access to a shared resource. When one process modifies the semaphore value, no other process can simultaneously modify the same semaphore value. The semaphore access/modification is uninterruptably locked under the test-and-set hardware.

A mutex (mutual exclusion) is a binary semaphore that provides simply two possible states, locked or an unlocked state.



If a process attempts to acquire a lock and it is found already in use by another then it has 3 options:

Spin on the lock (busy wait).
Go perform some other work.
Wait for the lock (non-busy wait, or blocked).





Two mechanisms that employ semaphores that can be used to ensure synchronization between threads are:

Enqueueing (Reserve/Release)
and
Wait and Post



Enqueuing mechanisms are program constructs used for reserving and releasing a system resource. The resource can be virtually anything that can be shared between two or more tasks; such as static memory, a device, a data communications channel, etc.

Enqueueing is the process of attempting to acquire a lock on a resource, or waiting until the lock is available, then locking the resource, followed by performing some critical section action on the resource, and finally releasing the resource. Only then are other waiting threads notified that the resource is available for reservation.

Safe Sequence:

Request a Lock
Reserve the resource
Use of resource via Critical Section
Release the resource
Signal all other waiting threads





Wait/Post mechanisms are program constructs used for synchronizing two or more threads so that one thread does not pass a particular point in its processing until the other(s) have reached the same agreed upon point in their processing.

Such as waiting for all threads to be fully initialized before having them attempt to send messages to each other.

Another would be the shutdown processing of an operating system. All tasks must be quiesced and brought down gracefully before the operating system can perform is final steps of withdrawing and issuing a halt to the hardware.





Semaphore Signalling versus Spin Locks.

A spinlock is a “busy wait” in which the process executes through a loop while waiting for the lock to become available.

The “passive wait” of both the Wait-and-Post and Reserve/Release do not consume CPU cycles, but voluntarily gives up its timeslice to another thread and block until the lock to become available. A notifying signal called a “post” will awake it out of its wait when the resource becomes available.



Locking Protocol

Shared access – Multiple Read and possible Single Write

Exclusive access – Only Single Read-Write / No other Reads



Executable Coding for Shared Critical Sections

Serially Reusable – entry to the CS is reserved, then code and data may then be altered during execution of a critical section, but all changed status and data in the CS must be reset to the original state prior to CS release.

Re-entrant – entry to the CS is NOT required to be reserved; code is executed but never altered by simultaneous processes; and all state and data references are outboard of the Critical Section of code.



Observance of Reservation

Note that all threads using the resource must observe, respect and employ the use of semaphores, otherwise there can be no data integrity.



Deadlock

This is a situation where two or more processes are waiting indefinitely for an event that can be caused only by one of the waiting processes, and therefore will never occur.



Priority Inversion

This is a situation where a higher-priority process needs to read or modify kernel data that is currently being accessed by a lower-priority process. The kernel data is typically protected with a lock owned by the ower-priority process. The situation compounds itself should the lower -priority process get preempted in favor of another process with a higher-priority than itself.

Processes: L < M < H

H requires resource R which is currently held by L.

H would wait for L to release resource R.

While waiting process M preempts process L.

Now a lower priority M has affected the length of the time that H must wait for L to relinquish resource R.



This situation can be avoided by implementing a priority-inheritance protocol. All processes that are accessing resources needed by a higher-priority process inherit the higher priority until such time as they have released the resources in question.






Topic 6
CPU Scheduling and Dispatching


Basic Concepts

On a uniprocessor only one process can run at any time; all other processes must wait in the ready queue until the CPU is free and can be rescheduled.

The objective of multiprograming is to maximize CPU utilization by having some process performing productive work forcing the CPU to be running at all times.

CPU operates at 100 percent of its engineered capacity, or zero percent; there is no partial state. The CPU is either busy with work or at idle.

A process will execute until it must wait (voluntarily or forced), or terminate.

When a process must wait, the OS give the CPU to another process that is ready-for-work. If no ready-for-work processes are available then the CPU goes IDLE.

The amount of CPU time given to a task is a timeslice or time quantum.





CPU-I/O Burst Cycle

CPU Burst – Any CPU cycles or time consumed prior to waiting for an event.

I/O Burst – Amount of time spent waiting prior to needing the CPU for further instruction execution.



CPU Scheduler

Short-Term or CPU Scheduler



Preemptive Scheduling

CPU scheduling decisions are cooperative or preemptive.

Preemptive means that the CPU is taken away from the process by the OS.

Cooperative implies that the process has not done something that requires giving up the CPU, but because it cannot or decides not to use the CPU immediately.

The transitions may take place under the following circumstances:

Transition from running to waiting state. (cooperative)

Transition from running to ready state due to an interrupt such as timeslice end. (preemptive)

Transition from running to ready state via voluntary timeslice end. (cooperative)

Transition from waiting to ready state. (preemptive)

Normal Process Termination. (cooperative)

Abnormal Process Termination. (preemptive)



Cooperative scheduling does not require hardware in the decision process. It is not a model for supporting realtime systems.

Preemptive scheduling requires the Interval Timer hardware to cause regular and timely interruptions.

Interrupt Masking is the enable/disablement of notification that an interrupt is pending in the hardware. Masking allows the OS to work uninterrupted on a unit of work. The OS will re-enable interruption when it decides that its work is done, or when it dispatches a user task.

Interrupt Handlers run disabled. This means that these entry points into the OS are entered with interruptions disabled. The processing of the first-level Interrupt Handlers must be extremely fast.



Dispatcher

The dispatcher is the service of the OS that gives control of the CPU to the next process selected by the scheduler. This function involves:

Context switching,

Switching from Supervisor to Problem state,

Branching to the proper location in the application program to commence the program at its next instruction.





Scheduling Criteria

Criteria by which to compare scheduling algorithms:

CPU utilization – How busy can we cause the CPU to be driven.

Throughput – How much aggregate work can be accomplished per unit of wallclock time.

Wall Clock Turnaround time – How fast can the work of a process be finished. From submission to completion.

I/O Wait time – What portion of time is spent waiting for I/O to complete.

User Service time – What portion of time is spent waiting for the OS to perform services on a task's behalf.

CPU Wait time – What portion of time is spent by processes waiting for the CPU in the ready queue.

Response time – How fast are human trivial interactions responded.





Scheduling Algorithms

First-Come, First-Served scheduling

Simple FIFO queue



Shortest-Job-First scheduling

Jobs that need the CPU the least go first.



Priority scheduling

Jobs follow FCFS queueing unless marked as a higher priority. Those are moved to appropriate placement in the queue.

Can have problems with “starvation” because they are continually being “cut in front of”.

Priority is raised over time (aging).



Round-Robin scheduling

Following time quantum preemption the task goes to the back of the dispatch queue. Performance relies on the size of the timeslice.

Too small a timeslice will incur frequent context switching and overhead.

Too large a timeslice will affect interactive users with inconsistent response times and RR degenerates to FCFS.



Multilevel Queue scheduling

Separate the tasks into different queues by their run characteristics. Queues with smaller CPU requirements are given higher priorities.

Such as: (high to low priority)

System Services
Interactive Tasks
Batch Processing
Compute Bound



Interactive tasks get the CPU frequently but in very small time quantums.

Batch work get the CPU infrequently but in larger time quantums when there is no interactive tasks queued ready. No batch work is permitted to run unless the queues for interactive and system services tasks are empty.

Each queue has associated with it a different time quantum.

Each queue is guaranteed a fixed percentage of access to the CPU over time.

The tasks of a queue are prioritized within the queue.



Multilevel Feedback Queue scheduling.

Like the multilevel queue but with reassignment among the queues in the event that a task either exceeds its time quantum, or never comes close to using its time allocation.

In the event that the task exhibits the characteristic of consuming its entire timeslce then the task is moved to a more compute intensive queue.

In the event that the task voluntarily yields a significant remainder of its timeslice, then the task is moved to a less compute intensive queue.







Multiple-Processor Scheduling

Asymmetric v. Symmetric multiprocessing scheduling

In asymmetric processing a single CPU handles all the scheduling decisions and I/O processing. The remaining CPUs execute user application code.

In symmetric processing all CPUs schedule from their own queues. Programming at the OS level must ensure that no two CPUs choose the same task, and that no processes are lost from the queues.



Processor Affinity

When a process migrates from one CPU to another then the contents of the CPU pipeline and the memory cache for the last CPU must be invalidated. Then, the pipeline and the cache for the next CPU must be repopulated. This is costly and wasteful.

Keeping the task on a single CPU is known as processor affinity. This avoids the high cost of migration.

A downside may be waiting for a particular desired CPU when another is available.





Load Balancing

Performed automatically when CPUs share a common queue of ready to run tasks.

Push Migration – An OS service periodically checks the load on each processor. If it finds an imbalance then it evenly distributes the load by moving processes from the overloaded to a less-busy CPU.

Pull Migration – An OS service running on an idle CPU will examine the queues of other processors and extract one for itself.

Will counteract the affects of processor affinity.





Multicore Processors

This hardware multithreading is employed to increase processor utilization by switching to another thread when the current thread that is found to be memory stalled due to a cache miss, waiting or yielding ( or incurring any long-latency event).

In this situation the CPU switches to a second hardware thread. Sometimes referred to as “hyperthreading” or a “dual-threaded” processor core.

From the OS perspective dual threaded processors appear as two CPUs.

This operation is somewhat like timeslicing performed on a hardware level. No OS decision processing is involved.





Virtualization and Scheduling

OS guest machines cannot account accurately for the passage of real time. The have access to a “virtual” clock not the real clock. The virtual clock does not tick smoothly and continuously, but is updated by the hypervisor with real clock information at dispatch time.

Any guest operating system scheduling algorithm that assumes a certain amount of progress in a given amount of time can be negatively impacted.

There will be “uncaptured” or “unaccounted for” time as measured by the guest. This will appear to the OS guest as time in which the CPU was in a suspended or stopped state.









Topic 7
DeadLocks


Basic Concepts

A deadlock is that state entered by a process when its continued forward progress depends on waiting to acquire access to one or more resources that never becomes available.

A requirement for deadlock is that a process is respecting the lock of a shared resource by another process. The sharing processes are attempting to synchronize the use of a resource.





Possible Problems due to Non-Synchronization

Memory Consistency Problems:
Two or more threads attempt write to the same resource at the same time. The data will be incorrect or corrupted, and un-trusted.

One thread is writing to a file while the other(s) are reading the same file. The data will be unreliable at best.



Thread Execution Interference:
Interference happens when two or more operations, running in different threads, execute the same shared code instructions that manipulate common data. The sequence of the steps overlap (or interleave) in time thereby leaving the results unpredictable.





Necessary Conditions required for Deadlock to occur

Mutual Exclusion – Only one process at a time can access a resource.

Hold and Wait – Holding at least one resource and waiting to acquire another held by some other process.

No Preemption allowed – Resources can be released only voluntarily.

Circular Wait – Process-1 is waiting for release of Process-2 which is waiting for release of Process-3,..., which is waiting for release of Process-N,..., which is waiting for Process-1.







Possible Problems due to Synchronization

A race condition is an undesirable situation that occurs when multiple program threads attempt to perform operations simultaneously (or concurrently) on the same resource, but because of the nature of the system, the operations must be performed in the proper sequence in order to be done correctly, otherwise the outcome of the execution depends on the particular order in which access to the shared data takes place.



Starvation occurs when one or more threads in your program are blocked from gaining access to a resource and, as a result, cannot make further progress.



Deadlock, the ultimate form of starvation, occurs when two or more threads are waiting on a condition that cannot be satisfied. Deadlock most often occurs when two (or more) threads are each waiting for the other(s) to provide some service or release some resource that is needed to progress further. All resources locked must be later unlocked by the owning thread.



Livelock is a form of deadlock. It differs in that the competing threads repeatedly change their conditions without regard that the same change is being performed in the other threads. No progress is made because they meet in deadlock with another new condition that they have synchronized on.



Indefinite Overtaking is a type of race condition. One or more fast tasks accessing a common resource with a relatively slow task(s). The slow tasks rarely get access to the resource because they are continually being “overtaken” in processing speed by the faster tasks. By the time the slow task recycle itself and attempts another access it is yet again beaten to the punch by a faster task.



Deadlock degrades overall system performance to the point that all or parts of the system will stop functioning, and will require manual restart.







Deadlock Prevention

Deadlock can be prevented by ensuring that at least one of the following required conditions do not hold.

Mutual Exclusion

Do not employ mutual exclusion semaphores on other than common shared resources.



Hold and Wait

When a process requests to reserve a common resource, ensure that that process holds no other common resource. Serialize the reservation and release of all shared resources.



No Preemption

Allow the OS to force release of resources from holder.



Circular Wait

Impose a total ordering on all resource types, and to require that each process requests resources in an increasing order of priority.

Require implementing a priority-inheritance protocol. All processes that are accessing resources needed by a higher-priority process inherit the higher priority until they have released the resource.



Deadlock Avoidance

Possible side-effects of deadlock prevention by various algorithms are low device utilization and reduced work throughput.

A process is said to be deadlock safe, if and only if, there exists a safe sequence that the system can allocate all required shared resources to each process, in some order and still avoid a deadlock.

Bankers Algorithm

(1) When a new process enters the system, it must declare the maximum number of instances of each shared resource type that it may need. This may not exceed the total number of shared resources in the system.

(2) When a process requests a set of shared resources, the system must first determine whether the allocation of these resources will leave the system in a safe state. If it will, the resources are allocated. Otherwise the process must wait without requesting any resources until some other process releases sufficient resources needed by (1).





Recovery from Deadlock (possibles)

Process Termination

Abort all deadlocked processes.
Abort one process at a time until the deadlock cycle is eliminated.



Resource Preemption

Selecting a Victim to cancel.
Rollback the entire transaction to the state that existed before the attempt.
Starvation until some thread terminates.














Topic 8
Main Memory


Memory Management

Main memory and the registers built into the processor itself are the only storage that the CPU can access directly.

Memory consists of a large array of addressable bytes; that is each byte has its own address.

The CPU fetches bytes from memory according to the address kept by the PSW or program counter. The CPU then decodes those bytes into (hopefully) instructions. The instructions may cause additional loading from and storing to specific memory addresses.

What does the CPU do while it is waiting for data to come over the memory bus?



Supporting Hardware

cache

base register (also known as a relocation register)
limit register





Logical (Virtual) vs. Physical (Real) Addressing

Logical or Virtual address is the address computed with the aid of the OS, based upon the “view” of the address space in which the program executes.

Physical or Absolute address is a real location on the memory chip. It is between 0 and the highest addressable byte of real memory.



Address Binding

Symbolic address – The addresses from the prespective of the programmer.

Relocatable address – Addresses in a format that can be altered, or relocated, at execution time.

Real address – The physical address of real memory.



Address Mapping

Run-time mapping between virtual and physical addresses is performed by a hardware mechanism called dynamic address translation (DAT) or the memory-management unit (MMU).

The value of the relocation is adjusted for every address in each memory access. This means that the value of the relocation register is added to each accessed address.

The application program never sees the real physical addresses.

So we have logical addresses in the range 0 to “any logical address”.

And physical addresses in the range of Real Address 0 to the “end physical address”







Swapping

Swapping is a process that temporarily removes the entire address space of a process from memory to a backing store (auxiliary storage). This frees up memory for occupancy by another process. Later, the rolled-out process and brought back into memory (roll-in) and its execution is continued at the point of interruption.

The context switching time is significantly high.

Swapping is rarely used since it provides very little execution time in relation to context-switching time for this method to be a reasonable memory management solution.





Memory Management Allocation Schemes

Single Contiguous Allocation

Before the advent of operating systems we had a simple address space that could accommodate a single application. Program must be able to fit into the address space. Each instruction and data address mapped directly to a real address. The unused portion of storage went wasted.

If a program does not fit the developer had the option to use dynamic loading. With dynamic loading a routine is not loaded until it is called. All routines are kept on some storage media in relocatable form. The main program is loaded into memory and executed. When a routine that is not in memory is needed, the relocatable linking loader is called to load the desired routine into memory, and control is then transferred to that routine.

Each program was initialized or “booted” into storage. The logical programmer view was the same as the absolute physical view of storage.

Along with CPU and Memory the process required Unit Record devices to run. Card Reader, Card Punch and Printer. As the program executed it read and punched cards, printed lines and accessed other peripherals in real time

If a unit record device was unavailable or down the program would not run.





Partitioned Allocation

Static Partitioned. Address space that could accommodate a fixed number of partitions and therefore a fixed number of applications. An operating system was needed; it was loaded beginning at memory location zero. Each application partition, aside from the first, was offset by a fixed displacement from memory location zero.

Since programs are written with the assumption of running relative to address zero, each memory access was manipulated by an offset representing the beginning partition address so that it could address the proper physical address relative to the partition start boundary.



The memory space of partitions were protected by means of a base-register and a limit-register. Only the OS can load these two registers. Every address generated by a user mode program is validated. A program can then legally access any real address between the base and (base+limit) range.

Remaining unused portions of each partition called fragments could not be used for other programming executions



Supporting Hardware

base register (also known as a relocation register)
limit register







Dynamic Partitioned. Had all the features of Static Partitioned with the exception that the address space could accommodate a variable number of partitions and applications. The variable number was set by the operator at machine startup time.

Both static and dynamic partitioned required that each program must be able to fit in the partition allocated to it.





Again, programs running in partitions other than the first must have their memory addresses and accesses offset by the displacement of the start of the partition.

Still the unused portion of storage in each partition was unusable for any other program.

Fragmentation exacerbated with time.

And If the entire program did not fit in any of the partitions, then it could not be run; or it must be segmented by the programmer.

No way to collect the fragments into a single contiguous partition.

All tasks shared a single address space. One process malfunctioning could affect all other processes running.

You could now run multiple programs at the same time but the shop needed to have a set of Unit Record Devices allocated to each partition. If you had 4 partitions running you may have been required to have 4 sets of unit record equipment.





Spooling

The IBM Federal Systems Division contractors at the NASA invented a program that virtualized card readers, punches and printers (HASP). The SPOOLer (which stood for Simultaneous Peripheral Operations Online) would capture the formatted output (unit record work) destined for the printer/punch and hold it on a disk device. The same for card reader data destined as input for partitions.

Spooling allowed multiple partitions to run at the same time while the shop may have had a only a single printer, card reader and card punch. The SPOOLER, which ran in its own partition, would queue, retrieve and drive the output to the appropriate devices in a serial manner at the devices' design rate.

Without spooling, most programs would be relegated to patterns of fast CPU processing followed by long waits for the printer/card. Use of spooling removed much of the WAIT from the execution system, causing the CPU to be tasked with more frequent work and driven harder. The result was faster throughput and the ability to run more programming in a smaller schedule window.

CPU utilization necessarily increased.





Relocatable Partitioned Scheme

This was an attempt to have the benefits of dynamic partitioned allocation with reduced fragmentation. The relocatable scheme had the ability to reclaim the unused fragments of each partition. By suspending operation and “percolating” or “compacting” the used portions of each partition up to the top of memory, thereby reducing the partition sizes dynamically; and creating a new unused partition at the bottom of memory representing the collection of the fragments. That new partition now can be used to run another application.




Problems were the cost and complexity of relocation. The logical addresses had to be relocated dynamically at execution time. Compaction time was substantial. Computer operations were required to be suspended, then each partitions' instructions had to be manipulated to reflect a new displacement, or offset, from the beginning of real storage.

There was no guarantee that the new partition would be large enough to run a particular application. Failures required restarting the application and possibly the entire machine.





Paged Memory Management

Paging is one of the memory-management schemes by which a computer can store and retrieve data from secondary storage for use in main memory. In the paging memory-management scheme, the operating system retrieves data from secondary storage in same-size blocks called pages. The main advantage of paging over memory segmentation is that it allows the physical address space of a process to be non-contiguous. Before paging came into use, systems had to fit whole programs into storage contiguously, which caused various storage and fragmentation problems.

Paging is an important part of virtual memory implementation in most contemporary general-purpose operating systems, allowing them to use disk storage for data that does not fit into physical random-access memory (RAM).



Break the real storage up into small pieces called pages or page frames. Each page, usually of size 4K, would contain a piece of the application. The memory for an application no longer must be physically contiguous, pieces can exist whereever in storage that a page frame is available. Logical addresses do not match physical addresses.

Each page has its own offset from the beginning of real storage.

The pages are linked together to form a logically contiguous address space by way of a Page Map Table. The OS maintained the PMT.

From the program's perspective the logical addresses matched the symbolic addresses. All pages still must fit into real physical memory.






Since the pages could be located anywhere in real storage, each instruction that addressed real storage must be manipulated to reference the actual page that contained the true offset into real storage needed by an application. Therefore each instruction had to be modified or translated in CPU processing by using the contents of the Page Map Table to direct operations to the correct real addressable page.

Dynamic address translation, or DAT, is the process of translating a virtual address during a storage reference into the corresponding real address.



Fragmentation was remedied.

Allowed a dynamic number of concurrent partitions.

All tasks still shared a single address space and therefore could still adversely affect each other.

Programmers still could not run a program that exceeded the size of real memory.













Topic 9
Virtual Memory


Virtual Memory Management



Virtual Memory allows the execution of processes whose programming cannot fit completely in real memory.

A major advantage is that programs can now be larger than the available physical real memory.

Because programs can use less physical memory, more programs can run together and thereby increase CPU utilization and work throughput.

The virtual address space of a process refers to the logical view or perspective of how a process (or the programmer) “sees” memory.





Demand-Paged Memory Management

The motivation here was to address the problem of running an application(s) that exceeded the size of the real memory available.

Prior to this time programmers were responsible to manage memory allocation by dividing a program that could not fit in the available memory into segments or overlays. The segments were loaded and unloaded under user program control during execution. The programmer had to ensure that the segments loaded never attempted to exceed the real storage.



Virtual Storage is a technique that used main real memory as a “cache” for secondary or auxiliary storage. The auxiliary storage is usually an array of disk drives that contains all the pages for all the applications running in the computer.

The main real storage is used to cache only the “active” parts of the application. Each cache block is known as a page. Each cache miss, an event that occurs when the accessed page is not present in real memory, is known as page fault.

The virtual address is an address that corresponds to a location in the virtual address space and is translated or re-mapped to a physical address when memory is accessed. The virtual memory address is translated by a combination of hardware and software mechanisms to a map to a real physical address. This mapping process is called address translation.



Demand paged memory management had all the advantages of paged memory management with relief from the problem of “too big” applications.

Its disadvantages were increase hardware cost, processor overhead and OS complexity.

But, all tasks continued to share a single address space and could adversely affect each other.







Basic Concepts

Locality of Reference is the “closeness” in either space or time with which a process' pages are accessed.

The Page Map Table is the array in real memory that the OS uses to translate the logical page location referenced by an instruction, to the logical page to a physical page frame in memory.

The page fault rate is the number of page-ins and page-outs occuring over a unit of time.

The Dynamic Page Pool are those page frames of real memory that are to be used for non-OS page frames.

The working-set of a process is the pages that exhibit sufficient locality for the OS to keep resident in real memory.

Overcommitting Storage is the term used for allocating more virtual storage than the available physical memory resources.

The victim page frame is the page selected by the paging algorithm to be removed from real storage to auxiliary storage to make room for an incoming page.

Page frames can be allocated on the basis of priority of processes maintained by the OS.





Page Replacement Algorithms

First In First Out (FIFO) is a scheme in which the page of storage selected for replacement is the oldest one.

Timestamped or replacement of head of queue.



Least-Frequently-Used (LFU) requires that the page with the smallest number of “touches” be replaced.

Count of references.



Most-Frequently-Used (MFU) is based on the argument that the page with the smallest count was probably just brought in and has yet to be referenced sufficiently.



Least Recently Used Removal (LRU) is a scheme in which the page of storage selected for replacement is the one that has been unused or un-referenced for the longest time.

modify (or changed) bit

reference bit

To aid the operating system to estimate and identify the LRU pages, the OS provides a reference-bit which is set whenever a page is accessed. The operating system periodically clears the reference bits and later records them so it can determine which pages were touched during a particular time period. The OS can then select a page that is among the least recently referenced page by detecting it having its referenced bit turned off.





Thrashing

Thrashing is the term given to the condition of incurring high paging rate activity relative to actual productive work. A process that is thrashing is spending more wall-clock time paging than executing.





Some Performance Consideration

Prepaging or Anticipatory Paging

This technique, sometimes called "swap prefetch", preloads a process's non-resident pages that are likely to be referenced in the near future (taking advantage of locality of reference). Such strategies attempt to reduce the number of page faults a process experiences. Some of those strategies are "if a program references one virtual address which causes a page fault, perhaps the next few pages' worth of virtual address space will soon be used".



Page Size considerations

Each page fault generates a large amount of overhead needed for processing the interrupt, saving registers, context switching, replacing a page via a page-out, followed by a page-in, I/O queuing for the paging device and updating the page map tables.

Smaller pages result in less I/O activity and less total allocated memory, due to less fragmentation and less loading of code and data that may not actually be referenced..

Larger pages minimize the number of page faults, but have larger portions that remain unused.



I/O Page Locking

Pages that contain read/write buffers should remain fixed in memory.

Performance consideration of Page Locking.

Fix in real memory the pages with the greatest anticipated locality.





Dynamic Address Translation (Page Mapping)

Dynamic Address Translation (DAT) is the process of translating a virtual address during a storage reference into the corresponding real address.

If the virtual address is already in central storage, the DAT process may be accelerated through the use of a translation lookaside buffer. If the virtual address is not in central storage, a page fault interrupt occurs, the OS is notified and it brings the page in from auxiliary storage.

If, while executing an instruction, the CPU fetches an instruction located at a particular virtual address, or fetches data from a specific virtual address or stores data to a particular virtual address, the virtual address must be translated to the corresponding physical address. This is done by a hardware component which looks up the real address (from the segment and page tables) corresponding to a virtual address and passes the real address to the CPU for execution of the instruction. If the tables indicate that the virtual memory page is not currently in real memory, the hardware raises a page fault exception which invokes the paging subsystem component of the operating system.



DAT is implemented by both hardware and software through the use of page tables, segment tables, region tables and translation lookaside buffers. DAT allows different address spaces to share the same program or other data that is read-only. This is because virtual addresses in different address spaces can be made to translate to the same frame of central storage. Otherwise, there would have to be many copies of the program or data, one for each address space.





Segmented Memory Management

Segmented Memory Management allowed for multiple address spaces to be constructed.

An address space is a length of memory that is viewed and managed from location zero and extends to some N-address.

Up until the invention of segmented memory management all processes shared a single address space. Each process would occupy a partition of memory carved into this address space. The logical view of the process memory space was mapped by a relocation register into a displacement into the address space.

It was possible for one partition to adversely affect another partition.



SMM allowed all the benefits of demand paging using more than a single address space. You could run multiple sets of address spaces in which each set of applications were architecturally isolated from programs running in other address spaces.

Segmentation supports the user's logical view of memory by allowing each process to run in its own protected address space.

DAT led to this capability and forms the basis for memory management on all major computer architectures.

When a program in some address space was dispatched the hardware is loaded via a special control register with a Segment Table Origin (STO) pointer which addresses or points to a unique individual Page Map Table associated with each different address space, thereby capability to context switch among multiple independent address spaces.

Now, no process running in some address space can reach and affect the code or data of another process running in some other address space.





Shared Segments

These are read-only blocks of pages that are physically shared by multiple address spaces, as opposed to filling memory with duplicate copies of code and data. Generally they house simultaneous reusable programming code (also known as re-entrant code). Re-entrant code never alters the data or instructions inside the shared segment; only registers and data areas outside of the segment are used to manipulate and undergo change.

Shared Segments save enormously on storage sizes by attaching seamlessly into the address spaces of each running application using the shared code. Virtual addresses in different address spaces can be made to translate or attach to the same page frame of real storage.









Virtual Machines

VM Hypervisor and Guests

Virtual Machines are those operating environments that provide emulation methods that present a standard software interface that is identical to the underlying hardware, to their "guest" operating systems. Also known as "hypervisors", these operating platforms create multiple instances of the same computer architecture as the one on which the hypervisor is natively running.

As the native operating system, and owner of the real Page Zero fixed storage locations, the hardware will respect the VM hypervisor as the owner of the processor.



The virtual machine hypervisor is the only task(s) that runs in the privileged (or supervisor) instruction state. All other guest operating systems run in the problem (or application) state.

Therefore when a guest attempts to execute a privileged instruction such as an I/O, the real hardware will capture a Program Check Interrruption and invoke the VM hypervisor's interrupt handler for processing.

If the hypervisor determines that the guest was in the proper "virtual" privileged state, the hypervisor will execute the instruction on the behalf of the guest, then "reflect" the signal and completion statuses back to the guest's operating environment. Next it will re-dispatch the guest. The guest will have no indication that it did not perform the same operation itself. Reflection alllows the guest to infer from its perspective that the instruction was executed directly by the guest itself.

The guest operating systems generally cannot detect that it is running in a virtual machine aside for the unaccounted wall clock time as measured by the guest.















Topic 10
File System


File Concepts



What is a File?

A file is an abstract data type defined and implemented by the operating system, consisting of a sequence of logical records. The logical record is made up of data fields, which in turn are made up of strings of bytes.



File Operations

Open/Close – Reserving/Releasing access to an existing or a new file to be created.

Create – Generate a new file with requested attributes.

Write – Write or rewrite an existing file.

Read – Read data from an existing file.

Reposition – Set the read/write position indicator on a file.

Delete – Erase a file directory entry, and release the file space.

Truncating – Reducing the file size to zero.

Appending – Write new or additional information at the end.



File Open/Close Processing

File Pointer – tracks the last read or write location with a file position indicator.

File open counter – tracks the number of opens and closes against the file. It reaches zero on the final close. The OS can then remove the entry.

Device Location of the file is the address of the device within the computing system.

Permissions and Access Rights – used by the OS to allow or deny I/O requests for the file data. These control the ability to view or make changes to the contents of the filesystem by various processes.



File Types

File Extensions Names

The OS can use the extension name to indicate the type of file and the type of operations that can be successfully performed on that type of file.

TXT, JAVA, PDF, BIN, EXE, CLASS, JPEG, SH, ZIP, etc.



File Attributes

Name, TimeStamp,Type, Location, File Size, Record Size, Accessability, Fixed or Variable length records.



File Structure

A directory provides a table of contents of the files.

File creation – Generate an index entry for a new file.

File deletion – Remove of an index entry for an existing file.

File list – Show each file and its attributes.

File System traversal -- Creation of a hiearchy or network.



File Access Methods

An access method is a function provided by the operating system that provides an application programming interface that allows read/write manipulation against files.

Sequential

Start with the first record and search in linear fashion forward one record at a time.



Direct, Relative Record or Random

Calculate precise location of record inside of the file, relative to the beginning of the file.



Indexed Sequential Access Method (ISAM)

File is divided into partitions. Calculation of the partition that contains record, then searched sequentially within the partition.



Virtual Storage Access Method (VSAM)

VSAM has both methods of Relative Record and ISAM, with the additional features of files spanning disk volumes.





File System Mounting

The root node is the highest directory from which all other directories are descendants.

The mount point is the directory where a file system on a given device is connected to the main OS file system to make it's files available for use.





File Protection by OS

File protection is provided by access lists, passwords, information hiding and other techniques to control access.

Types of Access

Read
Write
Execute



Control of Access

Owner/Creator
Group
Other









Topic 11
Mass Storage


Mass Storage Concepts



The Data Pyramid

                        CPU
                    
Buffers
                  
Registers
             
     Ln Cache
              
Real Memory
            Auxiliary Storage

         
DASD Internal Cache
     DASD (hard drive) Storage
   
Emulated-Virtual Tape Storage
  Physical
Magnetic TAPE Storage
The INTERNET and CLOUD Servers

Mass Storage is generally a reference to the lower levels of the Data Pyramid.



Overview

Magnetic Disk arrays

Magnetic Tape libraries

Store large quantities of data
Inexpensive
Slow access times
Sequential access

Network-Attached Storage

Network access pools of virtualized disks
Access remotely via remote procedure call API (RPC)
Data transferred via TCP/IP protocol
Bound by the speed of your Internet connection

Storage-Area Network

Private local access pools of virtualized disks
Access via storage protocols
Multiple hosts can connect directly
High bandwidth with fast transfers





Disk Structure

Disk – The surfaces of the platters physically arranged in cylinders. Each platter has two working surfaces.

Sector – A segment of data that makes up the tracks on a magnetic disk. It is the smallest amount of information that is read or written on a disk.

Head – The magnetized read/write mechanism that flies over the disk surface.

Cylinder – The collection of all concentric rings or tracks of data that are the same distance from the platters' edge and which can be accessed without the need to reposition the heads.

Arm Assembly – The mechanism that move all the heads simultaneously from one cylinder to another.





Disk Storage

Magnetic disks are nonvolatile storage.

A track is one of the concentric circles that make up the surface of a magnetic disk.

Tracks are accessed by read/write heads that literally "fly" over the track as the platter surface spins.

A cylinder consists of all the tracks on the disk platters that can be accessed by the set of heads without the need to reposition the heads.

The head is moved back and forth across the cylinders by the disk actuator mechanism.



Disk Scheduling

Physical Movement Components

Seek is the time required to move the heads from one cylinder to another. This includes acceleration, deceleration, stopping and settling to a stable position.

Rotational Latency is the amount of time required for the sector, or record to be accessed, to rotate under the read/write head.

Transfer time is the time needed to move the data between the disk surface and the computer memory. Transfer time is proportional to the amount of data transferred.

Service Time = seek + latency + transfer times

Connect Time is the time that the device is transferring data to/from memory.

Disconnect Time is the time that the device is not transferring data between the device and memory.

Pending Time is the time that the device request to perform I/O remains queued.

Channel Utilization =
(Connect time + Disconnect time + Pending time) / Wall Clock Interval





Scheduling of disk I/O
We can improve on channel utilization by managing the order in which disk I/O requests are serviced. Provide a queue. Possibly sort the queue of requests by their cylinder address. Considerations are priority and performance.

First-Come-First-Serve (FCFS)

Choose the track where the I/O was requested first.
Disadvantage: Wide seek swings can occur.

Shortest-Seek-Time-First (SSTF)

Choose the track closest to the current head position.
Disadvantage: A continuing arrival of requests for tracks in the neighborhood of the head position may cause starvation of further away tracks that have been pending.

Elevator Algorithm (SCAN)

Choose the track is next in the direction of the moving head. When an end is reached then reverse direction and process to the opposite end.
Disadvantage: A request for the track just behind the head position must wait until the head movement reverses.

Circular (C-SCAN)

Like the SCAN method but when an end is reached then reset to the beginning without servicing requests along the way back.
Provides more uniform wait time than SCAN.

Modified SCAN (LOOK)

Like the SCAN method but move forward only to the furthest currently queued I/O request, then reverse direction.
Improvement over SCAN and C-SCAN.









RAID

RAID stands for "redundant arrays of inexpensive disks".

Motivated initially by performance. The key reason it is used today is for dependability.

Dependability, Reliability, Availability and Serviceability

Computer system dependability is the quality of delivered service such that reliance can justifiably be placed on this service.

Reference specification of expected behavior so that one can determine dependability.

1) Service accomplishment, where the service is delivered as specified.

2) Service interruption, where the delivered service is different from the specified expected service.

Transitions from state-1 to state-2 are called service failures.

Transitions from state-2 to state-1 are called service restorations.





Reliability is a measure of the continuous service accomplishment or the mean time to failure (MTTF).

Service interruption is a measure of the mean time to repair (MTTR).

Mean time between failures is the sum of the preceding two measures. (MTTF + MTTR)



Availability is a measure of service accomplishment with respect to the alternation between the two states of accomplishment and interruption.

Availability = MTTF / (MTTF + MTTR)




Serviceability is a measure of the ease (or difficulty) of the application of upgrade or corrective service to be performed to the computing system. Can service be performed on a running system, or must the system experience loss of availability.







RAID Levels

RAID-0 "striping".
Allocation of logically sequential blocks to separate disks to allow a higher level if performance than a single disk drive can deliver. More R/W heads are brought to bear on the problem and thereby you achieve greater throughput performance.

The RAID controller make multiple disk drives appear as a single large logical drive with great capacity and many R/W heads.

Note there is no redundancy of data here. In the event of a single drive failure you must recover all volumes from backup media, such as tape; then journalling files.



RAID-1 "mirroring".
Scheme for tolerating a disk failure. Uses twice as many disks as does the scheme employed by RAID-0.

Data is striped over many drives and full set is duplicated in another location. Fast switchover in the event of a failure, but slow recovery for the failed set of drives since all volumes may need restoration.



RAID-2 (no longer used.)



RAID-3 "bit-interleaved"
A group of data disks that share a common checksum disk. Rather than have a complete copy of the original data for each disk, we need only add enough redundant information to restore the lost information in the event of a failure. This scheme relies on the fact that disk controllers can detect whether a sector has been read correctly, so a single parity bit can be used for error correction as well as for error detection. If one of the sectors is damaged, we know exactly which sector it is and then can figure out whether any bit in the sector is a 1 or a 0 by computing the parity of the corresponding bits from sectors in the other disks.

Long time to recover by less expense for mirrored storage (less volumes needed).



RAID-3 improves on RAID-1 by reduction of one disk, plus each byte is spread against multiple striped disks.



RAID-4/5/6 improve on RAID-3 by replacing the single parity volume by placing parity partitions on each of the data volumes. You thereby have “striping” performance with “bit-interleaved” recovery, and no need for additional volumes.



(Higher level RAID schemes progressively employ sophisticated algorithms to apply over the data such that one can begin to approach error detection and correction, and data redundancy with less extra device volumes.)







Topic 12
Input-Output


Overview



Polling

Polling or Busy-Waiting is the actively sampling the status of an I/O device as a synchronous activity to exclusion of all other work until the device responds.

Waiting-for-Interruption allows commencement or continuation other work while the I/O subsystem asynchronously and simultaneously handles I/O processing. When the I/O has completed or failed an interrupt will be presented to the Operating System.





Interrupts

Interrupt-request or Interruption

An Interrupt is a signal to the Operating System emitted by the hardware or running software indicating that an event needs immediate attention.

The hardware could be a device or a component of the processor. This can be a Channel End, Control Unit End or Device End signalling a milestone in the I/O read/write request.

The software is usually an application program. In the event of the software this could be an exception driven by a request for service or a program check.



The Interrupt-Handler is a special OS process that gets control from the computer architecture at interruption time. This process rapidly determines the cause of the interruption and then services the event requiring attention. Then the OS dispatches its next ready-to-run task.



Interrupt-Controller hardware and software mechanisms

Interrupt Masking is the disabling or enabling of further interrupts in the PSW or PC register, prior to handling the current interruption and immediately before exiting the interrupt handler.

Interrupt Priority designates the importance of interrupts being presented to the processor. Interrupts of greater priority are handled first and may interrupt the processing of lower priority events.







Direct Memory Access (DMA)

Transferring data between an I/O Device and Memory

Both polling and interrupt-driven I/O transfers place the burden of moving data and managing the transfer on the processor.

Polling simply consumes an extraordinary number of CPU cycles, but it is notified of the I/O event within one polling interation.

Interrupt-driven I/O relieves the OS of requiring high frequency reporting from the I/O subsystem, but the notification of the I/O event may be slightly delayed by the processing of other work by the CPU.

Engineers have designed a mechanism for offloading the processor and having a device controller or a channel processor handle the transfer of data directly to and from memory without involving the processor.

This mechanism is called Direct Memory Access (DMA).

The interrupt mechanism is used by the I/O processing to signal the completion of the I/O transfer, alerts when an error occurs or to post milestones to the OS during the I/O. These milestones may be Channel End, Control Unit End and Device End events. In the meandtime the processor is free to work on other tasks' computation.



DMA is a specialized controller that transfers data between an I/O device and memory independent of the processor. There may be multiple DMA devices in the computer.

The DMA follows a script known as a I/O Program or Channel Program. This program gives the DMA a list of instructions to be followed to complete the I/O request.

The DMA can now perform the data transfer without consuming a great number of the processor cycles.

Both the DMA and the CPU however must serialize their access to the common memory.





DMA Memory Access and the Memory System

Acting independently from the processor, the DMA will have the same difficulties of the operating system when dealing with a virtual memory system. The page frames that will be the source/targets of a data transfer will have both a physical and a virtual address.

The DMA will have to necessarily perform address translation to locate the page frame.

Because of ongoing paging operations, the OS will have to "lock" the page(s) that contain I/O data in real storage until the completion of the I/O. The I/O completion is signaled by the subsequent I/O interrupt.

Data transfers cannot cross a page frame boundary, so the transfer must be constrained to a single page. Large transfers must be broken up into sizes that will fit on a page; and then chain together the transfers.



Advantage of DMA: Frees up processor cycles for computation.

Disadvantage of DMA: Increases cost of the I/O system.







Application I/O Interface

The I/O Interface is a standardized set of functions that is used to abstract away the detailed differences among the various I/O devices and providing a standard and uniform way to access, read and write to external peripheral devices.



Blocking and Non-Blocking I/O

Blocking or synchronous I/O requires that no other processing commence until the I/O transmission has completed or failed. Control does not return until the I/O is finished. A function blocks when it waits for something to happen before returning, thereby leaving the CPU idle.

Non-Blocking or asynchronous I/O allows other processing to continue that does not have a dependency on the I/O operation. Generally the OS will have other processing ready to dispatch.





The Operating System's I/O Subsystem

I/O Scheduling is determining the order in which to execute I/O requests so that overall system performance can be improved, devices can be accessed fairly among processes and the average I/O wait time for completion is minimized.

Device Status Table is used by the OS to track I/O request and maintain a queue. Information includes the device type, its address, and its idle or busy state.

Buffering is the use of memory to store data being transferred between a device and an application:

for coping with a speed mismatch.

To provide adaptations for devices that have different data-transfer sizes, and may arrive out of original sequence.

To support copy semantics for application I/O and thereby locking the buffer for further update. A copy is moved into a protected area while the application my continue using the original location.



Caching is using a faster region of memory to hold copies of data that exhibit a high degree of locality.



Spooling is the operating system's process of intercepting all unit record input and output. Each applications' print is spooled to a separate disk file. The unit record work is virtualized for the application process. The OS can then serialize the queued print files to a system printer one at a time.



Device Reservation is the holding of devices to prevent deadlock.



Error Handling allows operating systems to compensate for transient I/O errors by attempting to retry the operation upon sensing a failure.



I/O Protection intervenes when a user process attempts to issue illegal I/O instructions which may disrupt the proper functioning of the system. All I/O instructions are privileged instructions. Programs in the problem state must use a System Service Call to request that the OS perform the I/O on its behalf.



OS maintains data structures to keep track of the state of each device.

I/O Subsystem Summary

Management of the namespace for files and devices.
Access control to files and devices.
Operation control
File system space allocation.
Device allocation
Buffering, caching and spooling.
I/O scheduling.
Device-status monitoring, error handling and failure recovering.
Device-driver configuration and initialization.



I/O Commands

The I/O interrupts may be presented at milestones so that the OS management may periodically check on the status of the work being performed.

Test for device

Start the I/O

Halt the I/O operation

Resume to I/O

Check on progress, (channel end, control unit end, device end, etc.), completion milestones.





Hardware-Software Interface

The operating system must be able to communicate with the I/O devices directly and also prevent the application programs from performing I/O directly.

The OS must be able to give commands to the I/O device. Such are read, write, but also control commands such as SEEK, SUSPEND and HALT.

The device must be able to notify the OS when the operation has completed or has encountered an error.

Data must be transferred between memory and the I/O device. No direct I/O device to I/O device transfers.

Much of this processing must occur independent of the OS.





An I/O instruction is a command sequence to an I/O device that specifies:

the target device address

a small script or program describing the work to be performed,

the location in virtual memory of the read/write buffer,

and possibly the address of the next I/O instruction.







Transforming I/O Requests to Hardware Operations

Application process builds a parameter list for performing a read/write.

This is followed with a Supervisor Call to request OS servicing of the actual read/write.

OS checks to see if that file fragment is already in a real storage,

If not, the OS via a device driver, allocates buffers then schedules the DMA (or Channel) to perform the physical I/O.

The application is suspended pending return of the I/O.

After the I/O is completed an interrupt is presented to the processor.

The OS interrupt handler processes the event.

The OS schedules the application to be re-dispatched now that I/O request is completed.





I/O Performance

I/O places heavy demands on the CPU and the OS Scheduler, thereby greatly impacting system performance. Areas of concern:

Interrupt handling to service the I/O devices signaling of milestones.

Context Switching among tasks requesting I/O operations.

Page Locking of frames containing I/O source/target buffers.

Memory Bus saturation with data transfers.

Handling of networked I/O



Remediation Techniques to reduce demand load:

Front-End processors

Terminal Controllers

Communication Concentrators

Use of the I/O Channel Processors

Interrupt handling instead of Polling



I/O interrupts may have a hierarchy and relative priority associated with them thereby relating each type of interruption with a different urgency, and order in which to be handled by the operating system.



The Interrupt mechanism eliminates the need for the processor to poll the device and allows the processor to focus on executing programs' CPU bound work.







Return to Professor Page]