==> c/notes-02.txt <== 20180821 ******************************************************* Example: Hello, world! // Lines starting with # are processed by preprocessed // No semicolon #include statement // C is case-sensitive #include // int: an integer type int main(){ printf("Hello, world!\n"); // sends the message to STDOUT // \n = newline; \t = tab character \ the escape character return 0; // returns the value to the OS } Java: javac JVM x.java -------> X.class ---------> output Perl/python perl/python x.pl/x.py ---------------> output C: gcc x.c ------------> a.out ------------> output compiled execute GCC Compiler --------------------------------- gcc invokes C compilers, translates C code into an executable Example: gcc helloWorld.c ./a.out Two-stage Compilation ---------------------- 1) pre-process & compile: gcc -c helloWorld.c (Produces hello.o) 2) link: gcc -o hello hello.o Linking several modules example: gcc -c a.c (produces a.o) gcc -c b.c (produces b.o) gcc -o solveWorldHunger a.o b.o Use math library: gcc -o calc calc.c -lm gcc flags: ---------------------- -o file output file for object or executable -Wall display all warnings -c compile single module -g insert debugging code -p insert profiling code -l file library -L dir library path -I dir include path Common errors using gcc: ---------------------- preprocessor: missing include file parser: syntax errors linker: missing libraries & undefined symbols 20180823 ******************************************************* #include void main(){ printf("Go Cougars!\n"); return 0; } gcc cougars.c -o csu c Preprocessor (cpp) ********************************* macro-processor textual changes Preprocessor directives start with # gcc -E shows output of preprocessor #if #else #endif ~~~~~~~~~~~~~~~~~~~~~~ #if expression code segment 1 #else code segment 2 #endif Example: #if OS == linux #include #else #include #endif #ifdef ~~~~~~~~~~~~~~~~~~~~~~ #ifdef name code segment 1 #else code segment 2 #endif #define ~~~~~~~~~~~~~~~~~~~~~~ #define name const-expression #undef name Example: #define MAXVALUE 100 if( i < MAXVALUE){ ... } becomes if( i < 100){ ... } #include ~~~~~~~~~~~~~~~~~~~~~~ #include // looks in the include directories #include "filename.h" // looks first in the current, then in the include directories Include Guards ~~~~~~~~~~~~~~~~~~~~~~ Example: includeGuardsExample/main-noIncludeGuards.c Example: includeGuardsExample/main.c Comments ********************************* 1) Sections (maybe multi-line): /* any text until */ Example: for( int i = 1 /* start at 1 instead of 0 b/c .... */; i < 100; i++){ /* * FunctionName() * Description of function */ 2) Rest of line: // C++ style comment Numeric Data Types ********************************* type bytes range ------ ------ ------------ char 1 -128 to 127 (-2^7 to 2^7) int 4 -2,147,483,648 to 2,147,483,647 (-2^31 to 2^31) long long 8 -2^63 to 2^63 float 4 3.4E+/-38 (7 digits of precision) double 8 1.7E+/-308 (15 digits of precision) Sizes (and therefore ranges) vary slightly per system unsigned versions of whole number types (changes range to be 0 - ...) What's missing? 1. strings (C-style strings) 2. boolean (char or int, 1 and 0) Most Common: ~~~~~~~~~~~~~~~~~~~~~~ char int double 20180828 ******************************************************* Variables (Data objects) --------------------------------- Uninitialized variables have "random" value Always initialize your variables Every variable in C has: 1) a name 2) a data type (size) 3) address in memory (location) 4) visibility (which parts of the program can refer to it) 5) lifetime (period during which it exists) Size of an object [think variable for now] is determined when it's created: 1) global data objects at compile time (data) 2) local data objects at run-time (stack) 3) dynamic data objects by programmer (heap) Scoping ---------------------- Variables declared in a { } block are usable (visible) only within that block Variables declared outside of a { } blocks are global. Use sparingly and with caution Example: variablesExample.c.html Exercise: What will the output be of the following (if any)? int solution; solution++; printf("solution: %d\n", solution); Type Conversion --------------------------------- Explicit Type Conversion: ---------------------- Syntax: (type) Example: int count = 99; double cost = (double) count * 5.34; Example: explicitTypeConversionExample.c.html Implicit: ---------------------- Syntax: Exercise: Discuss with your neighbor what will the output be of the following (if any)? int i = 100000; char c; c = i; printf("i: %d, c: %d\n", i, c); A: i: 100000, c: -96 Exercise: What is the value of the following expressions: int age; A. age = 8 B. -age C. 5 + 8 D. 5 / 8 E. 5 / 8.0 F. float( 7/3 ) G. float( 7 ) / 3 Arrays --------------------------------- Examples: ----------- int vec[100]; // holds 100 elements, indexed 0..99 char str[30]; // holds 30 elements, indexed 0..29 float m[12][31]; // holds 12*31 elements On the stack array[ index ] printf("%d\n", vec[ 999999 ]); sizeof( ) ---------------------- sizeof( data type) returns the number of bytes for a data type Example: ----------- sizeof( int) or sizeof int Example: ----------- int simpleNum = -1; sizeof( simpleNum); // or sizeof simpleNum; int vec[100]; sizeof( vec ); int arraySize = sizeof( vec ); int elementSize = sizeof( vec[0] ); printf( "vec has %d elements\n", arraySize / elementSize )); Exercise: Write code to sum the first 50 elements of an int array named receiptTotals. Pointers --------------------------------- Review ~~~~~~~~~~~~~~~~~~~~~~ int counter = 1; Pointers are a data type A pointer holds a memory address Example: ----------- int* myIntPtr; double* ptrToGradient; pointers: "variables that tell you where to find something else" (Dale & Weems) Analogy: PO Boxes int secretBankAccountNumber = 456789; int* myIntPtr = secretBankAccountNumber; // bad idea & operator ---------------------- & (address-of operator): returns the memory address of the first byte in memory of a variable myIntPtr = &secretBankAccountNumber; Example: int answer = 42; int* answerPtr = &answer; printf("answer's value: %d\n", answer); printf("answerPtr's value (base-10): %ld\n", answerPtr); printf("answerPtr's value (hexadecimal): %p\n", answerPtr); Memory Contents Type Name Address ------------------ ---- ---------- --------- | | +----------------+ | 0x7ffeea7fe2e8 | int* answerPtr 0x7ffeea7fe2e0 +----------------+ | 42 | int answer 0x7ffeea7fe2e8 +----------------+ | | 20180830 ******************************************************* Example: addressOfExample.c (or *.c.html) * (dereference) operator ---------------------- * dereferences a variable (it interprets the contents of that variable as address, of a particular type_) Example: printf("*answerPtr's value: %d\n", *answerPtr); Example: pointerExample.c (or *.c.html) Exercise: Given the following code, determine the data type for each expression: int x; int* ptr; Expression Data type Possible Values ---------- --------- --------------- x int 123456789, -123456789, 23456789, 0, -1 ptr int* 0x0, 0x7fffebcafe34 (addresses) *ptr int &ptr int** (addresses) &x int* (addresses) Example: +------+ | ???? | 0x10 | | +------+ | ???? | 0x14 | | +------+ | ???? | 0x18 | | +------+ int x; // at 0x10 int* p; // at 0x14 int* q; // at 0x18 +------+ | ???? | 0x10 | | int x +------+ | ???? | 0x14 | | int* p +------+ | ???? | 0x18 | | int* q +------+ p = &x; +------+ | ???? | 0x10 | | int x +------+ | 0x10 | 0x14 | | int* p +------+ | ???? | 0x18 | | int* q +------+ *p = 7; +------+ | 7 | 0x10 | | int x +------+ | 0x10 | 0x14 | | int* p +------+ | ???? | 0x18 | | int* q +------+ q = p; +------+ | 7 | 0x10 | | int x +------+ | 0x10 | 0x14 | | int* p +------+ | 0x10 | 0x18 | | int* q +------+ *q = 11; +------+ | 11 | 0x10 | | int x +------+ | 0x10 | 0x14 | | int* p +------+ | 0x10 | 0x18 | | int* q +------+ p = NULL; +------+ | 11 | 0x10 | | int x +------+ | NULL | 0x14 | (0) | int* p +------+ | 0x10 | 0x18 | | int* q +------+ Exercise: Diagram the elements on the stack and write the output of the following (adapted from HERE): #include int main (){ int numbers[5]; int * p; p = numbers; *p = 10; p++; *p = 20; p = &numbers[2]; *p = 30; p = numbers + 3; *p = 40; p = numbers; *(p+4) = 50; for( int n = 0; n < 5; n++){ printf("%d, ", numbers[n]); } return 0; } 20180904 ******************************************************* Exercise: Write code for each of the following and make sure that it compiles: a. Declare ex0 as a pointer to a char b. Declare ex1 as a pointer to a pointer to a char c. Declare ex2 as an array of 10 elements d. Declare ex3 as an array of 10 elements that are pointers to chars e. Declare ex4 as a pointer to an array of 30 char elements f. Declare ex5 as an array of 10 pointers to arrays of 500 char elements g. Declare ex7 as a pointer to const int h. Write a function ex8 that takes a char pointer named pc, increments the address that it points to and returns that value i. Write a function ex9 that takes an int pointer named num, and prints out what num points to and a message saying if what num points to is odd or even C-style Strings ================================= https://i.imgur.com/NwNPU.png (or https://imgur.com/NwNPU) Example: cStyleStrings.c Executing a C program: Command-line arguments ================================= // Function parameters: // argc: number of arguments // argv: array (notice the []) of char *s (which are pointers to strings) int main( int argc, char *argv[] ){ // ... return 0; } Command-line Arguments Example ----------- Syntax: [] Example: a.out 1 23 awesome argc: argv: +---+---+---+---+ | | | | | | | | | | | | | | +-+-+-+-+-+-+-+-+ | | | | | | | v | | v "awesome" | v "23" v "1" "a.out" Libraries --------------------------------- stdio.h I/O string.h character strings ctype.h character types math.h numerical math functions stdlib.h C Standard General Utilities Library (ex atof(), malloc(), NULL, rand(), exit(), etc.) Memory Allocation ********************************* 1) 2) A program's memory footprint: +--------------+ | Headers | +--------------+ | | | Heap | dynamically allocated memory | | +--------------+ | | grows down | | v | | ^ | | grows up | | +--------------+ | | | Stack | "normal" variables & function calls | | +--------------+ malloc() ~~~~~~~~~~~~~~~~~~~~~~ #include Function prototype: void* malloc( size_t size ); Lifetime of the block is until the memory is freed, with free(). free() ~~~~~~~~~~~~~~~~~~~~~~ #include Function prototype: void free( void *ptr ); Memory leak – Example: memoryAllocationExample.c (or memoryAllocationExample.c.html) Other function prototypes in stdlib.h: void* calloc( size_t n, size elementSize); // allocates memory and initializes it to 0 void* realloc( void* ptr, size_t size); // Returns pointer to larger or smaller memory block of size bytes 20180906 ******************************************************* Exercise: Add code as directed by the comments to pointersExercise.c and execute to see values update: pointersExercise-02.c Control structures --------------------------------- grouping: {...} Same as Java if statements ---------------------- Same as Java Syntax: if( condition1 ){ statements1 } else if( condition2 ){ statements2 } else if( condition3 ){ statements3 }else{ statementsN } Be careful when omitting { }s: Example: ifExample.c (or ifExample.c.html) #include int main(){ int x = 7, y = 42; if( x > 0 ) printf("x > 0!\n"); if( y > 0 ) printf("x and y > 0!"); return 0; } switch Statements ---------------------- Syntax: switch( expression ){ case const1: statements1; break; case const2: statements2; break; ... default: statementsN; // optional } Loops ---------------------- while( condition ){ ... } Loop body executes 0 or more times, while condition is != 0 do{ ... } while( condition ); Loop body executes 1 or more times, while condition is != 0 for( init; condition; update){ ... } init is executed once Loop body executes 0 or more times, while condition is != 0 update is executed after every loop body Beware: int n = 0; if( n = 42 ){ printf("Something is very, very wrong\n"); } Structured data objects ********************************* structs ~~~~~~~~~~~~~~~~~~~~~~ A developer-defined collection of other data types Access members of struct using the member of operator "." Syntax: structVarible.field Example: structsExample1.c (or structsExample1.c.html) Example: structsExample2.c (or structsExample2.c.html) Example: structsExample3.c (or structsExample3.c.html) Example with pointers: structsAndPointersExample.c (or structsAndPointersExample.c.html) Functions ********************************* Function Prototypes ~~~~~~~~~~~~~~~~~~~~~~ A function prototype specifies the return type, name and argument list A prototype is needed if a call to the function appears before the definition of the function Often found in header (.h) files We can pass arguments to functions 1) By value (e.g., int, double) 2) By reference (e.g., int*, double*) We can return values from functions 1) By value (e.g., int, double) 2) By reference (e.g., int*, double*) const arguments ----------- Indicates argument can't be changed Meaningful with pointers Example: void exFunc( const char* str, const int x) { printf("x = %d\n", x); str[5] = "!"; // error: read-only variable is not assignable } Note: C does NOT support overloading functions (although C++ does) Example: overloadedFunctions.c Programs with multiple files ~~~~~~~~~~~~~~~~~~~~~~ main.c solutions.h solutions.c I/O ================================= Formatted strings: Example: printf("%d", age); c - character d - integer f - floating-point number (float) s - C-style string p - pointer (hexadecimal) ld - long int lf - long floating-point number (double) printf() - display to stdout (defaults to the terminal) scanf() - read from stdin (defaults to the keyboard) Example: ----------- madlibs-main.c madlibs.h madlibs.c 20180911 ******************************************************* Other library functions: ----------- sscanf() - scans a C-style string (instead of stdin) asprintf() - formats a string (and allocates just enough memory for it) File I/O ================================= The stdio library (stdio.h) has types and functions for file streams File stream type: FILE* Input files: fopen() fgets() of fscanf() or getline() fclose() Output files: fopen() fprintf() fclose() Example: fileIOExample.c scanf Example: processStockData.c ==> intro/notes-02.txt <== Introduction to Operating Systems ******************************************************* Most operating systems have 2 modes: 1) User mode 2) Kernel mode Computer Hardware ================================= Processor ----------- 1) Fetch 2) Decodes 3) Execute How many processes can a CPU execute at once? 1 Virtualization Memory Hierarchy ----------- Access Time Unit (seconds) Capacity ----------- ------------ -------- Registers .000000001 <1 KB Cache .000000002 4 MB (L1, L2, L3) Main memory .00000001 4-32 GB (RAM) Disk .01 1-4 TB 20180913 ******************************************************* Demo: mem-simple.c Demo of threads code Concurrency Persistence ==> processes/notes-02.txt <== 20180913 ******************************************************* Processes ================================= What is a process? A running program, includes its machine state What's part of a process's machine state? Process's state Open files Registers + program counter + stack pointer + frame pointer Address space - the memory for the process With time sharing (multiprogramming), the OS switches between processes every 10-100 milliseconds Pseudo-parallelism True parallelism requires a multi-processor system What is the API in the OS for a process: ----------- Create Destroy Wait Control + Suspend + Kill + Etc. Status Process Creation ~~~~~~~~~~~~~~~~~~~~~~ Processes are created when: 1) User requested a new process (opened an app) 2) A running process wants help (e.g., using fork()) 3) System initialization 4) Starting a batch job (only on mainframes) OS steps to run a programs ----------- Load (from disk) the code and any static data (initialized variables) into memory Allocate memory for the stack Initialize the stack (with argc and argv) Allocate some memory for the heap Setup file descriptors (stdin, stdout, stderr) Start executing main Process Hierarchies ----------- UNIX: init starts running after booting, a new process for each terminal, building a process tree Windows: No process hierarchy, but the parent process is given a special token (handle) to control the child process Process Termination ~~~~~~~~~~~~~~~~~~~~~~ Processes are terminated when: 1) Normal completion exit() or ExitProcess() 2) Error 3) Fatal error (involuntary) 4) Killed (involuntary) kill() or TerminateProcess() Process States ~~~~~~~~~~~~~~~~~~~~~~ Main process states Blocked Running Ready Possible process state transitions Exercise Implementation of Processes ~~~~~~~~~~~~~~~~~~~~~~ Process Table ----------- OS tracks processes in the process table (aka Process Control Block (PCB)) Stores a process's: + program counter + stack pointer + memory allocation + open files + scheduling information + etc. 20180918 ******************************************************* Process API ================================= fork() ~~~~~~~~~~~~~~~~~~~~~~ UNIX: ----------- Makes a clone of the current process (with the same memory image, environment strings and open files) Memory is mapped so that a copy only happens when the child writes to (changes) the memory The child generally calls exec() and replaces the memory 2 steps: needed fro redirecting standard output and error Windows ----------- 1 step: CreateProcess() (so we have a different address space to begin with) man fork Example: fork-example-02.c Possible orderings: 1) [pid: 33070] Original process [pid: 33070] Parent created process with PID 33071 [pid: 33070] Final statement [pid: 33071] Hello, world (from the child process)! [pid: 33071] Final statement 2) [pid: 33070] Original process [pid: 33071] Hello, world (from the child process)! [pid: 33071] Final statement [pid: 33070] Parent created process with PID 33071 [pid: 33070] Final statement 3) [pid: 33070] Original process [pid: 33071] Hello, world (from the child process)! [pid: 33070] Parent created process with PID 33071 [pid: 33071] Final statement [pid: 33070] Final statement 4) [pid: 33070] Original process [pid: 33071] Hello, world (from the child process)! [pid: 33070] Parent created process with PID 33071 [pid: 33070] Final statement [pid: 33071] Final statement 5) [pid: 33070] Original process [pid: 33070] Parent created process with PID 33071 [pid: 33071] Hello, world (from the child process)! [pid: 33070] Final statement [pid: 33071] Final statement 6) [pid: 33070] Original process [pid: 33070] Parent created process with PID 33071 [pid: 33071] Hello, world (from the child process)! [pid: 33071] Final statement [pid: 33070] Final statement wait() ~~~~~~~~~~~~~~~~~~~~~~ Suspends execution until a child process completes exec() ~~~~~~~~~~~~~~~~~~~~~~ Overwrites the current process Example: fork-exec-example.c UNIX Commands ================================= 20180920 ******************************************************* UNIX Commands ================================= Limited Direct Execution ================================= Direct execution: Let the process run on the CPU OS creates processes and transfers control to the starting point (main()) Problems with direct execution + Process could do something restricted + Process could run forever + Process could something slow (I/O) Solution: Limited Direct Execution OS and hardware mechanism to regain control of the CPU processes/modeSwitch.pdf User & Kernel Modes ~~~~~~~~~~~~~~~~~~~~~~ Parts of the OS run in kernel mode (privileged mode) User processes run in user mode (restricted more) + User programs use system calls to request privileged actions (that require kernel mode) * System call: function call implemented by the OS - Uses the trap instruction to change the privilege level Context Switches ~~~~~~~~~~~~~~~~~~~~~~ The OS regain control by setting a timer An interrupt handler processes the timer The OS scheduler is run to determine which process should run processes/contextSwitch.pdf How long do context switches take? ~< 1 microsecond 20180925 ******************************************************* Scheduling ================================= Typical process behaviors: CPU or I/O bound Events that trigger the scheduler: + New processes created (either parent or child are in the ready state) + The running process exits + The running process is blocked (I/O, semaphore, etc.) + I/O interrupt + Hardware clock interrupt (10-100/s) Scheduling Metrics ~~~~~~~~~~~~~~~~~~~~~~ Turnaround Time ----------- Turnaround Time = (time of completion) - (arrival time) Response Time ----------- Response Time = (time of first run) - (arrival time) Scheduling Categories ~~~~~~~~~~~~~~~~~~~~~~ Non-preemptive ----------- Scheduler chooses a process to run and let's it run as long as it like Preemptive ----------- Scheduler moves the Running process to the Ready state every clock interrupt Scheduling Environments ----------- 1) Batch Non-preemptive or preemptive with long run times Goals: throughput, turnaround time & CPU utilization 2) Interactive Preemptive Goals: Response time 3) Real-time Preemptive is sometimes not needed b/c generally it's not a general processor Goals: Meeting deadlines, predictability All systems goals ----------- Fairness Policy enforcement Balance Scheduling in Batch Systems (non-preemptive) ~~~~~~~~~~~~~~~~~~~~~~ First-Come First-Served (FIFO) ----------- Simplest Shortest Job First ----------- Best Turnaround time Scheduling in Interactive Systems ~~~~~~~~~~~~~~~~~~~~~~ Examples: desktops, servers, etc. Shortest Time-to-Completion (aka shortest remaining time next) ----------- Preemptive version of shortest job first is Round-Robin ----------- Scheduler has a list of runnable processes Each process gets run for a time interval, a quantum Setting the quantum too short causes a lot of extra context switches Setting the quantum too long is bad for interactive requests Priority Scheduling ----------- Processes can have different priorities + Example: streaming video/on-line multi-player game verses a compile The ready process with the highest priority is scheduled first Low priority jobs can starve, so adjustments need to be made over time Often, jobs are grouped by priority and run round-robin Others ----------- Aging estimates for execution time Guaranteed scheduling Lottery scheduling User-based schedule Scheduling in Real-Time Systems ~~~~~~~~~~~~~~~~~~~~~~ Hard real-time systems can NOT miss deadlines Soft real-time systems can occasionally miss deadlines Static: all information is known ahead of time Dynamic: Decisions are made at runtime Multi-level Feedback Queues ~~~~~~~~~~~~~~~~~~~~~~ Prioritizing interactive jobs while still giving time for CPU-intense jobs Balances response time and turnaround time Show cpu-mlfq-eg.eps Figure 8.1 Rules ----------- Rule 1: Run a process from the highest priority queue that's not empty (if the queue has a single process) Rule 2: Use Round Robin to run the processes in the highest priority queue that's not empty Rule 3: When a job enters the system, it is placed at the highest priority (the topmost queue). Rule 4: Once a job uses up its time allotment at a given level (regardless of how many times it has given up the CPU), its priority is reduced (i.e., it moves down one queue). Rule 5: After some time period S, move all the jobs in the system to the topmost queue. processes/scheduling-multiLevelFeedbackQueue-oneLongJob.pdf processes/scheduling-multiLevelFeedbackQueue-addInteractiveProcess.pdf Starvation - A process is not scheduled to run for a very long time processes/scheduling-multiLevelFeedbackQueue-fig8.5.pdf Gaming the scheduler - processes/scheduling-multiLevelFeedbackQueue-fig8.6.pdf Lottery Scheduling ~~~~~~~~~~~~~~~~~~~~~~ Goal: Proportional (fair) share Approach: + Give each process lottery tickets + Highest priority gets more ticket + Whoever wins, runs (executes) Scheduling Pseudocode: ----------- int counter = 0; int winner = getrandom(0, totaltickets); node_t *current = ticketsListHead; while( current ){ counter += current->tickets; if( counter > winner ){ break; } current = current->next; } // current is the winner Example: (402 total tickets) ----------- Process A: 1 ticket Process B: 1 ticket Process C: 100 tickets Process D: 200 tickets Process E: 100 tickets winner = 50 winner = 350 winner = 0 20180927 ******************************************************* Address Spaces ================================= Historical view of memory: Figure 13.1 One process runs at a time Disadvantages: 1) One process runs at a time 2) Processes can overwrite the OS :( 3) Slow Multiprogramming Goals ~~~~~~~~~~~~~~~~~~~~~~ Virtual RAM: Illusion of private memory Transparency (Translucency) ----------- Processes are not aware that memory is shared Works regardless of the number and/or location of processes Protection ----------- Cannot corrupt the OS nor other processes Privacy: Cannot read data of other processes Efficiency ----------- Do not waste memory resources (minimize fragmentation) Minimize overhead to provide virtualization Figure 13.3 +--------------+ | Code | +--------------+ | | | Heap | dynamically allocated memory | | +--------------+ | | grows down | | v | | | | ^ | | grows up | | +--------------+ | | | Stack | "normal" variables & function calls | | +--------------+ +--------------+ Figure 13.2 Address Space: The memory location that a process can access 20181002 ******************************************************* Memory API ================================= Exercise: What are the function prototypes for malloc() and free()? void* malloc( size_t size ); void free( void* ptr ); Common Errors ~~~~~~~~~~~~~~~~~~~~~~ Forgetting to allocate memory ----------- Example: char* src = "Go Cougars!"; char* dst; // bad idea!!! strncpy(dst, src, strlen( src ) + 1); How do we fix this? char* src = "Go Cougars!"; char* dst = NULL; dst = (char*)malloc(256); strncpy(dst, src, strlen( src ) + 1); Not allocating enough memory ----------- char* src = "Go Cougars!"; char* dst[SIZE]; strncpy(dst, src, SIZE + 99 ); // bad idea!!! int SIZE=16 char* src = "Go Cougars!1234567890-=!@#$%^&*()_++QWERTYUIOP{}QWERTYUIOP{}ASDFGHJKL:" asdfghjkl;'zxcvbnm,./ZXCVBNM<>?"; char* dst[SIZE]; strncpy(dst, src, SIZE ); // bad idea!!! strcpy(dst, src ); // bad idea!!! Forgetting to initialize allocated memory ----------- Example: A (dynamically allocated) linked list node. Generally set to NULL Forgetting to free memory ----------- Memory leak Especially dangerous in memory-intense and/or long running programs Freeing memory before you are done with it ----------- Dangling pointer Example: int* ptr = malloc( sizeof( int )); ... free( ptr ); ptr = NULL; ... finalSolution( *ptr ); Freeing memory repeatedly ----------- Double free Calling free() incorrectly ----------- Example: char dst[SIZE]; ... free( dst); // bad idea 20181004 ******************************************************* Address Translation (Chapter 15) ================================= 3 options to manage RAM: 1) No memory abstraction 2) Swapping 3) Virtual Memory Memory Management Option 1) No memory abstraction ~~~~~~~~~~~~~~~~~~~~~~ Mainframes before 1960 & personal computers before 1980 Simple Example: mov Register1, 1000 Memory Management Option 2) Swapping ~~~~~~~~~~~~~~~~~~~~~~ To allow for multiple processes to run at the same time we need (both in memory at the same time): 1) Relocation 2) Protection Dynamic Relocation ----------- Base & bound registers When a process is loaded, its starting address is stored in the base register Each time an address is used, the CPU hardware 1) Adds the starting address to it 2) Checks that it's less than the value in the limit/bound register Example: addressTranslation/baseAndBoundExample.pdf Disadvantages: + Slows things down + Programs need to fit in memory Swapping ~~~~~~~~~~~~~~~~~~~~~~ Copy all of a process into RAM, run it and write it back to the disk Typically, process grow in size (push onto the stack & allocating memory on the heap) What if more memory is needed? Segmentation (Chapter 16) ~~~~~~~~~~~~~~~~~~~~~~ Base and bound pairs for each logical segment + code + heap + stack Example: Segment Base Size ------- ---- ---- Code 32K 2K Heap 34K 2K Stack 28K 2K Translate virtual address 200 In the code segment 200 < 2K? Yes 32K + 200 Translate 5000: Heap Offset = 5000 - the start of the heap = 5000 - 4K = 904 904 < 2K? Yes physical address = base + offset = 34K + 904 Example: segmentation/hardwareAddressTranslation.c Protection ~~~~~~~~~~~~~~~~~~~~~~ Fragmentation ~~~~~~~~~~~~~~~~~~~~~~ Internal Fragmentation ----------- Unused/not needed memory within an address space External Fragmentation ----------- Chunks of free (available) memory are small and scattered throughout main memory instead of being in large chunks. Solutions: 1) Compact the free memory chunks to be contiguous Takes a lot of CPU cycles But processes are free to use allocated memory however they want 2) Free-list management algorithm a) Best-fit - returns the smallest chunk that will work b) First-fit - returns the first chunks (from a list) that will work c) Many others Free-Space Management (Chapter 17) ================================= Purpose: 1Understand 1) low-level memory allocation and deallocation and 2) a cool technique to store a linked list (of available chunks) without allocating memory Example ~~~~~~~~~~~~~~~~~~~~~~ Address 0x10000 10400 11400 11440 12440 124C0 134C0 135C0 145C0 +-------+------+-------+------+-------+------+-------+------+------+ Size | 1024 | 4096 | 64 | 4096 | 128 | 4096 | 256 | 4096 | 32 | Next | 11400 | | 12440 | | 145C0 | | 10000 | | NULL | +-------+------+-------+------+-------+------+-------+------+------+ (free) (free) (free) (free) (free) Head: 134C0 Free List: head -> 256 -> 1024 -> 64 -> 128 -> 32 -> NULL Free Space Management Exercises ~~~~~~~~~~~~~~~~~~~~~~ 1. The OS manages the Free List (a data structure that tracks available chunks of memory in the heap). Discuss with your neighbor if there's a Non-free List and if so, so manages it? Splitting (Unallocated) Memory Chunks ~~~~~~~~~~~~~~~~~~~~~~ What does the OS do when process B calls malloc( 200 )? It applies an allocator algorithm to find an acceptable memory chunk. Assume that we've found an acceptable unallocated memory chunk, but it's not the exact size of the request? The OS (allocator algorithm) splits that chunk it allocated and unallocated parts. (Really, it updates (resizes) the unallocated information and returns a pointer to the request portion.) Let's say that we choose the 256 block to split. Address 0x10000 10400 11400 11440 12440 124C0 134C0 13500 135C0 145C0 +-------+------+-------+------+-------+------+-------+------+------+------+ Size | 1024 | 4096 | 64 | 4096 | 128 | 4096 | 56 | 200 | 4096 | 32 | Next | 11400 | | 12440 | | 145C0 | | 10000 | | | NULL | +-------+------+-------+------+-------+------+-------+------+------+------+ (free) (free) (free) (free) (free) Head: 134C0 Free List: head -> 56 -> 1024 -> 64 -> 128 -> 32 -> NULL (Not shown are the magic numbers in the allocated blocks) Allocated header: (BEFORE the void* returned to the user) ----------- struct header_t { int size; // number of bytes (in the payload or as seen by the user) int magic; // for integrity checks }; Figure 17.2 Free header: (EMBEDDED IN the unallocated chunk) ----------- struct node_t { int size; // available bytes (does not include this header) struct node_t* next; // next free chunk }; 20181018 ******************************************************* heapInit(size){ ... void* ptr = mmap() head = (node_t*) ptr; // head->size = size - sizeof(node_t); (*head).size = size - sizeof(node_t); head->next = NULL; /* After heapInit( 17888 ) Address 0x10000 10400 11400 11440 12440 124C0 134C0 13500 135C0 145C0 +-------+------+-------+------+-------+------+-------+------+------+------+ Size | 17888 - sizeof(node_t) | Next | NULL | +-------+------+-------+------+-------+------+-------+------+------+------+ (free) */ } Coalescing ~~~~~~~~~~~~~~~~~~~~~~ Combining adjacent unallocated memory chunks free( 0x11440 ) Address 0x10000 10400 11400 11440 12440 124C0 134C0 13500 135C0 145C0 +-------+------+-------+------+-------+------+-------+------+------+------+ Size | 1024 | 4096 | 64 | 4096 | 128 | 4096 | 56 | 200 | 4096 | 32 | Next | 11400 | | 12440 | | 145C0 | | 10000 | | | NULL | +-------+------+-------+------+-------+------+-------+------+------+------+ (free) (free) (free) (free) (free) (free) After coalescing: Address 0x10000 10400 11400 124C0 134C0 13500 135C0 145C0 +-------+------+-------+------+-------+------+------+------+ Size | 1024 | 4096 | 4288 | 4096 | 56 | 200 | 4096 | 32 | Next | 11400 | | 145C0 | | 10000 | | | NULL | +-------+------+-------+------+-------+------+------+------+ (free) (free) (free) (free) Head: 134C0 Free List: head -> 56 -> 1024 -> 4288 -> 32 -> NULL Four scenarios when deallocating memory Figure 17.3 Figure 17.4 Growing the Heap (What happens when the heap is full?) ~~~~~~~~~~~~~~~~~~~~~~ The OS reserves a large chunk of memory and adds it to the Free List Free List Management Algorithms ~~~~~~~~~~~~~~~~~~~~~~ How to choose which free chunk to give to the user? First fit - selects the first chunk that is large enough Next fit - selects the first chunk that is large enough, starting from where it left off during it's last search Best fit - selects the smallest chunk that is large enough (after searching the entire list) Binary Buddy - Splits chunks in half Paging Intro ******************************************************* Segmentation allows segments to be different sizes -> making implementation difficult Paging allows for fixed-size chunks to be moved in and out of memory (RAM) + Simple + Flexible Memory Management Option: 3) Virtual Memory ================================= Allows processes to run even if only part of it is loaded in RAM It acts like the RAM is larger than it really is MMU (Memory Management Unit) maps virtual (memory) address to a physical (memory) address Virtual addresses are grouped together into a VIRTUAL PAGE Physical addresses are grouped together into a PAGE FRAME If a virtual address is needed that references a page frame that is not in RAM the CPU executes a TRAP instruction to let the OS handle it (PAGE FAULT) The OS chooses a page frame that is in RAM, writes it to disk (if necessary) Reads in the page requested Updates the mapping (in the page table) Re-starts the instruction Page Tables ~~~~~~~~~~~~~~~~~~~~~~ Map virtual pages to page frame Virtual address is divided into: 1) Upper bits + Used for an index into the page table 2) Lower bits + Offset (into both the virtual page and the page frame) + Copied directly into the physical address How many entries are in page table? Addressable memory size = (size of the page table) * (page frame size) Exercise: For 8 GB (2^33 bytes) of addressable memory with 4 KB (2^12 byte) page frames, the page tables need to have how many entries? Addressable memory size = (size of the page table) * (page frame size) (size of the page table) = (Addressable memory size) / (page frame size) (size of the page table) = 2^33 / 2^12 = 2^(33-12) = 2^21 entries Virtual Address +----------------+------------------------------------+ | upper bits | lower bits | +----------------+------------------------------------+ n <-------------------------------------------------> 0 x <--------------------------------> 0 Addressable memory = 2^n Page frame size = 2^x Page table size = 2^(n-x) Exercise: ----------- How big is the address space in the page table figure? Addressable memory = 2^16 How big is the page frame size? Page frame size = 2^12 How many entries in the page table? Page table size = 2^4 Exercise ----------- Given virtual address, 0xBEEF, what is its physical address? 0xBEEF = 1011 1110 1110 1111_2 Lower bits: 0xEEF = 1110 1110 1111_2 Page frame offset Upper bits: 0xB = 1011_2 = 11 Page frame number = 111_2 (the value from the page table at index 11 (from the diagram)) Physical address = 111 1110 1110 1111_2 20181023 ******************************************************* Page Table Entries: ----------- Page frame number: The upper bits for the physical address Present/absent (bit): Whether the page frame is in the RAM Protection: Read, write and execution permissions for the page frame Modified (bit): 1 if the page frame is different in Ram than on disk, 0 otherwise (aka dirty bit) Referenced (bit): Page frame is currently referenced or not (during some time interval) Caching disabled: Allow for page frame to not be cached Page Size ---------------------- Small page size to minimize internal fragmentation Large page size to + minimize the number of disk transfers + utilize the TLB 4 KB is common TLB (Translation Lookaside Buffer) ----------- Cached of page table entries ~64 entries Multilevel Page Tables ----------- Goal: Avoid keeping all of the page tables in memory at the same time Addressable memory size = (size of the page table) * (page frame size) becomes Addressable memory size = (size of the page table #1 (page dictionary) * (size of the page table #2) * (page frame size) Exercise: What's the addressable memory size for 2-level page table with 1024 (2^10) entries each and 4 MB (2^22 bytes) page frames? Inverted Page Tables ----------- Instead of indexing based on the virtual address, index by the page frame Use a hash function for faster look up Swap Space ~~~~~~~~~~~~~~~~~~~~~~ Reserved space on disk for swapping out pages Page Fault ~~~~~~~~~~~~~~~~~~~~~~ Page Table Miss OS is in charge of swapping in the requested page (from disk) + Most systems, the disk location is actually stored in the page frame number part of the page table + It requests the page from disk + Update the page table entry + Update the TLB + Restart the instructions (the caused the page fault) Pseudocode for accessing memory ----------- Get virtual page number from the address (by masking and shifting) if( virtual page is in TLB && valid bit set ){ // TLB hit if( ! protection bits allow access ){ protection fault } // few instruction cycles ~ 0.000000001 seconds }else{ // TLB miss page table lookup with the virtual page number if( ! valid address ){ seg fault! }else{ if( ! protection bits allow access ){ protection fault }else if( is page table ){ // soft miss // 10-20 instruction cycles ~0.000000005 seconds }else{ // hard miss (page fault) get it from disk // ~.01 seconds } } evict a TLB entry add page table entry as a new TLB entry } Page Replacement ======================================================= RAM is a cache for disk When a request for a cache entry is not found: cache miss, otherwise, cache hit Average Memory Access Time (AMAT) ~~~~~~~~~~~~~~~~~~~~~~ AMAT = T_M + (P_Miss * T_D) T_M = cost to access memory (0.00000001 seconds) T_D = cost to access disk (0.01 seconds) P_Miss = probability of a cache miss Example (95% hit rate): P_Miss = 0.05 (or 5%) AMAT = 0.00000001 + (0.05 * 0.01) = 0.00050001 seconds Example (99.5% hit rate): P_Miss = 0.005 (or 0.5%) AMAT = 0.00000001 + (0.005 * 0.01) = 0.000050001 seconds Page Replacement Algorithms ~~~~~~~~~~~~~~~~~~~~~~ Several page replacement algorithms exist: Optimal Not Recently Used First-In, First-Out (FIFO) Second-Chance Least Recently Used (LRU) Least Frequently Used (LFU) Clock Aging ... In general: + pages not used recently + Un-modified pages (so that we don't have to wait forever for the contents to be written to disk) Each clock interrupt, clear (set to 0) the reference/use bit for all the entries in the page table Set the reference/use bit each time its accessed Set the modified/dirty bit for each write to that page Working Set ----------- The set of pages that the process is currently using thrashing Optimal ----------- 1) How the algorithm works Evict the page that will be referenced the furthest in the future Recording the first instruction count for each page Evicting the page with the highest instruction count 2) How easy or hard is it to implement Impossible to implement 3) Is the execution fast or slow Infinite execution 4) What is its page fault performance Optimal :) Not possible to implement, only useful for comparison (as a benchmark) Exercise: ----------- In small groups, prepare a 3-minute explanation about your assigned page replacement algorithm. Include: 1) How the algorithm works 2) How easy or hard is it to implement 3) Is the execution fast or slow 4) What is its page fault performance Groups: switch( Last digit of your 909-number % 4 ){ case 0: FIFO; break; case 1: Random; break; case 2: LRU; break; case 3: Clock; break; } FIFO (First-In, First-Out) ----------- 1) First pages referenced are the first ones evicted 2) Easy, use a queue 3) Fast, tail queue 4) Terrible / horrible / the worst (doesn't take into account locality) Random ----------- 1) Picks a random page 2) Easy 3) Fast 4) Random LRU (Least Recently Used) ----------- 1) Keep a queue of pages, based on when it was last used. Deque to choose which page to evict. 2) Easy 3) Can have a long queue 4) Decent Clock ----------- 1) Sees the page table as a circular list, looks if reference/used bit set; starts up again where it left off (an approximation of LRU) 2) Implementation: About the same as LRU, but no queue 3) Execution: Faster than LRU 4) About the same as LRU Summary of Page Replacement Algorithms ----------- Optimal: FIFO: Random: LRU: Clock: Local vs. Global Allocation Policies ~~~~~~~~~~~~~~~~~~~~~~ Local: Choose a page to evict only from the pages of the current process Global: Choose a page to evict from all of the pages Global does better, especially if the working set grows and/or shrinks Cleaning Policy ~~~~~~~~~~~~~~~~~~~~~~ For better performance (not having to wait for a dirty page to be written to disk), have a paging daemon that applies a page replacement algorithm to ensure a pool of free pages Implementation Issues ~~~~~~~~~~~~~~~~~~~~~~ The OS is involved with paging: 1) process creation 2) process execution 3) page faults 4) process termination The OS can "pin" a page in memory to prevent it from being evicted. + page dictionary + kernel's page + pages that are in the middle of I/O transfers Concurrency ======================================================= Processing at the same time Threads --------------------------------- Process Review ~~~~~~~~~~~~~~~~~~~~~~ OS tracks processes in the process table (aka Process Control Block (PCB)) Per-process: ----------- Address space Global variables Open files Child processes Pending alarms Signals and signal handlers Scheduling information Per-thread: Thread Control Block (TCB) ----------- Program counter (location of memory for the next instruction) Registers Stack A process can have multiple threads Multiple threads of the same process share the process's address space, but they have different states Multi-threaded Address Space: +--------------+ | Code 1 & 2 | (separate program counters) +--------------+ | Heap 1 & 2 | +--------------+ | | ... (free) ... | | +--------------+ | Stack 2 | +--------------+ | | ... (free) ... | | +--------------+ | Stack 1 | +--------------+ Why have multiple threads in a single process? + an application doing several activities "simultaneously" and some of them can be blocked (both CPU and I/O) + parallelism (multi-core environment) + Creating a thread is 10-100 times faster than creating a process Threads, not processes are actually what is executed by a CPU Lightweight processes Protection - we don't worry about protection because they're considered friendly - non combative Goal: Have a set of threads that can work closely and effectively together to perform a task Example: threads/helloWorld-pthreads.c How many threads are running? 11 (created 10 + the main/parent thread) Changing Single-threaded code to Multi-threaded ~~~~~~~~~~~~~~~~~~~~~~ Issues: 1) Global variable Solution, routines for creating, reading and writing to "private" global variables 2) Non-reentrant library procedures Non-reentrant means they can't (probably) handle multiple calls Example: 3 general issues with InterProcess Communication (IPC): 1) Communication 2) Race conditions 3) Dependencies (Process 1 produces A, and process 2 needs A) Race Conditions ~~~~~~~~~~~~~~~~~~~~~~ Example 1: What should happen: ----------- Task A Task B Memory Value Read value 0 Flip value 1 Read value 1 Flip value 0 Race condition manifested: ----------- Task A Task B Memory Value Read value 0 Read value 0 Flip value 1 Flip value 1 Example 2: Demo with 2 volunteers Example: threads/raceConditionsExample.c /* Description: Race conditions example * Some material inspired from https://computing.llnl.gov/tutorials/pthreads/#Joining * Date: September 19, 2017 * Author: Hyrum Carroll */ #include #include #include const int NUM_THREADS = 10; const long INCREMENT_VALUE = 10000000; int sum = 0; void* incrementSum(void* tID){ for( int i = 0; i < INCREMENT_VALUE; i++){ sum = sum + 1; } printf( "Thread ID %ld done (sum = %d)\n", (unsigned long) tID, sum); pthread_exit( NULL ); } int main(int argc, char* argv[]){ /* Create NUM_THREADS then print the sum of their increments */ pthread_t thread[ NUM_THREADS ]; // child thread structures int status = -1; // return value for( unsigned long i = 0; i < NUM_THREADS; i++){ status = pthread_create( &thread[i], NULL, incrementSum, (void*) i); if( status != 0) { /* error occurred */ fprintf(stderr, "Error creating thread index %ld! (status: %d)\n", i, status); exit(-1); } } printf( "%ld * %d = %d\n", INCREMENT_VALUE, NUM_THREADS, sum); pthread_exit(NULL); // the last thing main() will do return 0; } Critical Regions ---------------------- Solution to race conditions for shared memory / files / etc., prohibit multiple processes from reading and writing to/from it at the same time Need MUTUAL EXCLUSION (two events can not happen at the same time) Critical regions are the areas of code that involve shared resources Make it so that only 1 thread in the critical region at a time 20181030 ******************************************************* Pthreads API ================================= Pthreads (POSIX Threads) ----------- man pthread_create man pthread_exit man pthread_join Example: threads/raceConditionsExample-join.c pthread_create() ~~~~~~~~~~~~~~~~~~~~~~ pthread_exit() ~~~~~~~~~~~~~~~~~~~~~~ pthread_join() ~~~~~~~~~~~~~~~~~~~~~~ Demo: threads/raceConditionsExample-join.c Implementations to achieve mutual exclusion ---------------------- Priority inversion problem: Assume the scheduler has the rule that high priority processes always take precedence over lower priority processes Producer-Consumer Problem Analogy: 2 people doing dishes: 1 washes the other rinses / loads into the dishwasher Demo: threads/producerConsumer.c Locks ================================= Locks provide mutual exclusion How to use a lock: ----------- 1) Obtain a lock before entering a critical region 2) Execute the critical region (the least amount necessary) 3) Release the lock How NOT to use locks: threads/raceConditionsExample-join-lockVariable.c Pthreads Locks ~~~~~~~~~~~~~~~~~~~~~~ Initialization or Creation & Deletion ----------- pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER; or int rc = pthread_mutex_init( &lock, NULL ); assert( rc == 0 ); ... rc = pthread_mutex_destroy( &lock ); assert( rc == 0 ); Locking ----------- Option 1 - Blocking: int rc = pthread_mutex_lock( &lock ); assert( rc == 0 ); Option 2 - Non-blocking: int rc = pthread_mutex_trylock( &lock ); assert( rc == 0 ); Unlocking ----------- int rc = pthread_mutex_unlock( &lock ); assert( rc == 0 ); Example: threads/raceConditionsExample-join-pthreadsLock.c Evaluating Locks ~~~~~~~~~~~~~~~~~~~~~~ 1) Correctness - does it provide mutual exclusion? 2) Fairness - could a thread starve while trying to obtain the lock? 3) Performance - how many cycles are typically required? Lock Implementations ~~~~~~~~~~~~~~~~~~~~~~ Disable interrupts ----------- 1) slow 2) too much trust for a process 3) doesn't work on multiprocessor Spin-wait (How NOT to example above) ----------- 1) slow 2) not correct - interrupt right after the while loop Atomic Test & Set with Yield ----------- Example: int flag = 0; // global variable; 1 in critical region; 0 available void init(){ flag = 0; // available } void lock(){ while( TestAndSet( &flag, 1 ) == 1){ yield(); // suspend me } } void unlock(){ flag = 0; } Queues ----------- Add queue to track the order of requests 20181108 ******************************************************* Semaphores ================================= Analogy: Street light protecting the (critical region of the) intersection Definition: An object with an integer value Two main routines: 1) sem_wait() 2) sem_post() Implemented with system calls Kernel disables interrupts while executing these system calls Initialization: ~~~~~~~~~~~~~~~~~~~~~~ Function prototype: 1) return type 2) function 3) parameter list types 4) ; // it does not have the body int sem_init( sem_t *sem, int pshared, unsigned int value); ----------- #include Link with -pthread sem: semaphore pshared: 0 for shared just among thread in a process; 1 for between processes value: initial value of sem (common values: 0, 1, MAX) Example: ----------- sem_t sem; // declaration sem_init( &sem, 0, 1); // initialize sem to the value 1 (0 means threads in this process) int sem_wait( sem_t sem ); ~~~~~~~~~~~~~~~~~~~~~~ P() or down() or sem_wait() Pseudocode ----------- sem -= 1 if( sem < 0){ block / suspend // the value of the sem is equal to the number of waiting threads } int sem_post( sem_t sem ); ~~~~~~~~~~~~~~~~~~~~~~ V() or up() or sem_post() Pseudocode ----------- sem += 1 if( sem <= 0){ unblock one of the waiting threads } Two types of semaphores: 1) For mutual exclusion (binary semaphores or locks or mutexes) 2) For synchronization of events (counting semaphores) Binary Semaphores (Locks or Mutexes) ~~~~~~~~~~~~~~~~~~~~~~ Binary semaphore is a special semaphore with just 2 states: locked and unlocked (0 and 1) Used to protect critical sections Checking, setting and possibly going to sleep are atomic Typical usage of semaphores to protect a critical section: sem_t sem; sem_init( &sem, 0, 1); while( TRUE ){ ... sem_wait( sem ) // critical section sem_post( sem ) ... } Counting Semaphore ~~~~~~~~~~~~~~~~~~~~~~ Initialize semaphore to the count available and then start Producer Consumer Problem: threads/producerConsumer-semaphoreA.c threads/producerConsumer-semaphoreB.c threads/producerConsumer-semaphoreC.c threads/producerConsumer-semaphore.c 20181113 ******************************************************* Reader Writer Locks ~~~~~~~~~~~~~~~~~~~~~~ How many threads can successfully read from a file or database at a time? N How many threads can successfully write to a file or database at a time? 1 threads/readersWriters-02.c ----------- /* Strategy: If a reader has the semaphore/lock for the critical section, allow other readers to enter the critical section. If a writer has the lock, don't let anyone else in the critical section. Source: Operating Systems: Three Easy Pieces (now version 1.00), Chapter 31, page 11. */ struct rwLocks{ sem_t lock; // binary semaphore (basic lock) sem_t writelock; // used to allow ONE writer or MANY readers int readers; // count of readers reading in critical section }; void rwlock_init(rwlock_t *rw){ rw->readers = 0; sem_init(&rw->lock, 0, 1); sem_init(&rw->writelock, 0, 1); } void rwlock_acquire_writelock(rwlock_t *rw){ sem_wait( &(rm->writelock) ); //sem_wait( &((*rm).writelock) ); } void rwlock_release_writelock(rwlock_t *rw){ sem_post( &(rm->writelock) ); } void rwlock_acquire_readlock(rwlock_t *rw){ sem_wait( &(rw->lock) ); rw->readers++; if( rw->readers == 1 ){ // first, so prevent a writer from getting the lock sem_wait( &(rm->writelock) ); } sem_post( &(rw->lock) ); } void rwlock_release_readlock(rwlock_t *rw){ sem_wait( &(rw->lock) ); rw->readers--; if( rw->readers == 0 ){ // last reader is done, so allow a writer to get the lock sem_post( &(rm->writelock) ); } sem_post( &(rw->lock) ); } Dining Philosophers ~~~~~~~~~~~~~~~~~~~~~~ Classic concurrency problem Introduced and solved by Dijkstra while( 1 ){ think(); getForks(); eat(); putForks(); } Demo: ----------- Instructions: 1) Take left fork 2) Take right fork 3) Eat 4) Put left fork on table 5) Put right fork on table 6) Repeat at step #1 Solution? Semaphore Implementation ~~~~~~~~~~~~~~~~~~~~~~ Requires locks and condition variables :) Common Concurrency Problems ================================= Deadlock ~~~~~~~~~~~~~~~~~~~~~~ Operating systems have resources that more than one process would like to use at the same time. Examples: Printers Kernel data structures Database records etc. What is deadlock? Two locks, one process as one of the locks and wants the other lock and the other process has that lock and wants the other lock. A set of processes is deadlocked if each process in the set is waiting for an event that only another process in the set can cause (Modern Operating Systems, 4th Ed.) Example: Two processes want to use a scanner and a Blue-ray recorder. If processA has (exclusive) access to the scanner and wants (exclusive) access to the Blue-ray recorder, but processB has (exclusive) access to the Blue-ray recorder and wants (exclusive) access to the scanner. General steps to using a resource: 1) Request the resource 2) Use the resource 3) Release the resource If the resource is not available, the process must either 1) wait (be blocked) 2) wait (poll to check availability) Remember that the typical usage of semaphores to protect a critical section: ... sem_wait( sem ) // critical section (use the resource) sem_post( sem ) ... What if there's 2 resources? ... sem_wait( sem1 ) sem_wait( sem2 ) // critical section (use the resources) sem_post( sem2 ) sem_post( sem1 ) ... Examples: concurrency/twoResourcesExampleA.c concurrency/twoResourcesExampleB.c Kinds of deadlock: + Resource deadlock + Communication deadlock Conditions for Resource Deadlocks ---------------------- 4 Conditions for deadlock (Coffman et al. (1971)): 1) Mutual Exclusion: Each resource is either assigned to a process or is available 2) Hold-and-wait: Processes that already have a resource can request new resources 3) No-preemption: Resources cannot be preempted from the process that they're assigned to 4) Circular wait: There's a circular list of processes with each process waiting on the resource of another process Deadlock Modeling ---------------------- Use directed graphs to model deadlock: Square vertices are resources Circle vertices are processes Edge from a resource to a process indicates that the process (exclusively) has that resource Edge from a process to a resource indicates that the process has requested that resource 4 strategies for dealing with deadlocks: 1) Ignore the problem 2) Detection and recovery 3) Dynamic avoidance 4) Prevention The Ostrich Algorithm --------------------------------- Simplest and more common approach, don't do anything If the probability of a deadlock occurring is very low, then why do anything? Deadlock Detection and Recovery --------------------------------- Detection with One Resource of Each Type ---------------------- Construct a resource graph and check for a cycle Example: ----------- Process A holds R and wants S Process B (holds nothing) and wants T Process C (holds nothing) and wants S Process D holds U and wants S & T Process E holds T and wants V Process F holds W and wants S Process G holds V and wants U Detecting cycles ----------- Depth-First Search starting from each node + Mark edges as visited Terminate early if a node is visited twice (a cycle is detected) Detection with Multiple Resources of Each Type ---------------------- 4 data structures are necessary: E: Existing resources vector (an index for each resource) A: Available resource vector (an index for each resource) C: Current allocation matrix (rows are processes, columns are resources) R: Request matrix (rows are processes, columns are resources) When to check for deadlock? + Every resource request? + Every k minutes? + Low CPU utilization? Recovery from Deadlock ---------------------- Recovery through Preemption ----------- (Forcibly) Take away a resource Recovery through Rollback ----------- Periodically checkpoint processes (record their state including resource requests) Rollback a process that has a needed resource Recovery through Killing Process(es) ----------- Easiest (except for cleaning up the mess) Hopefully the cycle will be broken Not all process can be restarted Deadlock Avoidance --------------------------------- Safe State ----------- Some scheduling order exists that lets all processes complete Unsafe State ----------- No guarantee that there's some scheduling order exists that lets all processes complete Not deadlock Resource trajectory for two processes: A & B + Shaded regions are impossible (due to mutual exclusion) (both processes need the same resource) + Paths that include impossible regions will deadlock 20181115 ******************************************************* Deadlock Prevention --------------------------------- 4 Conditions for deadlock (Coffman et al. (1971)): 1) Mutual exclusion: Each resource is either assigned to a process or is available 2) Hold-and-wait: Processes that already have a resource can request new resources 3) No-preemption: Resources cannot be preempted from the process that they're assigned 4) Circular wait: There's a circular list of processes with each process waiting on the resource of another process Mutual Exclusion ---------------------- Avoid assigning a resource unless absolutely necessary (or at least to as few of processes as possible) + Data: Make data read only + Printers: Allow a process to add its print request to a job pool (then only the print daemon requests that actual printer) Hold & Wait ---------------------- Prevent processes from holding resources Have jobs list all resources necessary up front + Don't start a job until everything can be allocated + Allocates resources for longer than is necessary, but avoids deadlock No-preemption ---------------------- Allow resources to be taken away Good luck Circular Wait ---------------------- Steps: 1) Number all resources 2) Only allocate a resource to a process if its number is greater than all of the that process' allocated resources Other Issues --------------------------------- Communication Deadlock ---------------------- Process A sends a message to process B and blocks until it replies Process B sends the reply, but it is lost. Is this deadlock? Yes Solution: Timeouts - if the expected action has not occurred by a predetermined time, then assume that it was lost or that it's never going to happen Livelock ---------------------- A process is not deadlocked, but it can't get any real work done Example: Limited number of open files. Starvation ---------------------- Starvation occurs when a system is busy and a request for a resource continually goes unfulfilled due to the allocation policy. Example: CPU/printer for short jobs first. With a constant stream of small jobs, a long job will never run. Solution: First-Come, First-Served allocation Input and Output ++++++++++++++++++++++++++++++++++++++++++++ I/O Hardware ================================= I/O Devices ~~~~~~~~~~~~~~~~~~~~~~ Block devices: store information in a block Character devices: send/receive a stream of data Device Controllers ~~~~~~~~~~~~~~~~~~~~~~ Memory-Mapped I/O ~~~~~~~~~~~~~~~~~~~~~~ Controllers have some registers (and many have a buffer) that the CPU can write to and read from 2 Options 1) I/O Ports Requires assembly instruction 2) Memory-mapped I/O Add the controller's registers to the memory (address) space Direct Memory Access (DMA) ~~~~~~~~~~~~~~~~~~~~~~ Most systems have a single DMA controller CPU delegates memory transfers to the DMA Steps for a memory transfer using DMA: 1) CPU programs the DMA (what to transfer and where) CPU issues a read command to disk 2) DMA issues a read command to the device 3) Device writes to memory 4) Device sends acknowledgment signal to DMA If more bytes, change where and how much and repeat 2) & 3) Interrupts ~~~~~~~~~~~~~~~~~~~~~~ Steps to process an interrupt 1) Device signals to the interrupt controller 2) Interrupt controller interrupts CPU Save state of previously executing process (registers) CPU uses the interrupt number to look-up interrupt-service procedure in the interrupt vector 3) CPU executes the interrupt-service procedure Interrupt-Service procedure sends acknowledgment to the controller Principles of I/O Software ================================= Goals: Device independence Error handling Simulate synchronous I/O Buffering 3 Ways to perform I/O: 1) Program I/O 2) Interrupt-driven I/O 3) I/O using DMA 1) Program I/O ~~~~~~~~~~~~~~~~~~~~~~ CPU does all the work User code calls a system call to write to a device CPU copies data from user space into kernel space Kernel waits until the device is available Writes a character Kernel waits until the device is available Writes a character Kernel waits until the device is available Writes a character Kernel waits until the device is available Writes a character Kernel waits until the device is available Writes a character ... return control to the user code Advantage: Simple Disadvantage: CPU is busy during the entire time 2) Interrupt-driven I/O ~~~~~~~~~~~~~~~~~~~~~~ Instead of continuously polling the device to see if it's ready Have the CPU do other work, waiting for the device to interrupt it Advantage: Higher CPU utilization Disadvantage: Frequent interrupts 3) I/O using DMA ~~~~~~~~~~~~~~~~~~~~~~ Instead of frequent interrupts, use DMA Advantage: Single interrupt (when completed) Disadvantage: Only if the CPU didn't have other work I/O Software Layers ================================= 4 layers: ----------- User-level I/O software Device-independent OS software Device drivers Interrupt handlers (hardware) Interrupt Handlers ~~~~~~~~~~~~~~~~~~~~~~ Deal with interrupts, because complicated Device Drivers ~~~~~~~~~~~~~~~~~~~~~~ Code to control the device Device drivers typically: Check commands sent to it Then, check if the device is ready Then wait for the controller Reentrant (able to handle being called a second time, before its done with the first call) Device-independent OS software ~~~~~~~~~~~~~~~~~~~~~~ Uniform interfacing for device drivers ----------- Make all interactions with I/O devices and drivers the same OS defines a set of functions for drivers Buffers ----------- Prevent data loss due to real time constraints Buffer in kernel in case the user's page in not in memory Use a double buffer or circular buffer to prevent data loss while transferring the first buffer Too much buffering can slow down transfers User-level I/O software ~~~~~~~~~~~~~~~~~~~~~~ 20181127 ******************************************************* Hard Disk Drives ++++++++++++++++++++++++++++++++++++++++++++ 7200 rotations / minute 7200 rotations 1 minute 120 rotations -------------- * ---------- = ------------- 1 minute 60 seconds 1 second Hard Disk Parts ================================= Tracks Sectors Cylinders Platters Read/write heads Think of a hard disk as an array of sectors This is the address space of the drive Hard Disk Operations ================================= Rotational Delay ~~~~~~~~~~~~~~~~~~~~~~ Waiting for the disk to rotate so that the requested sector is under the read/write head .5 rotation * 1 second -------------- = 4.1 ms 120 rotations Seek Time ~~~~~~~~~~~~~~~~~~~~~~ The amount of time to move the read/write head to the new track Track skew Transfer Time ~~~~~~~~~~~~~~~~~~~~~~ Reading or writing the data to or from the disk I/O Timing ================================= T_I/O = T_seek + T_rotation + T_transfer Which is usually the smallest for 4 KB requests? Which is usually the largest? 1) T_seek 2) T_rotation 3) T_transfer transfer rate = Size_transfer / T_I/O Example Drives ~~~~~~~~~~~~~~~~~~~~~~ Cheetah 15K.5 Barracuda ------------- --------- Capacity 300 GB 1 TB RPM 15,000 7,200 Aver Seek 4 ms 9 ms Max Trans 125 MB/s 105 MB/s Platters 4 4 Cache 16 MB 16/32 MB Interface SCSI SATA Random Workload ~~~~~~~~~~~~~~~~~~~~~~ 4 KB blocks Cheetah ----------- T_I/O = T_seek + T_rotation + T_transfer T_seek: 4 ms (from the manufacturer) T_rotational: .5 rotation * 1 minute 60 seconds 1000 milliseconds -------------- * ------------ * ----------------- = 2 ms 15000 rotations 1 minute 1 second T_transfer: 1 second 1 MB 1000 ms 4 KB * --------- * ------- * ------- = 0.03125 ms 125 MB 1024 KB 1 second T_I/O = 4 + 2 + 0.03 ms ~= 6 ms Transfer rate = 4 KB / 6 ms * 1 MB / 1024 KB * 1000 ms / 1 second = .65 MB/s Barracuda ----------- T_I/O = T_seek + T_rotation + T_transfer = 9 ms + 4 ms + 0.037 ms ~= 13 ms Sequential Workload (100 MB transfers) ~~~~~~~~~~~~~~~~~~~~~~ Cheetah ----------- T_I/O = T_seek + T_rotation + T_transfer T_seek: 4 ms T_rotational: 2 ms T_transfer: 800 ms T_I/O = 806 ms Transfer rate = 124 MB/s Barracuda ----------- T_I/O = T_seek + T_rotation + T_transfer Disk Scheduling ================================= Goals: 1) Fast access time Minimize seek time Seek time ~= seek distance 2) High disk bandwidth Disk bandwidth is the total number of bytes transferred, divided by the total time between the first request for service and the completion of the last transfer First-Come, First-Served (FCFS) ~~~~~~~~~~~~~~~~~~~~~~ Shortest Seek Time First (SSTF) ~~~~~~~~~~~~~~~~~~~~~~ Re-orders request to minimize seek time from the current head position C-LOOK ~~~~~~~~~~~~~~~~~~~~~~ Move to one end and sweep to the other end (servicing requests) (only goes as far was the most extreme request), skip requests and start back at the beginning Selecting a Disk-Scheduling Algorithm ================================= SSTF is common Heavy disk loads: C-LOOK 20181129 ******************************************************* Redundant Arrays of Inexpensive Disks (RAIDS) ++++++++++++++++++++++++++++++++++++++++++++ Use multiple disk drives to provide reliability (by creating redundancy) and/or improve performance RAID is arranged into six different levels: RAID 0 ================================= No redundancy Stripe chunks across all disks Maximum performance https://en.wikipedia.org/wiki/Standard_RAID_levels#/media/File:RAID_0.svg RAID 1 ================================= Mirroring https://en.wikipedia.org/wiki/Standard_RAID_levels#/media/File:RAID_1.svg RAID 2 ================================= Stripe data by bits with error correction bits stored on extra drives https://en.wikipedia.org/wiki/Standard_RAID_levels#/media/File:RAID2_arch.svg RAID 3 ================================= Stripe data by bytes with one parity drive Good for long sequential reads or writes https://en.wikipedia.org/wiki/Standard_RAID_levels#/media/File:RAID_3.svg RAID 4 ================================= Stripe data by blocks with one parity drive https://en.wikipedia.org/wiki/Standard_RAID_levels#/media/File:RAID_4.svg RAID 5 ================================= Stripe data by blocks with one parity block distributed among drives If one drive fails, data still usable https://en.wikipedia.org/wiki/Standard_RAID_levels#/media/File:RAID_5.svg RAID 6 ================================= Stripe data by blocks with two parity blocks distributed among drives If two drives fail, data still usable https://en.wikipedia.org/wiki/Standard_RAID_levels#/media/File:RAID_6.svg File Systems ++++++++++++++++++++++++++++++++++++++++++++ 3 Goals for file systems: 1) Store very large amounts of information 2) Information must survive after the process ends 3) Multiple processes need to be able use it Magnetics disks and solid-state drives provide 2 basic functions: 1) Read block k 2) Write block k Files --------------------------------- Logical units of information created by a process File Naming ---------------------- File Structure ---------------------- UNIX and Windows represent a file as an unstructured sequence of bytes Some large mainframes have a key for each record and the file is composed of a linked list of nodes File Types ---------------------- Regular files Directories (system files) ASCII regular files ----------- End in \r\n (Windows) or \n (UNIX) Binary regular files ----------- "Magic number" prevents a non-executable file from being executed File Access ---------------------- Sequential access: Read all of the bytes in order (can rewind) Random access: Read bytes in any order File Attributes ---------------------- Each file has a name and data The file system also tracks file attributes (or metadata): + file modification time + file size + permissions + etc. Common File System Calls ---------------------- create delete open close read write seek + Changes the file pointer for random access get + get attributes set + set attributes rename Example: files/copy.c (Program to copy a file illustrating several system calls) Directories --------------------------------- Hierarchical Directory System ---------------------- Path Names ---------------------- 1) Absolute path names: Starts with with the root directory, separator (UNIX: /, Windows: \) 2) Relative path names: Starts from the working/current directory Special directory names: ----------- . Current directory .. Parent directory Examples: ./a.out (executes the a.out that is in the working directory, sometimes necessary because of PATH restrictions) cp /tmp/studyGuide.txt . (copy studyGuide.txt from the /tmp directory to the working directory; same as cp /tmp/studyGuide.txt studyGuide.txt) cp /tmp/studyGuide.txt myFavoriteStudyGuides/ cp ../project2/proj2.c proj3.c (copy proj2.c from the project2 directory, which is in the parent's directory, to proj3.c 20181204 ******************************************************* Common Directory System Calls ---------------------- create delete opendir closedir readdir rename link unlink File System Implementation --------------------------------- File System Layout ---------------------- At sector 0 is the Master Boot Record (MBR) After the MBR is the partition table (with the starting address of each partition) 1 of the partitions is marked as active Each partition starts with a boot block. After booting, the BIOS reads the MBR and looks for the active partition It then loads the OS in the active partition Implementing Files ---------------------- Contiguous Allocation ----------- Store each file as contiguous disk blocks Advantages: Simple Low overhead (number of blocks) Disadvantages: Search time Fragmented memory (compacting is expensive) 1) holes of removed files 2) last block Used on early systems, CD-ROMs and DVDs Linked List Allocation ----------- Make a linked list of disk blocks, with the first word the pointer to the next block. Advantages: Avoids fragmented memory (except for last block) Disadvantages: Random access if very slow Allocations are no longer a power of 2 Solution to disadvantages: Example: I-nodes (index-nodes) ----------- Store the attributes and disk addresses of the file's block Similar to Window's NTFS and UNIX's file systems Implementing Directories ---------------------- Directories are files that list their contents (files and/or directories) They store file names They store (or provide the location of) file attributes Variable length filenames: Option 1: Store the size of the filename, then the filename (in-line) Option 2: Store the filenames in a heap Shared Files ---------------------- Allowing a file system to share a file changes the structure from a tree to a directed acyclic graph (DAG) Some complications can arise from shared files: 1) All changes to the contents of the file need to be visible at each location it's shared 2) Having a file in your directory whose owner is another user 3) Having a symbolic link to a file that does not exist 4) Backup systems need to give the option to not backup files/directories that are linked Journaling File Systems ---------------------- Logs the request for modifications to data on the disk Removes log entries when the modification is completed Needs all requests to be "idempotent" (can be repeated multiple times without harm) Fault tolerant because if the system crashes, it just repeats the modification it was in the middle of Window's NTFS and Linux's ext3 use journaling File System Management and Optimization --------------------------------- Disk-space Management ---------------------- Layout ----------- Files can be stored or What about when a file grows? It often must be moved Most common: Block Size ----------- Small blocks increase disk utilization Large blocks increase disk performance Tracking Free Blocks ----------- Option 1: (block with just disk addresses of free blocks) Option 2: SSDs ++++++++++++++++++++++++++++++++++++++++++++ Reads a page at a time To write one or more bytes, a specified block has to be erased first With no moving parts, more reliable than HDDs Disadvantage, manufactures rate that each block can only be written to 10,000 - 100,000 times SSD Performance And Cost ================================= Random Sequential Reads Writes Reads Writes Device (MB/s) (MB/s) (MB/s) (MB/s) ------------------------ ------ ------ ------ ------ Samsung 840 Pro SSD 103 287 421 384 Seagate 600 SSD 84 252 424 374 Intel SSD 335 SSD 39 222 344 354 Seagate Savvio 15K.3 HDD 2 2 223 223