==> c/notes-01.txt <== 20180820 ******************************************************* // Lines starting with # are processed by preprocessor // No semi-colon // C is case-sensitive #include // int is the return int main(){ // execution printf("Hello, world!\n"); // sends a message to STDOUT // \n \t // \\ \" // 'h' return 0; // 0 for success } Java: javac JVM x.java ------------> X.class ---------> output Perl/python: perl/python x.py / x.py -------------> output C: gcc x.c -----------------> a.out compiled GCC Compile --------------------------------- Two-stage Compilation ---------------------- 1) pre-processing & compiling: gcc -c hello.c (produce hello.o) 2) link: gcc -o hello hello.o Linking several modules example: gcc -c a.c (produce a.o) gcc -c b.c (produce b.o) gcc -o solveWorldHunger a.o b.o Using math library: gcc calc.c -o calc -lm gcc flags --------------------------------- -c compile single module -o output file for object or executable -Wall display all warnings -g insert debugging code -p insert profiling code -l file library -L dir library path -I dir include path C Preprocessor (cpp) --------------------------------- Macro-processor Changes the code gcc -E shows the output of the preprocessor #define ----------- #define name const-expression #define name(param1[,param2[,...]]) const-expression #undef name Example: #define MAXVALUE 100 if( i < MAXVALUE ){ ... } becomes if( i < 100 ){ ... } Can take arguments: #define CHECK(x) ((x) < MAXVALUE) #if #else #endif ----------- #if expression code segment 1 #else code segment 2 #endif Example: #define OS linux #if OS == linux #include #else #include #endif #ifdef / #ifndef ----------- #ifdef name code segment 1 #else code segment 2 #endif Example: DEBUG #include ----------- #include // looks in the include directories #include "filename.h" // Looks in the current directory (first) Include Guards ----------- Example: includeGuardsExample/main-noIncludeGuards.c Comments ********************************* 1) Sections (maybe multi-line): /* any text until */ /* any text until */ 2) Rest of line: // C++-style (can be used in C) 20180827 ####################################################### Numeric Data Types ********************************* type bytes range ----- ----- ------ char 1 -128 to 127 (-2^7 to 2^7) int 4 -2.1 billion to 2.1 billion (-2^31 to 2^31) long long 8 -2^63 to 2^63 float 4 3.4E+/-38 (7 digits of precision) double 8 1.7E+/-308 (15 digits of precision) unsigned Example: unsigned int = 0; Missing? Boolean Strings (Wait for C-style strings) Most Common: ~~~~~~~~~~~~~~~~~~~~~~ char int double 20180829 ####################################################### Variables (Data objects) --------------------------------- Uninitialized variables have "random" value Always initialize your variables Every variable in C has: 1) a name 2) a data type 3) an address (memory location) 4) visibility (which parts of the program can refer to it) 5) lifetime (period during which it exists) Size of an object [think variable for now] is determined when it's created: 1) global data objects at compile time (data section) 2) local data objects at run-time (on the stack) 3) dynamic data objects by programmer (in the heap) Scoping ---------------------- Variables declared in a { } block are usable (visible) only with in that block Variables declared outside of a { } block are global. Use sparingly and with caution Example: variablesExample.c.html Exercise: What will the output be of the following (if any)? int solution; solution++; printf("solution: %d\n", solution); Type Conversion --------------------------------- Explicit Type Conversion: ---------------------- Syntax: (type) Example: int count = 99; double cost = (double) count * 5.34; Example: explicitTypeConversionExample.c.html Implicit Type Conversion: ---------------------- Syntax: Exercise: Discuss with your neighbor what will the output be of the following (if any)? int i = 100000; char c; c = i; printf("i: %d, c: %d\n", i, c); Output: Funky Types are "promoted": char -> int -> long ... If one operand is double, the other is promoted to be a double If either is float, the other is promoted to be a float Exercise: What is the value of the following expressions: int age; A. age = 8 if( age = 8 ){ -> if( 8 ){ B. -age C. 5 + 8 D. 5 / 8 E. 5 / 8.0 F. float( 7/3 ) G. float( 7 ) / 3 Arrays --------------------------------- Examples: ----------- int vec[100]; // holds 100 elements, indexed 0..99 char str[30]; // holds 30 elements, indexed 0..29 float m[12][31]; // holds 12*31 elements vec[55] printf( "%d\n", vec[ 999999 ]); printf( "%d\n", vec[ 101 ]); 20180831 ******************************************************* sizeof( ) ---------------------- sizeof( data type ) returns the number of bytes for a data type Example: ----------- sizeof( int ) or sizeof int sizeof( variableName) returns the number of bytes for variableName Example: ----------- char simpleChar = '!'; sizeof( simpleChar); // or sizeof simpleChar; int vec[100]; sizeof( vec ); int arraySize = sizeof( vec ); int elementSize = sizeof( vec[0] ); printf( "vec has %d elements\n", arraySize / elementSize )); Exercise: Write code to sum the first 50 elements of an int array named receiptTotals. int receiptTotals[ ARRAY_SIZE ]; int sum = 0; for( int i = 0; i < ARRAY_SIZE; i++){ sum += receiptTotals[i]; } Pointers --------------------------------- Review ~~~~~~~~~~~~~~~~~~~~~~ int counter = 1; Pointers are a data type Pointers hold a memory address Example: ----------- int* myIntPtr; double* ptrToGradient; pointers: "variables that tell you where to find something else" (Dale & Weems) Analogy: PO Boxes int secretBankAccountNumber = 456789; int* myIntPtr = secretBankAccountNumber; // bad idea & operator ---------------------- & (address-of operator): returns the memory address of the first byte in memory of a variable myIntPtr = &secretBankAccountNumber; Example: int answer = 42; int* answerPtr = &answer; printf("answer's value: %d\n", answer); printf("answerPtr's value (base-10): %ld\n", (long) answerPtr); printf("answerPtr's value (hexadecimal): %p\n", answerPtr); Memory Contents Type Name Address ------------------ ---- ---------- --------- | | +----------------+ | 0x7ffee348d2e8 | int* answerPtr 0x7ffee348d2e0 +----------------+ | 42 | int answer 0x +----------------+ | | Example: addressOfExample.c (or *.c.html) (left to individual review) * (dereference) operator ---------------------- * dereferences a variable (it interprets the contents of that variable as address, of a particular type_) Allows you to get at the contents of memory, to read or white (use or assign) Example: printf("*answerPtr's value: %d\n", *answerPtr); Example: pointerExample.c (or *.c.html) Exercise: Given the following code, determine the data type for each expression: int x; int* ptr; Expression Data type Possible Values ---------- --------- --------------- x int whole numbers <= +/-2^31 ptr int* a memory location *ptr int &ptr int** a memory location (to a memory location) &x int* a memory location 20180905 ####################################################### Example: +------+ | ???? | 0x10 | | +------+ | ???? | 0x14 | | +------+ | ???? | 0x18 | | +------+ int x; // at 0x10 int* p; // at 0x14 int* q; // at 0x18 +------+ | ???? | 0x10 | | int x +------+ | ???? | 0x14 | | int* p +------+ | ???? | 0x18 | | int* q +------+ p = &x; +------+ | ???? | 0x10 | | int x +------+ | 0x10 | 0x14 | | int* p +------+ | ???? | 0x18 | | int* q +------+ *p = 7; +------+ | 7 | 0x10 | | int x +------+ | 0x10 | 0x14 | | int* p +------+ | ???? | 0x18 | | int* q +------+ q = &p; +------+ | 7 | 0x10 | | int x +------+ | 0x10 | 0x14 | | int* p +------+ | 0x14 | 0x18 | | int* q +------+ q = p; +------+ | 7 | 0x10 | | int x +------+ | 0x10 | 0x14 | | int* p +------+ | 0x10 | 0x18 | | int* q +------+ *q = 11; +------+ | 11 | 0x10 | | int x +------+ | 0x10 | 0x14 | | int* p +------+ | 0x10 | 0x18 | | int* q +------+ p = NULL; +------+ | 11 | 0x10 | | int x +------+ | NULL | 0x14 | (0) | int* p +------+ | 0x10 | 0x18 | | int* q +------+ Exercise: Diagram the elements on the stack and write the output of the following (adapted from HERE): #include int main (){ int numbers[5]; int * p; p = numbers; *p = 10; p++; *p = 20; p = &numbers[2]; *p = 30; p = numbers + 3; *p = 40; p = numbers; *(p+4) = 50; for( int n = 0; n < 5; n++){ printf("%d, ", numbers[n]); } return 0; } Exercise: Write code for each of the following and make sure that it compiles: a. Declare ex0 as a pointer to a char b. Declare ex1 as a pointer to a pointer to a char c. Declare ex2 as an array of 10 elements d. Declare ex3 as an array of 10 elements that are pointers to chars e. Declare ex4 as a pointer to an array of 30 char elements f. Declare ex5 as an array of 10 pointers to arrays of 500 char elements g. Declare ex7 as a pointer to const int h. Write a function ex8 that takes a char pointer named pc, increments the address that it points to and returns that value i. Write a function ex9 that takes an int pointer named num, and prints out what num points to and a message saying if what num points to is odd or even C-style Strings --------------------------------- https://i.imgur.com/JdXxHqH.jpg (Portrait: https://i.imgur.com/NwNPU.png (or https://imgur.com/NwNPU)) Examples: cStyleStrings.c 20180907 ******************************************************* Executing a C program: Command-line arguments ---------------------- // Function parameters: // argc: number of arguments // argv: array (notice the []) of char *s (which are pointers to strings) int main( int argc, char *argv[] ){ // ... return 0; } Command-line Example ----------- Syntax: [] Example: a.out 1 23 awesome argc: 4 argv: +---+---+---+---+ | | | | | | | | | | | | | | +-+-+-+-+-+-+-+-+ | | | | | | | v | | v "awesome" | v "23" v "1" "a.out" Libraries --------------------------------- stdio.h I/O string.h character strings ctype.h character types math.h numerical math functions stdlib.h C Standard General Utilities Library (ex atof(), malloc(), NULL, rand(), exit(), etc.) Memory Allocation ********************************* 1) 2) A program's memory footprint: +--------------+ | Headers | +--------------+ | | | Heap | dynamically allocated memory | | +--------------+ | | grows down | | v | | ^ | | grows up | | +--------------+ | | | Stack | "normal" variables & function calls | | +--------------+ malloc() ~~~~~~~~~~~~~~~~~~~~~~ #include Function prototype: void* malloc( size_t size ); Lifetime of the block is until the memory is freed, with free(). free() ~~~~~~~~~~~~~~~~~~~~~~ #include Function prototype: void free( void *ptr ); Memory leak – Example: memoryAllocationExample.c (or memoryAllocationExample.c.html) Other function prototypes in stdlib.h: void* calloc( size_t n, size elementSize); // allocates memory and initializes it to 0 void* realloc( void* ptr, size_t size); // Returns pointer to larger or smaller memory block of size bytes Exercise: pointersExercise-01.c Add code as directed by the comments to pointersExercise.c and execute to see values update: 20180910 ******************************************************* Control structures --------------------------------- grouping: {...} Same as Java if statements ---------------------- Same as Java Syntax: if( condition1 ){ statements1 } else if( condition2 ){ statements2 } else if( condition3 ){ statements3 }else{ statementsN } Be careful when omitting { }s: Example: ifExample.c (or ifExample.c.html) #include int main(){ int x = 7, y = 42; if( x > 0 ) printf("x > 0!\n"); if( y > 0 ) printf("x and y > 0!"); return 0; } switch Statements ---------------------- Syntax: switch( expression ){ case const1: statements1; break; case const2: statements2; break; ... default: statementsN; // optional } Loops ---------------------- while( condition ){ ... } Loop body executes 0 or more times, while condition is != 0 do{ ... } while( condition ); Loop body executes 1 or more times, while condition is != 0 for( init; condition; update){ ... } init is executed once Loop body executes 0 or more times, while condition is != 0 update is executed after every loop body Beware: int n = 0; ... if( n = 42 ){ printf("Something is very, very wrong\n"); } Structured data objects ********************************* structs ~~~~~~~~~~~~~~~~~~~~~~ A developer-defined data type that is a collection of other data types Access members of the struct using the member of operator, "." Syntax: structVariable.field Example: structsExample1.c (or structsExample1.c.html) Example: structsExample2.c (or structsExample2.c.html) Example: structsExample3.c (or structsExample3.c.html) Example with pointers: structsAndPointersExample.c (or structsAndPointersExample.c.html) Functions ********************************* Function Prototypes ~~~~~~~~~~~~~~~~~~~~~~ A function prototype specifies the return type, name and the argument list A prototype is needed if a call to the function appears before the definition of the function Often found in header (.h) files We can pass arguments to functions 1) By value (e.g., int, double) 2) By reference (e.g., int*, double*, struct joeysStruct*) We can return values from functions 1) By value (e.g., int, double) 2) By reference (e.g., int*, double*, struct joeysStruct*) Don't return addresses that are on the stack (in the function) 20180912 ******************************************************* const arguments ----------- Example: void exFunc( const char* str, const int x) { printf("x = %d\n", x); str[5] = "!"; // error: read-only variable is not assignable } Note: C does NOT support overloading functions (although C++ does) Example: overloadedFunctions.c I/O ================================= Formatted strings: Example: printf("%d", age); c - character d - integer f - floating-point number (float) s - C-style string p - pointer (hexadecimal) ld - long int lf - long floating-point number (double) printf() - display to stdout (defaults to the terminal) scanf() - reads from stdin (defaults to the keyboard) Example: ----------- madlibs-main.c madlibs.h madlibs.c Other library functions: ----------- sscanf() - scans a C-style string asprintf() - formats a string (and allocates just enough memory for it) File I/O ================================= The stdio library (stdio.h) has types and functions for file streams File stream type: FILE* Input files: fopen() fscanf() or getline() or fgets() fclose() Output files: fopen() fprintf() fclose() Example: fileIOExample.c scanf Example: processStockData.c ==> intro/notes-01.txt <== 20180914 ******************************************************* What's an operating system? Most OSs have 2 modes: 1) User mode 2) Kernel mode Processors ~~~~~~~~~~~~~~~~~~~~~~ Fetch Decode Execute Repeat How many processes can a CPU execute at once? 1 OS Role: virtual machine Special Registers + Program counter (next instruction to execute) + Stack pointer (current top of the stack) + Program Status Word (condition codes) + etc. Memory Hierarchy ~~~~~~~~~~~~~~~~~~~~~~ Access Time Unit (seconds) Capacity ----------- ------------ -------- Registers .000000001 <1 KB Cache .000000002 4 MB (L1, L2, L3) Main memory .00000001 4-16 GB (RAM) Disk .01 1-4 TB ==> processes/notes-01.txt <== 20180917 ******************************************************* What is a process? A running program, including its machine state What's part of a process's machine state? Process state PID Address space - the memory the process Registers * Program counter * Stack pointer * Frame pointer With time sharing (multiprogramming), the OS switches between processes every 10-100 millisecond Pseudo-parallelism True parallelism requires a multi-processors system What is the API in the OS for a process: ----------- Create Destroy Wait Control + Kill + Suspend + Etc. Status Process Creation ~~~~~~~~~~~~~~~~~~~~~~ Processes are created when: 1) System initialization 2) The user requests a new process 3) A running process wants help (e.g., using fork()) 4) Starting a batch job (only on mainframes) OS steps to run a program ----------- Load (from disk) the code and any static data (initialized variables) into memory Allocate memory for the stack Initialize the stack (with argc and argv) Allocate some memory for the heap Setup file descriptors (stdin, stdout, stderr) Start executing main() Process Hierarchies ----------- UNIX: init starts running after booting, a new process for each terminal, builds a process tree Windows: No process hierarchy, but a parent process is given a special token (handle) to control the child process Process Termination ----------- Processes are terminated when: 1) Normal completion exit() or ExitProcess() 2) Error 3) Fatal error (involuntary) 4) Killed (involuntary) kill() or TerminateProcess() Process States ~~~~~~~~~~~~~~~~~~~~~~ Exercise Process Table ~~~~~~~~~~~~~~~~~~~~~~ OS tracks processes in the process table (aka Process Control Block (PCB)) Stores a process's + Program counter + Stack pointer + Memory allocation + Open files + Scheduling information + etc. Process API ~~~~~~~~~~~~~~~~~~~~~~ How many lines will be printed? #include #include int main(){ fork(); fork(); fork(); fprintf(stderr, "All done!\n"); return 0; } 20180919 ******************************************************* UNIX Commands: ../unixCommands.txt 20180921 ******************************************************* Process API ~~~~~~~~~~~~~~~~~~~~~~ fork() ----------- UNIX: Makes a clone of the current process (with the same memory image, environment strings and open files) Memory is mapped so that it only makes a copy when the child writes to (changes) its memory The child generally calls exec() and replaces the memory Windows: 1 step process: CreateProcess() Example: processes/fork-example-01.c What are all of the possible output sets? wait(), waitpid() ~~~~~~~~~~~~~~~~~~~~~~ Suspends execution until a child process completes exec() ~~~~~~~~~~~~~~~~~~~~~~ Replaces the current process image with a new one man -s 3 exec Example: processes/fork-exec-example.c Limited Direct Execution ******************************************************* What does direct execution mean? Let the process run on CPU OS creates process and transfers control to starting point (i.e., main()) Problems with direct execution? + Process could do something restricted * Could read/write other process data (disk or memory) + Process could run forever (slow, buggy, or malicious) * OS needs to be able to switch between processes + Process could do something slow (like I/O) * OS wants to use resources efficiently and switch CPU to other process Solution: Limited direct execution What does limited direct execution mean? OS and hardware mechanisms to regain control of the CPU How can we ensure user process cane€™t harm others? Solution: privilege levels supported by hardware (bit of status) User & Kernel Modes ~~~~~~~~~~~~~~~~~~~~~~ Parts of OS run in kernel mode (privileged mode) + Instructions for interacting with devices User processes run in user mode (restricted mode) + User programs use SYSTEM CALLS to request privileged actions (that require kernel mode) * System call: function call implemented by OS - Uses trap instruction to change privilege level processes/modeSwitch.pdf Exercise: processes/processesExercises.html With your neighbor, name as many operations that requires kernel mode as you can. I/O + write/reading to/from disk + screen + keyboard + printer + network + hologram Setting timers (and responding to them) 20180924 ******************************************************* Context Switches ~~~~~~~~~~~~~~~~~~~~~~ The OS regains control by setting a timer An interrupt handler processes the timer The OS scheduler is run to determine which process should run (for multicore, multiple processes) processes/contextSwitch.pdf How long do context switches take? Context switch: <1 microsecond Exercise: processes/processesExercises.html With your neighbor, discuss if all or part of the OS executes in kernel mode? Scheduling ================================= Typical process behaviors: CPU or I/O bound Events that trigger the scheduler: + New processes created (either parent or child are in the ready state) + The running process exits + The running process is blocked (I/O, semaphore, etc.) + I/O interrupt + Hardware clock interrupt (10-100/s) Scheduling Metrics ~~~~~~~~~~~~~~~~~~~~~~ Turnaround Time ----------- Turnaround Time = (time of completion) - (arrival time) Response Time ----------- Response Time = (time of first run) - (arrival time) Scheduling Categories ~~~~~~~~~~~~~~~~~~~~~~ Non-preemptive ----------- Scheduler chooses a process to run and let's it run as long as it like Preemptive ----------- Scheduler moves the Running process to the Ready state every clock interrupt Scheduling Environments ----------- 1) Batch Non-preemptive or preemptive with long run times Goals: throughput, turnaround time & CPU utilization 2) Interactive Preemptive Goals: Response time 3) Real-time Preemptive is sometimes not needed b/c generally it's not a general processor Goals: Meeting deadlines, predictability All systems goals ----------- Fairness Policy enforcement Balance Scheduling in Batch Systems (non-preemptive) ~~~~~~~~~~~~~~~~~~~~~~ First-Come First-Served (FIFO) ----------- Simplest Fair Shortest Job First ----------- Best Turnaround time Scheduling in Interactive Systems ~~~~~~~~~~~~~~~~~~~~~~ Examples: desktops, servers, etc. Shortest Time-to-Completion (aka shortest remaining time next) ----------- Preemptive version of shortest job first is Round-Robin ----------- Scheduler has a list of runnable processes Each process gets run for a time interval, a quantum Setting the quantum too short causes a lot of extra context switches Setting the quantum too long is bad for interactive requests Priority Scheduling ----------- Processes can have different priorities + Example: streaming video/on-line multi-player game verses a compile The ready process with the highest priority is scheduled first Low priority jobs can starve, so adjustments need to be made over time Often, jobs are grouped by priority and run round-robin Others ----------- Aging estimates for execution time Guaranteed scheduling Lottery scheduling User-based schedule Scheduling in Real-Time Systems ~~~~~~~~~~~~~~~~~~~~~~ Hard real-time systems can NOT miss deadlines Soft real-time systems can occasionally miss deadlines Static: all information is known ahead of time Dynamic: Decisions are made at runtime Multi-level Feedback Queues ~~~~~~~~~~~~~~~~~~~~~~ Priorities interactive jobs while still giving time for CPU-intense jobs Balances response time and turnaround time Show cpu-mlfq-eg.eps Figure 8.1 Rules ----------- Rule 1: Run a process from the highest priority queue that's not empty (if the queue has a single process) Rule 2: Use Round Robin to run the processes in the highest priority queue that's not empty Rule 3: When a job enters the system, it is placed at the highest priority (the topmost queue). Rule 4: Once a job uses up its time allotment at a given level (regardless of how many times it has given up the CPU), its priority is reduced (i.e., it moves down one queue). Rule 5: After some time period S, move all the jobs in the system to the topmost queue. processes/scheduling-multiLevelFeedbackQueue-oneLongJob.pdf processes/scheduling-multiLevelFeedbackQueue-addInteractiveProcess.pdf Starvation - A process is not scheduled to run for a very long time processes/scheduling-multiLevelFeedbackQueue-fig8.5.pdf Gaming the scheduler - processes/scheduling-multiLevelFeedbackQueue-fig8.6.pdf Lottery Scheduling ~~~~~~~~~~~~~~~~~~~~~~ Goal: Proportional (fair) share Approach: + Processes receive "lottery" tickets + Higher priority processes receive more tickets + Whoever wins, executes Amazingly simple to implement Avoids corner cases Scheduling Pseudocode: ----------- int counter = 0; int winner = getrandom(0, totaltickets); node_t *current = ticketsListHead; while( current ){ counter += current->tickets; if( counter > winner ){ break; } current = current->next; } // current is the winner Example: (402 total tickets) ----------- Process A: 1 ticket Process B: 1 ticket Process C: 100 tickets Process D: 200 tickets Process E: 100 tickets winner = 50 -> process C winner = 350 -> process E winner = 0 -> process A 20181001 ******************************************************* Address Spaces ================================= Historical view of memory: Figure 13.1 One process runs at a time Disadvantages: 1) One process runs at a time 2) Process can overwrite the OS :( Multiprogramming Goals ~~~~~~~~~~~~~~~~~~~~~~ Virtual RAM: Illusion of private memory Transparency (Translucency) ----------- Processes are not aware that memory is shared Works regardless of the number and/or location of the process(es) Protection ----------- Cannot corrupt the OS or other processes Privacy: Cannot read data from other processes Efficiency ----------- Do not waste memory resources (minimize fragmentation) Minimize the overhead to provide virtualization Figure 13.3 +--------------+ | Code | +--------------+ | | | Heap | dynamically allocated memory | | +--------------+ | | grows down | | v | | | | ^ | | grows up | | +--------------+ | | | Stack | "normal" variables & function calls | | +--------------+ +--------------+ Figure 13.2 Address Space: The memory state of a process 20181003 ******************************************************* int* ptr; int* array; Memory API - Chapter 14 ================================= What are the function prototypes for malloc() and free()? void* malloc( size_t size ); void free( void* ); Common Errors ~~~~~~~~~~~~~~~~~~~~~~ Forgetting to allocate memory ----------- char* src = "Go Cougars!"; char* dst = NULL; // bad idea!!! strncpy( dst, src, strlen( src ) + 1); Solution 1: char* src = "Go Cougars!"; char* dst = (char*)malloc( sizeof(char) * strlen(src) ); strncpy( dst, src, strlen( src ) + 1); Solution 2: char* src = "Go Cougars!"; char dst[1024]; strncpy( dst, src, strlen( src ) + 1); Not allocating enough memory ~~~~~~~~~~~~~~~~~~~~~~ char* src = "Go Cougars!"; char dst[2]; strncpy( dst, src, strlen( src ) + 1); const int SIZE = 20; char* src = "Go Cougars!!@#$%^&*()_++1234567890-=QWERTYUIOP{}|qwertyuiop[]\\ASDFGHJKL:"asdfghjkl;'ZXCVBNM<>?zxcvbnm,./"; char dst[SIZE]; //strncpy( dst, src, SIZE); strcpy( dst, src ); Forgetting to initialize allocated memory ~~~~~~~~~~~~~~~~~~~~~~ Forgetting to free memory ~~~~~~~~~~~~~~~~~~~~~~ Memory leak Especially dangerous in memory-intense and/or long running programs Freeing memory before you are done with it ~~~~~~~~~~~~~~~~~~~~~~ Dangling pointer Example: int* ptr = (int*)malloc( sizeof(int) ); ... free( ptr ); ptr = NULL; ... finalSolution( *ptr ); 20181005 ############################################################################# Address Translation (Chapter 15) ================================= 3 Options to manage RAM: 1) No memory abstraction < 1 process in memory at a time 2) Swapping 3) Virtual memory Memory Management Option 2) Swapping ~~~~~~~~~~~~~~~~~~~~~~ To allow for multiple processes to be in memory at the same time we need: 1) Relocation 2) Protection Dynamic Relocation ----------- Base & bounds/limit registers Each time an address is used, the CPU hardware 1) Adds the starting address to it 2) Check that the address is within bounds Example: addressTranslation/baseAndBoundExample.pdf Disadvantages: + Slows things down + Program needs to fit in memory Swapping ----------- Copy all of a process into RAM, run it and write it back to disk Processes grow (in memory) as they run + the stack + the heap 20181008 ******************************************************* void free( void * ptr ); Segmentation (Chapter 16) ================================= Figure 16.1 Segmentation - base and bound pair for each logical segment Example: ----------- Segment Base Size Grows Positive? ------- ------- ---- --------------- Code 32K 2K 1 Heap 34K 2K 1 Stack 28K 2K 0 Translate virtual address 200 1) Bounds check 200 < 2K (2048)? Yes 2) base + offset = physical address 32K + 200 = 32*1024+200 = 32968 Translate virtual address 5000: 1) offset = address - segment start offset < segment size 904 = 5000 - 4K 904 256 -> 1024 -> 64 -> 128 -> 32 -> NULL Free Space Management Exercises ~~~~~~~~~~~~~~~~~~~~~~ 1. The OS manages the Free List (a data structure that tracks available chunks of memory in the heap). Discuss with your neighbor if there's a Non-free List and if so, so manages it? Splitting (Unallocated) Memory Chunks ~~~~~~~~~~~~~~~~~~~~~~ What does the OS do when process B calls malloc( 200 )? It applies an allocator algorithm to find an acceptable memory chunk. Assume that we've found an acceptable unallocated memory chunk, but it's not the exact size of the request? The OS (allocator algorithm) splits that chunk it allocated and unallocated parts. (Really, it updates (resizes) the unallocated information and returns a pointer to the request portion.) Let's say that we choose the 256 block to split. Address 0x10000 10400 11400 11440 12440 124C0 134C0 13500 135C0 145C0 +-------+------+-------+------+-------+------+-------+------+------+------+ Size | 1024 | 4096 | 64 | 4096 | 128 | 4096 | 56 | 200 | 4096 | 32 | Next | 11400 | | 12440 | | 145C0 | | 10000 | | | NULL | +-------+------+-------+------+-------+------+-------+------+------+------+ (free) (free) (free) (free) (free) Head: 134C0 Free List: head -> 56 -> 1024 -> 64 -> 128 -> 32 -> NULL (Not shown are the magic numbers in the allocated blocks) Allocated header: (BEFORE the void* returned to the user) ----------- struct header_t { int size; // number of bytes (as seen by the user) int magic; // for integrity checks }; Figure 17.2 Free header: (EMBEDDED IN the unallocated chunk) ----------- struct node_t { int size; // available bytes (does not include this header) struct node_t* next; // next free chunk }; 20181017 ******************************************************* What is the data type for our head pointer? node_t* head; node_t* curr = head; // find the first available free chunk (First Fit) heapInit( 17888 ) /* Address 0x10000 10400 11400 11440 12440 124C0 134C0 13500 135C0 145C0 +-------+------+-------+------+-------+------+-------+------+------+------+ Size | 17888 - sizeof( node_t) | Next | NULL | +-------+------+-------+------+-------+------+-------+------+------+------+ (free) head: 0x10000 Free List: head -> 17888 -> NULL */ Coalescing ~~~~~~~~~~~~~~~~~~~~~~ Combining adjacent unallocated memory chunks free( 0x11440 ) Address 0x10000 10400 11400 11440 12440 124C0 134C0 13500 135C0 145C0 +-------+------+-------+------+-------+------+-------+------+------+------+ Size | 1024 | 4096 | 64 | 4096 | 128 | 4096 | 56 | 200 | 4096 | 32 | Next | 11400 | | 12440 | | 145C0 | | 10000 | | | NULL | +-------+------+-------+------+-------+------+-------+------+------+------+ (free) (free) (free) (free) (free) (free) After coalescing: Address 0x10000 10400 11400 124C0 134C0 13500 135C0 145C0 +-------+------+-------+------+-------+------+------+------+ Size | 1024 | 4096 | 4288 | 4096 | 56 | 200 | 4096 | 32 | Next | 11400 | | 145C0 | | 10000 | | | NULL | +-------+------+-------+------+-------+------+------+------+ (free) (free) (free) (free) Head: 134C0 Free List: head -> 56 -> 1024 -> 4288 -> 32 -> NULL Four scenarios when deallocating memory Figure 17.3 Figure 17.4 Growing the Heap (What happens when the heap is full?) ~~~~~~~~~~~~~~~~~~~~~~ The OS reserves a large chunk of memory and adds it to the Free List Free List Management Algorithms ~~~~~~~~~~~~~~~~~~~~~~ How to choose which free chunk to give to the user? Best fit - search the whole list, and return the smallest chunk (that will work) First fit (use this for your project 2) - search the list, and return the first chunk (that will work) Next fit - search the list, and return the first chunk (that will work) (starting from where the last call left off) Binary Buddy - Splits chunks in half Paging Intro ================================= Segmentation allows segments to be different sizes -> making implementation difficult Paging allows for fixed-size chunks to be moved in and out of memory (RAM) + Simple + Flexible Memory Management Option: 3) Virtual Memory ================================= Allows processes to run even if only part of it is load into RAM It acts like the RAM is larger than it really is The MMU ( Memory Management Unit) maps a virtual (memory) address to a physical (memory) address Virtual addresses are grouped together into a VIRTUAL PAGE Physical addresses are grouped together into a PAGE FRAME 20181017 ******************************************************* If a virtual address is needed that references a page frame that is not in RAM, the CPU TRAP instruction to let the OS handle it (PAGE FAULT) The OS chooses a page frame that is in RAM, writes it to disk (if necessary) Reads in the page requested Updates the mapping (in the page table) Re-starts the instruction Page Tables ~~~~~~~~~~~~~~~~~~~~~~ Maps virtual pages to page frames Virtual address is divided into: 1) Upper bits + Used for an index into the page table (values in the page table are upper bits for page frames) 2) Lowest bits + Offset + Copied directly into physical address How many entries are in page table? Addressable memory size = (size of the page table) * (page frame size) Exercise: For 8 GB (2^33 bytes) of addressable memory with 4 KB (2^12 byte) page frames, the page tables need to have how many entries? Addressable memory size = (size of the page table) * (page frame size) (size of the page table) = (Addressable memory size) / (page frame size) (size of the page table) = 2^33 / 2^12 (size of the page table) = 2^(33-12) = 2^21 Virtual Address +----------------+------------------------------------+ | upper bits | lower bits | +----------------+------------------------------------+ n <-------------------------------------------------> 0 x <--------------------------------> 0 Addressable memory = 2^n Page frame size = 2^x Page table size = 2^(n-x) Exercise: ----------- How big is the address space in the page table figure? Addressable memory = 2^16 How big is the page frame size? Page frame size = 2^12 How many entries in the page table? Page table size = 2^4 (=16) Exercise ----------- Given virtual address, 0xBEEF, what is its physical address? Lower bits: 0xEEF = 1110 1110 1111_2 Upper bits: 0xB = 1011_2 = 11 Physical address = 111 1110 1110 1111_2 Page Table Entries: ----------- Page frame number: The upper bits for the physical address Present/absent (bit): Whether the page frame is in the RAM Protection: Read, write and execute permissions for the page frame Modified (bit): 1 if the page frame is different in RAM and disk, 0 otherwise (also known as the dirty bit) Referenced (bit): Page frame is currently referenced or not Caching disabled: Allows for page frames to not be cached 20181022 ******************************************************* Page Size ---------------------- Small page size to minimize internal fragmentation Large page size to + minimize number of to/from disk transfers + utilize the TLB 4 KB is common TLB (Translation Lookaside Buffer) ----------- Cached page table entries for faster lookup ~64 entries Multilevel Page Tables ----------- Goal: Avoid keeping all of the page tables in memory at the same time Addressable memory size = (size of the page table) * (page frame size) becomes Addressable memory size = (size of the page dictionary (page table #1)) * (size of the page table #2) * (page frame size) Exercise: What's the addressable memory size for 2-level page table with 1024 (2^10) entries each and 4 MB (2^22 bytes) page frames? Inverted Page Tables ----------- Instead of indexing based on the virtual address, index by the page frame Use a hash function for lookup Swap Space ~~~~~~~~~~~~~~~~~~~~~~ Reserved space on disk for swapping out pages Page Fault ~~~~~~~~~~~~~~~~~~~~~~ Page Table Miss OS in charge of swapping in the request page (from disk) + Most systems, the disk location is actually stored in the page table entry + It requests the page from disk + Updates the page table entry + Updates the TLB + Restart the instruction (that caused the page fault) 20181024 ******************************************************* Pseudocode for accessing memory ----------- Get virtual page number from the address (by masking and shifting to isolate the upper bits) if( virtual page in the TLB && valid bit set ){ // TLB hit if( ! protection bits allow access ){ protection fault } // few cycles ~0.000000001 seconds }else{ // TLB miss page table lookup with the virtual page number if( ! valid address ){ seg fault! }else{ if( ! protection bits allow access ){ protection fault }else if( in the page table ){ // soft miss // 10-20 cycles ~0.000000005 seconds }else{ // hard miss (page fault) get from disk // ~0.01 seconds } } evict a TLB entry add the virtual page number as a new TLB entry } memory access (with the physical address) Page Replacement ======================================================= RAM is cache for disk When a request for a cache entry is not found: cache miss; otherwise, cache hit Average Memory Access Time (AMAT) ~~~~~~~~~~~~~~~~~~~~~~ AMAT = T_M + (P_Miss * T_D) T_M = cost to access memory (0.00000001 seconds) T_D = cost to access disk (.01 seconds) P_Miss = probability of a cache miss Example (95% hit rate): P_Miss = 0.05 (or 5%) AMAT = 0.00000001 + (0.05 * 0.01) = 0.00050001 seconds Example (99.5% hit rate): P_Miss = 0.005 (or 0.5%) AMAT = 0.00000001 + (0.005 * 0.01) = 0.000050001 seconds Page Replacement Algorithms ~~~~~~~~~~~~~~~~~~~~~~ Several page replacement algorithms exist: Optimal Not Recently Used First-In, First-Out (FIFO) Second-Chance Least Recently Used (LRU) Least Frequently Used (LFU) Clock Aging ... In general: + Pages not used recently + Un-modified pages (so that we don't have to wait forever for the contents to be written disk) Each clock interrupt, clear (set to 0) the reference bit for all entries in the page table Set the reference bit each time its accessed Set the modified bit for each write to that page Working Set ----------- The set of pages that the process is currently using Thrashing Optimal ----------- 1) How the algorithm works Evicts the page that will be referenced the furthest in the future 2) How easy or hard is it to implement Impossible 3) Is the execution fast or slow Slow 4) What is its page fault performance Optimal Not possible to implement, only useful for comparisons (as a benchmark) Exercise: ----------- In small groups, prepare a 3-minute explanation about your assigned page replacement algorithm. Include: 1) How the algorithm works 2) How easy or hard is it to implement 3) Is the execution fast or slow 4) What is its page fault performance Groups: switch( Last digit of your 909-number % 4 ){ case 0: FIFO; break; case 1: Random; break; case 2: LRU; break; case 3: Clock; break; } FIFO (First-In, First-Out) ----------- 1) How the algorithm works Linked list, when a page is added to RAM, it's put in the list 2) How easy or hard is it to implement Easy 3) Is the execution fast or slow Fast to choose a page to evict 4) What is its page fault performance Bad Random ----------- 1) How the algorithm works Randomly choose a page to evict 2) How easy or hard is it to implement Easy 3) Is the execution fast or slow Fast 4) What is its page fault performance Random LRU (Least Recently Used) ----------- 1) How the algorithm works The OS keeps track of each page in a list (or data structure), they're ordered by their use. When a page is referenced, it goes to the end of the list. The page at the beginning of the list is evicted. 2) How easy or hard is it to implement Moderate, need to find the page in a data structure 3) Is the execution fast or slow Slow to update, fast to choose a page to evict 4) What is its page fault performance Better than FIFO, random and little better than Clock Clock ----------- 1) How the algorithm works Views the page table as a circular list. Looks for a page that does not have the reference bit set. Clears (sets to 0) the referenced as it goes around. Worse case it will go through the entire page table. 2) How easy or hard is it to implement Moderate, history based algorithm 3) Is the execution fast or slow Moderate 4) What is its page fault performance Similar to LRU's performance Summary of Page Replacement Algorithms ----------- Optimal: FIFO: Random: LRU: Clock: Local vs. Global Allocation Policies ~~~~~~~~~~~~~~~~~~~~~~ Local: Choose a page to evict only from the pages of the current process Global: Choose a page to evict from all of the pages Global does better, especially if the working set grows and/or shrinks Cleaning Policy ~~~~~~~~~~~~~~~~~~~~~~ For better performance (not having to wait for a dirty page to be written to disk), have a paging daemon that applies a page replacement algorithm to ensure a pool of free pages frames Implementation Issues ~~~~~~~~~~~~~~~~~~~~~~ The OS is involved with paging: 1) process creation 2) process execution 3) page faults 4) process termination The OS can "pin" a page in memory to prevent it from being evicted. + kernel's pages + page dictionary + pages that are in the middle of I/O transfers Concurrency ======================================================= Things running at the "same" time Threads --------------------------------- Process Review ~~~~~~~~~~~~~~~~~~~~~~ OS tracks processes in the process table (aka Process Control Block (PCB)) Per-process: ----------- Address space Global variables Open files Child processes Pending alarms Signals and signal handlers Scheduling information Per-thread: Thread Control Block (TCB) ----------- Program counter (the next instruction to execute) Registers Stack Multi-threaded Address Space: +--------------+ | Code 1 & 2 | (separate program counters) +--------------+ | Heap 1 & 2 | +--------------+ | | ... (free) ... | | +--------------+ | Stack 2 | +--------------+ | | ... (free) ... | | +--------------+ | Stack 1 | +--------------+ Why have multiple threads in a single process? + Do more things at the same time - I/O & CPU + Creating a thread is 10-100 times faster than creating a process + Multi-core environments 20181026 ******************************************************* Threads, not processes are actually what is executed by a CPU Threads are sometimes called lightweight processes because they share a lot of common characteristics with processes We don't need protection between threads in the same process because they can be considered friendly, not combative Goal: Have a set of threads that can work closely and effectively together to perform a task Example: threads/helloWorld-pthreads.c #include #include #include #include // sleep() const int NUM_THREADS = 10; void* helloWorld( void* tid ){ // print the thread ID and exit sleep( 5 ); printf( "Hello, world (thread ID %ld)!\n", (unsigned long) tid); pthread_exit( NULL ); } int main(int argc, char* argv[]){ // create NUM_THREADS and exit pthread_t thread[ NUM_THREADS ]; // child thread structures int status = -1; // return value for( unsigned long i = 0; i < NUM_THREADS; i++){ printf( "Creating thread index %ld (parent thread)\n", i); status = pthread_create( &thread[i], NULL, helloWorld, (void*)i); if( status != 0 ){ /* error occurred */ fprintf(stderr, "Error creating thread index %ld! (status: %d)\n", i, status); exit(-1); } } pthread_exit(NULL); // the last thing main() will do return 0; } How many threads are running? 11: 10 worker/child threads + 1 main/parent thread Changing Single-threaded code to Multi-threaded ~~~~~~~~~~~~~~~~~~~~~~ Issues: 1) Global variables Solution, routines for creating, reading and writing to "private" global variables 2) Non-reentrant library procedures Non-reentrant means they can't (probably) handle multiple simultaneous calls 3 general issues with InterProcess Communication (IPC): 1) Communication 2) Race Condition 3) Dependencies Race Conditions ~~~~~~~~~~~~~~~~~~~~~~ Example 1: What should happen: ----------- Task A Task B Memory Value Read value 0 Flip value 1 Read value 1 Flip value 0 Race condition manifested: ----------- Task A Task B Memory Value Read value 0 Read value 0 Flip value 1 Flip value 1 20181029 ******************************************************* Example 2: Demo with 2 volunteers Example: threads/raceConditionsExample.c /* Description: Race conditions example * Some material inspired from https://computing.llnl.gov/tutorials/pthreads/#Joining * Date: September 19, 2017 * Author: Hyrum Carroll */ #include #include #include const int NUM_THREADS = 10; const long INCREMENT_VALUE = 10000000; int sum = 0; void* incrementSum(void* tID){ for( int i = 0; i < INCREMENT_VALUE; i++){ sum = sum + 1; } printf( "Thread ID %ld done (sum = %d)\n", (unsigned long) tID, sum); pthread_exit( NULL ); } int main(int argc, char* argv[]){ /* Create NUM_THREADS then print the sum of their increments */ pthread_t thread[ NUM_THREADS ]; // child thread structures int status = -1; // return value for( unsigned long i = 0; i < NUM_THREADS; i++){ status = pthread_create( &thread[i], NULL, incrementSum, (void*) i); if( status != 0) { /* error occurred */ fprintf(stderr, "Error creating thread index %ld! (status: %d)\n", i, status); exit(-1); } } printf( "%ld * %d = %d\n", INCREMENT_VALUE, NUM_THREADS, sum); pthread_exit(NULL); // the last thing main() will do return 0; } Critical Regions ---------------------- Solution to race conditions for shared memory / file / etc. prohibit multiple processes from reading and writing to/from it at the same time Need MUTUAL EXCLUSION (two events can not at the same time) Critical regions are the area(s) of code that involve shared resources Make it so that only 1 process is in the critical regions at a time Pthreads API ================================= Pthreads (POSIX Threads) ----------- man pthread_create man pthread_exit man pthread_join Example: threads/raceConditionsExample-join.c pthread_create() ~~~~~~~~~~~~~~~~~~~~~~ pthread_exit() ~~~~~~~~~~~~~~~~~~~~~~ pthread_join() ~~~~~~~~~~~~~~~~~~~~~~ Demo: threads/raceConditionsExample-join.c Implementations to achieve mutual exclusion ---------------------- Priority inversion problem: Producer-Consumer Problem Analogy: 2 people doing dishes: 1 washes the other rinses / loads into the dishwasher Demo: threads/producerConsumer.c Locks ================================= Locks provide mutual exclusion How to use a lock: ----------- 1) Obtain a lock before entering a critical region 2) Perform the smallest amount of work necessary to get out of the critical region 3) Release the lock How NOT to use locks: threads/raceConditionsExample-join-lockVariable.c Pthreads Locks ~~~~~~~~~~~~~~~~~~~~~~ Initialization or Creation & Deletion ----------- pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER; or int rc = pthread_mutex_init( &lock, NULL ); assert( rc == 0 ); ... rc = pthread_mutex_destroy( &lock, NULL ); assert( rc == 0 ); Locking ----------- Option 1 - Blocking: Option 2 - Non-blocking: Unlocking ----------- Example: threads/raceConditionsExample-join-pthreadsLock.c Evaluating Locks ~~~~~~~~~~~~~~~~~~~~~~ 1) 2) 3) Lock Implementations ~~~~~~~~~~~~~~~~~~~~~~ Disable interrupts ----------- 1) 2) 3) Spin-wait (How NOT to example above) ----------- 1) 2) Atomic Test & Set with Yield ----------- Example: int flag = -1; // global variable; 1 in critical region; 0 available void init(){ } void lock(){ } void unlock(){ } Queues ----------- 20181107 ******************************************************* Condition Variables ================================= What is a condition variable? ~~~~~~~~~~~~~~~~~~~~~~ Used to communicate between two threads/processes A queue on which a thread can place itself to wait for some condition pthread_cond_wait( cond, mutex ) pthread_cond_signal( cond ) 20181109 ******************************************************* Semaphores ================================= Analogy: Street light protecting the (critical region of the) intersection Definition: An object with an integer value Two main routines: 1) sem_wait() 2) sem_post() Implemented with system calls Kernel disables interrupts while executing these system calls Initialization: ~~~~~~~~~~~~~~~~~~~~~~ Review: Function Prototype: 1) return type 2) name 3) parameter list 4) ; int sem_init( sem_t *sem, int pshared, unsigned int value); ----------- #include Link with -pthread sem: semaphore pshared: 0 for shared with threads within a process; 1 for between processes value: initial value of sem (common values: 0, 1, MAX) Example: ----------- sem_t sem; // declaration sem_init( &sem, 0, 1); // initialize sem to the value 1 (0 means threads in this process) int sem_wait( sem_t sem ); ~~~~~~~~~~~~~~~~~~~~~~ P() or down() or sem_wait() Pseudocode ----------- sem -= 1 if( sem < 0 ){ block / suspend // the absolute value of sem is equal to the number of waiting threads } Checking, setting and possibly going to sleep are atomic int sem_post( sem_t sem ); ~~~~~~~~~~~~~~~~~~~~~~ V() or up() or sem_post() Pseudocode ----------- sem += 1 if( sem <= 0 ){ unblock one of the waiting threads } Checking and possibly waking up a process are atomic Two types of semaphores: 1) For mutual exclusion (binary semaphores or locks) 2) For synchronization of events (counting semaphores) Binary Semaphores (Locks or Mutexes) ~~~~~~~~~~~~~~~~~~~~~~ Is a special semaphore with just 2 states: locked and unlocked (0 and 1) Used to protect critical sections, etc. Typical usage of semaphores to protect a critical section: sem_t sem; sem_init( &sem, 0, 1); // thread worker function while( TRUE ){ ... sem_wait( sem ) // critical section sem_post( sem ) ... } Counting Semaphore ~~~~~~~~~~~~~~~~~~~~~~ Initialize semaphore to the count available and then start Producer Consumer Problem: threads/producerConsumer-semaphoreA.c threads/producerConsumer-semaphoreB.c threads/producerConsumer-semaphoreC.c threads/producerConsumer-semaphore.c - complete solution 20181112 ******************************************************* Reader Writer Locks ~~~~~~~~~~~~~~~~~~~~~~ How many threads can successfully read from a file or database at a time? How many threads can successfully write to a file or database at a time? threads/readersWriters-01.c ----------- /* Strategy: If a reader has the semaphore/lock for the critical section, allow other readers to enter the critical section. If a writer has the lock, don't let anyone else in the critical section. Source: Operating Systems: Three Easy Pieces (now version 1.00), Chapter 31, page 11. */ struct rwLocks{ sem_t lock; // binary semaphore (basic lock) sem_t writelock; // used to allow ONE writer or MANY readers int readers; // count of readers reading in critical section }; void rwlock_init(rwlock_t *rw){ rw->readers = 0; sem_init(&rw->lock, 0, 1); sem_init(&rw->writelock, 0, 1); } void rwlock_acquire_writelock( rwlock_t* rw ){ sem_wait( &rw->writelock ); //sem_wait( &(rw->writelock) ); //sem_wait( &((*rw).writelock) ); } void rwlock_release_writelock(rwlock_t *rw){ sem_post( &rw->writelock ); } void rwlock_acquire_readlock(rwlock_t *rw){ sem_wait( &(rw->lock) ); rw->reader++; if( rw->reader == 1 ){ sem_wait( &rw->writelock ); // first reader acquires lock } sem_post( &(rw->lock) ); } void rwlock_release_readlock(rwlock_t *rw){ sem_wait( &(rw->lock) ); rw->reader--; if( rw->reader == 0 ){ sem_post( &rw->writelock ); // last reader releases lock } sem_post( &(rw->lock) ); } Dining Philosophers ~~~~~~~~~~~~~~~~~~~~~~ Classic concurrency problem Introduced and solved by Dijkstra while( 1 ){ think(); getForks(); eat(); putForks(); } Demo: ----------- Instructions: 1) Take left fork 2) Take right fork 3) Eat 4) Put left fork on table 5) Put right fork on table 6) Repeat at step #1 Solution? Semaphore Implementation ~~~~~~~~~~~~~~~~~~~~~~ Requires locks and condition variables :) 20181114 ******************************************************* Common Concurrency Problems ================================= Deadlock ~~~~~~~~~~~~~~~~~~~~~~ Operating systems have resources that more than one process would like to use at the same time. Examples: Printers Kernel data structures Database records etc. What is deadlock? Two threads waiting on each other to release a lock A set of processes is deadlocked if each process in the set is waiting for an event that only another process in the set can cause. (Modern Operating Systems, 4th Ed.) Example: Two processes want to use a scanner and a Blue-ray recorder. If processA has (exclusive) access to the scanner and wants (exclusive) access to the Blue-ray recorder, but processB has (exclusive) access to the Blue-ray recorder and wants (exclusive) access to the scanner. General steps to using a resource: 1) Request the resource 2) Use the resource 3) Release the resource If the resource is not available, the process must either 1) wait (be blocked) 2) wait (poll to check availability) Remember that the typical usage of semaphores to protect a critical section: ... sem_wait( sem ) // critical section (use the resource) sem_post( sem ) ... What if there's 2 resources? ... sem_wait( sem1 ) sem_wait( sem2 ) // critical section (use the resources) sem_post( sem2 ) sem_post( sem1 ) ... Examples: concurrency/twoResourcesExampleA.c concurrency/twoResourcesExampleB.c Kinds of deadlock: + Resource deadlock + Communication deadlock Conditions for Resource Deadlocks ---------------------- 4 Conditions for deadlock (Coffman et al. (1971)): 1) Mutual exclusion: Each resource is either assigned to a process or is available 2) Hold-and-wait: Processes that already have a resource can request new resources 3) No-preemption: Resources cannot be preempted from the process that they're assigned 4) Circular wait: There's a circular list of processes with each process waiting on the resource of another process Deadlock Modeling ---------------------- Use directed graphs to model deadlock: Square vertices are resources Circle vertices are processes Edge from a resource to a process indicates that the process (exclusively) has that resource Edge from a process to a resource indicates that the process has requested that resource 4 strategies for dealing with deadlocks: 1) Ignore the problem 2) Detection and recovery 3) Dynamic avoidance 4) Prevention The Ostrich Algorithm --------------------------------- Simplest and more common approach, don't do anything If the probability of a deadlock occurring is very low, then why do anything? Deadlock Detection and Recovery --------------------------------- Detection with One Resource of Each Type ---------------------- Construct a resource graph and check for a cycle Example: ----------- Process A holds R and wants S Process B (holds nothing) and wants T Process C (holds nothing) and wants S Process D holds U and wants S & T Process E holds T and wants V Process F holds W and wants S Process G holds V and wants U Detecting cycles ----------- Depth-First Search starting from each node + Mark edges as visited Terminate early if a node is visited twice (a cycle is detected) Detection with Multiple Resources of Each Type ---------------------- 4 data structures are necessary: E: Existing resources vector (an index for each resource) A: Available resource vector (an index for each resource) C: Current allocation matrix (rows are processes, columns are resources) R: Request matrix (rows are processes, columns are resources) When to check for deadlock? + Every resource request? + Every k minutes? + Low CPU utilization? Recovery from Deadlock ---------------------- Recovery through Preemption ----------- (Forcibly) Take away a resource Recovery through Rollback ----------- Periodically checkpoint processes (record their state including resource requests) Rollback a process that has a needed resource Recovery through Killing Process(es) ----------- Easiest (except for cleaning up the mess) Hopefully the cycle will be broken Not all process can be restarted 20181116 ******************************************************* Deadlock Avoidance --------------------------------- Safe State ----------- Some scheduling order exists that lets all processes complete Unsafe State ----------- No guarantee that there's some scheduling order exists that lets all processes complete Not deadlock Resource trajectory for two processes: A & B + Shaded regions are impossible (due to mutual exclusion) (both processes need the same resource) + Paths that include impossible regions will deadlock Deadlock Prevention --------------------------------- 4 Conditions for deadlock (Coffman et al. (1971)): 1) Mutual exclusion: Each resource is either assigned to a process or is available 2) Hold-and-wait: Processes that already have a resource can request new resources 3) No-preemption: Resources cannot be preempted from the process that they're assigned 4) Circular wait: There's a circular list of processes with each process waiting on the resource of another process Mutual Exclusion ---------------------- Avoid assigning a resource unless absolutely necessary (or at least to as few of processes as possible) + Data: Make data read only + Printers: Allow a process to add its print request to a job pool (then only the print daemon requests that actual printer) Hold & Wait ---------------------- Prevent processes from holding resources Have jobs list all resources necessary up front + Don't start a job until everything can be allocated + Allocates resources for longer than is necessary, but avoids deadlock No-preemption ---------------------- Allow resources to be taken away Good luck Circular Wait ---------------------- 1) Number all resources 2) Only allocate a resource to a process if its number is greater than all of the that process' allocated resources Other Issues --------------------------------- Communication Deadlock ---------------------- Process A sends a message to process B and blocks until it replies Process B sends the reply, but it is lost. Is this deadlock? Yes Solution: Timeouts - if the expected action has not occurred by a predetermined time, then assume that it was lost or that it's never going to happen Livelock ---------------------- A process is not deadlocked, but it can't get any real work done Example: Limited number of open files. Starvation ---------------------- Starvation occurs when a system is busy and a request for a resource continually goes unfulfilled due to the allocation policy Example: CPU/printer for short jobs first. With a constant stream of small jobs, a long job will never run. Solution: First-Come, First-Served allocation Persistence %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Input/Output ++++++++++++++++++++++++++++++++++++++++++++ I/O Hardware ================================= I/O Devices ~~~~~~~~~~~~~~~~~~~~~~ Block devices: stores information in a block Character devices: sends/receive a stream of data Device Controllers ~~~~~~~~~~~~~~~~~~~~~~ Standard interfaces: SATAe, SCSI, USB, Thunderbolt, FireWire The controller is a level of abstraction to hide the ugliness of the device Disks read the data from the drive, then need to perform error correction on it before writing it to memory Memory-Mapped I/O ~~~~~~~~~~~~~~~~~~~~~~ Controllers have some registers (and many have a buffer) that the CPU can write to and read from 2 Options for how the CPU read/writes to/from a controller: 1) I/O Ports Requires assembly instructions 2) Memory-mapped I/O Add the controller's registers to the memory (address) space Direct Memory Access (DMA) ~~~~~~~~~~~~~~~~~~~~~~ Most systems have a single DMA controller CPU delegates memory transfers to the DMA Steps for a memory transfer using DMA: 1) CPU programs the DMA (what to transfer where) CPU issues read command to disk (so that it'll be ready) 2) DMA issues read command to disk 3) Device writes to memory 4) Device sends acknowledgment signal to DMA If more bytes, change where and how much and repeat 2) & 3) 5) DMA interrupts CPU Interrupts (revisited) ~~~~~~~~~~~~~~~~~~~~~~ Steps to processing an interrupt 1) Device If no other interrupts are pending takes care of interrupt else interrupt is pending 2) Interrupt controller Save state of previously executing process (registers) Set-up for context of interrupt-service procedure CPU uses the interrupt number to look-up interrupt-service procedure in the interrupt vector 3) CPU Interrupt-service procedure sends acknowledgment to controller Principles of I/O Software ================================= Goals of the I/O Software ~~~~~~~~~~~~~~~~~~~~~~ Device independence: uniform interface for reading from/writing to any device Example: ls /Volume/flashDrive ls /Volume/networkDrive ls / # hard drive Uniform naming: Device names should be strings Error handling: Handle errors as close to the hardware as possible Example: Speck of dust causing a bad read on a disk drive. Have the drive repeat the read. Simulating synchronous I/O: Most I/O is asynchronous (because it's interrupt driven), but code is much easier to write if it assumes it's synchronous (or blocking) Make I/O calls appear to be blocking. Buffering: Store streams of data in a buffer Efficiency / performance Avoid data loss 3 ways to perform I/O: 1) 2) 3) Programmed I/O ~~~~~~~~~~~~~~~~~~~~~~ User code calls system call to write to device CPU copies data from user space into kernel space Kernel waits until device is available Writes a character Waits until devices updates status register return control to user code Advantage: Disadvantage: Interrupt-Driven I/O ~~~~~~~~~~~~~~~~~~~~~~ Instead of continuously polling the device to see if it's ready, Advantage: Disadvantage: I/O Using DMA ~~~~~~~~~~~~~~~~~~~~~~ Instead of frequent interrupts, use DMA Advantage: Disadvantage: I/O Software Layers ================================= Usually, software is divided into 4 layers, each one interfacing to its neighboring layer: (hardware) Interrupt Handlers ~~~~~~~~~~~~~~~~~~~~~~ Dealing with interrupts is complicated and therefore is often buried in device drivers Several steps to process an interrupt Device Drivers ~~~~~~~~~~~~~~~~~~~~~~ Code to control the device Current practice is to add or link device drivers to the kernel (instead of user mode) Device drivers typically: Check commands sent to it Then, check if the device is ready Then wait for the controller Drivers need to be reentrant (able to handle called a second time, before it's done with the first call) Device-Independent I/O Software ~~~~~~~~~~~~~~~~~~~~~~ Uniform Interfacing for Device Drivers ----------- Make all interactions with I/O devices and drivers the same OS defines a set of functions for drivers Buffers ----------- Prevent data loss due to real time constraints (sending and receiving) Buffer in kernel in case the user's page is not in memory Use a double buffer or circular buffer to prevent data loss while transferring the first buffer Too much buffering can slow down transfers User Interfaces: Keyboard, Mouse, and Displays ================================= Keyboards ~~~~~~~~~~~~~~~~~~~~~~ Contain an embedded microprocessor to communicate which key was pressed and depressed These events generate an interrupt Mice ~~~~~~~~~~~~~~~~~~~~~~ Sends the change in x and y positions and which button was pressed 20181126 ******************************************************* Hard Disk Drives ++++++++++++++++++++++++++++++++++++++++++++ Hard Disk Parts ================================= Tracks Sectors Cylinders Platters Read/write heads Think of a hard disk as an array of sectors This is the address space of the drive Hard Disk Operations ================================= Rotational Delay ~~~~~~~~~~~~~~~~~~~~~~ Waiting for the disk to rotate so that the requested sector is under the read/write head .5 rotations / 7200 rotations/minute (.5 rotations) * (1 minute/7200 rotations) * (60 seconds/1 minute) * (1000 milliseconds / 1 second) = 4.2 ms Average rotational delay Seek Time ~~~~~~~~~~~~~~~~~~~~~~ The amount of time to move the read/write head to the new track Track skew Transfer ~~~~~~~~~~~~~~~~~~~~~~ Reading or writing the data to or from the disk I/O Timing ================================= T_I/O = T_seek + T_rotation + T_transfer transfer rate = Size_transfer / T_I/O Which is usually the smallest for 4 KB requests? Which is usually the largest? 1) T_seek 2) T_rotation 3) T_transfer Example Drives ~~~~~~~~~~~~~~~~~~~~~~ Cheetah 15K.5 Barracuda ------------- --------- Capacity 300 GB 1 TB RPM 15,000 7,200 Aver Seek 4 ms 9 ms Max Trans 125 MB/s 105 MB/s Platters 4 4 Cache 16 MB 16/32 MB Interface SCSI SATA Random Workload ~~~~~~~~~~~~~~~~~~~~~~ 4 KB blocks Cheetah ----------- T_I/O = T_seek + T_rotation + T_transfer T_seek: 4 ms (from the manufacturer) T_rotational: (.5 rotations) * (1 minute/15000 rotations) * (60 seconds/1 minute) * (1000 milliseconds / 1 second) = 2 ms T_transfer: 1 second/125 MB * 1 MB / 1024 KB * 4 KB * 1000 ms / 1 second = 0.03125 ms T_I/O = 4 + 2 + 0.03 = 6.03 = 6 ms Transfer rate = 4KB / 6 ms = .66 MB/s Barracuda ----------- T_I/O = T_seek + T_rotation + T_transfer Sequential Workload (100 MB transfers) ~~~~~~~~~~~~~~~~~~~~~~ Cheetah ----------- T_I/O = T_seek + T_rotation + T_transfer T_seek: 4 ms T_rotational: 2 ms T_transfer: 100 MB * 1 second / 125 MB = 800 ms T_I/O = 806 ms Transfer rate = 124 MB/s Barracuda ----------- T_I/O = T_seek + T_rotation + T_transfer 20181128 ******************************************************* Disk Scheduling ================================= Goals: 1) Fast access time Minimize seek time Seek time ~= seek distance 2) High disk bandwith Disk bandwidth is the total number of bytes transferred, divided by the total time between the first request for service and the completion of the last transfer First-Come, First-Served (FCFS) ~~~~~~~~~~~~~~~~~~~~~~ 640 cylinders Shortest Seek Time First (SSTF) ~~~~~~~~~~~~~~~~~~~~~~ Re-orders request to minimize seek time from the current head position Starvation C-LOOK ~~~~~~~~~~~~~~~~~~~~~~ Move to one end and sweep to the other end (servicing requests) (only goes as far was the most extreme request), skip requests and start back at the beginning Selecting a Disk-Scheduling Algorithm ================================= SSTF is common Heavy disk loads: C-LOOK Redundant Arrays of Inexpensive Disks (RAIDS) ++++++++++++++++++++++++++++++++++++++++++++ Use multiple disk drives to provide reliability (by creating redundancy) and/or improve performance RAID is arranged into six different levels: RAID 0 ================================= No redundancy Stripe chunks across all disks Maximum performance https://en.wikipedia.org/wiki/Standard_RAID_levels#/media/File:RAID_0.svg RAID 1 ================================= Mirroring https://en.wikipedia.org/wiki/Standard_RAID_levels#/media/File:RAID_1.svg RAID 2 ================================= Stripe data by bits with error correction bits stored on extra drives https://en.wikipedia.org/wiki/Standard_RAID_levels#/media/File:RAID2_arch.svg RAID 3 ================================= Stripe data by bytes with one parity drive Good for long sequential reads or writes https://en.wikipedia.org/wiki/Standard_RAID_levels#/media/File:RAID_3.svg RAID 4 ================================= Stripe data by blocks with one parity drive https://en.wikipedia.org/wiki/Standard_RAID_levels#/media/File:RAID_4.svg RAID 5 ================================= Stripe data by blocks with one parity block distributed among drives If one drive fails, data still usable https://en.wikipedia.org/wiki/Standard_RAID_levels#/media/File:RAID_5.svg RAID 6 ================================= Stripe data by blocks with two parity blocks distributed among drives If two drives fail, data still usable https://en.wikipedia.org/wiki/Standard_RAID_levels#/media/File:RAID_6.svg 20181130 ******************************************************* Test question: Disk scheduling: Rotational delay File Systems ++++++++++++++++++++++++++++++++++++++++++++ 3 Goals for file systems: 1) Store very large amounts of information 2) Information must survive after the process ends 3) Multiple processes need to be able use it Magnetics disks and solid-state drives provide 2 basic functions: 1) Read block k 2) Write block k Files --------------------------------- Logical units of information created by a process that are stored and retrievable File Naming ---------------------- File Structure ---------------------- UNIX and Windows represent a file as an unstructured sequence of bytes Some large mainframes have a key for each record and the file is composed of a linked list of nodes File Types ---------------------- Regular files Directories (system files) ASCII regular files ----------- End in \r\n (Windows) or \n (UNIX) Binary regular files ----------- "Magic number" prevents a non-executable file from being executed File Access ---------------------- Sequential access: Read all of the bytes in order (can rewind) Random access: Read bytes in any order File Attributes ---------------------- Each file has a name and data The file system also tracks file attributes (or metadata): + file modification time + file created time + file open/accessed time + owner + creator + executable + file size + etc. Common File System Calls ---------------------- create delete open close read write seek + Move the file pointer to a different location get + get attributes set + set attributes rename Example: files/copy.c (Program to copy a file illustrating several system calls) 20181203 ******************************************************* Directories --------------------------------- Hierarchical Directory System ---------------------- Path Names ---------------------- 1) Absolute path names: Starts with separator (Windows: \, for example: C:\...; UNIX: /, for example: /usr/bin/perl) 2) Relative path names: Start from the working directory Special directory names: ----------- . Current directory .. Parent directory Examples: ./a.out (executes the a.out that is in the working directory, sometimes necessary because of PATH restrictions) cp /tmp/studyGuide.txt . (copy studyGuide.txt from the /tmp directory to the working directory; same as cp /tmp/studyGuide.txt studyGuide.txt) cp ../project2/proj2.c proj3.c (copy proj2.c from the project2 directory, which is in the parent's directory, to proj3.c Common Directory System Calls ---------------------- create + Just creates . and .. delete + Only if it's empty (just . and ..) opendir closedir readdir rename link + Adds an entry to an existing file and increments that file's counter in its i-node unlink File System Implementation --------------------------------- File System Layout ---------------------- At sector 0 is the Master Boot Record (MBR) After the MBR is the partition table (with the starting address of each partition) 1 of the partitions is marked as active Each partition starts with a boot block. After booting, the BIOS reads the MBR and looks for the active partition It then loads the OS in the active partition Implementing Files ---------------------- Contiguous Allocation ----------- Store each file as contiguous disk blocks Advantages: Simple Low overhead (number of blocks) Disadvantages: Search time Fragmented memory (compacting is expensive) 1) holes of removed files 2) last block Used on early systems, CD-ROMs and DVDs Linked List Allocation ----------- Make a linked list of disk blocks, with the first word the pointer to the next block. Advantages: Avoids fragmented memory (except for last block) Disadvantages: Random access if very slow Allocations are no longer a power of 2 Solution to disadvantages: put pointer in a table, the File Allocation Table (FAT) Example: MS-DOS file system I-nodes (index-nodes) ----------- Store the attributes and disk addresses of the file's block Similar to Window's NTFS and UNIX's file systems Implementing Directories ---------------------- Directories are files that list their contents (files and/or directories) They store file names They store (or provide the location of) file attributes Variable length filenames: Option 1: Store the size of the filename, then the filename (in-line) Option 2: Store the filenames in a heap Shared Files ---------------------- Allowing a file system to share a file changes the structure from a tree to a directed acyclic graph (DAG) Some complications can arise from shared files: 1) All changes to the contents of the file need to be visible at each location it's shared 2) Having a file in your directory whose owner is another user 3) Having a symbolic link to a file that does not exist 4) Backup systems need to give the option to not backup files/directories that are linked Journaling File Systems ---------------------- Logs the request for modifications to data on the disk Removes log entries when the modification is completed Needs all requests to be "idempotent" (can be repeated multiple times without harm) Fault tolerant because if the system crashes, it just repeats the modification it was in the middle of Window's NTFS and Linux's ext3 use journaling 20181205 ******************************************************* File System Management and Optimization --------------------------------- Disk-space Management ---------------------- Layout ----------- Files can be stored contiguously or in blocks What about when a file grows? It often must be moved Most common: Blocks Block Size ----------- Small blocks increase disk utilization Large blocks increase disk performance Tracking Free Blocks ----------- Option 1: (block with just disk addresses of free blocks) Option 2: SSDs ++++++++++++++++++++++++++++++++++++++++++++ Reads a page at a time To write one or more bytes, a specified block has to be erased first With no moving parts, more reliable than HDDs Disadvantage, manufactures rate that each block can only be written to 10,000 - 100,000 times SSD Performance And Cost ================================= Random Sequential Reads Writes Reads Writes Device (MB/s) (MB/s) (MB/s) (MB/s) ------------------------ ------ ------ ------ ------ Samsung 840 Pro SSD 103 287 421 384 Seagate 600 SSD 84 252 424 374 Intel SSD 335 SSD 39 222 344 354 Seagate Savvio 15K.3 HDD 2 2 223 223