From b5f0874cd96ee2a62aabc645b9626c2749cb6a01 Mon Sep 17 00:00:00 2001 From: manuel Date: Mon, 26 Mar 2012 12:54:45 +0200 Subject: initial pintos checkin --- doc/pintos_2.html | 1734 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 1734 insertions(+) create mode 100644 doc/pintos_2.html (limited to 'doc/pintos_2.html') diff --git a/doc/pintos_2.html b/doc/pintos_2.html new file mode 100644 index 0000000..ae51974 --- /dev/null +++ b/doc/pintos_2.html @@ -0,0 +1,1734 @@ + + + + + +Pintos Projects: Project 0--Introducing Pintos + + + + + + + + + + + + + + + + + + + +

[ << ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

+ +

2. Project 0: Introducing Pintos

+ +

+ +In this assignment, you will learn about the existing functionality +in Pintos, and add two small features to the system: a more efficient +implementation of sleep, and the ability to pass command line +arguments to user programs. +

+ +You will be working in the threads directory for the first part +of this assignment (with some work in the devices directory on +the side), and modify the file userprog/process.c in the second part. +

+ +The tests for Project 0 are executed by changing the working directory +to intro and running make followed by make check. +

+ +Before you read the description of this project, you should read all of +the following sections: 1. Introduction, B. Coding Standards, +D. Debugging Tools, and E. Development Tools. You should at least +skim the material from A.1 Loading through A.5 Memory Allocation, especially A.3 Synchronization. +

+ + +

+ +

2.1 Understanding Threads

+ +

+ +The first step is to read and understand the code for the initial thread +system. +Pintos already implements thread creation and thread completion, +a simple scheduler to switch between threads, and synchronization +primitives (semaphores, locks, condition variables, and optimization +barriers). +

+ +Some of this code might seem slightly mysterious. If +you haven't already compiled and run the base system, as described in +the introduction (see section 1. Introduction), you should do so now. You +can read through parts of the source code to see what's going +on. If you like, you can add calls to printf() almost +anywhere, then recompile and run to see what happens and in what +order. You can also run the kernel in a debugger and set breakpoints +at interesting spots, single-step through code and examine data, and +so on. +

+ +When a thread is created, you are creating a new context to be +scheduled. You provide a function to be run in this context as an +argument to thread_create(). The first time the thread is +scheduled and runs, it starts from the beginning of that function +and executes in that context. When the function returns, the thread +terminates. Each thread, therefore, acts like a mini-program running +inside Pintos, with the function passed to thread_create() +acting like main(). +

+ +At any given time, exactly one thread runs and the rest, if any, +become inactive. The scheduler decides which thread to +run next. (If no thread is ready to run +at any given time, then the special "idle" thread, implemented in +idle(), runs.) +Synchronization primitives can force context switches when one +thread needs to wait for another thread to do something. +

+ +The mechanics of a context switch are +in threads/switch.S, which is 80x86 +assembly code. (You don't have to understand it.) It saves the +state of the currently running thread and restores the state of the +thread we're switching to. +

+ +Using the GDB debugger, slowly trace through a context +switch to see what happens (see section D.5 GDB). You can set a +breakpoint on schedule() to start out, and then +single-step from there.(1) Be sure +to keep track of each thread's address +and state, and what procedures are on the call stack for each thread. +You will notice that when one thread calls switch_threads(), +another thread starts running, and the first thing the new thread does +is to return from switch_threads(). You will understand the thread +system once you understand why and how the switch_threads() that +gets called is different from the switch_threads() that returns. +See section A.2.3 Thread Switching, for more information. +

+ +Warning: In Pintos, each thread is assigned a small, +fixed-size execution stack just under 4 kB in size. The kernel +tries to detect stack overflow, but it cannot do so perfectly. You +may cause bizarre problems, such as mysterious kernel panics, if you +declare large data structures as non-static local variables, +e.g. int buf[1000];. Alternatives to stack allocation include +the page allocator and the block allocator (see section A.5 Memory Allocation). +

+ + +

+ +

2.1.1 Source Files

+ +

+ +Here is a brief overview of the files in the threads +directory. You will not need to modify most of this code, but the +hope is that presenting this overview will give you a start on what +code to look at. +

+ +

loader.S +
loader.h +: The kernel loader. Assembles to 512 bytes of code and data that the +PC BIOS loads into memory and which in turn finds the kernel on disk, +loads it into memory, and jumps to start() in start.S. +See section A.1.1 The Loader, for details. You should not need to look at +this code or modify it. +
+ +
+
start.S +: Does basic setup needed for memory protection and 32-bit +operation on 80x86 CPUs. Unlike the loader, this code is +actually part of the kernel. See section A.1.2 Low-Level Kernel Initialization, +for details. +
+ +
+
kernel.lds.S +: The linker script used to link the kernel. Sets the load address of +the kernel and arranges for start.S to be near the beginning +of the kernel image. See section A.1.1 The Loader, for details. Again, you +should not need to look at this code +or modify it, but it's here in case you're curious. +
+ +
+
init.c +
init.h +: Kernel initialization, including main(), the kernel's "main +program." You should look over main() at least to see what +gets initialized. You might want to add your own initialization code +here. See section A.1.3 High-Level Kernel Initialization, for details. +
+ +
+
thread.c +
thread.h +: Basic thread support. Much of your work will take place in these files. +thread.h defines struct thread, which you are likely to modify +in all three projects. See A.2.1 struct thread and A.2 Threads for +more information. +
+ +
+
switch.S +
switch.h +: Assembly language routine for switching threads. Already discussed +above. See section A.2.2 Thread Functions, for more information. +
+ +
+
palloc.c +
palloc.h +: Page allocator, which hands out system memory in multiples of 4 kB +pages. See section A.5.1 Page Allocator, for more information. +
+ +
+
malloc.c +
malloc.h +: A simple implementation of malloc() and free() for +the kernel. See section A.5.2 Block Allocator, for more information. +
+ +
+
interrupt.c +
interrupt.h +: Basic interrupt handling and functions for turning interrupts on and +off. See section A.4 Interrupt Handling, for more information. +
+ +
+
intr-stubs.S +
intr-stubs.h +: Assembly code for low-level interrupt handling. See section A.4.1 Interrupt Infrastructure, for more information. +
+ +
+
synch.c +
synch.h +: Basic synchronization primitives: semaphores, locks, condition +variables, and optimization barriers. You will need to use these for +synchronization in all +four projects. See section A.3 Synchronization, for more information. +
+ +
+
io.h +: Functions for I/O port access. This is mostly used by source code in +the devices directory that you won't have to touch. +
+ +
+
vaddr.h +
pte.h +: Functions and macros for working with virtual addresses and page table +entries. These will be more important to you in project 2. For now, +you can ignore them. +
+ +
+
flags.h +: Macros that define a few bits in the 80x86 "flags" register. +Probably of no interest. See [ IA32-v1], section 3.4.3, "EFLAGS +Register," for more information. +

+ + +

+ +

2.1.1.1 `devices` code

+ +

+ +The basic threaded kernel also includes these files in the +devices directory: +

+ +

timer.c +
timer.h +: System timer that ticks, by default, 100 times per second. You will +modify this code in this project. +
+ +
+
vga.c +
vga.h +: VGA display driver. Responsible for writing text to the screen. +You should have no need to look at this code. printf() +calls into the VGA display driver for you, so there's little reason to +call this code yourself. +
+ +
+
serial.c +
serial.h +: Serial port driver. Again, printf() calls this code for you, +so you don't need to do so yourself. +It handles serial input by passing it to the input layer (see below). +
+ +
+
block.c +
block.h +: An abstraction layer for block devices, that is, random-access, +disk-like devices that are organized as arrays of fixed-size blocks. +Out of the box, Pintos supports two types of block devices: IDE disks +and partitions. +
+ +
+
ide.c +
ide.h +: Supports reading and writing sectors on up to 4 IDE disks. +
+ +
+
partition.c +
partition.h +: Understands the structure of partitions on disks, allowing a single +disk to be carved up into multiple regions (partitions) for +independent use. +
+ +
+
kbd.c +
kbd.h +: Keyboard driver. Handles keystrokes passing them to the input layer +(see below). +
+ +
+
input.c +
input.h +: Input layer. Queues input characters passed along by the keyboard or +serial drivers. +
+ +
+
intq.c +
intq.h +: Interrupt queue, for managing a circular queue that both kernel +threads and interrupt handlers want to access. Used by the keyboard +and serial drivers. +
+ +
+
rtc.c +
rtc.h +: Real-time clock driver, to enable the kernel to determine the current +date and time. By default, this is only used by thread/init.c +to choose an initial seed for the random number generator. +
+ +
+
speaker.c +
speaker.h +: Driver that can produce tones on the PC speaker. +
+ +
+
pit.c +
pit.h +: Code to configure the 8254 Programmable Interrupt Timer. This code is +used by both devices/timer.c and devices/speaker.c +because each device uses one of the PIT's output channel. +

+ + +

+ +

2.1.1.2 `lib` files

+ +

+ +Finally, lib and lib/kernel contain useful library +routines. (lib/user will be used by user programs, starting in +project 2, but it is not part of the kernel.) Here's a few more +details: +

+ +

ctype.h +
inttypes.h +
limits.h +
stdarg.h +
stdbool.h +
stddef.h +
stdint.h +
stdio.c +
stdio.h +
stdlib.c +
stdlib.h +
string.c +
string.h +: A subset of the standard C library. See section B.2 C99, for +information +on a few recently introduced pieces of the C library that you might +not have encountered before. See section B.3 Unsafe String Functions, for +information on what's been intentionally left out for safety. +
+ +
+
debug.c +
debug.h +: Functions and macros to aid debugging. See section D. Debugging Tools, for +more information. +
+ +
+
random.c +
random.h +: Pseudo-random number generator. The actual sequence of random values +will not vary from one Pintos run to another, unless you do one of +three things: specify a new random seed value on the -rs +kernel command-line option on each run, or use a simulator other than +Bochs, or specify the -r option to pintos. +
+ +
+
round.h +: Macros for rounding. +
+ +
+
syscall-nr.h +: System call numbers. Not used until project 2. +
+ +
+
kernel/list.c +
kernel/list.h +: Doubly linked list implementation. Used all over the Pintos code, and +you'll probably want to use it a few places yourself. +
+ +
+
kernel/bitmap.c +
kernel/bitmap.h +: Bitmap implementation. You can use this in your code if you like, but +you probably won't have any need for it in project 0 and project 1. +
+ +
+
kernel/hash.c +
kernel/hash.h +: Hash table implementation. +
+ +
+
kernel/console.c +
kernel/console.h +
kernel/stdio.h +: Implements printf() and a few other functions. +

+ + +

+ +

2.1.2 Synchronization

+ +

+ +Proper synchronization is an important part of the solutions to these +problems. Any synchronization problem can be easily solved by turning +interrupts off: while interrupts are off, there is no concurrency, so +there's no possibility for race conditions. Therefore, it's tempting to +solve all synchronization problems this way, but don't. +Instead, use semaphores, locks, and condition variables to solve the +bulk of your synchronization problems. Read the tour section on +synchronization (see section A.3 Synchronization) or the comments in +threads/synch.c if you're unsure what synchronization primitives +may be used in what situations. +

+ +In the Pintos projects, the only class of problem best solved by +disabling interrupts is coordinating data shared between a kernel thread +and an interrupt handler. Because interrupt handlers can't sleep, they +can't acquire locks. This means that data shared between kernel threads +and an interrupt handler must be protected within a kernel thread by +turning off interrupts. +

+ +This project only requires accessing a little bit of thread state from +interrupt handlers. For the alarm clock, the timer interrupt needs to +wake up sleeping threads. In the advanced scheduler, the timer +interrupt needs to access a few global and per-thread variables. When +you access these variables from kernel threads, you will need to disable +interrupts to prevent the timer interrupt from interfering. +

+ +When you do turn off interrupts, take care to do so for the least amount +of code possible, or you can end up losing important things such as +timer ticks or input events. Turning off interrupts also increases the +interrupt handling latency, which can make a machine feel sluggish if +taken too far. +

+ +The synchronization primitives themselves in synch.c are +implemented by disabling interrupts. You may need to increase the +amount of code that runs with interrupts disabled here, but you should +still try to keep it to a minimum. +

+ +Disabling interrupts can be useful for debugging, if you want to make +sure that a section of code is not interrupted. You should remove +debugging code before turning in your project. (Don't just comment it +out, because that can make the code difficult to read.) +

+ +There should be no busy waiting in your submission. A tight loop that +calls thread_yield() is one form of busy waiting. +

+ + +

+ +

2.1.3 Development Suggestions

+ +

+ +In the past, many groups divided the assignment into pieces, then each +group member worked on his or her piece until just before the +deadline, at which time the group reconvened to combine their code and +submit. This is a bad idea. We do not recommend this +approach. Groups that do this often find that two changes conflict +with each other, requiring lots of last-minute debugging. Some groups +who have done this have turned in code that did not even compile or +boot, much less pass any tests. +

+ +Instead, we recommend integrating your team's changes early and often, +using the source code control system git. +This is less likely to produce surprises, because everyone can see +everyone else's code as it is written, instead of just when it is +finished. These systems also make it possible to review changes and, +when a change introduces a bug, drop back to working versions of code. +

+ +You should expect to run into bugs that you simply don't understand +while working on this and subsequent projects. When you do, +reread the appendix on debugging tools, which is filled with +useful debugging tips that should help you to get back up to speed +(see section D. Debugging Tools). Be sure to read the section on backtraces +(see section D.4 Backtraces), which will help you to get the most out of every +kernel panic or assertion failure. +

+ + +

+ +

2.2 Understanding User Programs

+ +

+ +The tests for both the alarm clock assignment in Project 0, and the +priority scheduler in Project 1, run as part of the operating system +kernel, with full access to privileged parts of the system. +Once we start running user programs on top of the operating system, this +is no longer true. +

+ +We allow more than one process to run at a time. Each process has one +thread (multithreaded processes are not supported). User programs are +written under the illusion that they have the entire machine. This +means that when you load and run multiple processes at a time, you must +manage memory, scheduling, and other state correctly to maintain this +illusion. +

+ +In Project 2, we will test your operating system by running +user programs. This gives you much greater freedom. You must make sure +that the user program interface meets the specifications described here, +but given that constraint you are free to restructure or rewrite kernel +code however you wish. +

+ + +

+ +

2.2.1 Source Files

+ +

process.c +
process.h +: Loads ELF binaries and starts processes. +
+ +
+
pagedir.c +
pagedir.h +: A simple manager for 80x86 hardware page tables. +Although you probably won't want to modify this code for this project, +you may want to call some of its functions. +See Page Tables, for more information. +
+ +
+
syscall.c +
syscall.h +: Whenever a user process wants to access some kernel functionality, it +invokes a system call. +
+ +
+
exception.c +
exception.h +: When a user process performs a privileged or prohibited operation, it +traps into the kernel as an "exception" or "fault."(2) These files handle +exceptions. In project 2, you will need to modify the page fault +handler to support lazy page loading and stack growth. +
+ +
+
gdt.c +
gdt.h +: The 80x86 is a segmented architecture. The Global Descriptor +Table (GDT) is a table that describes the segments in use. These +files set up the GDT. You should not need to modify these +files for any of the projects. You can read the code if +you're interested in how the GDT works. +
+ +
+
tss.c +
tss.h +: The Task-State Segment (TSS) is used for 80x86 architectural +task switching. Pintos uses the TSS only for switching stacks when a +user process enters an interrupt handler, as does Linux. You +should not need to modify these files for any of the projects. +You can read the code if you're interested in how the TSS +works. +

+ + +

+ +

2.2.2 Using the File System

+ +

+ +You will need to interface to the file system code, because +user programs are loaded from the file system and most of the +system calls you must implement deal with the file system. +You will want to look over the filesys.h and file.h +interfaces to understand how to use the file system, and especially +its many limitations. +

+ +There is no need to modify the file system code in this course, and so +we recommend that you do not. Working on the file system is likely to +distract you from the project's foci. +

+ +You will have to tolerate the following limitations, however: +

+ +

+No internal synchronization. Concurrent accesses will interfere with one +another. You should use synchronization to ensure that only one process at a +time is executing file system code. +
+ +
+
+File size is fixed at creation time. The root directory is +represented as a file, so the number of files that may be created is also +limited. +
+ +
+
+File data is allocated as a single extent, that is, data in a single +file must occupy a contiguous range of sectors on disk. External +fragmentation can therefore become a serious problem as a file system is +used over time. +
+ +
+
+No subdirectories. +
+ +
+
+File names are limited to 14 characters. +
+ +
+
+A system crash mid-operation may corrupt the disk in a way +that cannot be repaired automatically. There is no file system repair +tool anyway. +

+ +One important feature is included: +

+ +

+Unix-like semantics for filesys_remove() are implemented. +That is, if a file is open when it is removed, its blocks +are not deallocated and it may still be accessed by any +threads that have it open, until the last one closes it. See Removing an Open File, for more information. +

+ +You need to be able to create a simulated disk with a file system +partition. The pintos-mkdisk program provides this +functionality. From the userprog/build directory, execute +pintos-mkdisk filesys.dsk --filesys-size=2. This command +creates a simulated disk named filesys.dsk that contains a 2 +MB Pintos file system partition. Then format the file system +partition by passing -f -q on the kernel's command line: +pintos -f -q. The -f option causes the file system to +be formatted, and -q causes Pintos to exit as soon as the +format is done. +

+ +You'll need a way to copy files in and out of the simulated file system. +The pintos -p ("put") and -g ("get") +options do this. To copy file into the +Pintos file system, use the command pintos -p file -- -q. +(The -- is needed because -p is for the pintos +script, not for the simulated kernel.) To copy it to the Pintos file +system under the name newname, add -a +newname: pintos -p file -a newname -- -q. The +commands for copying files out of a VM are similar, but substitute +-g for -p. +

+ +Incidentally, these commands work by passing special commands +extract and append on the kernel's command line and copying +to and from a special simulated "scratch" partition. If you're very +curious, you can look at the pintos script as well as +filesys/fsutil.c to learn the implementation details. +

+ +Here's a summary of how to create a disk with a file system partition, +format the file system, copy the echo program into the new +disk, and then run echo, passing argument x. +(Argument passing won't work until you implemented it.) It assumes +that you've already built the examples in examples and that the +current directory is userprog/build: +

+ +

pintos-mkdisk filesys.dsk --filesys-size=2 +pintos -f -q +pintos -p ../../examples/echo -a echo -- -q +pintos -q run 'echo x' +

+ +The three final steps can actually be combined into a single command: +

+ +

pintos-mkdisk filesys.dsk --filesys-size=2 +pintos -p ../../examples/echo -a echo -- -f -q run 'echo x' +

+ +If you don't want to keep the file system disk around for later use or +inspection, you can even combine all four steps into a single command. +The --filesys-size=n option creates a temporary file +system partition +approximately n megabytes in size just for the duration of the +pintos run. The Pintos automatic test suite makes extensive +use of this syntax: +

+ +

pintos --filesys-size=2 -p ../../examples/echo -a echo -- -f -q run 'echo x' +

+ +You can delete a file from the Pintos file system using the rm +file kernel action, e.g. pintos -q rm file. Also, +ls lists the files in the file system and cat +file prints a file's contents to the display. +

+ + +

+ +

2.2.3 How User Programs Work

+ +

+ +Pintos can run normal C programs, as long as they fit into memory and use +only the system calls you implement. Notably, malloc() cannot be +implemented because none of the system calls required for this project +allow for memory allocation. Pintos also can't run programs that use +floating point operations, since the kernel doesn't save and restore the +processor's floating-point unit when switching threads. +

+ +The src/examples directory contains a few sample user +programs. The Makefile in this directory +compiles the provided examples, and you can edit it +compile your own programs as well. Some of the example programs will +not work with the current implementation of Pintos. +

+ +Pintos can load ELF executables with the loader provided for you +in userprog/process.c. ELF is a file format used by Linux, +Solaris, and many other operating systems for object files, +shared libraries, and executables. You can actually use any compiler +and linker that output 80x86 ELF executables to produce programs +for Pintos. (We've provided compilers and linkers that should do just +fine.) +

+ +You should realize immediately that, until you copy a +test program to the simulated file system, Pintos will be unable to do +useful work. You won't be able to do +interesting things until you copy a variety of programs to the file system. +You might want to create a clean reference file system disk and copy that +over whenever you trash your filesys.dsk beyond a useful state, +which may happen occasionally while debugging. +

+ + +

+ +

2.2.4 Virtual Memory Layout

+ +

+ +Virtual memory in Pintos is divided into two regions: user virtual +memory and kernel virtual memory. User virtual memory ranges from +virtual address 0 up to PHYS_BASE, which is defined in +threads/vaddr.h and defaults to 0xc0000000 (3 GB). Kernel +virtual memory occupies the rest of the virtual address space, from +PHYS_BASE up to 4 GB. +

+ +User virtual memory is per-process. +When the kernel switches from one process to another, it +also switches user virtual address spaces by changing the processor's +page directory base register (see pagedir_activate() in +userprog/pagedir.c). struct thread contains a pointer to a +process's page table. +

+ +Kernel virtual memory is global. It is always mapped the same way, +regardless of what user process or kernel thread is running. In +Pintos, kernel virtual memory is mapped one-to-one to physical +memory, starting at PHYS_BASE. That is, virtual address +PHYS_BASE accesses physical +address 0, virtual address PHYS_BASE + 0x1234 accesses +physical address 0x1234, and so on up to the size of the machine's +physical memory. +

+ +A user program can only access its own user virtual memory. An attempt to +access kernel virtual memory causes a page fault, handled by +page_fault() in userprog/exception.c, and the process +will be terminated. Kernel threads can access both kernel virtual +memory and, if a user process is running, the user virtual memory of +the running process. However, even in the kernel, an attempt to +access memory at an unmapped user virtual address +will cause a page fault. +

+ + +

+ +

2.2.4.1 Typical Memory Layout

+ +

+ +Conceptually, each process is +free to lay out its own user virtual memory however it +chooses. In practice, user virtual memory is laid out like this: +

+ +

   PHYS_BASE +----------------------------------+
+             |            user stack            |
+             |                 |                |
+             |                 |                |
+             |                 V                |
+             |          grows downward          |
+             |                                  |
+             |                                  |
+             |                                  |
+             |                                  |
+             |           grows upward           |
+             |                 ^                |
+             |                 |                |
+             |                 |                |
+             +----------------------------------+
+             | uninitialized data segment (BSS) |
+             +----------------------------------+
+             |     initialized data segment     |
+             +----------------------------------+
+             |           code segment           |
+  0x08048000 +----------------------------------+
+             |                                  |
+             |                                  |
+             |                                  |
+             |                                  |
+             |                                  |
+           0 +----------------------------------+
+

+ +In this project, the user stack is fixed in size, but in project 2 it +will be allowed to grow. Traditionally, the size of the uninitialized +data segment can be adjusted with a system call, but you will not have +to implement this. +

+ +The code segment in Pintos starts at user virtual address +0x08084000, approximately 128 MB from the bottom of the address +space. This value is specified in [ SysV-i386] and has no deep +significance. +

+ +The linker sets the layout of a user program in memory, as directed by a +"linker script" that tells it the names and locations of the various +program segments. You can learn more about linker scripts by reading +the "Scripts" chapter in the linker manual, accessible via info +ld. +

+ +To view the layout of a particular executable, run objdump +(80x86) or i386-elf-objdump (SPARC) with the -p +option. +

+ + +

+ +

2.2.5 Accessing User Memory

+ +

+ +As part of a system +call, the kernel must often access memory through pointers provided by a user +program. The kernel must be very careful about doing so, because +the user can pass a null pointer, a pointer to +unmapped virtual memory, or a pointer to kernel virtual address space +(above PHYS_BASE). All of these types of invalid pointers must +be rejected without harm to the kernel or other running processes, by +terminating the offending process and freeing its resources. +

+ +There are at least two reasonable ways to do this correctly. The +first method is to verify +the validity of a user-provided pointer, then dereference it. +The second method is to check only that a user +pointer points below PHYS_BASE, then dereference it. +An invalid user pointer will cause a "page fault" that you can +handle by modifying the code for page_fault() in +userprog/exception.c. This technique is normally faster +because it takes advantage of the processor's MMU, so it tends to be +used in real kernels (including Linux). It is also the way +access to user pointers is implemented in the Pintos version provided. +

+ +In either case, one needs to make sure not to "leak" resources. For +example, suppose that your system call has acquired a lock or +allocated memory with malloc(). If you encounter an invalid user pointer +afterward, you must still be sure to release the lock or free the page +of memory. If you choose to verify user pointers before dereferencing +them, this should be straightforward. It's more difficult to handle +if an invalid pointer causes a page fault, +because there's no way to return an error code from a memory access. +

+ + +

+ +

2.3 Requirements

+ +

+ + +

+ +

2.3.1 Design Document

+ +

+ +Before you turn in your project, you must copy the +project 0 design document template into your source tree under the name +pintos/src/intro/DESIGNDOC and fill it in. We recommend that +you read the design document template before you start working on the +project. See section C. Project Documentation, for a sample design document +that goes along with a fictitious project. +

+ + +

+ +

2.3.2 Alarm Clock

+ +

+ +Reimplement timer_sleep(), defined in devices/timer.c. +Although a working implementation is provided, it "busy waits," that +is, it spins in a loop checking the current time and calling +thread_yield() until enough time has gone by. Reimplement it to +avoid busy waiting. +

+ + +

Function: void timer_sleep (int64_t ticks) +

Suspends execution of the calling thread until time has advanced by at +least x timer ticks. Unless the system is otherwise idle, the +thread need not wake up after exactly x ticks. Just put it on +the ready queue after they have waited for the right amount of time. +

+ +timer_sleep() is useful for threads that operate in real-time, +e.g. for blinking the cursor once per second. +

+ +The argument to timer_sleep() is expressed in timer ticks, not in +milliseconds or any another unit. There are TIMER_FREQ timer +ticks per second, where TIMER_FREQ is a macro defined in +devices/timer.h. The default value is 100. We don't recommend +changing this value, because any change is likely to cause many of +the tests to fail. +

+ +Separate functions timer_msleep(), timer_usleep(), and +timer_nsleep() do exist for sleeping a specific number of +milliseconds, microseconds, or nanoseconds, respectively, but these will +call timer_sleep() automatically when necessary. You do not need +to modify them. +

+ +If your delays seem too short or too long, reread the explanation of the +-r option to pintos (see section 1.1.4 Debugging versus Testing). +

+ +The tests for the 2.3.2 Alarm Clock assignment are executed by changing the +working directory intro. The run make to build the +Pintos kernel. Finally run make check to run the tests, +followed by make grade to obtain your score. +

+ +The alarm clock implementation is not needed for later projects. +

+ + +

+ +

2.3.3 Argument Passing

+ +

+ +Currently, process_execute() does not support passing arguments to +new processes. Implement this functionality, by extending +process_execute() so that instead of simply taking a program file +name as its argument, it divides it into words at spaces. The first +word is the program name, the second word is the first argument, and so +on. That is, process_execute("grep foo bar") should run +grep passing two arguments foo and bar. +

+ +Within a command line, multiple spaces are equivalent to a single +space, so that process_execute("grep foo bar") +is equivalent to our original example. You can impose a reasonable +limit on the length of the command line arguments. For example, you +could limit the arguments to those that will fit in a single page (4 +kB). (There is an unrelated limit of 128 bytes on command-line +arguments that the pintos utility can pass to the kernel.) +

+ +You can parse argument strings any way you like. If you're lost, +look at strtok_r(), prototyped in lib/string.h and +implemented with thorough comments in lib/string.c. You can +find more about it by looking at the man page (run man strtok_r +at the prompt). +

+ +See section 2.5.1 Program Startup Details, for information on exactly how you +need to set up the stack. +

+ + +

+ +

2.4 FAQ

+ +

How much code will I need to write? +

+ +Here's a summary of our reference solution, produced by the +diffstat program. The final row gives total lines inserted +and deleted; a changed line counts as both an insertion and a deletion. +

+ +The reference solution represents just one possible solution. Many +other solutions are also possible and many of those differ greatly from +the reference solution. Some excellent solutions may not modify all the +files modified by the reference solution, and some may modify files not +modified by the reference solution. +

+ +

devices/timer.c | 40 +++++++++++- + threads/thread.h | 3 + + userprog/process.c | 148 ++++++++++++++++++++++++++++++----------- + 3 files changed, 150 insertions(+), 41 deletions(-) +

+ + +

+ +

2.4.1 Threads FAQ

+ +

How do I update the Makefiles when I add a new source file? +

+ + +To add a .c file, edit the top-level Makefile.build. +Add the new file to variable dir_SRC, where +dir is the directory where you added the file. For this +project, that means you should add it to threads_SRC or +devices_SRC. Then run make. If your new file +doesn't get +compiled, run make clean and then try again. +

+ +When you modify the top-level Makefile.build and re-run +make, the modified +version should be automatically copied to +threads/build/Makefile. The converse is +not true, so any changes will be lost the next time you run make +clean from the threads directory. Unless your changes are +truly temporary, you should prefer to edit Makefile.build. +

+ +A new .h file does not require editing the Makefiles. +

+ +

What does warning: no previous prototype for `func' mean? +

+ +It means that you defined a non-static function without +preceding it by a prototype. Because non-static functions are +intended for use by other .c files, for safety they should be +prototyped in a header file included before their definition. To fix +the problem, add a prototype in a header file that you include, or, if +the function isn't actually used by other .c files, make it +static. +

+ +

What is the interval between timer interrupts? +

+ +Timer interrupts occur TIMER_FREQ times per second. You can +adjust this value by editing devices/timer.h. The default is +100 Hz. +

+ +We don't recommend changing this value, because any changes are likely +to cause many of the tests to fail. +

+ +

How long is a time slice? +

+ +There are TIME_SLICE ticks per time slice. This macro is +declared in threads/thread.c. The default is 4 ticks. +

+ +We don't recommend changing this value, because any changes are likely +to cause many of the tests to fail. +

+ +

How do I run the tests? +

+ +See section 1.2.1 Testing. +

+ +See section D.4 Backtraces, for more information. +

+ + +

+ +

2.4.2 Alarm Clock FAQ

+ +

Do I need to account for timer values overflowing? +: + +Don't worry about the possibility of timer values overflowing. Timer +values are expressed as signed 64-bit numbers, which at 100 ticks per +second should be good for almost 2,924,712,087 years. By then, we +expect Pintos to have been phased out of the curriculum. +

+ + +

+ +

2.4.3 Userprog FAQ

+ +

The kernel always panics when I run pintos -p file -- -q. +

+ +Did you format the file system (with pintos -f)? +

+ +Is your file name too long? The file system limits file names to 14 +characters. A command like pintos -p ../../examples/echo -- -q +will exceed the limit. Use pintos -p ../../examples/echo -a echo +-- -q to put the file under the name echo instead. +

+ +Is the file system full? +

+ +Does the file system already contain 16 files? The base Pintos file +system has a 16-file limit. +

+ +The file system may be so fragmented that there's not enough contiguous +space for your file. +

+ +

When I run pintos -p ../file --, file isn't copied. +

+ +Files are written under the name you refer to them, by default, so in +this case the file copied in would be named ../file. You +probably want to run pintos -p ../file -a file -- instead. +

+ +You can list the files in your file system with pintos -q ls. +

+ +

All my user programs die with page faults. +

+ +This will happen if you haven't implemented argument passing +(or haven't done so correctly). The basic C library for user programs tries +to read argc and argv off the stack. If the stack +isn't properly set up, this causes a page fault. +

+ +

How can I disassemble user programs? +

+ +The objdump (80x86) or i386-elf-objdump +(SPARC) utility can disassemble entire user +programs or object files. Invoke it as objdump -d +file. You can use GDB's +disassemble command to disassemble individual functions +(see section D.5 GDB). +

+ +

Why do many C include files not work in Pintos programs? +

Can I use libfoo in my Pintos programs? +

+ +The C library we provide is very limited. It does not include many of +the features that are expected of a real operating system's C library. +The C library must be built specifically for the operating system (and +architecture), since it must make system calls for I/O and memory +allocation. (Not all functions do, of course, but usually the library +is compiled as a unit.) +

+ +The chances are good that the library you want uses parts of the C library +that Pintos doesn't implement. It will probably take at least some +porting effort to make it work under Pintos. Notably, the Pintos +user program C library does not have a malloc() implementation. +

+ +

How do I compile new user programs? +

+ +Modify src/examples/Makefile, then run make. +

+ +

Can I run user programs under a debugger? +

+ +Yes, with some limitations. See section D.5 GDB. +

+ +

How can I run user programs that need more than 4 kB stack space? +

+ +You may modify the stack setup code to allocate more than one page of +stack space for each process. In project 2, you will implement a better +solution. +

+ +

What happens when an open file is removed? +

+ +You should implement the standard Unix semantics for files. That is, when +a file is removed any process which has a file descriptor for that file +may continue to use that descriptor. This means that +they can read and write from the file. The file will not have a name, +and no other processes will be able to open it, but it will continue +to exist until all file descriptors referring to the file are closed +or the machine shuts down. +

+ +

+ + +

+ +

2.4.4 Argument Passing FAQ

+ +

Isn't the top of stack in kernel virtual memory? +

+ +The top of stack is at PHYS_BASE, typically 0xc0000000, which +is also where kernel virtual memory starts. +But before the processor pushes data on the stack, it decrements the stack +pointer. Thus, the first (4-byte) value pushed on the stack +will be at address 0xbffffffc. +

+ +

Is PHYS_BASE fixed? +

+ +No. You should be able to support PHYS_BASE values that are +any multiple of 0x10000000 from 0x80000000 to 0xf0000000, +simply via recompilation. +

+ + +

+ +

2.5 80`x`86 Calling Convention

+ +

+ +This section summarizes important points of the convention used for +normal function calls on 32-bit 80x86 implementations of Unix. +Some details are omitted for brevity. If you do want all the details, +refer to [ SysV-i386]. +

+ +The calling convention works like this: +

+ +

+The caller pushes each of the function's arguments on the stack one by +one, normally using the PUSH assembly language instruction. +Arguments are pushed in right-to-left order. +
+ +The stack grows downward: each push decrements the stack pointer, then +stores into the location it now points to, like the C expression +*--sp = value. +
+
+ +
+
+The caller pushes the address of its next instruction (the return +address) on the stack and jumps to the first instruction of the callee. +A single 80x86 instruction, CALL, does both. +
+ +
+
+The callee executes. When it takes control, the stack pointer points to +the return address, the first argument is just above it, the second +argument is just above the first argument, and so on. +
+ +
+
+If the callee has a return value, it stores it into register EAX. +
+ +
+
+The callee returns by popping the return address from the stack and +jumping to the location it specifies, using the 80x86 RET +instruction. +
+ +
+
+The caller pops the arguments off the stack. +

+ +Consider a function f() that takes three int arguments. +This diagram shows a sample stack frame as seen by the callee at the +beginning of step 3 above, supposing that f() is invoked as +f(1, 2, 3). The initial stack address is arbitrary: +

+ +

                             +----------------+
+                  0xbffffe7c |        3       |
+                  0xbffffe78 |        2       |
+                  0xbffffe74 |        1       |
+stack pointer --> 0xbffffe70 | return address |
+                             +----------------+
+

+ + +

+ +

2.5.1 Program Startup Details

+ +

+ +The Pintos C library for user programs designates _start(), in +lib/user/entry.c, as the entry point for user programs. This +function is a wrapper around main() that calls exit() if +main() returns: +

+ +

void +_start (int argc, char *argv[]) +{ + exit (main (argc, argv)); +} +

+ +The kernel must put the arguments for the initial function on the stack +before it allows the user program to begin executing. The arguments are +passed in the same way as the normal calling convention (see section 2.5 80x86 Calling Convention). +

+ +Consider how to handle arguments for the following example command: +/bin/ls -l foo bar. +First, break the command into words: /bin/ls, +-l, foo, bar. Place the words at the top of the +stack. Order doesn't matter, because they will be referenced through +pointers. +

+ +Then, push the address of each string plus a null pointer sentinel, on +the stack, in right-to-left order. These are the elements of +argv. The null pointer sentinel ensures that argv[argc] +is a null pointer, as required by the C standard. The order ensures +that argv[0] is at the lowest virtual address. Word-aligned +accesses are faster than unaligned accesses, so for best performance +round the stack pointer down to a multiple of 4 before the first push. +

+ +Then, push argv (the address of argv[0]) and argc, +in that order. Finally, push a fake "return address": although the +entry function will never return, its stack frame must have the same +structure as any other. +

+ +The table below shows the state of the stack and the relevant registers +right before the beginning of the user program, assuming +PHYS_BASE is 0xc0000000: +

+ +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Address	Name	Data	Type
`0xbffffffc`	`argv[3][...]`	`bar\0`	`char[4]`
`0xbffffff8`	`argv[2][...]`	`foo\0`	`char[4]`
`0xbffffff5`	`argv[1][...]`	`-l\0`	`char[3]`
`0xbfffffed`	`argv[0][...]`	`/bin/ls\0`	`char[8]`
`0xbfffffec`	word-align	0	`uint8_t`
`0xbfffffe8`	`argv[4]`	`0`	`char *`
`0xbfffffe4`	`argv[3]`	`0xbffffffc`	`char *`
`0xbfffffe0`	`argv[2]`	`0xbffffff8`	`char *`
`0xbfffffdc`	`argv[1]`	`0xbffffff5`	`char *`
`0xbfffffd8`	`argv[0]`	`0xbfffffed`	`char *`
`0xbfffffd4`	`argv`	`0xbfffffd8`	`char **`
`0xbfffffd0`	`argc`	4	`int`
`0xbfffffcc`	return address	0	`void (*) ()`

+ +

+ +In this example, the stack pointer would be initialized to +0xbfffffcc. +

+ +As shown above, your code should start the stack at the very top of +the user virtual address space, in the page just below virtual address +PHYS_BASE (defined in threads/vaddr.h). +

+ +You may find the non-standard hex_dump() function, declared in +<stdio.h>, useful for debugging your argument passing code. +Here's what it would show in the above example: +

+ +

bfffffc0 00 00 00 00 | ....| +bfffffd0 04 00 00 00 d8 ff ff bf-ed ff ff bf f5 ff ff bf |................| +bfffffe0 f8 ff ff bf fc ff ff bf-00 00 00 00 00 2f 62 69 |............./bi| +bffffff0 6e 2f 6c 73 00 2d 6c 00-66 6f 6f 00 62 61 72 00 |n/ls.-l.foo.bar.| +

+ + +

+ +

2.5.2 System Call Details

+ +

+ +We already know one way that the operating system +can regain control from a user program: interrupts from timers and I/O +devices. These are "external" interrupts, because they are caused +by entities outside the CPU (see section A.4.3 External Interrupt Handling). +

+ +The operating system also deals with software exceptions, which are +events that occur in program code (see section A.4.2 Internal Interrupt Handling). These can be errors such as a page fault or division by +zero. Exceptions are also the means by which a user program +can request services ("system calls") from the operating system. +

+ +In the 80x86 architecture, the int instruction is the +most commonly used means for invoking system calls. This instruction +is handled in the same way as other software exceptions. In Pintos, +user programs invoke int $0x30 to make a system call. The +system call number and any additional arguments are expected to be +pushed on the stack in the normal fashion before invoking the +interrupt (see section 2.5 80x86 Calling Convention). +

+ +Thus, when the system call handler syscall_handler() gets control, +the system call number is in the 32-bit word at the caller's stack +pointer, the first argument is in the 32-bit word at the next higher +address, and so on. The caller's stack pointer is accessible to +syscall_handler() as the esp member of the +struct intr_frame passed to it. (struct intr_frame is on the kernel +stack.) +

+ +The 80x86 convention for function return values is to place them +in the EAX register. System calls that return a value can do +so by modifying the eax member of struct intr_frame. +

+ +You should try to avoid writing large amounts of repetitive code for +implementing system calls. Each system call argument, whether an +integer or a pointer, takes up 4 bytes on the stack. You should be able +to take advantage of this to avoid writing much near-identical code for +retrieving each system call's arguments from the stack. + +

+ + + + + + + +

[ << ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

+
+ +This document was generated +by on March, 6 2012 +using texi2html + + + + -- cgit v1.2.3

2. Project 0: Introducing Pintos

2.1 Understanding Threads

2.1.1 Source Files

2.1.1.1 devices code

2.1.1.2 lib files

2.1.2 Synchronization

2.1.3 Development Suggestions

2.2 Understanding User Programs

2.2.1 Source Files

2.2.2 Using the File System

2.2.3 How User Programs Work

2.2.4 Virtual Memory Layout

2.2.4.1 Typical Memory Layout

2.2.5 Accessing User Memory

2.3 Requirements

2.3.1 Design Document

2.3.2 Alarm Clock

2.3.3 Argument Passing

2.4 FAQ

2.4.1 Threads FAQ

2.4.2 Alarm Clock FAQ

2.4.3 Userprog FAQ

2.4.4 Argument Passing FAQ

2.5 80x86 Calling Convention

2.5.1 Program Startup Details

2.5.2 System Call Details

2.1.1.1 `devices` code

2.1.1.2 `lib` files

2.5 80`x`86 Calling Convention