From b5f0874cd96ee2a62aabc645b9626c2749cb6a01 Mon Sep 17 00:00:00 2001 From: manuel Date: Mon, 26 Mar 2012 12:54:45 +0200 Subject: initial pintos checkin --- doc/pintos.css | 76 ++ doc/pintos.html | 342 ++++++ doc/pintos.pdf | Bin 0 -> 514054 bytes doc/pintos_1.html | 788 ++++++++++++ doc/pintos_10.html | 286 +++++ doc/pintos_11.html | 137 +++ doc/pintos_2.html | 1734 ++++++++++++++++++++++++++ doc/pintos_3.html | 375 ++++++ doc/pintos_4.html | 83 ++ doc/pintos_5.html | 3343 +++++++++++++++++++++++++++++++++++++++++++++++++++ doc/pintos_6.html | 314 +++++ doc/pintos_7.html | 238 ++++ doc/pintos_8.html | 1041 ++++++++++++++++ doc/pintos_9.html | 143 +++ doc/pintos_abt.html | 205 ++++ doc/pintos_fot.html | 79 ++ doc/pintos_ovr.html | 69 ++ doc/pintos_tour.pdf | Bin 0 -> 257402 bytes doc/sample.tmpl | 104 ++ doc/start.tmpl | 101 ++ doc/threads.tmpl | 82 ++ doc/vm.tmpl | 81 ++ 22 files changed, 9621 insertions(+) create mode 100644 doc/pintos.css create mode 100644 doc/pintos.html create mode 100644 doc/pintos.pdf create mode 100644 doc/pintos_1.html create mode 100644 doc/pintos_10.html create mode 100644 doc/pintos_11.html create mode 100644 doc/pintos_2.html create mode 100644 doc/pintos_3.html create mode 100644 doc/pintos_4.html create mode 100644 doc/pintos_5.html create mode 100644 doc/pintos_6.html create mode 100644 doc/pintos_7.html create mode 100644 doc/pintos_8.html create mode 100644 doc/pintos_9.html create mode 100644 doc/pintos_abt.html create mode 100644 doc/pintos_fot.html create mode 100644 doc/pintos_ovr.html create mode 100644 doc/pintos_tour.pdf create mode 100644 doc/sample.tmpl create mode 100644 doc/start.tmpl create mode 100644 doc/threads.tmpl create mode 100644 doc/vm.tmpl (limited to 'doc') diff --git a/doc/pintos.css b/doc/pintos.css new file mode 100644 index 0000000..019a33d --- /dev/null +++ b/doc/pintos.css @@ -0,0 +1,76 @@ +body { + background: white; + color: black; + padding: 0em 1em 0em 3em; + margin: 0; + margin-left: auto; + margin-right: auto; + max-width: 8in; + text-align: justify +} +body>p { + margin: 0pt 0pt 0pt 0em; + text-align: justify +} +body>p + p { + margin: .75em 0pt 0pt 0pt +} +H1 { + font-size: 150%; + margin-left: -1.33em +} +H2 { + font-size: 125%; + font-weight: bold; + margin-left: -.8em +} +H3 { + font-size: 100%; + font-weight: bold; + margin-left: -.5em } +H4 { + font-size: 100%; + margin-left: 0em +} +H1, H2, H3, H4, H5, H6 { + font-family: sans-serif; + color: blue +} +H1, H2 { + text-decoration: underline +} +html { + margin: 0; + font-weight: lighter +} +tt, code { + font-family: sans-serif +} +b, strong { + font-weight: bold +} + +a:link { + color: blue; + text-decoration: none; +} +a:visited { + color: gray; + text-decoration: none; +} +a:active { + color: black; + text-decoration: none; +} +a:hover { + text-decoration: underline +} + +address { + font-size: 90%; + font-style: normal +} + +HR { + display: none +} diff --git a/doc/pintos.html b/doc/pintos.html new file mode 100644 index 0000000..8060c46 --- /dev/null +++ b/doc/pintos.html @@ -0,0 +1,342 @@ + + + + + +Pintos Projects: Table of Contents + + + + + + + + + + + + + + + + + +
[Top][Contents][Index][ ? ]
+

Table of Contents

+
+1. Introduction +
+
+1.1 Getting Started +
+
+1.1.1 Source Tree Overview +
+1.1.2 Building Pintos +
+1.1.3 Running Pintos +
+1.1.4 Debugging versus Testing +
+
+1.2 Grading +
+
+1.2.1 Testing +
+1.2.2 Design +
+
+1.2.2.1 Design Document +
+1.2.2.2 Source Code +
+
+
+1.3 Legal and Ethical Issues +
+1.4 Acknowledgements +
+1.5 Trivia +
+
+2. Project 0: Introducing Pintos +
+
+2.1 Understanding Threads +
+
+2.1.1 Source Files +
+
+2.1.1.1 devices code +
+2.1.1.2 lib files +
+
+2.1.2 Synchronization +
+2.1.3 Development Suggestions +
+
+2.2 Understanding User Programs +
+
+2.2.1 Source Files +
+2.2.2 Using the File System +
+2.2.3 How User Programs Work +
+2.2.4 Virtual Memory Layout +
+
+2.2.4.1 Typical Memory Layout +
+
+2.2.5 Accessing User Memory +
+
+2.3 Requirements +
+
+2.3.1 Design Document +
+2.3.2 Alarm Clock +
+2.3.3 Argument Passing +
+
+2.4 FAQ +
+
+2.4.1 Threads FAQ +
+2.4.2 Alarm Clock FAQ +
+2.4.3 Userprog FAQ +
+2.4.4 Argument Passing FAQ +
+
+2.5 80x86 Calling Convention +
+
+2.5.1 Program Startup Details +
+2.5.2 System Call Details +
+
+
+3. Project 1: Threads +
+
+3.1 Background +
+3.2 Requirements +
+
+3.2.1 Design Document +
+3.2.2 Priority Scheduling +
+
+3.3 FAQ +
+
+4. Project 2: Virtual Memory +
+A. Reference Guide +
+
+A.1 Loading +
+
+A.1.1 The Loader +
+A.1.2 Low-Level Kernel Initialization +
+A.1.3 High-Level Kernel Initialization +
+A.1.4 Physical Memory Map +
+
+A.2 Threads +
+
+A.2.1 struct thread +
+A.2.2 Thread Functions +
+A.2.3 Thread Switching +
+
+A.3 Synchronization +
+
+A.3.1 Disabling Interrupts +
+A.3.2 Semaphores +
+A.3.3 Locks +
+A.3.4 Monitors +
+
+A.3.4.1 Monitor Example +
+
+A.3.5 Optimization Barriers +
+
+A.4 Interrupt Handling +
+
+A.4.1 Interrupt Infrastructure +
+A.4.2 Internal Interrupt Handling +
+A.4.3 External Interrupt Handling +
+
+A.5 Memory Allocation +
+
+A.5.1 Page Allocator +
+A.5.2 Block Allocator +
+
+A.6 Virtual Addresses +
+A.7 Page Table +
+
+A.7.1 Creation, Destruction, and Activation +
+A.7.2 Inspection and Updates +
+A.7.3 Accessed and Dirty Bits +
+A.7.4 Page Table Details +
+
+A.7.4.1 Structure +
+A.7.4.2 Page Table Entry Format +
+A.7.4.3 Page Directory Entry Format +
+
+
+A.8 Hash Table +
+
+A.8.1 Data Types +
+A.8.2 Basic Functions +
+A.8.3 Search Functions +
+A.8.4 Iteration Functions +
+A.8.5 Hash Table Example +
+A.8.6 Auxiliary Data +
+A.8.7 Synchronization +
+
+
+B. Coding Standards +
+
+B.1 Style +
+B.2 C99 +
+B.3 Unsafe String Functions +
+
+C. Project Documentation +
+
+C.1 Sample Assignment +
+C.2 Sample Design Document +
+
+D. Debugging Tools +
+
+D.1 printf() +
+D.2 ASSERT +
+D.3 Function and Parameter Attributes +
+D.4 Backtraces +
+
+D.4.1 Example +
+
+D.5 GDB +
+
+D.5.1 Using GDB +
+D.5.2 Example GDB Session +
+D.5.3 FAQ +
+
+D.6 Triple Faults +
+D.7 Modifying Bochs +
+D.8 Tips +
+
+E. Development Tools +
+
+E.1 Tags +
+E.2 cscope +
+E.3 git +
+
+Bibliography +
+
+E.4 Hardware References +
+E.5 Software References +
+E.6 Operating System Design References +
+
+License +
+
+
+
+ +This document was generated +by on March, 6 2012 +using texi2html + + + + diff --git a/doc/pintos.pdf b/doc/pintos.pdf new file mode 100644 index 0000000..3549982 Binary files /dev/null and b/doc/pintos.pdf differ diff --git a/doc/pintos_1.html b/doc/pintos_1.html new file mode 100644 index 0000000..e921154 --- /dev/null +++ b/doc/pintos_1.html @@ -0,0 +1,788 @@ + + + + + +Pintos Projects: Introduction + + + + + + + + + + + + + + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+ +
+ +

1. Introduction

+ +Welcome to the Operating System Development course at Vienna. In this +course, you will be working with an adapted version of the Pintos +operating system, which was written by Ben Pfaff (See section 1.4 Acknowledgements.) +

+ +So ... Welcome to Pintos. Pintos is a simple operating system framework for +the 80x86 architecture. It supports kernel threads, loading and +running user programs, and a file system, but it implements all of +these in a very simple way. In the Pintos projects, you and your +project team will strengthen its support in some of these areas. +

+

+ +Pintos could, theoretically, run on a regular IBM-compatible PC. +For simplicity, we will run Pintos projects in a system simulator, that is, +a program that simulates an 80x86 CPU and its peripheral devices accurately +enough that unmodified operating systems and software can run under it. +In class we will use the +Bochs and +QEMU simulators. Pintos has also been tested with +VMware Player. +

+

+ +The course at the University of Stanford, where pintos originated, has a +reputation of taking a lot of time -- we suppose deservedly so. +We will do what we can to reduce the workload, such as providing a lot +of support material, but there is plenty of hard work that needs to be done. +We welcome your feedback. If you have suggestions on how we can reduce the +unnecessary overhead of assignments, cutting them down to the important +underlying issues, please let us know. +

+

+ +This chapter explains how to get started working with Pintos. You +should read the entire chapter before you start work on any of the +projects. +

+

+ + +


+ +

1.1 Getting Started

+ +

+ +To get started, you'll have to log into a machine that Pintos can be +built on. There are two possibilities: Either you work on one of the +machines in the TILAB (ssh.tilab.tuwien.ac.at), or you use +a virtual machine such as KVM, VMWare Player or VirtualBox. You may +also setup your own machine to build pintos, but in this case we +cannot provide support for problems you might encounter. +We will test your submission in the TILAB environment, and thus you should +ensure that your code works there. +

+

+ +Once you've logged into one of these machines, either locally or +remotely, start out by extracting the pintos tarball, and adding +our binaries directory to your PATH environment variable. +

+

+ +Assuming that you extracted the pintos tarball to $HOME/pintos, +you need to add pintos' utils directory to your PATH +environment variable. +
 
PATH=$HOME/pintos/src/utils:$PATH
+
It is a good idea to add this line to your $HOME/.bash_profile +startup script (or an equivalent script, if you do not happen to use bash). +Otherwise, you'll have to type it every time you log in. +

+ + +


+ +

1.1.1 Source Tree Overview

+ +

+ +Here's the directory structure that you should see in pintos/src: +

+

+ +

+
+ +
intro/ +
This directory is used to build the system and run tests for project 0. +It only contains a Makefile, which describes how to build the system +and run tests. +

+ +

+
threads/ +
Source code for the base kernel, which is the focus of project 1. +

+ +

+
userprog/ +
Source code for the user programs. You will complete the user program +loader in project 0, and rely on the code in this directory in +project 2. +

+ +

+
vm/ +
An almost empty directory. You will implement virtual memory here in +project 2. +

+ +

+
filesys/ +
Source code for a basic file system. You will use this file system, +but do not need to modify it in this course. +

+ +

+
devices/ +
Source code for I/O device interfacing: keyboard, timer, disk, etc. +You will modify the timer implementation in project 0. Otherwise +you should have no need to change this code. +

+ +

+
lib/ +
An implementation of a subset of the standard C library. The code in +this directory is compiled into both the Pintos kernel and user +programs that run under it. In both kernel code +and user programs, headers in this directory can be included using the +#include <...> notation. You should have little need to +modify this code. +

+ +

+
lib/kernel/ +
Parts of the C library that are included only in the Pintos kernel. +This also includes implementations of some data types that you are +free to use in your kernel code: bitmaps, doubly linked lists, and +hash tables. In the kernel, headers in this +directory can be included using the #include <...> +notation. +

+ +

+
lib/user/ +
Parts of the C library that are included only in Pintos user programs. +In user programs, headers in this directory can be included using the +#include <...> notation. +

+ +

+
tests/ +
Tests for each project. You can modify this code if it helps you test +your submission, but we will replace it with the originals before we run +the tests. +

+ +

+
examples/ +
Example user programs for use in project 0, and also project 2. +

+ +

+
misc/ +
utils/ +
These files may come in handy if you decide to try working with Pintos +on your own machine. Otherwise, you can ignore them. +
+

+ + +


+ +

1.1.2 Building Pintos

+ +

+ +As the next step, build the source code supplied for +the first project. First, cd into the intro +directory. Then, issue the make command. This will create a +build directory under intro, populate it with a +Makefile and a few subdirectories, and then build the kernel +inside. The entire build should take less than 30 seconds. +

+

+ +Following the build, the following are the interesting files in the +build directory: +

+

+ +

+
+
Makefile +
A copy of pintos/src/Makefile.build. It describes how to build +the kernel. See Adding Source Files, for more information. +

+ +

+
kernel.o +
Object file for the entire kernel. This is the result of linking +object files compiled from each individual kernel source file into a +single object file. It contains debug information, so you can run +GDB (see section D.5 GDB) or backtrace (see section D.4 Backtraces) on it. +

+ +

+
kernel.bin +
Memory image of the kernel, that is, the exact bytes loaded into +memory to run the Pintos kernel. This is just kernel.o with +debug information stripped out, which saves a lot of space, which in +turn keeps the kernel from bumping up against a 512 kB size limit +imposed by the kernel loader's design. +

+ +

+
loader.bin +
Memory image for the kernel loader, a small chunk of code written in +assembly language that reads the kernel from disk into memory and +starts it up. It is exactly 512 bytes long, a size fixed by the +PC BIOS. +
+

+ +Subdirectories of build contain object files (.o) and +dependency files (.d), both produced by the compiler. The +dependency files tell make which source files need to be +recompiled when other source or header files are changed. +

+

+ + +


+ +

1.1.3 Running Pintos

+ +

+ +We've supplied a program for conveniently running Pintos in a simulator, +called pintos. In the simplest case, you can invoke +pintos as pintos argument.... Each +argument is passed to the Pintos kernel for it to act on. +

+

+ +Try it out. First cd into the newly created build +directory. Then issue the command +pintos -kernel-test run alarm-multiple, +which passes the arguments run alarm-multiple to the Pintos +kernel. In these arguments, run, together with the kernel +option -kernel-test, instructs the kernel to run a test and +alarm-multiple is the test to run. +

+

+ +This command creates a bochsrc.txt file, which is needed for +running Bochs, and then invoke Bochs. Bochs opens a new window that +represents the simulated machine's display, and a BIOS message briefly +flashes. Then Pintos boots and runs the alarm-multiple test +program, which outputs a few screenfuls of text. When it's done, you +can close Bochs by clicking on the "Power" button in the window's top +right corner, or rerun the whole process by clicking on the "Reset" +button just to its left. The other buttons are not very useful for our +purposes. +

+

+ +(If no window appeared at all, then you're probably logged in remotely and X +forwarding is not set up correctly. In this case, you can fix your X +setup, or you can use the -v option to disable X output: +pintos -v -- -kernel-test run alarm-multiple.) +

+

+ +The text printed by Pintos inside Bochs probably went by too quickly to +read. However, you've probably noticed by now that the same text was +displayed in the terminal you used to run pintos. This is +because Pintos sends all output both to the VGA display and to the first +serial port, and by default the serial port is connected to Bochs's +stdin and stdout. You can log serial output to a file by +redirecting at the +command line, e.g. pintos run alarm-multiple > logfile. +

+

+ +The pintos program offers several options for configuring the +simulator or the virtual hardware. If you specify any options, they +must precede the commands passed to the Pintos kernel and be separated +from them by --, so that the whole command looks like +pintos option... -- argument.... Invoke +pintos without any arguments to see a list of available options. +Options can select a simulator to use: the default is Bochs, but +--qemu selects QEMU. You can run the simulator +with a debugger (see section D.5 GDB). You can set the amount of memory to give +the VM. Finally, you can select how you want VM output to be displayed: +use -v to turn off the VGA display, -t to use your +terminal window as the VGA display instead of opening a new window +(Bochs only), or -s to suppress serial input from stdin +and output to stdout. +

+

+ +The Pintos kernel has commands and options other than run. +These are not very interesting for now, but you can see a list of them +using -h, e.g. pintos -h. +

+

+ + +


+ +

1.1.4 Debugging versus Testing

+ +

+ +When you're debugging code, it's useful to be able to run a +program twice and have it do exactly the same thing. On second and +later runs, you can make new observations without having to discard or +verify your old observations. This property is called +"reproducibility." One of the simulators that Pintos supports, Bochs, +can be set up for +reproducibility, and that's the way that pintos invokes it +by default. +

+

+ +Of course, a simulation can only be reproducible from one run to the +next if its input is the same each time. For simulating an entire +computer, as we do, this means that every part of the computer must be +the same. For example, you must use the same command-line argument, the +same disks, the same version +of Bochs, and you must not hit any keys on the keyboard (because you +could not be sure to hit them at exactly the same point each time) +during the runs. +

+

+ +While reproducibility is useful for debugging, it is a problem for +testing thread synchronization, an important part of most of the projects. In +particular, when Bochs is set up for reproducibility, timer interrupts +will come at perfectly reproducible points, and therefore so will +thread switches. That means that running the same test several times +doesn't give you any greater confidence in your code's correctness +than does running it only once. +

+

+ +So, to make your code easier to test, we've added a feature, called +"jitter," to Bochs, that makes timer interrupts come at random +intervals, but in a perfectly predictable way. In particular, if you +invoke pintos with the option -j seed, timer +interrupts will come at irregularly spaced intervals. Within a single +seed value, execution will still be reproducible, but timer +behavior will change as seed is varied. Thus, for the highest +degree of confidence you should test your code with many seed values. +

+

+ +On the other hand, when Bochs runs in reproducible mode, timings are not +realistic, meaning that a "one-second" delay may be much shorter or +even much longer than one second. You can invoke pintos with +a different option, -r, to set up Bochs for realistic +timings, in which a one-second delay should take approximately one +second of real time. Simulation in real-time mode is not reproducible, +and options -j and -r are mutually exclusive. +

+

+ +The QEMU simulator is available as an +alternative to Bochs (use --qemu when invoking +pintos). The QEMU simulator is much faster than Bochs, but it +only supports real-time simulation and does not have a reproducible +mode. +

+

+ + +


+ +

1.2 Grading

+ +

+ +We will grade your assignments based on test results and design quality, +inspecting both your implementation and your design documents. +

+

+ + +


+ +

1.2.1 Testing

+ +

+ +Your test result grade will be based on our tests. Each project has +several tests, each of which has a name beginning with tests. +To completely test your submission, invoke make check from the +project build directory. This will build and run each test and +print a "pass" or "fail" message for each one. When a test fails, +make check also prints some details of the reason for failure. +After running all the tests, make check also prints a summary +of the test results. +

+

+ +For project 1, the tests will probably run faster in Bochs. For the +other projects, they will run much faster in QEMU. +make check will select the faster simulator by default, but +you can override its choice by specifying SIMULATOR=--bochs or +SIMULATOR=--qemu on the make command line. +

+

+ +You can also run individual tests one at a time. A given test t +writes its output to t.output, then a script scores the +output as "pass" or "fail" and writes the verdict to +t.result. To run and grade a single test, make +the .result file explicitly from the build directory, e.g. +make tests/threads/alarm-multiple.result. If make says +that the test result is up-to-date, but you want to re-run it anyway, +either run make clean or delete the .output file by hand. +

+

+ +By default, each test provides feedback only at completion, not during +its run. If you prefer, you can observe the progress of each test by +specifying VERBOSE=1 on the make command line, as in +make check VERBOSE=1. You can also provide arbitrary options to the +pintos run by the tests with PINTOSOPTS='...', +e.g. make check PINTOSOPTS='-j 1' to select a jitter value of 1 +(see section 1.1.4 Debugging versus Testing). +

+

+ +All of the tests and related files are in pintos/src/tests. +Before we test your submission, we will replace the contents of that +directory by a pristine, unmodified copy, to ensure that the correct +tests are used. Thus, you can modify some of the tests if that helps in +debugging, but we will run the originals. +

+

+ +All software has bugs, so some of our tests may be flawed. If you think +a test failure is a bug in the test, not a bug in your code, +please point it out. We will look at it and fix it if necessary. +

+

+ +Please don't try to take advantage of our generosity in giving out our +test suite. Your code has to work properly in the general case, not +just for the test cases we supply. For example, it would be unacceptable +to explicitly base the kernel's behavior on the name of the running +test case. Such attempts to side-step the test cases will receive no +credit. If you think your solution may be in a gray area here, please +ask us about it. +

+

+ + +


+ +

1.2.2 Design

+ +

+ +We will judge your design based on the design document and the source +code that you submit. We will read your entire design document and much +of your source code. +

+

+ +Don't forget that design quality, including the design document, is 30% +of your project grade. It +is better to spend one or two hours writing a good design document than +it is to spend that time getting the last 5% of the points for tests and +then trying to rush through writing the design document in the last 15 +minutes. +

+

+ + +


+ +

1.2.2.1 Design Document

+ +

+ +We provide a design document template for each project. For each +significant part of a project, the template asks questions in four +areas: +

+

+ +

+
+
Data Structures +

+ +The instructions for this section are always the same: +

+

+ +

+Copy here the declaration of each new or changed struct or +struct member, global or static variable, typedef, or +enumeration. Identify the purpose of each in 25 words or less. +
+

+ +The first part is mechanical. Just copy new or modified declarations +into the design document, to highlight for us the actual changes to data +structures. Each declaration should include the comment that should +accompany it in the source code (see below). +

+

+ +We also ask for a very brief description of the purpose of each new or +changed data structure. The limit of 25 words or less is a guideline +intended to save your time and avoid duplication with later areas. +

+

+ +

+
Algorithms +

+ +This is where you tell us how your code works, through questions that +probe your understanding of your code. We might not be able to easily +figure it out from the code, because many creative solutions exist for +most OS problems. Help us out a little. +

+

+ +Your answers should be at a level below the high level description of +requirements given in the assignment. We have read the assignment too, +so it is unnecessary to repeat or rephrase what is stated there. On the +other hand, your answers should be at a level above the low level of the +code itself. Don't give a line-by-line run-down of what your code does. +Instead, use your answers to explain how your code works to implement +the requirements. +

+

+ +

+
Synchronization +

+ +An operating system kernel is a complex, multithreaded program, in which +synchronizing multiple threads can be difficult. This section asks +about how you chose to synchronize this particular type of activity. +

+

+ +

+
Rationale +

+ +Whereas the other sections primarily ask "what" and "how," the +rationale section concentrates on "why." This is where we ask you to +justify some design decisions, by explaining why the choices you made +are better than alternatives. You may be able to state these in terms +of time and space complexity, which can be made as rough or informal +arguments (formal language or proofs are unnecessary). +

+

+ +An incomplete, evasive, or non-responsive design document or one that +strays from the template without good reason may be penalized. +Incorrect capitalization, punctuation, spelling, or grammar can also +cost points. See section C. Project Documentation, for a sample design document +for a fictitious project. +

+

+ + +


+ +

1.2.2.2 Source Code

+ +

+ +Your design will also be judged by looking at your source code. We will +typically look at the differences between the original Pintos source +tree and your submission, based on the output of a command like +diff -urpb pintos.orig pintos.submitted. We will try to match up your +description of the design with the code submitted. Important +discrepancies between the description and the actual code will be +penalized, as will be any bugs we find by spot checks. +

+

+ +The most important aspects of source code design are those that specifically +relate to the operating system issues at stake in the project. For +example, the organization of the supplemental page table is an important +part of virtual memory design, so in the virtual memory project a poorly designed +pagetable would lose points. Other issues are much less important. For +example, multiple Pintos design problems call for a "priority +queue," that is, a dynamic collection from which the minimum (or +maximum) item can quickly be extracted. Fast priority queues can be +implemented many ways, but we do not expect you to build a fancy data +structure even if it might improve performance. Instead, you are +welcome to use a linked list (and Pintos even provides one with +convenient functions for sorting and finding minimums and maximums). +

+

+ +Pintos is written in a consistent style. Make your additions and +modifications in existing Pintos source files blend in, not stick out. +In new source files, adopt the existing Pintos style by preference, but +make your code self-consistent at the very least. There should not be a +patchwork of different styles that makes it obvious that three different +people wrote the code. Use horizontal and vertical white space to make +code readable. Add a brief comment on every structure, structure +member, global or static variable, typedef, enumeration, and function +definition. Update +existing comments as you modify code. Don't comment out or use the +preprocessor to ignore blocks of code (instead, remove it entirely). +Use assertions to document key invariants. Decompose code into +functions for clarity. Code that is difficult to understand because it +violates these or other "common sense" software engineering practices +will be penalized. +

+

+ +In the end, remember your audience. Code is written primarily to be +read by humans. It has to be acceptable to the compiler too, but the +compiler doesn't care about how it looks or how well it is written. +

+

+ + +


+ +

1.3 Legal and Ethical Issues

+ +

+ +Pintos is distributed under a liberal license that allows free use, +modification, and distribution. Students and others who work on Pintos +own the code that they write and may use it for any purpose. +Pintos comes with NO WARRANTY, not even for MERCHANTABILITY or FITNESS +FOR A PARTICULAR PURPOSE. +See section License, for details of the license and lack of warranty. +

+

+ +In the context of the Operating System Development at Vienna University +of Technology, please respect the spirit and the letter of the honor code +by refraining from reading any homework solutions available online or +elsewhere. Reading the source code for other operating system kernels, +such as Linux or FreeBSD, is allowed, but do not copy code from them literally. +Please cite the code that inspired your own in your design documentation. +Additionally, please do not redistribute the modified Pintos environment +used in this course. It contains partial solutions which might spoil the +fun for people at other universities. +

+

+ + +


+ +

1.4 Acknowledgements

+ +The Pintos core and this documentation were originally written by Ben +Pfaff blp@cs.stanford.edu. +

+ +Additional features were contributed by Anthony Romano +chz@vt.edu. +

+

+ +The GDB macros supplied with Pintos were written by Godmar Back +gback@cs.vt.edu, and their documentation is adapted from his +work. +

+

+ +The original structure and form of Pintos was inspired by the Nachos +instructional operating system from the University of California, +Berkeley ([ Christopher]). +

+

+ +The Pintos projects and documentation originated with those designed for +Nachos by current and former CS 140 teaching assistants at Stanford +University, including at least Yu Ping, Greg Hutchins, Kelly Shaw, Paul +Twohey, Sameer Qureshi, and John Rector. +

+

+ +Example code for monitors (see section A.3.4 Monitors) is +from classroom slides originally by Dawson Engler and updated by Mendel +Rosenblum. +

+

+ +For the undergraduate OS Development course at Vienna UT, Rene Freingruber +renefreing@yahoo.de evaluated Pintos, and provided information on +expected work hours and typical pitfalls. Benedikt Huber +benedikt@vmars.tuwien.ac.at adapted Pintos and its documentation to +meet the requirements of the course; Roland Kammerer +kammerer@vmars.tuwien.ac.at created the virtual machine environments +to simplify working outside the lab. +

+

+ + +


+ +

1.5 Trivia

+ +

+ +Pintos originated as a replacement for Nachos with a similar design. +Since then Pintos has greatly diverged from the Nachos design. Pintos +differs from Nachos in two important ways. First, Pintos runs on real +or simulated 80x86 hardware, but Nachos runs as a process on a +host operating system. Second, Pintos is written in C like most +real-world operating systems, but Nachos is written in C++. +

+

+ +Why the name "Pintos"? First, like nachos, pinto beans are a common +Mexican food. Second, Pintos is small and a "pint" is a small amount. +Third, like drivers of the eponymous car, students are likely to have +trouble with blow-ups. + +


+ + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+
+ +This document was generated +by on March, 6 2012 +using texi2html + + + + diff --git a/doc/pintos_10.html b/doc/pintos_10.html new file mode 100644 index 0000000..8bd02bc --- /dev/null +++ b/doc/pintos_10.html @@ -0,0 +1,286 @@ + + + + + +Pintos Projects: Bibliography + + + + + + + + + + + + + + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+ +
+

Bibliography

+ +

+ + +


+ +

E.4 Hardware References

+ +

+ + +[IA32-v1]. +IA-32 Intel Architecture Software Developer's Manual Volume 1: Basic +Architecture. Basic 80x86 architecture and programming +environment. Available via developer.intel.com. Section numbers +in this document refer to revision 18. +

+

+ + +[IA32-v2a]. +IA-32 Intel Architecture Software Developer's Manual +Volume 2A: Instruction Set Reference A-M. 80x86 instructions +whose names begin with A through M. Available via +developer.intel.com. Section numbers in this document refer to +revision 18. +

+

+ + +[IA32-v2b]. +IA-32 Intel Architecture Software Developer's Manual Volume 2B: +Instruction Set Reference N-Z. 80x86 instructions whose names +begin with N through Z. Available via developer.intel.com. +Section numbers in this document refer to revision 18. +

+

+ + +[IA32-v3a]. +IA-32 Intel Architecture Software Developer's Manual Volume 3A: System +Programming Guide. Operating system support, including segmentation, +paging, tasks, interrupt and exception handling. Available via +developer.intel.com. Section numbers in this document refer to +revision 18. +

+

+ + +[FreeVGA]. +FreeVGA Project. Documents the VGA video +hardware used in PCs. +

+

+ + +[kbd]. +Keyboard scancodes. Documents PC keyboard +interface. +

+

+ + +[ATA-3]. +AT Attachment-3 Interface (ATA-3) Working +Draft. Draft of an old version of the ATA aka IDE interface for the +disks used in most desktop PCs. +

+

+ + +[PC16550D]. +National Semiconductor PC16550D Universal +Asynchronous Receiver/Transmitter with FIFOs. Datasheet for a chip +used for PC serial ports. +

+

+ + +[8254]. +Intel 8254 Programmable Interval Timer. +Datasheet for PC timer chip. +

+

+ + +[8259A]. +Intel 8259A Programmable Interrupt Controller +(8259A/8259A-2). Datasheet for PC interrupt controller chip. +

+

+ + +[MC146818A]. +Motorola MC146818A Real Time Clock Plus +Ram (RTC). Datasheet for PC real-time clock chip. +

+

+ + +


+ +

E.5 Software References

+ +

+ + +[ELF1]. +Tool Interface Standard (TIS) Executable and +Linking Format (ELF) Specification Version 1.2 Book I: Executable and +Linking Format. The ubiquitous format for executables in modern Unix +systems. +

+

+ + +[ELF2]. +Tool Interface Standard (TIS) Executable and +Linking Format (ELF) Specification Version 1.2 Book II: Processor +Specific (Intel Architecture). 80x86-specific parts of ELF. +

+

+ + +[ELF3]. +Tool Interface Standard (TIS) Executable and +Linking Format (ELF) Specification Version 1.2 Book III: Operating +System Specific (UNIX System V Release 4). Unix-specific parts of +ELF. +

+

+ + +[SysV-ABI]. +System V Application Binary Interface: +Edition 4.1. Specifies how applications interface with the OS under +Unix. +

+

+ + +[SysV-i386]. +System V Application Binary +Interface: Intel386 Architecture Processor Supplement: Fourth +Edition. 80x86-specific parts of the Unix interface. +

+

+ + +[SysV-ABI-update]. +System V Application Binary +Interface--DRAFT--24 April 2001. A draft of a revised version of +[ SysV-ABI] which was never completed. +

+

+ + +[SUSv3]. +The Open Group, Single UNIX Specification V3, 2001. +

+

+ + +[Partitions]. +A. E. Brouwer, Minimal partition table specification, 1999. +

+

+ + +[IntrList]. +R. Brown, Ralf Brown's +Interrupt List, 2000. +

+

+ + +


+ +

E.6 Operating System Design References

+ +

+ + +[Christopher]. +W. A. Christopher, S. J. Procter, T. E. Anderson, +The Nachos instructional operating system. +Proceedings of the USENIX Winter 1993 Conference. +http://portal.acm.org/citation.cfm?id=1267307. +

+

+ + +[Dijkstra]. +E. W. Dijkstra, The structure of the "THE" +multiprogramming system. Communications of the ACM 11(5):341--346, +1968. http://doi.acm.org/10.1145/363095.363143. +

+

+ + +[Hoare]. +C. A. R. Hoare, Monitors: An Operating System +Structuring Concept. Communications of the ACM, 17(10):549--557, +1974. http://www.acm.org/classics/feb96/. +

+

+ + +[Lampson]. +B. W. Lampson, D. D. Redell, Experience with processes and +monitors in Mesa. Communications of the ACM, 23(2):105--117, 1980. +http://doi.acm.org/10.1145/358818.358824. +

+

+ + +[McKusick]. +M. K. McKusick, K. Bostic, M. J. Karels, J. S. Quarterman, +The Design and Implementation of the 4.4BSD Operating +System. Addison-Wesley, 1996. +

+

+ + +[Wilson]. +P. R. Wilson, M. S. Johnstone, M. Neely, D. Boles, +Dynamic Storage Allocation: A Survey and Critical Review. +International Workshop on Memory Management, 1995. +http://www.cs.utexas.edu/users/oops/papers.html#allocsrv. + +


+ + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+
+ +This document was generated +by on March, 6 2012 +using texi2html + + + + diff --git a/doc/pintos_11.html b/doc/pintos_11.html new file mode 100644 index 0000000..d34b895 --- /dev/null +++ b/doc/pintos_11.html @@ -0,0 +1,137 @@ + + + + + +Pintos Projects: License + + + + + + + + + + + + + + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+ +
+

License

+ +

+ +Pintos, including its documentation, is subject to the following +license: +

+

+ +

+Copyright © 2004, 2005, 2006 Board of Trustees, Leland +Stanford Jr. University. All rights reserved. +

+ +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: +

+

+ +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. +

+

+ +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE +LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. +

+

+ +A few individual files in Pintos were originally derived from other +projects, but they have been extensively modified for use in Pintos. +The original code falls under the original license, and modifications +for Pintos are additionally covered by the Pintos license above. +

+

+ +In particular, code derived from Nachos is subject to the following +license: +

+

+ +

+Copyright © 1992-1996 The Regents of the University of California. +All rights reserved. +

+ +Permission to use, copy, modify, and distribute this software +and its documentation for any purpose, without fee, and +without written agreement is hereby granted, provided that the +above copyright notice and the following two paragraphs appear +in all copies of this software. +

+

+ +IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO +ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR +CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OF THIS SOFTWARE +AND ITS DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA +HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +

+

+ +THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY +WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS ON AN "AS IS" +BASIS, AND THE UNIVERSITY OF CALIFORNIA HAS NO OBLIGATION TO +PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR +MODIFICATIONS. +

+

+ +


+ + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+
+ +This document was generated +by on March, 6 2012 +using texi2html + + + + diff --git a/doc/pintos_2.html b/doc/pintos_2.html new file mode 100644 index 0000000..ae51974 --- /dev/null +++ b/doc/pintos_2.html @@ -0,0 +1,1734 @@ + + + + + +Pintos Projects: Project 0--Introducing Pintos + + + + + + + + + + + + + + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+ +
+

2. Project 0: Introducing Pintos

+ +

+ +In this assignment, you will learn about the existing functionality +in Pintos, and add two small features to the system: a more efficient +implementation of sleep, and the ability to pass command line +arguments to user programs. +

+

+ +You will be working in the threads directory for the first part +of this assignment (with some work in the devices directory on +the side), and modify the file userprog/process.c in the second part. +

+

+ +The tests for Project 0 are executed by changing the working directory +to intro and running make followed by make check. +

+

+ +Before you read the description of this project, you should read all of +the following sections: 1. Introduction, B. Coding Standards, +D. Debugging Tools, and E. Development Tools. You should at least +skim the material from A.1 Loading through A.5 Memory Allocation, especially A.3 Synchronization. +

+

+ + +


+ +

2.1 Understanding Threads

+ +

+ +The first step is to read and understand the code for the initial thread +system. +Pintos already implements thread creation and thread completion, +a simple scheduler to switch between threads, and synchronization +primitives (semaphores, locks, condition variables, and optimization +barriers). +

+

+ +Some of this code might seem slightly mysterious. If +you haven't already compiled and run the base system, as described in +the introduction (see section 1. Introduction), you should do so now. You +can read through parts of the source code to see what's going +on. If you like, you can add calls to printf() almost +anywhere, then recompile and run to see what happens and in what +order. You can also run the kernel in a debugger and set breakpoints +at interesting spots, single-step through code and examine data, and +so on. +

+

+ +When a thread is created, you are creating a new context to be +scheduled. You provide a function to be run in this context as an +argument to thread_create(). The first time the thread is +scheduled and runs, it starts from the beginning of that function +and executes in that context. When the function returns, the thread +terminates. Each thread, therefore, acts like a mini-program running +inside Pintos, with the function passed to thread_create() +acting like main(). +

+

+ +At any given time, exactly one thread runs and the rest, if any, +become inactive. The scheduler decides which thread to +run next. (If no thread is ready to run +at any given time, then the special "idle" thread, implemented in +idle(), runs.) +Synchronization primitives can force context switches when one +thread needs to wait for another thread to do something. +

+

+ +The mechanics of a context switch are +in threads/switch.S, which is 80x86 +assembly code. (You don't have to understand it.) It saves the +state of the currently running thread and restores the state of the +thread we're switching to. +

+

+ +Using the GDB debugger, slowly trace through a context +switch to see what happens (see section D.5 GDB). You can set a +breakpoint on schedule() to start out, and then +single-step from there.(1) Be sure +to keep track of each thread's address +and state, and what procedures are on the call stack for each thread. +You will notice that when one thread calls switch_threads(), +another thread starts running, and the first thing the new thread does +is to return from switch_threads(). You will understand the thread +system once you understand why and how the switch_threads() that +gets called is different from the switch_threads() that returns. +See section A.2.3 Thread Switching, for more information. +

+

+ +Warning: In Pintos, each thread is assigned a small, +fixed-size execution stack just under 4 kB in size. The kernel +tries to detect stack overflow, but it cannot do so perfectly. You +may cause bizarre problems, such as mysterious kernel panics, if you +declare large data structures as non-static local variables, +e.g. int buf[1000];. Alternatives to stack allocation include +the page allocator and the block allocator (see section A.5 Memory Allocation). +

+

+ + +


+ +

2.1.1 Source Files

+ +

+ +Here is a brief overview of the files in the threads +directory. You will not need to modify most of this code, but the +hope is that presenting this overview will give you a start on what +code to look at. +

+

+ +

+
+
loader.S +
loader.h +
The kernel loader. Assembles to 512 bytes of code and data that the +PC BIOS loads into memory and which in turn finds the kernel on disk, +loads it into memory, and jumps to start() in start.S. +See section A.1.1 The Loader, for details. You should not need to look at +this code or modify it. +

+ +

+
start.S +
Does basic setup needed for memory protection and 32-bit +operation on 80x86 CPUs. Unlike the loader, this code is +actually part of the kernel. See section A.1.2 Low-Level Kernel Initialization, +for details. +

+ +

+
kernel.lds.S +
The linker script used to link the kernel. Sets the load address of +the kernel and arranges for start.S to be near the beginning +of the kernel image. See section A.1.1 The Loader, for details. Again, you +should not need to look at this code +or modify it, but it's here in case you're curious. +

+ +

+
init.c +
init.h +
Kernel initialization, including main(), the kernel's "main +program." You should look over main() at least to see what +gets initialized. You might want to add your own initialization code +here. See section A.1.3 High-Level Kernel Initialization, for details. +

+ +

+
thread.c +
thread.h +
Basic thread support. Much of your work will take place in these files. +thread.h defines struct thread, which you are likely to modify +in all three projects. See A.2.1 struct thread and A.2 Threads for +more information. +

+ +

+
switch.S +
switch.h +
Assembly language routine for switching threads. Already discussed +above. See section A.2.2 Thread Functions, for more information. +

+ +

+
palloc.c +
palloc.h +
Page allocator, which hands out system memory in multiples of 4 kB +pages. See section A.5.1 Page Allocator, for more information. +

+ +

+
malloc.c +
malloc.h +
A simple implementation of malloc() and free() for +the kernel. See section A.5.2 Block Allocator, for more information. +

+ +

+
interrupt.c +
interrupt.h +
Basic interrupt handling and functions for turning interrupts on and +off. See section A.4 Interrupt Handling, for more information. +

+ +

+
intr-stubs.S +
intr-stubs.h +
Assembly code for low-level interrupt handling. See section A.4.1 Interrupt Infrastructure, for more information. +

+ +

+
synch.c +
synch.h +
Basic synchronization primitives: semaphores, locks, condition +variables, and optimization barriers. You will need to use these for +synchronization in all +four projects. See section A.3 Synchronization, for more information. +

+ +

+
io.h +
Functions for I/O port access. This is mostly used by source code in +the devices directory that you won't have to touch. +

+ +

+
vaddr.h +
pte.h +
Functions and macros for working with virtual addresses and page table +entries. These will be more important to you in project 2. For now, +you can ignore them. +

+ +

+
flags.h +
Macros that define a few bits in the 80x86 "flags" register. +Probably of no interest. See [ IA32-v1], section 3.4.3, "EFLAGS +Register," for more information. +
+

+ + +


+ +

2.1.1.1 devices code

+ +

+ +The basic threaded kernel also includes these files in the +devices directory: +

+

+ +

+
+
timer.c +
timer.h +
System timer that ticks, by default, 100 times per second. You will +modify this code in this project. +

+ +

+
vga.c +
vga.h +
VGA display driver. Responsible for writing text to the screen. +You should have no need to look at this code. printf() +calls into the VGA display driver for you, so there's little reason to +call this code yourself. +

+ +

+
serial.c +
serial.h +
Serial port driver. Again, printf() calls this code for you, +so you don't need to do so yourself. +It handles serial input by passing it to the input layer (see below). +

+ +

+
block.c +
block.h +
An abstraction layer for block devices, that is, random-access, +disk-like devices that are organized as arrays of fixed-size blocks. +Out of the box, Pintos supports two types of block devices: IDE disks +and partitions. +

+ +

+
ide.c +
ide.h +
Supports reading and writing sectors on up to 4 IDE disks. +

+ +

+
partition.c +
partition.h +
Understands the structure of partitions on disks, allowing a single +disk to be carved up into multiple regions (partitions) for +independent use. +

+ +

+
kbd.c +
kbd.h +
Keyboard driver. Handles keystrokes passing them to the input layer +(see below). +

+ +

+
input.c +
input.h +
Input layer. Queues input characters passed along by the keyboard or +serial drivers. +

+ +

+
intq.c +
intq.h +
Interrupt queue, for managing a circular queue that both kernel +threads and interrupt handlers want to access. Used by the keyboard +and serial drivers. +

+ +

+
rtc.c +
rtc.h +
Real-time clock driver, to enable the kernel to determine the current +date and time. By default, this is only used by thread/init.c +to choose an initial seed for the random number generator. +

+ +

+
speaker.c +
speaker.h +
Driver that can produce tones on the PC speaker. +

+ +

+
pit.c +
pit.h +
Code to configure the 8254 Programmable Interrupt Timer. This code is +used by both devices/timer.c and devices/speaker.c +because each device uses one of the PIT's output channel. +
+

+ + +


+ +

2.1.1.2 lib files

+ +

+ +Finally, lib and lib/kernel contain useful library +routines. (lib/user will be used by user programs, starting in +project 2, but it is not part of the kernel.) Here's a few more +details: +

+

+ +

+
+
ctype.h +
inttypes.h +
limits.h +
stdarg.h +
stdbool.h +
stddef.h +
stdint.h +
stdio.c +
stdio.h +
stdlib.c +
stdlib.h +
string.c +
string.h +
A subset of the standard C library. See section B.2 C99, for +information +on a few recently introduced pieces of the C library that you might +not have encountered before. See section B.3 Unsafe String Functions, for +information on what's been intentionally left out for safety. +

+ +

+
debug.c +
debug.h +
Functions and macros to aid debugging. See section D. Debugging Tools, for +more information. +

+ +

+
random.c +
random.h +
Pseudo-random number generator. The actual sequence of random values +will not vary from one Pintos run to another, unless you do one of +three things: specify a new random seed value on the -rs +kernel command-line option on each run, or use a simulator other than +Bochs, or specify the -r option to pintos. +

+ +

+
round.h +
Macros for rounding. +

+ +

+
syscall-nr.h +
System call numbers. Not used until project 2. +

+ +

+
kernel/list.c +
kernel/list.h +
Doubly linked list implementation. Used all over the Pintos code, and +you'll probably want to use it a few places yourself. +

+ +

+
kernel/bitmap.c +
kernel/bitmap.h +
Bitmap implementation. You can use this in your code if you like, but +you probably won't have any need for it in project 0 and project 1. +

+ +

+
kernel/hash.c +
kernel/hash.h +
Hash table implementation. +

+ +

+
kernel/console.c +
kernel/console.h +
kernel/stdio.h +
Implements printf() and a few other functions. +
+

+ + +


+ +

2.1.2 Synchronization

+ +

+ +Proper synchronization is an important part of the solutions to these +problems. Any synchronization problem can be easily solved by turning +interrupts off: while interrupts are off, there is no concurrency, so +there's no possibility for race conditions. Therefore, it's tempting to +solve all synchronization problems this way, but don't. +Instead, use semaphores, locks, and condition variables to solve the +bulk of your synchronization problems. Read the tour section on +synchronization (see section A.3 Synchronization) or the comments in +threads/synch.c if you're unsure what synchronization primitives +may be used in what situations. +

+

+ +In the Pintos projects, the only class of problem best solved by +disabling interrupts is coordinating data shared between a kernel thread +and an interrupt handler. Because interrupt handlers can't sleep, they +can't acquire locks. This means that data shared between kernel threads +and an interrupt handler must be protected within a kernel thread by +turning off interrupts. +

+

+ +This project only requires accessing a little bit of thread state from +interrupt handlers. For the alarm clock, the timer interrupt needs to +wake up sleeping threads. In the advanced scheduler, the timer +interrupt needs to access a few global and per-thread variables. When +you access these variables from kernel threads, you will need to disable +interrupts to prevent the timer interrupt from interfering. +

+

+ +When you do turn off interrupts, take care to do so for the least amount +of code possible, or you can end up losing important things such as +timer ticks or input events. Turning off interrupts also increases the +interrupt handling latency, which can make a machine feel sluggish if +taken too far. +

+

+ +The synchronization primitives themselves in synch.c are +implemented by disabling interrupts. You may need to increase the +amount of code that runs with interrupts disabled here, but you should +still try to keep it to a minimum. +

+

+ +Disabling interrupts can be useful for debugging, if you want to make +sure that a section of code is not interrupted. You should remove +debugging code before turning in your project. (Don't just comment it +out, because that can make the code difficult to read.) +

+

+ +There should be no busy waiting in your submission. A tight loop that +calls thread_yield() is one form of busy waiting. +

+

+ + +


+ +

2.1.3 Development Suggestions

+ +

+ +In the past, many groups divided the assignment into pieces, then each +group member worked on his or her piece until just before the +deadline, at which time the group reconvened to combine their code and +submit. This is a bad idea. We do not recommend this +approach. Groups that do this often find that two changes conflict +with each other, requiring lots of last-minute debugging. Some groups +who have done this have turned in code that did not even compile or +boot, much less pass any tests. +

+

+ +Instead, we recommend integrating your team's changes early and often, +using the source code control system git. +This is less likely to produce surprises, because everyone can see +everyone else's code as it is written, instead of just when it is +finished. These systems also make it possible to review changes and, +when a change introduces a bug, drop back to working versions of code. +

+

+ +You should expect to run into bugs that you simply don't understand +while working on this and subsequent projects. When you do, +reread the appendix on debugging tools, which is filled with +useful debugging tips that should help you to get back up to speed +(see section D. Debugging Tools). Be sure to read the section on backtraces +(see section D.4 Backtraces), which will help you to get the most out of every +kernel panic or assertion failure. +

+

+ + +


+ +

2.2 Understanding User Programs

+ +

+ +The tests for both the alarm clock assignment in Project 0, and the +priority scheduler in Project 1, run as part of the operating system +kernel, with full access to privileged parts of the system. +Once we start running user programs on top of the operating system, this +is no longer true. +

+

+ +We allow more than one process to run at a time. Each process has one +thread (multithreaded processes are not supported). User programs are +written under the illusion that they have the entire machine. This +means that when you load and run multiple processes at a time, you must +manage memory, scheduling, and other state correctly to maintain this +illusion. +

+

+ +In Project 2, we will test your operating system by running +user programs. This gives you much greater freedom. You must make sure +that the user program interface meets the specifications described here, +but given that constraint you are free to restructure or rewrite kernel +code however you wish. +

+

+ + +


+ +

2.2.1 Source Files

+ +

+ +

+
+
process.c +
process.h +
Loads ELF binaries and starts processes. +

+ +

+
pagedir.c +
pagedir.h +
A simple manager for 80x86 hardware page tables. +Although you probably won't want to modify this code for this project, +you may want to call some of its functions. +See Page Tables, for more information. +

+ +

+
syscall.c +
syscall.h +
Whenever a user process wants to access some kernel functionality, it +invokes a system call. +

+ +

+
exception.c +
exception.h +
When a user process performs a privileged or prohibited operation, it +traps into the kernel as an "exception" or "fault."(2) These files handle +exceptions. In project 2, you will need to modify the page fault +handler to support lazy page loading and stack growth. +

+ +

+
gdt.c +
gdt.h +
The 80x86 is a segmented architecture. The Global Descriptor +Table (GDT) is a table that describes the segments in use. These +files set up the GDT. You should not need to modify these +files for any of the projects. You can read the code if +you're interested in how the GDT works. +

+ +

+
tss.c +
tss.h +
The Task-State Segment (TSS) is used for 80x86 architectural +task switching. Pintos uses the TSS only for switching stacks when a +user process enters an interrupt handler, as does Linux. You +should not need to modify these files for any of the projects. +You can read the code if you're interested in how the TSS +works. +
+

+ + +


+ +

2.2.2 Using the File System

+ +

+ +You will need to interface to the file system code, because +user programs are loaded from the file system and most of the +system calls you must implement deal with the file system. +You will want to look over the filesys.h and file.h +interfaces to understand how to use the file system, and especially +its many limitations. +

+

+ +There is no need to modify the file system code in this course, and so +we recommend that you do not. Working on the file system is likely to +distract you from the project's foci. +

+

+ +You will have to tolerate the following limitations, however: +

+

+ +

+

+ +One important feature is included: +

+

+ +

+

+ +You need to be able to create a simulated disk with a file system +partition. The pintos-mkdisk program provides this +functionality. From the userprog/build directory, execute +pintos-mkdisk filesys.dsk --filesys-size=2. This command +creates a simulated disk named filesys.dsk that contains a 2 +MB Pintos file system partition. Then format the file system +partition by passing -f -q on the kernel's command line: +pintos -f -q. The -f option causes the file system to +be formatted, and -q causes Pintos to exit as soon as the +format is done. +

+

+ +You'll need a way to copy files in and out of the simulated file system. +The pintos -p ("put") and -g ("get") +options do this. To copy file into the +Pintos file system, use the command pintos -p file -- -q. +(The -- is needed because -p is for the pintos +script, not for the simulated kernel.) To copy it to the Pintos file +system under the name newname, add -a +newname: pintos -p file -a newname -- -q. The +commands for copying files out of a VM are similar, but substitute +-g for -p. +

+

+ +Incidentally, these commands work by passing special commands +extract and append on the kernel's command line and copying +to and from a special simulated "scratch" partition. If you're very +curious, you can look at the pintos script as well as +filesys/fsutil.c to learn the implementation details. +

+

+ +Here's a summary of how to create a disk with a file system partition, +format the file system, copy the echo program into the new +disk, and then run echo, passing argument x. +(Argument passing won't work until you implemented it.) It assumes +that you've already built the examples in examples and that the +current directory is userprog/build: +

+

+ +
 
pintos-mkdisk filesys.dsk --filesys-size=2
+pintos -f -q
+pintos -p ../../examples/echo -a echo -- -q
+pintos -q run 'echo x'
+

+ +The three final steps can actually be combined into a single command: +

+

+ +
 
pintos-mkdisk filesys.dsk --filesys-size=2
+pintos -p ../../examples/echo -a echo -- -f -q run 'echo x'
+

+ +If you don't want to keep the file system disk around for later use or +inspection, you can even combine all four steps into a single command. +The --filesys-size=n option creates a temporary file +system partition +approximately n megabytes in size just for the duration of the +pintos run. The Pintos automatic test suite makes extensive +use of this syntax: +

+

+ +
 
pintos --filesys-size=2 -p ../../examples/echo -a echo -- -f -q run 'echo x'
+

+ +You can delete a file from the Pintos file system using the rm +file kernel action, e.g. pintos -q rm file. Also, +ls lists the files in the file system and cat +file prints a file's contents to the display. +

+

+ + +


+ +

2.2.3 How User Programs Work

+ +

+ +Pintos can run normal C programs, as long as they fit into memory and use +only the system calls you implement. Notably, malloc() cannot be +implemented because none of the system calls required for this project +allow for memory allocation. Pintos also can't run programs that use +floating point operations, since the kernel doesn't save and restore the +processor's floating-point unit when switching threads. +

+

+ +The src/examples directory contains a few sample user +programs. The Makefile in this directory +compiles the provided examples, and you can edit it +compile your own programs as well. Some of the example programs will +not work with the current implementation of Pintos. +

+

+ +Pintos can load ELF executables with the loader provided for you +in userprog/process.c. ELF is a file format used by Linux, +Solaris, and many other operating systems for object files, +shared libraries, and executables. You can actually use any compiler +and linker that output 80x86 ELF executables to produce programs +for Pintos. (We've provided compilers and linkers that should do just +fine.) +

+

+ +You should realize immediately that, until you copy a +test program to the simulated file system, Pintos will be unable to do +useful work. You won't be able to do +interesting things until you copy a variety of programs to the file system. +You might want to create a clean reference file system disk and copy that +over whenever you trash your filesys.dsk beyond a useful state, +which may happen occasionally while debugging. +

+

+ + +


+ +

2.2.4 Virtual Memory Layout

+ +

+ +Virtual memory in Pintos is divided into two regions: user virtual +memory and kernel virtual memory. User virtual memory ranges from +virtual address 0 up to PHYS_BASE, which is defined in +threads/vaddr.h and defaults to 0xc0000000 (3 GB). Kernel +virtual memory occupies the rest of the virtual address space, from +PHYS_BASE up to 4 GB. +

+

+ +User virtual memory is per-process. +When the kernel switches from one process to another, it +also switches user virtual address spaces by changing the processor's +page directory base register (see pagedir_activate() in +userprog/pagedir.c). struct thread contains a pointer to a +process's page table. +

+

+ +Kernel virtual memory is global. It is always mapped the same way, +regardless of what user process or kernel thread is running. In +Pintos, kernel virtual memory is mapped one-to-one to physical +memory, starting at PHYS_BASE. That is, virtual address +PHYS_BASE accesses physical +address 0, virtual address PHYS_BASE + 0x1234 accesses +physical address 0x1234, and so on up to the size of the machine's +physical memory. +

+

+ +A user program can only access its own user virtual memory. An attempt to +access kernel virtual memory causes a page fault, handled by +page_fault() in userprog/exception.c, and the process +will be terminated. Kernel threads can access both kernel virtual +memory and, if a user process is running, the user virtual memory of +the running process. However, even in the kernel, an attempt to +access memory at an unmapped user virtual address +will cause a page fault. +

+

+ + +


+ +

2.2.4.1 Typical Memory Layout

+ +

+ +Conceptually, each process is +free to lay out its own user virtual memory however it +chooses. In practice, user virtual memory is laid out like this: +

+

+ +

+
 
   PHYS_BASE +----------------------------------+
+             |            user stack            |
+             |                 |                |
+             |                 |                |
+             |                 V                |
+             |          grows downward          |
+             |                                  |
+             |                                  |
+             |                                  |
+             |                                  |
+             |           grows upward           |
+             |                 ^                |
+             |                 |                |
+             |                 |                |
+             +----------------------------------+
+             | uninitialized data segment (BSS) |
+             +----------------------------------+
+             |     initialized data segment     |
+             +----------------------------------+
+             |           code segment           |
+  0x08048000 +----------------------------------+
+             |                                  |
+             |                                  |
+             |                                  |
+             |                                  |
+             |                                  |
+           0 +----------------------------------+
+
+

+ +In this project, the user stack is fixed in size, but in project 2 it +will be allowed to grow. Traditionally, the size of the uninitialized +data segment can be adjusted with a system call, but you will not have +to implement this. +

+

+ +The code segment in Pintos starts at user virtual address +0x08084000, approximately 128 MB from the bottom of the address +space. This value is specified in [ SysV-i386] and has no deep +significance. +

+

+ +The linker sets the layout of a user program in memory, as directed by a +"linker script" that tells it the names and locations of the various +program segments. You can learn more about linker scripts by reading +the "Scripts" chapter in the linker manual, accessible via info +ld. +

+

+ +To view the layout of a particular executable, run objdump +(80x86) or i386-elf-objdump (SPARC) with the -p +option. +

+

+ + +


+ +

2.2.5 Accessing User Memory

+ +

+ +As part of a system +call, the kernel must often access memory through pointers provided by a user +program. The kernel must be very careful about doing so, because +the user can pass a null pointer, a pointer to +unmapped virtual memory, or a pointer to kernel virtual address space +(above PHYS_BASE). All of these types of invalid pointers must +be rejected without harm to the kernel or other running processes, by +terminating the offending process and freeing its resources. +

+

+ +There are at least two reasonable ways to do this correctly. The +first method is to verify +the validity of a user-provided pointer, then dereference it. +The second method is to check only that a user +pointer points below PHYS_BASE, then dereference it. +An invalid user pointer will cause a "page fault" that you can +handle by modifying the code for page_fault() in +userprog/exception.c. This technique is normally faster +because it takes advantage of the processor's MMU, so it tends to be +used in real kernels (including Linux). It is also the way +access to user pointers is implemented in the Pintos version provided. +

+

+ +In either case, one needs to make sure not to "leak" resources. For +example, suppose that your system call has acquired a lock or +allocated memory with malloc(). If you encounter an invalid user pointer +afterward, you must still be sure to release the lock or free the page +of memory. If you choose to verify user pointers before dereferencing +them, this should be straightforward. It's more difficult to handle +if an invalid pointer causes a page fault, +because there's no way to return an error code from a memory access. +

+

+ + +


+ +

2.3 Requirements

+ +

+ + +


+ +

2.3.1 Design Document

+ +

+ +Before you turn in your project, you must copy the +project 0 design document template into your source tree under the name +pintos/src/intro/DESIGNDOC and fill it in. We recommend that +you read the design document template before you start working on the +project. See section C. Project Documentation, for a sample design document +that goes along with a fictitious project. +

+

+ + +


+ +

2.3.2 Alarm Clock

+ +

+ +Reimplement timer_sleep(), defined in devices/timer.c. +Although a working implementation is provided, it "busy waits," that +is, it spins in a loop checking the current time and calling +thread_yield() until enough time has gone by. Reimplement it to +avoid busy waiting. +

+

+ + +

+
+
Function: void timer_sleep (int64_t ticks) +
Suspends execution of the calling thread until time has advanced by at +least x timer ticks. Unless the system is otherwise idle, the +thread need not wake up after exactly x ticks. Just put it on +the ready queue after they have waited for the right amount of time. +

+ +timer_sleep() is useful for threads that operate in real-time, +e.g. for blinking the cursor once per second. +

+

+ +The argument to timer_sleep() is expressed in timer ticks, not in +milliseconds or any another unit. There are TIMER_FREQ timer +ticks per second, where TIMER_FREQ is a macro defined in +devices/timer.h. The default value is 100. We don't recommend +changing this value, because any change is likely to cause many of +the tests to fail. +

+
+

+ +Separate functions timer_msleep(), timer_usleep(), and +timer_nsleep() do exist for sleeping a specific number of +milliseconds, microseconds, or nanoseconds, respectively, but these will +call timer_sleep() automatically when necessary. You do not need +to modify them. +

+

+ +If your delays seem too short or too long, reread the explanation of the +-r option to pintos (see section 1.1.4 Debugging versus Testing). +

+

+ +The tests for the 2.3.2 Alarm Clock assignment are executed by changing the +working directory intro. The run make to build the +Pintos kernel. Finally run make check to run the tests, +followed by make grade to obtain your score. +

+

+ +The alarm clock implementation is not needed for later projects. +

+

+ + +


+ +

2.3.3 Argument Passing

+ +

+ +Currently, process_execute() does not support passing arguments to +new processes. Implement this functionality, by extending +process_execute() so that instead of simply taking a program file +name as its argument, it divides it into words at spaces. The first +word is the program name, the second word is the first argument, and so +on. That is, process_execute("grep foo bar") should run +grep passing two arguments foo and bar. +

+

+ +Within a command line, multiple spaces are equivalent to a single +space, so that process_execute("grep foo bar") +is equivalent to our original example. You can impose a reasonable +limit on the length of the command line arguments. For example, you +could limit the arguments to those that will fit in a single page (4 +kB). (There is an unrelated limit of 128 bytes on command-line +arguments that the pintos utility can pass to the kernel.) +

+

+ +You can parse argument strings any way you like. If you're lost, +look at strtok_r(), prototyped in lib/string.h and +implemented with thorough comments in lib/string.c. You can +find more about it by looking at the man page (run man strtok_r +at the prompt). +

+

+ +See section 2.5.1 Program Startup Details, for information on exactly how you +need to set up the stack. +

+

+ + +


+ +

2.4 FAQ

+ +

+ +

+
+
How much code will I need to write? +

+ +Here's a summary of our reference solution, produced by the +diffstat program. The final row gives total lines inserted +and deleted; a changed line counts as both an insertion and a deletion. +

+

+ +The reference solution represents just one possible solution. Many +other solutions are also possible and many of those differ greatly from +the reference solution. Some excellent solutions may not modify all the +files modified by the reference solution, and some may modify files not +modified by the reference solution. +

+

+ +
 
 devices/timer.c    |   40 +++++++++++-
+ threads/thread.h   |    3 +
+ userprog/process.c |  148 ++++++++++++++++++++++++++++++-----------
+ 3 files changed, 150 insertions(+), 41 deletions(-)
+

+

+ + +


+ +

2.4.1 Threads FAQ

+ +

+ +

+
+
How do I update the Makefiles when I add a new source file? +

+ + +To add a .c file, edit the top-level Makefile.build. +Add the new file to variable dir_SRC, where +dir is the directory where you added the file. For this +project, that means you should add it to threads_SRC or +devices_SRC. Then run make. If your new file +doesn't get +compiled, run make clean and then try again. +

+

+ +When you modify the top-level Makefile.build and re-run +make, the modified +version should be automatically copied to +threads/build/Makefile. The converse is +not true, so any changes will be lost the next time you run make +clean from the threads directory. Unless your changes are +truly temporary, you should prefer to edit Makefile.build. +

+

+ +A new .h file does not require editing the Makefiles. +

+

+ +

+
What does warning: no previous prototype for `func' mean? +

+ +It means that you defined a non-static function without +preceding it by a prototype. Because non-static functions are +intended for use by other .c files, for safety they should be +prototyped in a header file included before their definition. To fix +the problem, add a prototype in a header file that you include, or, if +the function isn't actually used by other .c files, make it +static. +

+

+ +

+
What is the interval between timer interrupts? +

+ +Timer interrupts occur TIMER_FREQ times per second. You can +adjust this value by editing devices/timer.h. The default is +100 Hz. +

+

+ +We don't recommend changing this value, because any changes are likely +to cause many of the tests to fail. +

+

+ +

+
How long is a time slice? +

+ +There are TIME_SLICE ticks per time slice. This macro is +declared in threads/thread.c. The default is 4 ticks. +

+

+ +We don't recommend changing this value, because any changes are likely +to cause many of the tests to fail. +

+

+ +

+
How do I run the tests? +

+ +See section 1.2.1 Testing. +

+

+ +See section D.4 Backtraces, for more information. +

+

+ + +


+ +

2.4.2 Alarm Clock FAQ

+ +

+ +

+
+
Do I need to account for timer values overflowing? +

+ +Don't worry about the possibility of timer values overflowing. Timer +values are expressed as signed 64-bit numbers, which at 100 ticks per +second should be good for almost 2,924,712,087 years. By then, we +expect Pintos to have been phased out of the curriculum. +

+

+ + +


+ +

2.4.3 Userprog FAQ

+ +

+ +

+
+
The kernel always panics when I run pintos -p file -- -q. +

+ +Did you format the file system (with pintos -f)? +

+

+ +Is your file name too long? The file system limits file names to 14 +characters. A command like pintos -p ../../examples/echo -- -q +will exceed the limit. Use pintos -p ../../examples/echo -a echo +-- -q to put the file under the name echo instead. +

+

+ +Is the file system full? +

+

+ +Does the file system already contain 16 files? The base Pintos file +system has a 16-file limit. +

+

+ +The file system may be so fragmented that there's not enough contiguous +space for your file. +

+

+ +

+
When I run pintos -p ../file --, file isn't copied. +

+ +Files are written under the name you refer to them, by default, so in +this case the file copied in would be named ../file. You +probably want to run pintos -p ../file -a file -- instead. +

+

+ +You can list the files in your file system with pintos -q ls. +

+

+ +

+
All my user programs die with page faults. +

+ +This will happen if you haven't implemented argument passing +(or haven't done so correctly). The basic C library for user programs tries +to read argc and argv off the stack. If the stack +isn't properly set up, this causes a page fault. +

+

+ +

+
How can I disassemble user programs? +

+ +The objdump (80x86) or i386-elf-objdump +(SPARC) utility can disassemble entire user +programs or object files. Invoke it as objdump -d +file. You can use GDB's +disassemble command to disassemble individual functions +(see section D.5 GDB). +

+

+ +

+
Why do many C include files not work in Pintos programs? +
Can I use libfoo in my Pintos programs? +

+ +The C library we provide is very limited. It does not include many of +the features that are expected of a real operating system's C library. +The C library must be built specifically for the operating system (and +architecture), since it must make system calls for I/O and memory +allocation. (Not all functions do, of course, but usually the library +is compiled as a unit.) +

+

+ +The chances are good that the library you want uses parts of the C library +that Pintos doesn't implement. It will probably take at least some +porting effort to make it work under Pintos. Notably, the Pintos +user program C library does not have a malloc() implementation. +

+

+ +

+
How do I compile new user programs? +

+ +Modify src/examples/Makefile, then run make. +

+

+ +

+
Can I run user programs under a debugger? +

+ +Yes, with some limitations. See section D.5 GDB. +

+

+ +

+
How can I run user programs that need more than 4 kB stack space? +

+ +You may modify the stack setup code to allocate more than one page of +stack space for each process. In project 2, you will implement a better +solution. +

+

+ +

+
What happens when an open file is removed? +
+

+ +You should implement the standard Unix semantics for files. That is, when +a file is removed any process which has a file descriptor for that file +may continue to use that descriptor. This means that +they can read and write from the file. The file will not have a name, +and no other processes will be able to open it, but it will continue +to exist until all file descriptors referring to the file are closed +or the machine shuts down. +

+

+ +

+

+ + +


+ +

2.4.4 Argument Passing FAQ

+ +

+ +

+
+
Isn't the top of stack in kernel virtual memory? +

+ +The top of stack is at PHYS_BASE, typically 0xc0000000, which +is also where kernel virtual memory starts. +But before the processor pushes data on the stack, it decrements the stack +pointer. Thus, the first (4-byte) value pushed on the stack +will be at address 0xbffffffc. +

+

+ +

+
Is PHYS_BASE fixed? +

+ +No. You should be able to support PHYS_BASE values that are +any multiple of 0x10000000 from 0x80000000 to 0xf0000000, +simply via recompilation. +

+

+ + +


+ +

2.5 80x86 Calling Convention

+ +

+ +This section summarizes important points of the convention used for +normal function calls on 32-bit 80x86 implementations of Unix. +Some details are omitted for brevity. If you do want all the details, +refer to [ SysV-i386]. +

+

+ +The calling convention works like this: +

+

+ +

    +
  1. +The caller pushes each of the function's arguments on the stack one by +one, normally using the PUSH assembly language instruction. +Arguments are pushed in right-to-left order. +

    + +The stack grows downward: each push decrements the stack pointer, then +stores into the location it now points to, like the C expression +*--sp = value. +

    +

    + +

    +
  2. +The caller pushes the address of its next instruction (the return +address) on the stack and jumps to the first instruction of the callee. +A single 80x86 instruction, CALL, does both. +

    + +

    +
  3. +The callee executes. When it takes control, the stack pointer points to +the return address, the first argument is just above it, the second +argument is just above the first argument, and so on. +

    + +

    +
  4. +If the callee has a return value, it stores it into register EAX. +

    + +

    +
  5. +The callee returns by popping the return address from the stack and +jumping to the location it specifies, using the 80x86 RET +instruction. +

    + +

    +
  6. +The caller pops the arguments off the stack. +
+

+ +Consider a function f() that takes three int arguments. +This diagram shows a sample stack frame as seen by the callee at the +beginning of step 3 above, supposing that f() is invoked as +f(1, 2, 3). The initial stack address is arbitrary: +

+

+ +

+
 
                             +----------------+
+                  0xbffffe7c |        3       |
+                  0xbffffe78 |        2       |
+                  0xbffffe74 |        1       |
+stack pointer --> 0xbffffe70 | return address |
+                             +----------------+
+
+

+ + +


+ +

2.5.1 Program Startup Details

+ +

+ +The Pintos C library for user programs designates _start(), in +lib/user/entry.c, as the entry point for user programs. This +function is a wrapper around main() that calls exit() if +main() returns: +

+

+ +
 
void
+_start (int argc, char *argv[])
+{
+  exit (main (argc, argv));
+}
+

+ +The kernel must put the arguments for the initial function on the stack +before it allows the user program to begin executing. The arguments are +passed in the same way as the normal calling convention (see section 2.5 80x86 Calling Convention). +

+

+ +Consider how to handle arguments for the following example command: +/bin/ls -l foo bar. +First, break the command into words: /bin/ls, +-l, foo, bar. Place the words at the top of the +stack. Order doesn't matter, because they will be referenced through +pointers. +

+

+ +Then, push the address of each string plus a null pointer sentinel, on +the stack, in right-to-left order. These are the elements of +argv. The null pointer sentinel ensures that argv[argc] +is a null pointer, as required by the C standard. The order ensures +that argv[0] is at the lowest virtual address. Word-aligned +accesses are faster than unaligned accesses, so for best performance +round the stack pointer down to a multiple of 4 before the first push. +

+

+ +Then, push argv (the address of argv[0]) and argc, +in that order. Finally, push a fake "return address": although the +entry function will never return, its stack frame must have the same +structure as any other. +

+

+ +The table below shows the state of the stack and the relevant registers +right before the beginning of the user program, assuming +PHYS_BASE is 0xc0000000: +

+

+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Address Name Data Type
0xbffffffc argv[3][...] bar\0 char[4]
0xbffffff8 argv[2][...] foo\0 char[4]
0xbffffff5 argv[1][...] -l\0 char[3]
0xbfffffed argv[0][...] /bin/ls\0 char[8]
0xbfffffec word-align 0 uint8_t
0xbfffffe8 argv[4] 0 char *
0xbfffffe4 argv[3] 0xbffffffc char *
0xbfffffe0 argv[2] 0xbffffff8 char *
0xbfffffdc argv[1] 0xbffffff5 char *
0xbfffffd8 argv[0] 0xbfffffed char *
0xbfffffd4 argv 0xbfffffd8 char **
0xbfffffd0 argc 4 int
0xbfffffcc return address 0 void (*) ()
+
+

+ +In this example, the stack pointer would be initialized to +0xbfffffcc. +

+

+ +As shown above, your code should start the stack at the very top of +the user virtual address space, in the page just below virtual address +PHYS_BASE (defined in threads/vaddr.h). +

+

+ +You may find the non-standard hex_dump() function, declared in +<stdio.h>, useful for debugging your argument passing code. +Here's what it would show in the above example: +

+

+ +
 
bfffffc0                                      00 00 00 00 |            ....|
+bfffffd0  04 00 00 00 d8 ff ff bf-ed ff ff bf f5 ff ff bf |................|
+bfffffe0  f8 ff ff bf fc ff ff bf-00 00 00 00 00 2f 62 69 |............./bi|
+bffffff0  6e 2f 6c 73 00 2d 6c 00-66 6f 6f 00 62 61 72 00 |n/ls.-l.foo.bar.|
+

+ + +


+ +

2.5.2 System Call Details

+ +

+ +We already know one way that the operating system +can regain control from a user program: interrupts from timers and I/O +devices. These are "external" interrupts, because they are caused +by entities outside the CPU (see section A.4.3 External Interrupt Handling). +

+

+ +The operating system also deals with software exceptions, which are +events that occur in program code (see section A.4.2 Internal Interrupt Handling). These can be errors such as a page fault or division by +zero. Exceptions are also the means by which a user program +can request services ("system calls") from the operating system. +

+

+ +In the 80x86 architecture, the int instruction is the +most commonly used means for invoking system calls. This instruction +is handled in the same way as other software exceptions. In Pintos, +user programs invoke int $0x30 to make a system call. The +system call number and any additional arguments are expected to be +pushed on the stack in the normal fashion before invoking the +interrupt (see section 2.5 80x86 Calling Convention). +

+

+ +Thus, when the system call handler syscall_handler() gets control, +the system call number is in the 32-bit word at the caller's stack +pointer, the first argument is in the 32-bit word at the next higher +address, and so on. The caller's stack pointer is accessible to +syscall_handler() as the esp member of the +struct intr_frame passed to it. (struct intr_frame is on the kernel +stack.) +

+

+ +The 80x86 convention for function return values is to place them +in the EAX register. System calls that return a value can do +so by modifying the eax member of struct intr_frame. +

+

+ +You should try to avoid writing large amounts of repetitive code for +implementing system calls. Each system call argument, whether an +integer or a pointer, takes up 4 bytes on the stack. You should be able +to take advantage of this to avoid writing much near-identical code for +retrieving each system call's arguments from the stack. + +


+ + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+
+ +This document was generated +by on March, 6 2012 +using texi2html + + + + diff --git a/doc/pintos_3.html b/doc/pintos_3.html new file mode 100644 index 0000000..c90148d --- /dev/null +++ b/doc/pintos_3.html @@ -0,0 +1,375 @@ + + + + + +Pintos Projects: Project 1--Threads + + + + + + + + + + + + + + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+ +
+

3. Project 1: Threads

+ +

+ +In this assignment, we give you a minimally functional thread system. +Your job is to extend the functionality of this system to gain a +better understanding of synchronization problems. +

+

+ +You will be working primarily in the threads directory for +this assignment, with some work in the devices directory on the +side. Compilation should be done in the threads directory. +

+

+ +Before you read the description of this project, you should read all of +the following sections: 1. Introduction, B. Coding Standards, +D. Debugging Tools, and E. Development Tools. You should at least +skim the material from A.1 Loading through A.5 Memory Allocation, especially A.3 Synchronization. +

+

+ + +


+ +

3.1 Background

+ +

+ +Before you start with project 1, be sure to refresh your knowledge +on the thread subsystem introduced in the last project +(2.1 Understanding Threads). +

+

+ + +


+ +

3.2 Requirements

+ +

+ + +


+ +

3.2.1 Design Document

+ +

+ +Before you turn in your project, you must copy the +project 1 design document template into your source tree under the name +pintos/src/threads/DESIGNDOC and fill it in. We recommend that +you read the design document template before you start working on the +project. +

+

+ + +


+ +

3.2.2 Priority Scheduling

+ +

+ +Implement priority scheduling in Pintos. +When a thread is added to the ready list that has a higher priority +than the currently running thread, the current thread should +immediately yield the processor to the new thread. Similarly, when +threads are waiting for a lock, semaphore, or condition variable, the +highest priority waiting thread should be awakened first. A thread +may raise or lower its own priority at any time, but lowering its +priority such that it no longer has the highest priority must cause it +to immediately yield the CPU. +

+

+ +Thread priorities range from PRI_MIN (0) to PRI_MAX (63). +Lower numbers correspond to lower priorities, so that priority 0 +is the lowest priority and priority 63 is the highest. +The initial thread priority is passed as an argument to +thread_create(). If there's no reason to choose another +priority, use PRI_DEFAULT (31). The PRI_ macros are +defined in threads/thread.h, and you should not change their +values. +

+

+ +One issue with priority scheduling is "priority inversion". Consider +high, medium, and low priority threads H, M, and L, +respectively. If H needs to wait for L (for instance, for a +lock held by L), and M is on the ready list, then H +will never get the CPU because the low priority thread will not get any +CPU time. A partial fix for this problem is for H to "donate" +its priority to L while L is holding the lock, then recall +the donation once L releases (and thus H acquires) the lock. +

+

+ +Implement priority donation. You will need to account for all different +situations in which priority donation is required. Be sure to handle +multiple donations, in which multiple priorities are donated to a single +thread. You must also handle nested donation: if H is waiting on +a lock that M holds and M is waiting on a lock that L +holds, then both M and L should be boosted to H's +priority. If necessary, you may impose a reasonable limit on depth of +nested priority donation, such as 8 levels. +

+

+ +You must implement priority donation for locks. You need not +implement priority donation for the other Pintos synchronization +constructs. You do need to implement priority scheduling in all +cases. +

+

+ +Finally, implement the following functions that allow a thread to +examine and modify its own priority. Skeletons for these functions are +provided in threads/thread.c. +

+

+ + +

+
+
Function: void thread_set_priority (int new_priority) +
Sets the current thread's priority to new_priority. If the +current thread no longer has the highest priority, yields. +
+

+ + +

+
+
Function: int thread_get_priority (void) +
Returns the current thread's priority. In the presence of priority +donation, returns the higher (donated) priority. +
+

+ +You need not provide any interface to allow a thread to directly modify +other threads' priorities. +

+

+ +The priority scheduler is not a necessary for project 2. +

+

+ + +


+ +

3.3 FAQ

+ +

+ +

+
+
How much code will I need to write? +

+ +Here's a summary of our reference solution, produced by the +diffstat program. The final row gives total lines inserted +and deleted; a changed line counts as both an insertion and a deletion. +

+

+ +The reference solution represents just one possible solution. Many +other solutions are also possible and many of those differ greatly from +the reference solution. Some excellent solutions may not modify all the +files modified by the reference solution, and some may modify files not +modified by the reference solution. +

+

+ +
 
 threads/interrupt.c |    3 +-
+ threads/synch.c     |   55 ++++++++++++++++++--
+ threads/synch.h     |    2 +
+ threads/thread.c    |  111 +++++++++++++++++++++++++++++++++++-----
+ threads/thread.h    |   10 +++-
+ 5 files changed, 160 insertions(+), 21 deletions(-)
+

+ +

+
Doesn't priority scheduling lead to starvation? +

+ +Yes, strict priority scheduling can lead to starvation +because a thread will not run if any higher-priority thread is runnable. +The advanced scheduler introduces a mechanism for dynamically +changing thread priorities. +

+

+ +Strict priority scheduling is valuable in real-time systems because it +offers the programmer more control over which jobs get processing +time. High priorities are generally reserved for time-critical +tasks. It's not "fair," but it addresses other concerns not +applicable to a general-purpose operating system. +

+

+ +

+
What thread should run after a lock has been released? +

+ +When a lock is released, the highest priority thread waiting for that +lock should be unblocked and put on the list of ready threads. The +scheduler should then run the highest priority thread on the ready +list. +

+

+ +

+
If the highest-priority thread yields, does it continue running? +

+ +Yes. If there is a single highest-priority thread, it continues +running until it blocks or finishes, even if it calls +thread_yield(). +If multiple threads have the same highest priority, +thread_yield() should switch among them in "round robin" order. +

+

+ +

+
What happens to the priority of a donating thread? +

+ +Priority donation only changes the priority of the donee +thread. The donor thread's priority is unchanged. +Priority donation is not additive: if thread A (with priority 5) donates +to thread B (with priority 3), then B's new priority is 5, not 8. +

+

+ +

+
Can a thread's priority change while it is on the ready queue? +

+ +Yes. Consider a ready, low-priority thread L that holds a lock. +High-priority thread H attempts to acquire the lock and blocks, +thereby donating its priority to ready thread L. +

+

+ +

+
Can a thread's priority change while it is blocked? +

+ +Yes. While a thread that has acquired lock L is blocked for any +reason, its priority can increase by priority donation if a +higher-priority thread attempts to acquire L. This case is +checked by the priority-donate-sema test. +

+

+ +

+
Can a thread added to the ready list preempt the processor? +

+ +Yes. If a thread added to the ready list has higher priority than the +running thread, the correct behavior is to immediately yield the +processor. It is not acceptable to wait for the next timer interrupt. +The highest priority thread should run as soon as it is runnable, +preempting whatever thread is currently running. +

+

+ +

+
How does thread_set_priority() affect a thread receiving donations? +

+ +It sets the thread's base priority. The thread's effective priority +becomes the higher of the newly set priority or the highest donated +priority. When the donations are released, the thread's priority +becomes the one set through the function call. This behavior is checked +by the priority-donate-lower test. +

+

+ +

+
Doubled test names in output make them fail. +

+ +Suppose you are seeing output in which some test names are doubled, +like this: +

+

+ +
 
(alarm-priority) begin
+(alarm-priority) (alarm-priority) Thread priority 30 woke up.
+Thread priority 29 woke up.
+(alarm-priority) Thread priority 28 woke up.
+

+ +What is happening is that output from two threads is being +interleaved. That is, one thread is printing "(alarm-priority) +Thread priority 29 woke up.\n" and another thread is printing +"(alarm-priority) Thread priority 30 woke up.\n", but the first +thread is being preempted by the second in the middle of its output. +

+

+ +This problem indicates a bug in your priority scheduler. After all, a +thread with priority 29 should not be able to run while a thread with +priority 30 has work to do. +

+

+ +Normally, the implementation of the printf() function in the +Pintos kernel attempts to prevent such interleaved output by acquiring +a console lock during the duration of the printf call and +releasing it afterwards. However, the output of the test name, +e.g., (alarm-priority), and the message following it is output +using two calls to printf, resulting in the console lock being +acquired and released twice. +

+ +
+ + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+
+ +This document was generated +by on March, 6 2012 +using texi2html + + + + diff --git a/doc/pintos_4.html b/doc/pintos_4.html new file mode 100644 index 0000000..70be2de --- /dev/null +++ b/doc/pintos_4.html @@ -0,0 +1,83 @@ + + + + + +Pintos Projects: Project 2--Virtual Memory + + + + + + + + + + + + + + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+ +
+

4. Project 2: Virtual Memory

+ +

+ +By now you should have some familiarity with the inner workings of +Pintos. Pintos can properly handle multiple threads of execution with proper +synchronization, and can load multiple user programs at once. In this assignment, +you will improve the memory management of Pintos. +

+

+ +This assignment requires user programs (and in particular argument passing, +which you implemented in project 0) to work. You will continue to handle +Pintos disks and file systems the same way you did before +(see section 2.2.2 Using the File System). +

+

+ +The documentation for this assignment will be released when Project 2 +is about to start. +

+

+ + + + + + +


+ + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+
+ +This document was generated +by on March, 6 2012 +using texi2html + + + + diff --git a/doc/pintos_5.html b/doc/pintos_5.html new file mode 100644 index 0000000..670f351 --- /dev/null +++ b/doc/pintos_5.html @@ -0,0 +1,3343 @@ + + + + + +Pintos Projects: Reference Guide + + + + + + + + + + + + + + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+ +
+

A. Reference Guide

+ +

+ +This chapter is a reference for the Pintos code. The reference guide +does not cover all of the code in Pintos, but it does cover those +pieces that students most often find troublesome. You may find that +you want to read each part of the reference guide as you work on the +project where it becomes important. +

+

+ +We recommend using "tags" to follow along with references to function +and variable names (see section E.1 Tags). +

+

+ + +


+ +

A.1 Loading

+ +

+ +This section covers the Pintos loader and basic kernel +initialization. +

+

+ + +


+ +

A.1.1 The Loader

+ +

+ +The first part of Pintos that runs is the loader, in +threads/loader.S. The PC BIOS loads the loader into memory. +The loader, in turn, is responsible for finding the kernel on disk, +loading it into memory, and then jumping to its start. It's +not important to understand exactly how the loader works, but if +you're interested, read on. You should probably read along with the +loader's source. You should also understand the basics of the +80x86 architecture as described by chapter 3, "Basic Execution +Environment," of [ IA32-v1]. +

+

+ +The PC BIOS loads the loader from the first sector of the first hard +disk, called the master boot record (MBR). PC conventions +reserve 64 bytes of the MBR for the partition table, and Pintos uses +about 128 additional bytes for kernel command-line arguments. This +leaves a little over 300 bytes for the loader's own code. This is a +severe restriction that means, practically speaking, the loader must +be written in assembly language. +

+

+ +The Pintos loader and kernel don't have to be on the same disk, nor +does is the kernel required to be in any particular location on a +given disk. The loader's first job, then, is to find the kernel by +reading the partition table on each hard disk, looking for a bootable +partition of the type used for a Pintos kernel. +

+

+ +When the loader finds a bootable kernel partition, it reads the +partition's contents into memory at physical address 128 kB. The +kernel is at the beginning of the partition, which might be larger +than necessary due to partition boundary alignment conventions, so the +loader reads no more than 512 kB (and the Pintos build process +will refuse to produce kernels larger than that). Reading more data +than this would cross into the region from 640 kB to 1 MB that +the PC architecture reserves for hardware and the BIOS, and a standard +PC BIOS does not provide any means to load the kernel above 1 MB. +

+

+ +The loader's final job is to extract the entry point from the loaded +kernel image and transfer control to it. The entry point is not at a +predictable location, but the kernel's ELF header contains a pointer +to it. The loader extracts the pointer and jumps to the location it +points to. +

+

+ +The Pintos kernel command line +is stored in the boot loader. The pintos program actually +modifies a copy of the boot loader on disk each time it runs the kernel, +inserting whatever command-line arguments the user supplies to the kernel, +and then the kernel at boot time reads those arguments out of the boot +loader in memory. This is not an elegant solution, but it is simple +and effective. +

+

+ + +


+ +

A.1.2 Low-Level Kernel Initialization

+ +

+ +The loader's last action is to transfer control to the kernel's entry +point, which is start() in threads/start.S. The job of +this code is to switch the CPU from legacy 16-bit "real mode" into +the 32-bit "protected mode" used by all modern 80x86 operating +systems. +

+

+ +The startup code's first task is actually to obtain the machine's +memory size, by asking the BIOS for the PC's memory size. The +simplest BIOS function to do this can only detect up to 64 MB of RAM, +so that's the practical limit that Pintos can support. The function +stores the memory size, in pages, in global variable +init_ram_pages. +

+

+ +The first part of CPU initialization is to enable the A20 line, that +is, the CPU's address line numbered 20. For historical reasons, PCs +boot with this address line fixed at 0, which means that attempts to +access memory beyond the first 1 MB (2 raised to the 20th power) will +fail. Pintos wants to access more memory than this, so we have to +enable it. +

+

+ +Next, the loader creates a basic page table. This page table maps +the 64 MB at the base of virtual memory (starting at virtual address +0) directly to the identical physical addresses. It also maps the +same physical memory starting at virtual address +LOADER_PHYS_BASE, which defaults to 0xc0000000 (3 GB). The +Pintos kernel only wants the latter mapping, but there's a +chicken-and-egg problem if we don't include the former: our current +virtual address is roughly 0x20000, the location where the loader +put us, and we can't jump to 0xc0020000 until we turn on the +page table, but if we turn on the page table without jumping there, +then we've just pulled the rug out from under ourselves. +

+

+ +After the page table is initialized, we load the CPU's control +registers to turn on protected mode and paging, and set up the segment +registers. We aren't yet equipped to handle interrupts in protected +mode, so we disable interrupts. The final step is to call main(). +

+

+ + +


+ +

A.1.3 High-Level Kernel Initialization

+ +

+ +The kernel proper starts with the main() function. The +main() function is written in C, as will be most of the code we +encounter in Pintos from here on out. +

+

+ +When main() starts, the system is in a pretty raw state. We're +in 32-bit protected mode with paging enabled, but hardly anything else is +ready. Thus, the main() function consists primarily of calls +into other Pintos modules' initialization functions. +These are usually named module_init(), where +module is the module's name, module.c is the +module's source code, and module.h is the module's +header. +

+

+ +The first step in main() is to call bss_init(), which clears +out the kernel's "BSS", which is the traditional name for a +segment that should be initialized to all zeros. In most C +implementations, whenever you +declare a variable outside a function without providing an +initializer, that variable goes into the BSS. Because it's all zeros, the +BSS isn't stored in the image that the loader brought into memory. We +just use memset() to zero it out. +

+

+ +Next, main() calls read_command_line() to break the kernel command +line into arguments, then parse_options() to read any options at +the beginning of the command line. (Actions specified on the +command line execute later.) +

+

+ +thread_init() initializes the thread system. We will defer full +discussion to our discussion of Pintos threads below. It is called so +early in initialization because a valid thread structure is a +prerequisite for acquiring a lock, and lock acquisition in turn is +important to other Pintos subsystems. Then we initialize the console +and print a startup message to the console. +

+

+ +The next block of functions we call initializes the kernel's memory +system. palloc_init() sets up the kernel page allocator, which +doles out memory one or more pages at a time (see section A.5.1 Page Allocator). +malloc_init() sets +up the allocator that handles allocations of arbitrary-size blocks of +memory (see section A.5.2 Block Allocator). +paging_init() sets up a page table for the kernel (see section A.7 Page Table). +

+

+ +In projects 2 and later, main() also calls tss_init() and +gdt_init(). +

+

+ +The next set of calls initializes the interrupt system. +intr_init() sets up the CPU's interrupt descriptor table +(IDT) to ready it for interrupt handling (see section A.4.1 Interrupt Infrastructure), then timer_init() and kbd_init() prepare for +handling timer interrupts and keyboard interrupts, respectively. +input_init() sets up to merge serial and keyboard input into one +stream. In +projects 2 and later, we also prepare to handle interrupts caused by +user programs using exception_init() and syscall_init(). +

+

+ +Now that interrupts are set up, we can start the scheduler +with thread_start(), which creates the idle thread and enables +interrupts. +With interrupts enabled, interrupt-driven serial port I/O becomes +possible, so we use +serial_init_queue() to switch to that mode. Finally, +timer_calibrate() calibrates the timer for accurate short delays. +

+

+ +If the file system is compiled in, as it will starting in project 2, we +initialize the IDE disks with ide_init(), then the +file system with filesys_init(). +

+

+ +Boot is complete, so we print a message. +

+

+ +Function run_actions() now parses and executes actions specified on +the kernel command line, such as run to run a test (in project +1) or a user program (in later projects). +

+

+ +Finally, if -q was specified on the kernel command line, we +call shutdown_power_off() to terminate the machine simulator. Otherwise, +main() calls thread_exit(), which allows any other running +threads to continue running. +

+

+ + +


+ +

A.1.4 Physical Memory Map

+ +

+ +

+ +@headitem Memory Range + + + + + + + + + + + + + + + + + + + + + + + +
Owner + Contents + +
00000000--000003ff CPU Real mode interrupt table.
00000400--000005ff BIOS Miscellaneous data area.
00000600--00007bff -- ---
00007c00--00007dff Pintos Loader.
0000e000--0000efff Pintos Stack for loader; kernel stack and struct thread for initial +kernel thread. +
0000f000--0000ffff Pintos Page directory for startup code. +
00010000--00020000 Pintos Page tables for startup code. +
00020000--0009ffff Pintos Kernel code, data, and uninitialized data segments. +
000a0000--000bffff Video VGA display memory.
000c0000--000effff Hardware Reserved for expansion card RAM and ROM. +
000f0000--000fffff BIOS ROM BIOS.
00100000--03ffffff Pintos Dynamic memory allocation.
+

+ + +


+ +

A.2 Threads

+ +

+ + +


+ +

A.2.1 struct thread

+ +

+ +The main Pintos data structure for threads is struct thread, +declared in threads/thread.h. +

+

+ + +

+
+
Structure: struct thread +
Represents a thread or a user process. In the projects, you will have +to add your own members to struct thread. You may also change or +delete the definitions of existing members. +

+ +Every struct thread occupies the beginning of its own page of +memory. The rest of the page is used for the thread's stack, which +grows downward from the end of the page. It looks like this: +

+

+ +
 
                  4 kB +---------------------------------+
+                       |          kernel stack           |
+                       |                |                |
+                       |                |                |
+                       |                V                |
+                       |         grows downward          |
+                       |                                 |
+                       |                                 |
+                       |                                 |
+                       |                                 |
+                       |                                 |
+                       |                                 |
+                       |                                 |
+                       |                                 |
+sizeof (struct thread) +---------------------------------+
+                       |              magic              |
+                       |                :                |
+                       |                :                |
+                       |              status             |
+                       |               tid               |
+                  0 kB +---------------------------------+
+

+ +This has two consequences. First, struct thread must not be allowed +to grow too big. If it does, then there will not be enough room for the +kernel stack. The base struct thread is only a few bytes in size. It +probably should stay well under 1 kB. +

+

+ +Second, kernel stacks must not be allowed to grow too large. If a stack +overflows, it will corrupt the thread state. Thus, kernel functions +should not allocate large structures or arrays as non-static local +variables. Use dynamic allocation with malloc() or +palloc_get_page() instead (see section A.5 Memory Allocation). +

+
+

+ + +

+
+
Member of struct thread: tid_t tid +
The thread's thread identifier or tid. Every thread must have a +tid that is unique over the entire lifetime of the kernel. By +default, tid_t is a typedef for int and each new +thread receives the numerically next higher tid, starting from 1 for +the initial process. You can change the type and the numbering scheme +if you like. +
+

+ + +

+
+
Member of struct thread: enum thread_status status +
+The thread's state, one of the following: +

+ + +

+
+
Thread State: THREAD_RUNNING +
The thread is running. Exactly one thread is running at a given time. +thread_current() returns the running thread. +
+

+ + +

+
+
Thread State: THREAD_READY +
The thread is ready to run, but it's not running right now. The +thread could be selected to run the next time the scheduler is +invoked. Ready threads are kept in a doubly linked list called +ready_list. +
+

+ + +

+
+
Thread State: THREAD_BLOCKED +
The thread is waiting for something, e.g. a lock to become +available, an interrupt to be invoked. The thread won't be scheduled +again until it transitions to the THREAD_READY state with a +call to thread_unblock(). This is most conveniently done +indirectly, using one of the Pintos synchronization primitives that +block and unblock threads automatically (see section A.3 Synchronization). +

+ +There is no a priori way to tell what a blocked thread is waiting +for, but a backtrace can help (see section D.4 Backtraces). +

+
+

+ + +

+
+
Thread State: THREAD_DYING +
The thread will be destroyed by the scheduler after switching to the +next thread. +
+
+

+ + +

+
+
Member of struct thread: char name[16] +
The thread's name as a string, or at least the first few characters of +it. +
+

+ + +

+
+
Member of struct thread: uint8_t *stack +
Every thread has its own stack to keep track of its state. When the +thread is running, the CPU's stack pointer register tracks the top of +the stack and this member is unused. But when the CPU switches to +another thread, this member saves the thread's stack pointer. No +other members are needed to save the thread's registers, because the +other registers that must be saved are saved on the stack. +

+ +When an interrupt occurs, whether in the kernel or a user program, an +struct intr_frame is pushed onto the stack. When the interrupt occurs +in a user program, the struct intr_frame is always at the very top of +the page. See section A.4 Interrupt Handling, for more information. +

+
+

+ + +

+
+
Member of struct thread: int priority +
A thread priority, ranging from PRI_MIN (0) to PRI_MAX +(63). Lower numbers correspond to lower priorities, so that +priority 0 is the lowest priority and priority 63 is the highest. +Pintos as provided ignores thread priorities, but you will implement +priority scheduling in project 1 (see section 3.2.2 Priority Scheduling). +
+

+ + +

+
+
Member of struct thread: struct list_elem allelem +
This "list element" is used to link the thread into the list of all +threads. Each thread is inserted into this list when it is created +and removed when it exits. The thread_foreach() function should +be used to iterate over all threads. +
+

+ + +

+
+
Member of struct thread: struct list_elem elem +
A "list element" used to put the thread into doubly linked lists, +either ready_list (the list of threads ready to run) or a list of +threads waiting on a semaphore in sema_down(). It can do double +duty because a thread waiting on a semaphore is not ready, and vice +versa. +
+

+ + +

+
+
Member of struct thread: uint32_t *pagedir +
Only present in project 2 and later. See Page Tables. +
+

+ + +

+
+
Member of struct thread: unsigned magic +
Always set to THREAD_MAGIC, which is just an arbitrary number defined +in threads/thread.c, and used to detect stack overflow. +thread_current() checks that the magic member of the running +thread's struct thread is set to THREAD_MAGIC. Stack overflow +tends to change this value, triggering the assertion. For greatest +benefit, as you add members to struct thread, leave magic at +the end. +
+

+ + +


+ +

A.2.2 Thread Functions

+ +

+ +threads/thread.c implements several public functions for thread +support. Let's take a look at the most useful: +

+

+ + +

+
+
Function: void thread_init (void) +
Called by main() to initialize the thread system. Its main +purpose is to create a struct thread for Pintos's initial thread. +This is possible because the Pintos loader puts the initial +thread's stack at the top of a page, in the same position as any other +Pintos thread. +

+ +Before thread_init() runs, +thread_current() will fail because the running thread's +magic value is incorrect. Lots of functions call +thread_current() directly or indirectly, including +lock_acquire() for locking a lock, so thread_init() is +called early in Pintos initialization. +

+
+

+ + +

+
+
Function: void thread_start (void) +
Called by main() to start the scheduler. Creates the idle +thread, that is, the thread that is scheduled when no other thread is +ready. Then enables interrupts, which as a side effect enables the +scheduler because the scheduler runs on return from the timer interrupt, using +intr_yield_on_return() (see section A.4.3 External Interrupt Handling). +
+

+ + +

+
+
Function: void thread_tick (void) +
Called by the timer interrupt at each timer tick. It keeps track of +thread statistics and triggers the scheduler when a time slice expires. +
+

+ + +

+
+
Function: void thread_print_stats (void) +
Called during Pintos shutdown to print thread statistics. +
+

+ + +

+
+
Function: tid_t thread_create (const char *name, int priority, thread_func *func, void *aux) +
Creates and starts a new thread named name with the given +priority, returning the new thread's tid. The thread executes +func, passing aux as the function's single argument. +

+ +thread_create() allocates a page for the thread's +struct thread and stack and initializes its members, then it sets +up a set of fake stack frames for it (see section A.2.3 Thread Switching). The +thread is initialized in the blocked state, then unblocked just before +returning, which allows the new thread to +be scheduled (see Thread States). +

+

+ + +

+
+
Type: void thread_func (void *aux) +
This is the type of the function passed to thread_create(), whose +aux argument is passed along as the function's argument. +
+
+

+ + +

+
+
Function: void thread_block (void) +
Transitions the running thread from the running state to the blocked +state (see Thread States). The thread will not run again until +thread_unblock() is +called on it, so you'd better have some way arranged for that to happen. +Because thread_block() is so low-level, you should prefer to use +one of the synchronization primitives instead (see section A.3 Synchronization). +
+

+ + +

+
+
Function: void thread_unblock (struct thread *thread) +
Transitions thread, which must be in the blocked state, to the +ready state, allowing it to resume running (see Thread States). +This is called when the event that the thread is waiting for occurs, +e.g. when the lock that +the thread is waiting on becomes available. +
+

+ + +

+
+
Function: struct thread *thread_current (void) +
Returns the running thread. +
+

+ + +

+
+
Function: tid_t thread_tid (void) +
Returns the running thread's thread id. Equivalent to +thread_current ()->tid. +
+

+ + +

+
+
Function: const char *thread_name (void) +
Returns the running thread's name. Equivalent to thread_current +()->name. +
+

+ + +

+
+
Function: void thread_exit (void) NO_RETURN +
Causes the current thread to exit. Never returns, hence +NO_RETURN (see section D.3 Function and Parameter Attributes). +
+

+ + +

+
+
Function: void thread_yield (void) +
Yields the CPU to the scheduler, which picks a new thread to run. The +new thread might be the current thread, so you can't depend on this +function to keep this thread from running for any particular length of +time. +
+

+ + +

+
+
Function: void thread_foreach (thread_action_func *action, void *aux) +
Iterates over all threads t and invokes action(t, aux) on each. +action must refer to a function that matches the signature +given by thread_action_func(): +

+ + +

+
+
Type: void thread_action_func (struct thread *thread, void *aux) +
Performs some action on a thread, given aux. +
+
+

+ + +

+
+
Function: int thread_get_priority (void) +
+
Function: void thread_set_priority (int new_priority) +
Stub to set and get thread priority. See section 3.2.2 Priority Scheduling. +
+

+ + +

+
+
Function: int thread_get_nice (void) +
+
Function: void thread_set_nice (int new_nice) +
+
Function: int thread_get_recent_cpu (void) +
+
Function: int thread_get_load_avg (void) +
Stubs for the advanced scheduler (not used in this course). +
+

+ + +


+ +

A.2.3 Thread Switching

+ +

+ +schedule() is responsible for switching threads. It +is internal to threads/thread.c and called only by the three +public thread functions that need to switch threads: +thread_block(), thread_exit(), and thread_yield(). +Before any of these functions call schedule(), they disable +interrupts (or ensure that they are already disabled) and then change +the running thread's state to something other than running. +

+

+ +schedule() is short but tricky. It records the +current thread in local variable cur, determines the next thread +to run as local variable next (by calling +next_thread_to_run()), and then calls switch_threads() to do +the actual thread switch. The thread we switched to was also running +inside switch_threads(), as are all the threads not currently +running, so the new thread now returns out of +switch_threads(), returning the previously running thread. +

+

+ +switch_threads() is an assembly language routine in +threads/switch.S. It saves registers on the stack, saves the +CPU's current stack pointer in the current struct thread's stack +member, restores the new thread's stack into the CPU's stack +pointer, restores registers from the stack, and returns. +

+

+ +The rest of the scheduler is implemented in thread_schedule_tail(). It +marks the new thread as running. If the thread we just switched from +is in the dying state, then it also frees the page that contained the +dying thread's struct thread and stack. These couldn't be freed +prior to the thread switch because the switch needed to use it. +

+

+ +Running a thread for the first time is a special case. When +thread_create() creates a new thread, it goes through a fair +amount of trouble to get it started properly. In particular, the new +thread hasn't started running yet, so there's no way for it to be +running inside switch_threads() as the scheduler expects. To +solve the problem, thread_create() creates some fake stack frames +in the new thread's stack: +

+

+ +

+

+ + +


+ +

A.3 Synchronization

+ +

+ +If sharing of resources between threads is not handled in a careful, +controlled fashion, the result is usually a big mess. +This is especially the case in operating system kernels, where +faulty sharing can crash the entire machine. Pintos provides several +synchronization primitives to help out. +

+

+ + +


+ +

A.3.1 Disabling Interrupts

+ +

+ +The crudest way to do synchronization is to disable interrupts, that +is, to temporarily prevent the CPU from responding to interrupts. If +interrupts are off, no other thread will preempt the running thread, +because thread preemption is driven by the timer interrupt. If +interrupts are on, as they normally are, then the running thread may +be preempted by another at any time, whether between two C statements +or even within the execution of one. +

+

+ +Incidentally, this means that Pintos is a "preemptible kernel," that +is, kernel threads can be preempted at any time. Traditional Unix +systems are "nonpreemptible," that is, kernel threads can only be +preempted at points where they explicitly call into the scheduler. +(User programs can be preempted at any time in both models.) As you +might imagine, preemptible kernels require more explicit +synchronization. +

+

+ +You should have little need to set the interrupt state directly. Most +of the time you should use the other synchronization primitives +described in the following sections. The main reason to disable +interrupts is to synchronize kernel threads with external interrupt +handlers, which cannot sleep and thus cannot use most other forms of +synchronization (see section A.4.3 External Interrupt Handling). +

+

+ +Some external interrupts cannot be postponed, even by disabling +interrupts. These interrupts, called non-maskable interrupts +(NMIs), are supposed to be used only in emergencies, e.g. when the +computer is on fire. Pintos does not handle non-maskable interrupts. +

+

+ +Types and functions for disabling and enabling interrupts are in +threads/interrupt.h. +

+

+ + +

+
+
Type: enum intr_level +
One of INTR_OFF or INTR_ON, denoting that interrupts are +disabled or enabled, respectively. +
+

+ + +

+
+
Function: enum intr_level intr_get_level (void) +
Returns the current interrupt state. +
+

+ + +

+
+
Function: enum intr_level intr_set_level (enum intr_level level) +
Turns interrupts on or off according to level. Returns the +previous interrupt state. +
+

+ + +

+
+
Function: enum intr_level intr_enable (void) +
Turns interrupts on. Returns the previous interrupt state. +
+

+ + +

+
+
Function: enum intr_level intr_disable (void) +
Turns interrupts off. Returns the previous interrupt state. +
+

+ + +


+ +

A.3.2 Semaphores

+ +

+ +A semaphore is a nonnegative integer together with two operators +that manipulate it atomically, which are: +

+

+ +

+

+ +A semaphore initialized to 0 may be used to wait for an event +that will happen exactly once. For example, suppose thread A +starts another thread B and wants to wait for B to signal +that some activity is complete. A can create a semaphore +initialized to 0, pass it to B as it starts it, and then +"down" the semaphore. When B finishes its activity, it +"ups" the semaphore. This works regardless of whether A +"downs" the semaphore or B "ups" it first. +

+

+ +A semaphore initialized to 1 is typically used for controlling access +to a resource. Before a block of code starts using the resource, it +"downs" the semaphore, then after it is done with the resource it +"ups" the resource. In such a case a lock, described below, may be +more appropriate. +

+

+ +Semaphores can also be initialized to values larger than 1. These are +rarely used. +

+

+ +Semaphores were invented by Edsger Dijkstra and first used in the THE +operating system ([ Dijkstra]). +

+

+ +Pintos' semaphore type and operations are declared in +threads/synch.h. +

+

+ + +

+
+
Type: struct semaphore +
Represents a semaphore. +
+

+ + +

+
+
Function: void sema_init (struct semaphore *sema, unsigned value) +
Initializes sema as a new semaphore with the given initial +value. +
+

+ + +

+
+
Function: void sema_down (struct semaphore *sema) +
Executes the "down" or "P" operation on sema, waiting for +its value to become positive and then decrementing it by one. +
+

+ + +

+
+
Function: bool sema_try_down (struct semaphore *sema) +
Tries to execute the "down" or "P" operation on sema, +without waiting. Returns true if sema +was successfully decremented, or false if it was already +zero and thus could not be decremented without waiting. Calling this +function in a +tight loop wastes CPU time, so use sema_down() or find a +different approach instead. +
+

+ + +

+
+
Function: void sema_up (struct semaphore *sema) +
Executes the "up" or "V" operation on sema, +incrementing its value. If any threads are waiting on +sema, wakes one of them up. +

+ +Unlike most synchronization primitives, sema_up() may be called +inside an external interrupt handler (see section A.4.3 External Interrupt Handling). +

+
+

+ +Semaphores are internally built out of disabling interrupt +(see section A.3.1 Disabling Interrupts) and thread blocking and unblocking +(thread_block() and thread_unblock()). Each semaphore maintains +a list of waiting threads, using the linked list +implementation in lib/kernel/list.c. +

+

+ + +


+ +

A.3.3 Locks

+ +

+ +A lock is like a semaphore with an initial value of 1 +(see section A.3.2 Semaphores). A lock's equivalent of "up" is called +"release", and the "down" operation is called "acquire". +

+

+ +Compared to a semaphore, a lock has one added restriction: only the +thread that acquires a lock, called the lock's "owner", is allowed to +release it. If this restriction is a problem, it's a good sign that a +semaphore should be used, instead of a lock. +

+

+ +Locks in Pintos are not "recursive," that is, it is an error for the +thread currently holding a lock to try to acquire that lock. +

+

+ +Lock types and functions are declared in threads/synch.h. +

+

+ + +

+
+
Type: struct lock +
Represents a lock. +
+

+ + +

+
+
Function: void lock_init (struct lock *lock) +
Initializes lock as a new lock. +The lock is not initially owned by any thread. +
+

+ + +

+
+
Function: void lock_acquire (struct lock *lock) +
Acquires lock for the current thread, first waiting for +any current owner to release it if necessary. +
+

+ + +

+
+
Function: bool lock_try_acquire (struct lock *lock) +
Tries to acquire lock for use by the current thread, without +waiting. Returns true if successful, false if the lock is already +owned. Calling this function in a tight loop is a bad idea because it +wastes CPU time, so use lock_acquire() instead. +
+

+ + +

+
+
Function: void lock_release (struct lock *lock) +
Releases lock, which the current thread must own. +
+

+ + +

+
+
Function: bool lock_held_by_current_thread (const struct lock *lock) +
Returns true if the running thread owns lock, +false otherwise. +There is no function to test whether an arbitrary thread owns a lock, +because the answer could change before the caller could act on it. +
+

+ + +


+ +

A.3.4 Monitors

+ +

+ +A monitor is a higher-level form of synchronization than a +semaphore or a lock. A monitor consists of data being synchronized, +plus a lock, called the monitor lock, and one or more +condition variables. Before it accesses the protected data, a +thread first acquires the monitor lock. It is then said to be "in the +monitor". While in the monitor, the thread has control over all the +protected data, which it may freely examine or modify. When access to +the protected data is complete, it releases the monitor lock. +

+

+ +Condition variables allow code in the monitor to wait for a condition to +become true. Each condition variable is associated with an abstract +condition, e.g. "some data has arrived for processing" or "over 10 +seconds has passed since the user's last keystroke". When code in the +monitor needs to wait for a condition to become true, it "waits" on +the associated condition variable, which releases the lock and waits for +the condition to be signaled. If, on the other hand, it has caused one +of these conditions to become true, it "signals" the condition to wake +up one waiter, or "broadcasts" the condition to wake all of them. +

+

+ +The theoretical framework for monitors was laid out by C. A. R. +Hoare ([ Hoare]). Their practical usage was later elaborated in a +paper on the Mesa operating system ([ Lampson]). +

+

+ +Condition variable types and functions are declared in +threads/synch.h. +

+

+ + +

+
+
Type: struct condition +
Represents a condition variable. +
+

+ + +

+
+
Function: void cond_init (struct condition *cond) +
Initializes cond as a new condition variable. +
+

+ + +

+
+
Function: void cond_wait (struct condition *cond, struct lock *lock) +
Atomically releases lock (the monitor lock) and waits for +cond to be signaled by some other piece of code. After +cond is signaled, reacquires lock before returning. +lock must be held before calling this function. +

+ +Sending a signal and waking up from a wait are not an atomic operation. +Thus, typically cond_wait()'s caller must recheck the condition +after the wait completes and, if necessary, wait again. See the next +section for an example. +

+
+

+ + +

+
+
Function: void cond_signal (struct condition *cond, struct lock *lock) +
If any threads are waiting on cond (protected by monitor lock +lock), then this function wakes up one of them. If no threads are +waiting, returns without performing any action. +lock must be held before calling this function. +
+

+ + +

+
+
Function: void cond_broadcast (struct condition *cond, struct lock *lock) +
Wakes up all threads, if any, waiting on cond (protected by +monitor lock lock). lock must be held before calling this +function. +
+

+ +


+ +

A.3.4.1 Monitor Example

+ +

+ +The classical example of a monitor is handling a buffer into which one +or more +"producer" threads write characters and out of which one or more +"consumer" threads read characters. To implement this we need, +besides the monitor lock, two condition variables which we will call +not_full and not_empty: +

+

+ +
 
char buf[BUF_SIZE];     /* Buffer. */
+size_t n = 0;           /* 0 <= n <= BUF_SIZE: # of characters in buffer. */
+size_t head = 0;        /* buf index of next char to write (mod BUF_SIZE). */
+size_t tail = 0;        /* buf index of next char to read (mod BUF_SIZE). */
+struct lock lock;       /* Monitor lock. */
+struct condition not_empty; /* Signaled when the buffer is not empty. */
+struct condition not_full; /* Signaled when the buffer is not full. */
+
+...initialize the locks and condition variables...
+
+void put (char ch) {
+  lock_acquire (&lock);
+  while (n == BUF_SIZE)            /* Can't add to buf as long as it's full. */
+    cond_wait (&not_full, &lock);
+  buf[head++ % BUF_SIZE] = ch;     /* Add ch to buf. */
+  n++;
+  cond_signal (&not_empty, &lock); /* buf can't be empty anymore. */
+  lock_release (&lock);
+}
+
+char get (void) {
+  char ch;
+  lock_acquire (&lock);
+  while (n == 0)                  /* Can't read buf as long as it's empty. */
+    cond_wait (&not_empty, &lock);
+  ch = buf[tail++ % BUF_SIZE];    /* Get ch from buf. */
+  n--;
+  cond_signal (&not_full, &lock); /* buf can't be full anymore. */
+  lock_release (&lock);
+}
+

+ +Note that BUF_SIZE must divide evenly into SIZE_MAX + 1 +for the above code to be completely correct. Otherwise, it will fail +the first time head wraps around to 0. In practice, +BUF_SIZE would ordinarily be a power of 2. +

+

+ + +


+ +

A.3.5 Optimization Barriers

+ +

+ +An optimization barrier is a special statement that prevents the +compiler from making assumptions about the state of memory across the +barrier. The compiler will not reorder reads or writes of variables +across the barrier or assume that a variable's value is unmodified +across the barrier, except for local variables whose address is never +taken. In Pintos, threads/synch.h defines the barrier() +macro as an optimization barrier. +

+

+ +One reason to use an optimization barrier is when data can change +asynchronously, without the compiler's knowledge, e.g. by another +thread or an interrupt handler. The too_many_loops() function in +devices/timer.c is an example. This function starts out by +busy-waiting in a loop until a timer tick occurs: +

+

+ +
 
/* Wait for a timer tick. */
+int64_t start = ticks;
+while (ticks == start)
+  barrier ();
+

+ +Without an optimization barrier in the loop, the compiler could +conclude that the loop would never terminate, because start and +ticks start out equal and the loop itself never changes them. +It could then "optimize" the function into an infinite loop, which +would definitely be undesirable. +

+

+ +Optimization barriers can be used to avoid other compiler +optimizations. The busy_wait() function, also in +devices/timer.c, is an example. It contains this loop: +

+

+ +
 
while (loops-- > 0)
+  barrier ();
+

+ +The goal of this loop is to busy-wait by counting loops down +from its original value to 0. Without the barrier, the compiler could +delete the loop entirely, because it produces no useful output and has +no side effects. The barrier forces the compiler to pretend that the +loop body has an important effect. +

+

+ +Finally, optimization barriers can be used to force the ordering of +memory reads or writes. For example, suppose we add a "feature" +that, whenever a timer interrupt occurs, the character in global +variable timer_put_char is printed on the console, but only if +global Boolean variable timer_do_put is true. The best way to +set up x to be printed is then to use an optimization barrier, +like this: +

+

+ +
 
timer_put_char = 'x';
+barrier ();
+timer_do_put = true;
+

+ +Without the barrier, the code is buggy because the compiler is free to +reorder operations when it doesn't see a reason to keep them in the +same order. In this case, the compiler doesn't know that the order of +assignments is important, so its optimizer is permitted to exchange +their order. There's no telling whether it will actually do this, and +it is possible that passing the compiler different optimization flags +or using a different version of the compiler will produce different +behavior. +

+

+ +Another solution is to disable interrupts around the assignments. +This does not prevent reordering, but it prevents the interrupt +handler from intervening between the assignments. It also has the +extra runtime cost of disabling and re-enabling interrupts: +

+

+ +
 
enum intr_level old_level = intr_disable ();
+timer_put_char = 'x';
+timer_do_put = true;
+intr_set_level (old_level);
+

+ +A second solution is to mark the declarations of +timer_put_char and timer_do_put as volatile. This +keyword tells the compiler that the variables are externally observable +and restricts its latitude for optimization. However, the semantics of +volatile are not well-defined, so it is not a good general +solution. The base Pintos code does not use volatile at all. +

+

+ +The following is not a solution, because locks neither prevent +interrupts nor prevent the compiler from reordering the code within the +region where the lock is held: +

+

+ +
 
lock_acquire (&timer_lock);     /* INCORRECT CODE */
+timer_put_char = 'x';
+timer_do_put = true;
+lock_release (&timer_lock);
+

+ +The compiler treats invocation of any function defined externally, +that is, in another source file, as a limited form of optimization +barrier. Specifically, the compiler assumes that any externally +defined function may access any statically or dynamically allocated +data and any local variable whose address is taken. This often means +that explicit barriers can be omitted. It is one reason that Pintos +contains few explicit barriers. +

+

+ +A function defined in the same source file, or in a header included by +the source file, cannot be relied upon as a optimization barrier. +This applies even to invocation of a function before its +definition, because the compiler may read and parse the entire source +file before performing optimization. +

+

+ + +


+ +

A.4 Interrupt Handling

+ +

+ +An interrupt notifies the CPU of some event. Much of the work +of an operating system relates to interrupts in one way or another. +For our purposes, we classify interrupts into two broad categories: +

+

+ +

+

+ +The CPU treats both classes of interrupts largely the same way, +so Pintos has common infrastructure to handle both classes. +The following section describes this +common infrastructure. The sections after that give the specifics of +external and internal interrupts. +

+

+ +If you haven't already read chapter 3, "Basic Execution Environment," +in [ IA32-v1], it is recommended that you do so now. You might +also want to skim chapter 5, "Interrupt and Exception Handling," in +[ IA32-v3a]. +

+

+ + +


+ +

A.4.1 Interrupt Infrastructure

+ +

+ +When an interrupt occurs, the CPU saves +its most essential state on a stack and jumps to an interrupt +handler routine. The 80x86 architecture supports 256 +interrupts, numbered 0 through 255, each with an independent +handler defined in an array called the interrupt +descriptor table or IDT. +

+

+ +In Pintos, intr_init() in threads/interrupt.c sets up the +IDT so that each entry points to a unique entry point in +threads/intr-stubs.S named intrNN_stub(), where +NN is the interrupt number in +hexadecimal. Because the CPU doesn't give +us any other way to find out the interrupt number, this entry point +pushes the interrupt number on the stack. Then it jumps to +intr_entry(), which pushes all the registers that the processor +didn't already push for us, and then calls intr_handler(), which +brings us back into C in threads/interrupt.c. +

+

+ +The main job of intr_handler() is to call the function +registered for handling the particular interrupt. (If no +function is registered, it dumps some information to the console and +panics.) It also does some extra processing for external +interrupts (see section A.4.3 External Interrupt Handling). +

+

+ +When intr_handler() returns, the assembly code in +threads/intr-stubs.S restores all the CPU registers saved +earlier and directs the CPU to return from the interrupt. +

+

+ +The following types and functions are common to all +interrupts. +

+

+ + +

+
+
Type: void intr_handler_func (struct intr_frame *frame) +
This is how an interrupt handler function must be declared. Its frame +argument (see below) allows it to determine the cause of the interrupt +and the state of the thread that was interrupted. +
+

+ + +

+
+
Type: struct intr_frame +
The stack frame of an interrupt handler, as saved by the CPU, the interrupt +stubs, and intr_entry(). Its most interesting members are described +below. +
+

+ + +

+
+
Member of struct intr_frame: uint32_t edi +
+
Member of struct intr_frame: uint32_t esi +
+
Member of struct intr_frame: uint32_t ebp +
+
Member of struct intr_frame: uint32_t esp_dummy +
+
Member of struct intr_frame: uint32_t ebx +
+
Member of struct intr_frame: uint32_t edx +
+
Member of struct intr_frame: uint32_t ecx +
+
Member of struct intr_frame: uint32_t eax +
+
Member of struct intr_frame: uint16_t es +
+
Member of struct intr_frame: uint16_t ds +
Register values in the interrupted thread, pushed by intr_entry(). +The esp_dummy value isn't actually used (refer to the +description of PUSHA in [ IA32-v2b] for details). +
+

+ + +

+
+
Member of struct intr_frame: uint32_t vec_no +
The interrupt vector number, ranging from 0 to 255. +
+

+ + +

+
+
Member of struct intr_frame: uint32_t error_code +
The "error code" pushed on the stack by the CPU for some internal +interrupts. +
+

+ + +

+
+
Member of struct intr_frame: void (*eip) (void) +
The address of the next instruction to be executed by the interrupted +thread. +
+

+ + +

+
+
Member of struct intr_frame: void *esp +
The interrupted thread's stack pointer. +
+

+ + +

+
+
Function: const char *intr_name (uint8_t vec) +
Returns the name of the interrupt numbered vec, or +"unknown" if the interrupt has no registered name. +
+

+ + +


+ +

A.4.2 Internal Interrupt Handling

+ +

+ +Internal interrupts are caused directly by CPU instructions executed by +the running kernel thread or user process (from project 2 onward). An +internal interrupt is therefore said to arise in a "process context." +

+

+ +In an internal interrupt's handler, it can make sense to examine the +struct intr_frame passed to the interrupt handler, or even to modify +it. When the interrupt returns, modifications in struct intr_frame +become changes to the calling thread or process's state. For example, +the Pintos system call handler returns a value to the user program by +modifying the saved EAX register (see section 2.5.2 System Call Details). +

+

+ +There are no special restrictions on what an internal interrupt +handler can or can't do. Generally they should run with interrupts +enabled, just like other code, and so they can be preempted by other +kernel threads. Thus, they do need to synchronize with other threads +on shared data and other resources (see section A.3 Synchronization). +

+

+ +Internal interrupt handlers can be invoked recursively. For example, +the system call handler might cause a page fault while attempting to +read user memory. Deep recursion would risk overflowing the limited +kernel stack (see section A.2.1 struct thread), but should be unnecessary. +

+

+ + +

+
+
Function: void intr_register_int (uint8_t vec, int dpl, enum intr_level level, intr_handler_func *handler, const char *name) +
Registers handler to be called when internal interrupt numbered +vec is triggered. Names the interrupt name for debugging +purposes. +

+ +If level is INTR_ON, external interrupts will be processed +normally during the interrupt handler's execution, which is normally +desirable. Specifying INTR_OFF will cause the CPU to disable +external interrupts when it invokes the interrupt handler. The effect +is slightly different from calling intr_disable() inside the +handler, because that leaves a window of one or more CPU instructions in +which external interrupts are still enabled. This is important for the +page fault handler; refer to the comments in userprog/exception.c +for details. +

+

+ +dpl determines how the interrupt can be invoked. If dpl is +0, then the interrupt can be invoked only by kernel threads. Otherwise +dpl should be 3, which allows user processes to invoke the +interrupt with an explicit INT instruction. The value of dpl +doesn't affect user processes' ability to invoke the interrupt +indirectly, e.g. an invalid memory reference will cause a page fault +regardless of dpl. +

+
+

+ + +


+ +

A.4.3 External Interrupt Handling

+ +

+ +External interrupts are caused by events outside the CPU. +They are asynchronous, so they can be invoked at any time that +interrupts have not been disabled. We say that an external interrupt +runs in an "interrupt context." +

+

+ +In an external interrupt, the struct intr_frame passed to the +handler is not very meaningful. It describes the state of the thread +or process that was interrupted, but there is no way to predict which +one that is. It is possible, although rarely useful, to examine it, but +modifying it is a recipe for disaster. +

+

+ +Only one external interrupt may be processed at a time. Neither +internal nor external interrupt may nest within an external interrupt +handler. Thus, an external interrupt's handler must run with interrupts +disabled (see section A.3.1 Disabling Interrupts). +

+

+ +An external interrupt handler must not sleep or yield, which rules out +calling lock_acquire(), thread_yield(), and many other +functions. Sleeping in interrupt context would effectively put the +interrupted thread to sleep, too, until the interrupt handler was again +scheduled and returned. This would be unfair to the unlucky thread, and +it would deadlock if the handler were waiting for the sleeping thread +to, e.g., release a lock. +

+

+ +An external interrupt handler +effectively monopolizes the machine and delays all other activities. +Therefore, external interrupt handlers should complete as quickly as +they can. Anything that require much CPU time should instead run in a +kernel thread, possibly one that the interrupt triggers using a +synchronization primitive. +

+

+ +External interrupts are controlled by a +pair of devices outside the CPU called programmable interrupt +controllers, PICs for short. When intr_init() sets up the +CPU's IDT, it also initializes the PICs for interrupt handling. The +PICs also must be "acknowledged" at the end of processing for each +external interrupt. intr_handler() takes care of that by calling +pic_end_of_interrupt(), which properly signals the PICs. +

+

+ +The following functions relate to external +interrupts. +

+

+ + +

+
+
Function: void intr_register_ext (uint8_t vec, intr_handler_func *handler, const char *name) +
Registers handler to be called when external interrupt numbered +vec is triggered. Names the interrupt name for debugging +purposes. The handler will run with interrupts disabled. +
+

+ + +

+
+
Function: bool intr_context (void) +
Returns true if we are running in an interrupt context, otherwise +false. Mainly used in functions that might sleep +or that otherwise should not be called from interrupt context, in this +form: +
 
ASSERT (!intr_context ());
+
+

+ + +

+
+
Function: void intr_yield_on_return (void) +
When called in an interrupt context, causes thread_yield() to be +called just before the interrupt returns. Used +in the timer interrupt handler when a thread's time slice expires, to +cause a new thread to be scheduled. +
+

+ + +


+ +

A.5 Memory Allocation

+ +

+ +Pintos contains two memory allocators, one that allocates memory in +units of a page, and one that can allocate blocks of any size. +

+

+ + +


+ +

A.5.1 Page Allocator

+ +

+ +The page allocator declared in threads/palloc.h allocates +memory in units of a page. It is most often used to allocate memory +one page at a time, but it can also allocate multiple contiguous pages +at once. +

+

+ +The page allocator divides the memory it allocates into two pools, +called the kernel and user pools. By default, each pool gets half of +system memory above 1 MB, but the division can be changed with the +-ul kernel +command line +option (see Why PAL_USER?). An allocation request draws from one +pool or the other. If one pool becomes empty, the other may still +have free pages. The user pool should be used for allocating memory +for user processes and the kernel pool for all other allocations. +This will only become important starting with project 3. Until then, +all allocations should be made from the kernel pool. +

+

+ +Each pool's usage is tracked with a bitmap, one bit per page in +the pool. A request to allocate n pages scans the bitmap +for n consecutive bits set to +false, indicating that those pages are free, and then sets those bits +to true to mark them as used. This is a "first fit" allocation +strategy (see Wilson). +

+

+ +The page allocator is subject to fragmentation. That is, it may not +be possible to allocate n contiguous pages even though n +or more pages are free, because the free pages are separated by used +pages. In fact, in pathological cases it may be impossible to +allocate 2 contiguous pages even though half of the pool's pages are free. +Single-page requests can't fail due to fragmentation, so +requests for multiple contiguous pages should be limited as much as +possible. +

+

+ +Pages may not be allocated from interrupt context, but they may be +freed. +

+

+ +When a page is freed, all of its bytes are cleared to 0xcc, as +a debugging aid (see section D.8 Tips). +

+

+ +Page allocator types and functions are described below. +

+

+ + +

+
+
Function: void *palloc_get_page (enum palloc_flags flags) +
+
Function: void *palloc_get_multiple (enum palloc_flags flags, size_t page_cnt) +
Obtains and returns one page, or page_cnt contiguous pages, +respectively. Returns a null pointer if the pages cannot be allocated. +

+ +The flags argument may be any combination of the following flags: +

+

+ + +

+
+
Page Allocator Flag: PAL_ASSERT +
If the pages cannot be allocated, panic the kernel. This is only +appropriate during kernel initialization. User processes +should never be permitted to panic the kernel. +
+

+ + +

+
+
Page Allocator Flag: PAL_ZERO +
Zero all the bytes in the allocated pages before returning them. If not +set, the contents of newly allocated pages are unpredictable. +
+

+ + +

+
+
Page Allocator Flag: PAL_USER +
Obtain the pages from the user pool. If not set, pages are allocated +from the kernel pool. +
+
+

+ + +

+
+
Function: void palloc_free_page (void *page) +
+
Function: void palloc_free_multiple (void *pages, size_t page_cnt) +
Frees one page, or page_cnt contiguous pages, respectively, +starting at pages. All of the pages must have been obtained using +palloc_get_page() or palloc_get_multiple(). +
+

+ + +


+ +

A.5.2 Block Allocator

+ +

+ +The block allocator, declared in threads/malloc.h, can allocate +blocks of any size. It is layered on top of the page allocator +described in the previous section. Blocks returned by the block +allocator are obtained from the kernel pool. +

+

+ +The block allocator uses two different strategies for allocating memory. +The first strategy applies to blocks that are 1 kB or smaller +(one-fourth of the page size). These allocations are rounded up to the +nearest power of 2, or 16 bytes, whichever is larger. Then they are +grouped into a page used only for allocations of that size. +

+

+ +The second strategy applies to blocks larger than 1 kB. +These allocations (plus a small amount of overhead) are rounded up to +the nearest page in size, and then the block allocator requests that +number of contiguous pages from the page allocator. +

+

+ +In either case, the difference between the allocation requested size +and the actual block size is wasted. A real operating system would +carefully tune its allocator to minimize this waste, but this is +unimportant in an instructional system like Pintos. +

+

+ +As long as a page can be obtained from the page allocator, small +allocations always succeed. Most small allocations do not require a +new page from the page allocator at all, because they are satisfied +using part of a page already allocated. However, large allocations +always require calling into the page allocator, and any allocation +that needs more than one contiguous page can fail due to fragmentation, +as already discussed in the previous section. Thus, you should +minimize the number of large allocations in your code, especially +those over approximately 4 kB each. +

+

+ +When a block is freed, all of its bytes are cleared to 0xcc, as +a debugging aid (see section D.8 Tips). +

+

+ +The block allocator may not be called from interrupt context. +

+

+ +The block allocator functions are described below. Their interfaces are +the same as the standard C library functions of the same names. +

+

+ + +

+
+
Function: void *malloc (size_t size) +
Obtains and returns a new block, from the kernel pool, at least +size bytes long. Returns a null pointer if size is zero or +if memory is not available. +
+

+ + +

+
+
Function: void *calloc (size_t a, size_t b) +
Obtains a returns a new block, from the kernel pool, at least +a * b bytes long. The block's contents will be +cleared to zeros. Returns a null pointer if a or b is zero +or if insufficient memory is available. +
+

+ + +

+
+
Function: void *realloc (void *block, size_t new_size) +
Attempts to resize block to new_size bytes, possibly moving +it in the process. If successful, returns the new block, in which case +the old block must no longer be accessed. On failure, returns a null +pointer, and the old block remains valid. +

+ +A call with block null is equivalent to malloc(). A call +with new_size zero is equivalent to free(). +

+
+

+ + +

+
+
Function: void free (void *block) +
Frees block, which must have been previously returned by +malloc(), calloc(), or realloc() (and not yet freed). +
+

+ + +


+ +

A.6 Virtual Addresses

+ +

+ +A 32-bit virtual address can be divided into a 20-bit page number +and a 12-bit page offset (or just offset), like this: +

+

+ +
 
               31               12 11        0
+              +-------------------+-----------+
+              |    Page Number    |   Offset  |
+              +-------------------+-----------+
+                       Virtual Address
+

+ +Header threads/vaddr.h defines these functions and macros for +working with virtual addresses: +

+

+ + +

+
+
Macro: PGSHIFT +
+
Macro: PGBITS +
The bit index (0) and number of bits (12) of the offset part of a +virtual address, respectively. +
+

+ + +

+
+
Macro: PGMASK +
A bit mask with the bits in the page offset set to 1, the rest set to 0 +(0xfff). +
+

+ + +

+
+
Macro: PGSIZE +
The page size in bytes (4,096). +
+

+ + +

+
+
Function: unsigned pg_ofs (const void *va) +
Extracts and returns the page offset in virtual address va. +
+

+ + +

+
+
Function: uintptr_t pg_no (const void *va) +
Extracts and returns the page number in virtual address va. +
+

+ + +

+
+
Function: void *pg_round_down (const void *va) +
Returns the start of the virtual page that va points within, that +is, va with the page offset set to 0. +
+

+ + +

+
+
Function: void *pg_round_up (const void *va) +
Returns va rounded up to the nearest page boundary. +
+

+ +Virtual memory in Pintos is divided into two regions: user virtual +memory and kernel virtual memory (see section 2.2.4 Virtual Memory Layout). The +boundary between them is PHYS_BASE: +

+

+ + +

+
+
Macro: PHYS_BASE +
Base address of kernel virtual memory. It defaults to 0xc0000000 (3 +GB), but it may be changed to any multiple of 0x10000000 from +0x80000000 to 0xf0000000. +

+ +User virtual memory ranges from virtual address 0 up to +PHYS_BASE. Kernel virtual memory occupies the rest of the +virtual address space, from PHYS_BASE up to 4 GB. +

+
+

+ + +

+
+
Function: bool is_user_vaddr (const void *va) +
+
Function: bool is_kernel_vaddr (const void *va) +
Returns true if va is a user or kernel virtual address, +respectively, false otherwise. +
+

+ +The 80x86 doesn't provide any way to directly access memory given +a physical address. This ability is often necessary in an operating +system kernel, so Pintos works around it by mapping kernel virtual +memory one-to-one to physical memory. That is, virtual address +PHYS_BASE accesses physical address 0, virtual address +PHYS_BASE + 0x1234 accesses physical address 0x1234, and +so on up to the size of the machine's physical memory. Thus, adding +PHYS_BASE to a physical address obtains a kernel virtual address +that accesses that address; conversely, subtracting PHYS_BASE +from a kernel virtual address obtains the corresponding physical +address. Header threads/vaddr.h provides a pair of functions to +do these translations: +

+

+ + +

+
+
Function: void *ptov (uintptr_t pa) +
Returns the kernel virtual address corresponding to physical address +pa, which should be between 0 and the number of bytes of physical +memory. +
+

+ + +

+
+
Function: uintptr_t vtop (void *va) +
Returns the physical address corresponding to va, which must be a +kernel virtual address. +
+

+ + +


+ +

A.7 Page Table

+ +

+ +The code in pagedir.c is an abstract interface to the 80x86 +hardware page table, also called a "page directory" by Intel processor +documentation. The page table interface uses a uint32_t * to +represent a page table because this is convenient for accessing their +internal structure. +

+

+ +The sections below describe the page table interface and internals. +

+

+ + +


+ +

A.7.1 Creation, Destruction, and Activation

+ +

+ +These functions create, destroy, and activate page tables. The base +Pintos code already calls these functions where necessary, so it should +not be necessary to call them yourself. +

+

+ + +

+
+
Function: uint32_t *pagedir_create (void) +
Creates and returns a new page table. The new page table contains +Pintos's normal kernel virtual page mappings, but no user virtual +mappings. +

+ +Returns a null pointer if memory cannot be obtained. +

+
+

+ + +

+
+
Function: void pagedir_destroy (uint32_t *pd) +
Frees all of the resources held by pd, including the page table +itself and the frames that it maps. +
+

+ + +

+
+
Function: void pagedir_activate (uint32_t *pd) +
Activates pd. The active page table is the one used by the CPU to +translate memory references. +
+

+ + +


+ +

A.7.2 Inspection and Updates

+ +

+ +These functions examine or update the mappings from pages to frames +encapsulated by a page table. They work on both active and inactive +page tables (that is, those for running and suspended processes), +flushing the TLB as necessary. +

+

+ + +

+
+
Function: bool pagedir_set_page (uint32_t *pd, void *upage, void *kpage, bool writable) +
Adds to pd a mapping from user page upage to the frame identified +by kernel virtual address kpage. If writable is true, the +page is mapped read/write; otherwise, it is mapped read-only. +

+ +User page upage must not already be mapped in pd. +

+

+ +Kernel page kpage should be a kernel virtual address obtained from +the user pool with palloc_get_page(PAL_USER) (see Why PAL_USER?). +

+

+ +Returns true if successful, false on failure. Failure will occur if +additional memory required for the page table cannot be obtained. +

+
+

+ + +

+
+
Function: void *pagedir_get_page (uint32_t *pd, const void *uaddr) +
Looks up the frame mapped to uaddr in pd. Returns the +kernel virtual address for that frame, if uaddr is mapped, or a +null pointer if it is not. +
+

+ + +

+
+
Function: void pagedir_clear_page (uint32_t *pd, void *page) +
Marks page "not present" in pd. Later accesses to +the page will fault. +

+ +Other bits in the page table for page are preserved, permitting +the accessed and dirty bits (see the next section) to be checked. +

+

+ +This function has no effect if page is not mapped. +

+
+

+ + +


+ +

A.7.3 Accessed and Dirty Bits

+ +

+ +80x86 hardware provides some assistance for implementing page +replacement algorithms, through a pair of bits in the page table entry +(PTE) for each page. On any read or write to a page, the CPU sets the +accessed bit to 1 in the page's PTE, and on any write, the CPU +sets the dirty bit to 1. The CPU never resets these bits to 0, +but the OS may do so. +

+

+ +Proper interpretation of these bits requires understanding of +aliases, that is, two (or more) pages that refer to the same +frame. When an aliased frame is accessed, the accessed and dirty bits +are updated in only one page table entry (the one for the page used for +access). The accessed and dirty bits for the other aliases are not +updated. +

+

+ +See Accessed and Dirty Bits, on applying these bits in implementing +page replacement algorithms. +

+

+ + +

+
+
Function: bool pagedir_is_dirty (uint32_t *pd, const void *page) +
+
Function: bool pagedir_is_accessed (uint32_t *pd, const void *page) +
Returns true if page directory pd contains a page table entry for +page that is marked dirty (or accessed). Otherwise, +returns false. +
+

+ + +

+
+
Function: void pagedir_set_dirty (uint32_t *pd, const void *page, bool value) +
+
Function: void pagedir_set_accessed (uint32_t *pd, const void *page, bool value) +
If page directory pd has a page table entry for page, then +its dirty (or accessed) bit is set to value. +
+

+ + +


+ +

A.7.4 Page Table Details

+ +

+ +The functions provided with Pintos are sufficient to implement the +projects. However, you may still find it worthwhile to understand the +hardware page table format, so we'll go into a little detail in this +section. +

+

+ + +


+ +

A.7.4.1 Structure

+ +

+ +The top-level paging data structure is a page called the "page +directory" (PD) arranged as an array of 1,024 32-bit page directory +entries (PDEs), each of which represents 4 MB of virtual memory. Each +PDE may point to the physical address of another page called a +"page table" (PT) arranged, similarly, as an array of 1,024 +32-bit page table entries (PTEs), each of which translates a single 4 +kB virtual page to a physical page. +

+

+ +Translation of a virtual address into a physical address follows +the three-step process illustrated in the diagram +below:(4) +

+

+ +

    +
  1. +The most-significant 10 bits of the virtual address (bits 22...31) +index the page directory. If the PDE is marked "present," the +physical address of a page table is read from the PDE thus obtained. +If the PDE is marked "not present" then a page fault occurs. +

    + +

    +
  2. +The next 10 bits of the virtual address (bits 12...21) index +the page table. If the PTE is marked "present," the physical +address of a data page is read from the PTE thus obtained. If the PTE +is marked "not present" then a page fault occurs. +

    + +

    +
  3. +The least-significant 12 bits of the virtual address (bits 0...11) +are added to the data page's physical base address, yielding the final +physical address. +
+

+ +
 
 31                  22 21                  12 11                   0
++----------------------+----------------------+----------------------+
+| Page Directory Index |   Page Table Index   |    Page Offset       |
++----------------------+----------------------+----------------------+
+             |                    |                     |
+     _______/             _______/                _____/
+    /                    /                       /
+   /    Page Directory  /      Page Table       /    Data Page
+  /     .____________. /     .____________.    /   .____________.
+  |1,023|____________| |1,023|____________|    |   |____________|
+  |1,022|____________| |1,022|____________|    |   |____________|
+  |1,021|____________| |1,021|____________|    \__\|____________|
+  |1,020|____________| |1,020|____________|       /|____________|
+  |     |            | |     |            |        |            |
+  |     |            | \____\|            |_       |            |
+  |     |      .     |      /|      .     | \      |      .     |
+  \____\|      .     |_      |      .     |  |     |      .     |
+       /|      .     | \     |      .     |  |     |      .     |
+        |      .     |  |    |      .     |  |     |      .     |
+        |            |  |    |            |  |     |            |
+        |____________|  |    |____________|  |     |____________|
+       4|____________|  |   4|____________|  |     |____________|
+       3|____________|  |   3|____________|  |     |____________|
+       2|____________|  |   2|____________|  |     |____________|
+       1|____________|  |   1|____________|  |     |____________|
+       0|____________|  \__\0|____________|  \____\|____________|
+                           /                      /
+

+ +Pintos provides some macros and functions that are useful for working +with raw page tables: +

+

+ + +

+
+
Macro: PTSHIFT +
+
Macro: PTBITS +
The starting bit index (12) and number of bits (10), respectively, in a +page table index. +
+

+ + +

+
+
Macro: PTMASK +
A bit mask with the bits in the page table index set to 1 and the rest +set to 0 (0x3ff000). +
+

+ + +

+
+
Macro: PTSPAN +
The number of bytes of virtual address space that a single page table +page covers (4,194,304 bytes, or 4 MB). +
+

+ + +

+
+
Macro: PDSHIFT +
+
Macro: PDBITS +
The starting bit index (22) and number of bits (10), respectively, in a +page directory index. +
+

+ + +

+
+
Macro: PDMASK +
A bit mask with the bits in the page directory index set to 1 and other +bits set to 0 (0xffc00000). +
+

+ + +

+
+
Function: uintptr_t pd_no (const void *va) +
+
Function: uintptr_t pt_no (const void *va) +
Returns the page directory index or page table index, respectively, for +virtual address va. These functions are defined in +threads/pte.h. +
+

+ + +

+
+
Function: unsigned pg_ofs (const void *va) +
Returns the page offset for virtual address va. This function is +defined in threads/vaddr.h. +
+

+ + +


+ +

A.7.4.2 Page Table Entry Format

+ +

+ +You do not need to understand the PTE format to do the Pintos +projects, unless you wish to incorporate the page table into your +supplemental page table (see Managing the Supplemental Page Table). +

+

+ +The actual format of a page table entry is summarized below. For +complete information, refer to section 3.7, "Page Translation Using +32-Bit Physical Addressing," in [ IA32-v3a]. +

+

+ +
 
 31                                   12 11 9      6 5     2 1 0
++---------------------------------------+----+----+-+-+---+-+-+-+
+|           Physical Address            | AVL|    |D|A|   |U|W|P|
++---------------------------------------+----+----+-+-+---+-+-+-+
+

+ +Some more information on each bit is given below. The names are +threads/pte.h macros that represent the bits' values: +

+

+ + +

+
+
Macro: PTE_P +
Bit 0, the "present" bit. When this bit is 1, the +other bits are interpreted as described below. When this bit is 0, any +attempt to access the page will page fault. The remaining bits are then +not used by the CPU and may be used by the OS for any purpose. +
+

+ + +

+
+
Macro: PTE_W +
Bit 1, the "read/write" bit. When it is 1, the page +is writable. When it is 0, write attempts will page fault. +
+

+ + +

+
+
Macro: PTE_U +
Bit 2, the "user/supervisor" bit. When it is 1, user +processes may access the page. When it is 0, only the kernel may access +the page (user accesses will page fault). +

+ +Pintos clears this bit in PTEs for kernel virtual memory, to prevent +user processes from accessing them. +

+
+

+ + +

+
+
Macro: PTE_A +
Bit 5, the "accessed" bit. See section A.7.3 Accessed and Dirty Bits. +
+

+ + +

+
+
Macro: PTE_D +
Bit 6, the "dirty" bit. See section A.7.3 Accessed and Dirty Bits. +
+

+ + +

+
+
Macro: PTE_AVL +
Bits 9...11, available for operating system use. +Pintos, as provided, does not use them and sets them to 0. +
+

+ + +

+
+
Macro: PTE_ADDR +
Bits 12...31, the top 20 bits of the physical address of a frame. +The low 12 bits of the frame's address are always 0. +
+

+ +Other bits are either reserved or uninteresting in a Pintos context and +should be set to@tie{}0. +

+

+ +Header threads/pte.h defines three functions for working with +page table entries: +

+

+ + +

+
+
Function: uint32_t pte_create_kernel (uint32_t *page, bool writable) +
Returns a page table entry that points to page, which should be a +kernel virtual address. The PTE's present bit will be set. It will be +marked for kernel-only access. If writable is true, the PTE will +also be marked read/write; otherwise, it will be read-only. +
+

+ + +

+
+
Function: uint32_t pte_create_user (uint32_t *page, bool writable) +
Returns a page table entry that points to page, which should be +the kernel virtual address of a frame in the user pool (see Why PAL_USER?). The PTE's present bit will be set and it will be marked to +allow user-mode access. If writable is true, the PTE will also be +marked read/write; otherwise, it will be read-only. +
+

+ + +

+
+
Function: void *pte_get_page (uint32_t pte) +
Returns the kernel virtual address for the frame that pte points +to. The pte may be present or not-present; if it is not-present +then the pointer returned is only meaningful if the address bits in the PTE +actually represent a physical address. +
+

+ + +


+ +

A.7.4.3 Page Directory Entry Format

+ +

+ +Page directory entries have the same format as PTEs, except that the +physical address points to a page table page instead of a frame. Header +threads/pte.h defines two functions for working with page +directory entries: +

+

+ + +

+
+
Function: uint32_t pde_create (uint32_t *pt) +
Returns a page directory that points to page, which should be the +kernel virtual address of a page table page. The PDE's present bit will +be set, it will be marked to allow user-mode access, and it will be +marked read/write. +
+

+ + +

+
+
Function: uint32_t *pde_get_pt (uint32_t pde) +
Returns the kernel virtual address for the page table page that +pde, which must be marked present, points to. +
+

+ + +


+ +

A.8 Hash Table

+ +

+ +Pintos provides a hash table data structure in lib/kernel/hash.c. +To use it you will need to include its header file, +lib/kernel/hash.h, with #include <hash.h>. +No code provided with Pintos uses the hash table, which means that you +are free to use it as is, modify its implementation for your own +purposes, or ignore it, as you wish. +

+

+ +Most implementations of the virtual memory project use a hash table to +translate pages to frames. You may find other uses for hash tables as +well. +

+

+ + +


+ +

A.8.1 Data Types

+ +

+ +A hash table is represented by struct hash. +

+

+ + +

+
+
Type: struct hash +
Represents an entire hash table. The actual members of struct hash +are "opaque." That is, code that uses a hash table should not access +struct hash members directly, nor should it need to. Instead, use +hash table functions and macros. +
+

+ +The hash table operates on elements of type struct hash_elem. +

+

+ + +

+
+
Type: struct hash_elem +
Embed a struct hash_elem member in the structure you want to include +in a hash table. Like struct hash, struct hash_elem is opaque. +All functions for operating on hash table elements actually take and +return pointers to struct hash_elem, not pointers to your hash table's +real element type. +
+

+ +You will often need to obtain a struct hash_elem given a real element +of the hash table, and vice versa. Given a real element of the hash +table, you may use the & operator to obtain a pointer to its +struct hash_elem. Use the hash_entry() macro to go the other +direction. +

+

+ + +

+
+
Macro: type *hash_entry (struct hash_elem *elem, type, member) +
Returns a pointer to the structure that elem, a pointer to a +struct hash_elem, is embedded within. You must provide type, +the name of the structure that elem is inside, and member, +the name of the member in type that elem points to. +

+ +For example, suppose h is a struct hash_elem * variable +that points to a struct thread member (of type struct hash_elem) +named h_elem. Then, hash_entry@tie{(h, struct thread, h_elem)} +yields the address of the struct thread that h points within. +

+
+

+ +See section A.8.5 Hash Table Example, for an example. +

+

+ +Each hash table element must contain a key, that is, data that +identifies and distinguishes elements, which must be unique +among elements in the hash table. (Elements may +also contain non-key data that need not be unique.) While an element is +in a hash table, its key data must not be changed. Instead, if need be, +remove the element from the hash table, modify its key, then reinsert +the element. +

+

+ +For each hash table, you must write two functions that act on keys: a +hash function and a comparison function. These functions must match the +following prototypes: +

+

+ + +

+
+
Type: unsigned hash_hash_func (const struct hash_elem *element, void *aux) +
Returns a hash of element's data, as a value anywhere in the range +of unsigned int. The hash of an element should be a +pseudo-random function of the element's key. It must not depend on +non-key data in the element or on any non-constant data other than the +key. Pintos provides the following functions as a suitable basis for +hash functions. +

+ + +

+
+
Function: unsigned hash_bytes (const void *buf, size_t *size) +
Returns a hash of the size bytes starting at buf. The +implementation is the general-purpose +Fowler-Noll-Vo +hash for 32-bit words. +
+

+ + +

+
+
Function: unsigned hash_string (const char *s) +
Returns a hash of null-terminated string s. +
+

+ + +

+
+
Function: unsigned hash_int (int i) +
Returns a hash of integer i. +
+

+ +If your key is a single piece of data of an appropriate type, it is +sensible for your hash function to directly return the output of one of +these functions. For multiple pieces of data, you may wish to combine +the output of more than one call to them using, e.g., the ^ +(exclusive or) +operator. Finally, you may entirely ignore these functions and write +your own hash function from scratch, but remember that your goal is to +build an operating system kernel, not to design a hash function. +

+

+ +See section A.8.6 Auxiliary Data, for an explanation of aux. +

+
+

+ + +

+
+
Type: bool hash_less_func (const struct hash_elem *a, const struct hash_elem *b, void *aux) +
Compares the keys stored in elements a and b. Returns +true if a is less than b, false if a is greater than +or equal to b. +

+ +If two elements compare equal, then they must hash to equal values. +

+

+ +See section A.8.6 Auxiliary Data, for an explanation of aux. +

+
+

+ +See section A.8.5 Hash Table Example, for hash and comparison function examples. +

+

+ +A few functions accept a pointer to a third kind of +function as an argument: +

+

+ + +

+
+
Type: void hash_action_func (struct hash_elem *element, void *aux) +
Performs some kind of action, chosen by the caller, on element. +

+ +See section A.8.6 Auxiliary Data, for an explanation of aux. +

+
+

+ + +


+ +

A.8.2 Basic Functions

+ +

+ +These functions create, destroy, and inspect hash tables. +

+

+ + +

+
+
Function: bool hash_init (struct hash *hash, hash_hash_func *hash_func, hash_less_func *less_func, void *aux) +
Initializes hash as a hash table with hash_func as hash +function, less_func as comparison function, and aux as +auxiliary data. +Returns true if successful, false on failure. hash_init() calls +malloc() and fails if memory cannot be allocated. +

+ +See section A.8.6 Auxiliary Data, for an explanation of aux, which is +most often a null pointer. +

+
+

+ + +

+
+
Function: void hash_clear (struct hash *hash, hash_action_func *action) +
Removes all the elements from hash, which must have been +previously initialized with hash_init(). +

+ +If action is non-null, then it is called once for each element in +the hash table, which gives the caller an opportunity to deallocate any +memory or other resources used by the element. For example, if the hash +table elements are dynamically allocated using malloc(), then +action could free() the element. This is safe because +hash_clear() will not access the memory in a given hash element +after calling action on it. However, action must not call +any function that may modify the hash table, such as hash_insert() +or hash_delete(). +

+
+

+ + +

+
+
Function: void hash_destroy (struct hash *hash, hash_action_func *action) +
If action is non-null, calls it for each element in the hash, with +the same semantics as a call to hash_clear(). Then, frees the +memory held by hash. Afterward, hash must not be passed to +any hash table function, absent an intervening call to hash_init(). +
+

+ + +

+
+
Function: size_t hash_size (struct hash *hash) +
Returns the number of elements currently stored in hash. +
+

+ + +

+
+
Function: bool hash_empty (struct hash *hash) +
Returns true if hash currently contains no elements, +false if hash contains at least one element. +
+

+ + +


+ +

A.8.3 Search Functions

+ +

+ +Each of these functions searches a hash table for an element that +compares equal to one provided. Based on the success of the search, +they perform some action, such as inserting a new element into the hash +table, or simply return the result of the search. +

+

+ + +

+
+
Function: struct hash_elem *hash_insert (struct hash *hash, struct hash_elem *element) +
Searches hash for an element equal to element. If none is +found, inserts element into hash and returns a null pointer. +If the table already contains an element equal to element, it is +returned without modifying hash. +
+

+ + +

+
+
Function: struct hash_elem *hash_replace (struct hash *hash, struct hash_elem *element) +
Inserts element into hash. Any element equal to +element already in hash is removed. Returns the element +removed, or a null pointer if hash did not contain an element +equal to element. +

+ +The caller is responsible for deallocating any resources associated with +the returned element, as appropriate. For example, if the hash table +elements are dynamically allocated using malloc(), then the caller +must free() the element after it is no longer needed. +

+
+

+ +The element passed to the following functions is only used for hashing +and comparison purposes. It is never actually inserted into the hash +table. Thus, only key data in the element needs to be initialized, and +other data in the element will not be used. It often makes sense to +declare an instance of the element type as a local variable, initialize +the key data, and then pass the address of its struct hash_elem to +hash_find() or hash_delete(). See section A.8.5 Hash Table Example, for +an example. (Large structures should not be +allocated as local variables. See section A.2.1 struct thread, for more +information.) +

+

+ + +

+
+
Function: struct hash_elem *hash_find (struct hash *hash, struct hash_elem *element) +
Searches hash for an element equal to element. Returns the +element found, if any, or a null pointer otherwise. +
+

+ + +

+
+
Function: struct hash_elem *hash_delete (struct hash *hash, struct hash_elem *element) +
Searches hash for an element equal to element. If one is +found, it is removed from hash and returned. Otherwise, a null +pointer is returned and hash is unchanged. +

+ +The caller is responsible for deallocating any resources associated with +the returned element, as appropriate. For example, if the hash table +elements are dynamically allocated using malloc(), then the caller +must free() the element after it is no longer needed. +

+
+

+ + +


+ +

A.8.4 Iteration Functions

+ +

+ +These functions allow iterating through the elements in a hash table. +Two interfaces are supplied. The first requires writing and supplying a +hash_action_func to act on each element (see section A.8.1 Data Types). +

+

+ + +

+
+
Function: void hash_apply (struct hash *hash, hash_action_func *action) +
Calls action once for each element in hash, in arbitrary +order. action must not call any function that may modify the hash +table, such as hash_insert() or hash_delete(). action +must not modify key data in elements, although it may modify any other +data. +
+

+ +The second interface is based on an "iterator" data type. +Idiomatically, iterators are used as follows: +

+

+ +
 
struct hash_iterator i;
+
+hash_first (&i, h);
+while (hash_next (&i))
+  {
+    struct foo *f = hash_entry (hash_cur (&i), struct foo, elem);
+    ...do something with f...
+  }
+

+ + +

+
+
Type: struct hash_iterator +
Represents a position within a hash table. Calling any function that +may modify a hash table, such as hash_insert() or +hash_delete(), invalidates all iterators within that hash table. +

+ +Like struct hash and struct hash_elem, struct hash_elem is opaque. +

+
+

+ + +

+
+
Function: void hash_first (struct hash_iterator *iterator, struct hash *hash) +
Initializes iterator to just before the first element in +hash. +
+

+ + +

+
+
Function: struct hash_elem *hash_next (struct hash_iterator *iterator) +
Advances iterator to the next element in hash, and returns +that element. Returns a null pointer if no elements remain. After +hash_next() returns null for iterator, calling it again +yields undefined behavior. +
+

+ + +

+
+
Function: struct hash_elem *hash_cur (struct hash_iterator *iterator) +
Returns the value most recently returned by hash_next() for +iterator. Yields undefined behavior after hash_first() has +been called on iterator but before hash_next() has been +called for the first time. +
+

+ + +


+ +

A.8.5 Hash Table Example

+ +

+ +Suppose you have a structure, called struct page, that you +want to put into a hash table. First, define struct page to include a +struct hash_elem member: +

+

+ +
 
struct page
+  {
+    struct hash_elem hash_elem; /* Hash table element. */
+    void *addr;                 /* Virtual address. */
+    /* ...other members... */
+  };
+

+ +We write a hash function and a comparison function using addr as +the key. A pointer can be hashed based on its bytes, and the < +operator works fine for comparing pointers: +

+

+ +
 
/* Returns a hash value for page p. */
+unsigned
+page_hash (const struct hash_elem *p_, void *aux UNUSED)
+{
+  const struct page *p = hash_entry (p_, struct page, hash_elem);
+  return hash_bytes (&p->addr, sizeof p->addr);
+}
+
+/* Returns true if page a precedes page b. */
+bool
+page_less (const struct hash_elem *a_, const struct hash_elem *b_,
+           void *aux UNUSED)
+{
+  const struct page *a = hash_entry (a_, struct page, hash_elem);
+  const struct page *b = hash_entry (b_, struct page, hash_elem);
+
+  return a->addr < b->addr;
+}
+

+ +(The use of UNUSED in these functions' prototypes suppresses a +warning that aux is unused. See section D.3 Function and Parameter Attributes, for information about UNUSED. See section A.8.6 Auxiliary Data, for an explanation of aux.) +

+

+ +Then, we can create a hash table like this: +

+

+ +
 
struct hash pages;
+
+hash_init (&pages, page_hash, page_less, NULL);
+

+ +Now we can manipulate the hash table we've created. If p +is a pointer to a struct page, we can insert it into the hash table +with: +

+

+ +
 
hash_insert (&pages, &p->hash_elem);
+

+ +If there's a chance that pages might already contain a +page with the same addr, then we should check hash_insert()'s +return value. +

+

+ +To search for an element in the hash table, use hash_find(). This +takes a little setup, because hash_find() takes an element to +compare against. Here's a function that will find and return a page +based on a virtual address, assuming that pages is defined at file +scope: +

+

+ +
 
/* Returns the page containing the given virtual address,
+   or a null pointer if no such page exists. */
+struct page *
+page_lookup (const void *address)
+{
+  struct page p;
+  struct hash_elem *e;
+
+  p.addr = address;
+  e = hash_find (&pages, &p.hash_elem);
+  return e != NULL ? hash_entry (e, struct page, hash_elem) : NULL;
+}
+

+ +struct page is allocated as a local variable here on the assumption +that it is fairly small. Large structures should not be allocated as +local variables. See section A.2.1 struct thread, for more information. +

+

+ +A similar function could delete a page by address using +hash_delete(). +

+

+ + +


+ +

A.8.6 Auxiliary Data

+ +

+ +In simple cases like the example above, there's no need for the +aux parameters. In these cases, just pass a null pointer to +hash_init() for aux and ignore the values passed to the hash +function and comparison functions. (You'll get a compiler warning if +you don't use the aux parameter, but you can turn that off with +the UNUSED macro, as shown in the example, or you can just ignore +it.) +

+

+ +aux is useful when you have some property of the data in the +hash table is both constant and needed for hashing or comparison, +but not stored in the data items themselves. For example, if +the items in a hash table are fixed-length strings, but the items +themselves don't indicate what that fixed length is, you could pass +the length as an aux parameter. +

+

+ + +


+ +

A.8.7 Synchronization

+ +

+ +The hash table does not do any internal synchronization. It is the +caller's responsibility to synchronize calls to hash table functions. +In general, any number of functions that examine but do not modify the +hash table, such as hash_find() or hash_next(), may execute +simultaneously. However, these function cannot safely execute at the +same time as any function that may modify a given hash table, such as +hash_insert() or hash_delete(), nor may more than one function +that can modify a given hash table execute safely at once. +

+

+ +It is also the caller's responsibility to synchronize access to data in +hash table elements. How to synchronize access to this data depends on +how it is designed and organized, as with any other data structure. +

+

+ + +


+ + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+
+ +This document was generated +by on March, 6 2012 +using texi2html + + + + diff --git a/doc/pintos_6.html b/doc/pintos_6.html new file mode 100644 index 0000000..afe7d8a --- /dev/null +++ b/doc/pintos_6.html @@ -0,0 +1,314 @@ + + + + + +Pintos Projects: Coding Standards + + + + + + + + + + + + + + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+ +
+

B. Coding Standards

+ +

+ +TODO: TUW coding standards +We expect you to be +familiar with some set of coding standards such as +CS 107 Coding Standards. Even if you've taken 107, we recommend +reviewing that document. We expect code at the "Peer-Review Quality" +level described there. +

+

+ +Our standards for coding are most important for grading. We want to +stress that aside from the fact that we are explicitly basing part of +your grade on these things, good coding practices will improve the +quality of your code. This makes it easier for your partners to +interact with it, and ultimately, will improve your chances of having a +good working program. That said once, the rest of this document will +discuss only the ways in which our coding standards will affect our +grading. +

+

+ + +


+ +

B.1 Style

+ +

+ +Style, for the purposes of our grading, refers to how readable your +code is. At minimum, this means that your code is well formatted, your +variable names are descriptive and your functions are decomposed and +well commented. Any other factors which make it hard (or easy) for us +to read or use your code will be reflected in your style grade. +

+

+ +The existing Pintos code is written in the GNU style and largely +follows the GNU +Coding Standards. We encourage you to follow the applicable parts of +them too, especially chapter 5, "Making the Best Use of C." Using a +different style won't cause actual problems, but it's ugly to see +gratuitous differences in style from one function to another. If your +code is too ugly, it will cost you points. +

+

+ +Please limit C source file lines to at most 79 characters long. +

+

+ +Pintos comments sometimes refer to external standards or +specifications by writing a name inside square brackets, like this: +[IA32-v3a]. These names refer to the reference names used in +this documentation (see section Bibliography). +

+

+ +If you remove existing Pintos code, please delete it from your source +file entirely. Don't just put it into a comment or a conditional +compilation directive, because that makes the resulting code hard to +read. +

+

+ +We're only going to do a compile in the directory for the project being +submitted. You don't need to make sure that the previous projects also +compile. +

+

+ +Project code should be written so that all of the subproblems for the +project function together, that is, without the need to rebuild with +different macros defined, etc. If you do extra credit work that +changes normal Pintos behavior so as to interfere with grading, then +you must implement it so that it only acts that way when given a +special command-line option of the form -name, where +name is a name of your choice. You can add such an option by +modifying parse_options() in threads/init.c. +

+

+ +The introduction describes additional coding style requirements +(see section 1.2.2 Design). +

+

+ + +


+ +

B.2 C99

+ +

+ +The Pintos source code uses a few features of the "C99" standard +library that were not in the original 1989 standard for C. Many +programmers are unaware of these feature, so we will describe them. The +new features used in Pintos are +mostly in new headers: +

+

+ +

+
+
<stdbool.h> +
Defines macros bool, a 1-bit type that takes on only the values +0 and 1, true, which expands to 1, and false, which +expands to 0. +

+ +

+
<stdint.h> +
On systems that support them, this header defines types +intn_t and uintn_t for n = 8, 16, 32, +64, and possibly other values. These are 2's complement signed and unsigned +types, respectively, with the given number of bits. +

+ +On systems where it is possible, this header also defines types +intptr_t and uintptr_t, which are integer types big +enough to hold a pointer. +

+

+ +On all systems, this header defines types intmax_t and +uintmax_t, which are the system's signed and unsigned integer +types with the widest ranges. +

+

+ +For every signed integer type type_t defined here, as well +as for ptrdiff_t defined in <stddef.h>, this header also +defines macros TYPE_MAX and TYPE_MIN that +give the type's range. Similarly, for every unsigned integer type +type_t defined here, as well as for size_t defined +in <stddef.h>, this header defines a TYPE_MAX +macro giving its maximum value. +

+

+ +

+
<inttypes.h> +
<stdint.h> provides no straightforward way to format +the types it defines with printf() and related functions. This +header provides macros to help with that. For every +intn_t defined by <stdint.h>, it provides macros +PRIdn and PRIin for formatting values of +that type with "%d" and "%i". Similarly, for every +uintn_t, it provides PRIon, +PRIun, PRIux, and PRIuX. +

+ +You use these something like this, taking advantage of the fact that +the C compiler concatenates adjacent string literals: +
 
#include <inttypes.h>
+...
+int32_t value = ...;
+printf ("value=%08"PRId32"\n", value);
+
The % is not supplied by the PRI macros. As shown +above, you supply it yourself and follow it by any flags, field +width, etc. +

+ +

+
<stdio.h> +
The printf() function has some new type modifiers for printing +standard types: +

+ +

+
+
j +
For intmax_t (e.g. %jd) or uintmax_t (e.g. +%ju). +

+ +

+
z +
For size_t (e.g. %zu). +

+ +

+
t +
For ptrdiff_t (e.g. %td). +
+

+ +Pintos printf() also implements a nonstandard ' flag that +groups large numbers with commas to make them easier to read. +

+

+ + +


+ +

B.3 Unsafe String Functions

+ +

+ +A few of the string functions declared in the standard +<string.h> and <stdio.h> headers are notoriously unsafe. +The worst offenders are intentionally not included in the Pintos C +library: +

+

+ +

+
+
strcpy +
When used carelessly this function can overflow the buffer reserved +for its output string. Use strlcpy() instead. Refer to +comments in its source code in lib/string.c for documentation. +

+ +

+
strncpy +
This function can leave its destination buffer without a null string +terminator. It also has performance problems. Again, use +strlcpy(). +

+ +

+
strcat +
Same issue as strcpy(). Use strlcat() instead. +Again, refer to comments in its source code in lib/string.c for +documentation. +

+ +

+
strncat +
The meaning of its buffer size argument is surprising. +Again, use strlcat(). +

+ +

+
strtok +
Uses global data, so it is unsafe in threaded programs such as +kernels. Use strtok_r() instead, and see its source code in +lib/string.c for documentation and an example. +

+ +

+
sprintf +
Same issue as strcpy(). Use snprintf() instead. Refer +to comments in lib/stdio.h for documentation. +

+ +

+
vsprintf +
Same issue as strcpy(). Use vsnprintf() instead. +
+

+ +If you try to use any of these functions, the error message will give +you a hint by referring to an identifier like +dont_use_sprintf_use_snprintf. + +


+ + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+
+ +This document was generated +by on March, 6 2012 +using texi2html + + + + diff --git a/doc/pintos_7.html b/doc/pintos_7.html new file mode 100644 index 0000000..e7ac7b5 --- /dev/null +++ b/doc/pintos_7.html @@ -0,0 +1,238 @@ + + + + + +Pintos Projects: Project Documentation + + + + + + + + + + + + + + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+ +
+

C. Project Documentation

+ +

+ +This chapter presents a sample assignment and a filled-in design +document for one possible implementation. Its purpose is to give you an +idea of what we expect to see in your own design documents. +

+

+ + +


+ +

C.1 Sample Assignment

+ +

+ +Implement thread_join(). +

+

+ + +

+
+
Function: void thread_join (tid_t tid) +
Blocks the current thread until thread tid exits. If A is +the running thread and B is the argument, then we say that +"A joins B." +

+ +Incidentally, the argument is a thread id, instead of a thread pointer, +because a thread pointer is not unique over time. That is, when a +thread dies, its memory may be, whether immediately or much later, +reused for another thread. If thread A over time had two children +B and C that were stored at the same address, then +thread_join(B) and thread_join(C) would be +ambiguous. +

+

+ +A thread may only join its immediate children. Calling +thread_join() on a thread that is not the caller's child should +cause the caller to return immediately. Children are not "inherited," +that is, if A has child B and B has child C, +then A always returns immediately should it try to join C, +even if B is dead. +

+

+ +A thread need not ever be joined. Your solution should properly free +all of a thread's resources, including its struct thread, +whether it is ever joined or not, and regardless of whether the child +exits before or after its parent. That is, a thread should be freed +exactly once in all cases. +

+

+ +Joining a given thread is idempotent. That is, joining a thread +multiple times is equivalent to joining it once, because it has already +exited at the time of the later joins. Thus, joins on a given thread +after the first should return immediately. +

+

+ +You must handle all the ways a join can occur: nested joins (A +joins B, then B joins C), multiple joins (A +joins B, then A joins C), and so on. +

+
+

+ + +


+ +

C.2 Sample Design Document

+ +

+ +
 
+                         +-----------------+
+                         |      CS 140     |
+                         |  SAMPLE PROJECT |
+                         | DESIGN DOCUMENT |
+                         +-----------------+
+
+---- GROUP ----
+
+Ben Pfaff <blp@stanford.edu>
+
+---- PRELIMINARIES ----
+
+>> If you have any preliminary comments on your submission, notes for
+>> the TAs, or extra credit, please give them here.
+
+(This is a sample design document.)
+
+>> Please cite any offline or online sources you consulted while
+>> preparing your submission, other than the Pintos documentation,
+>> course text, and lecture notes.
+
+None.
+
+                                 JOIN
+                                 ====
+
+---- DATA STRUCTURES ----
+
+>> Copy here the declaration of each new or changed `struct' or `struct'
+>> member, global or static variable, `typedef', or enumeration.
+>> Identify the purpose of each in 25 words or less.
+
+A "latch" is a new synchronization primitive.  Acquires block
+until the first release.  Afterward, all ongoing and future
+acquires pass immediately.
+
+    /* Latch. */
+    struct latch
+      {
+        bool released;              /* Released yet? */
+        struct lock monitor_lock;   /* Monitor lock. */
+        struct condition rel_cond;  /* Signaled when released. */
+      };
+
+Added to struct thread:
+
+    /* Members for implementing thread_join(). */
+    struct latch ready_to_die;   /* Release when thread about to die. */
+    struct semaphore can_die;    /* Up when thread allowed to die. */
+    struct list children;        /* List of child threads. */
+    list_elem children_elem;     /* Element of `children' list. */
+
+---- ALGORITHMS ----
+
+>> Briefly describe your implementation of thread_join() and how it
+>> interacts with thread termination.
+
+thread_join() finds the joined child on the thread's list of
+children and waits for the child to exit by acquiring the child's
+ready_to_die latch.  When thread_exit() is called, the thread
+releases its ready_to_die latch, allowing the parent to continue.
+
+---- SYNCHRONIZATION ----
+
+>> Consider parent thread P with child thread C.  How do you ensure
+>> proper synchronization and avoid race conditions when P calls wait(C)
+>> before C exits?  After C exits?  How do you ensure that all resources
+>> are freed in each case?  How about when P terminates without waiting,
+>> before C exits?  After C exits?  Are there any special cases?
+
+C waits in thread_exit() for P to die before it finishes its own
+exit, using the can_die semaphore "down"ed by C and "up"ed by P as
+it exits.  Regardless of whether whether C has terminated, there
+is no race on wait(C), because C waits for P's permission before
+it frees itself.
+
+Regardless of whether P waits for C, P still "up"s C's can_die
+semaphore when P dies, so C will always be freed.  (However,
+freeing C's resources is delayed until P's death.)
+
+The initial thread is a special case because it has no parent to
+wait for it or to "up" its can_die semaphore.  Therefore, its
+can_die semaphore is initialized to 1.
+
+---- RATIONALE ----
+
+>> Critique your design, pointing out advantages and disadvantages in
+>> your design choices.
+
+This design has the advantage of simplicity.  Encapsulating most
+of the synchronization logic into a new "latch" structure
+abstracts what little complexity there is into a separate layer,
+making the design easier to reason about.  Also, all the new data
+members are in `struct thread', with no need for any extra dynamic
+allocation, etc., that would require extra management code.
+
+On the other hand, this design is wasteful in that a child thread
+cannot free itself before its parent has terminated.  A parent
+thread that creates a large number of short-lived child threads
+could unnecessarily exhaust kernel memory.  This is probably
+acceptable for implementing kernel threads, but it may be a bad
+idea for use with user processes because of the larger number of
+resources that user processes tend to own.
+
+


+ + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+
+ +This document was generated +by on March, 6 2012 +using texi2html + + + + diff --git a/doc/pintos_8.html b/doc/pintos_8.html new file mode 100644 index 0000000..e354458 --- /dev/null +++ b/doc/pintos_8.html @@ -0,0 +1,1041 @@ + + + + + +Pintos Projects: Debugging Tools + + + + + + + + + + + + + + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+ +
+

D. Debugging Tools

+ +

+ +Many tools lie at your disposal for debugging Pintos. This appendix +introduces you to a few of them. +

+

+ + +


+ +

D.1 printf()

+ +

+ +Don't underestimate the value of printf(). The way +printf() is implemented in Pintos, you can call it from +practically anywhere in the kernel, whether it's in a kernel thread or +an interrupt handler, almost regardless of what locks are held. +

+

+ +printf() is useful for more than just examining data. +It can also help figure out when and where something goes wrong, even +when the kernel crashes or panics without a useful error message. The +strategy is to sprinkle calls to printf() with different strings +(e.g. "<1>", "<2>", ...) throughout the pieces of +code you suspect are failing. If you don't even see <1> printed, +then something bad happened before that point, if you see <1> +but not <2>, then something bad happened between those two +points, and so on. Based on what you learn, you can then insert more +printf() calls in the new, smaller region of code you suspect. +Eventually you can narrow the problem down to a single statement. +See section D.6 Triple Faults, for a related technique. +

+

+ + +


+ +

D.2 ASSERT

+ +

+ +Assertions are useful because they can catch problems early, before +they'd otherwise be noticed. Ideally, each function should begin with a +set of assertions that check its arguments for validity. (Initializers +for functions' local variables are evaluated before assertions are +checked, so be careful not to assume that an argument is valid in an +initializer.) You can also sprinkle assertions throughout the body of +functions in places where you suspect things are likely to go wrong. +They are especially useful for checking loop invariants. +

+

+ +Pintos provides the ASSERT macro, defined in <debug.h>, +for checking assertions. +

+

+ + +

+
+
Macro: ASSERT (expression) +
Tests the value of expression. If it evaluates to zero (false), +the kernel panics. The panic message includes the expression that +failed, its file and line number, and a backtrace, which should help you +to find the problem. See section D.4 Backtraces, for more information. +
+

+ + +


+ +

D.3 Function and Parameter Attributes

+ +

+ +These macros defined in <debug.h> tell the compiler special +attributes of a function or function parameter. Their expansions are +GCC-specific. +

+

+ + +

+
+
Macro: UNUSED +
Appended to a function parameter to tell the compiler that the +parameter might not be used within the function. It suppresses the +warning that would otherwise appear. +
+

+ + +

+
+
Macro: NO_RETURN +
Appended to a function prototype to tell the compiler that the +function never returns. It allows the compiler to fine-tune its +warnings and its code generation. +
+

+ + +

+
+
Macro: NO_INLINE +
Appended to a function prototype to tell the compiler to never emit +the function in-line. Occasionally useful to improve the quality of +backtraces (see below). +
+

+ + +

+
+
Macro: PRINTF_FORMAT (format, first) +
Appended to a function prototype to tell the compiler that the function +takes a printf()-like format string as the argument numbered +format (starting from 1) and that the corresponding value +arguments start at the argument numbered first. This lets the +compiler tell you if you pass the wrong argument types. +
+

+ + +


+ +

D.4 Backtraces

+ +

+ +When the kernel panics, it prints a "backtrace," that is, a summary +of how your program got where it is, as a list of addresses inside the +functions that were running at the time of the panic. You can also +insert a call to debug_backtrace(), prototyped in +<debug.h>, to print a backtrace at any point in your code. +debug_backtrace_all(), also declared in <debug.h>, +prints backtraces of all threads. +

+

+ +The addresses in a backtrace are listed as raw hexadecimal numbers, +which are difficult to interpret. We provide a tool called +backtrace to translate these into function names and source +file line numbers. +Give it the name of your kernel.o as the first argument and the +hexadecimal numbers composing the backtrace (including the 0x +prefixes) as the remaining arguments. It outputs the function name +and source file line numbers that correspond to each address. +

+

+ +If the translated form of a backtrace is garbled, or doesn't make +sense (e.g. function A is listed above function B, but B doesn't +call A), then it's a good sign that you're corrupting a kernel +thread's stack, because the backtrace is extracted from the stack. +Alternatively, it could be that the kernel.o you passed to +backtrace is not the same kernel that produced +the backtrace. +

+

+ +Sometimes backtraces can be confusing without any corruption. +Compiler optimizations can cause surprising behavior. When a function +has called another function as its final action (a tail call), the +calling function may not appear in a backtrace at all. Similarly, when +function A calls another function B that never returns, the compiler may +optimize such that an unrelated function C appears in the backtrace +instead of A. Function C is simply the function that happens to be in +memory just after A. In the threads project, this is commonly seen in +backtraces for test failures. +

+

+ + +


+ +

D.4.1 Example

+ +

+ +Here's an example. Suppose that Pintos printed out this following call +stack, which is taken from an actual Pintos submission for the file +system project: +

+

+ +
 
Call stack: 0xc0106eff 0xc01102fb 0xc010dc22 0xc010cf67 0xc0102319
+0xc010325a 0x804812c 0x8048a96 0x8048ac8.
+

+ +You would then invoke the backtrace utility like shown below, +cutting and pasting the backtrace information into the command line. +This assumes that kernel.o is in the current directory. You +would of course enter all of the following on a single shell command +line, even though that would overflow our margins here: +

+

+ +
 
backtrace kernel.o 0xc0106eff 0xc01102fb 0xc010dc22 0xc010cf67
+0xc0102319 0xc010325a 0x804812c 0x8048a96 0x8048ac8
+

+ +The backtrace output would then look something like this: +

+

+ +
 
0xc0106eff: debug_panic (lib/debug.c:86)
+0xc01102fb: file_seek (filesys/file.c:405)
+0xc010dc22: seek (userprog/syscall.c:744)
+0xc010cf67: syscall_handler (userprog/syscall.c:444)
+0xc0102319: intr_handler (threads/interrupt.c:334)
+0xc010325a: intr_entry (threads/intr-stubs.S:38)
+0x0804812c: (unknown)
+0x08048a96: (unknown)
+0x08048ac8: (unknown)
+

+ +(You will probably not see exactly the same addresses if you run the +command above on your own kernel binary, because the source code you +compiled and the compiler you used are probably different.) +

+

+ +The first line in the backtrace refers to debug_panic(), the +function that implements kernel panics. Because backtraces commonly +result from kernel panics, debug_panic() will often be the first +function shown in a backtrace. +

+

+ +The second line shows file_seek() as the function that panicked, +in this case as the result of an assertion failure. In the source code +tree used for this example, line 405 of filesys/file.c is the +assertion +

+

+ +
 
ASSERT (file_ofs >= 0);
+

+ +(This line was also cited in the assertion failure message.) +Thus, file_seek() panicked because it passed a negative file offset +argument. +

+

+ +The third line indicates that seek() called file_seek(), +presumably without validating the offset argument. In this submission, +seek() implements the seek system call. +

+

+ +The fourth line shows that syscall_handler(), the system call +handler, invoked seek(). +

+

+ +The fifth and sixth lines are the interrupt handler entry path. +

+

+ +The remaining lines are for addresses below PHYS_BASE. This +means that they refer to addresses in the user program, not in the +kernel. If you know what user program was running when the kernel +panicked, you can re-run backtrace on the user program, like +so: (typing the command on a single line, of course): +

+

+ +
 
backtrace tests/filesys/extended/grow-too-big 0xc0106eff 0xc01102fb
+0xc010dc22 0xc010cf67 0xc0102319 0xc010325a 0x804812c 0x8048a96
+0x8048ac8
+

+ +The results look like this: +

+

+ +
 
0xc0106eff: (unknown)
+0xc01102fb: (unknown)
+0xc010dc22: (unknown)
+0xc010cf67: (unknown)
+0xc0102319: (unknown)
+0xc010325a: (unknown)
+0x0804812c: test_main (...xtended/grow-too-big.c:20)
+0x08048a96: main (tests/main.c:10)
+0x08048ac8: _start (lib/user/entry.c:9)
+

+ +You can even specify both the kernel and the user program names on +the command line, like so: +

+

+ +
 
backtrace kernel.o tests/filesys/extended/grow-too-big 0xc0106eff
+0xc01102fb 0xc010dc22 0xc010cf67 0xc0102319 0xc010325a 0x804812c
+0x8048a96 0x8048ac8
+

+ +The result is a combined backtrace: +

+

+ +
 
In kernel.o:
+0xc0106eff: debug_panic (lib/debug.c:86)
+0xc01102fb: file_seek (filesys/file.c:405)
+0xc010dc22: seek (userprog/syscall.c:744)
+0xc010cf67: syscall_handler (userprog/syscall.c:444)
+0xc0102319: intr_handler (threads/interrupt.c:334)
+0xc010325a: intr_entry (threads/intr-stubs.S:38)
+In tests/filesys/extended/grow-too-big:
+0x0804812c: test_main (...xtended/grow-too-big.c:20)
+0x08048a96: main (tests/main.c:10)
+0x08048ac8: _start (lib/user/entry.c:9)
+

+ +Here's an extra tip for anyone who read this far: backtrace +is smart enough to strip the Call stack: header and . +trailer from the command line if you include them. This can save you +a little bit of trouble in cutting and pasting. Thus, the following +command prints the same output as the first one we used: +

+

+ +
 
backtrace kernel.o Call stack: 0xc0106eff 0xc01102fb 0xc010dc22
+0xc010cf67 0xc0102319 0xc010325a 0x804812c 0x8048a96 0x8048ac8.
+

+ + +


+ +

D.5 GDB

+ +

+ +You can run Pintos under the supervision of the GDB debugger. +First, start Pintos with the --gdb option, e.g. +pintos --gdb -- run mytest. Second, open a second terminal on +the same machine and +use pintos-gdb to invoke GDB on +kernel.o:(5) +
 
pintos-gdb kernel.o
+
and issue the following GDB command: +
 
target remote localhost:1234
+

+ +Now GDB is connected to the simulator over a local +network connection. You can now issue any normal GDB +commands. If you issue the c command, the simulated BIOS will take +control, load Pintos, and then Pintos will run in the usual way. You +can pause the process at any point with Ctrl+C. +

+

+ + +


+ +

D.5.1 Using GDB

+ +

+ +You can read the GDB manual by typing info gdb at a +terminal command prompt. Here's a few commonly useful GDB commands: +

+

+ + +

+
+
GDB Command: c +
Continues execution until Ctrl+C or the next breakpoint. +
+

+ + +

+
+
GDB Command: break function +
+
GDB Command: break file:line +
+
GDB Command: break *address +
Sets a breakpoint at function, at line within file, or +address. +(Use a 0x prefix to specify an address in hex.) +

+ +Use break main to make GDB stop when Pintos starts running. +

+
+

+ + +

+
+
GDB Command: p expression +
Evaluates the given expression and prints its value. +If the expression contains a function call, that function will actually +be executed. +
+

+ + +

+
+
GDB Command: l *address +
Lists a few lines of code around address. +(Use a 0x prefix to specify an address in hex.) +
+

+ + +

+
+
GDB Command: bt +
Prints a stack backtrace similar to that output by the +backtrace program described above. +
+

+ + +

+
+
GDB Command: p/a address +
Prints the name of the function or variable that occupies address. +(Use a 0x prefix to specify an address in hex.) +
+

+ + +

+
+
GDB Command: diassemble function +
Disassembles function. +
+

+ +We also provide a set of macros specialized for debugging Pintos, +written by Godmar Back gback@cs.vt.edu. You can type +help user-defined for basic help with the macros. Here is an +overview of their functionality, based on Godmar's documentation: +

+

+ + +

+
+
GDB Macro: debugpintos +
Attach debugger to a waiting pintos process on the same machine. +Shorthand for target remote localhost:1234. +
+

+ + +

+
+
GDB Macro: dumplist list type element +
Prints the elements of list, which should be a struct list +that contains elements of the given type (without the word +struct) in which element is the struct list_elem member +that links the elements. +

+ +Example: dumplist all_list thread allelem prints all elements of +struct thread that are linked in struct list all_list using the +struct list_elem allelem which is part of struct thread. +

+
+

+ + +

+
+
GDB Macro: btthread thread +
Shows the backtrace of thread, which is a pointer to the +struct thread of the thread whose backtrace it should show. For the +current thread, this is identical to the bt (backtrace) command. +It also works for any thread suspended in schedule(), +provided you know where its kernel stack page is located. +
+

+ + +

+
+
GDB Macro: btthreadlist list element +
Shows the backtraces of all threads in list, the struct list in +which the threads are kept. Specify element as the +struct list_elem field used inside struct thread to link the threads +together. +

+ +Example: btthreadlist all_list allelem shows the backtraces of +all threads contained in struct list all_list, linked together by +allelem. This command is useful to determine where your threads +are stuck when a deadlock occurs. Please see the example scenario below. +

+
+

+ + +

+
+
GDB Macro: btthreadall +
Short-hand for btthreadlist all_list allelem. +
+

+ + +

+
+
GDB Macro: btpagefault +
Print a backtrace of the current thread after a page fault exception. +Normally, when a page fault exception occurs, GDB will stop +with a message that might say:(6) +

+ +
 
Program received signal 0, Signal 0.
+0xc0102320 in intr0e_stub ()
+

+ +In that case, the bt command might not give a useful +backtrace. Use btpagefault instead. +

+

+ +You may also use btpagefault for page faults that occur in a user +process. In this case, you may wish to also load the user program's +symbol table using the loadusersymbols macro, as described above. +

+
+

+ + +

+
+
GDB Macro: hook-stop +
GDB invokes this macro every time the simulation stops, which Bochs will +do for every processor exception, among other reasons. If the +simulation stops due to a page fault, hook-stop will print a +message that says and explains further whether the page fault occurred +in the kernel or in user code. +

+ +If the exception occurred from user code, hook-stop will say: +
 
pintos-debug: a page fault exception occurred in user mode
+pintos-debug: hit 'c' to continue, or 's' to step to intr_handler
+

+ +In Project 2, a page fault in a user process leads to the termination of +the process. You should expect those page faults to occur in the +robustness tests where we test that your kernel properly terminates +processes that try to access invalid addresses. To debug those, set a +break point in page_fault() in exception.c, which you will +need to modify accordingly. +

+

+ +In Project 3, a page fault in a user process no longer automatically +leads to the termination of a process. Instead, it may require reading in +data for the page the process was trying to access, either +because it was swapped out or because this is the first time it's +accessed. In either case, you will reach page_fault() and need to +take the appropriate action there. +

+

+ +If the page fault did not occur in user mode while executing a user +process, then it occurred in kernel mode while executing kernel code. +In this case, hook-stop will print this message: +
 
pintos-debug: a page fault occurred in kernel mode
+
followed by the output of the btpagefault command. +

+ +Before Project 3, a page fault exception in kernel code is always a bug +in your kernel, because your kernel should never crash. Starting with +Project 3, the situation will change if you use the get_user() and +put_user() strategy to verify user memory accesses +(see section 2.2.5 Accessing User Memory). +

+

+ +

+
+

+ + +


+ +

D.5.2 Example GDB Session

+ +

+ +This section narrates a sample GDB session, provided by Godmar Back. +This example illustrates how one might debug a Project 1 solution in +which occasionally a thread that calls timer_sleep() is not woken +up. With this bug, tests such as mlfqs_load_1 get stuck. +

+

+ +This session was captured with a slightly older version of Bochs and the +GDB macros for Pintos, so it looks slightly different than it would now. +Program output is shown in normal type, user input in strong +type. +

+

+ +First, I start Pintos: +

+

+ +
 
$ pintos -v --gdb -- -q -mlfqs run mlfqs-load-1
+Writing command line to /tmp/gDAlqTB5Uf.dsk...
+bochs -q
+========================================================================
+                       Bochs x86 Emulator 2.2.5
+             Build from CVS snapshot on December 30, 2005
+========================================================================
+00000000000i[     ] reading configuration from bochsrc.txt
+00000000000i[     ] Enabled gdbstub
+00000000000i[     ] installing nogui module as the Bochs GUI
+00000000000i[     ] using log file bochsout.txt
+Waiting for gdb connection on localhost:1234
+

+ +Then, I open a second window on the same machine and start GDB: +

+

+ +
 
$ pintos-gdb kernel.o
+GNU gdb Red Hat Linux (6.3.0.0-1.84rh)
+Copyright 2004 Free Software Foundation, Inc.
+GDB is free software, covered by the GNU General Public License, and you are
+welcome to change it and/or distribute copies of it under certain conditions.
+Type "show copying" to see the conditions.
+There is absolutely no warranty for GDB.  Type "show warranty" for details.
+This GDB was configured as "i386-redhat-linux-gnu"...
+Using host libthread_db library "/lib/libthread_db.so.1".
+

+ +Then, I tell GDB to attach to the waiting Pintos emulator: +

+

+ +
 
(gdb) debugpintos
+Remote debugging using localhost:1234
+0x0000fff0 in ?? ()
+Reply contains invalid hex digit 78
+

+ +Now I tell Pintos to run by executing c (short for +continue) twice: +

+

+ +
 
(gdb) c
+Continuing.
+Reply contains invalid hex digit 78
+(gdb) c
+Continuing.
+

+ +Now Pintos will continue and output: +

+

+ +
 
Pintos booting with 4,096 kB RAM...
+Kernel command line: -q -mlfqs run mlfqs-load-1
+374 pages available in kernel pool.
+373 pages available in user pool.
+Calibrating timer...  102,400 loops/s.
+Boot complete.
+Executing 'mlfqs-load-1':
+(mlfqs-load-1) begin
+(mlfqs-load-1) spinning for up to 45 seconds, please wait...
+(mlfqs-load-1) load average rose to 0.5 after 42 seconds
+(mlfqs-load-1) sleeping for another 10 seconds, please wait...
+

+ +...until it gets stuck because of the bug I had introduced. I hit +Ctrl+C in the debugger window: +

+

+ +
 
Program received signal 0, Signal 0.
+0xc010168c in next_thread_to_run () at ../../threads/thread.c:649
+649        while (i <= PRI_MAX && list_empty (&ready_list[i]))
+(gdb)
+

+ +The thread that was running when I interrupted Pintos was the idle +thread. If I run backtrace, it shows this backtrace: +

+

+ +
 
(gdb) bt
+#0  0xc010168c in next_thread_to_run () at ../../threads/thread.c:649
+#1  0xc0101778 in schedule () at ../../threads/thread.c:714
+#2  0xc0100f8f in thread_block () at ../../threads/thread.c:324
+#3  0xc0101419 in idle (aux=0x0) at ../../threads/thread.c:551
+#4  0xc010145a in kernel_thread (function=0xc01013ff , aux=0x0)
+    at ../../threads/thread.c:575
+#5  0x00000000 in ?? ()
+

+ +Not terribly useful. What I really like to know is what's up with the +other thread (or threads). Since I keep all threads in a linked list +called all_list, linked together by a struct list_elem member +named allelem, I can use the btthreadlist macro from the +macro library I wrote. btthreadlist iterates through the list of +threads and prints the backtrace for each thread: +

+

+ +
 
(gdb) btthreadlist all_list allelem
+pintos-debug: dumping backtrace of thread 'main' @0xc002f000
+#0  0xc0101820 in schedule () at ../../threads/thread.c:722
+#1  0xc0100f8f in thread_block () at ../../threads/thread.c:324
+#2  0xc0104755 in timer_sleep (ticks=1000) at ../../devices/timer.c:141
+#3  0xc010bf7c in test_mlfqs_load_1 () at ../../tests/threads/mlfqs-load-1.c:49
+#4  0xc010aabb in run_test (name=0xc0007d8c "mlfqs-load-1")
+    at ../../tests/threads/tests.c:50
+#5  0xc0100647 in run_task (argv=0xc0110d28) at ../../threads/init.c:281
+#6  0xc0100721 in run_actions (argv=0xc0110d28) at ../../threads/init.c:331
+#7  0xc01000c7 in main () at ../../threads/init.c:140
+
+pintos-debug: dumping backtrace of thread 'idle' @0xc0116000
+#0  0xc010168c in next_thread_to_run () at ../../threads/thread.c:649
+#1  0xc0101778 in schedule () at ../../threads/thread.c:714
+#2  0xc0100f8f in thread_block () at ../../threads/thread.c:324
+#3  0xc0101419 in idle (aux=0x0) at ../../threads/thread.c:551
+#4  0xc010145a in kernel_thread (function=0xc01013ff , aux=0x0)
+    at ../../threads/thread.c:575
+#5  0x00000000 in ?? ()
+

+ +In this case, there are only two threads, the idle thread and the main +thread. The kernel stack pages (to which the struct thread points) +are at 0xc0116000 and 0xc002f000, respectively. The main thread +is stuck in timer_sleep(), called from test_mlfqs_load_1. +

+

+ +Knowing where threads are stuck can be tremendously useful, for instance +when diagnosing deadlocks or unexplained hangs. +

+

+ + +

+
+
GDB Macro: loadusersymbols +

+ +You can also use GDB to debug a user program running under Pintos. +To do that, use the loadusersymbols macro to load the program's +symbol table: +
 
loadusersymbols program
+
where program is the name of the program's executable (in the host +file system, not in the Pintos file system). For example, you may issue: +
 
(gdb) loadusersymbols tests/userprog/exec-multiple
+add symbol table from file "tests/userprog/exec-multiple" at
+    .text_addr = 0x80480a0
+(gdb)
+

+ +After this, you should be +able to debug the user program the same way you would the kernel, by +placing breakpoints, inspecting data, etc. Your actions apply to every +user program running in Pintos, not just to the one you want to debug, +so be careful in interpreting the results: GDB does not know +which process is currently active (because that is an abstraction +the Pintos kernel creates). Also, a name that appears in +both the kernel and the user program will actually refer to the kernel +name. (The latter problem can be avoided by giving the user executable +name on the GDB command line, instead of kernel.o, and then using +loadusersymbols to load kernel.o.) +loadusersymbols is implemented via GDB's add-symbol-file +command. +

+

+ +

+
+

+ + +


+ +

D.5.3 FAQ

+ +

+ +

+
+
GDB can't connect to Bochs. +

+ +If the target remote command fails, then make sure that both +GDB and pintos are running on the same machine by +running hostname in each terminal. If the names printed +differ, then you need to open a new terminal for GDB on the +machine running pintos. +

+

+ +

+
GDB doesn't recognize any of the macros. +

+ +If you start GDB with pintos-gdb, it should load the Pintos +macros automatically. If you start GDB some other way, then you must +issue the command source pintosdir/src/misc/gdb-macros, +where pintosdir is the root of your Pintos directory, before you +can use them. +

+

+ +

+
Can I debug Pintos with DDD? +

+ +Yes, you can. DDD invokes GDB as a subprocess, so you'll need to tell +it to invokes pintos-gdb instead: +
 
ddd --gdb --debugger pintos-gdb
+

+ +

+
Can I use GDB inside Emacs? +

+ +Yes, you can. Emacs has special support for running GDB as a +subprocess. Type M-x gdb and enter your pintos-gdb +command at the prompt. The Emacs manual has information on how to use +its debugging features in a section titled "Debuggers." +

+

+ +

+
GDB is doing something weird. +

+ +If you notice strange behavior while using GDB, there +are three possibilities: a bug in your +modified Pintos, a bug in Bochs's +interface to GDB or in GDB itself, or +a bug in the original Pintos code. The first and second +are quite likely, and you should seriously consider both. We hope +that the third is less likely, but it is also possible. +

+

+ + +


+ +

D.6 Triple Faults

+ +

+ +When a CPU exception handler, such as a page fault handler, cannot be +invoked because it is missing or defective, the CPU will try to invoke +the "double fault" handler. If the double fault handler is itself +missing or defective, that's called a "triple fault." A triple fault +causes an immediate CPU reset. +

+

+ +Thus, if you get yourself into a situation where the machine reboots in +a loop, that's probably a "triple fault." In a triple fault +situation, you might not be able to use printf() for debugging, +because the reboots might be happening even before everything needed for +printf() is initialized. +

+

+ +There are at least two ways to debug triple faults. First, you can run +Pintos in Bochs under GDB (see section D.5 GDB). If Bochs has been built +properly for Pintos, a triple fault under GDB will cause it to print the +message "Triple fault: stopping for gdb" on the console and break into +the debugger. (If Bochs is not running under GDB, a triple fault will +still cause it to reboot.) You can then inspect where Pintos stopped, +which is where the triple fault occurred. +

+

+ +Another option is what I call "debugging by infinite loop." +Pick a place in the Pintos code, insert the infinite loop +for (;;); there, and recompile and run. There are two likely +possibilities: +

+

+ +

+

+ +If you move around the infinite loop in a "binary search" fashion, you +can use this technique to pin down the exact spot that everything goes +wrong. It should only take a few minutes at most. +

+

+ + +


+ +

D.7 Modifying Bochs

+ +

+ +An advanced debugging technique is to modify and recompile the +simulator. This proves useful when the simulated hardware has more +information than it makes available to the OS. For example, page +faults have a long list of potential causes, but the hardware does not +report to the OS exactly which one is the particular cause. +Furthermore, a bug in the kernel's handling of page faults can easily +lead to recursive faults, but a "triple fault" will cause the CPU to +reset itself, which is hardly conducive to debugging. +

+

+ +In a case like this, you might appreciate being able to make Bochs +print out more debug information, such as the exact type of fault that +occurred. It's not very hard. You start by retrieving the source +code for Bochs 2.2.6 from http://bochs.sourceforge.net and +saving the file bochs-2.2.6.tar.gz into a directory. +The script pintos/src/misc/bochs-2.2.6-build.sh +applies a number of patches contained in pintos/src/misc +to the Bochs tree, then builds Bochs and installs it in a directory +of your choice. +Run this script without arguments to learn usage instructions. +To use your bochs binary with pintos, make sure +it is the one printed by which `bochs`; otherwise, modify +your PATH accordingly. +

+

+ +Of course, to get any good out of this you'll have to actually modify +Bochs. Instructions for doing this are firmly out of the scope of +this document. However, if you want to debug page faults as suggested +above, a good place to start adding printf()s is +BX_CPU_C::dtranslate_linear() in cpu/paging.cc. +

+

+ + +


+ +

D.8 Tips

+ +

+ +The page allocator in threads/palloc.c and the block allocator in +threads/malloc.c clear all the bytes in memory to +0xcc at time of free. Thus, if you see an attempt to +dereference a pointer like 0xcccccccc, or some other reference to +0xcc, there's a good chance you're trying to reuse a page that's +already been freed. Also, byte 0xcc is the CPU opcode for "invoke +interrupt 3," so if you see an error like Interrupt 0x03 (#BP +Breakpoint Exception), then Pintos tried to execute code in a freed page or +block. +

+

+ +An assertion failure on the expression sec_no < d->capacity +indicates that Pintos tried to access a file through an inode that has +been closed and freed. Freeing an inode clears its starting sector +number to 0xcccccccc, which is not a valid sector number for disks +smaller than about 1.6 TB. + +


+ + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+
+ +This document was generated +by on March, 6 2012 +using texi2html + + + + diff --git a/doc/pintos_9.html b/doc/pintos_9.html new file mode 100644 index 0000000..b63a0ba --- /dev/null +++ b/doc/pintos_9.html @@ -0,0 +1,143 @@ + + + + + +Pintos Projects: Development Tools + + + + + + + + + + + + + + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+ +
+

E. Development Tools

+ +

+ +Here are some tools that you might find useful while developing code. +

+

+ + +


+ +

E.1 Tags

+ +

+ +Tags are an index to the functions and global variables declared in a +program. Many editors, including Emacs and vi, can use +them. The Makefile in pintos/src produces Emacs-style +tags with the command make TAGS or vi-style tags with +make tags. +

+

+ +In Emacs, use M-. to follow a tag in the current window, +C-x 4 . in a new window, or C-x 5 . in a new frame. If +your cursor is on a symbol name for any of those commands, it becomes +the default target. If a tag name has multiple definitions, M-0 +M-. jumps to the next one. To jump back to where you were before +you followed the last tag, use M-*. +

+

+ + +


+ +

E.2 cscope

+ +

+ +The cscope program also provides an index to functions and +variables declared in a program. It has some features that tag +facilities lack. Most notably, it can find all the points in a +program at which a given function is called. +

+

+ +The Makefile in pintos/src produces cscope +indexes when it is invoked as make cscope. Once the index has +been generated, run cscope from a shell command line; no +command-line arguments are normally necessary. Then use the arrow +keys to choose one of the search criteria listed near the bottom of +the terminal, type in an identifier, and hit Enter. +cscope will then display the matches in the upper part of +the terminal. You may use the arrow keys to choose a particular +match; if you then hit Enter, cscope will invoke the +default system editor(7) and position the +cursor on that match. To start a new search, type Tab. To exit +cscope, type Ctrl-d. +

+

+ +Emacs and some versions of vi have their own interfaces to +cscope. For information on how to use these interface, +visit http://cscope.sourceforge.net, the cscope home +page. +

+

+ + +


+ +

E.3 git

+ +

+ +git is a version-control system. That is, you can use it to keep +track of multiple versions of files. The idea is that you do some +work on your code and test it, then check it into the version-control +system. If you decide that the work you've done since your last +check-in is no good, you can easily revert to the last checked-in +version. Furthermore, you can retrieve any old version of your code +as of some given day and time. The version control logs tell you who +made changes and when. +

+

+ + +


+ + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+
+ +This document was generated +by on March, 6 2012 +using texi2html + + + + diff --git a/doc/pintos_abt.html b/doc/pintos_abt.html new file mode 100644 index 0000000..6683381 --- /dev/null +++ b/doc/pintos_abt.html @@ -0,0 +1,205 @@ + + + + + +Pintos Projects: About this document + + + + + + + + + + + + + + + + + +
[Top][Contents][Index][ ? ]
+

About this document

+This document was generated +by +using texi2html +

+The buttons in the navigation panels have the following meaning: +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Button Name Go to From 1.2.3 go to
+ [ < ] +Back + +previous section in reading order + +1.2.2 +
+ [ > ] +Forward + +next section in reading order + +1.2.4 +
+ [ << ] +FastBack + +beginning of this chapter or previous chapter + +1 +
+ [ Up ] +Up + +up section + +1.2 +
+ [ >> ] +FastForward + +next chapter + +2 +
+ [Top] +Top + +cover (top) of document + +   +
+ [Contents] +Contents + +table of contents + +   +
+ [Index] +Index + +concept index + +   +
+ [ ? ] +About + +this page + +   +
+

+ where the Example assumes that the current position + is at Subsubsection One-Two-Three of a document of + the following structure:

+ + +
+
+ +This document was generated +by on March, 6 2012 +using texi2html + + + + diff --git a/doc/pintos_fot.html b/doc/pintos_fot.html new file mode 100644 index 0000000..820500e --- /dev/null +++ b/doc/pintos_fot.html @@ -0,0 +1,79 @@ + + + + + +Pintos Projects: Footnotes + + + + + + + + + + + + + + + + + +
[Top][Contents][Index][ ? ]
+

Footnotes

+

(1)

+

GDB might tell you that +schedule() doesn't exist, which is arguably a GDB bug. +You can work around this by setting the breakpoint by filename and +line number, e.g. break thread.c:ln where ln is +the line number of the first declaration in schedule(). +

(2)

+

We +will treat these terms as synonyms. There is no standard +distinction between them, although Intel processor manuals make +a minor distinction between them on 80x86. +

(3)

+

This is because switch_threads() takes +arguments on the stack and the 80x86 SVR4 calling convention +requires the caller, not the called function, to remove them when the +call is complete. See [ SysV-i386] chapter 3 for details. +

(4)

+

Actually, virtual to physical translation on the +80x86 architecture occurs via an intermediate "linear +address," but Pintos (and most modern 80x86 OSes) set up the CPU +so that linear and virtual addresses are one and the same. Thus, you +can effectively ignore this CPU feature. +

(5)

+

pintos-gdb is a wrapper around +gdb (80x86) or i386-elf-gdb (SPARC) that loads +the Pintos macros at startup. +

(6)

+

To be precise, GDB will stop +only when running under Bochs. When running under QEMU, you must +set a breakpoint in the page_fault function to stop execution +when a page fault occurs. In that case, the btpagefault macro is +unnecessary. +

(7)

+

This is typically vi. To +exit vi, type : q Enter. +


+
+ +This document was generated +by on March, 6 2012 +using texi2html + + + + diff --git a/doc/pintos_ovr.html b/doc/pintos_ovr.html new file mode 100644 index 0000000..610d243 --- /dev/null +++ b/doc/pintos_ovr.html @@ -0,0 +1,69 @@ + + + + + +Pintos Projects: Short Table of Contents + + + + + + + + + + + + + + + + + +
[Top][Contents][Index][ ? ]
+

Short Table of Contents

+
+1. Introduction +
+2. Project 0: Introducing Pintos +
+3. Project 1: Threads +
+4. Project 2: Virtual Memory +
+A. Reference Guide +
+B. Coding Standards +
+C. Project Documentation +
+D. Debugging Tools +
+E. Development Tools +
+Bibliography +
+License +
+ +
+
+
+ +This document was generated +by on March, 6 2012 +using texi2html + + + + diff --git a/doc/pintos_tour.pdf b/doc/pintos_tour.pdf new file mode 100644 index 0000000..98ec371 Binary files /dev/null and b/doc/pintos_tour.pdf differ diff --git a/doc/sample.tmpl b/doc/sample.tmpl new file mode 100644 index 0000000..2d07635 --- /dev/null +++ b/doc/sample.tmpl @@ -0,0 +1,104 @@ + + +-----------------+ + | CS 140 | + | SAMPLE PROJECT | + | DESIGN DOCUMENT | + +-----------------+ + +---- GROUP ---- + +Ben Pfaff + +---- PRELIMINARIES ---- + +>> If you have any preliminary comments on your submission, notes for +>> the TAs, or extra credit, please give them here. + +(This is a sample design document.) + +>> Please cite any offline or online sources you consulted while +>> preparing your submission, other than the Pintos documentation, +>> course text, and lecture notes. + +None. + + JOIN + ==== + +---- DATA STRUCTURES ---- + +>> Copy here the declaration of each new or changed `struct' or `struct' +>> member, global or static variable, `typedef', or enumeration. +>> Identify the purpose of each in 25 words or less. + +A "latch" is a new synchronization primitive. Acquires block +until the first release. Afterward, all ongoing and future +acquires pass immediately. + + /* Latch. */ + struct latch + { + bool released; /* Released yet? */ + struct lock monitor_lock; /* Monitor lock. */ + struct condition rel_cond; /* Signaled when released. */ + }; + +Added to struct thread: + + /* Members for implementing thread_join(). */ + struct latch ready_to_die; /* Release when thread about to die. */ + struct semaphore can_die; /* Up when thread allowed to die. */ + struct list children; /* List of child threads. */ + list_elem children_elem; /* Element of `children' list. */ + +---- ALGORITHMS ---- + +>> Briefly describe your implementation of thread_join() and how it +>> interacts with thread termination. + +thread_join() finds the joined child on the thread's list of +children and waits for the child to exit by acquiring the child's +ready_to_die latch. When thread_exit() is called, the thread +releases its ready_to_die latch, allowing the parent to continue. + +---- SYNCHRONIZATION ---- + +>> Consider parent thread P with child thread C. How do you ensure +>> proper synchronization and avoid race conditions when P calls wait(C) +>> before C exits? After C exits? How do you ensure that all resources +>> are freed in each case? How about when P terminates without waiting, +>> before C exits? After C exits? Are there any special cases? + +C waits in thread_exit() for P to die before it finishes its own +exit, using the can_die semaphore "down"ed by C and "up"ed by P as +it exits. Regardless of whether whether C has terminated, there +is no race on wait(C), because C waits for P's permission before +it frees itself. + +Regardless of whether P waits for C, P still "up"s C's can_die +semaphore when P dies, so C will always be freed. (However, +freeing C's resources is delayed until P's death.) + +The initial thread is a special case because it has no parent to +wait for it or to "up" its can_die semaphore. Therefore, its +can_die semaphore is initialized to 1. + +---- RATIONALE ---- + +>> Critique your design, pointing out advantages and disadvantages in +>> your design choices. + +This design has the advantage of simplicity. Encapsulating most +of the synchronization logic into a new "latch" structure +abstracts what little complexity there is into a separate layer, +making the design easier to reason about. Also, all the new data +members are in `struct thread', with no need for any extra dynamic +allocation, etc., that would require extra management code. + +On the other hand, this design is wasteful in that a child thread +cannot free itself before its parent has terminated. A parent +thread that creates a large number of short-lived child threads +could unnecessarily exhaust kernel memory. This is probably +acceptable for implementing kernel threads, but it may be a bad +idea for use with user processes because of the larger number of +resources that user processes tend to own. diff --git a/doc/start.tmpl b/doc/start.tmpl new file mode 100644 index 0000000..83b17ad --- /dev/null +++ b/doc/start.tmpl @@ -0,0 +1,101 @@ + +--------------------+ + | OS Development | + | PROJECT 0: INTRO | + | DESIGN DOCUMENT | + +--------------------+ + +---- GROUP ---- + +>> Fill in the names and email addresses of your group members. + +FirstName LastName +FirstName LastName +FirstName LastName + +---- PRELIMINARIES ---- + +>> If you have any preliminary comments on your submission, notes for the +>> TAs, or extra credit, please give them here. + +>> Please cite any offline or online sources you consulted while +>> preparing your submission, other than the Pintos documentation, course +>> text, lecture notes, and course staff. + + ALARM CLOCK + =========== + +---- DATA STRUCTURES ---- + +>> A1: Copy here the declaration of each new or changed `struct' or +>> `struct' member, global or static variable, `typedef', or +>> enumeration. Identify the purpose of each in 25 words or less. + +---- ALGORITHMS ---- + +>> A2: Briefly describe what happens in a call to timer_sleep(), +>> including the effects of the timer interrupt handler. + +>> A3: What steps are taken to minimize the amount of time spent in +>> the timer interrupt handler? + +---- SYNCHRONIZATION ---- + +>> A4: How are race conditions avoided when multiple threads call +>> timer_sleep() simultaneously? + +>> A5: How are race conditions avoided when a timer interrupt occurs +>> during a call to timer_sleep()? + +---- RATIONALE ---- + +>> A6: Why did you choose this design? In what ways is it superior to +>> another design you considered? + + ARGUMENT PASSING + ================ + +---- DATA STRUCTURES ---- + +>> A1: Copy here the declaration of each new or changed `struct' or +>> `struct' member, global or static variable, `typedef', or +>> enumeration. Identify the purpose of each in 25 words or less. + +---- ALGORITHMS ---- + +>> A2: Briefly describe how you implemented argument parsing. How do +>> you arrange for the elements of argv[] to be in the right order? +>> How do you avoid overflowing the stack page? + +---- RATIONALE ---- + +>> A3: Why does Pintos implement strtok_r() but not strtok()? + +>> A4: In Pintos, the kernel separates commands into a executable name +>> and arguments. In Unix-like systems, the shell does this +>> separation. Identify at least two advantages of the Unix approach. + + + + SURVEY QUESTIONS + ================ + +Answering these questions is optional, but it will help us improve the +course in future quarters. Feel free to tell us anything you +want--these questions are just to spur your thoughts. You may also +choose to respond anonymously in the course evaluations at the end of +the quarter. + +>> In your opinion, was this assignment, or any one of the three problems +>> in it, too easy or too hard? Did it take too long or too little time? + +>> Did you find that working on a particular part of the assignment gave +>> you greater insight into some aspect of OS design? + +>> Is there some particular fact or hint we should give students in +>> future quarters to help them solve the problems? Conversely, did you +>> find any of our guidance to be misleading? + +>> Do you have any suggestions for the TAs to more effectively assist +>> students, either for future quarters or the remaining projects? + +>> Any other comments? diff --git a/doc/threads.tmpl b/doc/threads.tmpl new file mode 100644 index 0000000..c3df5fe --- /dev/null +++ b/doc/threads.tmpl @@ -0,0 +1,82 @@ + +--------------------+ + | CS 140 | + | PROJECT 1: THREADS | + | DESIGN DOCUMENT | + +--------------------+ + +---- GROUP ---- + +>> Fill in the names and email addresses of your group members. + +FirstName LastName +FirstName LastName +FirstName LastName + +---- PRELIMINARIES ---- + +>> If you have any preliminary comments on your submission, notes for the +>> TAs, or extra credit, please give them here. + +>> Please cite any offline or online sources you consulted while +>> preparing your submission, other than the Pintos documentation, course +>> text, lecture notes, and course staff. + + + PRIORITY SCHEDULING + =================== + +---- DATA STRUCTURES ---- + +>> B1: Copy here the declaration of each new or changed `struct' or +>> `struct' member, global or static variable, `typedef', or +>> enumeration. Identify the purpose of each in 25 words or less. + +>> B2: Explain the data structure used to track priority donation. +>> Use ASCII art to diagram a nested donation. (Alternately, submit a +>> .png file.) + +---- ALGORITHMS ---- + +>> B3: How do you ensure that the highest priority thread waiting for +>> a lock, semaphore, or condition variable wakes up first? + +>> B4: Describe the sequence of events when a call to lock_acquire() +>> causes a priority donation. How is nested donation handled? + +>> B5: Describe the sequence of events when lock_release() is called +>> on a lock that a higher-priority thread is waiting for. + +---- SYNCHRONIZATION ---- + +>> B6: Describe a potential race in thread_set_priority() and explain +>> how your implementation avoids it. Can you use a lock to avoid +>> this race? + +---- RATIONALE ---- + +>> B7: Why did you choose this design? In what ways is it superior to +>> another design you considered? + + SURVEY QUESTIONS + ================ + +Answering these questions is optional, but it will help us improve the +course in future quarters. Feel free to tell us anything you +want--these questions are just to spur your thoughts. You may also +choose to respond anonymously in the course evaluations at the end of +the quarter. + +>> In your opinion, was this assignment, or any one of the three problems +>> in it, too easy or too hard? Did it take too long or too little time? + +>> Did you find that working on a particular part of the assignment gave +>> you greater insight into some aspect of OS design? + +>> Is there some particular fact or hint we should give students in +>> future quarters to help them solve the problems? Conversely, did you +>> find any of our guidance to be misleading? + +>> Do you have any suggestions for the TAs to more effectively assist +>> students, either for future quarters or the remaining projects? + +>> Any other comments? diff --git a/doc/vm.tmpl b/doc/vm.tmpl new file mode 100644 index 0000000..82b0806 --- /dev/null +++ b/doc/vm.tmpl @@ -0,0 +1,81 @@ + +---------------------------+ + | CS 140 | + | PROJECT 3: VIRTUAL MEMORY | + | DESIGN DOCUMENT | + +---------------------------+ + +---- GROUP ---- + +>> Fill in the names and email addresses of your group members. + +FirstName LastName +FirstName LastName +FirstName LastName + +---- PRELIMINARIES ---- + +>> If you have any preliminary comments on your submission, notes for the +>> TAs, or extra credit, please give them here. + +>> Please cite any offline or online sources you consulted while +>> preparing your submission, other than the Pintos documentation, course +>> text, lecture notes, and course staff. + + PAGE TABLE MANAGEMENT + ===================== + +TODO + + STACK GROWTH + ============ +TODO + + MEMORY MAPPED FILES + =================== + +---- DATA STRUCTURES ---- + +>> C1: Copy here the declaration of each new or changed `struct' or +>> `struct' member, global or static variable, `typedef', or +>> enumeration. Identify the purpose of each in 25 words or less. + +---- ALGORITHMS ---- + +>> C2: Describe how memory mapped files integrate into your virtual +>> memory subsystem. + +>> C3: Explain how you determine whether a new file mapping overlaps +>> any existing segment. + +---- RATIONALE ---- + +>> C4: Mappings created with "mmap" have similar semantics to those of +>> data demand-paged from executables, except that "mmap" mappings are +>> written back to their original files, not to swap. This implies +>> that much of their implementation can be shared. Explain why your +>> implementation either does or does not share much of the code for +>> the two situations. + + SURVEY QUESTIONS + ================ + +Answering these questions is optional, but it will help us improve the +course in future quarters. Feel free to tell us anything you +want--these questions are just to spur your thoughts. You may also +choose to respond anonymously in the course evaluations at the end of +the quarter. + +>> In your opinion, was this assignment, or any one of the three problems +>> in it, too easy or too hard? Did it take too long or too little time? + +>> Did you find that working on a particular part of the assignment gave +>> you greater insight into some aspect of OS design? + +>> Is there some particular fact or hint we should give students in +>> future quarters to help them solve the problems? Conversely, did you +>> find any of our guidance to be misleading? + +>> Do you have any suggestions for the TAs to more effectively assist +>> students, either for future quarters or the remaining projects? + +>> Any other comments? -- cgit v1.2.3