diff options
| author | manuel <manuel@mausz.at> | 2012-03-26 12:54:45 +0200 |
|---|---|---|
| committer | manuel <manuel@mausz.at> | 2012-03-26 12:54:45 +0200 |
| commit | b5f0874cd96ee2a62aabc645b9626c2749cb6a01 (patch) | |
| tree | 1262e4bbe0634de6650be130c36e0538240f4cbf /doc/pintos_5.html | |
| download | progos-b5f0874cd96ee2a62aabc645b9626c2749cb6a01.tar.gz progos-b5f0874cd96ee2a62aabc645b9626c2749cb6a01.tar.bz2 progos-b5f0874cd96ee2a62aabc645b9626c2749cb6a01.zip | |
initial pintos checkin
Diffstat (limited to 'doc/pintos_5.html')
| -rw-r--r-- | doc/pintos_5.html | 3343 |
1 files changed, 3343 insertions, 0 deletions
diff --git a/doc/pintos_5.html b/doc/pintos_5.html new file mode 100644 index 0000000..670f351 --- /dev/null +++ b/doc/pintos_5.html | |||
| @@ -0,0 +1,3343 @@ | |||
| 1 | <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" | ||
| 2 | "http://www.w3.org/TR/html40/loose.dtd"> | ||
| 3 | <HTML> | ||
| 4 | <!-- Created on March, 6 2012 by texi2html 1.66 --> | ||
| 5 | <!-- | ||
| 6 | Written by: Lionel Cons <Lionel.Cons@cern.ch> (original author) | ||
| 7 | Karl Berry <karl@freefriends.org> | ||
| 8 | Olaf Bachmann <obachman@mathematik.uni-kl.de> | ||
| 9 | and many others. | ||
| 10 | Maintained by: Many creative people <dev@texi2html.cvshome.org> | ||
| 11 | Send bugs and suggestions to <users@texi2html.cvshome.org> | ||
| 12 | |||
| 13 | --> | ||
| 14 | <HEAD> | ||
| 15 | <TITLE>Pintos Projects: Reference Guide</TITLE> | ||
| 16 | |||
| 17 | <META NAME="description" CONTENT="Pintos Projects: Reference Guide"> | ||
| 18 | <META NAME="keywords" CONTENT="Pintos Projects: Reference Guide"> | ||
| 19 | <META NAME="resource-type" CONTENT="document"> | ||
| 20 | <META NAME="distribution" CONTENT="global"> | ||
| 21 | <META NAME="Generator" CONTENT="texi2html 1.66"> | ||
| 22 | <LINK REL="stylesheet" HREF="pintos.css"> | ||
| 23 | </HEAD> | ||
| 24 | |||
| 25 | <BODY > | ||
| 26 | |||
| 27 | <A NAME="SEC48"></A> | ||
| 28 | <TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0> | ||
| 29 | <TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos_4.html#SEC47"> << </A>]</TD> | ||
| 30 | <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos_6.html#SEC89"> >> </A>]</TD> | ||
| 31 | <TD VALIGN="MIDDLE" ALIGN="LEFT"> <TD VALIGN="MIDDLE" ALIGN="LEFT"> <TD VALIGN="MIDDLE" ALIGN="LEFT"> <TD VALIGN="MIDDLE" ALIGN="LEFT"> <TD VALIGN="MIDDLE" ALIGN="LEFT"> <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos.html#SEC_Top">Top</A>]</TD> | ||
| 32 | <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos.html#SEC_Contents">Contents</A>]</TD> | ||
| 33 | <TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD> | ||
| 34 | <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos_abt.html#SEC_About"> ? </A>]</TD> | ||
| 35 | </TR></TABLE> | ||
| 36 | |||
| 37 | <HR SIZE=2> | ||
| 38 | <H1> A. Reference Guide </H1> | ||
| 39 | <!--docid::SEC48::--> | ||
| 40 | <P> | ||
| 41 | |||
| 42 | This chapter is a reference for the Pintos code. The reference guide | ||
| 43 | does not cover all of the code in Pintos, but it does cover those | ||
| 44 | pieces that students most often find troublesome. You may find that | ||
| 45 | you want to read each part of the reference guide as you work on the | ||
| 46 | project where it becomes important. | ||
| 47 | </P> | ||
| 48 | <P> | ||
| 49 | |||
| 50 | We recommend using "tags" to follow along with references to function | ||
| 51 | and variable names (see section <A HREF="pintos_9.html#SEC110">E.1 Tags</A>). | ||
| 52 | </P> | ||
| 53 | <P> | ||
| 54 | |||
| 55 | <A NAME="Pintos Loading"></A> | ||
| 56 | <HR SIZE="6"> | ||
| 57 | <A NAME="SEC49"></A> | ||
| 58 | <H2> A.1 Loading </H2> | ||
| 59 | <!--docid::SEC49::--> | ||
| 60 | <P> | ||
| 61 | |||
| 62 | This section covers the Pintos loader and basic kernel | ||
| 63 | initialization. | ||
| 64 | </P> | ||
| 65 | <P> | ||
| 66 | |||
| 67 | <A NAME="Pintos Loader"></A> | ||
| 68 | <HR SIZE="6"> | ||
| 69 | <A NAME="SEC50"></A> | ||
| 70 | <H3> A.1.1 The Loader </H3> | ||
| 71 | <!--docid::SEC50::--> | ||
| 72 | <P> | ||
| 73 | |||
| 74 | The first part of Pintos that runs is the loader, in | ||
| 75 | <Q><TT>threads/loader.S</TT></Q>. The PC BIOS loads the loader into memory. | ||
| 76 | The loader, in turn, is responsible for finding the kernel on disk, | ||
| 77 | loading it into memory, and then jumping to its start. It's | ||
| 78 | not important to understand exactly how the loader works, but if | ||
| 79 | you're interested, read on. You should probably read along with the | ||
| 80 | loader's source. You should also understand the basics of the | ||
| 81 | 80<VAR>x</VAR>86 architecture as described by chapter 3, "Basic Execution | ||
| 82 | Environment," of [ <A HREF="pintos_10.html#IA32-v1">IA32-v1</A>]. | ||
| 83 | </P> | ||
| 84 | <P> | ||
| 85 | |||
| 86 | The PC BIOS loads the loader from the first sector of the first hard | ||
| 87 | disk, called the <EM>master boot record</EM> (MBR). PC conventions | ||
| 88 | reserve 64 bytes of the MBR for the partition table, and Pintos uses | ||
| 89 | about 128 additional bytes for kernel command-line arguments. This | ||
| 90 | leaves a little over 300 bytes for the loader's own code. This is a | ||
| 91 | severe restriction that means, practically speaking, the loader must | ||
| 92 | be written in assembly language. | ||
| 93 | </P> | ||
| 94 | <P> | ||
| 95 | |||
| 96 | The Pintos loader and kernel don't have to be on the same disk, nor | ||
| 97 | does is the kernel required to be in any particular location on a | ||
| 98 | given disk. The loader's first job, then, is to find the kernel by | ||
| 99 | reading the partition table on each hard disk, looking for a bootable | ||
| 100 | partition of the type used for a Pintos kernel. | ||
| 101 | </P> | ||
| 102 | <P> | ||
| 103 | |||
| 104 | When the loader finds a bootable kernel partition, it reads the | ||
| 105 | partition's contents into memory at physical address 128 kB. The | ||
| 106 | kernel is at the beginning of the partition, which might be larger | ||
| 107 | than necessary due to partition boundary alignment conventions, so the | ||
| 108 | loader reads no more than 512 kB (and the Pintos build process | ||
| 109 | will refuse to produce kernels larger than that). Reading more data | ||
| 110 | than this would cross into the region from 640 kB to 1 MB that | ||
| 111 | the PC architecture reserves for hardware and the BIOS, and a standard | ||
| 112 | PC BIOS does not provide any means to load the kernel above 1 MB. | ||
| 113 | </P> | ||
| 114 | <P> | ||
| 115 | |||
| 116 | The loader's final job is to extract the entry point from the loaded | ||
| 117 | kernel image and transfer control to it. The entry point is not at a | ||
| 118 | predictable location, but the kernel's ELF header contains a pointer | ||
| 119 | to it. The loader extracts the pointer and jumps to the location it | ||
| 120 | points to. | ||
| 121 | </P> | ||
| 122 | <P> | ||
| 123 | |||
| 124 | The Pintos kernel command line | ||
| 125 | is stored in the boot loader. The <CODE>pintos</CODE> program actually | ||
| 126 | modifies a copy of the boot loader on disk each time it runs the kernel, | ||
| 127 | inserting whatever command-line arguments the user supplies to the kernel, | ||
| 128 | and then the kernel at boot time reads those arguments out of the boot | ||
| 129 | loader in memory. This is not an elegant solution, but it is simple | ||
| 130 | and effective. | ||
| 131 | </P> | ||
| 132 | <P> | ||
| 133 | |||
| 134 | <A NAME="Low-Level Kernel Initialization"></A> | ||
| 135 | <HR SIZE="6"> | ||
| 136 | <A NAME="SEC51"></A> | ||
| 137 | <H3> A.1.2 Low-Level Kernel Initialization </H3> | ||
| 138 | <!--docid::SEC51::--> | ||
| 139 | <P> | ||
| 140 | |||
| 141 | The loader's last action is to transfer control to the kernel's entry | ||
| 142 | point, which is <CODE>start()</CODE> in <Q><TT>threads/start.S</TT></Q>. The job of | ||
| 143 | this code is to switch the CPU from legacy 16-bit "real mode" into | ||
| 144 | the 32-bit "protected mode" used by all modern 80<VAR>x</VAR>86 operating | ||
| 145 | systems. | ||
| 146 | </P> | ||
| 147 | <P> | ||
| 148 | |||
| 149 | The startup code's first task is actually to obtain the machine's | ||
| 150 | memory size, by asking the BIOS for the PC's memory size. The | ||
| 151 | simplest BIOS function to do this can only detect up to 64 MB of RAM, | ||
| 152 | so that's the practical limit that Pintos can support. The function | ||
| 153 | stores the memory size, in pages, in global variable | ||
| 154 | <CODE>init_ram_pages</CODE>. | ||
| 155 | </P> | ||
| 156 | <P> | ||
| 157 | |||
| 158 | The first part of CPU initialization is to enable the A20 line, that | ||
| 159 | is, the CPU's address line numbered 20. For historical reasons, PCs | ||
| 160 | boot with this address line fixed at 0, which means that attempts to | ||
| 161 | access memory beyond the first 1 MB (2 raised to the 20th power) will | ||
| 162 | fail. Pintos wants to access more memory than this, so we have to | ||
| 163 | enable it. | ||
| 164 | </P> | ||
| 165 | <P> | ||
| 166 | |||
| 167 | Next, the loader creates a basic page table. This page table maps | ||
| 168 | the 64 MB at the base of virtual memory (starting at virtual address | ||
| 169 | 0) directly to the identical physical addresses. It also maps the | ||
| 170 | same physical memory starting at virtual address | ||
| 171 | <CODE>LOADER_PHYS_BASE</CODE>, which defaults to <TT>0xc0000000</TT> (3 GB). The | ||
| 172 | Pintos kernel only wants the latter mapping, but there's a | ||
| 173 | chicken-and-egg problem if we don't include the former: our current | ||
| 174 | virtual address is roughly <TT>0x20000</TT>, the location where the loader | ||
| 175 | put us, and we can't jump to <TT>0xc0020000</TT> until we turn on the | ||
| 176 | page table, but if we turn on the page table without jumping there, | ||
| 177 | then we've just pulled the rug out from under ourselves. | ||
| 178 | </P> | ||
| 179 | <P> | ||
| 180 | |||
| 181 | After the page table is initialized, we load the CPU's control | ||
| 182 | registers to turn on protected mode and paging, and set up the segment | ||
| 183 | registers. We aren't yet equipped to handle interrupts in protected | ||
| 184 | mode, so we disable interrupts. The final step is to call <CODE>main()</CODE>. | ||
| 185 | </P> | ||
| 186 | <P> | ||
| 187 | |||
| 188 | <A NAME="High-Level Kernel Initialization"></A> | ||
| 189 | <HR SIZE="6"> | ||
| 190 | <A NAME="SEC52"></A> | ||
| 191 | <H3> A.1.3 High-Level Kernel Initialization </H3> | ||
| 192 | <!--docid::SEC52::--> | ||
| 193 | <P> | ||
| 194 | |||
| 195 | The kernel proper starts with the <CODE>main()</CODE> function. The | ||
| 196 | <CODE>main()</CODE> function is written in C, as will be most of the code we | ||
| 197 | encounter in Pintos from here on out. | ||
| 198 | </P> | ||
| 199 | <P> | ||
| 200 | |||
| 201 | When <CODE>main()</CODE> starts, the system is in a pretty raw state. We're | ||
| 202 | in 32-bit protected mode with paging enabled, but hardly anything else is | ||
| 203 | ready. Thus, the <CODE>main()</CODE> function consists primarily of calls | ||
| 204 | into other Pintos modules' initialization functions. | ||
| 205 | These are usually named <CODE><VAR>module</VAR>_init()</CODE>, where | ||
| 206 | <VAR>module</VAR> is the module's name, <Q><TT><VAR>module</VAR>.c</TT></Q> is the | ||
| 207 | module's source code, and <Q><TT><VAR>module</VAR>.h</TT></Q> is the module's | ||
| 208 | header. | ||
| 209 | </P> | ||
| 210 | <P> | ||
| 211 | |||
| 212 | The first step in <CODE>main()</CODE> is to call <CODE>bss_init()</CODE>, which clears | ||
| 213 | out the kernel's "BSS", which is the traditional name for a | ||
| 214 | segment that should be initialized to all zeros. In most C | ||
| 215 | implementations, whenever you | ||
| 216 | declare a variable outside a function without providing an | ||
| 217 | initializer, that variable goes into the BSS. Because it's all zeros, the | ||
| 218 | BSS isn't stored in the image that the loader brought into memory. We | ||
| 219 | just use <CODE>memset()</CODE> to zero it out. | ||
| 220 | </P> | ||
| 221 | <P> | ||
| 222 | |||
| 223 | Next, <CODE>main()</CODE> calls <CODE>read_command_line()</CODE> to break the kernel command | ||
| 224 | line into arguments, then <CODE>parse_options()</CODE> to read any options at | ||
| 225 | the beginning of the command line. (Actions specified on the | ||
| 226 | command line execute later.) | ||
| 227 | </P> | ||
| 228 | <P> | ||
| 229 | |||
| 230 | <CODE>thread_init()</CODE> initializes the thread system. We will defer full | ||
| 231 | discussion to our discussion of Pintos threads below. It is called so | ||
| 232 | early in initialization because a valid thread structure is a | ||
| 233 | prerequisite for acquiring a lock, and lock acquisition in turn is | ||
| 234 | important to other Pintos subsystems. Then we initialize the console | ||
| 235 | and print a startup message to the console. | ||
| 236 | </P> | ||
| 237 | <P> | ||
| 238 | |||
| 239 | The next block of functions we call initializes the kernel's memory | ||
| 240 | system. <CODE>palloc_init()</CODE> sets up the kernel page allocator, which | ||
| 241 | doles out memory one or more pages at a time (see section <A HREF="pintos_5.html#SEC70">A.5.1 Page Allocator</A>). | ||
| 242 | <CODE>malloc_init()</CODE> sets | ||
| 243 | up the allocator that handles allocations of arbitrary-size blocks of | ||
| 244 | memory (see section <A HREF="pintos_5.html#SEC71">A.5.2 Block Allocator</A>). | ||
| 245 | <CODE>paging_init()</CODE> sets up a page table for the kernel (see section <A HREF="pintos_5.html#SEC73">A.7 Page Table</A>). | ||
| 246 | </P> | ||
| 247 | <P> | ||
| 248 | |||
| 249 | In projects 2 and later, <CODE>main()</CODE> also calls <CODE>tss_init()</CODE> and | ||
| 250 | <CODE>gdt_init()</CODE>. | ||
| 251 | </P> | ||
| 252 | <P> | ||
| 253 | |||
| 254 | The next set of calls initializes the interrupt system. | ||
| 255 | <CODE>intr_init()</CODE> sets up the CPU's <EM>interrupt descriptor table</EM> | ||
| 256 | (IDT) to ready it for interrupt handling (see section <A HREF="pintos_5.html#SEC66">A.4.1 Interrupt Infrastructure</A>), then <CODE>timer_init()</CODE> and <CODE>kbd_init()</CODE> prepare for | ||
| 257 | handling timer interrupts and keyboard interrupts, respectively. | ||
| 258 | <CODE>input_init()</CODE> sets up to merge serial and keyboard input into one | ||
| 259 | stream. In | ||
| 260 | projects 2 and later, we also prepare to handle interrupts caused by | ||
| 261 | user programs using <CODE>exception_init()</CODE> and <CODE>syscall_init()</CODE>. | ||
| 262 | </P> | ||
| 263 | <P> | ||
| 264 | |||
| 265 | Now that interrupts are set up, we can start the scheduler | ||
| 266 | with <CODE>thread_start()</CODE>, which creates the idle thread and enables | ||
| 267 | interrupts. | ||
| 268 | With interrupts enabled, interrupt-driven serial port I/O becomes | ||
| 269 | possible, so we use | ||
| 270 | <CODE>serial_init_queue()</CODE> to switch to that mode. Finally, | ||
| 271 | <CODE>timer_calibrate()</CODE> calibrates the timer for accurate short delays. | ||
| 272 | </P> | ||
| 273 | <P> | ||
| 274 | |||
| 275 | If the file system is compiled in, as it will starting in project 2, we | ||
| 276 | initialize the IDE disks with <CODE>ide_init()</CODE>, then the | ||
| 277 | file system with <CODE>filesys_init()</CODE>. | ||
| 278 | </P> | ||
| 279 | <P> | ||
| 280 | |||
| 281 | Boot is complete, so we print a message. | ||
| 282 | </P> | ||
| 283 | <P> | ||
| 284 | |||
| 285 | Function <CODE>run_actions()</CODE> now parses and executes actions specified on | ||
| 286 | the kernel command line, such as <CODE>run</CODE> to run a test (in project | ||
| 287 | 1) or a user program (in later projects). | ||
| 288 | </P> | ||
| 289 | <P> | ||
| 290 | |||
| 291 | Finally, if <Q><SAMP>-q</SAMP></Q> was specified on the kernel command line, we | ||
| 292 | call <CODE>shutdown_power_off()</CODE> to terminate the machine simulator. Otherwise, | ||
| 293 | <CODE>main()</CODE> calls <CODE>thread_exit()</CODE>, which allows any other running | ||
| 294 | threads to continue running. | ||
| 295 | </P> | ||
| 296 | <P> | ||
| 297 | |||
| 298 | <A NAME="Physical Memory Map"></A> | ||
| 299 | <HR SIZE="6"> | ||
| 300 | <A NAME="SEC53"></A> | ||
| 301 | <H3> A.1.4 Physical Memory Map </H3> | ||
| 302 | <!--docid::SEC53::--> | ||
| 303 | <P> | ||
| 304 | |||
| 305 | </P> | ||
| 306 | <TABLE> | ||
| 307 | @headitem Memory Range | ||
| 308 | </TD><TD> Owner | ||
| 309 | </TD><TD> Contents | ||
| 310 | |||
| 311 | <TR><TD><TT>00000000</TT>--<TT>000003ff</TT> </TD><TD> CPU </TD><TD> Real mode interrupt table.</TD> | ||
| 312 | </TR> | ||
| 313 | <TR><TD><TT>00000400</TT>--<TT>000005ff</TT> </TD><TD> BIOS </TD><TD> Miscellaneous data area.</TD> | ||
| 314 | </TR> | ||
| 315 | <TR><TD><TT>00000600</TT>--<TT>00007bff</TT> </TD><TD> -- </TD><TD> ---</TD> | ||
| 316 | </TR> | ||
| 317 | <TR><TD><TT>00007c00</TT>--<TT>00007dff</TT> </TD><TD> Pintos </TD><TD> Loader.</TD> | ||
| 318 | </TR> | ||
| 319 | <TR><TD><TT>0000e000</TT>--<TT>0000efff</TT> </TD><TD> Pintos</TD> | ||
| 320 | </TD><TD> Stack for loader; kernel stack and <CODE>struct thread</CODE> for initial | ||
| 321 | kernel thread. | ||
| 322 | </TR> | ||
| 323 | <TR><TD><TT>0000f000</TT>--<TT>0000ffff</TT> </TD><TD> Pintos</TD> | ||
| 324 | </TD><TD> Page directory for startup code. | ||
| 325 | </TR> | ||
| 326 | <TR><TD><TT>00010000</TT>--<TT>00020000</TT> </TD><TD> Pintos</TD> | ||
| 327 | </TD><TD> Page tables for startup code. | ||
| 328 | </TR> | ||
| 329 | <TR><TD><TT>00020000</TT>--<TT>0009ffff</TT> </TD><TD> Pintos</TD> | ||
| 330 | </TD><TD> Kernel code, data, and uninitialized data segments. | ||
| 331 | </TR> | ||
| 332 | <TR><TD><TT>000a0000</TT>--<TT>000bffff</TT> </TD><TD> Video </TD><TD> VGA display memory.</TD> | ||
| 333 | </TR> | ||
| 334 | <TR><TD><TT>000c0000</TT>--<TT>000effff</TT> </TD><TD> Hardware</TD> | ||
| 335 | </TD><TD> Reserved for expansion card RAM and ROM. | ||
| 336 | </TR> | ||
| 337 | <TR><TD><TT>000f0000</TT>--<TT>000fffff</TT> </TD><TD> BIOS </TD><TD> ROM BIOS.</TD> | ||
| 338 | </TR> | ||
| 339 | <TR><TD><TT>00100000</TT>--<TT>03ffffff</TT> </TD><TD> Pintos </TD><TD> Dynamic memory allocation.</TD> | ||
| 340 | </TR></TABLE> | ||
| 341 | <P> | ||
| 342 | |||
| 343 | <A NAME="Threads"></A> | ||
| 344 | <HR SIZE="6"> | ||
| 345 | <A NAME="SEC54"></A> | ||
| 346 | <H2> A.2 Threads </H2> | ||
| 347 | <!--docid::SEC54::--> | ||
| 348 | <P> | ||
| 349 | |||
| 350 | <A NAME="struct thread"></A> | ||
| 351 | <HR SIZE="6"> | ||
| 352 | <A NAME="SEC55"></A> | ||
| 353 | <H3> A.2.1 <CODE>struct thread</CODE> </H3> | ||
| 354 | <!--docid::SEC55::--> | ||
| 355 | <P> | ||
| 356 | |||
| 357 | The main Pintos data structure for threads is <CODE>struct thread</CODE>, | ||
| 358 | declared in <Q><TT>threads/thread.h</TT></Q>. | ||
| 359 | </P> | ||
| 360 | <P> | ||
| 361 | |||
| 362 | <A NAME="IDX4"></A> | ||
| 363 | </P> | ||
| 364 | <DL> | ||
| 365 | <DT><U>Structure:</U> <B>struct thread</B> | ||
| 366 | <DD>Represents a thread or a user process. In the projects, you will have | ||
| 367 | to add your own members to <CODE>struct thread</CODE>. You may also change or | ||
| 368 | delete the definitions of existing members. | ||
| 369 | <P> | ||
| 370 | |||
| 371 | Every <CODE>struct thread</CODE> occupies the beginning of its own page of | ||
| 372 | memory. The rest of the page is used for the thread's stack, which | ||
| 373 | grows downward from the end of the page. It looks like this: | ||
| 374 | </P> | ||
| 375 | <P> | ||
| 376 | |||
| 377 | <TABLE><tr><td> </td><td class=example><pre> 4 kB +---------------------------------+ | ||
| 378 | | kernel stack | | ||
| 379 | | | | | ||
| 380 | | | | | ||
| 381 | | V | | ||
| 382 | | grows downward | | ||
| 383 | | | | ||
| 384 | | | | ||
| 385 | | | | ||
| 386 | | | | ||
| 387 | | | | ||
| 388 | | | | ||
| 389 | | | | ||
| 390 | | | | ||
| 391 | sizeof (struct thread) +---------------------------------+ | ||
| 392 | | magic | | ||
| 393 | | : | | ||
| 394 | | : | | ||
| 395 | | status | | ||
| 396 | | tid | | ||
| 397 | 0 kB +---------------------------------+ | ||
| 398 | </pre></td></tr></table><P> | ||
| 399 | |||
| 400 | This has two consequences. First, <CODE>struct thread</CODE> must not be allowed | ||
| 401 | to grow too big. If it does, then there will not be enough room for the | ||
| 402 | kernel stack. The base <CODE>struct thread</CODE> is only a few bytes in size. It | ||
| 403 | probably should stay well under 1 kB. | ||
| 404 | </P> | ||
| 405 | <P> | ||
| 406 | |||
| 407 | Second, kernel stacks must not be allowed to grow too large. If a stack | ||
| 408 | overflows, it will corrupt the thread state. Thus, kernel functions | ||
| 409 | should not allocate large structures or arrays as non-static local | ||
| 410 | variables. Use dynamic allocation with <CODE>malloc()</CODE> or | ||
| 411 | <CODE>palloc_get_page()</CODE> instead (see section <A HREF="pintos_5.html#SEC69">A.5 Memory Allocation</A>). | ||
| 412 | </P> | ||
| 413 | </DL> | ||
| 414 | <P> | ||
| 415 | |||
| 416 | <A NAME="IDX5"></A> | ||
| 417 | </P> | ||
| 418 | <DL> | ||
| 419 | <DT><U>Member of <CODE>struct thread</CODE>:</U> tid_t <B>tid</B> | ||
| 420 | <DD>The thread's thread identifier or <EM>tid</EM>. Every thread must have a | ||
| 421 | tid that is unique over the entire lifetime of the kernel. By | ||
| 422 | default, <CODE>tid_t</CODE> is a <CODE>typedef</CODE> for <CODE>int</CODE> and each new | ||
| 423 | thread receives the numerically next higher tid, starting from 1 for | ||
| 424 | the initial process. You can change the type and the numbering scheme | ||
| 425 | if you like. | ||
| 426 | </DL> | ||
| 427 | <P> | ||
| 428 | |||
| 429 | <A NAME="IDX6"></A> | ||
| 430 | </P> | ||
| 431 | <DL> | ||
| 432 | <DT><U>Member of <CODE>struct thread</CODE>:</U> enum thread_status <B>status</B> | ||
| 433 | <DD><A NAME="Thread States"></A> | ||
| 434 | The thread's state, one of the following: | ||
| 435 | <P> | ||
| 436 | |||
| 437 | <A NAME="IDX7"></A> | ||
| 438 | </P> | ||
| 439 | <DL> | ||
| 440 | <DT><U>Thread State:</U> <B><CODE>THREAD_RUNNING</CODE></B> | ||
| 441 | <DD>The thread is running. Exactly one thread is running at a given time. | ||
| 442 | <CODE>thread_current()</CODE> returns the running thread. | ||
| 443 | </DL> | ||
| 444 | <P> | ||
| 445 | |||
| 446 | <A NAME="IDX8"></A> | ||
| 447 | </P> | ||
| 448 | <DL> | ||
| 449 | <DT><U>Thread State:</U> <B><CODE>THREAD_READY</CODE></B> | ||
| 450 | <DD>The thread is ready to run, but it's not running right now. The | ||
| 451 | thread could be selected to run the next time the scheduler is | ||
| 452 | invoked. Ready threads are kept in a doubly linked list called | ||
| 453 | <CODE>ready_list</CODE>. | ||
| 454 | </DL> | ||
| 455 | <P> | ||
| 456 | |||
| 457 | <A NAME="IDX9"></A> | ||
| 458 | </P> | ||
| 459 | <DL> | ||
| 460 | <DT><U>Thread State:</U> <B><CODE>THREAD_BLOCKED</CODE></B> | ||
| 461 | <DD>The thread is waiting for something, e.g. a lock to become | ||
| 462 | available, an interrupt to be invoked. The thread won't be scheduled | ||
| 463 | again until it transitions to the <CODE>THREAD_READY</CODE> state with a | ||
| 464 | call to <CODE>thread_unblock()</CODE>. This is most conveniently done | ||
| 465 | indirectly, using one of the Pintos synchronization primitives that | ||
| 466 | block and unblock threads automatically (see section <A HREF="pintos_5.html#SEC58">A.3 Synchronization</A>). | ||
| 467 | <P> | ||
| 468 | |||
| 469 | There is no <I>a priori</I> way to tell what a blocked thread is waiting | ||
| 470 | for, but a backtrace can help (see section <A HREF="pintos_8.html#SEC100">D.4 Backtraces</A>). | ||
| 471 | </P> | ||
| 472 | </DL> | ||
| 473 | <P> | ||
| 474 | |||
| 475 | <A NAME="IDX10"></A> | ||
| 476 | </P> | ||
| 477 | <DL> | ||
| 478 | <DT><U>Thread State:</U> <B><CODE>THREAD_DYING</CODE></B> | ||
| 479 | <DD>The thread will be destroyed by the scheduler after switching to the | ||
| 480 | next thread. | ||
| 481 | </DL> | ||
| 482 | </DL> | ||
| 483 | <P> | ||
| 484 | |||
| 485 | <A NAME="IDX11"></A> | ||
| 486 | </P> | ||
| 487 | <DL> | ||
| 488 | <DT><U>Member of <CODE>struct thread</CODE>:</U> char <B>name[16]</B> | ||
| 489 | <DD>The thread's name as a string, or at least the first few characters of | ||
| 490 | it. | ||
| 491 | </DL> | ||
| 492 | <P> | ||
| 493 | |||
| 494 | <A NAME="IDX12"></A> | ||
| 495 | </P> | ||
| 496 | <DL> | ||
| 497 | <DT><U>Member of <CODE>struct thread</CODE>:</U> uint8_t *<B>stack</B> | ||
| 498 | <DD>Every thread has its own stack to keep track of its state. When the | ||
| 499 | thread is running, the CPU's stack pointer register tracks the top of | ||
| 500 | the stack and this member is unused. But when the CPU switches to | ||
| 501 | another thread, this member saves the thread's stack pointer. No | ||
| 502 | other members are needed to save the thread's registers, because the | ||
| 503 | other registers that must be saved are saved on the stack. | ||
| 504 | <P> | ||
| 505 | |||
| 506 | When an interrupt occurs, whether in the kernel or a user program, an | ||
| 507 | <CODE>struct intr_frame</CODE> is pushed onto the stack. When the interrupt occurs | ||
| 508 | in a user program, the <CODE>struct intr_frame</CODE> is always at the very top of | ||
| 509 | the page. See section <A HREF="pintos_5.html#SEC65">A.4 Interrupt Handling</A>, for more information. | ||
| 510 | </P> | ||
| 511 | </DL> | ||
| 512 | <P> | ||
| 513 | |||
| 514 | <A NAME="IDX13"></A> | ||
| 515 | </P> | ||
| 516 | <DL> | ||
| 517 | <DT><U>Member of <CODE>struct thread</CODE>:</U> int <B>priority</B> | ||
| 518 | <DD>A thread priority, ranging from <CODE>PRI_MIN</CODE> (0) to <CODE>PRI_MAX</CODE> | ||
| 519 | (63). Lower numbers correspond to lower priorities, so that | ||
| 520 | priority 0 is the lowest priority and priority 63 is the highest. | ||
| 521 | Pintos as provided ignores thread priorities, but you will implement | ||
| 522 | priority scheduling in project 1 (see section <A HREF="pintos_3.html#SEC45">3.2.2 Priority Scheduling</A>). | ||
| 523 | </DL> | ||
| 524 | <P> | ||
| 525 | |||
| 526 | <A NAME="IDX14"></A> | ||
| 527 | </P> | ||
| 528 | <DL> | ||
| 529 | <DT><U>Member of <CODE>struct thread</CODE>:</U> <CODE>struct list_elem</CODE> <B>allelem</B> | ||
| 530 | <DD>This "list element" is used to link the thread into the list of all | ||
| 531 | threads. Each thread is inserted into this list when it is created | ||
| 532 | and removed when it exits. The <CODE>thread_foreach()</CODE> function should | ||
| 533 | be used to iterate over all threads. | ||
| 534 | </DL> | ||
| 535 | <P> | ||
| 536 | |||
| 537 | <A NAME="IDX15"></A> | ||
| 538 | </P> | ||
| 539 | <DL> | ||
| 540 | <DT><U>Member of <CODE>struct thread</CODE>:</U> <CODE>struct list_elem</CODE> <B>elem</B> | ||
| 541 | <DD>A "list element" used to put the thread into doubly linked lists, | ||
| 542 | either <CODE>ready_list</CODE> (the list of threads ready to run) or a list of | ||
| 543 | threads waiting on a semaphore in <CODE>sema_down()</CODE>. It can do double | ||
| 544 | duty because a thread waiting on a semaphore is not ready, and vice | ||
| 545 | versa. | ||
| 546 | </DL> | ||
| 547 | <P> | ||
| 548 | |||
| 549 | <A NAME="IDX16"></A> | ||
| 550 | </P> | ||
| 551 | <DL> | ||
| 552 | <DT><U>Member of <CODE>struct thread</CODE>:</U> uint32_t *<B>pagedir</B> | ||
| 553 | <DD>Only present in project 2 and later. See <A HREF="pintos_4.html#Page Tables">Page Tables</A>. | ||
| 554 | </DL> | ||
| 555 | <P> | ||
| 556 | |||
| 557 | <A NAME="IDX17"></A> | ||
| 558 | </P> | ||
| 559 | <DL> | ||
| 560 | <DT><U>Member of <CODE>struct thread</CODE>:</U> unsigned <B>magic</B> | ||
| 561 | <DD>Always set to <CODE>THREAD_MAGIC</CODE>, which is just an arbitrary number defined | ||
| 562 | in <Q><TT>threads/thread.c</TT></Q>, and used to detect stack overflow. | ||
| 563 | <CODE>thread_current()</CODE> checks that the <CODE>magic</CODE> member of the running | ||
| 564 | thread's <CODE>struct thread</CODE> is set to <CODE>THREAD_MAGIC</CODE>. Stack overflow | ||
| 565 | tends to change this value, triggering the assertion. For greatest | ||
| 566 | benefit, as you add members to <CODE>struct thread</CODE>, leave <CODE>magic</CODE> at | ||
| 567 | the end. | ||
| 568 | </DL> | ||
| 569 | <P> | ||
| 570 | |||
| 571 | <A NAME="Thread Functions"></A> | ||
| 572 | <HR SIZE="6"> | ||
| 573 | <A NAME="SEC56"></A> | ||
| 574 | <H3> A.2.2 Thread Functions </H3> | ||
| 575 | <!--docid::SEC56::--> | ||
| 576 | <P> | ||
| 577 | |||
| 578 | <Q><TT>threads/thread.c</TT></Q> implements several public functions for thread | ||
| 579 | support. Let's take a look at the most useful: | ||
| 580 | </P> | ||
| 581 | <P> | ||
| 582 | |||
| 583 | <A NAME="IDX18"></A> | ||
| 584 | </P> | ||
| 585 | <DL> | ||
| 586 | <DT><U>Function:</U> void <B>thread_init</B> (void) | ||
| 587 | <DD>Called by <CODE>main()</CODE> to initialize the thread system. Its main | ||
| 588 | purpose is to create a <CODE>struct thread</CODE> for Pintos's initial thread. | ||
| 589 | This is possible because the Pintos loader puts the initial | ||
| 590 | thread's stack at the top of a page, in the same position as any other | ||
| 591 | Pintos thread. | ||
| 592 | <P> | ||
| 593 | |||
| 594 | Before <CODE>thread_init()</CODE> runs, | ||
| 595 | <CODE>thread_current()</CODE> will fail because the running thread's | ||
| 596 | <CODE>magic</CODE> value is incorrect. Lots of functions call | ||
| 597 | <CODE>thread_current()</CODE> directly or indirectly, including | ||
| 598 | <CODE>lock_acquire()</CODE> for locking a lock, so <CODE>thread_init()</CODE> is | ||
| 599 | called early in Pintos initialization. | ||
| 600 | </P> | ||
| 601 | </DL> | ||
| 602 | <P> | ||
| 603 | |||
| 604 | <A NAME="IDX19"></A> | ||
| 605 | </P> | ||
| 606 | <DL> | ||
| 607 | <DT><U>Function:</U> void <B>thread_start</B> (void) | ||
| 608 | <DD>Called by <CODE>main()</CODE> to start the scheduler. Creates the idle | ||
| 609 | thread, that is, the thread that is scheduled when no other thread is | ||
| 610 | ready. Then enables interrupts, which as a side effect enables the | ||
| 611 | scheduler because the scheduler runs on return from the timer interrupt, using | ||
| 612 | <CODE>intr_yield_on_return()</CODE> (see section <A HREF="pintos_5.html#SEC68">A.4.3 External Interrupt Handling</A>). | ||
| 613 | </DL> | ||
| 614 | <P> | ||
| 615 | |||
| 616 | <A NAME="IDX20"></A> | ||
| 617 | </P> | ||
| 618 | <DL> | ||
| 619 | <DT><U>Function:</U> void <B>thread_tick</B> (void) | ||
| 620 | <DD>Called by the timer interrupt at each timer tick. It keeps track of | ||
| 621 | thread statistics and triggers the scheduler when a time slice expires. | ||
| 622 | </DL> | ||
| 623 | <P> | ||
| 624 | |||
| 625 | <A NAME="IDX21"></A> | ||
| 626 | </P> | ||
| 627 | <DL> | ||
| 628 | <DT><U>Function:</U> void <B>thread_print_stats</B> (void) | ||
| 629 | <DD>Called during Pintos shutdown to print thread statistics. | ||
| 630 | </DL> | ||
| 631 | <P> | ||
| 632 | |||
| 633 | <A NAME="IDX22"></A> | ||
| 634 | </P> | ||
| 635 | <DL> | ||
| 636 | <DT><U>Function:</U> tid_t <B>thread_create</B> (const char *<VAR>name</VAR>, int <VAR>priority</VAR>, thread_func *<VAR>func</VAR>, void *<VAR>aux</VAR>) | ||
| 637 | <DD>Creates and starts a new thread named <VAR>name</VAR> with the given | ||
| 638 | <VAR>priority</VAR>, returning the new thread's tid. The thread executes | ||
| 639 | <VAR>func</VAR>, passing <VAR>aux</VAR> as the function's single argument. | ||
| 640 | <P> | ||
| 641 | |||
| 642 | <CODE>thread_create()</CODE> allocates a page for the thread's | ||
| 643 | <CODE>struct thread</CODE> and stack and initializes its members, then it sets | ||
| 644 | up a set of fake stack frames for it (see section <A HREF="pintos_5.html#SEC57">A.2.3 Thread Switching</A>). The | ||
| 645 | thread is initialized in the blocked state, then unblocked just before | ||
| 646 | returning, which allows the new thread to | ||
| 647 | be scheduled (see <A HREF="pintos_5.html#Thread States">Thread States</A>). | ||
| 648 | </P> | ||
| 649 | <P> | ||
| 650 | |||
| 651 | <A NAME="IDX23"></A> | ||
| 652 | </P> | ||
| 653 | <DL> | ||
| 654 | <DT><U>Type:</U> <B>void thread_func (void *<VAR>aux</VAR>)</B> | ||
| 655 | <DD>This is the type of the function passed to <CODE>thread_create()</CODE>, whose | ||
| 656 | <VAR>aux</VAR> argument is passed along as the function's argument. | ||
| 657 | </DL> | ||
| 658 | </DL> | ||
| 659 | <P> | ||
| 660 | |||
| 661 | <A NAME="IDX24"></A> | ||
| 662 | </P> | ||
| 663 | <DL> | ||
| 664 | <DT><U>Function:</U> void <B>thread_block</B> (void) | ||
| 665 | <DD>Transitions the running thread from the running state to the blocked | ||
| 666 | state (see <A HREF="pintos_5.html#Thread States">Thread States</A>). The thread will not run again until | ||
| 667 | <CODE>thread_unblock()</CODE> is | ||
| 668 | called on it, so you'd better have some way arranged for that to happen. | ||
| 669 | Because <CODE>thread_block()</CODE> is so low-level, you should prefer to use | ||
| 670 | one of the synchronization primitives instead (see section <A HREF="pintos_5.html#SEC58">A.3 Synchronization</A>). | ||
| 671 | </DL> | ||
| 672 | <P> | ||
| 673 | |||
| 674 | <A NAME="IDX25"></A> | ||
| 675 | </P> | ||
| 676 | <DL> | ||
| 677 | <DT><U>Function:</U> void <B>thread_unblock</B> (struct thread *<VAR>thread</VAR>) | ||
| 678 | <DD>Transitions <VAR>thread</VAR>, which must be in the blocked state, to the | ||
| 679 | ready state, allowing it to resume running (see <A HREF="pintos_5.html#Thread States">Thread States</A>). | ||
| 680 | This is called when the event that the thread is waiting for occurs, | ||
| 681 | e.g. when the lock that | ||
| 682 | the thread is waiting on becomes available. | ||
| 683 | </DL> | ||
| 684 | <P> | ||
| 685 | |||
| 686 | <A NAME="IDX26"></A> | ||
| 687 | </P> | ||
| 688 | <DL> | ||
| 689 | <DT><U>Function:</U> struct thread *<B>thread_current</B> (void) | ||
| 690 | <DD>Returns the running thread. | ||
| 691 | </DL> | ||
| 692 | <P> | ||
| 693 | |||
| 694 | <A NAME="IDX27"></A> | ||
| 695 | </P> | ||
| 696 | <DL> | ||
| 697 | <DT><U>Function:</U> tid_t <B>thread_tid</B> (void) | ||
| 698 | <DD>Returns the running thread's thread id. Equivalent to | ||
| 699 | <CODE>thread_current ()->tid</CODE>. | ||
| 700 | </DL> | ||
| 701 | <P> | ||
| 702 | |||
| 703 | <A NAME="IDX28"></A> | ||
| 704 | </P> | ||
| 705 | <DL> | ||
| 706 | <DT><U>Function:</U> const char *<B>thread_name</B> (void) | ||
| 707 | <DD>Returns the running thread's name. Equivalent to <CODE>thread_current | ||
| 708 | ()->name</CODE>. | ||
| 709 | </DL> | ||
| 710 | <P> | ||
| 711 | |||
| 712 | <A NAME="IDX29"></A> | ||
| 713 | </P> | ||
| 714 | <DL> | ||
| 715 | <DT><U>Function:</U> void <B>thread_exit</B> (void) <CODE>NO_RETURN</CODE> | ||
| 716 | <DD>Causes the current thread to exit. Never returns, hence | ||
| 717 | <CODE>NO_RETURN</CODE> (see section <A HREF="pintos_8.html#SEC99">D.3 Function and Parameter Attributes</A>). | ||
| 718 | </DL> | ||
| 719 | <P> | ||
| 720 | |||
| 721 | <A NAME="IDX30"></A> | ||
| 722 | </P> | ||
| 723 | <DL> | ||
| 724 | <DT><U>Function:</U> void <B>thread_yield</B> (void) | ||
| 725 | <DD>Yields the CPU to the scheduler, which picks a new thread to run. The | ||
| 726 | new thread might be the current thread, so you can't depend on this | ||
| 727 | function to keep this thread from running for any particular length of | ||
| 728 | time. | ||
| 729 | </DL> | ||
| 730 | <P> | ||
| 731 | |||
| 732 | <A NAME="IDX31"></A> | ||
| 733 | </P> | ||
| 734 | <DL> | ||
| 735 | <DT><U>Function:</U> void <B>thread_foreach</B> (thread_action_func *<VAR>action</VAR>, void *<VAR>aux</VAR>) | ||
| 736 | <DD>Iterates over all threads <VAR>t</VAR> and invokes <CODE>action(t, aux)</CODE> on each. | ||
| 737 | <VAR>action</VAR> must refer to a function that matches the signature | ||
| 738 | given by <CODE>thread_action_func()</CODE>: | ||
| 739 | <P> | ||
| 740 | |||
| 741 | <A NAME="IDX32"></A> | ||
| 742 | </P> | ||
| 743 | <DL> | ||
| 744 | <DT><U>Type:</U> <B>void thread_action_func (struct thread *<VAR>thread</VAR>, void *<VAR>aux</VAR>)</B> | ||
| 745 | <DD>Performs some action on a thread, given <VAR>aux</VAR>. | ||
| 746 | </DL> | ||
| 747 | </DL> | ||
| 748 | <P> | ||
| 749 | |||
| 750 | <A NAME="IDX33"></A> | ||
| 751 | </P> | ||
| 752 | <DL> | ||
| 753 | <DT><U>Function:</U> int <B>thread_get_priority</B> (void) | ||
| 754 | <DD><A NAME="IDX34"></A> | ||
| 755 | <DT><U>Function:</U> void <B>thread_set_priority</B> (int <VAR>new_priority</VAR>) | ||
| 756 | <DD>Stub to set and get thread priority. See section <A HREF="pintos_3.html#SEC45">3.2.2 Priority Scheduling</A>. | ||
| 757 | </DL> | ||
| 758 | <P> | ||
| 759 | |||
| 760 | <A NAME="IDX35"></A> | ||
| 761 | </P> | ||
| 762 | <DL> | ||
| 763 | <DT><U>Function:</U> int <B>thread_get_nice</B> (void) | ||
| 764 | <DD><A NAME="IDX36"></A> | ||
| 765 | <DT><U>Function:</U> void <B>thread_set_nice</B> (int <VAR>new_nice</VAR>) | ||
| 766 | <DD><A NAME="IDX37"></A> | ||
| 767 | <DT><U>Function:</U> int <B>thread_get_recent_cpu</B> (void) | ||
| 768 | <DD><A NAME="IDX38"></A> | ||
| 769 | <DT><U>Function:</U> int <B>thread_get_load_avg</B> (void) | ||
| 770 | <DD>Stubs for the advanced scheduler (not used in this course). | ||
| 771 | </DL> | ||
| 772 | <P> | ||
| 773 | |||
| 774 | <A NAME="Thread Switching"></A> | ||
| 775 | <HR SIZE="6"> | ||
| 776 | <A NAME="SEC57"></A> | ||
| 777 | <H3> A.2.3 Thread Switching </H3> | ||
| 778 | <!--docid::SEC57::--> | ||
| 779 | <P> | ||
| 780 | |||
| 781 | <CODE>schedule()</CODE> is responsible for switching threads. It | ||
| 782 | is internal to <Q><TT>threads/thread.c</TT></Q> and called only by the three | ||
| 783 | public thread functions that need to switch threads: | ||
| 784 | <CODE>thread_block()</CODE>, <CODE>thread_exit()</CODE>, and <CODE>thread_yield()</CODE>. | ||
| 785 | Before any of these functions call <CODE>schedule()</CODE>, they disable | ||
| 786 | interrupts (or ensure that they are already disabled) and then change | ||
| 787 | the running thread's state to something other than running. | ||
| 788 | </P> | ||
| 789 | <P> | ||
| 790 | |||
| 791 | <CODE>schedule()</CODE> is short but tricky. It records the | ||
| 792 | current thread in local variable <VAR>cur</VAR>, determines the next thread | ||
| 793 | to run as local variable <VAR>next</VAR> (by calling | ||
| 794 | <CODE>next_thread_to_run()</CODE>), and then calls <CODE>switch_threads()</CODE> to do | ||
| 795 | the actual thread switch. The thread we switched to was also running | ||
| 796 | inside <CODE>switch_threads()</CODE>, as are all the threads not currently | ||
| 797 | running, so the new thread now returns out of | ||
| 798 | <CODE>switch_threads()</CODE>, returning the previously running thread. | ||
| 799 | </P> | ||
| 800 | <P> | ||
| 801 | |||
| 802 | <CODE>switch_threads()</CODE> is an assembly language routine in | ||
| 803 | <Q><TT>threads/switch.S</TT></Q>. It saves registers on the stack, saves the | ||
| 804 | CPU's current stack pointer in the current <CODE>struct thread</CODE>'s <CODE>stack</CODE> | ||
| 805 | member, restores the new thread's <CODE>stack</CODE> into the CPU's stack | ||
| 806 | pointer, restores registers from the stack, and returns. | ||
| 807 | </P> | ||
| 808 | <P> | ||
| 809 | |||
| 810 | The rest of the scheduler is implemented in <CODE>thread_schedule_tail()</CODE>. It | ||
| 811 | marks the new thread as running. If the thread we just switched from | ||
| 812 | is in the dying state, then it also frees the page that contained the | ||
| 813 | dying thread's <CODE>struct thread</CODE> and stack. These couldn't be freed | ||
| 814 | prior to the thread switch because the switch needed to use it. | ||
| 815 | </P> | ||
| 816 | <P> | ||
| 817 | |||
| 818 | Running a thread for the first time is a special case. When | ||
| 819 | <CODE>thread_create()</CODE> creates a new thread, it goes through a fair | ||
| 820 | amount of trouble to get it started properly. In particular, the new | ||
| 821 | thread hasn't started running yet, so there's no way for it to be | ||
| 822 | running inside <CODE>switch_threads()</CODE> as the scheduler expects. To | ||
| 823 | solve the problem, <CODE>thread_create()</CODE> creates some fake stack frames | ||
| 824 | in the new thread's stack: | ||
| 825 | </P> | ||
| 826 | <P> | ||
| 827 | |||
| 828 | <UL> | ||
| 829 | <LI> | ||
| 830 | The topmost fake stack frame is for <CODE>switch_threads()</CODE>, represented | ||
| 831 | by <CODE>struct switch_threads_frame</CODE>. The important part of this frame is | ||
| 832 | its <CODE>eip</CODE> member, the return address. We point <CODE>eip</CODE> to | ||
| 833 | <CODE>switch_entry()</CODE>, indicating it to be the function that called | ||
| 834 | <CODE>switch_entry()</CODE>. | ||
| 835 | <P> | ||
| 836 | |||
| 837 | </P> | ||
| 838 | <LI> | ||
| 839 | The next fake stack frame is for <CODE>switch_entry()</CODE>, an assembly | ||
| 840 | language routine in <Q><TT>threads/switch.S</TT></Q> that adjusts the stack | ||
| 841 | pointer,<A NAME="DOCF3" HREF="pintos_fot.html#FOOT3">(3)</A> | ||
| 842 | calls <CODE>thread_schedule_tail()</CODE> (this special case is why | ||
| 843 | <CODE>thread_schedule_tail()</CODE> is separate from <CODE>schedule()</CODE>), and returns. | ||
| 844 | We fill in its stack frame so that it returns into | ||
| 845 | <CODE>kernel_thread()</CODE>, a function in <Q><TT>threads/thread.c</TT></Q>. | ||
| 846 | <P> | ||
| 847 | |||
| 848 | </P> | ||
| 849 | <LI> | ||
| 850 | The final stack frame is for <CODE>kernel_thread()</CODE>, which enables | ||
| 851 | interrupts and calls the thread's function (the function passed to | ||
| 852 | <CODE>thread_create()</CODE>). If the thread's function returns, it calls | ||
| 853 | <CODE>thread_exit()</CODE> to terminate the thread. | ||
| 854 | </UL> | ||
| 855 | <P> | ||
| 856 | |||
| 857 | <A NAME="Synchronization"></A> | ||
| 858 | <HR SIZE="6"> | ||
| 859 | <A NAME="SEC58"></A> | ||
| 860 | <H2> A.3 Synchronization </H2> | ||
| 861 | <!--docid::SEC58::--> | ||
| 862 | <P> | ||
| 863 | |||
| 864 | If sharing of resources between threads is not handled in a careful, | ||
| 865 | controlled fashion, the result is usually a big mess. | ||
| 866 | This is especially the case in operating system kernels, where | ||
| 867 | faulty sharing can crash the entire machine. Pintos provides several | ||
| 868 | synchronization primitives to help out. | ||
| 869 | </P> | ||
| 870 | <P> | ||
| 871 | |||
| 872 | <A NAME="Disabling Interrupts"></A> | ||
| 873 | <HR SIZE="6"> | ||
| 874 | <A NAME="SEC59"></A> | ||
| 875 | <H3> A.3.1 Disabling Interrupts </H3> | ||
| 876 | <!--docid::SEC59::--> | ||
| 877 | <P> | ||
| 878 | |||
| 879 | The crudest way to do synchronization is to disable interrupts, that | ||
| 880 | is, to temporarily prevent the CPU from responding to interrupts. If | ||
| 881 | interrupts are off, no other thread will preempt the running thread, | ||
| 882 | because thread preemption is driven by the timer interrupt. If | ||
| 883 | interrupts are on, as they normally are, then the running thread may | ||
| 884 | be preempted by another at any time, whether between two C statements | ||
| 885 | or even within the execution of one. | ||
| 886 | </P> | ||
| 887 | <P> | ||
| 888 | |||
| 889 | Incidentally, this means that Pintos is a "preemptible kernel," that | ||
| 890 | is, kernel threads can be preempted at any time. Traditional Unix | ||
| 891 | systems are "nonpreemptible," that is, kernel threads can only be | ||
| 892 | preempted at points where they explicitly call into the scheduler. | ||
| 893 | (User programs can be preempted at any time in both models.) As you | ||
| 894 | might imagine, preemptible kernels require more explicit | ||
| 895 | synchronization. | ||
| 896 | </P> | ||
| 897 | <P> | ||
| 898 | |||
| 899 | You should have little need to set the interrupt state directly. Most | ||
| 900 | of the time you should use the other synchronization primitives | ||
| 901 | described in the following sections. The main reason to disable | ||
| 902 | interrupts is to synchronize kernel threads with external interrupt | ||
| 903 | handlers, which cannot sleep and thus cannot use most other forms of | ||
| 904 | synchronization (see section <A HREF="pintos_5.html#SEC68">A.4.3 External Interrupt Handling</A>). | ||
| 905 | </P> | ||
| 906 | <P> | ||
| 907 | |||
| 908 | Some external interrupts cannot be postponed, even by disabling | ||
| 909 | interrupts. These interrupts, called <EM>non-maskable interrupts</EM> | ||
| 910 | (NMIs), are supposed to be used only in emergencies, e.g. when the | ||
| 911 | computer is on fire. Pintos does not handle non-maskable interrupts. | ||
| 912 | </P> | ||
| 913 | <P> | ||
| 914 | |||
| 915 | Types and functions for disabling and enabling interrupts are in | ||
| 916 | <Q><TT>threads/interrupt.h</TT></Q>. | ||
| 917 | </P> | ||
| 918 | <P> | ||
| 919 | |||
| 920 | <A NAME="IDX39"></A> | ||
| 921 | </P> | ||
| 922 | <DL> | ||
| 923 | <DT><U>Type:</U> <B>enum intr_level</B> | ||
| 924 | <DD>One of <CODE>INTR_OFF</CODE> or <CODE>INTR_ON</CODE>, denoting that interrupts are | ||
| 925 | disabled or enabled, respectively. | ||
| 926 | </DL> | ||
| 927 | <P> | ||
| 928 | |||
| 929 | <A NAME="IDX40"></A> | ||
| 930 | </P> | ||
| 931 | <DL> | ||
| 932 | <DT><U>Function:</U> enum intr_level <B>intr_get_level</B> (void) | ||
| 933 | <DD>Returns the current interrupt state. | ||
| 934 | </DL> | ||
| 935 | <P> | ||
| 936 | |||
| 937 | <A NAME="IDX41"></A> | ||
| 938 | </P> | ||
| 939 | <DL> | ||
| 940 | <DT><U>Function:</U> enum intr_level <B>intr_set_level</B> (enum intr_level <VAR>level</VAR>) | ||
| 941 | <DD>Turns interrupts on or off according to <VAR>level</VAR>. Returns the | ||
| 942 | previous interrupt state. | ||
| 943 | </DL> | ||
| 944 | <P> | ||
| 945 | |||
| 946 | <A NAME="IDX42"></A> | ||
| 947 | </P> | ||
| 948 | <DL> | ||
| 949 | <DT><U>Function:</U> enum intr_level <B>intr_enable</B> (void) | ||
| 950 | <DD>Turns interrupts on. Returns the previous interrupt state. | ||
| 951 | </DL> | ||
| 952 | <P> | ||
| 953 | |||
| 954 | <A NAME="IDX43"></A> | ||
| 955 | </P> | ||
| 956 | <DL> | ||
| 957 | <DT><U>Function:</U> enum intr_level <B>intr_disable</B> (void) | ||
| 958 | <DD>Turns interrupts off. Returns the previous interrupt state. | ||
| 959 | </DL> | ||
| 960 | <P> | ||
| 961 | |||
| 962 | <A NAME="Semaphores"></A> | ||
| 963 | <HR SIZE="6"> | ||
| 964 | <A NAME="SEC60"></A> | ||
| 965 | <H3> A.3.2 Semaphores </H3> | ||
| 966 | <!--docid::SEC60::--> | ||
| 967 | <P> | ||
| 968 | |||
| 969 | A <EM>semaphore</EM> is a nonnegative integer together with two operators | ||
| 970 | that manipulate it atomically, which are: | ||
| 971 | </P> | ||
| 972 | <P> | ||
| 973 | |||
| 974 | <UL> | ||
| 975 | <LI> | ||
| 976 | "Down" or "P": wait for the value to become positive, then | ||
| 977 | decrement it. | ||
| 978 | <P> | ||
| 979 | |||
| 980 | </P> | ||
| 981 | <LI> | ||
| 982 | "Up" or "V": increment the value (and wake up one waiting thread, | ||
| 983 | if any). | ||
| 984 | </UL> | ||
| 985 | <P> | ||
| 986 | |||
| 987 | A semaphore initialized to 0 may be used to wait for an event | ||
| 988 | that will happen exactly once. For example, suppose thread <VAR>A</VAR> | ||
| 989 | starts another thread <VAR>B</VAR> and wants to wait for <VAR>B</VAR> to signal | ||
| 990 | that some activity is complete. <VAR>A</VAR> can create a semaphore | ||
| 991 | initialized to 0, pass it to <VAR>B</VAR> as it starts it, and then | ||
| 992 | "down" the semaphore. When <VAR>B</VAR> finishes its activity, it | ||
| 993 | "ups" the semaphore. This works regardless of whether <VAR>A</VAR> | ||
| 994 | "downs" the semaphore or <VAR>B</VAR> "ups" it first. | ||
| 995 | </P> | ||
| 996 | <P> | ||
| 997 | |||
| 998 | A semaphore initialized to 1 is typically used for controlling access | ||
| 999 | to a resource. Before a block of code starts using the resource, it | ||
| 1000 | "downs" the semaphore, then after it is done with the resource it | ||
| 1001 | "ups" the resource. In such a case a lock, described below, may be | ||
| 1002 | more appropriate. | ||
| 1003 | </P> | ||
| 1004 | <P> | ||
| 1005 | |||
| 1006 | Semaphores can also be initialized to values larger than 1. These are | ||
| 1007 | rarely used. | ||
| 1008 | </P> | ||
| 1009 | <P> | ||
| 1010 | |||
| 1011 | Semaphores were invented by Edsger Dijkstra and first used in the THE | ||
| 1012 | operating system ([ <A HREF="pintos_10.html#Dijkstra">Dijkstra</A>]). | ||
| 1013 | </P> | ||
| 1014 | <P> | ||
| 1015 | |||
| 1016 | Pintos' semaphore type and operations are declared in | ||
| 1017 | <Q><TT>threads/synch.h</TT></Q>. | ||
| 1018 | </P> | ||
| 1019 | <P> | ||
| 1020 | |||
| 1021 | <A NAME="IDX44"></A> | ||
| 1022 | </P> | ||
| 1023 | <DL> | ||
| 1024 | <DT><U>Type:</U> <B>struct semaphore</B> | ||
| 1025 | <DD>Represents a semaphore. | ||
| 1026 | </DL> | ||
| 1027 | <P> | ||
| 1028 | |||
| 1029 | <A NAME="IDX45"></A> | ||
| 1030 | </P> | ||
| 1031 | <DL> | ||
| 1032 | <DT><U>Function:</U> void <B>sema_init</B> (struct semaphore *<VAR>sema</VAR>, unsigned <VAR>value</VAR>) | ||
| 1033 | <DD>Initializes <VAR>sema</VAR> as a new semaphore with the given initial | ||
| 1034 | <VAR>value</VAR>. | ||
| 1035 | </DL> | ||
| 1036 | <P> | ||
| 1037 | |||
| 1038 | <A NAME="IDX46"></A> | ||
| 1039 | </P> | ||
| 1040 | <DL> | ||
| 1041 | <DT><U>Function:</U> void <B>sema_down</B> (struct semaphore *<VAR>sema</VAR>) | ||
| 1042 | <DD>Executes the "down" or "P" operation on <VAR>sema</VAR>, waiting for | ||
| 1043 | its value to become positive and then decrementing it by one. | ||
| 1044 | </DL> | ||
| 1045 | <P> | ||
| 1046 | |||
| 1047 | <A NAME="IDX47"></A> | ||
| 1048 | </P> | ||
| 1049 | <DL> | ||
| 1050 | <DT><U>Function:</U> bool <B>sema_try_down</B> (struct semaphore *<VAR>sema</VAR>) | ||
| 1051 | <DD>Tries to execute the "down" or "P" operation on <VAR>sema</VAR>, | ||
| 1052 | without waiting. Returns true if <VAR>sema</VAR> | ||
| 1053 | was successfully decremented, or false if it was already | ||
| 1054 | zero and thus could not be decremented without waiting. Calling this | ||
| 1055 | function in a | ||
| 1056 | tight loop wastes CPU time, so use <CODE>sema_down()</CODE> or find a | ||
| 1057 | different approach instead. | ||
| 1058 | </DL> | ||
| 1059 | <P> | ||
| 1060 | |||
| 1061 | <A NAME="IDX48"></A> | ||
| 1062 | </P> | ||
| 1063 | <DL> | ||
| 1064 | <DT><U>Function:</U> void <B>sema_up</B> (struct semaphore *<VAR>sema</VAR>) | ||
| 1065 | <DD>Executes the "up" or "V" operation on <VAR>sema</VAR>, | ||
| 1066 | incrementing its value. If any threads are waiting on | ||
| 1067 | <VAR>sema</VAR>, wakes one of them up. | ||
| 1068 | <P> | ||
| 1069 | |||
| 1070 | Unlike most synchronization primitives, <CODE>sema_up()</CODE> may be called | ||
| 1071 | inside an external interrupt handler (see section <A HREF="pintos_5.html#SEC68">A.4.3 External Interrupt Handling</A>). | ||
| 1072 | </P> | ||
| 1073 | </DL> | ||
| 1074 | <P> | ||
| 1075 | |||
| 1076 | Semaphores are internally built out of disabling interrupt | ||
| 1077 | (see section <A HREF="pintos_5.html#SEC59">A.3.1 Disabling Interrupts</A>) and thread blocking and unblocking | ||
| 1078 | (<CODE>thread_block()</CODE> and <CODE>thread_unblock()</CODE>). Each semaphore maintains | ||
| 1079 | a list of waiting threads, using the linked list | ||
| 1080 | implementation in <Q><TT>lib/kernel/list.c</TT></Q>. | ||
| 1081 | </P> | ||
| 1082 | <P> | ||
| 1083 | |||
| 1084 | <A NAME="Locks"></A> | ||
| 1085 | <HR SIZE="6"> | ||
| 1086 | <A NAME="SEC61"></A> | ||
| 1087 | <H3> A.3.3 Locks </H3> | ||
| 1088 | <!--docid::SEC61::--> | ||
| 1089 | <P> | ||
| 1090 | |||
| 1091 | A <EM>lock</EM> is like a semaphore with an initial value of 1 | ||
| 1092 | (see section <A HREF="pintos_5.html#SEC60">A.3.2 Semaphores</A>). A lock's equivalent of "up" is called | ||
| 1093 | "release", and the "down" operation is called "acquire". | ||
| 1094 | </P> | ||
| 1095 | <P> | ||
| 1096 | |||
| 1097 | Compared to a semaphore, a lock has one added restriction: only the | ||
| 1098 | thread that acquires a lock, called the lock's "owner", is allowed to | ||
| 1099 | release it. If this restriction is a problem, it's a good sign that a | ||
| 1100 | semaphore should be used, instead of a lock. | ||
| 1101 | </P> | ||
| 1102 | <P> | ||
| 1103 | |||
| 1104 | Locks in Pintos are not "recursive," that is, it is an error for the | ||
| 1105 | thread currently holding a lock to try to acquire that lock. | ||
| 1106 | </P> | ||
| 1107 | <P> | ||
| 1108 | |||
| 1109 | Lock types and functions are declared in <Q><TT>threads/synch.h</TT></Q>. | ||
| 1110 | </P> | ||
| 1111 | <P> | ||
| 1112 | |||
| 1113 | <A NAME="IDX49"></A> | ||
| 1114 | </P> | ||
| 1115 | <DL> | ||
| 1116 | <DT><U>Type:</U> <B>struct lock</B> | ||
| 1117 | <DD>Represents a lock. | ||
| 1118 | </DL> | ||
| 1119 | <P> | ||
| 1120 | |||
| 1121 | <A NAME="IDX50"></A> | ||
| 1122 | </P> | ||
| 1123 | <DL> | ||
| 1124 | <DT><U>Function:</U> void <B>lock_init</B> (struct lock *<VAR>lock</VAR>) | ||
| 1125 | <DD>Initializes <VAR>lock</VAR> as a new lock. | ||
| 1126 | The lock is not initially owned by any thread. | ||
| 1127 | </DL> | ||
| 1128 | <P> | ||
| 1129 | |||
| 1130 | <A NAME="IDX51"></A> | ||
| 1131 | </P> | ||
| 1132 | <DL> | ||
| 1133 | <DT><U>Function:</U> void <B>lock_acquire</B> (struct lock *<VAR>lock</VAR>) | ||
| 1134 | <DD>Acquires <VAR>lock</VAR> for the current thread, first waiting for | ||
| 1135 | any current owner to release it if necessary. | ||
| 1136 | </DL> | ||
| 1137 | <P> | ||
| 1138 | |||
| 1139 | <A NAME="IDX52"></A> | ||
| 1140 | </P> | ||
| 1141 | <DL> | ||
| 1142 | <DT><U>Function:</U> bool <B>lock_try_acquire</B> (struct lock *<VAR>lock</VAR>) | ||
| 1143 | <DD>Tries to acquire <VAR>lock</VAR> for use by the current thread, without | ||
| 1144 | waiting. Returns true if successful, false if the lock is already | ||
| 1145 | owned. Calling this function in a tight loop is a bad idea because it | ||
| 1146 | wastes CPU time, so use <CODE>lock_acquire()</CODE> instead. | ||
| 1147 | </DL> | ||
| 1148 | <P> | ||
| 1149 | |||
| 1150 | <A NAME="IDX53"></A> | ||
| 1151 | </P> | ||
| 1152 | <DL> | ||
| 1153 | <DT><U>Function:</U> void <B>lock_release</B> (struct lock *<VAR>lock</VAR>) | ||
| 1154 | <DD>Releases <VAR>lock</VAR>, which the current thread must own. | ||
| 1155 | </DL> | ||
| 1156 | <P> | ||
| 1157 | |||
| 1158 | <A NAME="IDX54"></A> | ||
| 1159 | </P> | ||
| 1160 | <DL> | ||
| 1161 | <DT><U>Function:</U> bool <B>lock_held_by_current_thread</B> (const struct lock *<VAR>lock</VAR>) | ||
| 1162 | <DD>Returns true if the running thread owns <VAR>lock</VAR>, | ||
| 1163 | false otherwise. | ||
| 1164 | There is no function to test whether an arbitrary thread owns a lock, | ||
| 1165 | because the answer could change before the caller could act on it. | ||
| 1166 | </DL> | ||
| 1167 | <P> | ||
| 1168 | |||
| 1169 | <A NAME="Monitors"></A> | ||
| 1170 | <HR SIZE="6"> | ||
| 1171 | <A NAME="SEC62"></A> | ||
| 1172 | <H3> A.3.4 Monitors </H3> | ||
| 1173 | <!--docid::SEC62::--> | ||
| 1174 | <P> | ||
| 1175 | |||
| 1176 | A <EM>monitor</EM> is a higher-level form of synchronization than a | ||
| 1177 | semaphore or a lock. A monitor consists of data being synchronized, | ||
| 1178 | plus a lock, called the <EM>monitor lock</EM>, and one or more | ||
| 1179 | <EM>condition variables</EM>. Before it accesses the protected data, a | ||
| 1180 | thread first acquires the monitor lock. It is then said to be "in the | ||
| 1181 | monitor". While in the monitor, the thread has control over all the | ||
| 1182 | protected data, which it may freely examine or modify. When access to | ||
| 1183 | the protected data is complete, it releases the monitor lock. | ||
| 1184 | </P> | ||
| 1185 | <P> | ||
| 1186 | |||
| 1187 | Condition variables allow code in the monitor to wait for a condition to | ||
| 1188 | become true. Each condition variable is associated with an abstract | ||
| 1189 | condition, e.g. "some data has arrived for processing" or "over 10 | ||
| 1190 | seconds has passed since the user's last keystroke". When code in the | ||
| 1191 | monitor needs to wait for a condition to become true, it "waits" on | ||
| 1192 | the associated condition variable, which releases the lock and waits for | ||
| 1193 | the condition to be signaled. If, on the other hand, it has caused one | ||
| 1194 | of these conditions to become true, it "signals" the condition to wake | ||
| 1195 | up one waiter, or "broadcasts" the condition to wake all of them. | ||
| 1196 | </P> | ||
| 1197 | <P> | ||
| 1198 | |||
| 1199 | The theoretical framework for monitors was laid out by C. A. R. | ||
| 1200 | Hoare ([ <A HREF="pintos_10.html#Hoare">Hoare</A>]). Their practical usage was later elaborated in a | ||
| 1201 | paper on the Mesa operating system ([ <A HREF="pintos_10.html#Lampson">Lampson</A>]). | ||
| 1202 | </P> | ||
| 1203 | <P> | ||
| 1204 | |||
| 1205 | Condition variable types and functions are declared in | ||
| 1206 | <Q><TT>threads/synch.h</TT></Q>. | ||
| 1207 | </P> | ||
| 1208 | <P> | ||
| 1209 | |||
| 1210 | <A NAME="IDX55"></A> | ||
| 1211 | </P> | ||
| 1212 | <DL> | ||
| 1213 | <DT><U>Type:</U> <B>struct condition</B> | ||
| 1214 | <DD>Represents a condition variable. | ||
| 1215 | </DL> | ||
| 1216 | <P> | ||
| 1217 | |||
| 1218 | <A NAME="IDX56"></A> | ||
| 1219 | </P> | ||
| 1220 | <DL> | ||
| 1221 | <DT><U>Function:</U> void <B>cond_init</B> (struct condition *<VAR>cond</VAR>) | ||
| 1222 | <DD>Initializes <VAR>cond</VAR> as a new condition variable. | ||
| 1223 | </DL> | ||
| 1224 | <P> | ||
| 1225 | |||
| 1226 | <A NAME="IDX57"></A> | ||
| 1227 | </P> | ||
| 1228 | <DL> | ||
| 1229 | <DT><U>Function:</U> void <B>cond_wait</B> (struct condition *<VAR>cond</VAR>, struct lock *<VAR>lock</VAR>) | ||
| 1230 | <DD>Atomically releases <VAR>lock</VAR> (the monitor lock) and waits for | ||
| 1231 | <VAR>cond</VAR> to be signaled by some other piece of code. After | ||
| 1232 | <VAR>cond</VAR> is signaled, reacquires <VAR>lock</VAR> before returning. | ||
| 1233 | <VAR>lock</VAR> must be held before calling this function. | ||
| 1234 | <P> | ||
| 1235 | |||
| 1236 | Sending a signal and waking up from a wait are not an atomic operation. | ||
| 1237 | Thus, typically <CODE>cond_wait()</CODE>'s caller must recheck the condition | ||
| 1238 | after the wait completes and, if necessary, wait again. See the next | ||
| 1239 | section for an example. | ||
| 1240 | </P> | ||
| 1241 | </DL> | ||
| 1242 | <P> | ||
| 1243 | |||
| 1244 | <A NAME="IDX58"></A> | ||
| 1245 | </P> | ||
| 1246 | <DL> | ||
| 1247 | <DT><U>Function:</U> void <B>cond_signal</B> (struct condition *<VAR>cond</VAR>, struct lock *<VAR>lock</VAR>) | ||
| 1248 | <DD>If any threads are waiting on <VAR>cond</VAR> (protected by monitor lock | ||
| 1249 | <VAR>lock</VAR>), then this function wakes up one of them. If no threads are | ||
| 1250 | waiting, returns without performing any action. | ||
| 1251 | <VAR>lock</VAR> must be held before calling this function. | ||
| 1252 | </DL> | ||
| 1253 | <P> | ||
| 1254 | |||
| 1255 | <A NAME="IDX59"></A> | ||
| 1256 | </P> | ||
| 1257 | <DL> | ||
| 1258 | <DT><U>Function:</U> void <B>cond_broadcast</B> (struct condition *<VAR>cond</VAR>, struct lock *<VAR>lock</VAR>) | ||
| 1259 | <DD>Wakes up all threads, if any, waiting on <VAR>cond</VAR> (protected by | ||
| 1260 | monitor lock <VAR>lock</VAR>). <VAR>lock</VAR> must be held before calling this | ||
| 1261 | function. | ||
| 1262 | </DL> | ||
| 1263 | <P> | ||
| 1264 | |||
| 1265 | <HR SIZE="6"> | ||
| 1266 | <A NAME="SEC63"></A> | ||
| 1267 | <H4> A.3.4.1 Monitor Example </H4> | ||
| 1268 | <!--docid::SEC63::--> | ||
| 1269 | <P> | ||
| 1270 | |||
| 1271 | The classical example of a monitor is handling a buffer into which one | ||
| 1272 | or more | ||
| 1273 | "producer" threads write characters and out of which one or more | ||
| 1274 | "consumer" threads read characters. To implement this we need, | ||
| 1275 | besides the monitor lock, two condition variables which we will call | ||
| 1276 | <VAR>not_full</VAR> and <VAR>not_empty</VAR>: | ||
| 1277 | </P> | ||
| 1278 | <P> | ||
| 1279 | |||
| 1280 | <TABLE><tr><td> </td><td class=example><pre>char buf[BUF_SIZE]; /* Buffer. */ | ||
| 1281 | size_t n = 0; /* 0 <= n <= <VAR>BUF_SIZE</VAR>: # of characters in buffer. */ | ||
| 1282 | size_t head = 0; /* <VAR>buf</VAR> index of next char to write (mod <VAR>BUF_SIZE</VAR>). */ | ||
| 1283 | size_t tail = 0; /* <VAR>buf</VAR> index of next char to read (mod <VAR>BUF_SIZE</VAR>). */ | ||
| 1284 | struct lock lock; /* Monitor lock. */ | ||
| 1285 | struct condition not_empty; /* Signaled when the buffer is not empty. */ | ||
| 1286 | struct condition not_full; /* Signaled when the buffer is not full. */ | ||
| 1287 | |||
| 1288 | <small>...</small>initialize the locks and condition variables<small>...</small> | ||
| 1289 | |||
| 1290 | void put (char ch) { | ||
| 1291 | lock_acquire (&lock); | ||
| 1292 | while (n == BUF_SIZE) /* Can't add to <VAR>buf</VAR> as long as it's full. */ | ||
| 1293 | cond_wait (&not_full, &lock); | ||
| 1294 | buf[head++ % BUF_SIZE] = ch; /* Add <VAR>ch</VAR> to <VAR>buf</VAR>. */ | ||
| 1295 | n++; | ||
| 1296 | cond_signal (&not_empty, &lock); /* <VAR>buf</VAR> can't be empty anymore. */ | ||
| 1297 | lock_release (&lock); | ||
| 1298 | } | ||
| 1299 | |||
| 1300 | char get (void) { | ||
| 1301 | char ch; | ||
| 1302 | lock_acquire (&lock); | ||
| 1303 | while (n == 0) /* Can't read <VAR>buf</VAR> as long as it's empty. */ | ||
| 1304 | cond_wait (&not_empty, &lock); | ||
| 1305 | ch = buf[tail++ % BUF_SIZE]; /* Get <VAR>ch</VAR> from <VAR>buf</VAR>. */ | ||
| 1306 | n--; | ||
| 1307 | cond_signal (&not_full, &lock); /* <VAR>buf</VAR> can't be full anymore. */ | ||
| 1308 | lock_release (&lock); | ||
| 1309 | } | ||
| 1310 | </pre></td></tr></table><P> | ||
| 1311 | |||
| 1312 | Note that <CODE>BUF_SIZE</CODE> must divide evenly into <CODE>SIZE_MAX + 1</CODE> | ||
| 1313 | for the above code to be completely correct. Otherwise, it will fail | ||
| 1314 | the first time <CODE>head</CODE> wraps around to 0. In practice, | ||
| 1315 | <CODE>BUF_SIZE</CODE> would ordinarily be a power of 2. | ||
| 1316 | </P> | ||
| 1317 | <P> | ||
| 1318 | |||
| 1319 | <A NAME="Optimization Barriers"></A> | ||
| 1320 | <HR SIZE="6"> | ||
| 1321 | <A NAME="SEC64"></A> | ||
| 1322 | <H3> A.3.5 Optimization Barriers </H3> | ||
| 1323 | <!--docid::SEC64::--> | ||
| 1324 | <P> | ||
| 1325 | |||
| 1326 | An <EM>optimization barrier</EM> is a special statement that prevents the | ||
| 1327 | compiler from making assumptions about the state of memory across the | ||
| 1328 | barrier. The compiler will not reorder reads or writes of variables | ||
| 1329 | across the barrier or assume that a variable's value is unmodified | ||
| 1330 | across the barrier, except for local variables whose address is never | ||
| 1331 | taken. In Pintos, <Q><TT>threads/synch.h</TT></Q> defines the <CODE>barrier()</CODE> | ||
| 1332 | macro as an optimization barrier. | ||
| 1333 | </P> | ||
| 1334 | <P> | ||
| 1335 | |||
| 1336 | One reason to use an optimization barrier is when data can change | ||
| 1337 | asynchronously, without the compiler's knowledge, e.g. by another | ||
| 1338 | thread or an interrupt handler. The <CODE>too_many_loops()</CODE> function in | ||
| 1339 | <Q><TT>devices/timer.c</TT></Q> is an example. This function starts out by | ||
| 1340 | busy-waiting in a loop until a timer tick occurs: | ||
| 1341 | </P> | ||
| 1342 | <P> | ||
| 1343 | |||
| 1344 | <TABLE><tr><td> </td><td class=example><pre>/* Wait for a timer tick. */ | ||
| 1345 | int64_t start = ticks; | ||
| 1346 | while (ticks == start) | ||
| 1347 | barrier (); | ||
| 1348 | </pre></td></tr></table><P> | ||
| 1349 | |||
| 1350 | Without an optimization barrier in the loop, the compiler could | ||
| 1351 | conclude that the loop would never terminate, because <CODE>start</CODE> and | ||
| 1352 | <CODE>ticks</CODE> start out equal and the loop itself never changes them. | ||
| 1353 | It could then "optimize" the function into an infinite loop, which | ||
| 1354 | would definitely be undesirable. | ||
| 1355 | </P> | ||
| 1356 | <P> | ||
| 1357 | |||
| 1358 | Optimization barriers can be used to avoid other compiler | ||
| 1359 | optimizations. The <CODE>busy_wait()</CODE> function, also in | ||
| 1360 | <Q><TT>devices/timer.c</TT></Q>, is an example. It contains this loop: | ||
| 1361 | </P> | ||
| 1362 | <P> | ||
| 1363 | |||
| 1364 | <TABLE><tr><td> </td><td class=example><pre>while (loops-- > 0) | ||
| 1365 | barrier (); | ||
| 1366 | </pre></td></tr></table><P> | ||
| 1367 | |||
| 1368 | The goal of this loop is to busy-wait by counting <CODE>loops</CODE> down | ||
| 1369 | from its original value to 0. Without the barrier, the compiler could | ||
| 1370 | delete the loop entirely, because it produces no useful output and has | ||
| 1371 | no side effects. The barrier forces the compiler to pretend that the | ||
| 1372 | loop body has an important effect. | ||
| 1373 | </P> | ||
| 1374 | <P> | ||
| 1375 | |||
| 1376 | Finally, optimization barriers can be used to force the ordering of | ||
| 1377 | memory reads or writes. For example, suppose we add a "feature" | ||
| 1378 | that, whenever a timer interrupt occurs, the character in global | ||
| 1379 | variable <CODE>timer_put_char</CODE> is printed on the console, but only if | ||
| 1380 | global Boolean variable <CODE>timer_do_put</CODE> is true. The best way to | ||
| 1381 | set up <Q><SAMP>x</SAMP></Q> to be printed is then to use an optimization barrier, | ||
| 1382 | like this: | ||
| 1383 | </P> | ||
| 1384 | <P> | ||
| 1385 | |||
| 1386 | <TABLE><tr><td> </td><td class=example><pre>timer_put_char = 'x'; | ||
| 1387 | barrier (); | ||
| 1388 | timer_do_put = true; | ||
| 1389 | </pre></td></tr></table><P> | ||
| 1390 | |||
| 1391 | Without the barrier, the code is buggy because the compiler is free to | ||
| 1392 | reorder operations when it doesn't see a reason to keep them in the | ||
| 1393 | same order. In this case, the compiler doesn't know that the order of | ||
| 1394 | assignments is important, so its optimizer is permitted to exchange | ||
| 1395 | their order. There's no telling whether it will actually do this, and | ||
| 1396 | it is possible that passing the compiler different optimization flags | ||
| 1397 | or using a different version of the compiler will produce different | ||
| 1398 | behavior. | ||
| 1399 | </P> | ||
| 1400 | <P> | ||
| 1401 | |||
| 1402 | Another solution is to disable interrupts around the assignments. | ||
| 1403 | This does not prevent reordering, but it prevents the interrupt | ||
| 1404 | handler from intervening between the assignments. It also has the | ||
| 1405 | extra runtime cost of disabling and re-enabling interrupts: | ||
| 1406 | </P> | ||
| 1407 | <P> | ||
| 1408 | |||
| 1409 | <TABLE><tr><td> </td><td class=example><pre>enum intr_level old_level = intr_disable (); | ||
| 1410 | timer_put_char = 'x'; | ||
| 1411 | timer_do_put = true; | ||
| 1412 | intr_set_level (old_level); | ||
| 1413 | </pre></td></tr></table><P> | ||
| 1414 | |||
| 1415 | A second solution is to mark the declarations of | ||
| 1416 | <CODE>timer_put_char</CODE> and <CODE>timer_do_put</CODE> as <Q><SAMP>volatile</SAMP></Q>. This | ||
| 1417 | keyword tells the compiler that the variables are externally observable | ||
| 1418 | and restricts its latitude for optimization. However, the semantics of | ||
| 1419 | <Q><SAMP>volatile</SAMP></Q> are not well-defined, so it is not a good general | ||
| 1420 | solution. The base Pintos code does not use <Q><SAMP>volatile</SAMP></Q> at all. | ||
| 1421 | </P> | ||
| 1422 | <P> | ||
| 1423 | |||
| 1424 | The following is <EM>not</EM> a solution, because locks neither prevent | ||
| 1425 | interrupts nor prevent the compiler from reordering the code within the | ||
| 1426 | region where the lock is held: | ||
| 1427 | </P> | ||
| 1428 | <P> | ||
| 1429 | |||
| 1430 | <TABLE><tr><td> </td><td class=example><pre>lock_acquire (&timer_lock); /* INCORRECT CODE */ | ||
| 1431 | timer_put_char = 'x'; | ||
| 1432 | timer_do_put = true; | ||
| 1433 | lock_release (&timer_lock); | ||
| 1434 | </pre></td></tr></table><P> | ||
| 1435 | |||
| 1436 | The compiler treats invocation of any function defined externally, | ||
| 1437 | that is, in another source file, as a limited form of optimization | ||
| 1438 | barrier. Specifically, the compiler assumes that any externally | ||
| 1439 | defined function may access any statically or dynamically allocated | ||
| 1440 | data and any local variable whose address is taken. This often means | ||
| 1441 | that explicit barriers can be omitted. It is one reason that Pintos | ||
| 1442 | contains few explicit barriers. | ||
| 1443 | </P> | ||
| 1444 | <P> | ||
| 1445 | |||
| 1446 | A function defined in the same source file, or in a header included by | ||
| 1447 | the source file, cannot be relied upon as a optimization barrier. | ||
| 1448 | This applies even to invocation of a function before its | ||
| 1449 | definition, because the compiler may read and parse the entire source | ||
| 1450 | file before performing optimization. | ||
| 1451 | </P> | ||
| 1452 | <P> | ||
| 1453 | |||
| 1454 | <A NAME="Interrupt Handling"></A> | ||
| 1455 | <HR SIZE="6"> | ||
| 1456 | <A NAME="SEC65"></A> | ||
| 1457 | <H2> A.4 Interrupt Handling </H2> | ||
| 1458 | <!--docid::SEC65::--> | ||
| 1459 | <P> | ||
| 1460 | |||
| 1461 | An <EM>interrupt</EM> notifies the CPU of some event. Much of the work | ||
| 1462 | of an operating system relates to interrupts in one way or another. | ||
| 1463 | For our purposes, we classify interrupts into two broad categories: | ||
| 1464 | </P> | ||
| 1465 | <P> | ||
| 1466 | |||
| 1467 | <UL> | ||
| 1468 | <LI> | ||
| 1469 | <EM>Internal interrupts</EM>, that is, interrupts caused directly by CPU | ||
| 1470 | instructions. System calls, attempts at invalid memory access | ||
| 1471 | (<EM>page faults</EM>), and attempts to divide by zero are some activities | ||
| 1472 | that cause internal interrupts. Because they are caused by CPU | ||
| 1473 | instructions, internal interrupts are <EM>synchronous</EM> or synchronized | ||
| 1474 | with CPU instructions. <CODE>intr_disable()</CODE> does not disable internal | ||
| 1475 | interrupts. | ||
| 1476 | <P> | ||
| 1477 | |||
| 1478 | </P> | ||
| 1479 | <LI> | ||
| 1480 | <EM>External interrupts</EM>, that is, interrupts originating outside the | ||
| 1481 | CPU. These interrupts come from hardware devices such as the system | ||
| 1482 | timer, keyboard, serial ports, and disks. External interrupts are | ||
| 1483 | <EM>asynchronous</EM>, meaning that their delivery is not | ||
| 1484 | synchronized with instruction execution. Handling of external interrupts | ||
| 1485 | can be postponed with <CODE>intr_disable()</CODE> and related functions | ||
| 1486 | (see section <A HREF="pintos_5.html#SEC59">A.3.1 Disabling Interrupts</A>). | ||
| 1487 | </UL> | ||
| 1488 | <P> | ||
| 1489 | |||
| 1490 | The CPU treats both classes of interrupts largely the same way, | ||
| 1491 | so Pintos has common infrastructure to handle both classes. | ||
| 1492 | The following section describes this | ||
| 1493 | common infrastructure. The sections after that give the specifics of | ||
| 1494 | external and internal interrupts. | ||
| 1495 | </P> | ||
| 1496 | <P> | ||
| 1497 | |||
| 1498 | If you haven't already read chapter 3, "Basic Execution Environment," | ||
| 1499 | in [ <A HREF="pintos_10.html#IA32-v1">IA32-v1</A>], it is recommended that you do so now. You might | ||
| 1500 | also want to skim chapter 5, "Interrupt and Exception Handling," in | ||
| 1501 | [ <A HREF="pintos_10.html#IA32-v3a">IA32-v3a</A>]. | ||
| 1502 | </P> | ||
| 1503 | <P> | ||
| 1504 | |||
| 1505 | <A NAME="Interrupt Infrastructure"></A> | ||
| 1506 | <HR SIZE="6"> | ||
| 1507 | <A NAME="SEC66"></A> | ||
| 1508 | <H3> A.4.1 Interrupt Infrastructure </H3> | ||
| 1509 | <!--docid::SEC66::--> | ||
| 1510 | <P> | ||
| 1511 | |||
| 1512 | When an interrupt occurs, the CPU saves | ||
| 1513 | its most essential state on a stack and jumps to an interrupt | ||
| 1514 | handler routine. The 80<VAR>x</VAR>86 architecture supports 256 | ||
| 1515 | interrupts, numbered 0 through 255, each with an independent | ||
| 1516 | handler defined in an array called the <EM>interrupt | ||
| 1517 | descriptor table</EM> or IDT. | ||
| 1518 | </P> | ||
| 1519 | <P> | ||
| 1520 | |||
| 1521 | In Pintos, <CODE>intr_init()</CODE> in <Q><TT>threads/interrupt.c</TT></Q> sets up the | ||
| 1522 | IDT so that each entry points to a unique entry point in | ||
| 1523 | <Q><TT>threads/intr-stubs.S</TT></Q> named <CODE>intr<VAR>NN</VAR>_stub()</CODE>, where | ||
| 1524 | <VAR>NN</VAR> is the interrupt number in | ||
| 1525 | hexadecimal. Because the CPU doesn't give | ||
| 1526 | us any other way to find out the interrupt number, this entry point | ||
| 1527 | pushes the interrupt number on the stack. Then it jumps to | ||
| 1528 | <CODE>intr_entry()</CODE>, which pushes all the registers that the processor | ||
| 1529 | didn't already push for us, and then calls <CODE>intr_handler()</CODE>, which | ||
| 1530 | brings us back into C in <Q><TT>threads/interrupt.c</TT></Q>. | ||
| 1531 | </P> | ||
| 1532 | <P> | ||
| 1533 | |||
| 1534 | The main job of <CODE>intr_handler()</CODE> is to call the function | ||
| 1535 | registered for handling the particular interrupt. (If no | ||
| 1536 | function is registered, it dumps some information to the console and | ||
| 1537 | panics.) It also does some extra processing for external | ||
| 1538 | interrupts (see section <A HREF="pintos_5.html#SEC68">A.4.3 External Interrupt Handling</A>). | ||
| 1539 | </P> | ||
| 1540 | <P> | ||
| 1541 | |||
| 1542 | When <CODE>intr_handler()</CODE> returns, the assembly code in | ||
| 1543 | <Q><TT>threads/intr-stubs.S</TT></Q> restores all the CPU registers saved | ||
| 1544 | earlier and directs the CPU to return from the interrupt. | ||
| 1545 | </P> | ||
| 1546 | <P> | ||
| 1547 | |||
| 1548 | The following types and functions are common to all | ||
| 1549 | interrupts. | ||
| 1550 | </P> | ||
| 1551 | <P> | ||
| 1552 | |||
| 1553 | <A NAME="IDX60"></A> | ||
| 1554 | </P> | ||
| 1555 | <DL> | ||
| 1556 | <DT><U>Type:</U> <B>void intr_handler_func (struct intr_frame *<VAR>frame</VAR>)</B> | ||
| 1557 | <DD>This is how an interrupt handler function must be declared. Its <VAR>frame</VAR> | ||
| 1558 | argument (see below) allows it to determine the cause of the interrupt | ||
| 1559 | and the state of the thread that was interrupted. | ||
| 1560 | </DL> | ||
| 1561 | <P> | ||
| 1562 | |||
| 1563 | <A NAME="IDX61"></A> | ||
| 1564 | </P> | ||
| 1565 | <DL> | ||
| 1566 | <DT><U>Type:</U> <B>struct intr_frame</B> | ||
| 1567 | <DD>The stack frame of an interrupt handler, as saved by the CPU, the interrupt | ||
| 1568 | stubs, and <CODE>intr_entry()</CODE>. Its most interesting members are described | ||
| 1569 | below. | ||
| 1570 | </DL> | ||
| 1571 | <P> | ||
| 1572 | |||
| 1573 | <A NAME="IDX62"></A> | ||
| 1574 | </P> | ||
| 1575 | <DL> | ||
| 1576 | <DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint32_t <B>edi</B> | ||
| 1577 | <DD><A NAME="IDX63"></A> | ||
| 1578 | <DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint32_t <B>esi</B> | ||
| 1579 | <DD><A NAME="IDX64"></A> | ||
| 1580 | <DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint32_t <B>ebp</B> | ||
| 1581 | <DD><A NAME="IDX65"></A> | ||
| 1582 | <DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint32_t <B>esp_dummy</B> | ||
| 1583 | <DD><A NAME="IDX66"></A> | ||
| 1584 | <DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint32_t <B>ebx</B> | ||
| 1585 | <DD><A NAME="IDX67"></A> | ||
| 1586 | <DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint32_t <B>edx</B> | ||
| 1587 | <DD><A NAME="IDX68"></A> | ||
| 1588 | <DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint32_t <B>ecx</B> | ||
| 1589 | <DD><A NAME="IDX69"></A> | ||
| 1590 | <DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint32_t <B>eax</B> | ||
| 1591 | <DD><A NAME="IDX70"></A> | ||
| 1592 | <DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint16_t <B>es</B> | ||
| 1593 | <DD><A NAME="IDX71"></A> | ||
| 1594 | <DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint16_t <B>ds</B> | ||
| 1595 | <DD>Register values in the interrupted thread, pushed by <CODE>intr_entry()</CODE>. | ||
| 1596 | The <CODE>esp_dummy</CODE> value isn't actually used (refer to the | ||
| 1597 | description of <CODE>PUSHA</CODE> in [ <A HREF="pintos_10.html#IA32-v2b">IA32-v2b</A>] for details). | ||
| 1598 | </DL> | ||
| 1599 | <P> | ||
| 1600 | |||
| 1601 | <A NAME="IDX72"></A> | ||
| 1602 | </P> | ||
| 1603 | <DL> | ||
| 1604 | <DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint32_t <B>vec_no</B> | ||
| 1605 | <DD>The interrupt vector number, ranging from 0 to 255. | ||
| 1606 | </DL> | ||
| 1607 | <P> | ||
| 1608 | |||
| 1609 | <A NAME="IDX73"></A> | ||
| 1610 | </P> | ||
| 1611 | <DL> | ||
| 1612 | <DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint32_t <B>error_code</B> | ||
| 1613 | <DD>The "error code" pushed on the stack by the CPU for some internal | ||
| 1614 | interrupts. | ||
| 1615 | </DL> | ||
| 1616 | <P> | ||
| 1617 | |||
| 1618 | <A NAME="IDX74"></A> | ||
| 1619 | </P> | ||
| 1620 | <DL> | ||
| 1621 | <DT><U>Member of <CODE>struct intr_frame</CODE>:</U> void <B>(*eip)</B> (void) | ||
| 1622 | <DD>The address of the next instruction to be executed by the interrupted | ||
| 1623 | thread. | ||
| 1624 | </DL> | ||
| 1625 | <P> | ||
| 1626 | |||
| 1627 | <A NAME="IDX75"></A> | ||
| 1628 | </P> | ||
| 1629 | <DL> | ||
| 1630 | <DT><U>Member of <CODE>struct intr_frame</CODE>:</U> void *<B>esp</B> | ||
| 1631 | <DD>The interrupted thread's stack pointer. | ||
| 1632 | </DL> | ||
| 1633 | <P> | ||
| 1634 | |||
| 1635 | <A NAME="IDX76"></A> | ||
| 1636 | </P> | ||
| 1637 | <DL> | ||
| 1638 | <DT><U>Function:</U> const char *<B>intr_name</B> (uint8_t <VAR>vec</VAR>) | ||
| 1639 | <DD>Returns the name of the interrupt numbered <VAR>vec</VAR>, or | ||
| 1640 | <CODE>"unknown"</CODE> if the interrupt has no registered name. | ||
| 1641 | </DL> | ||
| 1642 | <P> | ||
| 1643 | |||
| 1644 | <A NAME="Internal Interrupt Handling"></A> | ||
| 1645 | <HR SIZE="6"> | ||
| 1646 | <A NAME="SEC67"></A> | ||
| 1647 | <H3> A.4.2 Internal Interrupt Handling </H3> | ||
| 1648 | <!--docid::SEC67::--> | ||
| 1649 | <P> | ||
| 1650 | |||
| 1651 | Internal interrupts are caused directly by CPU instructions executed by | ||
| 1652 | the running kernel thread or user process (from project 2 onward). An | ||
| 1653 | internal interrupt is therefore said to arise in a "process context." | ||
| 1654 | </P> | ||
| 1655 | <P> | ||
| 1656 | |||
| 1657 | In an internal interrupt's handler, it can make sense to examine the | ||
| 1658 | <CODE>struct intr_frame</CODE> passed to the interrupt handler, or even to modify | ||
| 1659 | it. When the interrupt returns, modifications in <CODE>struct intr_frame</CODE> | ||
| 1660 | become changes to the calling thread or process's state. For example, | ||
| 1661 | the Pintos system call handler returns a value to the user program by | ||
| 1662 | modifying the saved EAX register (see section <A HREF="pintos_2.html#SEC40">2.5.2 System Call Details</A>). | ||
| 1663 | </P> | ||
| 1664 | <P> | ||
| 1665 | |||
| 1666 | There are no special restrictions on what an internal interrupt | ||
| 1667 | handler can or can't do. Generally they should run with interrupts | ||
| 1668 | enabled, just like other code, and so they can be preempted by other | ||
| 1669 | kernel threads. Thus, they do need to synchronize with other threads | ||
| 1670 | on shared data and other resources (see section <A HREF="pintos_5.html#SEC58">A.3 Synchronization</A>). | ||
| 1671 | </P> | ||
| 1672 | <P> | ||
| 1673 | |||
| 1674 | Internal interrupt handlers can be invoked recursively. For example, | ||
| 1675 | the system call handler might cause a page fault while attempting to | ||
| 1676 | read user memory. Deep recursion would risk overflowing the limited | ||
| 1677 | kernel stack (see section <A HREF="pintos_5.html#SEC55">A.2.1 <CODE>struct thread</CODE></A>), but should be unnecessary. | ||
| 1678 | </P> | ||
| 1679 | <P> | ||
| 1680 | |||
| 1681 | <A NAME="IDX77"></A> | ||
| 1682 | </P> | ||
| 1683 | <DL> | ||
| 1684 | <DT><U>Function:</U> void <B>intr_register_int</B> (uint8_t <VAR>vec</VAR>, int <VAR>dpl</VAR>, enum intr_level <VAR>level</VAR>, intr_handler_func *<VAR>handler</VAR>, const char *<VAR>name</VAR>) | ||
| 1685 | <DD>Registers <VAR>handler</VAR> to be called when internal interrupt numbered | ||
| 1686 | <VAR>vec</VAR> is triggered. Names the interrupt <VAR>name</VAR> for debugging | ||
| 1687 | purposes. | ||
| 1688 | <P> | ||
| 1689 | |||
| 1690 | If <VAR>level</VAR> is <CODE>INTR_ON</CODE>, external interrupts will be processed | ||
| 1691 | normally during the interrupt handler's execution, which is normally | ||
| 1692 | desirable. Specifying <CODE>INTR_OFF</CODE> will cause the CPU to disable | ||
| 1693 | external interrupts when it invokes the interrupt handler. The effect | ||
| 1694 | is slightly different from calling <CODE>intr_disable()</CODE> inside the | ||
| 1695 | handler, because that leaves a window of one or more CPU instructions in | ||
| 1696 | which external interrupts are still enabled. This is important for the | ||
| 1697 | page fault handler; refer to the comments in <Q><TT>userprog/exception.c</TT></Q> | ||
| 1698 | for details. | ||
| 1699 | </P> | ||
| 1700 | <P> | ||
| 1701 | |||
| 1702 | <VAR>dpl</VAR> determines how the interrupt can be invoked. If <VAR>dpl</VAR> is | ||
| 1703 | 0, then the interrupt can be invoked only by kernel threads. Otherwise | ||
| 1704 | <VAR>dpl</VAR> should be 3, which allows user processes to invoke the | ||
| 1705 | interrupt with an explicit INT instruction. The value of <VAR>dpl</VAR> | ||
| 1706 | doesn't affect user processes' ability to invoke the interrupt | ||
| 1707 | indirectly, e.g. an invalid memory reference will cause a page fault | ||
| 1708 | regardless of <VAR>dpl</VAR>. | ||
| 1709 | </P> | ||
| 1710 | </DL> | ||
| 1711 | <P> | ||
| 1712 | |||
| 1713 | <A NAME="External Interrupt Handling"></A> | ||
| 1714 | <HR SIZE="6"> | ||
| 1715 | <A NAME="SEC68"></A> | ||
| 1716 | <H3> A.4.3 External Interrupt Handling </H3> | ||
| 1717 | <!--docid::SEC68::--> | ||
| 1718 | <P> | ||
| 1719 | |||
| 1720 | External interrupts are caused by events outside the CPU. | ||
| 1721 | They are asynchronous, so they can be invoked at any time that | ||
| 1722 | interrupts have not been disabled. We say that an external interrupt | ||
| 1723 | runs in an "interrupt context." | ||
| 1724 | </P> | ||
| 1725 | <P> | ||
| 1726 | |||
| 1727 | In an external interrupt, the <CODE>struct intr_frame</CODE> passed to the | ||
| 1728 | handler is not very meaningful. It describes the state of the thread | ||
| 1729 | or process that was interrupted, but there is no way to predict which | ||
| 1730 | one that is. It is possible, although rarely useful, to examine it, but | ||
| 1731 | modifying it is a recipe for disaster. | ||
| 1732 | </P> | ||
| 1733 | <P> | ||
| 1734 | |||
| 1735 | Only one external interrupt may be processed at a time. Neither | ||
| 1736 | internal nor external interrupt may nest within an external interrupt | ||
| 1737 | handler. Thus, an external interrupt's handler must run with interrupts | ||
| 1738 | disabled (see section <A HREF="pintos_5.html#SEC59">A.3.1 Disabling Interrupts</A>). | ||
| 1739 | </P> | ||
| 1740 | <P> | ||
| 1741 | |||
| 1742 | An external interrupt handler must not sleep or yield, which rules out | ||
| 1743 | calling <CODE>lock_acquire()</CODE>, <CODE>thread_yield()</CODE>, and many other | ||
| 1744 | functions. Sleeping in interrupt context would effectively put the | ||
| 1745 | interrupted thread to sleep, too, until the interrupt handler was again | ||
| 1746 | scheduled and returned. This would be unfair to the unlucky thread, and | ||
| 1747 | it would deadlock if the handler were waiting for the sleeping thread | ||
| 1748 | to, e.g., release a lock. | ||
| 1749 | </P> | ||
| 1750 | <P> | ||
| 1751 | |||
| 1752 | An external interrupt handler | ||
| 1753 | effectively monopolizes the machine and delays all other activities. | ||
| 1754 | Therefore, external interrupt handlers should complete as quickly as | ||
| 1755 | they can. Anything that require much CPU time should instead run in a | ||
| 1756 | kernel thread, possibly one that the interrupt triggers using a | ||
| 1757 | synchronization primitive. | ||
| 1758 | </P> | ||
| 1759 | <P> | ||
| 1760 | |||
| 1761 | External interrupts are controlled by a | ||
| 1762 | pair of devices outside the CPU called <EM>programmable interrupt | ||
| 1763 | controllers</EM>, <EM>PICs</EM> for short. When <CODE>intr_init()</CODE> sets up the | ||
| 1764 | CPU's IDT, it also initializes the PICs for interrupt handling. The | ||
| 1765 | PICs also must be "acknowledged" at the end of processing for each | ||
| 1766 | external interrupt. <CODE>intr_handler()</CODE> takes care of that by calling | ||
| 1767 | <CODE>pic_end_of_interrupt()</CODE>, which properly signals the PICs. | ||
| 1768 | </P> | ||
| 1769 | <P> | ||
| 1770 | |||
| 1771 | The following functions relate to external | ||
| 1772 | interrupts. | ||
| 1773 | </P> | ||
| 1774 | <P> | ||
| 1775 | |||
| 1776 | <A NAME="IDX78"></A> | ||
| 1777 | </P> | ||
| 1778 | <DL> | ||
| 1779 | <DT><U>Function:</U> void <B>intr_register_ext</B> (uint8_t <VAR>vec</VAR>, intr_handler_func *<VAR>handler</VAR>, const char *<VAR>name</VAR>) | ||
| 1780 | <DD>Registers <VAR>handler</VAR> to be called when external interrupt numbered | ||
| 1781 | <VAR>vec</VAR> is triggered. Names the interrupt <VAR>name</VAR> for debugging | ||
| 1782 | purposes. The handler will run with interrupts disabled. | ||
| 1783 | </DL> | ||
| 1784 | <P> | ||
| 1785 | |||
| 1786 | <A NAME="IDX79"></A> | ||
| 1787 | </P> | ||
| 1788 | <DL> | ||
| 1789 | <DT><U>Function:</U> bool <B>intr_context</B> (void) | ||
| 1790 | <DD>Returns true if we are running in an interrupt context, otherwise | ||
| 1791 | false. Mainly used in functions that might sleep | ||
| 1792 | or that otherwise should not be called from interrupt context, in this | ||
| 1793 | form: | ||
| 1794 | <TABLE><tr><td> </td><td class=example><pre>ASSERT (!intr_context ()); | ||
| 1795 | </pre></td></tr></table></DL> | ||
| 1796 | <P> | ||
| 1797 | |||
| 1798 | <A NAME="IDX80"></A> | ||
| 1799 | </P> | ||
| 1800 | <DL> | ||
| 1801 | <DT><U>Function:</U> void <B>intr_yield_on_return</B> (void) | ||
| 1802 | <DD>When called in an interrupt context, causes <CODE>thread_yield()</CODE> to be | ||
| 1803 | called just before the interrupt returns. Used | ||
| 1804 | in the timer interrupt handler when a thread's time slice expires, to | ||
| 1805 | cause a new thread to be scheduled. | ||
| 1806 | </DL> | ||
| 1807 | <P> | ||
| 1808 | |||
| 1809 | <A NAME="Memory Allocation"></A> | ||
| 1810 | <HR SIZE="6"> | ||
| 1811 | <A NAME="SEC69"></A> | ||
| 1812 | <H2> A.5 Memory Allocation </H2> | ||
| 1813 | <!--docid::SEC69::--> | ||
| 1814 | <P> | ||
| 1815 | |||
| 1816 | Pintos contains two memory allocators, one that allocates memory in | ||
| 1817 | units of a page, and one that can allocate blocks of any size. | ||
| 1818 | </P> | ||
| 1819 | <P> | ||
| 1820 | |||
| 1821 | <A NAME="Page Allocator"></A> | ||
| 1822 | <HR SIZE="6"> | ||
| 1823 | <A NAME="SEC70"></A> | ||
| 1824 | <H3> A.5.1 Page Allocator </H3> | ||
| 1825 | <!--docid::SEC70::--> | ||
| 1826 | <P> | ||
| 1827 | |||
| 1828 | The page allocator declared in <Q><TT>threads/palloc.h</TT></Q> allocates | ||
| 1829 | memory in units of a page. It is most often used to allocate memory | ||
| 1830 | one page at a time, but it can also allocate multiple contiguous pages | ||
| 1831 | at once. | ||
| 1832 | </P> | ||
| 1833 | <P> | ||
| 1834 | |||
| 1835 | The page allocator divides the memory it allocates into two pools, | ||
| 1836 | called the kernel and user pools. By default, each pool gets half of | ||
| 1837 | system memory above 1 MB, but the division can be changed with the | ||
| 1838 | <Q><SAMP>-ul</SAMP></Q> kernel | ||
| 1839 | command line | ||
| 1840 | option (see <A HREF="pintos_4.html#Why PAL_USER?">Why PAL_USER?</A>). An allocation request draws from one | ||
| 1841 | pool or the other. If one pool becomes empty, the other may still | ||
| 1842 | have free pages. The user pool should be used for allocating memory | ||
| 1843 | for user processes and the kernel pool for all other allocations. | ||
| 1844 | This will only become important starting with project 3. Until then, | ||
| 1845 | all allocations should be made from the kernel pool. | ||
| 1846 | </P> | ||
| 1847 | <P> | ||
| 1848 | |||
| 1849 | Each pool's usage is tracked with a bitmap, one bit per page in | ||
| 1850 | the pool. A request to allocate <VAR>n</VAR> pages scans the bitmap | ||
| 1851 | for <VAR>n</VAR> consecutive bits set to | ||
| 1852 | false, indicating that those pages are free, and then sets those bits | ||
| 1853 | to true to mark them as used. This is a "first fit" allocation | ||
| 1854 | strategy (see <A HREF="pintos_10.html#Wilson">Wilson</A>). | ||
| 1855 | </P> | ||
| 1856 | <P> | ||
| 1857 | |||
| 1858 | The page allocator is subject to fragmentation. That is, it may not | ||
| 1859 | be possible to allocate <VAR>n</VAR> contiguous pages even though <VAR>n</VAR> | ||
| 1860 | or more pages are free, because the free pages are separated by used | ||
| 1861 | pages. In fact, in pathological cases it may be impossible to | ||
| 1862 | allocate 2 contiguous pages even though half of the pool's pages are free. | ||
| 1863 | Single-page requests can't fail due to fragmentation, so | ||
| 1864 | requests for multiple contiguous pages should be limited as much as | ||
| 1865 | possible. | ||
| 1866 | </P> | ||
| 1867 | <P> | ||
| 1868 | |||
| 1869 | Pages may not be allocated from interrupt context, but they may be | ||
| 1870 | freed. | ||
| 1871 | </P> | ||
| 1872 | <P> | ||
| 1873 | |||
| 1874 | When a page is freed, all of its bytes are cleared to <TT>0xcc</TT>, as | ||
| 1875 | a debugging aid (see section <A HREF="pintos_8.html#SEC108">D.8 Tips</A>). | ||
| 1876 | </P> | ||
| 1877 | <P> | ||
| 1878 | |||
| 1879 | Page allocator types and functions are described below. | ||
| 1880 | </P> | ||
| 1881 | <P> | ||
| 1882 | |||
| 1883 | <A NAME="IDX81"></A> | ||
| 1884 | </P> | ||
| 1885 | <DL> | ||
| 1886 | <DT><U>Function:</U> void *<B>palloc_get_page</B> (enum palloc_flags <VAR>flags</VAR>) | ||
| 1887 | <DD><A NAME="IDX82"></A> | ||
| 1888 | <DT><U>Function:</U> void *<B>palloc_get_multiple</B> (enum palloc_flags <VAR>flags</VAR>, size_t <VAR>page_cnt</VAR>) | ||
| 1889 | <DD>Obtains and returns one page, or <VAR>page_cnt</VAR> contiguous pages, | ||
| 1890 | respectively. Returns a null pointer if the pages cannot be allocated. | ||
| 1891 | <P> | ||
| 1892 | |||
| 1893 | The <VAR>flags</VAR> argument may be any combination of the following flags: | ||
| 1894 | </P> | ||
| 1895 | <P> | ||
| 1896 | |||
| 1897 | <A NAME="IDX83"></A> | ||
| 1898 | </P> | ||
| 1899 | <DL> | ||
| 1900 | <DT><U>Page Allocator Flag:</U> <B><CODE>PAL_ASSERT</CODE></B> | ||
| 1901 | <DD>If the pages cannot be allocated, panic the kernel. This is only | ||
| 1902 | appropriate during kernel initialization. User processes | ||
| 1903 | should never be permitted to panic the kernel. | ||
| 1904 | </DL> | ||
| 1905 | <P> | ||
| 1906 | |||
| 1907 | <A NAME="IDX84"></A> | ||
| 1908 | </P> | ||
| 1909 | <DL> | ||
| 1910 | <DT><U>Page Allocator Flag:</U> <B><CODE>PAL_ZERO</CODE></B> | ||
| 1911 | <DD>Zero all the bytes in the allocated pages before returning them. If not | ||
| 1912 | set, the contents of newly allocated pages are unpredictable. | ||
| 1913 | </DL> | ||
| 1914 | <P> | ||
| 1915 | |||
| 1916 | <A NAME="IDX85"></A> | ||
| 1917 | </P> | ||
| 1918 | <DL> | ||
| 1919 | <DT><U>Page Allocator Flag:</U> <B><CODE>PAL_USER</CODE></B> | ||
| 1920 | <DD>Obtain the pages from the user pool. If not set, pages are allocated | ||
| 1921 | from the kernel pool. | ||
| 1922 | </DL> | ||
| 1923 | </DL> | ||
| 1924 | <P> | ||
| 1925 | |||
| 1926 | <A NAME="IDX86"></A> | ||
| 1927 | </P> | ||
| 1928 | <DL> | ||
| 1929 | <DT><U>Function:</U> void <B>palloc_free_page</B> (void *<VAR>page</VAR>) | ||
| 1930 | <DD><A NAME="IDX87"></A> | ||
| 1931 | <DT><U>Function:</U> void <B>palloc_free_multiple</B> (void *<VAR>pages</VAR>, size_t <VAR>page_cnt</VAR>) | ||
| 1932 | <DD>Frees one page, or <VAR>page_cnt</VAR> contiguous pages, respectively, | ||
| 1933 | starting at <VAR>pages</VAR>. All of the pages must have been obtained using | ||
| 1934 | <CODE>palloc_get_page()</CODE> or <CODE>palloc_get_multiple()</CODE>. | ||
| 1935 | </DL> | ||
| 1936 | <P> | ||
| 1937 | |||
| 1938 | <A NAME="Block Allocator"></A> | ||
| 1939 | <HR SIZE="6"> | ||
| 1940 | <A NAME="SEC71"></A> | ||
| 1941 | <H3> A.5.2 Block Allocator </H3> | ||
| 1942 | <!--docid::SEC71::--> | ||
| 1943 | <P> | ||
| 1944 | |||
| 1945 | The block allocator, declared in <Q><TT>threads/malloc.h</TT></Q>, can allocate | ||
| 1946 | blocks of any size. It is layered on top of the page allocator | ||
| 1947 | described in the previous section. Blocks returned by the block | ||
| 1948 | allocator are obtained from the kernel pool. | ||
| 1949 | </P> | ||
| 1950 | <P> | ||
| 1951 | |||
| 1952 | The block allocator uses two different strategies for allocating memory. | ||
| 1953 | The first strategy applies to blocks that are 1 kB or smaller | ||
| 1954 | (one-fourth of the page size). These allocations are rounded up to the | ||
| 1955 | nearest power of 2, or 16 bytes, whichever is larger. Then they are | ||
| 1956 | grouped into a page used only for allocations of that size. | ||
| 1957 | </P> | ||
| 1958 | <P> | ||
| 1959 | |||
| 1960 | The second strategy applies to blocks larger than 1 kB. | ||
| 1961 | These allocations (plus a small amount of overhead) are rounded up to | ||
| 1962 | the nearest page in size, and then the block allocator requests that | ||
| 1963 | number of contiguous pages from the page allocator. | ||
| 1964 | </P> | ||
| 1965 | <P> | ||
| 1966 | |||
| 1967 | In either case, the difference between the allocation requested size | ||
| 1968 | and the actual block size is wasted. A real operating system would | ||
| 1969 | carefully tune its allocator to minimize this waste, but this is | ||
| 1970 | unimportant in an instructional system like Pintos. | ||
| 1971 | </P> | ||
| 1972 | <P> | ||
| 1973 | |||
| 1974 | As long as a page can be obtained from the page allocator, small | ||
| 1975 | allocations always succeed. Most small allocations do not require a | ||
| 1976 | new page from the page allocator at all, because they are satisfied | ||
| 1977 | using part of a page already allocated. However, large allocations | ||
| 1978 | always require calling into the page allocator, and any allocation | ||
| 1979 | that needs more than one contiguous page can fail due to fragmentation, | ||
| 1980 | as already discussed in the previous section. Thus, you should | ||
| 1981 | minimize the number of large allocations in your code, especially | ||
| 1982 | those over approximately 4 kB each. | ||
| 1983 | </P> | ||
| 1984 | <P> | ||
| 1985 | |||
| 1986 | When a block is freed, all of its bytes are cleared to <TT>0xcc</TT>, as | ||
| 1987 | a debugging aid (see section <A HREF="pintos_8.html#SEC108">D.8 Tips</A>). | ||
| 1988 | </P> | ||
| 1989 | <P> | ||
| 1990 | |||
| 1991 | The block allocator may not be called from interrupt context. | ||
| 1992 | </P> | ||
| 1993 | <P> | ||
| 1994 | |||
| 1995 | The block allocator functions are described below. Their interfaces are | ||
| 1996 | the same as the standard C library functions of the same names. | ||
| 1997 | </P> | ||
| 1998 | <P> | ||
| 1999 | |||
| 2000 | <A NAME="IDX88"></A> | ||
| 2001 | </P> | ||
| 2002 | <DL> | ||
| 2003 | <DT><U>Function:</U> void *<B>malloc</B> (size_t <VAR>size</VAR>) | ||
| 2004 | <DD>Obtains and returns a new block, from the kernel pool, at least | ||
| 2005 | <VAR>size</VAR> bytes long. Returns a null pointer if <VAR>size</VAR> is zero or | ||
| 2006 | if memory is not available. | ||
| 2007 | </DL> | ||
| 2008 | <P> | ||
| 2009 | |||
| 2010 | <A NAME="IDX89"></A> | ||
| 2011 | </P> | ||
| 2012 | <DL> | ||
| 2013 | <DT><U>Function:</U> void *<B>calloc</B> (size_t <VAR>a</VAR>, size_t <VAR>b</VAR>) | ||
| 2014 | <DD>Obtains a returns a new block, from the kernel pool, at least | ||
| 2015 | <CODE><VAR>a</VAR> * <VAR>b</VAR></CODE> bytes long. The block's contents will be | ||
| 2016 | cleared to zeros. Returns a null pointer if <VAR>a</VAR> or <VAR>b</VAR> is zero | ||
| 2017 | or if insufficient memory is available. | ||
| 2018 | </DL> | ||
| 2019 | <P> | ||
| 2020 | |||
| 2021 | <A NAME="IDX90"></A> | ||
| 2022 | </P> | ||
| 2023 | <DL> | ||
| 2024 | <DT><U>Function:</U> void *<B>realloc</B> (void *<VAR>block</VAR>, size_t <VAR>new_size</VAR>) | ||
| 2025 | <DD>Attempts to resize <VAR>block</VAR> to <VAR>new_size</VAR> bytes, possibly moving | ||
| 2026 | it in the process. If successful, returns the new block, in which case | ||
| 2027 | the old block must no longer be accessed. On failure, returns a null | ||
| 2028 | pointer, and the old block remains valid. | ||
| 2029 | <P> | ||
| 2030 | |||
| 2031 | A call with <VAR>block</VAR> null is equivalent to <CODE>malloc()</CODE>. A call | ||
| 2032 | with <VAR>new_size</VAR> zero is equivalent to <CODE>free()</CODE>. | ||
| 2033 | </P> | ||
| 2034 | </DL> | ||
| 2035 | <P> | ||
| 2036 | |||
| 2037 | <A NAME="IDX91"></A> | ||
| 2038 | </P> | ||
| 2039 | <DL> | ||
| 2040 | <DT><U>Function:</U> void <B>free</B> (void *<VAR>block</VAR>) | ||
| 2041 | <DD>Frees <VAR>block</VAR>, which must have been previously returned by | ||
| 2042 | <CODE>malloc()</CODE>, <CODE>calloc()</CODE>, or <CODE>realloc()</CODE> (and not yet freed). | ||
| 2043 | </DL> | ||
| 2044 | <P> | ||
| 2045 | |||
| 2046 | <A NAME="Virtual Addresses"></A> | ||
| 2047 | <HR SIZE="6"> | ||
| 2048 | <A NAME="SEC72"></A> | ||
| 2049 | <H2> A.6 Virtual Addresses </H2> | ||
| 2050 | <!--docid::SEC72::--> | ||
| 2051 | <P> | ||
| 2052 | |||
| 2053 | A 32-bit virtual address can be divided into a 20-bit <EM>page number</EM> | ||
| 2054 | and a 12-bit <EM>page offset</EM> (or just <EM>offset</EM>), like this: | ||
| 2055 | </P> | ||
| 2056 | <P> | ||
| 2057 | |||
| 2058 | <TABLE><tr><td> </td><td class=example><pre> 31 12 11 0 | ||
| 2059 | +-------------------+-----------+ | ||
| 2060 | | Page Number | Offset | | ||
| 2061 | +-------------------+-----------+ | ||
| 2062 | Virtual Address | ||
| 2063 | </pre></td></tr></table><P> | ||
| 2064 | |||
| 2065 | Header <Q><TT>threads/vaddr.h</TT></Q> defines these functions and macros for | ||
| 2066 | working with virtual addresses: | ||
| 2067 | </P> | ||
| 2068 | <P> | ||
| 2069 | |||
| 2070 | <A NAME="IDX92"></A> | ||
| 2071 | </P> | ||
| 2072 | <DL> | ||
| 2073 | <DT><U>Macro:</U> <B>PGSHIFT</B> | ||
| 2074 | <DD><A NAME="IDX93"></A> | ||
| 2075 | <DT><U>Macro:</U> <B>PGBITS</B> | ||
| 2076 | <DD>The bit index (0) and number of bits (12) of the offset part of a | ||
| 2077 | virtual address, respectively. | ||
| 2078 | </DL> | ||
| 2079 | <P> | ||
| 2080 | |||
| 2081 | <A NAME="IDX94"></A> | ||
| 2082 | </P> | ||
| 2083 | <DL> | ||
| 2084 | <DT><U>Macro:</U> <B>PGMASK</B> | ||
| 2085 | <DD>A bit mask with the bits in the page offset set to 1, the rest set to 0 | ||
| 2086 | (<TT>0xfff</TT>). | ||
| 2087 | </DL> | ||
| 2088 | <P> | ||
| 2089 | |||
| 2090 | <A NAME="IDX95"></A> | ||
| 2091 | </P> | ||
| 2092 | <DL> | ||
| 2093 | <DT><U>Macro:</U> <B>PGSIZE</B> | ||
| 2094 | <DD>The page size in bytes (4,096). | ||
| 2095 | </DL> | ||
| 2096 | <P> | ||
| 2097 | |||
| 2098 | <A NAME="IDX96"></A> | ||
| 2099 | </P> | ||
| 2100 | <DL> | ||
| 2101 | <DT><U>Function:</U> unsigned <B>pg_ofs</B> (const void *<VAR>va</VAR>) | ||
| 2102 | <DD>Extracts and returns the page offset in virtual address <VAR>va</VAR>. | ||
| 2103 | </DL> | ||
| 2104 | <P> | ||
| 2105 | |||
| 2106 | <A NAME="IDX97"></A> | ||
| 2107 | </P> | ||
| 2108 | <DL> | ||
| 2109 | <DT><U>Function:</U> uintptr_t <B>pg_no</B> (const void *<VAR>va</VAR>) | ||
| 2110 | <DD>Extracts and returns the page number in virtual address <VAR>va</VAR>. | ||
| 2111 | </DL> | ||
| 2112 | <P> | ||
| 2113 | |||
| 2114 | <A NAME="IDX98"></A> | ||
| 2115 | </P> | ||
| 2116 | <DL> | ||
| 2117 | <DT><U>Function:</U> void *<B>pg_round_down</B> (const void *<VAR>va</VAR>) | ||
| 2118 | <DD>Returns the start of the virtual page that <VAR>va</VAR> points within, that | ||
| 2119 | is, <VAR>va</VAR> with the page offset set to 0. | ||
| 2120 | </DL> | ||
| 2121 | <P> | ||
| 2122 | |||
| 2123 | <A NAME="IDX99"></A> | ||
| 2124 | </P> | ||
| 2125 | <DL> | ||
| 2126 | <DT><U>Function:</U> void *<B>pg_round_up</B> (const void *<VAR>va</VAR>) | ||
| 2127 | <DD>Returns <VAR>va</VAR> rounded up to the nearest page boundary. | ||
| 2128 | </DL> | ||
| 2129 | <P> | ||
| 2130 | |||
| 2131 | Virtual memory in Pintos is divided into two regions: user virtual | ||
| 2132 | memory and kernel virtual memory (see section <A HREF="pintos_2.html#SEC26">2.2.4 Virtual Memory Layout</A>). The | ||
| 2133 | boundary between them is <CODE>PHYS_BASE</CODE>: | ||
| 2134 | </P> | ||
| 2135 | <P> | ||
| 2136 | |||
| 2137 | <A NAME="IDX100"></A> | ||
| 2138 | </P> | ||
| 2139 | <DL> | ||
| 2140 | <DT><U>Macro:</U> <B>PHYS_BASE</B> | ||
| 2141 | <DD>Base address of kernel virtual memory. It defaults to <TT>0xc0000000</TT> (3 | ||
| 2142 | GB), but it may be changed to any multiple of <TT>0x10000000</TT> from | ||
| 2143 | <TT>0x80000000</TT> to <TT>0xf0000000</TT>. | ||
| 2144 | <P> | ||
| 2145 | |||
| 2146 | User virtual memory ranges from virtual address 0 up to | ||
| 2147 | <CODE>PHYS_BASE</CODE>. Kernel virtual memory occupies the rest of the | ||
| 2148 | virtual address space, from <CODE>PHYS_BASE</CODE> up to 4 GB. | ||
| 2149 | </P> | ||
| 2150 | </DL> | ||
| 2151 | <P> | ||
| 2152 | |||
| 2153 | <A NAME="IDX101"></A> | ||
| 2154 | </P> | ||
| 2155 | <DL> | ||
| 2156 | <DT><U>Function:</U> bool <B>is_user_vaddr</B> (const void *<VAR>va</VAR>) | ||
| 2157 | <DD><A NAME="IDX102"></A> | ||
| 2158 | <DT><U>Function:</U> bool <B>is_kernel_vaddr</B> (const void *<VAR>va</VAR>) | ||
| 2159 | <DD>Returns true if <VAR>va</VAR> is a user or kernel virtual address, | ||
| 2160 | respectively, false otherwise. | ||
| 2161 | </DL> | ||
| 2162 | <P> | ||
| 2163 | |||
| 2164 | The 80<VAR>x</VAR>86 doesn't provide any way to directly access memory given | ||
| 2165 | a physical address. This ability is often necessary in an operating | ||
| 2166 | system kernel, so Pintos works around it by mapping kernel virtual | ||
| 2167 | memory one-to-one to physical memory. That is, virtual address | ||
| 2168 | <CODE>PHYS_BASE</CODE> accesses physical address 0, virtual address | ||
| 2169 | <CODE>PHYS_BASE</CODE> + <TT>0x1234</TT> accesses physical address <TT>0x1234</TT>, and | ||
| 2170 | so on up to the size of the machine's physical memory. Thus, adding | ||
| 2171 | <CODE>PHYS_BASE</CODE> to a physical address obtains a kernel virtual address | ||
| 2172 | that accesses that address; conversely, subtracting <CODE>PHYS_BASE</CODE> | ||
| 2173 | from a kernel virtual address obtains the corresponding physical | ||
| 2174 | address. Header <Q><TT>threads/vaddr.h</TT></Q> provides a pair of functions to | ||
| 2175 | do these translations: | ||
| 2176 | </P> | ||
| 2177 | <P> | ||
| 2178 | |||
| 2179 | <A NAME="IDX103"></A> | ||
| 2180 | </P> | ||
| 2181 | <DL> | ||
| 2182 | <DT><U>Function:</U> void *<B>ptov</B> (uintptr_t <VAR>pa</VAR>) | ||
| 2183 | <DD>Returns the kernel virtual address corresponding to physical address | ||
| 2184 | <VAR>pa</VAR>, which should be between 0 and the number of bytes of physical | ||
| 2185 | memory. | ||
| 2186 | </DL> | ||
| 2187 | <P> | ||
| 2188 | |||
| 2189 | <A NAME="IDX104"></A> | ||
| 2190 | </P> | ||
| 2191 | <DL> | ||
| 2192 | <DT><U>Function:</U> uintptr_t <B>vtop</B> (void *<VAR>va</VAR>) | ||
| 2193 | <DD>Returns the physical address corresponding to <VAR>va</VAR>, which must be a | ||
| 2194 | kernel virtual address. | ||
| 2195 | </DL> | ||
| 2196 | <P> | ||
| 2197 | |||
| 2198 | <A NAME="Page Table"></A> | ||
| 2199 | <HR SIZE="6"> | ||
| 2200 | <A NAME="SEC73"></A> | ||
| 2201 | <H2> A.7 Page Table </H2> | ||
| 2202 | <!--docid::SEC73::--> | ||
| 2203 | <P> | ||
| 2204 | |||
| 2205 | The code in <Q><TT>pagedir.c</TT></Q> is an abstract interface to the 80<VAR>x</VAR>86 | ||
| 2206 | hardware page table, also called a "page directory" by Intel processor | ||
| 2207 | documentation. The page table interface uses a <CODE>uint32_t *</CODE> to | ||
| 2208 | represent a page table because this is convenient for accessing their | ||
| 2209 | internal structure. | ||
| 2210 | </P> | ||
| 2211 | <P> | ||
| 2212 | |||
| 2213 | The sections below describe the page table interface and internals. | ||
| 2214 | </P> | ||
| 2215 | <P> | ||
| 2216 | |||
| 2217 | <A NAME="Page Table Creation Destruction Activation"></A> | ||
| 2218 | <HR SIZE="6"> | ||
| 2219 | <A NAME="SEC74"></A> | ||
| 2220 | <H3> A.7.1 Creation, Destruction, and Activation </H3> | ||
| 2221 | <!--docid::SEC74::--> | ||
| 2222 | <P> | ||
| 2223 | |||
| 2224 | These functions create, destroy, and activate page tables. The base | ||
| 2225 | Pintos code already calls these functions where necessary, so it should | ||
| 2226 | not be necessary to call them yourself. | ||
| 2227 | </P> | ||
| 2228 | <P> | ||
| 2229 | |||
| 2230 | <A NAME="IDX105"></A> | ||
| 2231 | </P> | ||
| 2232 | <DL> | ||
| 2233 | <DT><U>Function:</U> uint32_t *<B>pagedir_create</B> (void) | ||
| 2234 | <DD>Creates and returns a new page table. The new page table contains | ||
| 2235 | Pintos's normal kernel virtual page mappings, but no user virtual | ||
| 2236 | mappings. | ||
| 2237 | <P> | ||
| 2238 | |||
| 2239 | Returns a null pointer if memory cannot be obtained. | ||
| 2240 | </P> | ||
| 2241 | </DL> | ||
| 2242 | <P> | ||
| 2243 | |||
| 2244 | <A NAME="IDX106"></A> | ||
| 2245 | </P> | ||
| 2246 | <DL> | ||
| 2247 | <DT><U>Function:</U> void <B>pagedir_destroy</B> (uint32_t *<VAR>pd</VAR>) | ||
| 2248 | <DD>Frees all of the resources held by <VAR>pd</VAR>, including the page table | ||
| 2249 | itself and the frames that it maps. | ||
| 2250 | </DL> | ||
| 2251 | <P> | ||
| 2252 | |||
| 2253 | <A NAME="IDX107"></A> | ||
| 2254 | </P> | ||
| 2255 | <DL> | ||
| 2256 | <DT><U>Function:</U> void <B>pagedir_activate</B> (uint32_t *<VAR>pd</VAR>) | ||
| 2257 | <DD>Activates <VAR>pd</VAR>. The active page table is the one used by the CPU to | ||
| 2258 | translate memory references. | ||
| 2259 | </DL> | ||
| 2260 | <P> | ||
| 2261 | |||
| 2262 | <A NAME="Page Tables Inspection and Updates"></A> | ||
| 2263 | <HR SIZE="6"> | ||
| 2264 | <A NAME="SEC75"></A> | ||
| 2265 | <H3> A.7.2 Inspection and Updates </H3> | ||
| 2266 | <!--docid::SEC75::--> | ||
| 2267 | <P> | ||
| 2268 | |||
| 2269 | These functions examine or update the mappings from pages to frames | ||
| 2270 | encapsulated by a page table. They work on both active and inactive | ||
| 2271 | page tables (that is, those for running and suspended processes), | ||
| 2272 | flushing the TLB as necessary. | ||
| 2273 | </P> | ||
| 2274 | <P> | ||
| 2275 | |||
| 2276 | <A NAME="IDX108"></A> | ||
| 2277 | </P> | ||
| 2278 | <DL> | ||
| 2279 | <DT><U>Function:</U> bool <B>pagedir_set_page</B> (uint32_t *<VAR>pd</VAR>, void *<VAR>upage</VAR>, void *<VAR>kpage</VAR>, bool <VAR>writable</VAR>) | ||
| 2280 | <DD>Adds to <VAR>pd</VAR> a mapping from user page <VAR>upage</VAR> to the frame identified | ||
| 2281 | by kernel virtual address <VAR>kpage</VAR>. If <VAR>writable</VAR> is true, the | ||
| 2282 | page is mapped read/write; otherwise, it is mapped read-only. | ||
| 2283 | <P> | ||
| 2284 | |||
| 2285 | User page <VAR>upage</VAR> must not already be mapped in <VAR>pd</VAR>. | ||
| 2286 | </P> | ||
| 2287 | <P> | ||
| 2288 | |||
| 2289 | Kernel page <VAR>kpage</VAR> should be a kernel virtual address obtained from | ||
| 2290 | the user pool with <CODE>palloc_get_page(PAL_USER)</CODE> (see <A HREF="pintos_4.html#Why PAL_USER?">Why PAL_USER?</A>). | ||
| 2291 | </P> | ||
| 2292 | <P> | ||
| 2293 | |||
| 2294 | Returns true if successful, false on failure. Failure will occur if | ||
| 2295 | additional memory required for the page table cannot be obtained. | ||
| 2296 | </P> | ||
| 2297 | </DL> | ||
| 2298 | <P> | ||
| 2299 | |||
| 2300 | <A NAME="IDX109"></A> | ||
| 2301 | </P> | ||
| 2302 | <DL> | ||
| 2303 | <DT><U>Function:</U> void *<B>pagedir_get_page</B> (uint32_t *<VAR>pd</VAR>, const void *<VAR>uaddr</VAR>) | ||
| 2304 | <DD>Looks up the frame mapped to <VAR>uaddr</VAR> in <VAR>pd</VAR>. Returns the | ||
| 2305 | kernel virtual address for that frame, if <VAR>uaddr</VAR> is mapped, or a | ||
| 2306 | null pointer if it is not. | ||
| 2307 | </DL> | ||
| 2308 | <P> | ||
| 2309 | |||
| 2310 | <A NAME="IDX110"></A> | ||
| 2311 | </P> | ||
| 2312 | <DL> | ||
| 2313 | <DT><U>Function:</U> void <B>pagedir_clear_page</B> (uint32_t *<VAR>pd</VAR>, void *<VAR>page</VAR>) | ||
| 2314 | <DD>Marks <VAR>page</VAR> "not present" in <VAR>pd</VAR>. Later accesses to | ||
| 2315 | the page will fault. | ||
| 2316 | <P> | ||
| 2317 | |||
| 2318 | Other bits in the page table for <VAR>page</VAR> are preserved, permitting | ||
| 2319 | the accessed and dirty bits (see the next section) to be checked. | ||
| 2320 | </P> | ||
| 2321 | <P> | ||
| 2322 | |||
| 2323 | This function has no effect if <VAR>page</VAR> is not mapped. | ||
| 2324 | </P> | ||
| 2325 | </DL> | ||
| 2326 | <P> | ||
| 2327 | |||
| 2328 | <A NAME="Page Table Accessed and Dirty Bits"></A> | ||
| 2329 | <HR SIZE="6"> | ||
| 2330 | <A NAME="SEC76"></A> | ||
| 2331 | <H3> A.7.3 Accessed and Dirty Bits </H3> | ||
| 2332 | <!--docid::SEC76::--> | ||
| 2333 | <P> | ||
| 2334 | |||
| 2335 | 80<VAR>x</VAR>86 hardware provides some assistance for implementing page | ||
| 2336 | replacement algorithms, through a pair of bits in the page table entry | ||
| 2337 | (PTE) for each page. On any read or write to a page, the CPU sets the | ||
| 2338 | <EM>accessed bit</EM> to 1 in the page's PTE, and on any write, the CPU | ||
| 2339 | sets the <EM>dirty bit</EM> to 1. The CPU never resets these bits to 0, | ||
| 2340 | but the OS may do so. | ||
| 2341 | </P> | ||
| 2342 | <P> | ||
| 2343 | |||
| 2344 | Proper interpretation of these bits requires understanding of | ||
| 2345 | <EM>aliases</EM>, that is, two (or more) pages that refer to the same | ||
| 2346 | frame. When an aliased frame is accessed, the accessed and dirty bits | ||
| 2347 | are updated in only one page table entry (the one for the page used for | ||
| 2348 | access). The accessed and dirty bits for the other aliases are not | ||
| 2349 | updated. | ||
| 2350 | </P> | ||
| 2351 | <P> | ||
| 2352 | |||
| 2353 | See <A HREF="pintos_4.html#Accessed and Dirty Bits">Accessed and Dirty Bits</A>, on applying these bits in implementing | ||
| 2354 | page replacement algorithms. | ||
| 2355 | </P> | ||
| 2356 | <P> | ||
| 2357 | |||
| 2358 | <A NAME="IDX111"></A> | ||
| 2359 | </P> | ||
| 2360 | <DL> | ||
| 2361 | <DT><U>Function:</U> bool <B>pagedir_is_dirty</B> (uint32_t *<VAR>pd</VAR>, const void *<VAR>page</VAR>) | ||
| 2362 | <DD><A NAME="IDX112"></A> | ||
| 2363 | <DT><U>Function:</U> bool <B>pagedir_is_accessed</B> (uint32_t *<VAR>pd</VAR>, const void *<VAR>page</VAR>) | ||
| 2364 | <DD>Returns true if page directory <VAR>pd</VAR> contains a page table entry for | ||
| 2365 | <VAR>page</VAR> that is marked dirty (or accessed). Otherwise, | ||
| 2366 | returns false. | ||
| 2367 | </DL> | ||
| 2368 | <P> | ||
| 2369 | |||
| 2370 | <A NAME="IDX113"></A> | ||
| 2371 | </P> | ||
| 2372 | <DL> | ||
| 2373 | <DT><U>Function:</U> void <B>pagedir_set_dirty</B> (uint32_t *<VAR>pd</VAR>, const void *<VAR>page</VAR>, bool <VAR>value</VAR>) | ||
| 2374 | <DD><A NAME="IDX114"></A> | ||
| 2375 | <DT><U>Function:</U> void <B>pagedir_set_accessed</B> (uint32_t *<VAR>pd</VAR>, const void *<VAR>page</VAR>, bool <VAR>value</VAR>) | ||
| 2376 | <DD>If page directory <VAR>pd</VAR> has a page table entry for <VAR>page</VAR>, then | ||
| 2377 | its dirty (or accessed) bit is set to <VAR>value</VAR>. | ||
| 2378 | </DL> | ||
| 2379 | <P> | ||
| 2380 | |||
| 2381 | <A NAME="Page Table Details"></A> | ||
| 2382 | <HR SIZE="6"> | ||
| 2383 | <A NAME="SEC77"></A> | ||
| 2384 | <H3> A.7.4 Page Table Details </H3> | ||
| 2385 | <!--docid::SEC77::--> | ||
| 2386 | <P> | ||
| 2387 | |||
| 2388 | The functions provided with Pintos are sufficient to implement the | ||
| 2389 | projects. However, you may still find it worthwhile to understand the | ||
| 2390 | hardware page table format, so we'll go into a little detail in this | ||
| 2391 | section. | ||
| 2392 | </P> | ||
| 2393 | <P> | ||
| 2394 | |||
| 2395 | <A NAME="Page Table Structure"></A> | ||
| 2396 | <HR SIZE="6"> | ||
| 2397 | <A NAME="SEC78"></A> | ||
| 2398 | <H4> A.7.4.1 Structure </H4> | ||
| 2399 | <!--docid::SEC78::--> | ||
| 2400 | <P> | ||
| 2401 | |||
| 2402 | The top-level paging data structure is a page called the "page | ||
| 2403 | directory" (PD) arranged as an array of 1,024 32-bit page directory | ||
| 2404 | entries (PDEs), each of which represents 4 MB of virtual memory. Each | ||
| 2405 | PDE may point to the physical address of another page called a | ||
| 2406 | "page table" (PT) arranged, similarly, as an array of 1,024 | ||
| 2407 | 32-bit page table entries (PTEs), each of which translates a single 4 | ||
| 2408 | kB virtual page to a physical page. | ||
| 2409 | </P> | ||
| 2410 | <P> | ||
| 2411 | |||
| 2412 | Translation of a virtual address into a physical address follows | ||
| 2413 | the three-step process illustrated in the diagram | ||
| 2414 | below:<A NAME="DOCF4" HREF="pintos_fot.html#FOOT4">(4)</A> | ||
| 2415 | </P> | ||
| 2416 | <P> | ||
| 2417 | |||
| 2418 | <OL> | ||
| 2419 | <LI> | ||
| 2420 | The most-significant 10 bits of the virtual address (bits 22<small>...</small>31) | ||
| 2421 | index the page directory. If the PDE is marked "present," the | ||
| 2422 | physical address of a page table is read from the PDE thus obtained. | ||
| 2423 | If the PDE is marked "not present" then a page fault occurs. | ||
| 2424 | <P> | ||
| 2425 | |||
| 2426 | </P> | ||
| 2427 | <LI> | ||
| 2428 | The next 10 bits of the virtual address (bits 12<small>...</small>21) index | ||
| 2429 | the page table. If the PTE is marked "present," the physical | ||
| 2430 | address of a data page is read from the PTE thus obtained. If the PTE | ||
| 2431 | is marked "not present" then a page fault occurs. | ||
| 2432 | <P> | ||
| 2433 | |||
| 2434 | </P> | ||
| 2435 | <LI> | ||
| 2436 | The least-significant 12 bits of the virtual address (bits 0<small>...</small>11) | ||
| 2437 | are added to the data page's physical base address, yielding the final | ||
| 2438 | physical address. | ||
| 2439 | </OL> | ||
| 2440 | <P> | ||
| 2441 | |||
| 2442 | <TABLE><tr><td> </td><td class=example><pre> 31 22 21 12 11 0 | ||
| 2443 | +----------------------+----------------------+----------------------+ | ||
| 2444 | | Page Directory Index | Page Table Index | Page Offset | | ||
| 2445 | +----------------------+----------------------+----------------------+ | ||
| 2446 | | | | | ||
| 2447 | _______/ _______/ _____/ | ||
| 2448 | / / / | ||
| 2449 | / Page Directory / Page Table / Data Page | ||
| 2450 | / .____________. / .____________. / .____________. | ||
| 2451 | |1,023|____________| |1,023|____________| | |____________| | ||
| 2452 | |1,022|____________| |1,022|____________| | |____________| | ||
| 2453 | |1,021|____________| |1,021|____________| \__\|____________| | ||
| 2454 | |1,020|____________| |1,020|____________| /|____________| | ||
| 2455 | | | | | | | | | | ||
| 2456 | | | | \____\| |_ | | | ||
| 2457 | | | . | /| . | \ | . | | ||
| 2458 | \____\| . |_ | . | | | . | | ||
| 2459 | /| . | \ | . | | | . | | ||
| 2460 | | . | | | . | | | . | | ||
| 2461 | | | | | | | | | | ||
| 2462 | |____________| | |____________| | |____________| | ||
| 2463 | 4|____________| | 4|____________| | |____________| | ||
| 2464 | 3|____________| | 3|____________| | |____________| | ||
| 2465 | 2|____________| | 2|____________| | |____________| | ||
| 2466 | 1|____________| | 1|____________| | |____________| | ||
| 2467 | 0|____________| \__\0|____________| \____\|____________| | ||
| 2468 | / / | ||
| 2469 | </pre></td></tr></table><P> | ||
| 2470 | |||
| 2471 | Pintos provides some macros and functions that are useful for working | ||
| 2472 | with raw page tables: | ||
| 2473 | </P> | ||
| 2474 | <P> | ||
| 2475 | |||
| 2476 | <A NAME="IDX115"></A> | ||
| 2477 | </P> | ||
| 2478 | <DL> | ||
| 2479 | <DT><U>Macro:</U> <B>PTSHIFT</B> | ||
| 2480 | <DD><A NAME="IDX116"></A> | ||
| 2481 | <DT><U>Macro:</U> <B>PTBITS</B> | ||
| 2482 | <DD>The starting bit index (12) and number of bits (10), respectively, in a | ||
| 2483 | page table index. | ||
| 2484 | </DL> | ||
| 2485 | <P> | ||
| 2486 | |||
| 2487 | <A NAME="IDX117"></A> | ||
| 2488 | </P> | ||
| 2489 | <DL> | ||
| 2490 | <DT><U>Macro:</U> <B>PTMASK</B> | ||
| 2491 | <DD>A bit mask with the bits in the page table index set to 1 and the rest | ||
| 2492 | set to 0 (<TT>0x3ff000</TT>). | ||
| 2493 | </DL> | ||
| 2494 | <P> | ||
| 2495 | |||
| 2496 | <A NAME="IDX118"></A> | ||
| 2497 | </P> | ||
| 2498 | <DL> | ||
| 2499 | <DT><U>Macro:</U> <B>PTSPAN</B> | ||
| 2500 | <DD>The number of bytes of virtual address space that a single page table | ||
| 2501 | page covers (4,194,304 bytes, or 4 MB). | ||
| 2502 | </DL> | ||
| 2503 | <P> | ||
| 2504 | |||
| 2505 | <A NAME="IDX119"></A> | ||
| 2506 | </P> | ||
| 2507 | <DL> | ||
| 2508 | <DT><U>Macro:</U> <B>PDSHIFT</B> | ||
| 2509 | <DD><A NAME="IDX120"></A> | ||
| 2510 | <DT><U>Macro:</U> <B>PDBITS</B> | ||
| 2511 | <DD>The starting bit index (22) and number of bits (10), respectively, in a | ||
| 2512 | page directory index. | ||
| 2513 | </DL> | ||
| 2514 | <P> | ||
| 2515 | |||
| 2516 | <A NAME="IDX121"></A> | ||
| 2517 | </P> | ||
| 2518 | <DL> | ||
| 2519 | <DT><U>Macro:</U> <B>PDMASK</B> | ||
| 2520 | <DD>A bit mask with the bits in the page directory index set to 1 and other | ||
| 2521 | bits set to 0 (<TT>0xffc00000</TT>). | ||
| 2522 | </DL> | ||
| 2523 | <P> | ||
| 2524 | |||
| 2525 | <A NAME="IDX122"></A> | ||
| 2526 | </P> | ||
| 2527 | <DL> | ||
| 2528 | <DT><U>Function:</U> uintptr_t <B>pd_no</B> (const void *<VAR>va</VAR>) | ||
| 2529 | <DD><A NAME="IDX123"></A> | ||
| 2530 | <DT><U>Function:</U> uintptr_t <B>pt_no</B> (const void *<VAR>va</VAR>) | ||
| 2531 | <DD>Returns the page directory index or page table index, respectively, for | ||
| 2532 | virtual address <VAR>va</VAR>. These functions are defined in | ||
| 2533 | <Q><TT>threads/pte.h</TT></Q>. | ||
| 2534 | </DL> | ||
| 2535 | <P> | ||
| 2536 | |||
| 2537 | <A NAME="IDX124"></A> | ||
| 2538 | </P> | ||
| 2539 | <DL> | ||
| 2540 | <DT><U>Function:</U> unsigned <B>pg_ofs</B> (const void *<VAR>va</VAR>) | ||
| 2541 | <DD>Returns the page offset for virtual address <VAR>va</VAR>. This function is | ||
| 2542 | defined in <Q><TT>threads/vaddr.h</TT></Q>. | ||
| 2543 | </DL> | ||
| 2544 | <P> | ||
| 2545 | |||
| 2546 | <A NAME="Page Table Entry Format"></A> | ||
| 2547 | <HR SIZE="6"> | ||
| 2548 | <A NAME="SEC79"></A> | ||
| 2549 | <H4> A.7.4.2 Page Table Entry Format </H4> | ||
| 2550 | <!--docid::SEC79::--> | ||
| 2551 | <P> | ||
| 2552 | |||
| 2553 | You do not need to understand the PTE format to do the Pintos | ||
| 2554 | projects, unless you wish to incorporate the page table into your | ||
| 2555 | supplemental page table (see <A HREF="pintos_4.html#Managing the Supplemental Page Table">Managing the Supplemental Page Table</A>). | ||
| 2556 | </P> | ||
| 2557 | <P> | ||
| 2558 | |||
| 2559 | The actual format of a page table entry is summarized below. For | ||
| 2560 | complete information, refer to section 3.7, "Page Translation Using | ||
| 2561 | 32-Bit Physical Addressing," in [ <A HREF="pintos_10.html#IA32-v3a">IA32-v3a</A>]. | ||
| 2562 | </P> | ||
| 2563 | <P> | ||
| 2564 | |||
| 2565 | <TABLE><tr><td> </td><td class=example><pre> 31 12 11 9 6 5 2 1 0 | ||
| 2566 | +---------------------------------------+----+----+-+-+---+-+-+-+ | ||
| 2567 | | Physical Address | AVL| |D|A| |U|W|P| | ||
| 2568 | +---------------------------------------+----+----+-+-+---+-+-+-+ | ||
| 2569 | </pre></td></tr></table><P> | ||
| 2570 | |||
| 2571 | Some more information on each bit is given below. The names are | ||
| 2572 | <Q><TT>threads/pte.h</TT></Q> macros that represent the bits' values: | ||
| 2573 | </P> | ||
| 2574 | <P> | ||
| 2575 | |||
| 2576 | <A NAME="IDX125"></A> | ||
| 2577 | </P> | ||
| 2578 | <DL> | ||
| 2579 | <DT><U>Macro:</U> <B>PTE_P</B> | ||
| 2580 | <DD>Bit 0, the "present" bit. When this bit is 1, the | ||
| 2581 | other bits are interpreted as described below. When this bit is 0, any | ||
| 2582 | attempt to access the page will page fault. The remaining bits are then | ||
| 2583 | not used by the CPU and may be used by the OS for any purpose. | ||
| 2584 | </DL> | ||
| 2585 | <P> | ||
| 2586 | |||
| 2587 | <A NAME="IDX126"></A> | ||
| 2588 | </P> | ||
| 2589 | <DL> | ||
| 2590 | <DT><U>Macro:</U> <B>PTE_W</B> | ||
| 2591 | <DD>Bit 1, the "read/write" bit. When it is 1, the page | ||
| 2592 | is writable. When it is 0, write attempts will page fault. | ||
| 2593 | </DL> | ||
| 2594 | <P> | ||
| 2595 | |||
| 2596 | <A NAME="IDX127"></A> | ||
| 2597 | </P> | ||
| 2598 | <DL> | ||
| 2599 | <DT><U>Macro:</U> <B>PTE_U</B> | ||
| 2600 | <DD>Bit 2, the "user/supervisor" bit. When it is 1, user | ||
| 2601 | processes may access the page. When it is 0, only the kernel may access | ||
| 2602 | the page (user accesses will page fault). | ||
| 2603 | <P> | ||
| 2604 | |||
| 2605 | Pintos clears this bit in PTEs for kernel virtual memory, to prevent | ||
| 2606 | user processes from accessing them. | ||
| 2607 | </P> | ||
| 2608 | </DL> | ||
| 2609 | <P> | ||
| 2610 | |||
| 2611 | <A NAME="IDX128"></A> | ||
| 2612 | </P> | ||
| 2613 | <DL> | ||
| 2614 | <DT><U>Macro:</U> <B>PTE_A</B> | ||
| 2615 | <DD>Bit 5, the "accessed" bit. See section <A HREF="pintos_5.html#SEC76">A.7.3 Accessed and Dirty Bits</A>. | ||
| 2616 | </DL> | ||
| 2617 | <P> | ||
| 2618 | |||
| 2619 | <A NAME="IDX129"></A> | ||
| 2620 | </P> | ||
| 2621 | <DL> | ||
| 2622 | <DT><U>Macro:</U> <B>PTE_D</B> | ||
| 2623 | <DD>Bit 6, the "dirty" bit. See section <A HREF="pintos_5.html#SEC76">A.7.3 Accessed and Dirty Bits</A>. | ||
| 2624 | </DL> | ||
| 2625 | <P> | ||
| 2626 | |||
| 2627 | <A NAME="IDX130"></A> | ||
| 2628 | </P> | ||
| 2629 | <DL> | ||
| 2630 | <DT><U>Macro:</U> <B>PTE_AVL</B> | ||
| 2631 | <DD>Bits 9<small>...</small>11, available for operating system use. | ||
| 2632 | Pintos, as provided, does not use them and sets them to 0. | ||
| 2633 | </DL> | ||
| 2634 | <P> | ||
| 2635 | |||
| 2636 | <A NAME="IDX131"></A> | ||
| 2637 | </P> | ||
| 2638 | <DL> | ||
| 2639 | <DT><U>Macro:</U> <B>PTE_ADDR</B> | ||
| 2640 | <DD>Bits 12<small>...</small>31, the top 20 bits of the physical address of a frame. | ||
| 2641 | The low 12 bits of the frame's address are always 0. | ||
| 2642 | </DL> | ||
| 2643 | <P> | ||
| 2644 | |||
| 2645 | Other bits are either reserved or uninteresting in a Pintos context and | ||
| 2646 | should be set to@tie{}0. | ||
| 2647 | </P> | ||
| 2648 | <P> | ||
| 2649 | |||
| 2650 | Header <Q><TT>threads/pte.h</TT></Q> defines three functions for working with | ||
| 2651 | page table entries: | ||
| 2652 | </P> | ||
| 2653 | <P> | ||
| 2654 | |||
| 2655 | <A NAME="IDX132"></A> | ||
| 2656 | </P> | ||
| 2657 | <DL> | ||
| 2658 | <DT><U>Function:</U> uint32_t <B>pte_create_kernel</B> (uint32_t *<VAR>page</VAR>, bool <VAR>writable</VAR>) | ||
| 2659 | <DD>Returns a page table entry that points to <VAR>page</VAR>, which should be a | ||
| 2660 | kernel virtual address. The PTE's present bit will be set. It will be | ||
| 2661 | marked for kernel-only access. If <VAR>writable</VAR> is true, the PTE will | ||
| 2662 | also be marked read/write; otherwise, it will be read-only. | ||
| 2663 | </DL> | ||
| 2664 | <P> | ||
| 2665 | |||
| 2666 | <A NAME="IDX133"></A> | ||
| 2667 | </P> | ||
| 2668 | <DL> | ||
| 2669 | <DT><U>Function:</U> uint32_t <B>pte_create_user</B> (uint32_t *<VAR>page</VAR>, bool <VAR>writable</VAR>) | ||
| 2670 | <DD>Returns a page table entry that points to <VAR>page</VAR>, which should be | ||
| 2671 | the kernel virtual address of a frame in the user pool (see <A HREF="pintos_4.html#Why PAL_USER?">Why PAL_USER?</A>). The PTE's present bit will be set and it will be marked to | ||
| 2672 | allow user-mode access. If <VAR>writable</VAR> is true, the PTE will also be | ||
| 2673 | marked read/write; otherwise, it will be read-only. | ||
| 2674 | </DL> | ||
| 2675 | <P> | ||
| 2676 | |||
| 2677 | <A NAME="IDX134"></A> | ||
| 2678 | </P> | ||
| 2679 | <DL> | ||
| 2680 | <DT><U>Function:</U> void *<B>pte_get_page</B> (uint32_t <VAR>pte</VAR>) | ||
| 2681 | <DD>Returns the kernel virtual address for the frame that <VAR>pte</VAR> points | ||
| 2682 | to. The <VAR>pte</VAR> may be present or not-present; if it is not-present | ||
| 2683 | then the pointer returned is only meaningful if the address bits in the PTE | ||
| 2684 | actually represent a physical address. | ||
| 2685 | </DL> | ||
| 2686 | <P> | ||
| 2687 | |||
| 2688 | <A NAME="Page Directory Entry Format"></A> | ||
| 2689 | <HR SIZE="6"> | ||
| 2690 | <A NAME="SEC80"></A> | ||
| 2691 | <H4> A.7.4.3 Page Directory Entry Format </H4> | ||
| 2692 | <!--docid::SEC80::--> | ||
| 2693 | <P> | ||
| 2694 | |||
| 2695 | Page directory entries have the same format as PTEs, except that the | ||
| 2696 | physical address points to a page table page instead of a frame. Header | ||
| 2697 | <Q><TT>threads/pte.h</TT></Q> defines two functions for working with page | ||
| 2698 | directory entries: | ||
| 2699 | </P> | ||
| 2700 | <P> | ||
| 2701 | |||
| 2702 | <A NAME="IDX135"></A> | ||
| 2703 | </P> | ||
| 2704 | <DL> | ||
| 2705 | <DT><U>Function:</U> uint32_t <B>pde_create</B> (uint32_t *<VAR>pt</VAR>) | ||
| 2706 | <DD>Returns a page directory that points to <VAR>page</VAR>, which should be the | ||
| 2707 | kernel virtual address of a page table page. The PDE's present bit will | ||
| 2708 | be set, it will be marked to allow user-mode access, and it will be | ||
| 2709 | marked read/write. | ||
| 2710 | </DL> | ||
| 2711 | <P> | ||
| 2712 | |||
| 2713 | <A NAME="IDX136"></A> | ||
| 2714 | </P> | ||
| 2715 | <DL> | ||
| 2716 | <DT><U>Function:</U> uint32_t *<B>pde_get_pt</B> (uint32_t <VAR>pde</VAR>) | ||
| 2717 | <DD>Returns the kernel virtual address for the page table page that | ||
| 2718 | <VAR>pde</VAR>, which must be marked present, points to. | ||
| 2719 | </DL> | ||
| 2720 | <P> | ||
| 2721 | |||
| 2722 | <A NAME="Hash Table"></A> | ||
| 2723 | <HR SIZE="6"> | ||
| 2724 | <A NAME="SEC81"></A> | ||
| 2725 | <H2> A.8 Hash Table </H2> | ||
| 2726 | <!--docid::SEC81::--> | ||
| 2727 | <P> | ||
| 2728 | |||
| 2729 | Pintos provides a hash table data structure in <Q><TT>lib/kernel/hash.c</TT></Q>. | ||
| 2730 | To use it you will need to include its header file, | ||
| 2731 | <Q><TT>lib/kernel/hash.h</TT></Q>, with <CODE>#include <hash.h></CODE>. | ||
| 2732 | No code provided with Pintos uses the hash table, which means that you | ||
| 2733 | are free to use it as is, modify its implementation for your own | ||
| 2734 | purposes, or ignore it, as you wish. | ||
| 2735 | </P> | ||
| 2736 | <P> | ||
| 2737 | |||
| 2738 | Most implementations of the virtual memory project use a hash table to | ||
| 2739 | translate pages to frames. You may find other uses for hash tables as | ||
| 2740 | well. | ||
| 2741 | </P> | ||
| 2742 | <P> | ||
| 2743 | |||
| 2744 | <A NAME="Hash Data Types"></A> | ||
| 2745 | <HR SIZE="6"> | ||
| 2746 | <A NAME="SEC82"></A> | ||
| 2747 | <H3> A.8.1 Data Types </H3> | ||
| 2748 | <!--docid::SEC82::--> | ||
| 2749 | <P> | ||
| 2750 | |||
| 2751 | A hash table is represented by <CODE>struct hash</CODE>. | ||
| 2752 | </P> | ||
| 2753 | <P> | ||
| 2754 | |||
| 2755 | <A NAME="IDX137"></A> | ||
| 2756 | </P> | ||
| 2757 | <DL> | ||
| 2758 | <DT><U>Type:</U> <B>struct hash</B> | ||
| 2759 | <DD>Represents an entire hash table. The actual members of <CODE>struct hash</CODE> | ||
| 2760 | are "opaque." That is, code that uses a hash table should not access | ||
| 2761 | <CODE>struct hash</CODE> members directly, nor should it need to. Instead, use | ||
| 2762 | hash table functions and macros. | ||
| 2763 | </DL> | ||
| 2764 | <P> | ||
| 2765 | |||
| 2766 | The hash table operates on elements of type <CODE>struct hash_elem</CODE>. | ||
| 2767 | </P> | ||
| 2768 | <P> | ||
| 2769 | |||
| 2770 | <A NAME="IDX138"></A> | ||
| 2771 | </P> | ||
| 2772 | <DL> | ||
| 2773 | <DT><U>Type:</U> <B>struct hash_elem</B> | ||
| 2774 | <DD>Embed a <CODE>struct hash_elem</CODE> member in the structure you want to include | ||
| 2775 | in a hash table. Like <CODE>struct hash</CODE>, <CODE>struct hash_elem</CODE> is opaque. | ||
| 2776 | All functions for operating on hash table elements actually take and | ||
| 2777 | return pointers to <CODE>struct hash_elem</CODE>, not pointers to your hash table's | ||
| 2778 | real element type. | ||
| 2779 | </DL> | ||
| 2780 | <P> | ||
| 2781 | |||
| 2782 | You will often need to obtain a <CODE>struct hash_elem</CODE> given a real element | ||
| 2783 | of the hash table, and vice versa. Given a real element of the hash | ||
| 2784 | table, you may use the <Q><SAMP>&</SAMP></Q> operator to obtain a pointer to its | ||
| 2785 | <CODE>struct hash_elem</CODE>. Use the <CODE>hash_entry()</CODE> macro to go the other | ||
| 2786 | direction. | ||
| 2787 | </P> | ||
| 2788 | <P> | ||
| 2789 | |||
| 2790 | <A NAME="IDX139"></A> | ||
| 2791 | </P> | ||
| 2792 | <DL> | ||
| 2793 | <DT><U>Macro:</U> <VAR>type</VAR> *<B>hash_entry</B> (struct hash_elem *<VAR>elem</VAR>, <VAR>type</VAR>, <VAR>member</VAR>) | ||
| 2794 | <DD>Returns a pointer to the structure that <VAR>elem</VAR>, a pointer to a | ||
| 2795 | <CODE>struct hash_elem</CODE>, is embedded within. You must provide <VAR>type</VAR>, | ||
| 2796 | the name of the structure that <VAR>elem</VAR> is inside, and <VAR>member</VAR>, | ||
| 2797 | the name of the member in <VAR>type</VAR> that <VAR>elem</VAR> points to. | ||
| 2798 | <P> | ||
| 2799 | |||
| 2800 | For example, suppose <CODE>h</CODE> is a <CODE>struct hash_elem *</CODE> variable | ||
| 2801 | that points to a <CODE>struct thread</CODE> member (of type <CODE>struct hash_elem</CODE>) | ||
| 2802 | named <CODE>h_elem</CODE>. Then, <CODE>hash_entry@tie{</CODE>(h, struct thread, h_elem)} | ||
| 2803 | yields the address of the <CODE>struct thread</CODE> that <CODE>h</CODE> points within. | ||
| 2804 | </P> | ||
| 2805 | </DL> | ||
| 2806 | <P> | ||
| 2807 | |||
| 2808 | See section <A HREF="pintos_5.html#SEC86">A.8.5 Hash Table Example</A>, for an example. | ||
| 2809 | </P> | ||
| 2810 | <P> | ||
| 2811 | |||
| 2812 | Each hash table element must contain a key, that is, data that | ||
| 2813 | identifies and distinguishes elements, which must be unique | ||
| 2814 | among elements in the hash table. (Elements may | ||
| 2815 | also contain non-key data that need not be unique.) While an element is | ||
| 2816 | in a hash table, its key data must not be changed. Instead, if need be, | ||
| 2817 | remove the element from the hash table, modify its key, then reinsert | ||
| 2818 | the element. | ||
| 2819 | </P> | ||
| 2820 | <P> | ||
| 2821 | |||
| 2822 | For each hash table, you must write two functions that act on keys: a | ||
| 2823 | hash function and a comparison function. These functions must match the | ||
| 2824 | following prototypes: | ||
| 2825 | </P> | ||
| 2826 | <P> | ||
| 2827 | |||
| 2828 | <A NAME="IDX140"></A> | ||
| 2829 | </P> | ||
| 2830 | <DL> | ||
| 2831 | <DT><U>Type:</U> <B>unsigned hash_hash_func (const struct hash_elem *<VAR>element</VAR>, void *<VAR>aux</VAR>)</B> | ||
| 2832 | <DD>Returns a hash of <VAR>element</VAR>'s data, as a value anywhere in the range | ||
| 2833 | of <CODE>unsigned int</CODE>. The hash of an element should be a | ||
| 2834 | pseudo-random function of the element's key. It must not depend on | ||
| 2835 | non-key data in the element or on any non-constant data other than the | ||
| 2836 | key. Pintos provides the following functions as a suitable basis for | ||
| 2837 | hash functions. | ||
| 2838 | <P> | ||
| 2839 | |||
| 2840 | <A NAME="IDX141"></A> | ||
| 2841 | </P> | ||
| 2842 | <DL> | ||
| 2843 | <DT><U>Function:</U> unsigned <B>hash_bytes</B> (const void *<VAR>buf</VAR>, size_t *<VAR>size</VAR>) | ||
| 2844 | <DD>Returns a hash of the <VAR>size</VAR> bytes starting at <VAR>buf</VAR>. The | ||
| 2845 | implementation is the general-purpose | ||
| 2846 | <A HREF="http://en.wikipedia.org/wiki/Fowler_Noll_Vo_hash">Fowler-Noll-Vo | ||
| 2847 | hash</A> for 32-bit words. | ||
| 2848 | </DL> | ||
| 2849 | <P> | ||
| 2850 | |||
| 2851 | <A NAME="IDX142"></A> | ||
| 2852 | </P> | ||
| 2853 | <DL> | ||
| 2854 | <DT><U>Function:</U> unsigned <B>hash_string</B> (const char *<VAR>s</VAR>) | ||
| 2855 | <DD>Returns a hash of null-terminated string <VAR>s</VAR>. | ||
| 2856 | </DL> | ||
| 2857 | <P> | ||
| 2858 | |||
| 2859 | <A NAME="IDX143"></A> | ||
| 2860 | </P> | ||
| 2861 | <DL> | ||
| 2862 | <DT><U>Function:</U> unsigned <B>hash_int</B> (int <VAR>i</VAR>) | ||
| 2863 | <DD>Returns a hash of integer <VAR>i</VAR>. | ||
| 2864 | </DL> | ||
| 2865 | <P> | ||
| 2866 | |||
| 2867 | If your key is a single piece of data of an appropriate type, it is | ||
| 2868 | sensible for your hash function to directly return the output of one of | ||
| 2869 | these functions. For multiple pieces of data, you may wish to combine | ||
| 2870 | the output of more than one call to them using, e.g., the <Q><SAMP>^</SAMP></Q> | ||
| 2871 | (exclusive or) | ||
| 2872 | operator. Finally, you may entirely ignore these functions and write | ||
| 2873 | your own hash function from scratch, but remember that your goal is to | ||
| 2874 | build an operating system kernel, not to design a hash function. | ||
| 2875 | </P> | ||
| 2876 | <P> | ||
| 2877 | |||
| 2878 | See section <A HREF="pintos_5.html#SEC87">A.8.6 Auxiliary Data</A>, for an explanation of <VAR>aux</VAR>. | ||
| 2879 | </P> | ||
| 2880 | </DL> | ||
| 2881 | <P> | ||
| 2882 | |||
| 2883 | <A NAME="IDX144"></A> | ||
| 2884 | </P> | ||
| 2885 | <DL> | ||
| 2886 | <DT><U>Type:</U> <B>bool hash_less_func (const struct hash_elem *<VAR>a</VAR>, const struct hash_elem *<VAR>b</VAR>, void *<VAR>aux</VAR>)</B> | ||
| 2887 | <DD>Compares the keys stored in elements <VAR>a</VAR> and <VAR>b</VAR>. Returns | ||
| 2888 | true if <VAR>a</VAR> is less than <VAR>b</VAR>, false if <VAR>a</VAR> is greater than | ||
| 2889 | or equal to <VAR>b</VAR>. | ||
| 2890 | <P> | ||
| 2891 | |||
| 2892 | If two elements compare equal, then they must hash to equal values. | ||
| 2893 | </P> | ||
| 2894 | <P> | ||
| 2895 | |||
| 2896 | See section <A HREF="pintos_5.html#SEC87">A.8.6 Auxiliary Data</A>, for an explanation of <VAR>aux</VAR>. | ||
| 2897 | </P> | ||
| 2898 | </DL> | ||
| 2899 | <P> | ||
| 2900 | |||
| 2901 | See section <A HREF="pintos_5.html#SEC86">A.8.5 Hash Table Example</A>, for hash and comparison function examples. | ||
| 2902 | </P> | ||
| 2903 | <P> | ||
| 2904 | |||
| 2905 | A few functions accept a pointer to a third kind of | ||
| 2906 | function as an argument: | ||
| 2907 | </P> | ||
| 2908 | <P> | ||
| 2909 | |||
| 2910 | <A NAME="IDX145"></A> | ||
| 2911 | </P> | ||
| 2912 | <DL> | ||
| 2913 | <DT><U>Type:</U> <B>void hash_action_func (struct hash_elem *<VAR>element</VAR>, void *<VAR>aux</VAR>)</B> | ||
| 2914 | <DD>Performs some kind of action, chosen by the caller, on <VAR>element</VAR>. | ||
| 2915 | <P> | ||
| 2916 | |||
| 2917 | See section <A HREF="pintos_5.html#SEC87">A.8.6 Auxiliary Data</A>, for an explanation of <VAR>aux</VAR>. | ||
| 2918 | </P> | ||
| 2919 | </DL> | ||
| 2920 | <P> | ||
| 2921 | |||
| 2922 | <A NAME="Basic Hash Functions"></A> | ||
| 2923 | <HR SIZE="6"> | ||
| 2924 | <A NAME="SEC83"></A> | ||
| 2925 | <H3> A.8.2 Basic Functions </H3> | ||
| 2926 | <!--docid::SEC83::--> | ||
| 2927 | <P> | ||
| 2928 | |||
| 2929 | These functions create, destroy, and inspect hash tables. | ||
| 2930 | </P> | ||
| 2931 | <P> | ||
| 2932 | |||
| 2933 | <A NAME="IDX146"></A> | ||
| 2934 | </P> | ||
| 2935 | <DL> | ||
| 2936 | <DT><U>Function:</U> bool <B>hash_init</B> (struct hash *<VAR>hash</VAR>, hash_hash_func *<VAR>hash_func</VAR>, hash_less_func *<VAR>less_func</VAR>, void *<VAR>aux</VAR>) | ||
| 2937 | <DD>Initializes <VAR>hash</VAR> as a hash table with <VAR>hash_func</VAR> as hash | ||
| 2938 | function, <VAR>less_func</VAR> as comparison function, and <VAR>aux</VAR> as | ||
| 2939 | auxiliary data. | ||
| 2940 | Returns true if successful, false on failure. <CODE>hash_init()</CODE> calls | ||
| 2941 | <CODE>malloc()</CODE> and fails if memory cannot be allocated. | ||
| 2942 | <P> | ||
| 2943 | |||
| 2944 | See section <A HREF="pintos_5.html#SEC87">A.8.6 Auxiliary Data</A>, for an explanation of <VAR>aux</VAR>, which is | ||
| 2945 | most often a null pointer. | ||
| 2946 | </P> | ||
| 2947 | </DL> | ||
| 2948 | <P> | ||
| 2949 | |||
| 2950 | <A NAME="IDX147"></A> | ||
| 2951 | </P> | ||
| 2952 | <DL> | ||
| 2953 | <DT><U>Function:</U> void <B>hash_clear</B> (struct hash *<VAR>hash</VAR>, hash_action_func *<VAR>action</VAR>) | ||
| 2954 | <DD>Removes all the elements from <VAR>hash</VAR>, which must have been | ||
| 2955 | previously initialized with <CODE>hash_init()</CODE>. | ||
| 2956 | <P> | ||
| 2957 | |||
| 2958 | If <VAR>action</VAR> is non-null, then it is called once for each element in | ||
| 2959 | the hash table, which gives the caller an opportunity to deallocate any | ||
| 2960 | memory or other resources used by the element. For example, if the hash | ||
| 2961 | table elements are dynamically allocated using <CODE>malloc()</CODE>, then | ||
| 2962 | <VAR>action</VAR> could <CODE>free()</CODE> the element. This is safe because | ||
| 2963 | <CODE>hash_clear()</CODE> will not access the memory in a given hash element | ||
| 2964 | after calling <VAR>action</VAR> on it. However, <VAR>action</VAR> must not call | ||
| 2965 | any function that may modify the hash table, such as <CODE>hash_insert()</CODE> | ||
| 2966 | or <CODE>hash_delete()</CODE>. | ||
| 2967 | </P> | ||
| 2968 | </DL> | ||
| 2969 | <P> | ||
| 2970 | |||
| 2971 | <A NAME="IDX148"></A> | ||
| 2972 | </P> | ||
| 2973 | <DL> | ||
| 2974 | <DT><U>Function:</U> void <B>hash_destroy</B> (struct hash *<VAR>hash</VAR>, hash_action_func *<VAR>action</VAR>) | ||
| 2975 | <DD>If <VAR>action</VAR> is non-null, calls it for each element in the hash, with | ||
| 2976 | the same semantics as a call to <CODE>hash_clear()</CODE>. Then, frees the | ||
| 2977 | memory held by <VAR>hash</VAR>. Afterward, <VAR>hash</VAR> must not be passed to | ||
| 2978 | any hash table function, absent an intervening call to <CODE>hash_init()</CODE>. | ||
| 2979 | </DL> | ||
| 2980 | <P> | ||
| 2981 | |||
| 2982 | <A NAME="IDX149"></A> | ||
| 2983 | </P> | ||
| 2984 | <DL> | ||
| 2985 | <DT><U>Function:</U> size_t <B>hash_size</B> (struct hash *<VAR>hash</VAR>) | ||
| 2986 | <DD>Returns the number of elements currently stored in <VAR>hash</VAR>. | ||
| 2987 | </DL> | ||
| 2988 | <P> | ||
| 2989 | |||
| 2990 | <A NAME="IDX150"></A> | ||
| 2991 | </P> | ||
| 2992 | <DL> | ||
| 2993 | <DT><U>Function:</U> bool <B>hash_empty</B> (struct hash *<VAR>hash</VAR>) | ||
| 2994 | <DD>Returns true if <VAR>hash</VAR> currently contains no elements, | ||
| 2995 | false if <VAR>hash</VAR> contains at least one element. | ||
| 2996 | </DL> | ||
| 2997 | <P> | ||
| 2998 | |||
| 2999 | <A NAME="Hash Search Functions"></A> | ||
| 3000 | <HR SIZE="6"> | ||
| 3001 | <A NAME="SEC84"></A> | ||
| 3002 | <H3> A.8.3 Search Functions </H3> | ||
| 3003 | <!--docid::SEC84::--> | ||
| 3004 | <P> | ||
| 3005 | |||
| 3006 | Each of these functions searches a hash table for an element that | ||
| 3007 | compares equal to one provided. Based on the success of the search, | ||
| 3008 | they perform some action, such as inserting a new element into the hash | ||
| 3009 | table, or simply return the result of the search. | ||
| 3010 | </P> | ||
| 3011 | <P> | ||
| 3012 | |||
| 3013 | <A NAME="IDX151"></A> | ||
| 3014 | </P> | ||
| 3015 | <DL> | ||
| 3016 | <DT><U>Function:</U> struct hash_elem *<B>hash_insert</B> (struct hash *<VAR>hash</VAR>, struct hash_elem *<VAR>element</VAR>) | ||
| 3017 | <DD>Searches <VAR>hash</VAR> for an element equal to <VAR>element</VAR>. If none is | ||
| 3018 | found, inserts <VAR>element</VAR> into <VAR>hash</VAR> and returns a null pointer. | ||
| 3019 | If the table already contains an element equal to <VAR>element</VAR>, it is | ||
| 3020 | returned without modifying <VAR>hash</VAR>. | ||
| 3021 | </DL> | ||
| 3022 | <P> | ||
| 3023 | |||
| 3024 | <A NAME="IDX152"></A> | ||
| 3025 | </P> | ||
| 3026 | <DL> | ||
| 3027 | <DT><U>Function:</U> struct hash_elem *<B>hash_replace</B> (struct hash *<VAR>hash</VAR>, struct hash_elem *<VAR>element</VAR>) | ||
| 3028 | <DD>Inserts <VAR>element</VAR> into <VAR>hash</VAR>. Any element equal to | ||
| 3029 | <VAR>element</VAR> already in <VAR>hash</VAR> is removed. Returns the element | ||
| 3030 | removed, or a null pointer if <VAR>hash</VAR> did not contain an element | ||
| 3031 | equal to <VAR>element</VAR>. | ||
| 3032 | <P> | ||
| 3033 | |||
| 3034 | The caller is responsible for deallocating any resources associated with | ||
| 3035 | the returned element, as appropriate. For example, if the hash table | ||
| 3036 | elements are dynamically allocated using <CODE>malloc()</CODE>, then the caller | ||
| 3037 | must <CODE>free()</CODE> the element after it is no longer needed. | ||
| 3038 | </P> | ||
| 3039 | </DL> | ||
| 3040 | <P> | ||
| 3041 | |||
| 3042 | The element passed to the following functions is only used for hashing | ||
| 3043 | and comparison purposes. It is never actually inserted into the hash | ||
| 3044 | table. Thus, only key data in the element needs to be initialized, and | ||
| 3045 | other data in the element will not be used. It often makes sense to | ||
| 3046 | declare an instance of the element type as a local variable, initialize | ||
| 3047 | the key data, and then pass the address of its <CODE>struct hash_elem</CODE> to | ||
| 3048 | <CODE>hash_find()</CODE> or <CODE>hash_delete()</CODE>. See section <A HREF="pintos_5.html#SEC86">A.8.5 Hash Table Example</A>, for | ||
| 3049 | an example. (Large structures should not be | ||
| 3050 | allocated as local variables. See section <A HREF="pintos_5.html#SEC55">A.2.1 <CODE>struct thread</CODE></A>, for more | ||
| 3051 | information.) | ||
| 3052 | </P> | ||
| 3053 | <P> | ||
| 3054 | |||
| 3055 | <A NAME="IDX153"></A> | ||
| 3056 | </P> | ||
| 3057 | <DL> | ||
| 3058 | <DT><U>Function:</U> struct hash_elem *<B>hash_find</B> (struct hash *<VAR>hash</VAR>, struct hash_elem *<VAR>element</VAR>) | ||
| 3059 | <DD>Searches <VAR>hash</VAR> for an element equal to <VAR>element</VAR>. Returns the | ||
| 3060 | element found, if any, or a null pointer otherwise. | ||
| 3061 | </DL> | ||
| 3062 | <P> | ||
| 3063 | |||
| 3064 | <A NAME="IDX154"></A> | ||
| 3065 | </P> | ||
| 3066 | <DL> | ||
| 3067 | <DT><U>Function:</U> struct hash_elem *<B>hash_delete</B> (struct hash *<VAR>hash</VAR>, struct hash_elem *<VAR>element</VAR>) | ||
| 3068 | <DD>Searches <VAR>hash</VAR> for an element equal to <VAR>element</VAR>. If one is | ||
| 3069 | found, it is removed from <VAR>hash</VAR> and returned. Otherwise, a null | ||
| 3070 | pointer is returned and <VAR>hash</VAR> is unchanged. | ||
| 3071 | <P> | ||
| 3072 | |||
| 3073 | The caller is responsible for deallocating any resources associated with | ||
| 3074 | the returned element, as appropriate. For example, if the hash table | ||
| 3075 | elements are dynamically allocated using <CODE>malloc()</CODE>, then the caller | ||
| 3076 | must <CODE>free()</CODE> the element after it is no longer needed. | ||
| 3077 | </P> | ||
| 3078 | </DL> | ||
| 3079 | <P> | ||
| 3080 | |||
| 3081 | <A NAME="Hash Iteration Functions"></A> | ||
| 3082 | <HR SIZE="6"> | ||
| 3083 | <A NAME="SEC85"></A> | ||
| 3084 | <H3> A.8.4 Iteration Functions </H3> | ||
| 3085 | <!--docid::SEC85::--> | ||
| 3086 | <P> | ||
| 3087 | |||
| 3088 | These functions allow iterating through the elements in a hash table. | ||
| 3089 | Two interfaces are supplied. The first requires writing and supplying a | ||
| 3090 | <VAR>hash_action_func</VAR> to act on each element (see section <A HREF="pintos_5.html#SEC82">A.8.1 Data Types</A>). | ||
| 3091 | </P> | ||
| 3092 | <P> | ||
| 3093 | |||
| 3094 | <A NAME="IDX155"></A> | ||
| 3095 | </P> | ||
| 3096 | <DL> | ||
| 3097 | <DT><U>Function:</U> void <B>hash_apply</B> (struct hash *<VAR>hash</VAR>, hash_action_func *<VAR>action</VAR>) | ||
| 3098 | <DD>Calls <VAR>action</VAR> once for each element in <VAR>hash</VAR>, in arbitrary | ||
| 3099 | order. <VAR>action</VAR> must not call any function that may modify the hash | ||
| 3100 | table, such as <CODE>hash_insert()</CODE> or <CODE>hash_delete()</CODE>. <VAR>action</VAR> | ||
| 3101 | must not modify key data in elements, although it may modify any other | ||
| 3102 | data. | ||
| 3103 | </DL> | ||
| 3104 | <P> | ||
| 3105 | |||
| 3106 | The second interface is based on an "iterator" data type. | ||
| 3107 | Idiomatically, iterators are used as follows: | ||
| 3108 | </P> | ||
| 3109 | <P> | ||
| 3110 | |||
| 3111 | <TABLE><tr><td> </td><td class=example><pre>struct hash_iterator i; | ||
| 3112 | |||
| 3113 | hash_first (&i, h); | ||
| 3114 | while (hash_next (&i)) | ||
| 3115 | { | ||
| 3116 | struct foo *f = hash_entry (hash_cur (&i), struct foo, elem); | ||
| 3117 | <small>...</small>do something with <I>f</I><small>...</small> | ||
| 3118 | } | ||
| 3119 | </pre></td></tr></table><P> | ||
| 3120 | |||
| 3121 | <A NAME="IDX156"></A> | ||
| 3122 | </P> | ||
| 3123 | <DL> | ||
| 3124 | <DT><U>Type:</U> <B>struct hash_iterator</B> | ||
| 3125 | <DD>Represents a position within a hash table. Calling any function that | ||
| 3126 | may modify a hash table, such as <CODE>hash_insert()</CODE> or | ||
| 3127 | <CODE>hash_delete()</CODE>, invalidates all iterators within that hash table. | ||
| 3128 | <P> | ||
| 3129 | |||
| 3130 | Like <CODE>struct hash</CODE> and <CODE>struct hash_elem</CODE>, <CODE>struct hash_elem</CODE> is opaque. | ||
| 3131 | </P> | ||
| 3132 | </DL> | ||
| 3133 | <P> | ||
| 3134 | |||
| 3135 | <A NAME="IDX157"></A> | ||
| 3136 | </P> | ||
| 3137 | <DL> | ||
| 3138 | <DT><U>Function:</U> void <B>hash_first</B> (struct hash_iterator *<VAR>iterator</VAR>, struct hash *<VAR>hash</VAR>) | ||
| 3139 | <DD>Initializes <VAR>iterator</VAR> to just before the first element in | ||
| 3140 | <VAR>hash</VAR>. | ||
| 3141 | </DL> | ||
| 3142 | <P> | ||
| 3143 | |||
| 3144 | <A NAME="IDX158"></A> | ||
| 3145 | </P> | ||
| 3146 | <DL> | ||
| 3147 | <DT><U>Function:</U> struct hash_elem *<B>hash_next</B> (struct hash_iterator *<VAR>iterator</VAR>) | ||
| 3148 | <DD>Advances <VAR>iterator</VAR> to the next element in <VAR>hash</VAR>, and returns | ||
| 3149 | that element. Returns a null pointer if no elements remain. After | ||
| 3150 | <CODE>hash_next()</CODE> returns null for <VAR>iterator</VAR>, calling it again | ||
| 3151 | yields undefined behavior. | ||
| 3152 | </DL> | ||
| 3153 | <P> | ||
| 3154 | |||
| 3155 | <A NAME="IDX159"></A> | ||
| 3156 | </P> | ||
| 3157 | <DL> | ||
| 3158 | <DT><U>Function:</U> struct hash_elem *<B>hash_cur</B> (struct hash_iterator *<VAR>iterator</VAR>) | ||
| 3159 | <DD>Returns the value most recently returned by <CODE>hash_next()</CODE> for | ||
| 3160 | <VAR>iterator</VAR>. Yields undefined behavior after <CODE>hash_first()</CODE> has | ||
| 3161 | been called on <VAR>iterator</VAR> but before <CODE>hash_next()</CODE> has been | ||
| 3162 | called for the first time. | ||
| 3163 | </DL> | ||
| 3164 | <P> | ||
| 3165 | |||
| 3166 | <A NAME="Hash Table Example"></A> | ||
| 3167 | <HR SIZE="6"> | ||
| 3168 | <A NAME="SEC86"></A> | ||
| 3169 | <H3> A.8.5 Hash Table Example </H3> | ||
| 3170 | <!--docid::SEC86::--> | ||
| 3171 | <P> | ||
| 3172 | |||
| 3173 | Suppose you have a structure, called <CODE>struct page</CODE>, that you | ||
| 3174 | want to put into a hash table. First, define <CODE>struct page</CODE> to include a | ||
| 3175 | <CODE>struct hash_elem</CODE> member: | ||
| 3176 | </P> | ||
| 3177 | <P> | ||
| 3178 | |||
| 3179 | <TABLE><tr><td> </td><td class=example><pre>struct page | ||
| 3180 | { | ||
| 3181 | struct hash_elem hash_elem; /* Hash table element. */ | ||
| 3182 | void *addr; /* Virtual address. */ | ||
| 3183 | /* <small>...</small>other members<small>...</small> */ | ||
| 3184 | }; | ||
| 3185 | </pre></td></tr></table><P> | ||
| 3186 | |||
| 3187 | We write a hash function and a comparison function using <VAR>addr</VAR> as | ||
| 3188 | the key. A pointer can be hashed based on its bytes, and the <Q><SAMP><</SAMP></Q> | ||
| 3189 | operator works fine for comparing pointers: | ||
| 3190 | </P> | ||
| 3191 | <P> | ||
| 3192 | |||
| 3193 | <TABLE><tr><td> </td><td class=example><pre>/* Returns a hash value for page <VAR>p</VAR>. */ | ||
| 3194 | unsigned | ||
| 3195 | page_hash (const struct hash_elem *p_, void *aux UNUSED) | ||
| 3196 | { | ||
| 3197 | const struct page *p = hash_entry (p_, struct page, hash_elem); | ||
| 3198 | return hash_bytes (&p->addr, sizeof p->addr); | ||
| 3199 | } | ||
| 3200 | |||
| 3201 | /* Returns true if page <VAR>a</VAR> precedes page <VAR>b</VAR>. */ | ||
| 3202 | bool | ||
| 3203 | page_less (const struct hash_elem *a_, const struct hash_elem *b_, | ||
| 3204 | void *aux UNUSED) | ||
| 3205 | { | ||
| 3206 | const struct page *a = hash_entry (a_, struct page, hash_elem); | ||
| 3207 | const struct page *b = hash_entry (b_, struct page, hash_elem); | ||
| 3208 | |||
| 3209 | return a->addr < b->addr; | ||
| 3210 | } | ||
| 3211 | </pre></td></tr></table><P> | ||
| 3212 | |||
| 3213 | (The use of <CODE>UNUSED</CODE> in these functions' prototypes suppresses a | ||
| 3214 | warning that <VAR>aux</VAR> is unused. See section <A HREF="pintos_8.html#SEC99">D.3 Function and Parameter Attributes</A>, for information about <CODE>UNUSED</CODE>. See section <A HREF="pintos_5.html#SEC87">A.8.6 Auxiliary Data</A>, for an explanation of <VAR>aux</VAR>.) | ||
| 3215 | </P> | ||
| 3216 | <P> | ||
| 3217 | |||
| 3218 | Then, we can create a hash table like this: | ||
| 3219 | </P> | ||
| 3220 | <P> | ||
| 3221 | |||
| 3222 | <TABLE><tr><td> </td><td class=example><pre>struct hash pages; | ||
| 3223 | |||
| 3224 | hash_init (&pages, page_hash, page_less, NULL); | ||
| 3225 | </pre></td></tr></table><P> | ||
| 3226 | |||
| 3227 | Now we can manipulate the hash table we've created. If <CODE><VAR>p</VAR></CODE> | ||
| 3228 | is a pointer to a <CODE>struct page</CODE>, we can insert it into the hash table | ||
| 3229 | with: | ||
| 3230 | </P> | ||
| 3231 | <P> | ||
| 3232 | |||
| 3233 | <TABLE><tr><td> </td><td class=example><pre>hash_insert (&pages, &p->hash_elem); | ||
| 3234 | </pre></td></tr></table><P> | ||
| 3235 | |||
| 3236 | If there's a chance that <VAR>pages</VAR> might already contain a | ||
| 3237 | page with the same <VAR>addr</VAR>, then we should check <CODE>hash_insert()</CODE>'s | ||
| 3238 | return value. | ||
| 3239 | </P> | ||
| 3240 | <P> | ||
| 3241 | |||
| 3242 | To search for an element in the hash table, use <CODE>hash_find()</CODE>. This | ||
| 3243 | takes a little setup, because <CODE>hash_find()</CODE> takes an element to | ||
| 3244 | compare against. Here's a function that will find and return a page | ||
| 3245 | based on a virtual address, assuming that <VAR>pages</VAR> is defined at file | ||
| 3246 | scope: | ||
| 3247 | </P> | ||
| 3248 | <P> | ||
| 3249 | |||
| 3250 | <TABLE><tr><td> </td><td class=example><pre>/* Returns the page containing the given virtual <VAR>address</VAR>, | ||
| 3251 | or a null pointer if no such page exists. */ | ||
| 3252 | struct page * | ||
| 3253 | page_lookup (const void *address) | ||
| 3254 | { | ||
| 3255 | struct page p; | ||
| 3256 | struct hash_elem *e; | ||
| 3257 | |||
| 3258 | p.addr = address; | ||
| 3259 | e = hash_find (&pages, &p.hash_elem); | ||
| 3260 | return e != NULL ? hash_entry (e, struct page, hash_elem) : NULL; | ||
| 3261 | } | ||
| 3262 | </pre></td></tr></table><P> | ||
| 3263 | |||
| 3264 | <CODE>struct page</CODE> is allocated as a local variable here on the assumption | ||
| 3265 | that it is fairly small. Large structures should not be allocated as | ||
| 3266 | local variables. See section <A HREF="pintos_5.html#SEC55">A.2.1 <CODE>struct thread</CODE></A>, for more information. | ||
| 3267 | </P> | ||
| 3268 | <P> | ||
| 3269 | |||
| 3270 | A similar function could delete a page by address using | ||
| 3271 | <CODE>hash_delete()</CODE>. | ||
| 3272 | </P> | ||
| 3273 | <P> | ||
| 3274 | |||
| 3275 | <A NAME="Hash Auxiliary Data"></A> | ||
| 3276 | <HR SIZE="6"> | ||
| 3277 | <A NAME="SEC87"></A> | ||
| 3278 | <H3> A.8.6 Auxiliary Data </H3> | ||
| 3279 | <!--docid::SEC87::--> | ||
| 3280 | <P> | ||
| 3281 | |||
| 3282 | In simple cases like the example above, there's no need for the | ||
| 3283 | <VAR>aux</VAR> parameters. In these cases, just pass a null pointer to | ||
| 3284 | <CODE>hash_init()</CODE> for <VAR>aux</VAR> and ignore the values passed to the hash | ||
| 3285 | function and comparison functions. (You'll get a compiler warning if | ||
| 3286 | you don't use the <VAR>aux</VAR> parameter, but you can turn that off with | ||
| 3287 | the <CODE>UNUSED</CODE> macro, as shown in the example, or you can just ignore | ||
| 3288 | it.) | ||
| 3289 | </P> | ||
| 3290 | <P> | ||
| 3291 | |||
| 3292 | <VAR>aux</VAR> is useful when you have some property of the data in the | ||
| 3293 | hash table is both constant and needed for hashing or comparison, | ||
| 3294 | but not stored in the data items themselves. For example, if | ||
| 3295 | the items in a hash table are fixed-length strings, but the items | ||
| 3296 | themselves don't indicate what that fixed length is, you could pass | ||
| 3297 | the length as an <VAR>aux</VAR> parameter. | ||
| 3298 | </P> | ||
| 3299 | <P> | ||
| 3300 | |||
| 3301 | <A NAME="Hash Synchronization"></A> | ||
| 3302 | <HR SIZE="6"> | ||
| 3303 | <A NAME="SEC88"></A> | ||
| 3304 | <H3> A.8.7 Synchronization </H3> | ||
| 3305 | <!--docid::SEC88::--> | ||
| 3306 | <P> | ||
| 3307 | |||
| 3308 | The hash table does not do any internal synchronization. It is the | ||
| 3309 | caller's responsibility to synchronize calls to hash table functions. | ||
| 3310 | In general, any number of functions that examine but do not modify the | ||
| 3311 | hash table, such as <CODE>hash_find()</CODE> or <CODE>hash_next()</CODE>, may execute | ||
| 3312 | simultaneously. However, these function cannot safely execute at the | ||
| 3313 | same time as any function that may modify a given hash table, such as | ||
| 3314 | <CODE>hash_insert()</CODE> or <CODE>hash_delete()</CODE>, nor may more than one function | ||
| 3315 | that can modify a given hash table execute safely at once. | ||
| 3316 | </P> | ||
| 3317 | <P> | ||
| 3318 | |||
| 3319 | It is also the caller's responsibility to synchronize access to data in | ||
| 3320 | hash table elements. How to synchronize access to this data depends on | ||
| 3321 | how it is designed and organized, as with any other data structure. | ||
| 3322 | </P> | ||
| 3323 | <P> | ||
| 3324 | |||
| 3325 | <A NAME="Coding Standards"></A> | ||
| 3326 | <HR SIZE="6"> | ||
| 3327 | <TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0> | ||
| 3328 | <TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos_5.html#SEC48"> << </A>]</TD> | ||
| 3329 | <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos_6.html#SEC89"> >> </A>]</TD> | ||
| 3330 | <TD VALIGN="MIDDLE" ALIGN="LEFT"> <TD VALIGN="MIDDLE" ALIGN="LEFT"> <TD VALIGN="MIDDLE" ALIGN="LEFT"> <TD VALIGN="MIDDLE" ALIGN="LEFT"> <TD VALIGN="MIDDLE" ALIGN="LEFT"> <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos.html#SEC_Top">Top</A>]</TD> | ||
| 3331 | <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos.html#SEC_Contents">Contents</A>]</TD> | ||
| 3332 | <TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD> | ||
| 3333 | <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos_abt.html#SEC_About"> ? </A>]</TD> | ||
| 3334 | </TR></TABLE> | ||
| 3335 | <BR> | ||
| 3336 | <FONT SIZE="-1"> | ||
| 3337 | This document was generated | ||
| 3338 | by on <I>March, 6 2012</I> | ||
| 3339 | using <A HREF="http://texi2html.cvshome.org"><I>texi2html</I></A> | ||
| 3340 | </FONT> | ||
| 3341 | |||
| 3342 | </BODY> | ||
| 3343 | </HTML> | ||
