summaryrefslogtreecommitdiffstats
path: root/doc/pintos_5.html
diff options
context:
space:
mode:
authormanuel <manuel@mausz.at>2012-03-26 12:54:45 +0200
committermanuel <manuel@mausz.at>2012-03-26 12:54:45 +0200
commitb5f0874cd96ee2a62aabc645b9626c2749cb6a01 (patch)
tree1262e4bbe0634de6650be130c36e0538240f4cbf /doc/pintos_5.html
downloadprogos-b5f0874cd96ee2a62aabc645b9626c2749cb6a01.tar.gz
progos-b5f0874cd96ee2a62aabc645b9626c2749cb6a01.tar.bz2
progos-b5f0874cd96ee2a62aabc645b9626c2749cb6a01.zip
initial pintos checkin
Diffstat (limited to 'doc/pintos_5.html')
-rw-r--r--doc/pintos_5.html3343
1 files changed, 3343 insertions, 0 deletions
diff --git a/doc/pintos_5.html b/doc/pintos_5.html
new file mode 100644
index 0000000..670f351
--- /dev/null
+++ b/doc/pintos_5.html
@@ -0,0 +1,3343 @@
1<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
2 "http://www.w3.org/TR/html40/loose.dtd">
3<HTML>
4<!-- Created on March, 6 2012 by texi2html 1.66 -->
5<!--
6Written by: Lionel Cons <Lionel.Cons@cern.ch> (original author)
7 Karl Berry <karl@freefriends.org>
8 Olaf Bachmann <obachman@mathematik.uni-kl.de>
9 and many others.
10Maintained by: Many creative people <dev@texi2html.cvshome.org>
11Send bugs and suggestions to <users@texi2html.cvshome.org>
12
13-->
14<HEAD>
15<TITLE>Pintos Projects: Reference Guide</TITLE>
16
17<META NAME="description" CONTENT="Pintos Projects: Reference Guide">
18<META NAME="keywords" CONTENT="Pintos Projects: Reference Guide">
19<META NAME="resource-type" CONTENT="document">
20<META NAME="distribution" CONTENT="global">
21<META NAME="Generator" CONTENT="texi2html 1.66">
22<LINK REL="stylesheet" HREF="pintos.css">
23</HEAD>
24
25<BODY >
26
27<A NAME="SEC48"></A>
28<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
29<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos_4.html#SEC47"> &lt;&lt; </A>]</TD>
30<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos_6.html#SEC89"> &gt;&gt; </A>]</TD>
31<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos.html#SEC_Top">Top</A>]</TD>
32<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos.html#SEC_Contents">Contents</A>]</TD>
33<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
34<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos_abt.html#SEC_About"> ? </A>]</TD>
35</TR></TABLE>
36
37<HR SIZE=2>
38<H1> A. Reference Guide </H1>
39<!--docid::SEC48::-->
40<P>
41
42This chapter is a reference for the Pintos code. The reference guide
43does not cover all of the code in Pintos, but it does cover those
44pieces that students most often find troublesome. You may find that
45you want to read each part of the reference guide as you work on the
46project where it becomes important.
47</P>
48<P>
49
50We recommend using &quot;tags&quot; to follow along with references to function
51and variable names (see section <A HREF="pintos_9.html#SEC110">E.1 Tags</A>).
52</P>
53<P>
54
55<A NAME="Pintos Loading"></A>
56<HR SIZE="6">
57<A NAME="SEC49"></A>
58<H2> A.1 Loading </H2>
59<!--docid::SEC49::-->
60<P>
61
62This section covers the Pintos loader and basic kernel
63initialization.
64</P>
65<P>
66
67<A NAME="Pintos Loader"></A>
68<HR SIZE="6">
69<A NAME="SEC50"></A>
70<H3> A.1.1 The Loader </H3>
71<!--docid::SEC50::-->
72<P>
73
74The first part of Pintos that runs is the loader, in
75<Q><TT>threads/loader.S</TT></Q>. The PC BIOS loads the loader into memory.
76The loader, in turn, is responsible for finding the kernel on disk,
77loading it into memory, and then jumping to its start. It's
78not important to understand exactly how the loader works, but if
79you're interested, read on. You should probably read along with the
80loader's source. You should also understand the basics of the
8180<VAR>x</VAR>86 architecture as described by chapter 3, &quot;Basic Execution
82Environment,&quot; of [ <A HREF="pintos_10.html#IA32-v1">IA32-v1</A>].
83</P>
84<P>
85
86The PC BIOS loads the loader from the first sector of the first hard
87disk, called the <EM>master boot record</EM> (MBR). PC conventions
88reserve 64 bytes of the MBR for the partition table, and Pintos uses
89about 128 additional bytes for kernel command-line arguments. This
90leaves a little over 300 bytes for the loader's own code. This is a
91severe restriction that means, practically speaking, the loader must
92be written in assembly language.
93</P>
94<P>
95
96The Pintos loader and kernel don't have to be on the same disk, nor
97does is the kernel required to be in any particular location on a
98given disk. The loader's first job, then, is to find the kernel by
99reading the partition table on each hard disk, looking for a bootable
100partition of the type used for a Pintos kernel.
101</P>
102<P>
103
104When the loader finds a bootable kernel partition, it reads the
105partition's contents into memory at physical address 128 kB. The
106kernel is at the beginning of the partition, which might be larger
107than necessary due to partition boundary alignment conventions, so the
108loader reads no more than 512 kB (and the Pintos build process
109will refuse to produce kernels larger than that). Reading more data
110than this would cross into the region from 640 kB to 1 MB that
111the PC architecture reserves for hardware and the BIOS, and a standard
112PC BIOS does not provide any means to load the kernel above 1 MB.
113</P>
114<P>
115
116The loader's final job is to extract the entry point from the loaded
117kernel image and transfer control to it. The entry point is not at a
118predictable location, but the kernel's ELF header contains a pointer
119to it. The loader extracts the pointer and jumps to the location it
120points to.
121</P>
122<P>
123
124The Pintos kernel command line
125is stored in the boot loader. The <CODE>pintos</CODE> program actually
126modifies a copy of the boot loader on disk each time it runs the kernel,
127inserting whatever command-line arguments the user supplies to the kernel,
128and then the kernel at boot time reads those arguments out of the boot
129loader in memory. This is not an elegant solution, but it is simple
130and effective.
131</P>
132<P>
133
134<A NAME="Low-Level Kernel Initialization"></A>
135<HR SIZE="6">
136<A NAME="SEC51"></A>
137<H3> A.1.2 Low-Level Kernel Initialization </H3>
138<!--docid::SEC51::-->
139<P>
140
141The loader's last action is to transfer control to the kernel's entry
142point, which is <CODE>start()</CODE> in <Q><TT>threads/start.S</TT></Q>. The job of
143this code is to switch the CPU from legacy 16-bit &quot;real mode&quot; into
144the 32-bit &quot;protected mode&quot; used by all modern 80<VAR>x</VAR>86 operating
145systems.
146</P>
147<P>
148
149The startup code's first task is actually to obtain the machine's
150memory size, by asking the BIOS for the PC's memory size. The
151simplest BIOS function to do this can only detect up to 64 MB of RAM,
152so that's the practical limit that Pintos can support. The function
153stores the memory size, in pages, in global variable
154<CODE>init_ram_pages</CODE>.
155</P>
156<P>
157
158The first part of CPU initialization is to enable the A20 line, that
159is, the CPU's address line numbered 20. For historical reasons, PCs
160boot with this address line fixed at 0, which means that attempts to
161access memory beyond the first 1 MB (2 raised to the 20th power) will
162fail. Pintos wants to access more memory than this, so we have to
163enable it.
164</P>
165<P>
166
167Next, the loader creates a basic page table. This page table maps
168the 64 MB at the base of virtual memory (starting at virtual address
1690) directly to the identical physical addresses. It also maps the
170same physical memory starting at virtual address
171<CODE>LOADER_PHYS_BASE</CODE>, which defaults to <TT>0xc0000000</TT> (3 GB). The
172Pintos kernel only wants the latter mapping, but there's a
173chicken-and-egg problem if we don't include the former: our current
174virtual address is roughly <TT>0x20000</TT>, the location where the loader
175put us, and we can't jump to <TT>0xc0020000</TT> until we turn on the
176page table, but if we turn on the page table without jumping there,
177then we've just pulled the rug out from under ourselves.
178</P>
179<P>
180
181After the page table is initialized, we load the CPU's control
182registers to turn on protected mode and paging, and set up the segment
183registers. We aren't yet equipped to handle interrupts in protected
184mode, so we disable interrupts. The final step is to call <CODE>main()</CODE>.
185</P>
186<P>
187
188<A NAME="High-Level Kernel Initialization"></A>
189<HR SIZE="6">
190<A NAME="SEC52"></A>
191<H3> A.1.3 High-Level Kernel Initialization </H3>
192<!--docid::SEC52::-->
193<P>
194
195The kernel proper starts with the <CODE>main()</CODE> function. The
196<CODE>main()</CODE> function is written in C, as will be most of the code we
197encounter in Pintos from here on out.
198</P>
199<P>
200
201When <CODE>main()</CODE> starts, the system is in a pretty raw state. We're
202in 32-bit protected mode with paging enabled, but hardly anything else is
203ready. Thus, the <CODE>main()</CODE> function consists primarily of calls
204into other Pintos modules' initialization functions.
205These are usually named <CODE><VAR>module</VAR>_init()</CODE>, where
206<VAR>module</VAR> is the module's name, <Q><TT><VAR>module</VAR>.c</TT></Q> is the
207module's source code, and <Q><TT><VAR>module</VAR>.h</TT></Q> is the module's
208header.
209</P>
210<P>
211
212The first step in <CODE>main()</CODE> is to call <CODE>bss_init()</CODE>, which clears
213out the kernel's &quot;BSS&quot;, which is the traditional name for a
214segment that should be initialized to all zeros. In most C
215implementations, whenever you
216declare a variable outside a function without providing an
217initializer, that variable goes into the BSS. Because it's all zeros, the
218BSS isn't stored in the image that the loader brought into memory. We
219just use <CODE>memset()</CODE> to zero it out.
220</P>
221<P>
222
223Next, <CODE>main()</CODE> calls <CODE>read_command_line()</CODE> to break the kernel command
224line into arguments, then <CODE>parse_options()</CODE> to read any options at
225the beginning of the command line. (Actions specified on the
226command line execute later.)
227</P>
228<P>
229
230<CODE>thread_init()</CODE> initializes the thread system. We will defer full
231discussion to our discussion of Pintos threads below. It is called so
232early in initialization because a valid thread structure is a
233prerequisite for acquiring a lock, and lock acquisition in turn is
234important to other Pintos subsystems. Then we initialize the console
235and print a startup message to the console.
236</P>
237<P>
238
239The next block of functions we call initializes the kernel's memory
240system. <CODE>palloc_init()</CODE> sets up the kernel page allocator, which
241doles out memory one or more pages at a time (see section <A HREF="pintos_5.html#SEC70">A.5.1 Page Allocator</A>).
242<CODE>malloc_init()</CODE> sets
243up the allocator that handles allocations of arbitrary-size blocks of
244memory (see section <A HREF="pintos_5.html#SEC71">A.5.2 Block Allocator</A>).
245<CODE>paging_init()</CODE> sets up a page table for the kernel (see section <A HREF="pintos_5.html#SEC73">A.7 Page Table</A>).
246</P>
247<P>
248
249In projects 2 and later, <CODE>main()</CODE> also calls <CODE>tss_init()</CODE> and
250<CODE>gdt_init()</CODE>.
251</P>
252<P>
253
254The next set of calls initializes the interrupt system.
255<CODE>intr_init()</CODE> sets up the CPU's <EM>interrupt descriptor table</EM>
256(IDT) to ready it for interrupt handling (see section <A HREF="pintos_5.html#SEC66">A.4.1 Interrupt Infrastructure</A>), then <CODE>timer_init()</CODE> and <CODE>kbd_init()</CODE> prepare for
257handling timer interrupts and keyboard interrupts, respectively.
258<CODE>input_init()</CODE> sets up to merge serial and keyboard input into one
259stream. In
260projects 2 and later, we also prepare to handle interrupts caused by
261user programs using <CODE>exception_init()</CODE> and <CODE>syscall_init()</CODE>.
262</P>
263<P>
264
265Now that interrupts are set up, we can start the scheduler
266with <CODE>thread_start()</CODE>, which creates the idle thread and enables
267interrupts.
268With interrupts enabled, interrupt-driven serial port I/O becomes
269possible, so we use
270<CODE>serial_init_queue()</CODE> to switch to that mode. Finally,
271<CODE>timer_calibrate()</CODE> calibrates the timer for accurate short delays.
272</P>
273<P>
274
275If the file system is compiled in, as it will starting in project 2, we
276initialize the IDE disks with <CODE>ide_init()</CODE>, then the
277file system with <CODE>filesys_init()</CODE>.
278</P>
279<P>
280
281Boot is complete, so we print a message.
282</P>
283<P>
284
285Function <CODE>run_actions()</CODE> now parses and executes actions specified on
286the kernel command line, such as <CODE>run</CODE> to run a test (in project
2871) or a user program (in later projects).
288</P>
289<P>
290
291Finally, if <Q><SAMP>-q</SAMP></Q> was specified on the kernel command line, we
292call <CODE>shutdown_power_off()</CODE> to terminate the machine simulator. Otherwise,
293<CODE>main()</CODE> calls <CODE>thread_exit()</CODE>, which allows any other running
294threads to continue running.
295</P>
296<P>
297
298<A NAME="Physical Memory Map"></A>
299<HR SIZE="6">
300<A NAME="SEC53"></A>
301<H3> A.1.4 Physical Memory Map </H3>
302<!--docid::SEC53::-->
303<P>
304
305</P>
306<TABLE>
307@headitem Memory Range
308</TD><TD> Owner
309</TD><TD> Contents
310
311<TR><TD><TT>00000000</TT>--<TT>000003ff</TT> </TD><TD> CPU </TD><TD> Real mode interrupt table.</TD>
312</TR>
313<TR><TD><TT>00000400</TT>--<TT>000005ff</TT> </TD><TD> BIOS </TD><TD> Miscellaneous data area.</TD>
314</TR>
315<TR><TD><TT>00000600</TT>--<TT>00007bff</TT> </TD><TD> -- </TD><TD> ---</TD>
316</TR>
317<TR><TD><TT>00007c00</TT>--<TT>00007dff</TT> </TD><TD> Pintos </TD><TD> Loader.</TD>
318</TR>
319<TR><TD><TT>0000e000</TT>--<TT>0000efff</TT> </TD><TD> Pintos</TD>
320</TD><TD> Stack for loader; kernel stack and <CODE>struct thread</CODE> for initial
321kernel thread.
322</TR>
323<TR><TD><TT>0000f000</TT>--<TT>0000ffff</TT> </TD><TD> Pintos</TD>
324</TD><TD> Page directory for startup code.
325</TR>
326<TR><TD><TT>00010000</TT>--<TT>00020000</TT> </TD><TD> Pintos</TD>
327</TD><TD> Page tables for startup code.
328</TR>
329<TR><TD><TT>00020000</TT>--<TT>0009ffff</TT> </TD><TD> Pintos</TD>
330</TD><TD> Kernel code, data, and uninitialized data segments.
331</TR>
332<TR><TD><TT>000a0000</TT>--<TT>000bffff</TT> </TD><TD> Video </TD><TD> VGA display memory.</TD>
333</TR>
334<TR><TD><TT>000c0000</TT>--<TT>000effff</TT> </TD><TD> Hardware</TD>
335</TD><TD> Reserved for expansion card RAM and ROM.
336</TR>
337<TR><TD><TT>000f0000</TT>--<TT>000fffff</TT> </TD><TD> BIOS </TD><TD> ROM BIOS.</TD>
338</TR>
339<TR><TD><TT>00100000</TT>--<TT>03ffffff</TT> </TD><TD> Pintos </TD><TD> Dynamic memory allocation.</TD>
340</TR></TABLE>
341<P>
342
343<A NAME="Threads"></A>
344<HR SIZE="6">
345<A NAME="SEC54"></A>
346<H2> A.2 Threads </H2>
347<!--docid::SEC54::-->
348<P>
349
350<A NAME="struct thread"></A>
351<HR SIZE="6">
352<A NAME="SEC55"></A>
353<H3> A.2.1 <CODE>struct thread</CODE> </H3>
354<!--docid::SEC55::-->
355<P>
356
357The main Pintos data structure for threads is <CODE>struct thread</CODE>,
358declared in <Q><TT>threads/thread.h</TT></Q>.
359</P>
360<P>
361
362<A NAME="IDX4"></A>
363</P>
364<DL>
365<DT><U>Structure:</U> <B>struct thread</B>
366<DD>Represents a thread or a user process. In the projects, you will have
367to add your own members to <CODE>struct thread</CODE>. You may also change or
368delete the definitions of existing members.
369<P>
370
371Every <CODE>struct thread</CODE> occupies the beginning of its own page of
372memory. The rest of the page is used for the thread's stack, which
373grows downward from the end of the page. It looks like this:
374</P>
375<P>
376
377<TABLE><tr><td>&nbsp;</td><td class=example><pre> 4 kB +---------------------------------+
378 | kernel stack |
379 | | |
380 | | |
381 | V |
382 | grows downward |
383 | |
384 | |
385 | |
386 | |
387 | |
388 | |
389 | |
390 | |
391sizeof (struct thread) +---------------------------------+
392 | magic |
393 | : |
394 | : |
395 | status |
396 | tid |
397 0 kB +---------------------------------+
398</pre></td></tr></table><P>
399
400This has two consequences. First, <CODE>struct thread</CODE> must not be allowed
401to grow too big. If it does, then there will not be enough room for the
402kernel stack. The base <CODE>struct thread</CODE> is only a few bytes in size. It
403probably should stay well under 1 kB.
404</P>
405<P>
406
407Second, kernel stacks must not be allowed to grow too large. If a stack
408overflows, it will corrupt the thread state. Thus, kernel functions
409should not allocate large structures or arrays as non-static local
410variables. Use dynamic allocation with <CODE>malloc()</CODE> or
411<CODE>palloc_get_page()</CODE> instead (see section <A HREF="pintos_5.html#SEC69">A.5 Memory Allocation</A>).
412</P>
413</DL>
414<P>
415
416<A NAME="IDX5"></A>
417</P>
418<DL>
419<DT><U>Member of <CODE>struct thread</CODE>:</U> tid_t <B>tid</B>
420<DD>The thread's thread identifier or <EM>tid</EM>. Every thread must have a
421tid that is unique over the entire lifetime of the kernel. By
422default, <CODE>tid_t</CODE> is a <CODE>typedef</CODE> for <CODE>int</CODE> and each new
423thread receives the numerically next higher tid, starting from 1 for
424the initial process. You can change the type and the numbering scheme
425if you like.
426</DL>
427<P>
428
429<A NAME="IDX6"></A>
430</P>
431<DL>
432<DT><U>Member of <CODE>struct thread</CODE>:</U> enum thread_status <B>status</B>
433<DD><A NAME="Thread States"></A>
434The thread's state, one of the following:
435<P>
436
437<A NAME="IDX7"></A>
438</P>
439<DL>
440<DT><U>Thread State:</U> <B><CODE>THREAD_RUNNING</CODE></B>
441<DD>The thread is running. Exactly one thread is running at a given time.
442<CODE>thread_current()</CODE> returns the running thread.
443</DL>
444<P>
445
446<A NAME="IDX8"></A>
447</P>
448<DL>
449<DT><U>Thread State:</U> <B><CODE>THREAD_READY</CODE></B>
450<DD>The thread is ready to run, but it's not running right now. The
451thread could be selected to run the next time the scheduler is
452invoked. Ready threads are kept in a doubly linked list called
453<CODE>ready_list</CODE>.
454</DL>
455<P>
456
457<A NAME="IDX9"></A>
458</P>
459<DL>
460<DT><U>Thread State:</U> <B><CODE>THREAD_BLOCKED</CODE></B>
461<DD>The thread is waiting for something, e.g. a lock to become
462available, an interrupt to be invoked. The thread won't be scheduled
463again until it transitions to the <CODE>THREAD_READY</CODE> state with a
464call to <CODE>thread_unblock()</CODE>. This is most conveniently done
465indirectly, using one of the Pintos synchronization primitives that
466block and unblock threads automatically (see section <A HREF="pintos_5.html#SEC58">A.3 Synchronization</A>).
467<P>
468
469There is no <I>a priori</I> way to tell what a blocked thread is waiting
470for, but a backtrace can help (see section <A HREF="pintos_8.html#SEC100">D.4 Backtraces</A>).
471</P>
472</DL>
473<P>
474
475<A NAME="IDX10"></A>
476</P>
477<DL>
478<DT><U>Thread State:</U> <B><CODE>THREAD_DYING</CODE></B>
479<DD>The thread will be destroyed by the scheduler after switching to the
480next thread.
481</DL>
482</DL>
483<P>
484
485<A NAME="IDX11"></A>
486</P>
487<DL>
488<DT><U>Member of <CODE>struct thread</CODE>:</U> char <B>name[16]</B>
489<DD>The thread's name as a string, or at least the first few characters of
490it.
491</DL>
492<P>
493
494<A NAME="IDX12"></A>
495</P>
496<DL>
497<DT><U>Member of <CODE>struct thread</CODE>:</U> uint8_t *<B>stack</B>
498<DD>Every thread has its own stack to keep track of its state. When the
499thread is running, the CPU's stack pointer register tracks the top of
500the stack and this member is unused. But when the CPU switches to
501another thread, this member saves the thread's stack pointer. No
502other members are needed to save the thread's registers, because the
503other registers that must be saved are saved on the stack.
504<P>
505
506When an interrupt occurs, whether in the kernel or a user program, an
507<CODE>struct intr_frame</CODE> is pushed onto the stack. When the interrupt occurs
508in a user program, the <CODE>struct intr_frame</CODE> is always at the very top of
509the page. See section <A HREF="pintos_5.html#SEC65">A.4 Interrupt Handling</A>, for more information.
510</P>
511</DL>
512<P>
513
514<A NAME="IDX13"></A>
515</P>
516<DL>
517<DT><U>Member of <CODE>struct thread</CODE>:</U> int <B>priority</B>
518<DD>A thread priority, ranging from <CODE>PRI_MIN</CODE> (0) to <CODE>PRI_MAX</CODE>
519(63). Lower numbers correspond to lower priorities, so that
520priority 0 is the lowest priority and priority 63 is the highest.
521Pintos as provided ignores thread priorities, but you will implement
522priority scheduling in project 1 (see section <A HREF="pintos_3.html#SEC45">3.2.2 Priority Scheduling</A>).
523</DL>
524<P>
525
526<A NAME="IDX14"></A>
527</P>
528<DL>
529<DT><U>Member of <CODE>struct thread</CODE>:</U> <CODE>struct list_elem</CODE> <B>allelem</B>
530<DD>This &quot;list element&quot; is used to link the thread into the list of all
531threads. Each thread is inserted into this list when it is created
532and removed when it exits. The <CODE>thread_foreach()</CODE> function should
533be used to iterate over all threads.
534</DL>
535<P>
536
537<A NAME="IDX15"></A>
538</P>
539<DL>
540<DT><U>Member of <CODE>struct thread</CODE>:</U> <CODE>struct list_elem</CODE> <B>elem</B>
541<DD>A &quot;list element&quot; used to put the thread into doubly linked lists,
542either <CODE>ready_list</CODE> (the list of threads ready to run) or a list of
543threads waiting on a semaphore in <CODE>sema_down()</CODE>. It can do double
544duty because a thread waiting on a semaphore is not ready, and vice
545versa.
546</DL>
547<P>
548
549<A NAME="IDX16"></A>
550</P>
551<DL>
552<DT><U>Member of <CODE>struct thread</CODE>:</U> uint32_t *<B>pagedir</B>
553<DD>Only present in project 2 and later. See <A HREF="pintos_4.html#Page Tables">Page Tables</A>.
554</DL>
555<P>
556
557<A NAME="IDX17"></A>
558</P>
559<DL>
560<DT><U>Member of <CODE>struct thread</CODE>:</U> unsigned <B>magic</B>
561<DD>Always set to <CODE>THREAD_MAGIC</CODE>, which is just an arbitrary number defined
562in <Q><TT>threads/thread.c</TT></Q>, and used to detect stack overflow.
563<CODE>thread_current()</CODE> checks that the <CODE>magic</CODE> member of the running
564thread's <CODE>struct thread</CODE> is set to <CODE>THREAD_MAGIC</CODE>. Stack overflow
565tends to change this value, triggering the assertion. For greatest
566benefit, as you add members to <CODE>struct thread</CODE>, leave <CODE>magic</CODE> at
567the end.
568</DL>
569<P>
570
571<A NAME="Thread Functions"></A>
572<HR SIZE="6">
573<A NAME="SEC56"></A>
574<H3> A.2.2 Thread Functions </H3>
575<!--docid::SEC56::-->
576<P>
577
578<Q><TT>threads/thread.c</TT></Q> implements several public functions for thread
579support. Let's take a look at the most useful:
580</P>
581<P>
582
583<A NAME="IDX18"></A>
584</P>
585<DL>
586<DT><U>Function:</U> void <B>thread_init</B> (void)
587<DD>Called by <CODE>main()</CODE> to initialize the thread system. Its main
588purpose is to create a <CODE>struct thread</CODE> for Pintos's initial thread.
589This is possible because the Pintos loader puts the initial
590thread's stack at the top of a page, in the same position as any other
591Pintos thread.
592<P>
593
594Before <CODE>thread_init()</CODE> runs,
595<CODE>thread_current()</CODE> will fail because the running thread's
596<CODE>magic</CODE> value is incorrect. Lots of functions call
597<CODE>thread_current()</CODE> directly or indirectly, including
598<CODE>lock_acquire()</CODE> for locking a lock, so <CODE>thread_init()</CODE> is
599called early in Pintos initialization.
600</P>
601</DL>
602<P>
603
604<A NAME="IDX19"></A>
605</P>
606<DL>
607<DT><U>Function:</U> void <B>thread_start</B> (void)
608<DD>Called by <CODE>main()</CODE> to start the scheduler. Creates the idle
609thread, that is, the thread that is scheduled when no other thread is
610ready. Then enables interrupts, which as a side effect enables the
611scheduler because the scheduler runs on return from the timer interrupt, using
612<CODE>intr_yield_on_return()</CODE> (see section <A HREF="pintos_5.html#SEC68">A.4.3 External Interrupt Handling</A>).
613</DL>
614<P>
615
616<A NAME="IDX20"></A>
617</P>
618<DL>
619<DT><U>Function:</U> void <B>thread_tick</B> (void)
620<DD>Called by the timer interrupt at each timer tick. It keeps track of
621thread statistics and triggers the scheduler when a time slice expires.
622</DL>
623<P>
624
625<A NAME="IDX21"></A>
626</P>
627<DL>
628<DT><U>Function:</U> void <B>thread_print_stats</B> (void)
629<DD>Called during Pintos shutdown to print thread statistics.
630</DL>
631<P>
632
633<A NAME="IDX22"></A>
634</P>
635<DL>
636<DT><U>Function:</U> tid_t <B>thread_create</B> (const char *<VAR>name</VAR>, int <VAR>priority</VAR>, thread_func *<VAR>func</VAR>, void *<VAR>aux</VAR>)
637<DD>Creates and starts a new thread named <VAR>name</VAR> with the given
638<VAR>priority</VAR>, returning the new thread's tid. The thread executes
639<VAR>func</VAR>, passing <VAR>aux</VAR> as the function's single argument.
640<P>
641
642<CODE>thread_create()</CODE> allocates a page for the thread's
643<CODE>struct thread</CODE> and stack and initializes its members, then it sets
644up a set of fake stack frames for it (see section <A HREF="pintos_5.html#SEC57">A.2.3 Thread Switching</A>). The
645thread is initialized in the blocked state, then unblocked just before
646returning, which allows the new thread to
647be scheduled (see <A HREF="pintos_5.html#Thread States">Thread States</A>).
648</P>
649<P>
650
651<A NAME="IDX23"></A>
652</P>
653<DL>
654<DT><U>Type:</U> <B>void thread_func (void *<VAR>aux</VAR>)</B>
655<DD>This is the type of the function passed to <CODE>thread_create()</CODE>, whose
656<VAR>aux</VAR> argument is passed along as the function's argument.
657</DL>
658</DL>
659<P>
660
661<A NAME="IDX24"></A>
662</P>
663<DL>
664<DT><U>Function:</U> void <B>thread_block</B> (void)
665<DD>Transitions the running thread from the running state to the blocked
666state (see <A HREF="pintos_5.html#Thread States">Thread States</A>). The thread will not run again until
667<CODE>thread_unblock()</CODE> is
668called on it, so you'd better have some way arranged for that to happen.
669Because <CODE>thread_block()</CODE> is so low-level, you should prefer to use
670one of the synchronization primitives instead (see section <A HREF="pintos_5.html#SEC58">A.3 Synchronization</A>).
671</DL>
672<P>
673
674<A NAME="IDX25"></A>
675</P>
676<DL>
677<DT><U>Function:</U> void <B>thread_unblock</B> (struct thread *<VAR>thread</VAR>)
678<DD>Transitions <VAR>thread</VAR>, which must be in the blocked state, to the
679ready state, allowing it to resume running (see <A HREF="pintos_5.html#Thread States">Thread States</A>).
680This is called when the event that the thread is waiting for occurs,
681e.g. when the lock that
682the thread is waiting on becomes available.
683</DL>
684<P>
685
686<A NAME="IDX26"></A>
687</P>
688<DL>
689<DT><U>Function:</U> struct thread *<B>thread_current</B> (void)
690<DD>Returns the running thread.
691</DL>
692<P>
693
694<A NAME="IDX27"></A>
695</P>
696<DL>
697<DT><U>Function:</U> tid_t <B>thread_tid</B> (void)
698<DD>Returns the running thread's thread id. Equivalent to
699<CODE>thread_current ()-&gt;tid</CODE>.
700</DL>
701<P>
702
703<A NAME="IDX28"></A>
704</P>
705<DL>
706<DT><U>Function:</U> const char *<B>thread_name</B> (void)
707<DD>Returns the running thread's name. Equivalent to <CODE>thread_current
708()-&gt;name</CODE>.
709</DL>
710<P>
711
712<A NAME="IDX29"></A>
713</P>
714<DL>
715<DT><U>Function:</U> void <B>thread_exit</B> (void) <CODE>NO_RETURN</CODE>
716<DD>Causes the current thread to exit. Never returns, hence
717<CODE>NO_RETURN</CODE> (see section <A HREF="pintos_8.html#SEC99">D.3 Function and Parameter Attributes</A>).
718</DL>
719<P>
720
721<A NAME="IDX30"></A>
722</P>
723<DL>
724<DT><U>Function:</U> void <B>thread_yield</B> (void)
725<DD>Yields the CPU to the scheduler, which picks a new thread to run. The
726new thread might be the current thread, so you can't depend on this
727function to keep this thread from running for any particular length of
728time.
729</DL>
730<P>
731
732<A NAME="IDX31"></A>
733</P>
734<DL>
735<DT><U>Function:</U> void <B>thread_foreach</B> (thread_action_func *<VAR>action</VAR>, void *<VAR>aux</VAR>)
736<DD>Iterates over all threads <VAR>t</VAR> and invokes <CODE>action(t, aux)</CODE> on each.
737<VAR>action</VAR> must refer to a function that matches the signature
738given by <CODE>thread_action_func()</CODE>:
739<P>
740
741<A NAME="IDX32"></A>
742</P>
743<DL>
744<DT><U>Type:</U> <B>void thread_action_func (struct thread *<VAR>thread</VAR>, void *<VAR>aux</VAR>)</B>
745<DD>Performs some action on a thread, given <VAR>aux</VAR>.
746</DL>
747</DL>
748<P>
749
750<A NAME="IDX33"></A>
751</P>
752<DL>
753<DT><U>Function:</U> int <B>thread_get_priority</B> (void)
754<DD><A NAME="IDX34"></A>
755<DT><U>Function:</U> void <B>thread_set_priority</B> (int <VAR>new_priority</VAR>)
756<DD>Stub to set and get thread priority. See section <A HREF="pintos_3.html#SEC45">3.2.2 Priority Scheduling</A>.
757</DL>
758<P>
759
760<A NAME="IDX35"></A>
761</P>
762<DL>
763<DT><U>Function:</U> int <B>thread_get_nice</B> (void)
764<DD><A NAME="IDX36"></A>
765<DT><U>Function:</U> void <B>thread_set_nice</B> (int <VAR>new_nice</VAR>)
766<DD><A NAME="IDX37"></A>
767<DT><U>Function:</U> int <B>thread_get_recent_cpu</B> (void)
768<DD><A NAME="IDX38"></A>
769<DT><U>Function:</U> int <B>thread_get_load_avg</B> (void)
770<DD>Stubs for the advanced scheduler (not used in this course).
771</DL>
772<P>
773
774<A NAME="Thread Switching"></A>
775<HR SIZE="6">
776<A NAME="SEC57"></A>
777<H3> A.2.3 Thread Switching </H3>
778<!--docid::SEC57::-->
779<P>
780
781<CODE>schedule()</CODE> is responsible for switching threads. It
782is internal to <Q><TT>threads/thread.c</TT></Q> and called only by the three
783public thread functions that need to switch threads:
784<CODE>thread_block()</CODE>, <CODE>thread_exit()</CODE>, and <CODE>thread_yield()</CODE>.
785Before any of these functions call <CODE>schedule()</CODE>, they disable
786interrupts (or ensure that they are already disabled) and then change
787the running thread's state to something other than running.
788</P>
789<P>
790
791<CODE>schedule()</CODE> is short but tricky. It records the
792current thread in local variable <VAR>cur</VAR>, determines the next thread
793to run as local variable <VAR>next</VAR> (by calling
794<CODE>next_thread_to_run()</CODE>), and then calls <CODE>switch_threads()</CODE> to do
795the actual thread switch. The thread we switched to was also running
796inside <CODE>switch_threads()</CODE>, as are all the threads not currently
797running, so the new thread now returns out of
798<CODE>switch_threads()</CODE>, returning the previously running thread.
799</P>
800<P>
801
802<CODE>switch_threads()</CODE> is an assembly language routine in
803<Q><TT>threads/switch.S</TT></Q>. It saves registers on the stack, saves the
804CPU's current stack pointer in the current <CODE>struct thread</CODE>'s <CODE>stack</CODE>
805member, restores the new thread's <CODE>stack</CODE> into the CPU's stack
806pointer, restores registers from the stack, and returns.
807</P>
808<P>
809
810The rest of the scheduler is implemented in <CODE>thread_schedule_tail()</CODE>. It
811marks the new thread as running. If the thread we just switched from
812is in the dying state, then it also frees the page that contained the
813dying thread's <CODE>struct thread</CODE> and stack. These couldn't be freed
814prior to the thread switch because the switch needed to use it.
815</P>
816<P>
817
818Running a thread for the first time is a special case. When
819<CODE>thread_create()</CODE> creates a new thread, it goes through a fair
820amount of trouble to get it started properly. In particular, the new
821thread hasn't started running yet, so there's no way for it to be
822running inside <CODE>switch_threads()</CODE> as the scheduler expects. To
823solve the problem, <CODE>thread_create()</CODE> creates some fake stack frames
824in the new thread's stack:
825</P>
826<P>
827
828<UL>
829<LI>
830The topmost fake stack frame is for <CODE>switch_threads()</CODE>, represented
831by <CODE>struct switch_threads_frame</CODE>. The important part of this frame is
832its <CODE>eip</CODE> member, the return address. We point <CODE>eip</CODE> to
833<CODE>switch_entry()</CODE>, indicating it to be the function that called
834<CODE>switch_entry()</CODE>.
835<P>
836
837</P>
838<LI>
839The next fake stack frame is for <CODE>switch_entry()</CODE>, an assembly
840language routine in <Q><TT>threads/switch.S</TT></Q> that adjusts the stack
841pointer,<A NAME="DOCF3" HREF="pintos_fot.html#FOOT3">(3)</A>
842calls <CODE>thread_schedule_tail()</CODE> (this special case is why
843<CODE>thread_schedule_tail()</CODE> is separate from <CODE>schedule()</CODE>), and returns.
844We fill in its stack frame so that it returns into
845<CODE>kernel_thread()</CODE>, a function in <Q><TT>threads/thread.c</TT></Q>.
846<P>
847
848</P>
849<LI>
850The final stack frame is for <CODE>kernel_thread()</CODE>, which enables
851interrupts and calls the thread's function (the function passed to
852<CODE>thread_create()</CODE>). If the thread's function returns, it calls
853<CODE>thread_exit()</CODE> to terminate the thread.
854</UL>
855<P>
856
857<A NAME="Synchronization"></A>
858<HR SIZE="6">
859<A NAME="SEC58"></A>
860<H2> A.3 Synchronization </H2>
861<!--docid::SEC58::-->
862<P>
863
864If sharing of resources between threads is not handled in a careful,
865controlled fashion, the result is usually a big mess.
866This is especially the case in operating system kernels, where
867faulty sharing can crash the entire machine. Pintos provides several
868synchronization primitives to help out.
869</P>
870<P>
871
872<A NAME="Disabling Interrupts"></A>
873<HR SIZE="6">
874<A NAME="SEC59"></A>
875<H3> A.3.1 Disabling Interrupts </H3>
876<!--docid::SEC59::-->
877<P>
878
879The crudest way to do synchronization is to disable interrupts, that
880is, to temporarily prevent the CPU from responding to interrupts. If
881interrupts are off, no other thread will preempt the running thread,
882because thread preemption is driven by the timer interrupt. If
883interrupts are on, as they normally are, then the running thread may
884be preempted by another at any time, whether between two C statements
885or even within the execution of one.
886</P>
887<P>
888
889Incidentally, this means that Pintos is a &quot;preemptible kernel,&quot; that
890is, kernel threads can be preempted at any time. Traditional Unix
891systems are &quot;nonpreemptible,&quot; that is, kernel threads can only be
892preempted at points where they explicitly call into the scheduler.
893(User programs can be preempted at any time in both models.) As you
894might imagine, preemptible kernels require more explicit
895synchronization.
896</P>
897<P>
898
899You should have little need to set the interrupt state directly. Most
900of the time you should use the other synchronization primitives
901described in the following sections. The main reason to disable
902interrupts is to synchronize kernel threads with external interrupt
903handlers, which cannot sleep and thus cannot use most other forms of
904synchronization (see section <A HREF="pintos_5.html#SEC68">A.4.3 External Interrupt Handling</A>).
905</P>
906<P>
907
908Some external interrupts cannot be postponed, even by disabling
909interrupts. These interrupts, called <EM>non-maskable interrupts</EM>
910(NMIs), are supposed to be used only in emergencies, e.g. when the
911computer is on fire. Pintos does not handle non-maskable interrupts.
912</P>
913<P>
914
915Types and functions for disabling and enabling interrupts are in
916<Q><TT>threads/interrupt.h</TT></Q>.
917</P>
918<P>
919
920<A NAME="IDX39"></A>
921</P>
922<DL>
923<DT><U>Type:</U> <B>enum intr_level</B>
924<DD>One of <CODE>INTR_OFF</CODE> or <CODE>INTR_ON</CODE>, denoting that interrupts are
925disabled or enabled, respectively.
926</DL>
927<P>
928
929<A NAME="IDX40"></A>
930</P>
931<DL>
932<DT><U>Function:</U> enum intr_level <B>intr_get_level</B> (void)
933<DD>Returns the current interrupt state.
934</DL>
935<P>
936
937<A NAME="IDX41"></A>
938</P>
939<DL>
940<DT><U>Function:</U> enum intr_level <B>intr_set_level</B> (enum intr_level <VAR>level</VAR>)
941<DD>Turns interrupts on or off according to <VAR>level</VAR>. Returns the
942previous interrupt state.
943</DL>
944<P>
945
946<A NAME="IDX42"></A>
947</P>
948<DL>
949<DT><U>Function:</U> enum intr_level <B>intr_enable</B> (void)
950<DD>Turns interrupts on. Returns the previous interrupt state.
951</DL>
952<P>
953
954<A NAME="IDX43"></A>
955</P>
956<DL>
957<DT><U>Function:</U> enum intr_level <B>intr_disable</B> (void)
958<DD>Turns interrupts off. Returns the previous interrupt state.
959</DL>
960<P>
961
962<A NAME="Semaphores"></A>
963<HR SIZE="6">
964<A NAME="SEC60"></A>
965<H3> A.3.2 Semaphores </H3>
966<!--docid::SEC60::-->
967<P>
968
969A <EM>semaphore</EM> is a nonnegative integer together with two operators
970that manipulate it atomically, which are:
971</P>
972<P>
973
974<UL>
975<LI>
976&quot;Down&quot; or &quot;P&quot;: wait for the value to become positive, then
977decrement it.
978<P>
979
980</P>
981<LI>
982&quot;Up&quot; or &quot;V&quot;: increment the value (and wake up one waiting thread,
983if any).
984</UL>
985<P>
986
987A semaphore initialized to 0 may be used to wait for an event
988that will happen exactly once. For example, suppose thread <VAR>A</VAR>
989starts another thread <VAR>B</VAR> and wants to wait for <VAR>B</VAR> to signal
990that some activity is complete. <VAR>A</VAR> can create a semaphore
991initialized to 0, pass it to <VAR>B</VAR> as it starts it, and then
992&quot;down&quot; the semaphore. When <VAR>B</VAR> finishes its activity, it
993&quot;ups&quot; the semaphore. This works regardless of whether <VAR>A</VAR>
994&quot;downs&quot; the semaphore or <VAR>B</VAR> &quot;ups&quot; it first.
995</P>
996<P>
997
998A semaphore initialized to 1 is typically used for controlling access
999to a resource. Before a block of code starts using the resource, it
1000&quot;downs&quot; the semaphore, then after it is done with the resource it
1001&quot;ups&quot; the resource. In such a case a lock, described below, may be
1002more appropriate.
1003</P>
1004<P>
1005
1006Semaphores can also be initialized to values larger than 1. These are
1007rarely used.
1008</P>
1009<P>
1010
1011Semaphores were invented by Edsger Dijkstra and first used in the THE
1012operating system ([ <A HREF="pintos_10.html#Dijkstra">Dijkstra</A>]).
1013</P>
1014<P>
1015
1016Pintos' semaphore type and operations are declared in
1017<Q><TT>threads/synch.h</TT></Q>.
1018</P>
1019<P>
1020
1021<A NAME="IDX44"></A>
1022</P>
1023<DL>
1024<DT><U>Type:</U> <B>struct semaphore</B>
1025<DD>Represents a semaphore.
1026</DL>
1027<P>
1028
1029<A NAME="IDX45"></A>
1030</P>
1031<DL>
1032<DT><U>Function:</U> void <B>sema_init</B> (struct semaphore *<VAR>sema</VAR>, unsigned <VAR>value</VAR>)
1033<DD>Initializes <VAR>sema</VAR> as a new semaphore with the given initial
1034<VAR>value</VAR>.
1035</DL>
1036<P>
1037
1038<A NAME="IDX46"></A>
1039</P>
1040<DL>
1041<DT><U>Function:</U> void <B>sema_down</B> (struct semaphore *<VAR>sema</VAR>)
1042<DD>Executes the &quot;down&quot; or &quot;P&quot; operation on <VAR>sema</VAR>, waiting for
1043its value to become positive and then decrementing it by one.
1044</DL>
1045<P>
1046
1047<A NAME="IDX47"></A>
1048</P>
1049<DL>
1050<DT><U>Function:</U> bool <B>sema_try_down</B> (struct semaphore *<VAR>sema</VAR>)
1051<DD>Tries to execute the &quot;down&quot; or &quot;P&quot; operation on <VAR>sema</VAR>,
1052without waiting. Returns true if <VAR>sema</VAR>
1053was successfully decremented, or false if it was already
1054zero and thus could not be decremented without waiting. Calling this
1055function in a
1056tight loop wastes CPU time, so use <CODE>sema_down()</CODE> or find a
1057different approach instead.
1058</DL>
1059<P>
1060
1061<A NAME="IDX48"></A>
1062</P>
1063<DL>
1064<DT><U>Function:</U> void <B>sema_up</B> (struct semaphore *<VAR>sema</VAR>)
1065<DD>Executes the &quot;up&quot; or &quot;V&quot; operation on <VAR>sema</VAR>,
1066incrementing its value. If any threads are waiting on
1067<VAR>sema</VAR>, wakes one of them up.
1068<P>
1069
1070Unlike most synchronization primitives, <CODE>sema_up()</CODE> may be called
1071inside an external interrupt handler (see section <A HREF="pintos_5.html#SEC68">A.4.3 External Interrupt Handling</A>).
1072</P>
1073</DL>
1074<P>
1075
1076Semaphores are internally built out of disabling interrupt
1077(see section <A HREF="pintos_5.html#SEC59">A.3.1 Disabling Interrupts</A>) and thread blocking and unblocking
1078(<CODE>thread_block()</CODE> and <CODE>thread_unblock()</CODE>). Each semaphore maintains
1079a list of waiting threads, using the linked list
1080implementation in <Q><TT>lib/kernel/list.c</TT></Q>.
1081</P>
1082<P>
1083
1084<A NAME="Locks"></A>
1085<HR SIZE="6">
1086<A NAME="SEC61"></A>
1087<H3> A.3.3 Locks </H3>
1088<!--docid::SEC61::-->
1089<P>
1090
1091A <EM>lock</EM> is like a semaphore with an initial value of 1
1092(see section <A HREF="pintos_5.html#SEC60">A.3.2 Semaphores</A>). A lock's equivalent of &quot;up&quot; is called
1093&quot;release&quot;, and the &quot;down&quot; operation is called &quot;acquire&quot;.
1094</P>
1095<P>
1096
1097Compared to a semaphore, a lock has one added restriction: only the
1098thread that acquires a lock, called the lock's &quot;owner&quot;, is allowed to
1099release it. If this restriction is a problem, it's a good sign that a
1100semaphore should be used, instead of a lock.
1101</P>
1102<P>
1103
1104Locks in Pintos are not &quot;recursive,&quot; that is, it is an error for the
1105thread currently holding a lock to try to acquire that lock.
1106</P>
1107<P>
1108
1109Lock types and functions are declared in <Q><TT>threads/synch.h</TT></Q>.
1110</P>
1111<P>
1112
1113<A NAME="IDX49"></A>
1114</P>
1115<DL>
1116<DT><U>Type:</U> <B>struct lock</B>
1117<DD>Represents a lock.
1118</DL>
1119<P>
1120
1121<A NAME="IDX50"></A>
1122</P>
1123<DL>
1124<DT><U>Function:</U> void <B>lock_init</B> (struct lock *<VAR>lock</VAR>)
1125<DD>Initializes <VAR>lock</VAR> as a new lock.
1126The lock is not initially owned by any thread.
1127</DL>
1128<P>
1129
1130<A NAME="IDX51"></A>
1131</P>
1132<DL>
1133<DT><U>Function:</U> void <B>lock_acquire</B> (struct lock *<VAR>lock</VAR>)
1134<DD>Acquires <VAR>lock</VAR> for the current thread, first waiting for
1135any current owner to release it if necessary.
1136</DL>
1137<P>
1138
1139<A NAME="IDX52"></A>
1140</P>
1141<DL>
1142<DT><U>Function:</U> bool <B>lock_try_acquire</B> (struct lock *<VAR>lock</VAR>)
1143<DD>Tries to acquire <VAR>lock</VAR> for use by the current thread, without
1144waiting. Returns true if successful, false if the lock is already
1145owned. Calling this function in a tight loop is a bad idea because it
1146wastes CPU time, so use <CODE>lock_acquire()</CODE> instead.
1147</DL>
1148<P>
1149
1150<A NAME="IDX53"></A>
1151</P>
1152<DL>
1153<DT><U>Function:</U> void <B>lock_release</B> (struct lock *<VAR>lock</VAR>)
1154<DD>Releases <VAR>lock</VAR>, which the current thread must own.
1155</DL>
1156<P>
1157
1158<A NAME="IDX54"></A>
1159</P>
1160<DL>
1161<DT><U>Function:</U> bool <B>lock_held_by_current_thread</B> (const struct lock *<VAR>lock</VAR>)
1162<DD>Returns true if the running thread owns <VAR>lock</VAR>,
1163false otherwise.
1164There is no function to test whether an arbitrary thread owns a lock,
1165because the answer could change before the caller could act on it.
1166</DL>
1167<P>
1168
1169<A NAME="Monitors"></A>
1170<HR SIZE="6">
1171<A NAME="SEC62"></A>
1172<H3> A.3.4 Monitors </H3>
1173<!--docid::SEC62::-->
1174<P>
1175
1176A <EM>monitor</EM> is a higher-level form of synchronization than a
1177semaphore or a lock. A monitor consists of data being synchronized,
1178plus a lock, called the <EM>monitor lock</EM>, and one or more
1179<EM>condition variables</EM>. Before it accesses the protected data, a
1180thread first acquires the monitor lock. It is then said to be &quot;in the
1181monitor&quot;. While in the monitor, the thread has control over all the
1182protected data, which it may freely examine or modify. When access to
1183the protected data is complete, it releases the monitor lock.
1184</P>
1185<P>
1186
1187Condition variables allow code in the monitor to wait for a condition to
1188become true. Each condition variable is associated with an abstract
1189condition, e.g. &quot;some data has arrived for processing&quot; or &quot;over 10
1190seconds has passed since the user's last keystroke&quot;. When code in the
1191monitor needs to wait for a condition to become true, it &quot;waits&quot; on
1192the associated condition variable, which releases the lock and waits for
1193the condition to be signaled. If, on the other hand, it has caused one
1194of these conditions to become true, it &quot;signals&quot; the condition to wake
1195up one waiter, or &quot;broadcasts&quot; the condition to wake all of them.
1196</P>
1197<P>
1198
1199The theoretical framework for monitors was laid out by C. A. R.
1200Hoare ([ <A HREF="pintos_10.html#Hoare">Hoare</A>]). Their practical usage was later elaborated in a
1201paper on the Mesa operating system ([ <A HREF="pintos_10.html#Lampson">Lampson</A>]).
1202</P>
1203<P>
1204
1205Condition variable types and functions are declared in
1206<Q><TT>threads/synch.h</TT></Q>.
1207</P>
1208<P>
1209
1210<A NAME="IDX55"></A>
1211</P>
1212<DL>
1213<DT><U>Type:</U> <B>struct condition</B>
1214<DD>Represents a condition variable.
1215</DL>
1216<P>
1217
1218<A NAME="IDX56"></A>
1219</P>
1220<DL>
1221<DT><U>Function:</U> void <B>cond_init</B> (struct condition *<VAR>cond</VAR>)
1222<DD>Initializes <VAR>cond</VAR> as a new condition variable.
1223</DL>
1224<P>
1225
1226<A NAME="IDX57"></A>
1227</P>
1228<DL>
1229<DT><U>Function:</U> void <B>cond_wait</B> (struct condition *<VAR>cond</VAR>, struct lock *<VAR>lock</VAR>)
1230<DD>Atomically releases <VAR>lock</VAR> (the monitor lock) and waits for
1231<VAR>cond</VAR> to be signaled by some other piece of code. After
1232<VAR>cond</VAR> is signaled, reacquires <VAR>lock</VAR> before returning.
1233<VAR>lock</VAR> must be held before calling this function.
1234<P>
1235
1236Sending a signal and waking up from a wait are not an atomic operation.
1237Thus, typically <CODE>cond_wait()</CODE>'s caller must recheck the condition
1238after the wait completes and, if necessary, wait again. See the next
1239section for an example.
1240</P>
1241</DL>
1242<P>
1243
1244<A NAME="IDX58"></A>
1245</P>
1246<DL>
1247<DT><U>Function:</U> void <B>cond_signal</B> (struct condition *<VAR>cond</VAR>, struct lock *<VAR>lock</VAR>)
1248<DD>If any threads are waiting on <VAR>cond</VAR> (protected by monitor lock
1249<VAR>lock</VAR>), then this function wakes up one of them. If no threads are
1250waiting, returns without performing any action.
1251<VAR>lock</VAR> must be held before calling this function.
1252</DL>
1253<P>
1254
1255<A NAME="IDX59"></A>
1256</P>
1257<DL>
1258<DT><U>Function:</U> void <B>cond_broadcast</B> (struct condition *<VAR>cond</VAR>, struct lock *<VAR>lock</VAR>)
1259<DD>Wakes up all threads, if any, waiting on <VAR>cond</VAR> (protected by
1260monitor lock <VAR>lock</VAR>). <VAR>lock</VAR> must be held before calling this
1261function.
1262</DL>
1263<P>
1264
1265<HR SIZE="6">
1266<A NAME="SEC63"></A>
1267<H4> A.3.4.1 Monitor Example </H4>
1268<!--docid::SEC63::-->
1269<P>
1270
1271The classical example of a monitor is handling a buffer into which one
1272or more
1273&quot;producer&quot; threads write characters and out of which one or more
1274&quot;consumer&quot; threads read characters. To implement this we need,
1275besides the monitor lock, two condition variables which we will call
1276<VAR>not_full</VAR> and <VAR>not_empty</VAR>:
1277</P>
1278<P>
1279
1280<TABLE><tr><td>&nbsp;</td><td class=example><pre>char buf[BUF_SIZE]; /* Buffer. */
1281size_t n = 0; /* 0 &lt;= n &lt;= <VAR>BUF_SIZE</VAR>: # of characters in buffer. */
1282size_t head = 0; /* <VAR>buf</VAR> index of next char to write (mod <VAR>BUF_SIZE</VAR>). */
1283size_t tail = 0; /* <VAR>buf</VAR> index of next char to read (mod <VAR>BUF_SIZE</VAR>). */
1284struct lock lock; /* Monitor lock. */
1285struct condition not_empty; /* Signaled when the buffer is not empty. */
1286struct condition not_full; /* Signaled when the buffer is not full. */
1287
1288<small>...</small>initialize the locks and condition variables<small>...</small>
1289
1290void put (char ch) {
1291 lock_acquire (&amp;lock);
1292 while (n == BUF_SIZE) /* Can't add to <VAR>buf</VAR> as long as it's full. */
1293 cond_wait (&amp;not_full, &amp;lock);
1294 buf[head++ % BUF_SIZE] = ch; /* Add <VAR>ch</VAR> to <VAR>buf</VAR>. */
1295 n++;
1296 cond_signal (&amp;not_empty, &amp;lock); /* <VAR>buf</VAR> can't be empty anymore. */
1297 lock_release (&amp;lock);
1298}
1299
1300char get (void) {
1301 char ch;
1302 lock_acquire (&amp;lock);
1303 while (n == 0) /* Can't read <VAR>buf</VAR> as long as it's empty. */
1304 cond_wait (&amp;not_empty, &amp;lock);
1305 ch = buf[tail++ % BUF_SIZE]; /* Get <VAR>ch</VAR> from <VAR>buf</VAR>. */
1306 n--;
1307 cond_signal (&amp;not_full, &amp;lock); /* <VAR>buf</VAR> can't be full anymore. */
1308 lock_release (&amp;lock);
1309}
1310</pre></td></tr></table><P>
1311
1312Note that <CODE>BUF_SIZE</CODE> must divide evenly into <CODE>SIZE_MAX + 1</CODE>
1313for the above code to be completely correct. Otherwise, it will fail
1314the first time <CODE>head</CODE> wraps around to 0. In practice,
1315<CODE>BUF_SIZE</CODE> would ordinarily be a power of 2.
1316</P>
1317<P>
1318
1319<A NAME="Optimization Barriers"></A>
1320<HR SIZE="6">
1321<A NAME="SEC64"></A>
1322<H3> A.3.5 Optimization Barriers </H3>
1323<!--docid::SEC64::-->
1324<P>
1325
1326An <EM>optimization barrier</EM> is a special statement that prevents the
1327compiler from making assumptions about the state of memory across the
1328barrier. The compiler will not reorder reads or writes of variables
1329across the barrier or assume that a variable's value is unmodified
1330across the barrier, except for local variables whose address is never
1331taken. In Pintos, <Q><TT>threads/synch.h</TT></Q> defines the <CODE>barrier()</CODE>
1332macro as an optimization barrier.
1333</P>
1334<P>
1335
1336One reason to use an optimization barrier is when data can change
1337asynchronously, without the compiler's knowledge, e.g. by another
1338thread or an interrupt handler. The <CODE>too_many_loops()</CODE> function in
1339<Q><TT>devices/timer.c</TT></Q> is an example. This function starts out by
1340busy-waiting in a loop until a timer tick occurs:
1341</P>
1342<P>
1343
1344<TABLE><tr><td>&nbsp;</td><td class=example><pre>/* Wait for a timer tick. */
1345int64_t start = ticks;
1346while (ticks == start)
1347 barrier ();
1348</pre></td></tr></table><P>
1349
1350Without an optimization barrier in the loop, the compiler could
1351conclude that the loop would never terminate, because <CODE>start</CODE> and
1352<CODE>ticks</CODE> start out equal and the loop itself never changes them.
1353It could then &quot;optimize&quot; the function into an infinite loop, which
1354would definitely be undesirable.
1355</P>
1356<P>
1357
1358Optimization barriers can be used to avoid other compiler
1359optimizations. The <CODE>busy_wait()</CODE> function, also in
1360<Q><TT>devices/timer.c</TT></Q>, is an example. It contains this loop:
1361</P>
1362<P>
1363
1364<TABLE><tr><td>&nbsp;</td><td class=example><pre>while (loops-- &gt; 0)
1365 barrier ();
1366</pre></td></tr></table><P>
1367
1368The goal of this loop is to busy-wait by counting <CODE>loops</CODE> down
1369from its original value to 0. Without the barrier, the compiler could
1370delete the loop entirely, because it produces no useful output and has
1371no side effects. The barrier forces the compiler to pretend that the
1372loop body has an important effect.
1373</P>
1374<P>
1375
1376Finally, optimization barriers can be used to force the ordering of
1377memory reads or writes. For example, suppose we add a &quot;feature&quot;
1378that, whenever a timer interrupt occurs, the character in global
1379variable <CODE>timer_put_char</CODE> is printed on the console, but only if
1380global Boolean variable <CODE>timer_do_put</CODE> is true. The best way to
1381set up <Q><SAMP>x</SAMP></Q> to be printed is then to use an optimization barrier,
1382like this:
1383</P>
1384<P>
1385
1386<TABLE><tr><td>&nbsp;</td><td class=example><pre>timer_put_char = 'x';
1387barrier ();
1388timer_do_put = true;
1389</pre></td></tr></table><P>
1390
1391Without the barrier, the code is buggy because the compiler is free to
1392reorder operations when it doesn't see a reason to keep them in the
1393same order. In this case, the compiler doesn't know that the order of
1394assignments is important, so its optimizer is permitted to exchange
1395their order. There's no telling whether it will actually do this, and
1396it is possible that passing the compiler different optimization flags
1397or using a different version of the compiler will produce different
1398behavior.
1399</P>
1400<P>
1401
1402Another solution is to disable interrupts around the assignments.
1403This does not prevent reordering, but it prevents the interrupt
1404handler from intervening between the assignments. It also has the
1405extra runtime cost of disabling and re-enabling interrupts:
1406</P>
1407<P>
1408
1409<TABLE><tr><td>&nbsp;</td><td class=example><pre>enum intr_level old_level = intr_disable ();
1410timer_put_char = 'x';
1411timer_do_put = true;
1412intr_set_level (old_level);
1413</pre></td></tr></table><P>
1414
1415A second solution is to mark the declarations of
1416<CODE>timer_put_char</CODE> and <CODE>timer_do_put</CODE> as <Q><SAMP>volatile</SAMP></Q>. This
1417keyword tells the compiler that the variables are externally observable
1418and restricts its latitude for optimization. However, the semantics of
1419<Q><SAMP>volatile</SAMP></Q> are not well-defined, so it is not a good general
1420solution. The base Pintos code does not use <Q><SAMP>volatile</SAMP></Q> at all.
1421</P>
1422<P>
1423
1424The following is <EM>not</EM> a solution, because locks neither prevent
1425interrupts nor prevent the compiler from reordering the code within the
1426region where the lock is held:
1427</P>
1428<P>
1429
1430<TABLE><tr><td>&nbsp;</td><td class=example><pre>lock_acquire (&amp;timer_lock); /* INCORRECT CODE */
1431timer_put_char = 'x';
1432timer_do_put = true;
1433lock_release (&amp;timer_lock);
1434</pre></td></tr></table><P>
1435
1436The compiler treats invocation of any function defined externally,
1437that is, in another source file, as a limited form of optimization
1438barrier. Specifically, the compiler assumes that any externally
1439defined function may access any statically or dynamically allocated
1440data and any local variable whose address is taken. This often means
1441that explicit barriers can be omitted. It is one reason that Pintos
1442contains few explicit barriers.
1443</P>
1444<P>
1445
1446A function defined in the same source file, or in a header included by
1447the source file, cannot be relied upon as a optimization barrier.
1448This applies even to invocation of a function before its
1449definition, because the compiler may read and parse the entire source
1450file before performing optimization.
1451</P>
1452<P>
1453
1454<A NAME="Interrupt Handling"></A>
1455<HR SIZE="6">
1456<A NAME="SEC65"></A>
1457<H2> A.4 Interrupt Handling </H2>
1458<!--docid::SEC65::-->
1459<P>
1460
1461An <EM>interrupt</EM> notifies the CPU of some event. Much of the work
1462of an operating system relates to interrupts in one way or another.
1463For our purposes, we classify interrupts into two broad categories:
1464</P>
1465<P>
1466
1467<UL>
1468<LI>
1469<EM>Internal interrupts</EM>, that is, interrupts caused directly by CPU
1470instructions. System calls, attempts at invalid memory access
1471(<EM>page faults</EM>), and attempts to divide by zero are some activities
1472that cause internal interrupts. Because they are caused by CPU
1473instructions, internal interrupts are <EM>synchronous</EM> or synchronized
1474with CPU instructions. <CODE>intr_disable()</CODE> does not disable internal
1475interrupts.
1476<P>
1477
1478</P>
1479<LI>
1480<EM>External interrupts</EM>, that is, interrupts originating outside the
1481CPU. These interrupts come from hardware devices such as the system
1482timer, keyboard, serial ports, and disks. External interrupts are
1483<EM>asynchronous</EM>, meaning that their delivery is not
1484synchronized with instruction execution. Handling of external interrupts
1485can be postponed with <CODE>intr_disable()</CODE> and related functions
1486(see section <A HREF="pintos_5.html#SEC59">A.3.1 Disabling Interrupts</A>).
1487</UL>
1488<P>
1489
1490The CPU treats both classes of interrupts largely the same way,
1491so Pintos has common infrastructure to handle both classes.
1492The following section describes this
1493common infrastructure. The sections after that give the specifics of
1494external and internal interrupts.
1495</P>
1496<P>
1497
1498If you haven't already read chapter 3, &quot;Basic Execution Environment,&quot;
1499in [ <A HREF="pintos_10.html#IA32-v1">IA32-v1</A>], it is recommended that you do so now. You might
1500also want to skim chapter 5, &quot;Interrupt and Exception Handling,&quot; in
1501[ <A HREF="pintos_10.html#IA32-v3a">IA32-v3a</A>].
1502</P>
1503<P>
1504
1505<A NAME="Interrupt Infrastructure"></A>
1506<HR SIZE="6">
1507<A NAME="SEC66"></A>
1508<H3> A.4.1 Interrupt Infrastructure </H3>
1509<!--docid::SEC66::-->
1510<P>
1511
1512When an interrupt occurs, the CPU saves
1513its most essential state on a stack and jumps to an interrupt
1514handler routine. The 80<VAR>x</VAR>86 architecture supports 256
1515interrupts, numbered 0 through 255, each with an independent
1516handler defined in an array called the <EM>interrupt
1517descriptor table</EM> or IDT.
1518</P>
1519<P>
1520
1521In Pintos, <CODE>intr_init()</CODE> in <Q><TT>threads/interrupt.c</TT></Q> sets up the
1522IDT so that each entry points to a unique entry point in
1523<Q><TT>threads/intr-stubs.S</TT></Q> named <CODE>intr<VAR>NN</VAR>_stub()</CODE>, where
1524<VAR>NN</VAR> is the interrupt number in
1525hexadecimal. Because the CPU doesn't give
1526us any other way to find out the interrupt number, this entry point
1527pushes the interrupt number on the stack. Then it jumps to
1528<CODE>intr_entry()</CODE>, which pushes all the registers that the processor
1529didn't already push for us, and then calls <CODE>intr_handler()</CODE>, which
1530brings us back into C in <Q><TT>threads/interrupt.c</TT></Q>.
1531</P>
1532<P>
1533
1534The main job of <CODE>intr_handler()</CODE> is to call the function
1535registered for handling the particular interrupt. (If no
1536function is registered, it dumps some information to the console and
1537panics.) It also does some extra processing for external
1538interrupts (see section <A HREF="pintos_5.html#SEC68">A.4.3 External Interrupt Handling</A>).
1539</P>
1540<P>
1541
1542When <CODE>intr_handler()</CODE> returns, the assembly code in
1543<Q><TT>threads/intr-stubs.S</TT></Q> restores all the CPU registers saved
1544earlier and directs the CPU to return from the interrupt.
1545</P>
1546<P>
1547
1548The following types and functions are common to all
1549interrupts.
1550</P>
1551<P>
1552
1553<A NAME="IDX60"></A>
1554</P>
1555<DL>
1556<DT><U>Type:</U> <B>void intr_handler_func (struct intr_frame *<VAR>frame</VAR>)</B>
1557<DD>This is how an interrupt handler function must be declared. Its <VAR>frame</VAR>
1558argument (see below) allows it to determine the cause of the interrupt
1559and the state of the thread that was interrupted.
1560</DL>
1561<P>
1562
1563<A NAME="IDX61"></A>
1564</P>
1565<DL>
1566<DT><U>Type:</U> <B>struct intr_frame</B>
1567<DD>The stack frame of an interrupt handler, as saved by the CPU, the interrupt
1568stubs, and <CODE>intr_entry()</CODE>. Its most interesting members are described
1569below.
1570</DL>
1571<P>
1572
1573<A NAME="IDX62"></A>
1574</P>
1575<DL>
1576<DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint32_t <B>edi</B>
1577<DD><A NAME="IDX63"></A>
1578<DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint32_t <B>esi</B>
1579<DD><A NAME="IDX64"></A>
1580<DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint32_t <B>ebp</B>
1581<DD><A NAME="IDX65"></A>
1582<DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint32_t <B>esp_dummy</B>
1583<DD><A NAME="IDX66"></A>
1584<DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint32_t <B>ebx</B>
1585<DD><A NAME="IDX67"></A>
1586<DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint32_t <B>edx</B>
1587<DD><A NAME="IDX68"></A>
1588<DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint32_t <B>ecx</B>
1589<DD><A NAME="IDX69"></A>
1590<DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint32_t <B>eax</B>
1591<DD><A NAME="IDX70"></A>
1592<DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint16_t <B>es</B>
1593<DD><A NAME="IDX71"></A>
1594<DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint16_t <B>ds</B>
1595<DD>Register values in the interrupted thread, pushed by <CODE>intr_entry()</CODE>.
1596The <CODE>esp_dummy</CODE> value isn't actually used (refer to the
1597description of <CODE>PUSHA</CODE> in [ <A HREF="pintos_10.html#IA32-v2b">IA32-v2b</A>] for details).
1598</DL>
1599<P>
1600
1601<A NAME="IDX72"></A>
1602</P>
1603<DL>
1604<DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint32_t <B>vec_no</B>
1605<DD>The interrupt vector number, ranging from 0 to 255.
1606</DL>
1607<P>
1608
1609<A NAME="IDX73"></A>
1610</P>
1611<DL>
1612<DT><U>Member of <CODE>struct intr_frame</CODE>:</U> uint32_t <B>error_code</B>
1613<DD>The &quot;error code&quot; pushed on the stack by the CPU for some internal
1614interrupts.
1615</DL>
1616<P>
1617
1618<A NAME="IDX74"></A>
1619</P>
1620<DL>
1621<DT><U>Member of <CODE>struct intr_frame</CODE>:</U> void <B>(*eip)</B> (void)
1622<DD>The address of the next instruction to be executed by the interrupted
1623thread.
1624</DL>
1625<P>
1626
1627<A NAME="IDX75"></A>
1628</P>
1629<DL>
1630<DT><U>Member of <CODE>struct intr_frame</CODE>:</U> void *<B>esp</B>
1631<DD>The interrupted thread's stack pointer.
1632</DL>
1633<P>
1634
1635<A NAME="IDX76"></A>
1636</P>
1637<DL>
1638<DT><U>Function:</U> const char *<B>intr_name</B> (uint8_t <VAR>vec</VAR>)
1639<DD>Returns the name of the interrupt numbered <VAR>vec</VAR>, or
1640<CODE>&quot;unknown&quot;</CODE> if the interrupt has no registered name.
1641</DL>
1642<P>
1643
1644<A NAME="Internal Interrupt Handling"></A>
1645<HR SIZE="6">
1646<A NAME="SEC67"></A>
1647<H3> A.4.2 Internal Interrupt Handling </H3>
1648<!--docid::SEC67::-->
1649<P>
1650
1651Internal interrupts are caused directly by CPU instructions executed by
1652the running kernel thread or user process (from project 2 onward). An
1653internal interrupt is therefore said to arise in a &quot;process context.&quot;
1654</P>
1655<P>
1656
1657In an internal interrupt's handler, it can make sense to examine the
1658<CODE>struct intr_frame</CODE> passed to the interrupt handler, or even to modify
1659it. When the interrupt returns, modifications in <CODE>struct intr_frame</CODE>
1660become changes to the calling thread or process's state. For example,
1661the Pintos system call handler returns a value to the user program by
1662modifying the saved EAX register (see section <A HREF="pintos_2.html#SEC40">2.5.2 System Call Details</A>).
1663</P>
1664<P>
1665
1666There are no special restrictions on what an internal interrupt
1667handler can or can't do. Generally they should run with interrupts
1668enabled, just like other code, and so they can be preempted by other
1669kernel threads. Thus, they do need to synchronize with other threads
1670on shared data and other resources (see section <A HREF="pintos_5.html#SEC58">A.3 Synchronization</A>).
1671</P>
1672<P>
1673
1674Internal interrupt handlers can be invoked recursively. For example,
1675the system call handler might cause a page fault while attempting to
1676read user memory. Deep recursion would risk overflowing the limited
1677kernel stack (see section <A HREF="pintos_5.html#SEC55">A.2.1 <CODE>struct thread</CODE></A>), but should be unnecessary.
1678</P>
1679<P>
1680
1681<A NAME="IDX77"></A>
1682</P>
1683<DL>
1684<DT><U>Function:</U> void <B>intr_register_int</B> (uint8_t <VAR>vec</VAR>, int <VAR>dpl</VAR>, enum intr_level <VAR>level</VAR>, intr_handler_func *<VAR>handler</VAR>, const char *<VAR>name</VAR>)
1685<DD>Registers <VAR>handler</VAR> to be called when internal interrupt numbered
1686<VAR>vec</VAR> is triggered. Names the interrupt <VAR>name</VAR> for debugging
1687purposes.
1688<P>
1689
1690If <VAR>level</VAR> is <CODE>INTR_ON</CODE>, external interrupts will be processed
1691normally during the interrupt handler's execution, which is normally
1692desirable. Specifying <CODE>INTR_OFF</CODE> will cause the CPU to disable
1693external interrupts when it invokes the interrupt handler. The effect
1694is slightly different from calling <CODE>intr_disable()</CODE> inside the
1695handler, because that leaves a window of one or more CPU instructions in
1696which external interrupts are still enabled. This is important for the
1697page fault handler; refer to the comments in <Q><TT>userprog/exception.c</TT></Q>
1698for details.
1699</P>
1700<P>
1701
1702<VAR>dpl</VAR> determines how the interrupt can be invoked. If <VAR>dpl</VAR> is
17030, then the interrupt can be invoked only by kernel threads. Otherwise
1704<VAR>dpl</VAR> should be 3, which allows user processes to invoke the
1705interrupt with an explicit INT instruction. The value of <VAR>dpl</VAR>
1706doesn't affect user processes' ability to invoke the interrupt
1707indirectly, e.g. an invalid memory reference will cause a page fault
1708regardless of <VAR>dpl</VAR>.
1709</P>
1710</DL>
1711<P>
1712
1713<A NAME="External Interrupt Handling"></A>
1714<HR SIZE="6">
1715<A NAME="SEC68"></A>
1716<H3> A.4.3 External Interrupt Handling </H3>
1717<!--docid::SEC68::-->
1718<P>
1719
1720External interrupts are caused by events outside the CPU.
1721They are asynchronous, so they can be invoked at any time that
1722interrupts have not been disabled. We say that an external interrupt
1723runs in an &quot;interrupt context.&quot;
1724</P>
1725<P>
1726
1727In an external interrupt, the <CODE>struct intr_frame</CODE> passed to the
1728handler is not very meaningful. It describes the state of the thread
1729or process that was interrupted, but there is no way to predict which
1730one that is. It is possible, although rarely useful, to examine it, but
1731modifying it is a recipe for disaster.
1732</P>
1733<P>
1734
1735Only one external interrupt may be processed at a time. Neither
1736internal nor external interrupt may nest within an external interrupt
1737handler. Thus, an external interrupt's handler must run with interrupts
1738disabled (see section <A HREF="pintos_5.html#SEC59">A.3.1 Disabling Interrupts</A>).
1739</P>
1740<P>
1741
1742An external interrupt handler must not sleep or yield, which rules out
1743calling <CODE>lock_acquire()</CODE>, <CODE>thread_yield()</CODE>, and many other
1744functions. Sleeping in interrupt context would effectively put the
1745interrupted thread to sleep, too, until the interrupt handler was again
1746scheduled and returned. This would be unfair to the unlucky thread, and
1747it would deadlock if the handler were waiting for the sleeping thread
1748to, e.g., release a lock.
1749</P>
1750<P>
1751
1752An external interrupt handler
1753effectively monopolizes the machine and delays all other activities.
1754Therefore, external interrupt handlers should complete as quickly as
1755they can. Anything that require much CPU time should instead run in a
1756kernel thread, possibly one that the interrupt triggers using a
1757synchronization primitive.
1758</P>
1759<P>
1760
1761External interrupts are controlled by a
1762pair of devices outside the CPU called <EM>programmable interrupt
1763controllers</EM>, <EM>PICs</EM> for short. When <CODE>intr_init()</CODE> sets up the
1764CPU's IDT, it also initializes the PICs for interrupt handling. The
1765PICs also must be &quot;acknowledged&quot; at the end of processing for each
1766external interrupt. <CODE>intr_handler()</CODE> takes care of that by calling
1767<CODE>pic_end_of_interrupt()</CODE>, which properly signals the PICs.
1768</P>
1769<P>
1770
1771The following functions relate to external
1772interrupts.
1773</P>
1774<P>
1775
1776<A NAME="IDX78"></A>
1777</P>
1778<DL>
1779<DT><U>Function:</U> void <B>intr_register_ext</B> (uint8_t <VAR>vec</VAR>, intr_handler_func *<VAR>handler</VAR>, const char *<VAR>name</VAR>)
1780<DD>Registers <VAR>handler</VAR> to be called when external interrupt numbered
1781<VAR>vec</VAR> is triggered. Names the interrupt <VAR>name</VAR> for debugging
1782purposes. The handler will run with interrupts disabled.
1783</DL>
1784<P>
1785
1786<A NAME="IDX79"></A>
1787</P>
1788<DL>
1789<DT><U>Function:</U> bool <B>intr_context</B> (void)
1790<DD>Returns true if we are running in an interrupt context, otherwise
1791false. Mainly used in functions that might sleep
1792or that otherwise should not be called from interrupt context, in this
1793form:
1794<TABLE><tr><td>&nbsp;</td><td class=example><pre>ASSERT (!intr_context ());
1795</pre></td></tr></table></DL>
1796<P>
1797
1798<A NAME="IDX80"></A>
1799</P>
1800<DL>
1801<DT><U>Function:</U> void <B>intr_yield_on_return</B> (void)
1802<DD>When called in an interrupt context, causes <CODE>thread_yield()</CODE> to be
1803called just before the interrupt returns. Used
1804in the timer interrupt handler when a thread's time slice expires, to
1805cause a new thread to be scheduled.
1806</DL>
1807<P>
1808
1809<A NAME="Memory Allocation"></A>
1810<HR SIZE="6">
1811<A NAME="SEC69"></A>
1812<H2> A.5 Memory Allocation </H2>
1813<!--docid::SEC69::-->
1814<P>
1815
1816Pintos contains two memory allocators, one that allocates memory in
1817units of a page, and one that can allocate blocks of any size.
1818</P>
1819<P>
1820
1821<A NAME="Page Allocator"></A>
1822<HR SIZE="6">
1823<A NAME="SEC70"></A>
1824<H3> A.5.1 Page Allocator </H3>
1825<!--docid::SEC70::-->
1826<P>
1827
1828The page allocator declared in <Q><TT>threads/palloc.h</TT></Q> allocates
1829memory in units of a page. It is most often used to allocate memory
1830one page at a time, but it can also allocate multiple contiguous pages
1831at once.
1832</P>
1833<P>
1834
1835The page allocator divides the memory it allocates into two pools,
1836called the kernel and user pools. By default, each pool gets half of
1837system memory above 1 MB, but the division can be changed with the
1838<Q><SAMP>-ul</SAMP></Q> kernel
1839command line
1840option (see <A HREF="pintos_4.html#Why PAL_USER?">Why PAL_USER?</A>). An allocation request draws from one
1841pool or the other. If one pool becomes empty, the other may still
1842have free pages. The user pool should be used for allocating memory
1843for user processes and the kernel pool for all other allocations.
1844This will only become important starting with project 3. Until then,
1845all allocations should be made from the kernel pool.
1846</P>
1847<P>
1848
1849Each pool's usage is tracked with a bitmap, one bit per page in
1850the pool. A request to allocate <VAR>n</VAR> pages scans the bitmap
1851for <VAR>n</VAR> consecutive bits set to
1852false, indicating that those pages are free, and then sets those bits
1853to true to mark them as used. This is a &quot;first fit&quot; allocation
1854strategy (see <A HREF="pintos_10.html#Wilson">Wilson</A>).
1855</P>
1856<P>
1857
1858The page allocator is subject to fragmentation. That is, it may not
1859be possible to allocate <VAR>n</VAR> contiguous pages even though <VAR>n</VAR>
1860or more pages are free, because the free pages are separated by used
1861pages. In fact, in pathological cases it may be impossible to
1862allocate 2 contiguous pages even though half of the pool's pages are free.
1863Single-page requests can't fail due to fragmentation, so
1864requests for multiple contiguous pages should be limited as much as
1865possible.
1866</P>
1867<P>
1868
1869Pages may not be allocated from interrupt context, but they may be
1870freed.
1871</P>
1872<P>
1873
1874When a page is freed, all of its bytes are cleared to <TT>0xcc</TT>, as
1875a debugging aid (see section <A HREF="pintos_8.html#SEC108">D.8 Tips</A>).
1876</P>
1877<P>
1878
1879Page allocator types and functions are described below.
1880</P>
1881<P>
1882
1883<A NAME="IDX81"></A>
1884</P>
1885<DL>
1886<DT><U>Function:</U> void *<B>palloc_get_page</B> (enum palloc_flags <VAR>flags</VAR>)
1887<DD><A NAME="IDX82"></A>
1888<DT><U>Function:</U> void *<B>palloc_get_multiple</B> (enum palloc_flags <VAR>flags</VAR>, size_t <VAR>page_cnt</VAR>)
1889<DD>Obtains and returns one page, or <VAR>page_cnt</VAR> contiguous pages,
1890respectively. Returns a null pointer if the pages cannot be allocated.
1891<P>
1892
1893The <VAR>flags</VAR> argument may be any combination of the following flags:
1894</P>
1895<P>
1896
1897<A NAME="IDX83"></A>
1898</P>
1899<DL>
1900<DT><U>Page Allocator Flag:</U> <B><CODE>PAL_ASSERT</CODE></B>
1901<DD>If the pages cannot be allocated, panic the kernel. This is only
1902appropriate during kernel initialization. User processes
1903should never be permitted to panic the kernel.
1904</DL>
1905<P>
1906
1907<A NAME="IDX84"></A>
1908</P>
1909<DL>
1910<DT><U>Page Allocator Flag:</U> <B><CODE>PAL_ZERO</CODE></B>
1911<DD>Zero all the bytes in the allocated pages before returning them. If not
1912set, the contents of newly allocated pages are unpredictable.
1913</DL>
1914<P>
1915
1916<A NAME="IDX85"></A>
1917</P>
1918<DL>
1919<DT><U>Page Allocator Flag:</U> <B><CODE>PAL_USER</CODE></B>
1920<DD>Obtain the pages from the user pool. If not set, pages are allocated
1921from the kernel pool.
1922</DL>
1923</DL>
1924<P>
1925
1926<A NAME="IDX86"></A>
1927</P>
1928<DL>
1929<DT><U>Function:</U> void <B>palloc_free_page</B> (void *<VAR>page</VAR>)
1930<DD><A NAME="IDX87"></A>
1931<DT><U>Function:</U> void <B>palloc_free_multiple</B> (void *<VAR>pages</VAR>, size_t <VAR>page_cnt</VAR>)
1932<DD>Frees one page, or <VAR>page_cnt</VAR> contiguous pages, respectively,
1933starting at <VAR>pages</VAR>. All of the pages must have been obtained using
1934<CODE>palloc_get_page()</CODE> or <CODE>palloc_get_multiple()</CODE>.
1935</DL>
1936<P>
1937
1938<A NAME="Block Allocator"></A>
1939<HR SIZE="6">
1940<A NAME="SEC71"></A>
1941<H3> A.5.2 Block Allocator </H3>
1942<!--docid::SEC71::-->
1943<P>
1944
1945The block allocator, declared in <Q><TT>threads/malloc.h</TT></Q>, can allocate
1946blocks of any size. It is layered on top of the page allocator
1947described in the previous section. Blocks returned by the block
1948allocator are obtained from the kernel pool.
1949</P>
1950<P>
1951
1952The block allocator uses two different strategies for allocating memory.
1953The first strategy applies to blocks that are 1 kB or smaller
1954(one-fourth of the page size). These allocations are rounded up to the
1955nearest power of 2, or 16 bytes, whichever is larger. Then they are
1956grouped into a page used only for allocations of that size.
1957</P>
1958<P>
1959
1960The second strategy applies to blocks larger than 1 kB.
1961These allocations (plus a small amount of overhead) are rounded up to
1962the nearest page in size, and then the block allocator requests that
1963number of contiguous pages from the page allocator.
1964</P>
1965<P>
1966
1967In either case, the difference between the allocation requested size
1968and the actual block size is wasted. A real operating system would
1969carefully tune its allocator to minimize this waste, but this is
1970unimportant in an instructional system like Pintos.
1971</P>
1972<P>
1973
1974As long as a page can be obtained from the page allocator, small
1975allocations always succeed. Most small allocations do not require a
1976new page from the page allocator at all, because they are satisfied
1977using part of a page already allocated. However, large allocations
1978always require calling into the page allocator, and any allocation
1979that needs more than one contiguous page can fail due to fragmentation,
1980as already discussed in the previous section. Thus, you should
1981minimize the number of large allocations in your code, especially
1982those over approximately 4 kB each.
1983</P>
1984<P>
1985
1986When a block is freed, all of its bytes are cleared to <TT>0xcc</TT>, as
1987a debugging aid (see section <A HREF="pintos_8.html#SEC108">D.8 Tips</A>).
1988</P>
1989<P>
1990
1991The block allocator may not be called from interrupt context.
1992</P>
1993<P>
1994
1995The block allocator functions are described below. Their interfaces are
1996the same as the standard C library functions of the same names.
1997</P>
1998<P>
1999
2000<A NAME="IDX88"></A>
2001</P>
2002<DL>
2003<DT><U>Function:</U> void *<B>malloc</B> (size_t <VAR>size</VAR>)
2004<DD>Obtains and returns a new block, from the kernel pool, at least
2005<VAR>size</VAR> bytes long. Returns a null pointer if <VAR>size</VAR> is zero or
2006if memory is not available.
2007</DL>
2008<P>
2009
2010<A NAME="IDX89"></A>
2011</P>
2012<DL>
2013<DT><U>Function:</U> void *<B>calloc</B> (size_t <VAR>a</VAR>, size_t <VAR>b</VAR>)
2014<DD>Obtains a returns a new block, from the kernel pool, at least
2015<CODE><VAR>a</VAR> * <VAR>b</VAR></CODE> bytes long. The block's contents will be
2016cleared to zeros. Returns a null pointer if <VAR>a</VAR> or <VAR>b</VAR> is zero
2017or if insufficient memory is available.
2018</DL>
2019<P>
2020
2021<A NAME="IDX90"></A>
2022</P>
2023<DL>
2024<DT><U>Function:</U> void *<B>realloc</B> (void *<VAR>block</VAR>, size_t <VAR>new_size</VAR>)
2025<DD>Attempts to resize <VAR>block</VAR> to <VAR>new_size</VAR> bytes, possibly moving
2026it in the process. If successful, returns the new block, in which case
2027the old block must no longer be accessed. On failure, returns a null
2028pointer, and the old block remains valid.
2029<P>
2030
2031A call with <VAR>block</VAR> null is equivalent to <CODE>malloc()</CODE>. A call
2032with <VAR>new_size</VAR> zero is equivalent to <CODE>free()</CODE>.
2033</P>
2034</DL>
2035<P>
2036
2037<A NAME="IDX91"></A>
2038</P>
2039<DL>
2040<DT><U>Function:</U> void <B>free</B> (void *<VAR>block</VAR>)
2041<DD>Frees <VAR>block</VAR>, which must have been previously returned by
2042<CODE>malloc()</CODE>, <CODE>calloc()</CODE>, or <CODE>realloc()</CODE> (and not yet freed).
2043</DL>
2044<P>
2045
2046<A NAME="Virtual Addresses"></A>
2047<HR SIZE="6">
2048<A NAME="SEC72"></A>
2049<H2> A.6 Virtual Addresses </H2>
2050<!--docid::SEC72::-->
2051<P>
2052
2053A 32-bit virtual address can be divided into a 20-bit <EM>page number</EM>
2054and a 12-bit <EM>page offset</EM> (or just <EM>offset</EM>), like this:
2055</P>
2056<P>
2057
2058<TABLE><tr><td>&nbsp;</td><td class=example><pre> 31 12 11 0
2059 +-------------------+-----------+
2060 | Page Number | Offset |
2061 +-------------------+-----------+
2062 Virtual Address
2063</pre></td></tr></table><P>
2064
2065Header <Q><TT>threads/vaddr.h</TT></Q> defines these functions and macros for
2066working with virtual addresses:
2067</P>
2068<P>
2069
2070<A NAME="IDX92"></A>
2071</P>
2072<DL>
2073<DT><U>Macro:</U> <B>PGSHIFT</B>
2074<DD><A NAME="IDX93"></A>
2075<DT><U>Macro:</U> <B>PGBITS</B>
2076<DD>The bit index (0) and number of bits (12) of the offset part of a
2077virtual address, respectively.
2078</DL>
2079<P>
2080
2081<A NAME="IDX94"></A>
2082</P>
2083<DL>
2084<DT><U>Macro:</U> <B>PGMASK</B>
2085<DD>A bit mask with the bits in the page offset set to 1, the rest set to 0
2086(<TT>0xfff</TT>).
2087</DL>
2088<P>
2089
2090<A NAME="IDX95"></A>
2091</P>
2092<DL>
2093<DT><U>Macro:</U> <B>PGSIZE</B>
2094<DD>The page size in bytes (4,096).
2095</DL>
2096<P>
2097
2098<A NAME="IDX96"></A>
2099</P>
2100<DL>
2101<DT><U>Function:</U> unsigned <B>pg_ofs</B> (const void *<VAR>va</VAR>)
2102<DD>Extracts and returns the page offset in virtual address <VAR>va</VAR>.
2103</DL>
2104<P>
2105
2106<A NAME="IDX97"></A>
2107</P>
2108<DL>
2109<DT><U>Function:</U> uintptr_t <B>pg_no</B> (const void *<VAR>va</VAR>)
2110<DD>Extracts and returns the page number in virtual address <VAR>va</VAR>.
2111</DL>
2112<P>
2113
2114<A NAME="IDX98"></A>
2115</P>
2116<DL>
2117<DT><U>Function:</U> void *<B>pg_round_down</B> (const void *<VAR>va</VAR>)
2118<DD>Returns the start of the virtual page that <VAR>va</VAR> points within, that
2119is, <VAR>va</VAR> with the page offset set to 0.
2120</DL>
2121<P>
2122
2123<A NAME="IDX99"></A>
2124</P>
2125<DL>
2126<DT><U>Function:</U> void *<B>pg_round_up</B> (const void *<VAR>va</VAR>)
2127<DD>Returns <VAR>va</VAR> rounded up to the nearest page boundary.
2128</DL>
2129<P>
2130
2131Virtual memory in Pintos is divided into two regions: user virtual
2132memory and kernel virtual memory (see section <A HREF="pintos_2.html#SEC26">2.2.4 Virtual Memory Layout</A>). The
2133boundary between them is <CODE>PHYS_BASE</CODE>:
2134</P>
2135<P>
2136
2137<A NAME="IDX100"></A>
2138</P>
2139<DL>
2140<DT><U>Macro:</U> <B>PHYS_BASE</B>
2141<DD>Base address of kernel virtual memory. It defaults to <TT>0xc0000000</TT> (3
2142GB), but it may be changed to any multiple of <TT>0x10000000</TT> from
2143<TT>0x80000000</TT> to <TT>0xf0000000</TT>.
2144<P>
2145
2146User virtual memory ranges from virtual address 0 up to
2147<CODE>PHYS_BASE</CODE>. Kernel virtual memory occupies the rest of the
2148virtual address space, from <CODE>PHYS_BASE</CODE> up to 4 GB.
2149</P>
2150</DL>
2151<P>
2152
2153<A NAME="IDX101"></A>
2154</P>
2155<DL>
2156<DT><U>Function:</U> bool <B>is_user_vaddr</B> (const void *<VAR>va</VAR>)
2157<DD><A NAME="IDX102"></A>
2158<DT><U>Function:</U> bool <B>is_kernel_vaddr</B> (const void *<VAR>va</VAR>)
2159<DD>Returns true if <VAR>va</VAR> is a user or kernel virtual address,
2160respectively, false otherwise.
2161</DL>
2162<P>
2163
2164The 80<VAR>x</VAR>86 doesn't provide any way to directly access memory given
2165a physical address. This ability is often necessary in an operating
2166system kernel, so Pintos works around it by mapping kernel virtual
2167memory one-to-one to physical memory. That is, virtual address
2168<CODE>PHYS_BASE</CODE> accesses physical address 0, virtual address
2169<CODE>PHYS_BASE</CODE> + <TT>0x1234</TT> accesses physical address <TT>0x1234</TT>, and
2170so on up to the size of the machine's physical memory. Thus, adding
2171<CODE>PHYS_BASE</CODE> to a physical address obtains a kernel virtual address
2172that accesses that address; conversely, subtracting <CODE>PHYS_BASE</CODE>
2173from a kernel virtual address obtains the corresponding physical
2174address. Header <Q><TT>threads/vaddr.h</TT></Q> provides a pair of functions to
2175do these translations:
2176</P>
2177<P>
2178
2179<A NAME="IDX103"></A>
2180</P>
2181<DL>
2182<DT><U>Function:</U> void *<B>ptov</B> (uintptr_t <VAR>pa</VAR>)
2183<DD>Returns the kernel virtual address corresponding to physical address
2184<VAR>pa</VAR>, which should be between 0 and the number of bytes of physical
2185memory.
2186</DL>
2187<P>
2188
2189<A NAME="IDX104"></A>
2190</P>
2191<DL>
2192<DT><U>Function:</U> uintptr_t <B>vtop</B> (void *<VAR>va</VAR>)
2193<DD>Returns the physical address corresponding to <VAR>va</VAR>, which must be a
2194kernel virtual address.
2195</DL>
2196<P>
2197
2198<A NAME="Page Table"></A>
2199<HR SIZE="6">
2200<A NAME="SEC73"></A>
2201<H2> A.7 Page Table </H2>
2202<!--docid::SEC73::-->
2203<P>
2204
2205The code in <Q><TT>pagedir.c</TT></Q> is an abstract interface to the 80<VAR>x</VAR>86
2206hardware page table, also called a &quot;page directory&quot; by Intel processor
2207documentation. The page table interface uses a <CODE>uint32_t *</CODE> to
2208represent a page table because this is convenient for accessing their
2209internal structure.
2210</P>
2211<P>
2212
2213The sections below describe the page table interface and internals.
2214</P>
2215<P>
2216
2217<A NAME="Page Table Creation Destruction Activation"></A>
2218<HR SIZE="6">
2219<A NAME="SEC74"></A>
2220<H3> A.7.1 Creation, Destruction, and Activation </H3>
2221<!--docid::SEC74::-->
2222<P>
2223
2224These functions create, destroy, and activate page tables. The base
2225Pintos code already calls these functions where necessary, so it should
2226not be necessary to call them yourself.
2227</P>
2228<P>
2229
2230<A NAME="IDX105"></A>
2231</P>
2232<DL>
2233<DT><U>Function:</U> uint32_t *<B>pagedir_create</B> (void)
2234<DD>Creates and returns a new page table. The new page table contains
2235Pintos's normal kernel virtual page mappings, but no user virtual
2236mappings.
2237<P>
2238
2239Returns a null pointer if memory cannot be obtained.
2240</P>
2241</DL>
2242<P>
2243
2244<A NAME="IDX106"></A>
2245</P>
2246<DL>
2247<DT><U>Function:</U> void <B>pagedir_destroy</B> (uint32_t *<VAR>pd</VAR>)
2248<DD>Frees all of the resources held by <VAR>pd</VAR>, including the page table
2249itself and the frames that it maps.
2250</DL>
2251<P>
2252
2253<A NAME="IDX107"></A>
2254</P>
2255<DL>
2256<DT><U>Function:</U> void <B>pagedir_activate</B> (uint32_t *<VAR>pd</VAR>)
2257<DD>Activates <VAR>pd</VAR>. The active page table is the one used by the CPU to
2258translate memory references.
2259</DL>
2260<P>
2261
2262<A NAME="Page Tables Inspection and Updates"></A>
2263<HR SIZE="6">
2264<A NAME="SEC75"></A>
2265<H3> A.7.2 Inspection and Updates </H3>
2266<!--docid::SEC75::-->
2267<P>
2268
2269These functions examine or update the mappings from pages to frames
2270encapsulated by a page table. They work on both active and inactive
2271page tables (that is, those for running and suspended processes),
2272flushing the TLB as necessary.
2273</P>
2274<P>
2275
2276<A NAME="IDX108"></A>
2277</P>
2278<DL>
2279<DT><U>Function:</U> bool <B>pagedir_set_page</B> (uint32_t *<VAR>pd</VAR>, void *<VAR>upage</VAR>, void *<VAR>kpage</VAR>, bool <VAR>writable</VAR>)
2280<DD>Adds to <VAR>pd</VAR> a mapping from user page <VAR>upage</VAR> to the frame identified
2281by kernel virtual address <VAR>kpage</VAR>. If <VAR>writable</VAR> is true, the
2282page is mapped read/write; otherwise, it is mapped read-only.
2283<P>
2284
2285User page <VAR>upage</VAR> must not already be mapped in <VAR>pd</VAR>.
2286</P>
2287<P>
2288
2289Kernel page <VAR>kpage</VAR> should be a kernel virtual address obtained from
2290the user pool with <CODE>palloc_get_page(PAL_USER)</CODE> (see <A HREF="pintos_4.html#Why PAL_USER?">Why PAL_USER?</A>).
2291</P>
2292<P>
2293
2294Returns true if successful, false on failure. Failure will occur if
2295additional memory required for the page table cannot be obtained.
2296</P>
2297</DL>
2298<P>
2299
2300<A NAME="IDX109"></A>
2301</P>
2302<DL>
2303<DT><U>Function:</U> void *<B>pagedir_get_page</B> (uint32_t *<VAR>pd</VAR>, const void *<VAR>uaddr</VAR>)
2304<DD>Looks up the frame mapped to <VAR>uaddr</VAR> in <VAR>pd</VAR>. Returns the
2305kernel virtual address for that frame, if <VAR>uaddr</VAR> is mapped, or a
2306null pointer if it is not.
2307</DL>
2308<P>
2309
2310<A NAME="IDX110"></A>
2311</P>
2312<DL>
2313<DT><U>Function:</U> void <B>pagedir_clear_page</B> (uint32_t *<VAR>pd</VAR>, void *<VAR>page</VAR>)
2314<DD>Marks <VAR>page</VAR> &quot;not present&quot; in <VAR>pd</VAR>. Later accesses to
2315the page will fault.
2316<P>
2317
2318Other bits in the page table for <VAR>page</VAR> are preserved, permitting
2319the accessed and dirty bits (see the next section) to be checked.
2320</P>
2321<P>
2322
2323This function has no effect if <VAR>page</VAR> is not mapped.
2324</P>
2325</DL>
2326<P>
2327
2328<A NAME="Page Table Accessed and Dirty Bits"></A>
2329<HR SIZE="6">
2330<A NAME="SEC76"></A>
2331<H3> A.7.3 Accessed and Dirty Bits </H3>
2332<!--docid::SEC76::-->
2333<P>
2334
233580<VAR>x</VAR>86 hardware provides some assistance for implementing page
2336replacement algorithms, through a pair of bits in the page table entry
2337(PTE) for each page. On any read or write to a page, the CPU sets the
2338<EM>accessed bit</EM> to 1 in the page's PTE, and on any write, the CPU
2339sets the <EM>dirty bit</EM> to 1. The CPU never resets these bits to 0,
2340but the OS may do so.
2341</P>
2342<P>
2343
2344Proper interpretation of these bits requires understanding of
2345<EM>aliases</EM>, that is, two (or more) pages that refer to the same
2346frame. When an aliased frame is accessed, the accessed and dirty bits
2347are updated in only one page table entry (the one for the page used for
2348access). The accessed and dirty bits for the other aliases are not
2349updated.
2350</P>
2351<P>
2352
2353See <A HREF="pintos_4.html#Accessed and Dirty Bits">Accessed and Dirty Bits</A>, on applying these bits in implementing
2354page replacement algorithms.
2355</P>
2356<P>
2357
2358<A NAME="IDX111"></A>
2359</P>
2360<DL>
2361<DT><U>Function:</U> bool <B>pagedir_is_dirty</B> (uint32_t *<VAR>pd</VAR>, const void *<VAR>page</VAR>)
2362<DD><A NAME="IDX112"></A>
2363<DT><U>Function:</U> bool <B>pagedir_is_accessed</B> (uint32_t *<VAR>pd</VAR>, const void *<VAR>page</VAR>)
2364<DD>Returns true if page directory <VAR>pd</VAR> contains a page table entry for
2365<VAR>page</VAR> that is marked dirty (or accessed). Otherwise,
2366returns false.
2367</DL>
2368<P>
2369
2370<A NAME="IDX113"></A>
2371</P>
2372<DL>
2373<DT><U>Function:</U> void <B>pagedir_set_dirty</B> (uint32_t *<VAR>pd</VAR>, const void *<VAR>page</VAR>, bool <VAR>value</VAR>)
2374<DD><A NAME="IDX114"></A>
2375<DT><U>Function:</U> void <B>pagedir_set_accessed</B> (uint32_t *<VAR>pd</VAR>, const void *<VAR>page</VAR>, bool <VAR>value</VAR>)
2376<DD>If page directory <VAR>pd</VAR> has a page table entry for <VAR>page</VAR>, then
2377its dirty (or accessed) bit is set to <VAR>value</VAR>.
2378</DL>
2379<P>
2380
2381<A NAME="Page Table Details"></A>
2382<HR SIZE="6">
2383<A NAME="SEC77"></A>
2384<H3> A.7.4 Page Table Details </H3>
2385<!--docid::SEC77::-->
2386<P>
2387
2388The functions provided with Pintos are sufficient to implement the
2389projects. However, you may still find it worthwhile to understand the
2390hardware page table format, so we'll go into a little detail in this
2391section.
2392</P>
2393<P>
2394
2395<A NAME="Page Table Structure"></A>
2396<HR SIZE="6">
2397<A NAME="SEC78"></A>
2398<H4> A.7.4.1 Structure </H4>
2399<!--docid::SEC78::-->
2400<P>
2401
2402The top-level paging data structure is a page called the &quot;page
2403directory&quot; (PD) arranged as an array of 1,024 32-bit page directory
2404entries (PDEs), each of which represents 4 MB of virtual memory. Each
2405PDE may point to the physical address of another page called a
2406&quot;page table&quot; (PT) arranged, similarly, as an array of 1,024
240732-bit page table entries (PTEs), each of which translates a single 4
2408kB virtual page to a physical page.
2409</P>
2410<P>
2411
2412Translation of a virtual address into a physical address follows
2413the three-step process illustrated in the diagram
2414below:<A NAME="DOCF4" HREF="pintos_fot.html#FOOT4">(4)</A>
2415</P>
2416<P>
2417
2418<OL>
2419<LI>
2420The most-significant 10 bits of the virtual address (bits 22<small>...</small>31)
2421index the page directory. If the PDE is marked &quot;present,&quot; the
2422physical address of a page table is read from the PDE thus obtained.
2423If the PDE is marked &quot;not present&quot; then a page fault occurs.
2424<P>
2425
2426</P>
2427<LI>
2428The next 10 bits of the virtual address (bits 12<small>...</small>21) index
2429the page table. If the PTE is marked &quot;present,&quot; the physical
2430address of a data page is read from the PTE thus obtained. If the PTE
2431is marked &quot;not present&quot; then a page fault occurs.
2432<P>
2433
2434</P>
2435<LI>
2436The least-significant 12 bits of the virtual address (bits 0<small>...</small>11)
2437are added to the data page's physical base address, yielding the final
2438physical address.
2439</OL>
2440<P>
2441
2442<TABLE><tr><td>&nbsp;</td><td class=example><pre> 31 22 21 12 11 0
2443+----------------------+----------------------+----------------------+
2444| Page Directory Index | Page Table Index | Page Offset |
2445+----------------------+----------------------+----------------------+
2446 | | |
2447 _______/ _______/ _____/
2448 / / /
2449 / Page Directory / Page Table / Data Page
2450 / .____________. / .____________. / .____________.
2451 |1,023|____________| |1,023|____________| | |____________|
2452 |1,022|____________| |1,022|____________| | |____________|
2453 |1,021|____________| |1,021|____________| \__\|____________|
2454 |1,020|____________| |1,020|____________| /|____________|
2455 | | | | | | | |
2456 | | | \____\| |_ | |
2457 | | . | /| . | \ | . |
2458 \____\| . |_ | . | | | . |
2459 /| . | \ | . | | | . |
2460 | . | | | . | | | . |
2461 | | | | | | | |
2462 |____________| | |____________| | |____________|
2463 4|____________| | 4|____________| | |____________|
2464 3|____________| | 3|____________| | |____________|
2465 2|____________| | 2|____________| | |____________|
2466 1|____________| | 1|____________| | |____________|
2467 0|____________| \__\0|____________| \____\|____________|
2468 / /
2469</pre></td></tr></table><P>
2470
2471Pintos provides some macros and functions that are useful for working
2472with raw page tables:
2473</P>
2474<P>
2475
2476<A NAME="IDX115"></A>
2477</P>
2478<DL>
2479<DT><U>Macro:</U> <B>PTSHIFT</B>
2480<DD><A NAME="IDX116"></A>
2481<DT><U>Macro:</U> <B>PTBITS</B>
2482<DD>The starting bit index (12) and number of bits (10), respectively, in a
2483page table index.
2484</DL>
2485<P>
2486
2487<A NAME="IDX117"></A>
2488</P>
2489<DL>
2490<DT><U>Macro:</U> <B>PTMASK</B>
2491<DD>A bit mask with the bits in the page table index set to 1 and the rest
2492set to 0 (<TT>0x3ff000</TT>).
2493</DL>
2494<P>
2495
2496<A NAME="IDX118"></A>
2497</P>
2498<DL>
2499<DT><U>Macro:</U> <B>PTSPAN</B>
2500<DD>The number of bytes of virtual address space that a single page table
2501page covers (4,194,304 bytes, or 4 MB).
2502</DL>
2503<P>
2504
2505<A NAME="IDX119"></A>
2506</P>
2507<DL>
2508<DT><U>Macro:</U> <B>PDSHIFT</B>
2509<DD><A NAME="IDX120"></A>
2510<DT><U>Macro:</U> <B>PDBITS</B>
2511<DD>The starting bit index (22) and number of bits (10), respectively, in a
2512page directory index.
2513</DL>
2514<P>
2515
2516<A NAME="IDX121"></A>
2517</P>
2518<DL>
2519<DT><U>Macro:</U> <B>PDMASK</B>
2520<DD>A bit mask with the bits in the page directory index set to 1 and other
2521bits set to 0 (<TT>0xffc00000</TT>).
2522</DL>
2523<P>
2524
2525<A NAME="IDX122"></A>
2526</P>
2527<DL>
2528<DT><U>Function:</U> uintptr_t <B>pd_no</B> (const void *<VAR>va</VAR>)
2529<DD><A NAME="IDX123"></A>
2530<DT><U>Function:</U> uintptr_t <B>pt_no</B> (const void *<VAR>va</VAR>)
2531<DD>Returns the page directory index or page table index, respectively, for
2532virtual address <VAR>va</VAR>. These functions are defined in
2533<Q><TT>threads/pte.h</TT></Q>.
2534</DL>
2535<P>
2536
2537<A NAME="IDX124"></A>
2538</P>
2539<DL>
2540<DT><U>Function:</U> unsigned <B>pg_ofs</B> (const void *<VAR>va</VAR>)
2541<DD>Returns the page offset for virtual address <VAR>va</VAR>. This function is
2542defined in <Q><TT>threads/vaddr.h</TT></Q>.
2543</DL>
2544<P>
2545
2546<A NAME="Page Table Entry Format"></A>
2547<HR SIZE="6">
2548<A NAME="SEC79"></A>
2549<H4> A.7.4.2 Page Table Entry Format </H4>
2550<!--docid::SEC79::-->
2551<P>
2552
2553You do not need to understand the PTE format to do the Pintos
2554projects, unless you wish to incorporate the page table into your
2555supplemental page table (see <A HREF="pintos_4.html#Managing the Supplemental Page Table">Managing the Supplemental Page Table</A>).
2556</P>
2557<P>
2558
2559The actual format of a page table entry is summarized below. For
2560complete information, refer to section 3.7, &quot;Page Translation Using
256132-Bit Physical Addressing,&quot; in [ <A HREF="pintos_10.html#IA32-v3a">IA32-v3a</A>].
2562</P>
2563<P>
2564
2565<TABLE><tr><td>&nbsp;</td><td class=example><pre> 31 12 11 9 6 5 2 1 0
2566+---------------------------------------+----+----+-+-+---+-+-+-+
2567| Physical Address | AVL| |D|A| |U|W|P|
2568+---------------------------------------+----+----+-+-+---+-+-+-+
2569</pre></td></tr></table><P>
2570
2571Some more information on each bit is given below. The names are
2572<Q><TT>threads/pte.h</TT></Q> macros that represent the bits' values:
2573</P>
2574<P>
2575
2576<A NAME="IDX125"></A>
2577</P>
2578<DL>
2579<DT><U>Macro:</U> <B>PTE_P</B>
2580<DD>Bit 0, the &quot;present&quot; bit. When this bit is 1, the
2581other bits are interpreted as described below. When this bit is 0, any
2582attempt to access the page will page fault. The remaining bits are then
2583not used by the CPU and may be used by the OS for any purpose.
2584</DL>
2585<P>
2586
2587<A NAME="IDX126"></A>
2588</P>
2589<DL>
2590<DT><U>Macro:</U> <B>PTE_W</B>
2591<DD>Bit 1, the &quot;read/write&quot; bit. When it is 1, the page
2592is writable. When it is 0, write attempts will page fault.
2593</DL>
2594<P>
2595
2596<A NAME="IDX127"></A>
2597</P>
2598<DL>
2599<DT><U>Macro:</U> <B>PTE_U</B>
2600<DD>Bit 2, the &quot;user/supervisor&quot; bit. When it is 1, user
2601processes may access the page. When it is 0, only the kernel may access
2602the page (user accesses will page fault).
2603<P>
2604
2605Pintos clears this bit in PTEs for kernel virtual memory, to prevent
2606user processes from accessing them.
2607</P>
2608</DL>
2609<P>
2610
2611<A NAME="IDX128"></A>
2612</P>
2613<DL>
2614<DT><U>Macro:</U> <B>PTE_A</B>
2615<DD>Bit 5, the &quot;accessed&quot; bit. See section <A HREF="pintos_5.html#SEC76">A.7.3 Accessed and Dirty Bits</A>.
2616</DL>
2617<P>
2618
2619<A NAME="IDX129"></A>
2620</P>
2621<DL>
2622<DT><U>Macro:</U> <B>PTE_D</B>
2623<DD>Bit 6, the &quot;dirty&quot; bit. See section <A HREF="pintos_5.html#SEC76">A.7.3 Accessed and Dirty Bits</A>.
2624</DL>
2625<P>
2626
2627<A NAME="IDX130"></A>
2628</P>
2629<DL>
2630<DT><U>Macro:</U> <B>PTE_AVL</B>
2631<DD>Bits 9<small>...</small>11, available for operating system use.
2632Pintos, as provided, does not use them and sets them to 0.
2633</DL>
2634<P>
2635
2636<A NAME="IDX131"></A>
2637</P>
2638<DL>
2639<DT><U>Macro:</U> <B>PTE_ADDR</B>
2640<DD>Bits 12<small>...</small>31, the top 20 bits of the physical address of a frame.
2641The low 12 bits of the frame's address are always 0.
2642</DL>
2643<P>
2644
2645Other bits are either reserved or uninteresting in a Pintos context and
2646should be set to@tie{}0.
2647</P>
2648<P>
2649
2650Header <Q><TT>threads/pte.h</TT></Q> defines three functions for working with
2651page table entries:
2652</P>
2653<P>
2654
2655<A NAME="IDX132"></A>
2656</P>
2657<DL>
2658<DT><U>Function:</U> uint32_t <B>pte_create_kernel</B> (uint32_t *<VAR>page</VAR>, bool <VAR>writable</VAR>)
2659<DD>Returns a page table entry that points to <VAR>page</VAR>, which should be a
2660kernel virtual address. The PTE's present bit will be set. It will be
2661marked for kernel-only access. If <VAR>writable</VAR> is true, the PTE will
2662also be marked read/write; otherwise, it will be read-only.
2663</DL>
2664<P>
2665
2666<A NAME="IDX133"></A>
2667</P>
2668<DL>
2669<DT><U>Function:</U> uint32_t <B>pte_create_user</B> (uint32_t *<VAR>page</VAR>, bool <VAR>writable</VAR>)
2670<DD>Returns a page table entry that points to <VAR>page</VAR>, which should be
2671the kernel virtual address of a frame in the user pool (see <A HREF="pintos_4.html#Why PAL_USER?">Why PAL_USER?</A>). The PTE's present bit will be set and it will be marked to
2672allow user-mode access. If <VAR>writable</VAR> is true, the PTE will also be
2673marked read/write; otherwise, it will be read-only.
2674</DL>
2675<P>
2676
2677<A NAME="IDX134"></A>
2678</P>
2679<DL>
2680<DT><U>Function:</U> void *<B>pte_get_page</B> (uint32_t <VAR>pte</VAR>)
2681<DD>Returns the kernel virtual address for the frame that <VAR>pte</VAR> points
2682to. The <VAR>pte</VAR> may be present or not-present; if it is not-present
2683then the pointer returned is only meaningful if the address bits in the PTE
2684actually represent a physical address.
2685</DL>
2686<P>
2687
2688<A NAME="Page Directory Entry Format"></A>
2689<HR SIZE="6">
2690<A NAME="SEC80"></A>
2691<H4> A.7.4.3 Page Directory Entry Format </H4>
2692<!--docid::SEC80::-->
2693<P>
2694
2695Page directory entries have the same format as PTEs, except that the
2696physical address points to a page table page instead of a frame. Header
2697<Q><TT>threads/pte.h</TT></Q> defines two functions for working with page
2698directory entries:
2699</P>
2700<P>
2701
2702<A NAME="IDX135"></A>
2703</P>
2704<DL>
2705<DT><U>Function:</U> uint32_t <B>pde_create</B> (uint32_t *<VAR>pt</VAR>)
2706<DD>Returns a page directory that points to <VAR>page</VAR>, which should be the
2707kernel virtual address of a page table page. The PDE's present bit will
2708be set, it will be marked to allow user-mode access, and it will be
2709marked read/write.
2710</DL>
2711<P>
2712
2713<A NAME="IDX136"></A>
2714</P>
2715<DL>
2716<DT><U>Function:</U> uint32_t *<B>pde_get_pt</B> (uint32_t <VAR>pde</VAR>)
2717<DD>Returns the kernel virtual address for the page table page that
2718<VAR>pde</VAR>, which must be marked present, points to.
2719</DL>
2720<P>
2721
2722<A NAME="Hash Table"></A>
2723<HR SIZE="6">
2724<A NAME="SEC81"></A>
2725<H2> A.8 Hash Table </H2>
2726<!--docid::SEC81::-->
2727<P>
2728
2729Pintos provides a hash table data structure in <Q><TT>lib/kernel/hash.c</TT></Q>.
2730To use it you will need to include its header file,
2731<Q><TT>lib/kernel/hash.h</TT></Q>, with <CODE>#include &lt;hash.h&gt;</CODE>.
2732No code provided with Pintos uses the hash table, which means that you
2733are free to use it as is, modify its implementation for your own
2734purposes, or ignore it, as you wish.
2735</P>
2736<P>
2737
2738Most implementations of the virtual memory project use a hash table to
2739translate pages to frames. You may find other uses for hash tables as
2740well.
2741</P>
2742<P>
2743
2744<A NAME="Hash Data Types"></A>
2745<HR SIZE="6">
2746<A NAME="SEC82"></A>
2747<H3> A.8.1 Data Types </H3>
2748<!--docid::SEC82::-->
2749<P>
2750
2751A hash table is represented by <CODE>struct hash</CODE>.
2752</P>
2753<P>
2754
2755<A NAME="IDX137"></A>
2756</P>
2757<DL>
2758<DT><U>Type:</U> <B>struct hash</B>
2759<DD>Represents an entire hash table. The actual members of <CODE>struct hash</CODE>
2760are &quot;opaque.&quot; That is, code that uses a hash table should not access
2761<CODE>struct hash</CODE> members directly, nor should it need to. Instead, use
2762hash table functions and macros.
2763</DL>
2764<P>
2765
2766The hash table operates on elements of type <CODE>struct hash_elem</CODE>.
2767</P>
2768<P>
2769
2770<A NAME="IDX138"></A>
2771</P>
2772<DL>
2773<DT><U>Type:</U> <B>struct hash_elem</B>
2774<DD>Embed a <CODE>struct hash_elem</CODE> member in the structure you want to include
2775in a hash table. Like <CODE>struct hash</CODE>, <CODE>struct hash_elem</CODE> is opaque.
2776All functions for operating on hash table elements actually take and
2777return pointers to <CODE>struct hash_elem</CODE>, not pointers to your hash table's
2778real element type.
2779</DL>
2780<P>
2781
2782You will often need to obtain a <CODE>struct hash_elem</CODE> given a real element
2783of the hash table, and vice versa. Given a real element of the hash
2784table, you may use the <Q><SAMP>&amp;</SAMP></Q> operator to obtain a pointer to its
2785<CODE>struct hash_elem</CODE>. Use the <CODE>hash_entry()</CODE> macro to go the other
2786direction.
2787</P>
2788<P>
2789
2790<A NAME="IDX139"></A>
2791</P>
2792<DL>
2793<DT><U>Macro:</U> <VAR>type</VAR> *<B>hash_entry</B> (struct hash_elem *<VAR>elem</VAR>, <VAR>type</VAR>, <VAR>member</VAR>)
2794<DD>Returns a pointer to the structure that <VAR>elem</VAR>, a pointer to a
2795<CODE>struct hash_elem</CODE>, is embedded within. You must provide <VAR>type</VAR>,
2796the name of the structure that <VAR>elem</VAR> is inside, and <VAR>member</VAR>,
2797the name of the member in <VAR>type</VAR> that <VAR>elem</VAR> points to.
2798<P>
2799
2800For example, suppose <CODE>h</CODE> is a <CODE>struct hash_elem *</CODE> variable
2801that points to a <CODE>struct thread</CODE> member (of type <CODE>struct hash_elem</CODE>)
2802named <CODE>h_elem</CODE>. Then, <CODE>hash_entry@tie{</CODE>(h, struct thread, h_elem)}
2803yields the address of the <CODE>struct thread</CODE> that <CODE>h</CODE> points within.
2804</P>
2805</DL>
2806<P>
2807
2808See section <A HREF="pintos_5.html#SEC86">A.8.5 Hash Table Example</A>, for an example.
2809</P>
2810<P>
2811
2812Each hash table element must contain a key, that is, data that
2813identifies and distinguishes elements, which must be unique
2814among elements in the hash table. (Elements may
2815also contain non-key data that need not be unique.) While an element is
2816in a hash table, its key data must not be changed. Instead, if need be,
2817remove the element from the hash table, modify its key, then reinsert
2818the element.
2819</P>
2820<P>
2821
2822For each hash table, you must write two functions that act on keys: a
2823hash function and a comparison function. These functions must match the
2824following prototypes:
2825</P>
2826<P>
2827
2828<A NAME="IDX140"></A>
2829</P>
2830<DL>
2831<DT><U>Type:</U> <B>unsigned hash_hash_func (const struct hash_elem *<VAR>element</VAR>, void *<VAR>aux</VAR>)</B>
2832<DD>Returns a hash of <VAR>element</VAR>'s data, as a value anywhere in the range
2833of <CODE>unsigned int</CODE>. The hash of an element should be a
2834pseudo-random function of the element's key. It must not depend on
2835non-key data in the element or on any non-constant data other than the
2836key. Pintos provides the following functions as a suitable basis for
2837hash functions.
2838<P>
2839
2840<A NAME="IDX141"></A>
2841</P>
2842<DL>
2843<DT><U>Function:</U> unsigned <B>hash_bytes</B> (const void *<VAR>buf</VAR>, size_t *<VAR>size</VAR>)
2844<DD>Returns a hash of the <VAR>size</VAR> bytes starting at <VAR>buf</VAR>. The
2845implementation is the general-purpose
2846<A HREF="http://en.wikipedia.org/wiki/Fowler_Noll_Vo_hash">Fowler-Noll-Vo
2847hash</A> for 32-bit words.
2848</DL>
2849<P>
2850
2851<A NAME="IDX142"></A>
2852</P>
2853<DL>
2854<DT><U>Function:</U> unsigned <B>hash_string</B> (const char *<VAR>s</VAR>)
2855<DD>Returns a hash of null-terminated string <VAR>s</VAR>.
2856</DL>
2857<P>
2858
2859<A NAME="IDX143"></A>
2860</P>
2861<DL>
2862<DT><U>Function:</U> unsigned <B>hash_int</B> (int <VAR>i</VAR>)
2863<DD>Returns a hash of integer <VAR>i</VAR>.
2864</DL>
2865<P>
2866
2867If your key is a single piece of data of an appropriate type, it is
2868sensible for your hash function to directly return the output of one of
2869these functions. For multiple pieces of data, you may wish to combine
2870the output of more than one call to them using, e.g., the <Q><SAMP>^</SAMP></Q>
2871(exclusive or)
2872operator. Finally, you may entirely ignore these functions and write
2873your own hash function from scratch, but remember that your goal is to
2874build an operating system kernel, not to design a hash function.
2875</P>
2876<P>
2877
2878See section <A HREF="pintos_5.html#SEC87">A.8.6 Auxiliary Data</A>, for an explanation of <VAR>aux</VAR>.
2879</P>
2880</DL>
2881<P>
2882
2883<A NAME="IDX144"></A>
2884</P>
2885<DL>
2886<DT><U>Type:</U> <B>bool hash_less_func (const struct hash_elem *<VAR>a</VAR>, const struct hash_elem *<VAR>b</VAR>, void *<VAR>aux</VAR>)</B>
2887<DD>Compares the keys stored in elements <VAR>a</VAR> and <VAR>b</VAR>. Returns
2888true if <VAR>a</VAR> is less than <VAR>b</VAR>, false if <VAR>a</VAR> is greater than
2889or equal to <VAR>b</VAR>.
2890<P>
2891
2892If two elements compare equal, then they must hash to equal values.
2893</P>
2894<P>
2895
2896See section <A HREF="pintos_5.html#SEC87">A.8.6 Auxiliary Data</A>, for an explanation of <VAR>aux</VAR>.
2897</P>
2898</DL>
2899<P>
2900
2901See section <A HREF="pintos_5.html#SEC86">A.8.5 Hash Table Example</A>, for hash and comparison function examples.
2902</P>
2903<P>
2904
2905A few functions accept a pointer to a third kind of
2906function as an argument:
2907</P>
2908<P>
2909
2910<A NAME="IDX145"></A>
2911</P>
2912<DL>
2913<DT><U>Type:</U> <B>void hash_action_func (struct hash_elem *<VAR>element</VAR>, void *<VAR>aux</VAR>)</B>
2914<DD>Performs some kind of action, chosen by the caller, on <VAR>element</VAR>.
2915<P>
2916
2917See section <A HREF="pintos_5.html#SEC87">A.8.6 Auxiliary Data</A>, for an explanation of <VAR>aux</VAR>.
2918</P>
2919</DL>
2920<P>
2921
2922<A NAME="Basic Hash Functions"></A>
2923<HR SIZE="6">
2924<A NAME="SEC83"></A>
2925<H3> A.8.2 Basic Functions </H3>
2926<!--docid::SEC83::-->
2927<P>
2928
2929These functions create, destroy, and inspect hash tables.
2930</P>
2931<P>
2932
2933<A NAME="IDX146"></A>
2934</P>
2935<DL>
2936<DT><U>Function:</U> bool <B>hash_init</B> (struct hash *<VAR>hash</VAR>, hash_hash_func *<VAR>hash_func</VAR>, hash_less_func *<VAR>less_func</VAR>, void *<VAR>aux</VAR>)
2937<DD>Initializes <VAR>hash</VAR> as a hash table with <VAR>hash_func</VAR> as hash
2938function, <VAR>less_func</VAR> as comparison function, and <VAR>aux</VAR> as
2939auxiliary data.
2940Returns true if successful, false on failure. <CODE>hash_init()</CODE> calls
2941<CODE>malloc()</CODE> and fails if memory cannot be allocated.
2942<P>
2943
2944See section <A HREF="pintos_5.html#SEC87">A.8.6 Auxiliary Data</A>, for an explanation of <VAR>aux</VAR>, which is
2945most often a null pointer.
2946</P>
2947</DL>
2948<P>
2949
2950<A NAME="IDX147"></A>
2951</P>
2952<DL>
2953<DT><U>Function:</U> void <B>hash_clear</B> (struct hash *<VAR>hash</VAR>, hash_action_func *<VAR>action</VAR>)
2954<DD>Removes all the elements from <VAR>hash</VAR>, which must have been
2955previously initialized with <CODE>hash_init()</CODE>.
2956<P>
2957
2958If <VAR>action</VAR> is non-null, then it is called once for each element in
2959the hash table, which gives the caller an opportunity to deallocate any
2960memory or other resources used by the element. For example, if the hash
2961table elements are dynamically allocated using <CODE>malloc()</CODE>, then
2962<VAR>action</VAR> could <CODE>free()</CODE> the element. This is safe because
2963<CODE>hash_clear()</CODE> will not access the memory in a given hash element
2964after calling <VAR>action</VAR> on it. However, <VAR>action</VAR> must not call
2965any function that may modify the hash table, such as <CODE>hash_insert()</CODE>
2966or <CODE>hash_delete()</CODE>.
2967</P>
2968</DL>
2969<P>
2970
2971<A NAME="IDX148"></A>
2972</P>
2973<DL>
2974<DT><U>Function:</U> void <B>hash_destroy</B> (struct hash *<VAR>hash</VAR>, hash_action_func *<VAR>action</VAR>)
2975<DD>If <VAR>action</VAR> is non-null, calls it for each element in the hash, with
2976the same semantics as a call to <CODE>hash_clear()</CODE>. Then, frees the
2977memory held by <VAR>hash</VAR>. Afterward, <VAR>hash</VAR> must not be passed to
2978any hash table function, absent an intervening call to <CODE>hash_init()</CODE>.
2979</DL>
2980<P>
2981
2982<A NAME="IDX149"></A>
2983</P>
2984<DL>
2985<DT><U>Function:</U> size_t <B>hash_size</B> (struct hash *<VAR>hash</VAR>)
2986<DD>Returns the number of elements currently stored in <VAR>hash</VAR>.
2987</DL>
2988<P>
2989
2990<A NAME="IDX150"></A>
2991</P>
2992<DL>
2993<DT><U>Function:</U> bool <B>hash_empty</B> (struct hash *<VAR>hash</VAR>)
2994<DD>Returns true if <VAR>hash</VAR> currently contains no elements,
2995false if <VAR>hash</VAR> contains at least one element.
2996</DL>
2997<P>
2998
2999<A NAME="Hash Search Functions"></A>
3000<HR SIZE="6">
3001<A NAME="SEC84"></A>
3002<H3> A.8.3 Search Functions </H3>
3003<!--docid::SEC84::-->
3004<P>
3005
3006Each of these functions searches a hash table for an element that
3007compares equal to one provided. Based on the success of the search,
3008they perform some action, such as inserting a new element into the hash
3009table, or simply return the result of the search.
3010</P>
3011<P>
3012
3013<A NAME="IDX151"></A>
3014</P>
3015<DL>
3016<DT><U>Function:</U> struct hash_elem *<B>hash_insert</B> (struct hash *<VAR>hash</VAR>, struct hash_elem *<VAR>element</VAR>)
3017<DD>Searches <VAR>hash</VAR> for an element equal to <VAR>element</VAR>. If none is
3018found, inserts <VAR>element</VAR> into <VAR>hash</VAR> and returns a null pointer.
3019If the table already contains an element equal to <VAR>element</VAR>, it is
3020returned without modifying <VAR>hash</VAR>.
3021</DL>
3022<P>
3023
3024<A NAME="IDX152"></A>
3025</P>
3026<DL>
3027<DT><U>Function:</U> struct hash_elem *<B>hash_replace</B> (struct hash *<VAR>hash</VAR>, struct hash_elem *<VAR>element</VAR>)
3028<DD>Inserts <VAR>element</VAR> into <VAR>hash</VAR>. Any element equal to
3029<VAR>element</VAR> already in <VAR>hash</VAR> is removed. Returns the element
3030removed, or a null pointer if <VAR>hash</VAR> did not contain an element
3031equal to <VAR>element</VAR>.
3032<P>
3033
3034The caller is responsible for deallocating any resources associated with
3035the returned element, as appropriate. For example, if the hash table
3036elements are dynamically allocated using <CODE>malloc()</CODE>, then the caller
3037must <CODE>free()</CODE> the element after it is no longer needed.
3038</P>
3039</DL>
3040<P>
3041
3042The element passed to the following functions is only used for hashing
3043and comparison purposes. It is never actually inserted into the hash
3044table. Thus, only key data in the element needs to be initialized, and
3045other data in the element will not be used. It often makes sense to
3046declare an instance of the element type as a local variable, initialize
3047the key data, and then pass the address of its <CODE>struct hash_elem</CODE> to
3048<CODE>hash_find()</CODE> or <CODE>hash_delete()</CODE>. See section <A HREF="pintos_5.html#SEC86">A.8.5 Hash Table Example</A>, for
3049an example. (Large structures should not be
3050allocated as local variables. See section <A HREF="pintos_5.html#SEC55">A.2.1 <CODE>struct thread</CODE></A>, for more
3051information.)
3052</P>
3053<P>
3054
3055<A NAME="IDX153"></A>
3056</P>
3057<DL>
3058<DT><U>Function:</U> struct hash_elem *<B>hash_find</B> (struct hash *<VAR>hash</VAR>, struct hash_elem *<VAR>element</VAR>)
3059<DD>Searches <VAR>hash</VAR> for an element equal to <VAR>element</VAR>. Returns the
3060element found, if any, or a null pointer otherwise.
3061</DL>
3062<P>
3063
3064<A NAME="IDX154"></A>
3065</P>
3066<DL>
3067<DT><U>Function:</U> struct hash_elem *<B>hash_delete</B> (struct hash *<VAR>hash</VAR>, struct hash_elem *<VAR>element</VAR>)
3068<DD>Searches <VAR>hash</VAR> for an element equal to <VAR>element</VAR>. If one is
3069found, it is removed from <VAR>hash</VAR> and returned. Otherwise, a null
3070pointer is returned and <VAR>hash</VAR> is unchanged.
3071<P>
3072
3073The caller is responsible for deallocating any resources associated with
3074the returned element, as appropriate. For example, if the hash table
3075elements are dynamically allocated using <CODE>malloc()</CODE>, then the caller
3076must <CODE>free()</CODE> the element after it is no longer needed.
3077</P>
3078</DL>
3079<P>
3080
3081<A NAME="Hash Iteration Functions"></A>
3082<HR SIZE="6">
3083<A NAME="SEC85"></A>
3084<H3> A.8.4 Iteration Functions </H3>
3085<!--docid::SEC85::-->
3086<P>
3087
3088These functions allow iterating through the elements in a hash table.
3089Two interfaces are supplied. The first requires writing and supplying a
3090<VAR>hash_action_func</VAR> to act on each element (see section <A HREF="pintos_5.html#SEC82">A.8.1 Data Types</A>).
3091</P>
3092<P>
3093
3094<A NAME="IDX155"></A>
3095</P>
3096<DL>
3097<DT><U>Function:</U> void <B>hash_apply</B> (struct hash *<VAR>hash</VAR>, hash_action_func *<VAR>action</VAR>)
3098<DD>Calls <VAR>action</VAR> once for each element in <VAR>hash</VAR>, in arbitrary
3099order. <VAR>action</VAR> must not call any function that may modify the hash
3100table, such as <CODE>hash_insert()</CODE> or <CODE>hash_delete()</CODE>. <VAR>action</VAR>
3101must not modify key data in elements, although it may modify any other
3102data.
3103</DL>
3104<P>
3105
3106The second interface is based on an &quot;iterator&quot; data type.
3107Idiomatically, iterators are used as follows:
3108</P>
3109<P>
3110
3111<TABLE><tr><td>&nbsp;</td><td class=example><pre>struct hash_iterator i;
3112
3113hash_first (&amp;i, h);
3114while (hash_next (&amp;i))
3115 {
3116 struct foo *f = hash_entry (hash_cur (&amp;i), struct foo, elem);
3117 <small>...</small>do something with <I>f</I><small>...</small>
3118 }
3119</pre></td></tr></table><P>
3120
3121<A NAME="IDX156"></A>
3122</P>
3123<DL>
3124<DT><U>Type:</U> <B>struct hash_iterator</B>
3125<DD>Represents a position within a hash table. Calling any function that
3126may modify a hash table, such as <CODE>hash_insert()</CODE> or
3127<CODE>hash_delete()</CODE>, invalidates all iterators within that hash table.
3128<P>
3129
3130Like <CODE>struct hash</CODE> and <CODE>struct hash_elem</CODE>, <CODE>struct hash_elem</CODE> is opaque.
3131</P>
3132</DL>
3133<P>
3134
3135<A NAME="IDX157"></A>
3136</P>
3137<DL>
3138<DT><U>Function:</U> void <B>hash_first</B> (struct hash_iterator *<VAR>iterator</VAR>, struct hash *<VAR>hash</VAR>)
3139<DD>Initializes <VAR>iterator</VAR> to just before the first element in
3140<VAR>hash</VAR>.
3141</DL>
3142<P>
3143
3144<A NAME="IDX158"></A>
3145</P>
3146<DL>
3147<DT><U>Function:</U> struct hash_elem *<B>hash_next</B> (struct hash_iterator *<VAR>iterator</VAR>)
3148<DD>Advances <VAR>iterator</VAR> to the next element in <VAR>hash</VAR>, and returns
3149that element. Returns a null pointer if no elements remain. After
3150<CODE>hash_next()</CODE> returns null for <VAR>iterator</VAR>, calling it again
3151yields undefined behavior.
3152</DL>
3153<P>
3154
3155<A NAME="IDX159"></A>
3156</P>
3157<DL>
3158<DT><U>Function:</U> struct hash_elem *<B>hash_cur</B> (struct hash_iterator *<VAR>iterator</VAR>)
3159<DD>Returns the value most recently returned by <CODE>hash_next()</CODE> for
3160<VAR>iterator</VAR>. Yields undefined behavior after <CODE>hash_first()</CODE> has
3161been called on <VAR>iterator</VAR> but before <CODE>hash_next()</CODE> has been
3162called for the first time.
3163</DL>
3164<P>
3165
3166<A NAME="Hash Table Example"></A>
3167<HR SIZE="6">
3168<A NAME="SEC86"></A>
3169<H3> A.8.5 Hash Table Example </H3>
3170<!--docid::SEC86::-->
3171<P>
3172
3173Suppose you have a structure, called <CODE>struct page</CODE>, that you
3174want to put into a hash table. First, define <CODE>struct page</CODE> to include a
3175<CODE>struct hash_elem</CODE> member:
3176</P>
3177<P>
3178
3179<TABLE><tr><td>&nbsp;</td><td class=example><pre>struct page
3180 {
3181 struct hash_elem hash_elem; /* Hash table element. */
3182 void *addr; /* Virtual address. */
3183 /* <small>...</small>other members<small>...</small> */
3184 };
3185</pre></td></tr></table><P>
3186
3187We write a hash function and a comparison function using <VAR>addr</VAR> as
3188the key. A pointer can be hashed based on its bytes, and the <Q><SAMP>&lt;</SAMP></Q>
3189operator works fine for comparing pointers:
3190</P>
3191<P>
3192
3193<TABLE><tr><td>&nbsp;</td><td class=example><pre>/* Returns a hash value for page <VAR>p</VAR>. */
3194unsigned
3195page_hash (const struct hash_elem *p_, void *aux UNUSED)
3196{
3197 const struct page *p = hash_entry (p_, struct page, hash_elem);
3198 return hash_bytes (&amp;p-&gt;addr, sizeof p-&gt;addr);
3199}
3200
3201/* Returns true if page <VAR>a</VAR> precedes page <VAR>b</VAR>. */
3202bool
3203page_less (const struct hash_elem *a_, const struct hash_elem *b_,
3204 void *aux UNUSED)
3205{
3206 const struct page *a = hash_entry (a_, struct page, hash_elem);
3207 const struct page *b = hash_entry (b_, struct page, hash_elem);
3208
3209 return a-&gt;addr &lt; b-&gt;addr;
3210}
3211</pre></td></tr></table><P>
3212
3213(The use of <CODE>UNUSED</CODE> in these functions' prototypes suppresses a
3214warning that <VAR>aux</VAR> is unused. See section <A HREF="pintos_8.html#SEC99">D.3 Function and Parameter Attributes</A>, for information about <CODE>UNUSED</CODE>. See section <A HREF="pintos_5.html#SEC87">A.8.6 Auxiliary Data</A>, for an explanation of <VAR>aux</VAR>.)
3215</P>
3216<P>
3217
3218Then, we can create a hash table like this:
3219</P>
3220<P>
3221
3222<TABLE><tr><td>&nbsp;</td><td class=example><pre>struct hash pages;
3223
3224hash_init (&amp;pages, page_hash, page_less, NULL);
3225</pre></td></tr></table><P>
3226
3227Now we can manipulate the hash table we've created. If <CODE><VAR>p</VAR></CODE>
3228is a pointer to a <CODE>struct page</CODE>, we can insert it into the hash table
3229with:
3230</P>
3231<P>
3232
3233<TABLE><tr><td>&nbsp;</td><td class=example><pre>hash_insert (&amp;pages, &amp;p-&gt;hash_elem);
3234</pre></td></tr></table><P>
3235
3236If there's a chance that <VAR>pages</VAR> might already contain a
3237page with the same <VAR>addr</VAR>, then we should check <CODE>hash_insert()</CODE>'s
3238return value.
3239</P>
3240<P>
3241
3242To search for an element in the hash table, use <CODE>hash_find()</CODE>. This
3243takes a little setup, because <CODE>hash_find()</CODE> takes an element to
3244compare against. Here's a function that will find and return a page
3245based on a virtual address, assuming that <VAR>pages</VAR> is defined at file
3246scope:
3247</P>
3248<P>
3249
3250<TABLE><tr><td>&nbsp;</td><td class=example><pre>/* Returns the page containing the given virtual <VAR>address</VAR>,
3251 or a null pointer if no such page exists. */
3252struct page *
3253page_lookup (const void *address)
3254{
3255 struct page p;
3256 struct hash_elem *e;
3257
3258 p.addr = address;
3259 e = hash_find (&amp;pages, &amp;p.hash_elem);
3260 return e != NULL ? hash_entry (e, struct page, hash_elem) : NULL;
3261}
3262</pre></td></tr></table><P>
3263
3264<CODE>struct page</CODE> is allocated as a local variable here on the assumption
3265that it is fairly small. Large structures should not be allocated as
3266local variables. See section <A HREF="pintos_5.html#SEC55">A.2.1 <CODE>struct thread</CODE></A>, for more information.
3267</P>
3268<P>
3269
3270A similar function could delete a page by address using
3271<CODE>hash_delete()</CODE>.
3272</P>
3273<P>
3274
3275<A NAME="Hash Auxiliary Data"></A>
3276<HR SIZE="6">
3277<A NAME="SEC87"></A>
3278<H3> A.8.6 Auxiliary Data </H3>
3279<!--docid::SEC87::-->
3280<P>
3281
3282In simple cases like the example above, there's no need for the
3283<VAR>aux</VAR> parameters. In these cases, just pass a null pointer to
3284<CODE>hash_init()</CODE> for <VAR>aux</VAR> and ignore the values passed to the hash
3285function and comparison functions. (You'll get a compiler warning if
3286you don't use the <VAR>aux</VAR> parameter, but you can turn that off with
3287the <CODE>UNUSED</CODE> macro, as shown in the example, or you can just ignore
3288it.)
3289</P>
3290<P>
3291
3292<VAR>aux</VAR> is useful when you have some property of the data in the
3293hash table is both constant and needed for hashing or comparison,
3294but not stored in the data items themselves. For example, if
3295the items in a hash table are fixed-length strings, but the items
3296themselves don't indicate what that fixed length is, you could pass
3297the length as an <VAR>aux</VAR> parameter.
3298</P>
3299<P>
3300
3301<A NAME="Hash Synchronization"></A>
3302<HR SIZE="6">
3303<A NAME="SEC88"></A>
3304<H3> A.8.7 Synchronization </H3>
3305<!--docid::SEC88::-->
3306<P>
3307
3308The hash table does not do any internal synchronization. It is the
3309caller's responsibility to synchronize calls to hash table functions.
3310In general, any number of functions that examine but do not modify the
3311hash table, such as <CODE>hash_find()</CODE> or <CODE>hash_next()</CODE>, may execute
3312simultaneously. However, these function cannot safely execute at the
3313same time as any function that may modify a given hash table, such as
3314<CODE>hash_insert()</CODE> or <CODE>hash_delete()</CODE>, nor may more than one function
3315that can modify a given hash table execute safely at once.
3316</P>
3317<P>
3318
3319It is also the caller's responsibility to synchronize access to data in
3320hash table elements. How to synchronize access to this data depends on
3321how it is designed and organized, as with any other data structure.
3322</P>
3323<P>
3324
3325<A NAME="Coding Standards"></A>
3326<HR SIZE="6">
3327<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
3328<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos_5.html#SEC48"> &lt;&lt; </A>]</TD>
3329<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos_6.html#SEC89"> &gt;&gt; </A>]</TD>
3330<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos.html#SEC_Top">Top</A>]</TD>
3331<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos.html#SEC_Contents">Contents</A>]</TD>
3332<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
3333<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos_abt.html#SEC_About"> ? </A>]</TD>
3334</TR></TABLE>
3335<BR>
3336<FONT SIZE="-1">
3337This document was generated
3338by on <I>March, 6 2012</I>
3339using <A HREF="http://texi2html.cvshome.org"><I>texi2html</I></A>
3340</FONT>
3341
3342</BODY>
3343</HTML>