summaryrefslogtreecommitdiffstats
path: root/doc/pintos_8.html
diff options
context:
space:
mode:
Diffstat (limited to 'doc/pintos_8.html')
-rw-r--r--doc/pintos_8.html1041
1 files changed, 1041 insertions, 0 deletions
diff --git a/doc/pintos_8.html b/doc/pintos_8.html
new file mode 100644
index 0000000..e354458
--- /dev/null
+++ b/doc/pintos_8.html
@@ -0,0 +1,1041 @@
1<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
2 "http://www.w3.org/TR/html40/loose.dtd">
3<HTML>
4<!-- Created on March, 6 2012 by texi2html 1.66 -->
5<!--
6Written by: Lionel Cons <Lionel.Cons@cern.ch> (original author)
7 Karl Berry <karl@freefriends.org>
8 Olaf Bachmann <obachman@mathematik.uni-kl.de>
9 and many others.
10Maintained by: Many creative people <dev@texi2html.cvshome.org>
11Send bugs and suggestions to <users@texi2html.cvshome.org>
12
13-->
14<HEAD>
15<TITLE>Pintos Projects: Debugging Tools</TITLE>
16
17<META NAME="description" CONTENT="Pintos Projects: Debugging Tools">
18<META NAME="keywords" CONTENT="Pintos Projects: Debugging Tools">
19<META NAME="resource-type" CONTENT="document">
20<META NAME="distribution" CONTENT="global">
21<META NAME="Generator" CONTENT="texi2html 1.66">
22<LINK REL="stylesheet" HREF="pintos.css">
23</HEAD>
24
25<BODY >
26
27<A NAME="SEC96"></A>
28<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
29<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos_7.html#SEC93"> &lt;&lt; </A>]</TD>
30<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos_9.html#SEC109"> &gt;&gt; </A>]</TD>
31<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos.html#SEC_Top">Top</A>]</TD>
32<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos.html#SEC_Contents">Contents</A>]</TD>
33<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
34<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos_abt.html#SEC_About"> ? </A>]</TD>
35</TR></TABLE>
36
37<HR SIZE=2>
38<H1> D. Debugging Tools </H1>
39<!--docid::SEC96::-->
40<P>
41
42Many tools lie at your disposal for debugging Pintos. This appendix
43introduces you to a few of them.
44</P>
45<P>
46
47<A NAME="printf"></A>
48<HR SIZE="6">
49<A NAME="SEC97"></A>
50<H2> D.1 <CODE>printf()</CODE> </H2>
51<!--docid::SEC97::-->
52<P>
53
54Don't underestimate the value of <CODE>printf()</CODE>. The way
55<CODE>printf()</CODE> is implemented in Pintos, you can call it from
56practically anywhere in the kernel, whether it's in a kernel thread or
57an interrupt handler, almost regardless of what locks are held.
58</P>
59<P>
60
61<CODE>printf()</CODE> is useful for more than just examining data.
62It can also help figure out when and where something goes wrong, even
63when the kernel crashes or panics without a useful error message. The
64strategy is to sprinkle calls to <CODE>printf()</CODE> with different strings
65(e.g. <CODE>&quot;&lt;1&gt;&quot;</CODE>, <CODE>&quot;&lt;2&gt;&quot;</CODE>, <small>...</small>) throughout the pieces of
66code you suspect are failing. If you don't even see <CODE>&lt;1&gt;</CODE> printed,
67then something bad happened before that point, if you see <CODE>&lt;1&gt;</CODE>
68but not <CODE>&lt;2&gt;</CODE>, then something bad happened between those two
69points, and so on. Based on what you learn, you can then insert more
70<CODE>printf()</CODE> calls in the new, smaller region of code you suspect.
71Eventually you can narrow the problem down to a single statement.
72See section <A HREF="pintos_8.html#SEC106">D.6 Triple Faults</A>, for a related technique.
73</P>
74<P>
75
76<A NAME="ASSERT"></A>
77<HR SIZE="6">
78<A NAME="SEC98"></A>
79<H2> D.2 <CODE>ASSERT</CODE> </H2>
80<!--docid::SEC98::-->
81<P>
82
83Assertions are useful because they can catch problems early, before
84they'd otherwise be noticed. Ideally, each function should begin with a
85set of assertions that check its arguments for validity. (Initializers
86for functions' local variables are evaluated before assertions are
87checked, so be careful not to assume that an argument is valid in an
88initializer.) You can also sprinkle assertions throughout the body of
89functions in places where you suspect things are likely to go wrong.
90They are especially useful for checking loop invariants.
91</P>
92<P>
93
94Pintos provides the <CODE>ASSERT</CODE> macro, defined in <Q><TT>&lt;debug.h&gt;</TT></Q>,
95for checking assertions.
96</P>
97<P>
98
99<A NAME="IDX161"></A>
100</P>
101<DL>
102<DT><U>Macro:</U> <B>ASSERT</B> <I>(expression)</I>
103<DD>Tests the value of <VAR>expression</VAR>. If it evaluates to zero (false),
104the kernel panics. The panic message includes the expression that
105failed, its file and line number, and a backtrace, which should help you
106to find the problem. See section <A HREF="pintos_8.html#SEC100">D.4 Backtraces</A>, for more information.
107</DL>
108<P>
109
110<A NAME="Function and Parameter Attributes"></A>
111<HR SIZE="6">
112<A NAME="SEC99"></A>
113<H2> D.3 Function and Parameter Attributes </H2>
114<!--docid::SEC99::-->
115<P>
116
117These macros defined in <Q><TT>&lt;debug.h&gt;</TT></Q> tell the compiler special
118attributes of a function or function parameter. Their expansions are
119GCC-specific.
120</P>
121<P>
122
123<A NAME="IDX162"></A>
124</P>
125<DL>
126<DT><U>Macro:</U> <B>UNUSED</B>
127<DD>Appended to a function parameter to tell the compiler that the
128parameter might not be used within the function. It suppresses the
129warning that would otherwise appear.
130</DL>
131<P>
132
133<A NAME="IDX163"></A>
134</P>
135<DL>
136<DT><U>Macro:</U> <B>NO_RETURN</B>
137<DD>Appended to a function prototype to tell the compiler that the
138function never returns. It allows the compiler to fine-tune its
139warnings and its code generation.
140</DL>
141<P>
142
143<A NAME="IDX164"></A>
144</P>
145<DL>
146<DT><U>Macro:</U> <B>NO_INLINE</B>
147<DD>Appended to a function prototype to tell the compiler to never emit
148the function in-line. Occasionally useful to improve the quality of
149backtraces (see below).
150</DL>
151<P>
152
153<A NAME="IDX165"></A>
154</P>
155<DL>
156<DT><U>Macro:</U> <B>PRINTF_FORMAT</B> <I>(<VAR>format</VAR>, <VAR>first</VAR>)</I>
157<DD>Appended to a function prototype to tell the compiler that the function
158takes a <CODE>printf()</CODE>-like format string as the argument numbered
159<VAR>format</VAR> (starting from 1) and that the corresponding value
160arguments start at the argument numbered <VAR>first</VAR>. This lets the
161compiler tell you if you pass the wrong argument types.
162</DL>
163<P>
164
165<A NAME="Backtraces"></A>
166<HR SIZE="6">
167<A NAME="SEC100"></A>
168<H2> D.4 Backtraces </H2>
169<!--docid::SEC100::-->
170<P>
171
172When the kernel panics, it prints a &quot;backtrace,&quot; that is, a summary
173of how your program got where it is, as a list of addresses inside the
174functions that were running at the time of the panic. You can also
175insert a call to <CODE>debug_backtrace()</CODE>, prototyped in
176<Q><TT>&lt;debug.h&gt;</TT></Q>, to print a backtrace at any point in your code.
177<CODE>debug_backtrace_all()</CODE>, also declared in <Q><TT>&lt;debug.h&gt;</TT></Q>,
178prints backtraces of all threads.
179</P>
180<P>
181
182The addresses in a backtrace are listed as raw hexadecimal numbers,
183which are difficult to interpret. We provide a tool called
184<CODE>backtrace</CODE> to translate these into function names and source
185file line numbers.
186Give it the name of your <Q><TT>kernel.o</TT></Q> as the first argument and the
187hexadecimal numbers composing the backtrace (including the <Q><SAMP>0x</SAMP></Q>
188prefixes) as the remaining arguments. It outputs the function name
189and source file line numbers that correspond to each address.
190</P>
191<P>
192
193If the translated form of a backtrace is garbled, or doesn't make
194sense (e.g. function A is listed above function B, but B doesn't
195call A), then it's a good sign that you're corrupting a kernel
196thread's stack, because the backtrace is extracted from the stack.
197Alternatively, it could be that the <Q><TT>kernel.o</TT></Q> you passed to
198<CODE>backtrace</CODE> is not the same kernel that produced
199the backtrace.
200</P>
201<P>
202
203Sometimes backtraces can be confusing without any corruption.
204Compiler optimizations can cause surprising behavior. When a function
205has called another function as its final action (a <EM>tail call</EM>), the
206calling function may not appear in a backtrace at all. Similarly, when
207function A calls another function B that never returns, the compiler may
208optimize such that an unrelated function C appears in the backtrace
209instead of A. Function C is simply the function that happens to be in
210memory just after A. In the threads project, this is commonly seen in
211backtraces for test failures.
212</P>
213<P>
214
215<A NAME="Backtrace Example"></A>
216<HR SIZE="6">
217<A NAME="SEC101"></A>
218<H3> D.4.1 Example </H3>
219<!--docid::SEC101::-->
220<P>
221
222Here's an example. Suppose that Pintos printed out this following call
223stack, which is taken from an actual Pintos submission for the file
224system project:
225</P>
226<P>
227
228<TABLE><tr><td>&nbsp;</td><td class=example><pre>Call stack: 0xc0106eff 0xc01102fb 0xc010dc22 0xc010cf67 0xc0102319
2290xc010325a 0x804812c 0x8048a96 0x8048ac8.
230</pre></td></tr></table><P>
231
232You would then invoke the <CODE>backtrace</CODE> utility like shown below,
233cutting and pasting the backtrace information into the command line.
234This assumes that <Q><TT>kernel.o</TT></Q> is in the current directory. You
235would of course enter all of the following on a single shell command
236line, even though that would overflow our margins here:
237</P>
238<P>
239
240<TABLE><tr><td>&nbsp;</td><td class=example><pre>backtrace kernel.o 0xc0106eff 0xc01102fb 0xc010dc22 0xc010cf67
2410xc0102319 0xc010325a 0x804812c 0x8048a96 0x8048ac8
242</pre></td></tr></table><P>
243
244The backtrace output would then look something like this:
245</P>
246<P>
247
248<TABLE><tr><td>&nbsp;</td><td class=example><pre>0xc0106eff: debug_panic (lib/debug.c:86)
2490xc01102fb: file_seek (filesys/file.c:405)
2500xc010dc22: seek (userprog/syscall.c:744)
2510xc010cf67: syscall_handler (userprog/syscall.c:444)
2520xc0102319: intr_handler (threads/interrupt.c:334)
2530xc010325a: intr_entry (threads/intr-stubs.S:38)
2540x0804812c: (unknown)
2550x08048a96: (unknown)
2560x08048ac8: (unknown)
257</pre></td></tr></table><P>
258
259(You will probably not see exactly the same addresses if you run the
260command above on your own kernel binary, because the source code you
261compiled and the compiler you used are probably different.)
262</P>
263<P>
264
265The first line in the backtrace refers to <CODE>debug_panic()</CODE>, the
266function that implements kernel panics. Because backtraces commonly
267result from kernel panics, <CODE>debug_panic()</CODE> will often be the first
268function shown in a backtrace.
269</P>
270<P>
271
272The second line shows <CODE>file_seek()</CODE> as the function that panicked,
273in this case as the result of an assertion failure. In the source code
274tree used for this example, line 405 of <Q><TT>filesys/file.c</TT></Q> is the
275assertion
276</P>
277<P>
278
279<TABLE><tr><td>&nbsp;</td><td class=example><pre>ASSERT (file_ofs &gt;= 0);
280</pre></td></tr></table><P>
281
282(This line was also cited in the assertion failure message.)
283Thus, <CODE>file_seek()</CODE> panicked because it passed a negative file offset
284argument.
285</P>
286<P>
287
288The third line indicates that <CODE>seek()</CODE> called <CODE>file_seek()</CODE>,
289presumably without validating the offset argument. In this submission,
290<CODE>seek()</CODE> implements the <CODE>seek</CODE> system call.
291</P>
292<P>
293
294The fourth line shows that <CODE>syscall_handler()</CODE>, the system call
295handler, invoked <CODE>seek()</CODE>.
296</P>
297<P>
298
299The fifth and sixth lines are the interrupt handler entry path.
300</P>
301<P>
302
303The remaining lines are for addresses below <CODE>PHYS_BASE</CODE>. This
304means that they refer to addresses in the user program, not in the
305kernel. If you know what user program was running when the kernel
306panicked, you can re-run <CODE>backtrace</CODE> on the user program, like
307so: (typing the command on a single line, of course):
308</P>
309<P>
310
311<TABLE><tr><td>&nbsp;</td><td class=example><pre>backtrace tests/filesys/extended/grow-too-big 0xc0106eff 0xc01102fb
3120xc010dc22 0xc010cf67 0xc0102319 0xc010325a 0x804812c 0x8048a96
3130x8048ac8
314</pre></td></tr></table><P>
315
316The results look like this:
317</P>
318<P>
319
320<TABLE><tr><td>&nbsp;</td><td class=example><pre>0xc0106eff: (unknown)
3210xc01102fb: (unknown)
3220xc010dc22: (unknown)
3230xc010cf67: (unknown)
3240xc0102319: (unknown)
3250xc010325a: (unknown)
3260x0804812c: test_main (...xtended/grow-too-big.c:20)
3270x08048a96: main (tests/main.c:10)
3280x08048ac8: _start (lib/user/entry.c:9)
329</pre></td></tr></table><P>
330
331You can even specify both the kernel and the user program names on
332the command line, like so:
333</P>
334<P>
335
336<TABLE><tr><td>&nbsp;</td><td class=example><pre>backtrace kernel.o tests/filesys/extended/grow-too-big 0xc0106eff
3370xc01102fb 0xc010dc22 0xc010cf67 0xc0102319 0xc010325a 0x804812c
3380x8048a96 0x8048ac8
339</pre></td></tr></table><P>
340
341The result is a combined backtrace:
342</P>
343<P>
344
345<TABLE><tr><td>&nbsp;</td><td class=example><pre>In kernel.o:
3460xc0106eff: debug_panic (lib/debug.c:86)
3470xc01102fb: file_seek (filesys/file.c:405)
3480xc010dc22: seek (userprog/syscall.c:744)
3490xc010cf67: syscall_handler (userprog/syscall.c:444)
3500xc0102319: intr_handler (threads/interrupt.c:334)
3510xc010325a: intr_entry (threads/intr-stubs.S:38)
352In tests/filesys/extended/grow-too-big:
3530x0804812c: test_main (...xtended/grow-too-big.c:20)
3540x08048a96: main (tests/main.c:10)
3550x08048ac8: _start (lib/user/entry.c:9)
356</pre></td></tr></table><P>
357
358Here's an extra tip for anyone who read this far: <CODE>backtrace</CODE>
359is smart enough to strip the <CODE>Call stack:</CODE> header and <Q><SAMP>.</SAMP></Q>
360trailer from the command line if you include them. This can save you
361a little bit of trouble in cutting and pasting. Thus, the following
362command prints the same output as the first one we used:
363</P>
364<P>
365
366<TABLE><tr><td>&nbsp;</td><td class=example><pre>backtrace kernel.o Call stack: 0xc0106eff 0xc01102fb 0xc010dc22
3670xc010cf67 0xc0102319 0xc010325a 0x804812c 0x8048a96 0x8048ac8.
368</pre></td></tr></table><P>
369
370<A NAME="GDB"></A>
371<HR SIZE="6">
372<A NAME="SEC102"></A>
373<H2> D.5 GDB </H2>
374<!--docid::SEC102::-->
375<P>
376
377You can run Pintos under the supervision of the GDB debugger.
378First, start Pintos with the <Q><SAMP>--gdb</SAMP></Q> option, e.g.
379<CODE>pintos --gdb -- run mytest</CODE>. Second, open a second terminal on
380the same machine and
381use <CODE>pintos-gdb</CODE> to invoke GDB on
382<Q><TT>kernel.o</TT></Q>:<A NAME="DOCF5" HREF="pintos_fot.html#FOOT5">(5)</A>
383<TABLE><tr><td>&nbsp;</td><td class=example><pre>pintos-gdb kernel.o
384</pre></td></tr></table>and issue the following GDB command:
385<TABLE><tr><td>&nbsp;</td><td class=example><pre>target remote localhost:1234
386</pre></td></tr></table><P>
387
388Now GDB is connected to the simulator over a local
389network connection. You can now issue any normal GDB
390commands. If you issue the <Q><SAMP>c</SAMP></Q> command, the simulated BIOS will take
391control, load Pintos, and then Pintos will run in the usual way. You
392can pause the process at any point with <KBD>Ctrl+C</KBD>.
393</P>
394<P>
395
396<A NAME="Using GDB"></A>
397<HR SIZE="6">
398<A NAME="SEC103"></A>
399<H3> D.5.1 Using GDB </H3>
400<!--docid::SEC103::-->
401<P>
402
403You can read the GDB manual by typing <CODE>info gdb</CODE> at a
404terminal command prompt. Here's a few commonly useful GDB commands:
405</P>
406<P>
407
408<A NAME="IDX166"></A>
409</P>
410<DL>
411<DT><U>GDB Command:</U> <B>c</B>
412<DD>Continues execution until <KBD>Ctrl+C</KBD> or the next breakpoint.
413</DL>
414<P>
415
416<A NAME="IDX167"></A>
417</P>
418<DL>
419<DT><U>GDB Command:</U> <B>break</B> <I>function</I>
420<DD><A NAME="IDX168"></A>
421<DT><U>GDB Command:</U> <B>break</B> <I>file:line</I>
422<DD><A NAME="IDX169"></A>
423<DT><U>GDB Command:</U> <B>break</B> <I>*address</I>
424<DD>Sets a breakpoint at <VAR>function</VAR>, at <VAR>line</VAR> within <VAR>file</VAR>, or
425<VAR>address</VAR>.
426(Use a <Q><SAMP>0x</SAMP></Q> prefix to specify an address in hex.)
427<P>
428
429Use <CODE>break main</CODE> to make GDB stop when Pintos starts running.
430</P>
431</DL>
432<P>
433
434<A NAME="IDX170"></A>
435</P>
436<DL>
437<DT><U>GDB Command:</U> <B>p</B> <I>expression</I>
438<DD>Evaluates the given <VAR>expression</VAR> and prints its value.
439If the expression contains a function call, that function will actually
440be executed.
441</DL>
442<P>
443
444<A NAME="IDX171"></A>
445</P>
446<DL>
447<DT><U>GDB Command:</U> <B>l</B> <I>*address</I>
448<DD>Lists a few lines of code around <VAR>address</VAR>.
449(Use a <Q><SAMP>0x</SAMP></Q> prefix to specify an address in hex.)
450</DL>
451<P>
452
453<A NAME="IDX172"></A>
454</P>
455<DL>
456<DT><U>GDB Command:</U> <B>bt</B>
457<DD>Prints a stack backtrace similar to that output by the
458<CODE>backtrace</CODE> program described above.
459</DL>
460<P>
461
462<A NAME="IDX173"></A>
463</P>
464<DL>
465<DT><U>GDB Command:</U> <B>p/a</B> <I>address</I>
466<DD>Prints the name of the function or variable that occupies <VAR>address</VAR>.
467(Use a <Q><SAMP>0x</SAMP></Q> prefix to specify an address in hex.)
468</DL>
469<P>
470
471<A NAME="IDX174"></A>
472</P>
473<DL>
474<DT><U>GDB Command:</U> <B>diassemble</B> <I>function</I>
475<DD>Disassembles <VAR>function</VAR>.
476</DL>
477<P>
478
479We also provide a set of macros specialized for debugging Pintos,
480written by Godmar Back <A HREF="mailto:gback@cs.vt.edu">gback@cs.vt.edu</A>. You can type
481<CODE>help user-defined</CODE> for basic help with the macros. Here is an
482overview of their functionality, based on Godmar's documentation:
483</P>
484<P>
485
486<A NAME="IDX175"></A>
487</P>
488<DL>
489<DT><U>GDB Macro:</U> <B>debugpintos</B>
490<DD>Attach debugger to a waiting pintos process on the same machine.
491Shorthand for <CODE>target remote localhost:1234</CODE>.
492</DL>
493<P>
494
495<A NAME="IDX176"></A>
496</P>
497<DL>
498<DT><U>GDB Macro:</U> <B>dumplist</B> <I>list type element</I>
499<DD>Prints the elements of <VAR>list</VAR>, which should be a <CODE>struct</CODE> list
500that contains elements of the given <VAR>type</VAR> (without the word
501<CODE>struct</CODE>) in which <VAR>element</VAR> is the <CODE>struct list_elem</CODE> member
502that links the elements.
503<P>
504
505Example: <CODE>dumplist all_list thread allelem</CODE> prints all elements of
506<CODE>struct thread</CODE> that are linked in <CODE>struct list all_list</CODE> using the
507<CODE>struct list_elem allelem</CODE> which is part of <CODE>struct thread</CODE>.
508</P>
509</DL>
510<P>
511
512<A NAME="IDX177"></A>
513</P>
514<DL>
515<DT><U>GDB Macro:</U> <B>btthread</B> <I>thread</I>
516<DD>Shows the backtrace of <VAR>thread</VAR>, which is a pointer to the
517<CODE>struct thread</CODE> of the thread whose backtrace it should show. For the
518current thread, this is identical to the <CODE>bt</CODE> (backtrace) command.
519It also works for any thread suspended in <CODE>schedule()</CODE>,
520provided you know where its kernel stack page is located.
521</DL>
522<P>
523
524<A NAME="IDX178"></A>
525</P>
526<DL>
527<DT><U>GDB Macro:</U> <B>btthreadlist</B> <I>list element</I>
528<DD>Shows the backtraces of all threads in <VAR>list</VAR>, the <CODE>struct list</CODE> in
529which the threads are kept. Specify <VAR>element</VAR> as the
530<CODE>struct list_elem</CODE> field used inside <CODE>struct thread</CODE> to link the threads
531together.
532<P>
533
534Example: <CODE>btthreadlist all_list allelem</CODE> shows the backtraces of
535all threads contained in <CODE>struct list all_list</CODE>, linked together by
536<CODE>allelem</CODE>. This command is useful to determine where your threads
537are stuck when a deadlock occurs. Please see the example scenario below.
538</P>
539</DL>
540<P>
541
542<A NAME="IDX179"></A>
543</P>
544<DL>
545<DT><U>GDB Macro:</U> <B>btthreadall</B>
546<DD>Short-hand for <CODE>btthreadlist all_list allelem</CODE>.
547</DL>
548<P>
549
550<A NAME="IDX180"></A>
551</P>
552<DL>
553<DT><U>GDB Macro:</U> <B>btpagefault</B>
554<DD>Print a backtrace of the current thread after a page fault exception.
555Normally, when a page fault exception occurs, GDB will stop
556with a message that might say:<A NAME="DOCF6" HREF="pintos_fot.html#FOOT6">(6)</A>
557<P>
558
559<TABLE><tr><td>&nbsp;</td><td class=example><pre>Program received signal 0, Signal 0.
5600xc0102320 in intr0e_stub ()
561</pre></td></tr></table><P>
562
563In that case, the <CODE>bt</CODE> command might not give a useful
564backtrace. Use <CODE>btpagefault</CODE> instead.
565</P>
566<P>
567
568You may also use <CODE>btpagefault</CODE> for page faults that occur in a user
569process. In this case, you may wish to also load the user program's
570symbol table using the <CODE>loadusersymbols</CODE> macro, as described above.
571</P>
572</DL>
573<P>
574
575<A NAME="IDX181"></A>
576</P>
577<DL>
578<DT><U>GDB Macro:</U> <B>hook-stop</B>
579<DD>GDB invokes this macro every time the simulation stops, which Bochs will
580do for every processor exception, among other reasons. If the
581simulation stops due to a page fault, <CODE>hook-stop</CODE> will print a
582message that says and explains further whether the page fault occurred
583in the kernel or in user code.
584<P>
585
586If the exception occurred from user code, <CODE>hook-stop</CODE> will say:
587<TABLE><tr><td>&nbsp;</td><td class=example><pre>pintos-debug: a page fault exception occurred in user mode
588pintos-debug: hit 'c' to continue, or 's' to step to intr_handler
589</pre></td></tr></table><P>
590
591In Project 2, a page fault in a user process leads to the termination of
592the process. You should expect those page faults to occur in the
593robustness tests where we test that your kernel properly terminates
594processes that try to access invalid addresses. To debug those, set a
595break point in <CODE>page_fault()</CODE> in <Q><TT>exception.c</TT></Q>, which you will
596need to modify accordingly.
597</P>
598<P>
599
600In Project 3, a page fault in a user process no longer automatically
601leads to the termination of a process. Instead, it may require reading in
602data for the page the process was trying to access, either
603because it was swapped out or because this is the first time it's
604accessed. In either case, you will reach <CODE>page_fault()</CODE> and need to
605take the appropriate action there.
606</P>
607<P>
608
609If the page fault did not occur in user mode while executing a user
610process, then it occurred in kernel mode while executing kernel code.
611In this case, <CODE>hook-stop</CODE> will print this message:
612<TABLE><tr><td>&nbsp;</td><td class=example><pre>pintos-debug: a page fault occurred in kernel mode
613</pre></td></tr></table>followed by the output of the <CODE>btpagefault</CODE> command.
614<P>
615
616Before Project 3, a page fault exception in kernel code is always a bug
617in your kernel, because your kernel should never crash. Starting with
618Project 3, the situation will change if you use the <CODE>get_user()</CODE> and
619<CODE>put_user()</CODE> strategy to verify user memory accesses
620(see section <A HREF="pintos_2.html#SEC28">2.2.5 Accessing User Memory</A>).
621</P>
622<P>
623
624</P>
625</DL>
626<P>
627
628<A NAME="Example GDB Session"></A>
629<HR SIZE="6">
630<A NAME="SEC104"></A>
631<H3> D.5.2 Example GDB Session </H3>
632<!--docid::SEC104::-->
633<P>
634
635This section narrates a sample GDB session, provided by Godmar Back.
636This example illustrates how one might debug a Project 1 solution in
637which occasionally a thread that calls <CODE>timer_sleep()</CODE> is not woken
638up. With this bug, tests such as <CODE>mlfqs_load_1</CODE> get stuck.
639</P>
640<P>
641
642This session was captured with a slightly older version of Bochs and the
643GDB macros for Pintos, so it looks slightly different than it would now.
644Program output is shown in normal type, user input in <STRONG>strong</STRONG>
645type.
646</P>
647<P>
648
649First, I start Pintos:
650</P>
651<P>
652
653<TABLE><tr><td>&nbsp;</td><td class=smallexample><pre><FONT SIZE=-1>$ <STRONG>pintos -v --gdb -- -q -mlfqs run mlfqs-load-1</STRONG>
654Writing command line to /tmp/gDAlqTB5Uf.dsk...
655bochs -q
656========================================================================
657 Bochs x86 Emulator 2.2.5
658 Build from CVS snapshot on December 30, 2005
659========================================================================
66000000000000i[ ] reading configuration from bochsrc.txt
66100000000000i[ ] Enabled gdbstub
66200000000000i[ ] installing nogui module as the Bochs GUI
66300000000000i[ ] using log file bochsout.txt
664Waiting for gdb connection on localhost:1234
665</FONT></pre></td></tr></table><P>
666
667Then, I open a second window on the same machine and start GDB:
668</P>
669<P>
670
671<TABLE><tr><td>&nbsp;</td><td class=smallexample><pre><FONT SIZE=-1>$ <STRONG>pintos-gdb kernel.o</STRONG>
672GNU gdb Red Hat Linux (6.3.0.0-1.84rh)
673Copyright 2004 Free Software Foundation, Inc.
674GDB is free software, covered by the GNU General Public License, and you are
675welcome to change it and/or distribute copies of it under certain conditions.
676Type &quot;show copying&quot; to see the conditions.
677There is absolutely no warranty for GDB. Type &quot;show warranty&quot; for details.
678This GDB was configured as &quot;i386-redhat-linux-gnu&quot;...
679Using host libthread_db library &quot;/lib/libthread_db.so.1&quot;.
680</FONT></pre></td></tr></table><P>
681
682Then, I tell GDB to attach to the waiting Pintos emulator:
683</P>
684<P>
685
686<TABLE><tr><td>&nbsp;</td><td class=smallexample><pre><FONT SIZE=-1>(gdb) <STRONG>debugpintos</STRONG>
687Remote debugging using localhost:1234
6880x0000fff0 in ?? ()
689Reply contains invalid hex digit 78
690</FONT></pre></td></tr></table><P>
691
692Now I tell Pintos to run by executing <CODE>c</CODE> (short for
693<CODE>continue</CODE>) twice:
694</P>
695<P>
696
697<TABLE><tr><td>&nbsp;</td><td class=smallexample><pre><FONT SIZE=-1>(gdb) <STRONG>c</STRONG>
698Continuing.
699Reply contains invalid hex digit 78
700(gdb) <STRONG>c</STRONG>
701Continuing.
702</FONT></pre></td></tr></table><P>
703
704Now Pintos will continue and output:
705</P>
706<P>
707
708<TABLE><tr><td>&nbsp;</td><td class=smallexample><pre><FONT SIZE=-1>Pintos booting with 4,096 kB RAM...
709Kernel command line: -q -mlfqs run mlfqs-load-1
710374 pages available in kernel pool.
711373 pages available in user pool.
712Calibrating timer... 102,400 loops/s.
713Boot complete.
714Executing 'mlfqs-load-1':
715(mlfqs-load-1) begin
716(mlfqs-load-1) spinning for up to 45 seconds, please wait...
717(mlfqs-load-1) load average rose to 0.5 after 42 seconds
718(mlfqs-load-1) sleeping for another 10 seconds, please wait...
719</FONT></pre></td></tr></table><P>
720
721<small>...</small>until it gets stuck because of the bug I had introduced. I hit
722<KBD>Ctrl+C</KBD> in the debugger window:
723</P>
724<P>
725
726<TABLE><tr><td>&nbsp;</td><td class=smallexample><pre><FONT SIZE=-1>Program received signal 0, Signal 0.
7270xc010168c in next_thread_to_run () at ../../threads/thread.c:649
728649 while (i &lt;= PRI_MAX &amp;&amp; list_empty (&amp;ready_list[i]))
729(gdb)
730</FONT></pre></td></tr></table><P>
731
732The thread that was running when I interrupted Pintos was the idle
733thread. If I run <CODE>backtrace</CODE>, it shows this backtrace:
734</P>
735<P>
736
737<TABLE><tr><td>&nbsp;</td><td class=smallexample><pre><FONT SIZE=-1>(gdb) <STRONG>bt</STRONG>
738#0 0xc010168c in next_thread_to_run () at ../../threads/thread.c:649
739#1 0xc0101778 in schedule () at ../../threads/thread.c:714
740#2 0xc0100f8f in thread_block () at ../../threads/thread.c:324
741#3 0xc0101419 in idle (aux=0x0) at ../../threads/thread.c:551
742#4 0xc010145a in kernel_thread (function=0xc01013ff , aux=0x0)
743 at ../../threads/thread.c:575
744#5 0x00000000 in ?? ()
745</FONT></pre></td></tr></table><P>
746
747Not terribly useful. What I really like to know is what's up with the
748other thread (or threads). Since I keep all threads in a linked list
749called <CODE>all_list</CODE>, linked together by a <CODE>struct list_elem</CODE> member
750named <CODE>allelem</CODE>, I can use the <CODE>btthreadlist</CODE> macro from the
751macro library I wrote. <CODE>btthreadlist</CODE> iterates through the list of
752threads and prints the backtrace for each thread:
753</P>
754<P>
755
756<TABLE><tr><td>&nbsp;</td><td class=smallexample><pre><FONT SIZE=-1>(gdb) <STRONG>btthreadlist all_list allelem</STRONG>
757pintos-debug: dumping backtrace of thread 'main' @0xc002f000
758#0 0xc0101820 in schedule () at ../../threads/thread.c:722
759#1 0xc0100f8f in thread_block () at ../../threads/thread.c:324
760#2 0xc0104755 in timer_sleep (ticks=1000) at ../../devices/timer.c:141
761#3 0xc010bf7c in test_mlfqs_load_1 () at ../../tests/threads/mlfqs-load-1.c:49
762#4 0xc010aabb in run_test (name=0xc0007d8c &quot;mlfqs-load-1&quot;)
763 at ../../tests/threads/tests.c:50
764#5 0xc0100647 in run_task (argv=0xc0110d28) at ../../threads/init.c:281
765#6 0xc0100721 in run_actions (argv=0xc0110d28) at ../../threads/init.c:331
766#7 0xc01000c7 in main () at ../../threads/init.c:140
767
768pintos-debug: dumping backtrace of thread 'idle' @0xc0116000
769#0 0xc010168c in next_thread_to_run () at ../../threads/thread.c:649
770#1 0xc0101778 in schedule () at ../../threads/thread.c:714
771#2 0xc0100f8f in thread_block () at ../../threads/thread.c:324
772#3 0xc0101419 in idle (aux=0x0) at ../../threads/thread.c:551
773#4 0xc010145a in kernel_thread (function=0xc01013ff , aux=0x0)
774 at ../../threads/thread.c:575
775#5 0x00000000 in ?? ()
776</FONT></pre></td></tr></table><P>
777
778In this case, there are only two threads, the idle thread and the main
779thread. The kernel stack pages (to which the <CODE>struct thread</CODE> points)
780are at <TT>0xc0116000</TT> and <TT>0xc002f000</TT>, respectively. The main thread
781is stuck in <CODE>timer_sleep()</CODE>, called from <CODE>test_mlfqs_load_1</CODE>.
782</P>
783<P>
784
785Knowing where threads are stuck can be tremendously useful, for instance
786when diagnosing deadlocks or unexplained hangs.
787</P>
788<P>
789
790<A NAME="IDX182"></A>
791</P>
792<DL>
793<DT><U>GDB Macro:</U> <B>loadusersymbols</B>
794<DD><P>
795
796You can also use GDB to debug a user program running under Pintos.
797To do that, use the <CODE>loadusersymbols</CODE> macro to load the program's
798symbol table:
799<TABLE><tr><td>&nbsp;</td><td class=example><pre>loadusersymbols <VAR>program</VAR>
800</pre></td></tr></table>where <VAR>program</VAR> is the name of the program's executable (in the host
801file system, not in the Pintos file system). For example, you may issue:
802<TABLE><tr><td>&nbsp;</td><td class=smallexample><pre><FONT SIZE=-1>(gdb) <STRONG>loadusersymbols tests/userprog/exec-multiple</STRONG>
803add symbol table from file &quot;tests/userprog/exec-multiple&quot; at
804 .text_addr = 0x80480a0
805(gdb)
806</FONT></pre></td></tr></table><P>
807
808After this, you should be
809able to debug the user program the same way you would the kernel, by
810placing breakpoints, inspecting data, etc. Your actions apply to every
811user program running in Pintos, not just to the one you want to debug,
812so be careful in interpreting the results: GDB does not know
813which process is currently active (because that is an abstraction
814the Pintos kernel creates). Also, a name that appears in
815both the kernel and the user program will actually refer to the kernel
816name. (The latter problem can be avoided by giving the user executable
817name on the GDB command line, instead of <Q><TT>kernel.o</TT></Q>, and then using
818<CODE>loadusersymbols</CODE> to load <Q><TT>kernel.o</TT></Q>.)
819<CODE>loadusersymbols</CODE> is implemented via GDB's <CODE>add-symbol-file</CODE>
820command.
821</P>
822<P>
823
824</P>
825</DL>
826<P>
827
828<A NAME="GDB FAQ"></A>
829<HR SIZE="6">
830<A NAME="SEC105"></A>
831<H3> D.5.3 FAQ </H3>
832<!--docid::SEC105::-->
833<P>
834
835</P>
836<DL COMPACT>
837<DT>GDB can't connect to Bochs.
838<DD><P>
839
840If the <CODE>target remote</CODE> command fails, then make sure that both
841GDB and <CODE>pintos</CODE> are running on the same machine by
842running <CODE>hostname</CODE> in each terminal. If the names printed
843differ, then you need to open a new terminal for GDB on the
844machine running <CODE>pintos</CODE>.
845</P>
846<P>
847
848</P>
849<DT>GDB doesn't recognize any of the macros.
850<DD><P>
851
852If you start GDB with <CODE>pintos-gdb</CODE>, it should load the Pintos
853macros automatically. If you start GDB some other way, then you must
854issue the command <CODE>source <VAR>pintosdir</VAR>/src/misc/gdb-macros</CODE>,
855where <VAR>pintosdir</VAR> is the root of your Pintos directory, before you
856can use them.
857</P>
858<P>
859
860</P>
861<DT>Can I debug Pintos with DDD?
862<DD><P>
863
864Yes, you can. DDD invokes GDB as a subprocess, so you'll need to tell
865it to invokes <CODE>pintos-gdb</CODE> instead:
866<TABLE><tr><td>&nbsp;</td><td class=example><pre>ddd --gdb --debugger pintos-gdb
867</pre></td></tr></table><P>
868
869</P>
870<DT>Can I use GDB inside Emacs?
871<DD><P>
872
873Yes, you can. Emacs has special support for running GDB as a
874subprocess. Type <KBD>M-x gdb</KBD> and enter your <CODE>pintos-gdb</CODE>
875command at the prompt. The Emacs manual has information on how to use
876its debugging features in a section titled &quot;Debuggers.&quot;
877</P>
878<P>
879
880</P>
881<DT>GDB is doing something weird.
882<DD><P>
883
884If you notice strange behavior while using GDB, there
885are three possibilities: a bug in your
886modified Pintos, a bug in Bochs's
887interface to GDB or in GDB itself, or
888a bug in the original Pintos code. The first and second
889are quite likely, and you should seriously consider both. We hope
890that the third is less likely, but it is also possible.
891</DL>
892<P>
893
894<A NAME="Triple Faults"></A>
895<HR SIZE="6">
896<A NAME="SEC106"></A>
897<H2> D.6 Triple Faults </H2>
898<!--docid::SEC106::-->
899<P>
900
901When a CPU exception handler, such as a page fault handler, cannot be
902invoked because it is missing or defective, the CPU will try to invoke
903the &quot;double fault&quot; handler. If the double fault handler is itself
904missing or defective, that's called a &quot;triple fault.&quot; A triple fault
905causes an immediate CPU reset.
906</P>
907<P>
908
909Thus, if you get yourself into a situation where the machine reboots in
910a loop, that's probably a &quot;triple fault.&quot; In a triple fault
911situation, you might not be able to use <CODE>printf()</CODE> for debugging,
912because the reboots might be happening even before everything needed for
913<CODE>printf()</CODE> is initialized.
914</P>
915<P>
916
917There are at least two ways to debug triple faults. First, you can run
918Pintos in Bochs under GDB (see section <A HREF="pintos_8.html#SEC102">D.5 GDB</A>). If Bochs has been built
919properly for Pintos, a triple fault under GDB will cause it to print the
920message &quot;Triple fault: stopping for gdb&quot; on the console and break into
921the debugger. (If Bochs is not running under GDB, a triple fault will
922still cause it to reboot.) You can then inspect where Pintos stopped,
923which is where the triple fault occurred.
924</P>
925<P>
926
927Another option is what I call &quot;debugging by infinite loop.&quot;
928Pick a place in the Pintos code, insert the infinite loop
929<CODE>for (;;);</CODE> there, and recompile and run. There are two likely
930possibilities:
931</P>
932<P>
933
934<UL>
935<LI>
936The machine hangs without rebooting. If this happens, you know that
937the infinite loop is running. That means that whatever caused the
938reboot must be <EM>after</EM> the place you inserted the infinite loop.
939Now move the infinite loop later in the code sequence.
940<P>
941
942</P>
943<LI>
944The machine reboots in a loop. If this happens, you know that the
945machine didn't make it to the infinite loop. Thus, whatever caused the
946reboot must be <EM>before</EM> the place you inserted the infinite loop.
947Now move the infinite loop earlier in the code sequence.
948</UL>
949<P>
950
951If you move around the infinite loop in a &quot;binary search&quot; fashion, you
952can use this technique to pin down the exact spot that everything goes
953wrong. It should only take a few minutes at most.
954</P>
955<P>
956
957<A NAME="Modifying Bochs"></A>
958<HR SIZE="6">
959<A NAME="SEC107"></A>
960<H2> D.7 Modifying Bochs </H2>
961<!--docid::SEC107::-->
962<P>
963
964An advanced debugging technique is to modify and recompile the
965simulator. This proves useful when the simulated hardware has more
966information than it makes available to the OS. For example, page
967faults have a long list of potential causes, but the hardware does not
968report to the OS exactly which one is the particular cause.
969Furthermore, a bug in the kernel's handling of page faults can easily
970lead to recursive faults, but a &quot;triple fault&quot; will cause the CPU to
971reset itself, which is hardly conducive to debugging.
972</P>
973<P>
974
975In a case like this, you might appreciate being able to make Bochs
976print out more debug information, such as the exact type of fault that
977occurred. It's not very hard. You start by retrieving the source
978code for Bochs 2.2.6 from <A HREF="http://bochs.sourceforge.net">http://bochs.sourceforge.net</A> and
979saving the file <Q><TT>bochs-2.2.6.tar.gz</TT></Q> into a directory.
980The script <Q><TT>pintos/src/misc/bochs-2.2.6-build.sh</TT></Q>
981applies a number of patches contained in <Q><TT>pintos/src/misc</TT></Q>
982to the Bochs tree, then builds Bochs and installs it in a directory
983of your choice.
984Run this script without arguments to learn usage instructions.
985To use your <Q><TT>bochs</TT></Q> binary with <CODE>pintos</CODE>, make sure
986it is the one printed by <Q><SAMP>which `bochs`</SAMP></Q>; otherwise, modify
987your <CODE>PATH</CODE> accordingly.
988</P>
989<P>
990
991Of course, to get any good out of this you'll have to actually modify
992Bochs. Instructions for doing this are firmly out of the scope of
993this document. However, if you want to debug page faults as suggested
994above, a good place to start adding <CODE>printf()</CODE>s is
995<CODE>BX_CPU_C::dtranslate_linear()</CODE> in <Q><TT>cpu/paging.cc</TT></Q>.
996</P>
997<P>
998
999<A NAME="Debugging Tips"></A>
1000<HR SIZE="6">
1001<A NAME="SEC108"></A>
1002<H2> D.8 Tips </H2>
1003<!--docid::SEC108::-->
1004<P>
1005
1006The page allocator in <Q><TT>threads/palloc.c</TT></Q> and the block allocator in
1007<Q><TT>threads/malloc.c</TT></Q> clear all the bytes in memory to
1008<TT>0xcc</TT> at time of free. Thus, if you see an attempt to
1009dereference a pointer like <TT>0xcccccccc</TT>, or some other reference to
1010<TT>0xcc</TT>, there's a good chance you're trying to reuse a page that's
1011already been freed. Also, byte <TT>0xcc</TT> is the CPU opcode for &quot;invoke
1012interrupt 3,&quot; so if you see an error like <CODE>Interrupt 0x03 (#BP
1013Breakpoint Exception)</CODE>, then Pintos tried to execute code in a freed page or
1014block.
1015</P>
1016<P>
1017
1018An assertion failure on the expression <CODE>sec_no &lt; d-&gt;capacity</CODE>
1019indicates that Pintos tried to access a file through an inode that has
1020been closed and freed. Freeing an inode clears its starting sector
1021number to <TT>0xcccccccc</TT>, which is not a valid sector number for disks
1022smaller than about 1.6 TB.
1023<A NAME="Development Tools"></A>
1024<HR SIZE="6">
1025<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
1026<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos_8.html#SEC96"> &lt;&lt; </A>]</TD>
1027<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos_9.html#SEC109"> &gt;&gt; </A>]</TD>
1028<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos.html#SEC_Top">Top</A>]</TD>
1029<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos.html#SEC_Contents">Contents</A>]</TD>
1030<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
1031<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="pintos_abt.html#SEC_About"> ? </A>]</TD>
1032</TR></TABLE>
1033<BR>
1034<FONT SIZE="-1">
1035This document was generated
1036by on <I>March, 6 2012</I>
1037using <A HREF="http://texi2html.cvshome.org"><I>texi2html</I></A>
1038</FONT>
1039
1040</BODY>
1041</HTML>