Add native debugger manual section

tmcgilchrist · Jan 6, 2025 · 550de75 · 550de75
1 parent e8d4798
commit 550de75
Show file tree

Hide file tree

Showing 4 changed files with 319 additions and 1 deletion.
diff --git a/Changes b/Changes
@@ -149,6 +149,9 @@ Working version
 
 ### Manual and documentation:
 
+- #?????: Document support for native debugging with GDB and LLDB.
+   (Tim McGilchrist, review by ???)
+
 ### Compiler user-interface and warnings:
 
 - #13428: support dump=[source | parsetree | lambda | ... | cmm | ...]

diff --git a/manual/src/allfiles.etex b/manual/src/allfiles.etex
@@ -69,6 +69,7 @@ and as a
 \input{ocamldep.tex}
 \input{ocamldoc.tex}
 \input{debugger.tex}
+\input{native-debugger.tex}
 \input{profil.tex}
 \input{intf-c.tex}
 \input{flambda.tex}

diff --git a/manual/src/cmds/Makefile b/manual/src/cmds/Makefile
@@ -10,7 +10,7 @@ TEXQUOTE = $(OCAMLRUN) $(TOOLS)/texquote2
 TRANSF = $(OCAMLRUN) $(TOOLS)/transf
 
 FILES = comp.tex top.tex runtime.tex native.tex lexyacc.tex intf-c.tex \
-  ocamldep.tex profil.tex debugger.tex ocamldoc.tex \
+ocamldep.tex profil.tex debugger.tex native-debugger.tex ocamldoc.tex \
   warnings-help.tex flambda.tex tail-mod-cons.tex \
   afl-fuzz.tex runtime-tracing.tex unified-options.tex tsan.tex
 

diff --git a/manual/src/cmds/native-debugger.etex b/manual/src/cmds/native-debugger.etex
@@ -0,0 +1,314 @@
+\chapter{Native debugger} \label{c:native-debugger}
+%HEVEA\cutname{native-debugger.html}
+
+\section{s:native-debugger-preliminaries}{Preliminaries}
+
+This chapter describes the support for debugging OCaml executables built using the native-code compiler \texttt{ocamlopt}, with GDB or LLDB on Linux, macOS, and FreeBSD platforms. We will call this \texttt{native debugging}, in contrast to bytecode debugging supported via \texttt{ocamldebug} (see chapter~\ref{c:debugger}).
+
+\subsection{ss:native-debugger-dwarf}{DWARF}
+
+The OCaml compiler uses the \href{http://dwarfstd.org/}{DWARF} debugging information format to describe the debug information it generates.  DWARF is a debugging information format used by many compilers and debuggers to support source-level debugging. It has support for Linux ELF, macOS Mach-O and FreeBSD ELF.
+
+Within the larger DWARF standard, the compiler specifically uses Call Frame Information (CFI) to describe the stack frames for OCaml code, which allows for unwinding the call stack to generate a backtrace. CFI is preserved across OCaml stack frames into sections of the runtime written in C, and across the Foreign Function Interface (FFI) if the language provides CFI information. OCaml defines its own calling convention that details how arguments are passed to functions, how values are returned and how registers are used. This information is also encoded using CFI.
+
+The OCaml compiler generates line information that maps machine instructions back to their source program location (e.g., the instruction at address \texttt{0xdeadbeef} originated from \texttt{myprogram.ml} line 42). This allows native debuggers to display the OCaml source code for the program being debugged and enables stepping through OCaml source code.
+
+\subsection{ss:native-debugger-name-mangling}{Name Mangling}
+
+Name mangling describes the process for how the OCaml compiler generates symbol names for OCaml language constructs. The format of these symbols is important for native debuggers to uniquely identify the source function for a symbol without referencing the original source code. In the absence of source mappings you would need to use mangled names to set breakpoints and they appear in debugger output like backtraces. Understanding OCaml's name mangling is therefore useful when debugging OCaml programs.
+
+OCaml 5.1.1 uses a name mangling scheme of \texttt{caml<MODULE_NAME>.<FUNCTION_NAME>_<NNN>}, where \texttt{NNN} is a randomly generated number. Before OCaml 5.1.1 the scheme used two underscores as the separator, e.g., \texttt{caml<MODULE_NAME>__<FUNCTION_NAME>_<NNN>}. For the Windows MSVC port (restored in OCaml 5.3), the scheme uses \texttt{\$} as the separator, e.g., \texttt{caml<MODULE_NAME>\$<FUNCTION_NAME>_<NNN>}. OCaml 5.4 onwards uses \texttt{\$} as the separator on all platforms.
+
+\subsection{ss:native-debugger-frame-pointers}{Frame Pointers}
+
+The OCaml native compiler supports maintaining frame pointers on AMD64 and ARM64 platforms, which a native debugger can use to walk the stack of function calls in a program. The frame pointer (also known as the base pointer) is a register (e.g., \texttt{\%rbp} on AMD64 or \texttt{x29} on ARM64) that points to the base of the current stack frame. The stack frame (also known as the activation frame or the activation record) refers to the portion of the stack allocated to a single function call. By saving the frame pointer along with the return address, the call stack for OCaml can be maintained. Using frame pointers only, without CFI enabled, it is possible to debug OCaml programs, however the experience is closer to debugging assembly and using DWARF with CFI is recommended. 
+
+\section{s:native-debugger-compilation}{Compiling for Debugging}
+
+Before debugging programs written in OCaml, the native compiler \texttt{ocamlopt} must be installed with CFI emission support, this is enabled by default. CFI emission is controlled by the \texttt{--enable-cfi} flag.
+
+To perform source-level debugging, compile code with the \texttt{-g} flag, this records DWARF information for exception backtraces, and generates line information for mapping between assembly and source locations in OCaml. Compiling with \texttt{-g} entails no runtime penalty but will generate larger binaries as they include extra sections of debugging information.
+
+\section{s:native-debugger-gdb}{Using GDB}
+Here we walk through debugging a simple OCaml program using GDB on Linux, showing the commands to use and the expected outputs. Note this session uses Ubuntu 24.04 LTS on AMD64.
+
+Consider the following program:
+\begin{caml_example*}{verbatim}
+(* fib.ml *)
+let rec fib n =
+  if n = 0 then 0
+  else if n = 1 then 1
+  else fib (n-1) + fib (n-2)
+
+let main () = 
+  let r = fib 20 in 
+  Printf.printf "fib(20) = %d" r
+
+let _ = main ()
+\end{caml_example*}
+
+Compile this program with \texttt{ocamlopt} like so:
+
+\begin{verbatim}
+$ ocamlopt -g -o fib.exe fib.ml
+$ ./fib.exe 20
+fib(20) = 6765
+\end{verbatim}
+
+When run this program prints the 20th Fibonacci number. The use of recursion is an excuse to inspect the call stack. To do so, startup a GDB session for this program:
+
+\begin{verbatim}
+$ gdb ./fib.exe
+\end{verbatim}
+
+Breakpoints can be set using either the mangled names produced by the compiler or a combination of file name and line number. For example:
+
+\begin{verbatim}
+(gdb) break camlFib$fib_     # press tab
+(gdb) break camlFib$fib_271  # 271 happens to be the random number generated for NNN
+Breakpoint 1 at 0x3cd50: file fib.ml, line 2.
+
+(gdb) break fib.ml:7         # breakpoint for main function
+Breakpoint 2 at 0x3cdc0: file fib.ml, line 7.
+\end{verbatim}
+
+Now we can run the program and print a backtrace.
+
+\begin{verbatim}
+(gdb) run
+Starting program: fib.exe 
+[Thread debugging using libthread_db enabled]
+Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
+
+Breakpoint 2, camlFib$main_273 () at fib.ml:7
+7	let main () =
+(gdb) continue
+Continuing.
+
+Breakpoint 1, camlFib$fib_271 () at fib.ml:2
+2	let rec fib n =
+(gdb) backtrace
+#0  camlFib$fib_270 () at fib.ml:2
+#1  0x0000555555590de1 in camlFib$main_273 () at fib.ml:7
+#2  0x0000555555590e86 in camlFib$entry () at fib.ml:11
+#3  0x000055555558eaa7 in caml_program ()
+#4  <signal handler called>
+#5  0x00005555555de126 in caml_startup_common (pooling=<optimised out>, argv=0x7fffffffe3f8)
+    at runtime/startup_nat.c:132
+#6  caml_startup_common (argv=0x7fffffffe3f8, pooling=<optimised out>) at runtime/startup_nat.c:88
+#7  0x00005555555de19f in caml_startup_exn (argv=<optimised out>) at runtime/startup_nat.c:139
+#8  caml_startup (argv=<optimised out>) at runtime/startup_nat.c:144
+#9  caml_main (argv=<optimised out>) at runtime/startup_nat.c:151
+#10 0x000055555558e892 in main (argc=<optimised out>, argv=<optimised out>) at runtime/main.c:37
+\end{verbatim}
+
+There is basic support for printing OCaml values using \href{https://github.com/ocaml/ocaml/blob/5.4.0/tools/gdb.py}{tools/gdb.py} and the built-in Python scripting in GDB. Download that file and load it into GDB like so:
+
+\begin{verbatim}
+(gdb) source ~/ocaml/tools/gdb.py
+OCaml support module loaded. Values of type 'value' will now
+print as OCaml values, there is a $Array() convenience function,
+and an 'ocaml' command is available for heap exploration
+(see 'help ocaml' for more information).
+
+(gdb) p (value)$rax
+$1 = caml:14
+
+\end{verbatim}
+
+We can also print other kinds of OCaml values. In order to illustrate this, consider the following program:
+\begin{caml_example*}{verbatim}
+(* test_blocks.ml *)
+type t = {s : string; i : int}
+
+let main a b =
+  print_endline "Hello, world!";
+  print_endline a;
+  print_endline b.s
+
+let _ = main "foo" {s = "bar"; i = 42}
+\end{caml_example*}
+
+Compile this program with \texttt{ocamlopt} and load it into GDB:
+
+\begin{verbatim}
+$ ocamlopt -g -o test_blocks.exe test_blocks.ml
+$ gdb ./test_blocks.exe
+(gdb) source ~/ocaml/tools/gdb.py
+...
+(gdb) break camlTest_blocks$main_273 
+Breakpoint 1 at 0x16db0: file test_blocks.ml, line 4.
+(gdb) run
+...
+Breakpoint 1, camlTest_blocks$main_273 () at test_blocks.ml:4
+4	let main a b =
+(gdb) p (value)$rax      # Print out the first argument to main
+$1 = caml(-):'foo'<3>
+(gdb) p (value)$rbx      # Then print the second argument
+$2 = caml(-):('bar', 42) = {caml(-):'bar'<3>, caml:42}
+\end{verbatim}
+
+Note the use of AMD64 register names: \texttt{\$rax} and \texttt{\$rbx} to access the first and second arguments to a function. This follows the OCaml calling convention on AMD64 where \texttt{\$rax} to \texttt{\$r13} hold OCaml function arguments and \texttt{\$rax} holds function results. Consult \texttt{asmcomp/<ARCH>/proc.ml} file for a specific architecture for further information about OCaml calling conventions.
+
+\subsection{ss:native-debugger-gdb-commands}{GDB Commands}
+Summary of interesting OCaml specific GDB commands:
+\begin{options}
+\item["break "\var{locspec}]
+Set a breakpoint at all of the code locations matching \var{locspec}, e.g., using the mangled OCaml names or specifying the linenum in the source file as \texttt{filename:linenum}.
+
+\item["backtrace"]
+Print the backtrace of the entire stack. This will include OCaml source references identifying which stack frame maps to a source location, e.g., \texttt{fib.ml:4}.
+
+\item["disassemble "\var{addresses}]
+Display a range of \var{addresses} as machine instructions. Typically used with the mangled OCaml names to display the assembly for a function. 
+
+\item["info "\var{frame}]
+This command prints a verbose description of the selected stack \var{frame}.
+
+\item["list "\var{linenum}]
+Print lines centered around line number \var{linenum} in the current source file. This will print the source code for OCaml and the OCaml runtime written in C.
+
+\end{options}
+
+See the \href{https://sourceware.org/gdb/current/onlinedocs/gdb.html/}{Debugging with GDB} documentation for more details. In general only the features described above work in GDB, otherwise users will need to fall back to assembly debugging. GDB is expected to work on all supported Linux architectures.
+
+\section{s:native-debugger-lldb}{Using LLDB}
+
+Here we will walk through debugging the earlier fib example using LLDB on Linux. Startup an LLDB session using the `fib.exe` from earlier. Note this session uses Ubuntu 24.04 LTS on ARM64.
+
+\begin{verbatim}
+$ lldb ./fib.exe
+Current executable set to 'fib.exe' (aarch64).
+(lldb)
+\end{verbatim}
+
+Breakpoints can be set using the OCaml mangled names or using a combination of file name and line number. For example:
+
+\begin{verbatim}
+(lldb) breakpoint set -n camlFib$fib        # press tab for autocomplete
+(lldb) breakpoint set -n camlFib$fib_271
+Breakpoint 2: where = fib.exe`camlFib$fib_271 + 80, address = 0x0000000000052360
+(lldb) breakpoint set -f fib.ml -l 7         # breakpoint for line 7 in fib.ml
+Breakpoint 2: where = fib.exe`camlFib$main_272, address = 0x0000000000051088
+(lldb)
+\end{verbatim}
+
+Now we can run the program.
+\begin{verbatim}
+(lldb) run
+...
+Process 11391 stopped
+* thread #1, name = 'fib.exe', stop reason = breakpoint 2.1
+    frame #0: 0x0000aaaaaaaf1088 fib.exe`camlFib$main_272 at fib.ml:7
+   4   	  else if n = 1 then 1
+   5   	  else fib (n-1) + fib (n-2)
+   6   	
+-> 7   	let main () = 
+   8   	  let r = fib 20 in 
+   9   	  Printf.printf "fib(20) = %d" r
+   10  	
+...
+(lldb) continue
+Process 28032 resuming
+Process 28032 stopped
+* thread #1, name = 'fib.exe', stop reason = breakpoint 2.1
+    frame #0: 0x0000aaaaaaaf2360 fib.exe`camlFib$fib_271 at fib.ml:5
+   2   	let rec fib n =
+   3   	  if n = 0 then 0
+   4   	  else if n = 1 then 1
+-> 5   	  else fib (n-1) + fib (n-2)
+   6   	
+   7   	let main () =
+   8   	  let r = fib 20 in
+
+(lldb) bt               # Print a backtrace
+* thread #1, name = 'fib.exe', stop reason = breakpoint 2.1
+  * frame #0: 0x0000aaaaaaaf2360 fib.exe`camlFib$fib_271 at fib.ml:5
+    frame #1: 0x0000aaaaaaaf23d0 fib.exe`camlFib$main_273 at fib.ml:8
+    frame #2: 0x0000aaaaaaaf2490 fib.exe`camlFib$entry at fib.ml:11
+    frame #3: 0x0000aaaaaaaef748 fib.exe`caml_program + 480
+    frame #4: 0x0000aaaaaab4ab90 fib.exe`caml_start_program + 132
+    frame #5: 0x0000aaaaaab4a5f8 fib.exe`caml_startup_common [inlined] caml_startup_common(pooling=-1430712272, argv=0x0000000000000010) at startup_nat.c:127:9
+    frame #6: 0x0000aaaaaab4a528 fib.exe`caml_startup_common(argv=0x0000000000000010, pooling=-1430712272) at startup_nat.c:86:7
+    frame #7: 0x0000aaaaaab4a670 fib.exe`caml_main [inlined] caml_startup_exn(argv=<unavailable>) at startup_nat.c:134:10
+    frame #8: 0x0000aaaaaab4a66c fib.exe`caml_main [inlined] caml_startup(argv=<unavailable>) at startup_nat.c:139:15
+    frame #9: 0x0000aaaaaab4a66c fib.exe`caml_main(argv=<unavailable>) at startup_nat.c:146:3
+    frame #10: 0x0000aaaaaaaef3d0 fib.exe`main(argc=<unavailable>, argv=<unavailable>) at main.c:37:3
+    frame #11: 0x0000fffff7d784c4 libc.so.6`__libc_start_call_main(main=(fib.exe`main at main.c:31:1), argc=1, argv=0x0000fffffffffc98) at libc_start_call_main.h:58:16
+    frame #12: 0x0000fffff7d78598 libc.so.6`__libc_start_main_impl(main=0x0000aaaaaaba0e68, argc=1, argv=0x0000fffffffffc98, init=<unavailable>, fini=<unavailable>, rtld_fini=<unavailable>, stack_end=<unavailable>) at libc-start.c:360:3
+    frame #13: 0x0000aaaaaaaef470 fib.exe`_start + 48
+\end{verbatim}
+
+There is basic support for printing OCaml values using \href{https://github.com/ocaml/ocaml/blob/5.4.0/tools/lldb.py}{tools/lldb.py} and the built-in Python scripting in LLDB. Download that file and load it into LLDB like so:
+
+\begin{verbatim}
+(lldb) command script import ~/ocaml/tools/lldb.py
+OCaml support module loaded. Values of type 'value' will now
+print as OCaml values, and an 'ocaml' command is available for
+heap exploration (see 'help ocaml' for more information).
+(lldb)  p (value)$x0
+(value) 41 caml:20
+(lldb) 
+\end{verbatim}
+
+Note: above we are using an ARM64 Linux machine, so our first argument is passed in the first register \texttt{x0}.
+
+We can also print out all kinds of OCaml values. Reusing the `test_blocks.exe` startup a new LLDB session:
+
+\begin{verbatim}
+$ lldb ./test_blocks.exe
+...
+(lldb) command script import ~/ocaml/tools/lldb.py
+OCaml support module loaded. Values of type 'value' will now
+print as OCaml values, and an 'ocaml' command is available for
+heap exploration (see 'help ocaml' for more information).
+(lldb) breakpoint set -n camlTest_blocks$main_274 
+Breakpoint 1: where = test_blocks.exe`camlTest_blocks$main_274 + 44, address = 0x000000000001a6fc
+(lldb) run
+...
+Process 15536 stopped
+* thread #1, name = 'test_blocks.exe', stop reason = breakpoint 1.1
+    frame #0: 0x0000aaaaaaaba6fc test_blocks.exe`camlTest_blocks$main_274 at test_blocks.ml:5
+   2   	type t = {s : string; i : int}
+   3   	
+   4   	let main a b =
+-> 5   	  print_endline "Hello, world!";
+   6   	  print_endline a;
+   7   	  print_endline b.s
+   8   	
+...
+(lldb) p (value)$x0
+(value) 187649984957416 caml(-):'Hello, world!'<13>
+(lldb) p (value)$x1
+(value) 187649984957360 caml(-):('bar', 42)
+\end{verbatim}
+
+Here we use the ARM64 registers named \texttt{\$x0} and \texttt{\$x1} to access the first and second arguments to a function. This follows the OCaml calling convention on ARM64 where \texttt{\$x0} to \texttt{\$x15} hold OCaml function arguments. Consult \texttt{asmcomp/<ARCH>/proc.ml} file for a specific architecture for further information about OCaml calling conventions.
+
+\subsection{ss:native-debugger-lldb-commands}{LLDB Commands}
+
+Summary of interesting OCaml specific LLDB commands:
+
+\begin{options}
+\item["breakpoint set -n "\var{symbol}]
+Set a breakpoint at code location matching \var{symbol}, e.g, using the mangled OCaml name.
+
+\item["breakpoint set -f "\var{filename}" -l "\var{linenum}]
+Set a breakpoint at \var{linenum} in \var{filename}, e.g., \texttt{fib.ml:7}
+
+\item["breakpoint set -a "\var{address}]
+Set a breakpoint on a memory \var{address}.
+
+\item["backtrace"]
+Print the backtrace of the entire stack. This will include OCaml source references identifying which stack frame maps to a source location.
+
+\item["disassemble"]
+Disassemble specified instructions in the current target. Useful options include \texttt{-n} plus mangled OCaml name to disassemble a specific function and \texttt{-a} plus an address to disassemble function containing this address.
+
+\item["frame info"]
+List information about the current stack frame in the current thread.
+
+\item["source"]
+Commands for examining source code described by debug information for the current target process.
+
+\end{options}