Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debug perf manual #18

Closed
wants to merge 29 commits into from
Closed

Debug perf manual #18

wants to merge 29 commits into from

Conversation

tmcgilchrist
Copy link
Owner

@tmcgilchrist tmcgilchrist commented Nov 1, 2024

Copy link

@jmid jmid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! It was a nice read and I learned a few things along the way 😃

In a few places, I found the text to be a bit scarse on details, e.g., for it to be understandable by a gdb/lldb newcomer.

Language-wise I have been raised with the Oxford comma: "X, Y, and Z"
which I can see you don't use. Have you checked what the other manual chapters use (to fit nicely in)? 🙂

For debugging

  • to mirror the ocamldebug manual page, it could be nice to include a description of how to debug programs that
    • take command line parameters
    • read from stdin
  • it could be a nice touch if opam switches installed the gdb/python scripts with the switch during opam switch create to avoid ad-hoc downloading afterwards

For profiling

  • would it make sense to include an example output of perf report with a short explanation of how to read/understand it?
  • similarly, an example flamegraph and how to read/understand it could also be a nice addition
  • perf is by now standard, but I could imagine push-back on including material that depends on (unreleased?) (perl?) scripts from a Brendan Gregg repo.

manual/src/cmds/native-debugger.etex Outdated Show resolved Hide resolved

The OCaml compiler uses the \href{http://dwarfstd.org/}{DWARF} debugging information file format to describe the debug information it generates. DWARF is a debugging information file format used by many compilers and debuggers to support source level debugging, and it is used by Linux ELF, macOS Mach-O and FreBSD ELF.

Within the DWARF standard the compiler specifically uses Call Frame Information (hereafter abbreviated as CFI) to describe a call stack for OCaml code, sections of the runtime written in C. e.g. Garbage Collector and across the Foreign Function Interface (FFI) if the language provides CFI information. (If the language has been compiled to include CFI information).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence reads incomplete to me. Are there words missing, "and sections of the runtime"?
Nit: The usual sequence surrounding "e.g" is , e.g., https://www.merriam-webster.com/dictionary/e.g. (note: two commas and two periods)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious I've always known "e.g." to be used like sections of the runtime written in C, e.g. Garbage Collector without the trailing comma. That usage seems to be common to the existing manual pages, but I'm no National Grammar Rodeo Champion. ;-)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick grep "e\.g" *.etex in manual/src/cmds reveals a different convention (mostly in parens, if not then with an initial comma but omitting the second comma) from the Merriam Webster one. Feel free to ignore this part then.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do use the Oxford Comma in our style guide and also the comma after e.g. or i.e.
I'm totally obsessed with this stuff, so I'm sure to edit things that come across my desk to adhere to our style guide, but I certainly don't expect anyone else to remember all those details! I'll take care of those things. 🙂

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @christinerose ! While we do want to use the style guide when writing our own documentation, blog posts and comms, in this case we are aiming at changes to an existing body of text. It's important that we retain uniformity, rather than impose our style guide. Let's ensure we do so!


\subsection{ss:native-debugger-name-mangling}{Name Mangling}

Name mangling is the process for describing how the OCaml compiler generates symbol names for OCaml language constructs. The format of these symbols is important for debuggers, performance and observability tools, to uniquely identify the source function for a symbol and to do so without resource to the original source code. In the absensce of source mappings, you often need to use mangled names to set breakpoints or they will appear in information the native debugger will display. As such knowing how OCaml performs name mangling is important when debugging OCaml programs. OCaml 5.1.1 uses the a name mangling scheme of \texttt{caml<MODULE_NAME>.<FUNCTION_NAME>_<NNN>} where \texttt{NNN} is a randomly generated number. Before 5.1.1 the scheme is uses two underscores as the separator e.g. \texttt{caml<MODULE_NAME>__<FUNCTION_NAME>_<NNN>}.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Name mangling is the process for describing how the OCaml compiler generates symbol names for OCaml language constructs. The format of these symbols is important for debuggers, performance and observability tools, to uniquely identify the source function for a symbol and to do so without resource to the original source code. In the absensce of source mappings, you often need to use mangled names to set breakpoints or they will appear in information the native debugger will display. As such knowing how OCaml performs name mangling is important when debugging OCaml programs. OCaml 5.1.1 uses the a name mangling scheme of \texttt{caml<MODULE_NAME>.<FUNCTION_NAME>_<NNN>} where \texttt{NNN} is a randomly generated number. Before 5.1.1 the scheme is uses two underscores as the separator e.g. \texttt{caml<MODULE_NAME>__<FUNCTION_NAME>_<NNN>}.
Name mangling is the process for describing how the OCaml compiler generates symbol names for OCaml language constructs. The format of these symbols is important for debuggers, performance and observability tools, to uniquely identify the source function for a symbol and to do so without reference to the original source code. In the absence of source mappings, you often need to use mangled names to set breakpoints or they will appear in the information the native debugger will display. As such knowing how OCaml performs name mangling is important when debugging OCaml programs. OCaml 5.1.1 uses a name mangling scheme of \texttt{caml<MODULE_NAME>.<FUNCTION_NAME>_<NNN>} where \texttt{NNN} is a randomly generated number. Before 5.1.1 the scheme used two underscores as the separator, e.g., \texttt{caml<MODULE_NAME>__<FUNCTION_NAME>_<NNN>}.

There's the MSVC exception. I don't know if it is worth a footnote/mention?

According to ocaml#11430 packed modules uses the <SUB_MODULE_NAME> convention, which may or may not be worth a footnote/mention?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth including a mention of MSVC 👍🏻


\subsection{ss:native-debugger-frame-pointers}{Frame Pointers}

The OCaml native compiler also supports maintaining Frame Pointers, which can be used by a debugger to walk the stack of function calls in a program. The Frame Pointer (also known as the base pointer) is a register (e.g. %rbp on x86_64 or x29 on ARM64) that points to the base of the current stack frame. The stack frame (also known as the activation frame or the activation record) refers to the portion of the stack allocated to a single function call. By saving the frame pointer along with the return address, between stack frames the call stack for OCaml can be maintained. It should be possible to use just frame pointers to debug OCaml programs, similar to debugging plain assembly code.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The OCaml native compiler also supports maintaining Frame Pointers, which can be used by a debugger to walk the stack of function calls in a program. The Frame Pointer (also known as the base pointer) is a register (e.g. %rbp on x86_64 or x29 on ARM64) that points to the base of the current stack frame. The stack frame (also known as the activation frame or the activation record) refers to the portion of the stack allocated to a single function call. By saving the frame pointer along with the return address, between stack frames the call stack for OCaml can be maintained. It should be possible to use just frame pointers to debug OCaml programs, similar to debugging plain assembly code.
The OCaml native compiler also supports maintaining Frame Pointers, which can be used by a debugger to walk the stack of function calls in a program. The Frame Pointer (also known as the base pointer) is a register (e.g. %rbp on x86_64 or x29 on ARM64) that points to the base of the current stack frame. The stack frame (also known as the activation frame or the activation record) refers to the portion of the stack allocated to a single function call. By saving the frame pointer along with the return address between stack frames, the call stack for OCaml can be maintained. It should be possible to use just frame pointers to debug OCaml programs, similar to debugging plain assembly code.

Also, I'm not following the last sentence. Can you elaborate?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent is you could strip an OCaml binary and still be able to debug it if you compile with frame pointers. You would only get stack frames with a call stack plus assembly code stepping and mangled names for setting breakpoints.

Having a stripped binary might be useful if you want the smallest executable possible for say MirageOS or performance. I haven't thoroughly tested this scenario but I wanted to include the option for completeness.

manual/src/cmds/native-debugger.etex Outdated Show resolved Hide resolved
manual/src/cmds/profile-perf.etex Outdated Show resolved Hide resolved
manual/src/cmds/profile-perf.etex Outdated Show resolved Hide resolved
manual/src/cmds/profile-perf.etex Outdated Show resolved Hide resolved
manual/src/cmds/profile-perf.etex Outdated Show resolved Hide resolved
manual/src/cmds/native-debugger.etex Outdated Show resolved Hide resolved
Changes Show resolved Hide resolved
manual/README.md Outdated

Note that ocamlc,ocamlopt and the toplevel options overlap a lot.
Note that ocamlc, ocamlopt and the toplevel options overlap a lot.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Note that ocamlc, ocamlopt and the toplevel options overlap a lot.
Note that `ocamlc`, `ocamlopt`, and the toplevel options overlap a lot.

@@ -85,8 +85,10 @@ chapters (or sometimes sections) are mapped to a distinct `.etex` file:
- Optimisation with Flambda: `flambda.etex`
- Fuzzing with afl-fuzz: `afl-fuzz.etex`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Fuzzing with afl-fuzz: `afl-fuzz.etex`
- Fuzzing with `afl-fuzz`: `afl-fuzz.etex`

Comment on lines 49 to +50
Each part of the manual corresponds to a specific directory, and each distinct
chapters (or sometimes sections) are mapped to a distinct `.etex` file:
chapter (or sometimes sections) are mapped to a distinct `.etex` file:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you like this better? It's a little more concise, but what you have here is totally fine, too

Each part of the manual is organised into specific directories, with chapters or sections mapped to separate .etex files.

@christinerose
Copy link
Collaborator

christinerose commented Nov 8, 2024

In the ReadMe file, you use both "Parts" and "Chapters" do describe different sections. Are these the same thing? If so, stick to either "Parts" or "Chapters" throughout. If not, perhaps we can clarify how they're different.

For BNF Grammar Notation: The description of @-quotes is clear, but the mention of the @ needing to be escaped could be clarified by specifying where this might cause issues.

Under "LaTeX Extensions," consider mentioning the purpose of sections and subsections early to highlight their specific role in maintaining links. This might improve readability, especially for new users.

In the "Building the Manual" section, the numbered list under could use a more descriptive intro, helping readers quickly understand each task. Also, did you mean to start with 0. and then 1. ... instead of 1. and then 2.?

Just initial thoughts if you think they'll help!

Otherwise, the tone seems consistent. It's clear and precise, which is perfect for a manual! Well done!


Frame pointer support for OCaml is available on x86_64 architecture on Linux starting with OCaml 4.12 and on macOS from OCaml 5.3. ARM64 architecture is supported on Linux and macOS from OCaml 5.4, while other Tier-1 architectures (POWER, RISC-V, and s390x) are currently unsupported.

OCaml 5 requires frame pointers due to its use of non-contiguous stacks for effects, as mentioned in \href{https://dl.acm.org/doi/10.1145/3453483.3454039}{Retrofitting Effect Handlers Onto OCaml} (Section 5.5). Non-contiguous stacks are incompatible with \texttt{perf}’s stack-copying approach, so traces produced for OCaml 5 without frame pointers will be truncated and inaccurate. Profiling will also be truncated if the Linux distribution doesn't enable frame pointers for libraries that OCaml links against.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Profiling will also be truncated" or "also be inaccurate" or both?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Profiling will be inaccurate due to stack traces being truncated by perf and that loss of information causes the reporting tools to produce visualisations that attribute stack traces to the wrong places or group unrelated stack traces together. "

This deserves a little more context. Is that version clearer?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Would this be added to the paragraph in li. 25/27?


\section{s:native-debugger-preliminaries}{Preliminaries}

This chapter describes the support for debugging executables built with \texttt{ocamlopt}, the native-code compiler, using GDB or LLDB on Linux, macOS, and FreeBSD. We will call this \texttt{native debugging}, in contrastto bytecode debugging supported via \texttt{ocamldebug} (see chapter~\ref{c:debugger}).
Copy link
Collaborator

@christinerose christinerose Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This chapter describes the support for debugging executables built with \texttt{ocamlopt}, the native-code compiler, using GDB or LLDB on Linux, macOS, and FreeBSD. We will call this \texttt{native debugging}, in contrastto bytecode debugging supported via \texttt{ocamldebug} (see chapter~\ref{c:debugger}).
This chapter describes the support for debugging executables built with \texttt{ocamlopt}, OCaml's native compiler, using GDB or LLDB on Linux, macOS, or FreeBSD. We will call this \texttt{native debugging}, in contrast to bytecode debugging supported via \texttt{ocamldebug} (see chapter~\ref{c:debugger}).

Let me know if these suggestions change the meaning.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive by comment: contrastto -> contrast to

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!


\subsection{ss:native-debugger-dwarf}{DWARF}

The OCaml compiler uses the \href{http://dwarfstd.org/}{DWARF} debugging information file format to describe the debug information it generates. DWARF is a debugging information file format used by many compilers and debuggers to support source level debugging, and it is used by Linux ELF, macOS Mach-O and FreeBSD ELF.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The OCaml compiler uses the \href{http://dwarfstd.org/}{DWARF} debugging information file format to describe the debug information it generates. DWARF is a debugging information file format used by many compilers and debuggers to support source level debugging, and it is used by Linux ELF, macOS Mach-O and FreeBSD ELF.
The OCaml compiler uses the \href{http://dwarfstd.org/}{DWARF} to describe the debug information it generates. DWARF is a debugging information file format widely used by compilers and debuggers to support source-level debugging across Linux ELF, macOS Mach-O, and FreeBSD ELF systems.

Since you define DWARF in the second sentence, it probably doesn't need to be in the first where it complicates the structure. Also I tightened up the second sentence somewhat.

Alternatively ::

The OCaml compiler uses \href{http://dwarfstd.org/}{DWARF} debugging information file format to describe the debug information it generates. DWARF is widely used by compilers and debuggers to support source-level debugging across Linux ELF, macOS Mach-O, and FreeBSD ELF systems.


The OCaml compiler uses the \href{http://dwarfstd.org/}{DWARF} debugging information file format to describe the debug information it generates. DWARF is a debugging information file format used by many compilers and debuggers to support source level debugging, and it is used by Linux ELF, macOS Mach-O and FreeBSD ELF.

Within the DWARF standard, the compiler specifically uses Call Frame Information (CFI) to describe a call stack for OCaml code, sections of the runtime written in C, and across the Foreign Function Interface (FFI) if the language provides CFI information. (If the language has been compiled to include CFI information).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Within the DWARF standard, the compiler specifically uses Call Frame Information (CFI) to describe a call stack for OCaml code, sections of the runtime written in C, and across the Foreign Function Interface (FFI) if the language provides CFI information. (If the language has been compiled to include CFI information).
Within the DWARF standard, the compiler specifically uses Call Frame Information (CFI) to describe a call stack for OCaml code, sections of the runtime written in C, and across the Foreign Function Interface (FFI) if the language provides CFI information.

How do you feel about deleting that last parenthetical sentence? Is it understood that the language was compiled to include CFI information if "the language provides CFI information"?


\subsection{ss:native-debugger-frame-pointers}{Frame Pointers}

The OCaml native compiler supports generating frame pointers, which native debugger can use to walk the stack of function calls in a program. The frame pointer (also known as the base pointer) is a register (e.g., \texttt{\%rbp} on x86_64 or \texttt{x29} on ARM64) that points to the base of the current stack frame. The stack frame (also known as the activation frame or the activation record) refers to the portion of the stack allocated to a single function call. By saving the frame pointer along with the return address, the call stack for OCaml can be maintained. Using frame pointers only, without CFI enabled, it is possible to debug OCaml programs, however the experience is closer to debugging assembly and using DWARF with CFI is recommended.
Copy link
Collaborator

@christinerose christinerose Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The OCaml native compiler supports generating frame pointers, which native debugger can use to walk the stack of function calls in a program. The frame pointer (also known as the base pointer) is a register (e.g., \texttt{\%rbp} on x86_64 or \texttt{x29} on ARM64) that points to the base of the current stack frame. The stack frame (also known as the activation frame or the activation record) refers to the portion of the stack allocated to a single function call. By saving the frame pointer along with the return address, the call stack for OCaml can be maintained. Using frame pointers only, without CFI enabled, it is possible to debug OCaml programs, however the experience is closer to debugging assembly and using DWARF with CFI is recommended.
The OCaml native compiler supports generating frame pointers, which native debugger can use to walk the stack of a program's function calls. The frame pointer (also known as the base pointer) is a register (e.g., \texttt{\%rbp} on x86_64 or \texttt{x29} on ARM64) that points to the base of the current stack frame. The stack frame (also known as the activation frame or the activation record) refers to the portion of the stack allocated to a single function call. By saving the frame pointer along with the return address, the call stack for OCaml can be maintained. Although it is possible to debug OCaml programs using frame pointers alone without CFI enabled, this experience is closer to assembly-level debugging, so using DWARF with CFI is recommended.

Does this still make sense / is accurate if we remove that little prepositional phrase?
How about the last sentence suggestion?

fib(20) = 6765
\end{verbatim}

When run this program prints the 20th Fibonacci number. The use of recursion is an excuse to inspect the call stack. To do so, startup a GDB session for this program:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When run this program prints the 20th Fibonacci number. The use of recursion is an excuse to inspect the call stack. To do so, startup a GDB session for this program:
When run this program prints the 20th Fibonacci number. The use of recursion allows us to inspect the call stack. To do so, startup a GDB session for this program:

$3 = {caml(-):'bar'<3>, caml:42}
\end{verbatim}

Note the use of x86_64 register names: : \texttt{\$rax} and \texttt{\$rbx}. We can print values as their OCaml representations (note The (m) or (u) (or (g) or (-)) is the GC color).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Note the use of x86_64 register names: : \texttt{\$rax} and \texttt{\$rbx}. We can print values as their OCaml representations (note The (m) or (u) (or (g) or (-)) is the GC color).
Note the use of x86_64 register names: \texttt{\$rax} and \texttt{\$rbx}. We can print values as their OCaml representations (note The (m) or (u) (or (g) or (-)) is the GC color).

Should this have the double : ?


\section{s:native-debugger-lldb}{Using LLDB}

Here we will walk through debugging the earlier fib example using LLDB on Linux. Startup an LLDB session using the `fib.exe` from earlier:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Here we will walk through debugging the earlier fib example using LLDB on Linux. Startup an LLDB session using the `fib.exe` from earlier:
Here we will walk through debugging the earlier \texttt{fib} example using LLDB on Linux. Startup an LLDB session using the \texttt{fib.exe} from earlier:

Should both these be in monospace, or are you referring to Fibonacci rather than the fib command?

Copy link
Collaborator

@christinerose christinerose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions / suggestions on native-debugger.etex

@tmcgilchrist
Copy link
Owner Author

For profiling

  • would it make sense to include an example output of perf report with a short explanation of how to read/understand it?
  • similarly, an example flamegraph and how to read/understand it could also be a nice addition
  • perf is by now standard, but I could imagine push-back on including material that depends on (unreleased?) (perl?) scripts from a Brendan Gregg repo.

There are Rust ports of these tools https://github.com/jonhoo/inferno if Perl is not appealing. We link off to hyperfine in the parallel programming section of the manual. A stand alone rust program has a certain appeal (better if it was OCaml).

For understanding perf report and flamegraphs I was explicitly not going to go into those, I think there is plenty of documentation elsewhere and we can link off to that. Thoughts?

@jmid

Copy link

@jmid jmid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spotted a duplicate paragraph in the latest version.

Also, I can't help think whether tak is a good example for perf, as stack traces will all be the same (a bunch of recursive calls), leading to little insight 🤔

There are Rust ports of these tools https://github.com/jonhoo/inferno if Perl is not appealing. We link off to hyperfine in the parallel programming section of the manual. A stand alone rust program has a certain appeal (better if it was OCaml).

True, although in fairness hiperfine is included in many distributions:
https://pkgs.org/download/hyperfine

For understanding perf report and flamegraphs I was explicitly not going to go into those, I think there is plenty of documentation elsewhere and we can link off to that. Thoughts?

That may work too. On the flip-side there's something nice about letting people get up-and-running by having read a manual chapter without having to then read others.

For my own part, I investigated flamegraphs, e.g., 1-2 years back and haven't had my fingers in them since. For someone in a similar situation, it would be nice with a brief reminder. For a newcomer, I could similarly see an advantage in briefly introducing them, and then point the interested reader to Gregg's pages and book to learn more.


Frame Pointer based call graphs use a convention where the head of the list of stack frames can be found in a register called the frame pointer (e.g. \%rbp on x86_64), a pointer to the previous stack frame is saved at a known offset from the frame pointer, and the return address is also saved at a known offset. This linked list of stack frames is then used to walk the call stack. OCaml 5 features non-contiguous stacks as part of the implementation of effects, see \href{https://dl.acm.org/doi/10.1145/3453483.3454039}{Retrofitting effect handlers onto OCaml} (Section 5.5).

Frame pointer based call graphs use a convention where the head of the linked list of stack frames can be found in a register called the frame pointer (e.g. \$rbp on x86_64), and two pointers to the previous stack frame and the return address are saved at a know offset from the frame pointer. This linked list of stack frames is then used to walk the stack of called functions. OCaml 5 features non-contiguous stacks as part of the implementation of effects, see \href{https://dl.acm.org/doi/10.1145/3453483.3454039}{Retrofitting effect handlers onto OCaml} (Section 5.5).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate paragraph

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which of these two do you prefer, @tmcgilchrist? They are slightly different.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The duplicate paragraph is two rewritings of the same content, I think I prefer the second one. A diagram would be better but I can't see where we use diagrams in the manual and I'm not clever enough to make this diagram in etex.

perf record -F 99 --call-graph fp <YOUR_EXECUTABLE>
\end{verbatim}

The \texttt{-F 99} option tells \texttt{perf} to sample at 99Hz, which avoids generating excessive data for longer runs and minimising overlap other periodic activities. The \texttt{--call-graph fp} instructs \texttt{perf} to use frame pointers to get the call graph, followed by the OCaml executable you want to profile. This command creates a \texttt{perf.data} file in the current directory. Alternatively use \texttt{--output} to choose a more descriptive filename.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The \texttt{-F 99} option tells \texttt{perf} to sample at 99Hz, which avoids generating excessive data for longer runs and minimising overlap other periodic activities. The \texttt{--call-graph fp} instructs \texttt{perf} to use frame pointers to get the call graph, followed by the OCaml executable you want to profile. This command creates a \texttt{perf.data} file in the current directory. Alternatively use \texttt{--output} to choose a more descriptive filename.
The \texttt{-F 99} option sets \texttt{perf} to sample at 99Hz, reducing excessive data generation during longer runs and minimising interference with other periodic activities. The \texttt{--call-graph fp} instructs \texttt{perf} to use frame pointers to get the call graph, followed by the OCaml executable you want to profile. This command creates a \texttt{perf.data} file in the current directory. Alternatively use \texttt{--output} to choose a more descriptive filename.

Would "interference" improve clarity here?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interference isn't the right word here. It is more we want a good statistical distribution for when perf record runs. We want to avoid two situations; 1 always running when some periodic activity is running and always including that into our results, 2 always running when some other activity isn't running.

Ensured all instances of `perf` were in monospace.
@christinerose
Copy link
Collaborator

Hi @tmcgilchrist! Overall I think it flows really well! The tone is consistent, and I think it definitely helps using more consistent terminology. The glossary was a great idea, too. Well done!

I fixed a few grammar/formatting/syntax things, and I left a few suggestions in cases that it might change the meaning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants