Built site for gh-pages

UKGovernmentBEIS · Dec 13, 2024 · 487b728 · 487b728
1 parent 50c9ef5
commit 487b728
Show file tree

Hide file tree

Showing 43 changed files with 1,706 additions and 1,677 deletions.
diff --git a/.nojekyll b/.nojekyll
@@ -1 +1 @@
-6c5c2b45
+b9b769e5
diff --git a/agents-api.html b/agents-api.html
@@ -2,12 +2,12 @@
 <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
 
 <meta charset="utf-8">
-<meta name="generator" content="quarto-1.5.32">
+<meta name="generator" content="quarto-1.5.57">
 
 <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
 
 
-<title>Inspect – Agents API</title>
+<title>Agents API – Inspect</title>
 <style>
 code{white-space: pre-wrap;}
 span.smallcaps{font-variant: small-caps;}
@@ -474,7 +474,7 @@ <h3 class="anchored" data-anchor-id="sec-stop-reasons">Stop Reasons</h3>
 <span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a><span class="cf">if</span> output.stop_reason <span class="op">==</span> <span class="st">"model_length"</span>:</span>
 <span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a>    <span class="co"># do something to recover from context window overflow</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <p>Here are the possible values for <code>StopReason</code> :</p>
-<table class="table">
+<table class="caption-top table">
 <colgroup>
 <col style="width: 35%">
 <col style="width: 65%">

diff --git a/agents-api.html.md b/agents-api.html.md
@@ -175,14 +175,14 @@ if output.stop_reason == "model_length":
 
 Here are the possible values for `StopReason` :
 
-| Stop Reason      | Description                                                        |
-|------------------|--------------------------------------------------------------------|
-| `stop`           | The model hit a natural stop point or a provided stop sequence     |
-| `max_tokens`     | The maximum number of tokens specified in the request was reached. |
-| `model_length`   | The model’s context length was exceeded.                           |
-| `tool_calls`     | The model called a tool                                            |
-| `content_filter` | Content was omitted due to a content filter.                       |
-| `unknown`        | Unknown (e.g. unexpected runtime error)                            |
+| Stop Reason | Description |
+|----|----|
+| `stop` | The model hit a natural stop point or a provided stop sequence |
+| `max_tokens` | The maximum number of tokens specified in the request was reached. |
+| `model_length` | The model’s context length was exceeded. |
+| `tool_calls` | The model called a tool |
+| `content_filter` | Content was omitted due to a content filter. |
+| `unknown` | Unknown (e.g. unexpected runtime error) |
 
 ### Error Handling
 

diff --git a/agents.html b/agents.html
@@ -2,12 +2,12 @@
 <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
 
 <meta charset="utf-8">
-<meta name="generator" content="quarto-1.5.32">
+<meta name="generator" content="quarto-1.5.57">
 
 <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
 
 
-<title>Inspect – Agents</title>
+<title>Agents – Inspect</title>
 <style>
 code{white-space: pre-wrap;}
 span.smallcaps{font-variant: small-caps;}
@@ -459,7 +459,7 @@ <h3 class="anchored" data-anchor-id="example">Example</h3>
 <section id="options" class="level3">
 <h3 class="anchored" data-anchor-id="options">Options</h3>
 <p>There are several options available for customising the behaviour of the basic agent:</p>
-<table class="table">
+<table class="caption-top table">
 <colgroup>
 <col style="width: 23%">
 <col style="width: 20%">
@@ -578,7 +578,7 @@ <h3 class="anchored" data-anchor-id="sec-stop-reasons">Stop Reasons</h3>
 <span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a><span class="cf">if</span> output.stop_reason <span class="op">==</span> <span class="st">"model_length"</span>:</span>
 <span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a>    <span class="co"># do something to recover from context window overflow</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <p>Here are the possible values for <code>StopReason</code> :</p>
-<table class="table">
+<table class="caption-top table">
 <colgroup>
 <col style="width: 35%">
 <col style="width: 65%">
@@ -910,7 +910,7 @@ <h3 class="anchored" data-anchor-id="environment-interface">Environment Interfac
 <section id="environment-binding" class="level3">
 <h3 class="anchored" data-anchor-id="environment-binding">Environment Binding</h3>
 <p>There are two sandbox environments built in to Inspect:</p>
-<table class="table">
+<table class="caption-top table">
 <thead>
 <tr class="header">
 <th>Environment Type</th>
@@ -965,7 +965,7 @@ <h3 class="anchored" data-anchor-id="sec-docker-configuration">Docker Configurat
 <p>Before using Docker sandbox environments, please be sure to install <a href="https://docs.docker.com/engine/install/">Docker Engine</a> (version 24.0.7 or greater).</p>
 <p>You can use the Docker sandbox enviornment without any special configuration, however most commonly you’ll provide explicit configuration via either a <code>Dockerfile</code> or a <a href="https://docs.docker.com/compose/compose-file/">Docker Compose</a> configuration file (<code>compose.yaml</code>).</p>
 <p>Here is how Docker sandbox environments are created based on the presence of <code>Dockerfile</code> and/or <code>compose.yml</code> in the task directory:</p>
-<table class="table">
+<table class="caption-top table">
 <thead>
 <tr class="header">
 <th>Config Files</th>

diff --git a/agents.html.md b/agents.html.md
@@ -129,18 +129,18 @@ repository at
 There are several options available for customising the behaviour of the
 basic agent:
 
-| Option               | Type                    | Description                                                                                                       |
-|----------------------|-------------------------|-------------------------------------------------------------------------------------------------------------------|
-| `init`               | `Solver | list[Solver]` | Agent initialisation (e.g. `system_message()`).                                                                   |
-| `tools`              | `list[Tool]`            | List of tools available to the agent.                                                                             |
-| `max_attempts`       | `int`                   | Maximum number of submission attempts to accept.                                                                  |
-| `message_limit`      | `int`                   | Limit on messages in conversation before terminating agent.                                                       |
-| `token_limit`        | `int`                   | Limit on in conversation before terminating agent.                                                                |
-| `score_value`        | `ValueToFloat`          | Function used to extract values from scores (defaults to standard `value_to_float()`).                            |
-| `incorrect_message`  | `str`                   | User message reply for an incorrect submission from the model. Alternatively, a function which returns a message. |
-| `continue_message`   | `str`                   | User message to urge the model to continue when it doesn’t make a tool call.                                      |
-| `submit_name`        | `str`                   | Name for tool used to make submissions (defaults to ‘submit’).                                                    |
-| `submit_description` | `str`                   | Description of submit tool (defaults to ‘Submit an answer for evaluation’)                                        |
+| Option | Type | Description |
+|----|----|----|
+| `init` | `Solver | list[Solver]` | Agent initialisation (e.g. `system_message()`). |
+| `tools` | `list[Tool]` | List of tools available to the agent. |
+| `max_attempts` | `int` | Maximum number of submission attempts to accept. |
+| `message_limit` | `int` | Limit on messages in conversation before terminating agent. |
+| `token_limit` | `int` | Limit on in conversation before terminating agent. |
+| `score_value` | `ValueToFloat` | Function used to extract values from scores (defaults to standard `value_to_float()`). |
+| `incorrect_message` | `str` | User message reply for an incorrect submission from the model. Alternatively, a function which returns a message. |
+| `continue_message` | `str` | User message to urge the model to continue when it doesn’t make a tool call. |
+| `submit_name` | `str` | Name for tool used to make submissions (defaults to ‘submit’). |
+| `submit_description` | `str` | Description of submit tool (defaults to ‘Submit an answer for evaluation’) |
 
 For multiple attempts, submissions are evaluated using the task’s main
 scorer, with value of 1.0 indicating a correct answer. Scorer values are
@@ -225,14 +225,14 @@ if output.stop_reason == "model_length":
 
 Here are the possible values for `StopReason` :
 
-| Stop Reason      | Description                                                        |
-|------------------|--------------------------------------------------------------------|
-| `stop`           | The model hit a natural stop point or a provided stop sequence     |
-| `max_tokens`     | The maximum number of tokens specified in the request was reached. |
-| `model_length`   | The model’s context length was exceeded.                           |
-| `tool_calls`     | The model called a tool                                            |
-| `content_filter` | Content was omitted due to a content filter.                       |
-| `unknown`        | Unknown (e.g. unexpected runtime error)                            |
+| Stop Reason | Description |
+|----|----|
+| `stop` | The model hit a natural stop point or a provided stop sequence |
+| `max_tokens` | The maximum number of tokens specified in the request was reached. |
+| `model_length` | The model’s context length was exceeded. |
+| `tool_calls` | The model called a tool |
+| `content_filter` | Content was omitted due to a content filter. |
+| `unknown` | Unknown (e.g. unexpected runtime error) |
 
 ### Error Handling
 
@@ -629,10 +629,10 @@ The sandbox is also available to custom scorers.
 
 There are two sandbox environments built in to Inspect:
 
-| Environment Type | Description                                                                                                                                                      |
-|------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| `local`          | Run `sandbox()` methods in the same file system as the running evaluation (should *only be used* if you are already running your evaluation in another sandbox). |
-| `docker`         | Run `sandbox()` methods within a Docker container (see the [Docker Configuration](#sec-docker-configuration) section below for additional details).              |
+| Environment Type | Description |
+|----|----|
+| `local` | Run `sandbox()` methods in the same file system as the running evaluation (should *only be used* if you are already running your evaluation in another sandbox). |
+| `docker` | Run `sandbox()` methods within a Docker container (see the [Docker Configuration](#sec-docker-configuration) section below for additional details). |
 
 Sandbox environment definitions can be bound at the `Sample`, `Task`, or
 `eval()` level. Binding precedence goes from `eval()`, to `Task` to
@@ -717,11 +717,11 @@ file (`compose.yaml`).
 Here is how Docker sandbox environments are created based on the
 presence of `Dockerfile` and/or `compose.yml` in the task directory:
 
-| Config Files   | Behavior                                                                                                           |
-|----------------|--------------------------------------------------------------------------------------------------------------------|
-| None           | Creates a sandbox environment based on the official [python:3.12-bookworm](https://hub.docker.com/_/python) image. |
-| `Dockerfile`   | Creates a sandbox environment by building the image.                                                               |
-| `compose.yaml` | Creates sandbox environment(s) based on `compose.yaml`.                                                            |
+| Config Files | Behavior |
+|----|----|
+| None | Creates a sandbox environment based on the official [python:3.12-bookworm](https://hub.docker.com/_/python) image. |
+| `Dockerfile` | Creates a sandbox environment by building the image. |
+| `compose.yaml` | Creates sandbox environment(s) based on `compose.yaml`. |
 
 Providing a `compose.yaml` is not strictly required, as Inspect will
 automatically generate one as needed. Note that the automatically

diff --git a/approval.html b/approval.html
@@ -2,12 +2,12 @@
 <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
 
 <meta charset="utf-8">
-<meta name="generator" content="quarto-1.5.32">
+<meta name="generator" content="quarto-1.5.57">
 
 <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
 
 
-<title>Inspect – Approval</title>
+<title>Approval – Inspect</title>
 <style>
 code{white-space: pre-wrap;}
 span.smallcaps{font-variant: small-caps;}
@@ -420,7 +420,7 @@ <h2 class="anchored" data-anchor-id="custom-approvers">Custom Approvers</h2>
 <span id="cb5-11"><a href="#cb5-11" aria-hidden="true" tabindex="-1"></a></span>
 <span id="cb5-12"><a href="#cb5-12" aria-hidden="true" tabindex="-1"></a>    <span class="cf">return</span> approve</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <p>There are five possible approval decisions:</p>
-<table class="table">
+<table class="caption-top table">
 <colgroup>
 <col style="width: 50%">
 <col style="width: 50%">

diff --git a/approval.html.md b/approval.html.md
@@ -116,13 +116,13 @@ def auto_approver(decision: ApprovalDecision = "approve") -> Approver:
 
 There are five possible approval decisions:
 
-| Decision  | Description                                                                                          |
-|-----------|------------------------------------------------------------------------------------------------------|
-| approve   | The tool call is approved                                                                            |
-| modify    | The tool call is approved with modification (included in `modified` field of `Approver`)             |
-| reject    | The tool call is rejected (report to the model that the call was rejected along with an explanation) |
-| escalate  | The tool call should be escalated to the next approver in the chain.                                 |
-| terminate | The current sample should be terminated as a result of the tool call.                                |
+| Decision | Description |
+|----|----|
+| approve | The tool call is approved |
+| modify | The tool call is approved with modification (included in `modified` field of `Approver`) |
+| reject | The tool call is rejected (report to the model that the call was rejected along with an explanation) |
+| escalate | The tool call should be escalated to the next approver in the chain. |
+| terminate | The current sample should be terminated as a result of the tool call. |
 
 Here’s a more complicated custom approver that implements an allow list
 for bash commands. Imagine that we’ve implemented this approver within a

diff --git a/caching.html b/caching.html
@@ -2,12 +2,12 @@
 <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
 
 <meta charset="utf-8">
-<meta name="generator" content="quarto-1.5.32">
+<meta name="generator" content="quarto-1.5.57">
 
 <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
 
 
-<title>Inspect – Caching</title>
+<title>Caching – Inspect</title>
 <style>
 code{white-space: pre-wrap;}
 span.smallcaps{font-variant: small-caps;}
@@ -475,7 +475,7 @@ <h3 class="anchored" data-anchor-id="usage-reporting">Usage Reporting</h3>
 <p>When using provider caching, model token usage will be reported with 4 distinct values rather than the normal input and output. For example:</p>
 <div class="sourceCode" id="cb13"><pre class="sourceCode default code-with-copy"><code class="sourceCode default"><span id="cb13-1"><a href="#cb13-1" aria-hidden="true" tabindex="-1"></a>13,684 tokens [I: 22, CW: 1,711, CR: 11,442, O: 509]</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <p>Where the prefixes on reported token counts stand for:</p>
-<table class="table">
+<table class="caption-top table">
 <tbody>
 <tr class="odd">
 <td><strong>I</strong></td>

diff --git a/datasets.html b/datasets.html
@@ -2,12 +2,12 @@
 <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
 
 <meta charset="utf-8">
-<meta name="generator" content="quarto-1.5.32">
+<meta name="generator" content="quarto-1.5.57">
 
 <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
 
 
-<title>Inspect – Datasets</title>
+<title>Datasets – Inspect</title>
 <style>
 code{white-space: pre-wrap;}
 span.smallcaps{font-variant: small-caps;}
@@ -376,7 +376,7 @@ <h2 class="anchored" data-anchor-id="overview">Overview</h2>
 <h2 class="anchored" data-anchor-id="dataset-samples">Dataset Samples</h2>
 <p>The core data type underlying the use of datasets with Inspect is the <code>Sample</code>, which consists of a required <code>input</code> field and several other optional fields:</p>
 <p><strong>Class</strong> <code>inspect_ai.dataset.Sample</code></p>
-<table class="table">
+<table class="caption-top table">
 <colgroup>
 <col style="width: 20%">
 <col style="width: 40%">
@@ -433,7 +433,7 @@ <h2 class="anchored" data-anchor-id="dataset-samples">Dataset Samples</h2>
 </tbody>
 </table>
 <p>So a CSV dataset with the following structure:</p>
-<table class="table">
+<table class="caption-top table">
 <colgroup>
 <col style="width: 56%">
 <col style="width: 43%">
@@ -547,7 +547,7 @@ <h2 class="anchored" data-anchor-id="amazon-s3">Amazon S3</h2>
 <h2 class="anchored" data-anchor-id="chat-messages">Chat Messages</h2>
 <p>The most important data structure within <code>Sample</code> is the <code>ChatMessage</code>. Note that often datasets will contain a simple string as their input (which is then internally converted to a <code>ChatMessageUser</code>). However, it is possible to include a full message history as the input via <code>ChatMessage</code>. Another useful application of <code>ChatMessage</code> is providing multi-modal input (e.g.&nbsp;images).</p>
 <p><strong>Class</strong> <code>inspect_ai.model.ChatMessage</code></p>
-<table class="table">
+<table class="caption-top table">
 <colgroup>
 <col style="width: 10%">
 <col style="width: 35%">