From c9b764419da4ecb9738e60168a0e6960d4dbfb01 Mon Sep 17 00:00:00 2001 From: killian <63927363+KillianLucas@users.noreply.github.com> Date: Tue, 23 Jan 2024 12:16:40 -0800 Subject: [PATCH 1/4] Documentation Update --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 9a85fad5..923b1661 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@

● Open Interpreter

- + Discord JA doc ZH doc From d39268b4bc0df4bd2eb414efcf64ad544bc15529 Mon Sep 17 00:00:00 2001 From: killian <63927363+KillianLucas@users.noreply.github.com> Date: Tue, 23 Jan 2024 12:47:02 -0800 Subject: [PATCH 2/4] Documentation Update --- docs/code-execution/computer-api.mdx | 128 ++++++++++++++++++++- docs/code-execution/custom-languages.mdx | 73 +++++++++++- docs/code-execution/usage.mdx | 33 +++++- docs/language-models/introduction.mdx | 8 +- docs/settings/all-settings.mdx | 137 ++++++----------------- 5 files changed, 272 insertions(+), 107 deletions(-) diff --git a/docs/code-execution/computer-api.mdx b/docs/code-execution/computer-api.mdx index 902b164f..ef379e1e 100644 --- a/docs/code-execution/computer-api.mdx +++ b/docs/code-execution/computer-api.mdx @@ -2,4 +2,130 @@ title: Computer API --- -Coming soon... +The following settings and functions are primarily for the language model to use, not for users to use. + +### Display - View + +Takes a screenshot of the primary display. + + + +```python Python +interpreter.computer.display.view() +``` + + + +### Display - Center + +Gets the x, y value of the center of the screen. + + + +```python Python +x, y = interpreter.computer.display.center() +``` + + + +### Keyboard - Hotkey + +Performs a hotkey on the computer + + + +```python Python +interpreter.computer.keboard.hotkey(" ", "command") +``` + + + +### Keyboard - Write + +Writes the text into the currently focused window. + + + +```python Python +interpreter.computer.keyboard.write("hello") +``` + + + +### Mouse - Click + +Clicks on the specified coordinates, or an icon, or text. If text is specified, OCR will be run on the screenshot to find the text coordinates and click on it. + + + +```python Python +# Click on coordinates +interpreter.computer.mouse.click(x=100, y=100) + +# Click on text on the screen +interpreter.computer.mouse.click("Onscreen Text") + +# Click on a gear icon +interpreter.computer.mouse.click(icon="gear icon") +``` + + + +### Mouse - Move + +Moves to the specified coordinates, or an icon, or text. If text is specified, OCR will be run on the screenshot to find the text coordinates and move to it. + + + +```python Python +# Click on coordinates +interpreter.computer.mouse.move(x=100, y=100) + +# Click on text on the screen +interpreter.computer.mouse.move("Onscreen Text") + +# Click on a gear icon +interpreter.computer.mouse.move(icon="gear icon") +``` + + + +### Mouse - Scroll + +Scrolls the mouse a specified number of pixels. + + + +```python Python +# Scroll Down +interpreter.computer.mouse.scroll(-10) + +# Scroll Up +interpreter.computer.mouse.scroll(10) +``` + + + +### Clipboard - View + +Returns the contents of the clipboard. + + + +```python Python +interpreter.computer.clipboard.view() +``` + + + +### OS - Get Selected Text + +Get the selected text on the screen. + + + +```python Python +interpreter.computer.os.get_selected_text() +``` + + diff --git a/docs/code-execution/custom-languages.mdx b/docs/code-execution/custom-languages.mdx index 035a2876..9f342e50 100644 --- a/docs/code-execution/custom-languages.mdx +++ b/docs/code-execution/custom-languages.mdx @@ -2,4 +2,75 @@ title: Custom Languages --- -Coming soon... +You can add or edit the programming languages that Open Interpreter's computer runs. + +In this example, we'll swap out the `python` language for a version of `python` that runs in the cloud. We'll use `E2B` to do this. + +([`E2B`](https://e2b.dev/) is a secure, sandboxed environment where you can run arbitrary code.) + +First, [get an API key here](https://e2b.dev/), and set it: + +```python +import os +os.environ["E2B_API_KEY"] = "" +``` + +Then, define a custom language for Open Interpreter. The class name doesn't matter, but we'll call it `PythonE2B`: + +```python +import e2b + +class PythonE2B: + """ + This class contains all requirements for being a custom language in Open Interpreter: + + - name (an attribute) + - run (a method) + - stop (a method) + - terminate (a method) + + You can use this class to run any language you know how to run, or edit any of the official languages (which also conform to this class). + + Here, we'll use E2B to power the `run` method. + """ + + # This is the name that will appear to the LLM. + name = "python" + + # Optionally, you can append some information about this language to the system message: + system_message = "# Follow this rule: Every Python code block MUST contain at least one print statement." + + # (E2B isn't a Jupyter Notebook, so we added ^ this so it would print things, + # instead of putting variables at the end of code blocks, which is a Jupyter thing.) + + def run(self, code): + """Generator that yields a dictionary in LMC Format.""" + + # Run the code on E2B + stdout, stderr = e2b.run_code('Python3', code) + + # Yield the output + yield { + "type": "console", "format": "output", + "content": stdout + stderr # We combined these arbitrarily. Yield anything you'd like! + } + + def stop(self): + """Stops the code.""" + # Not needed here, because e2b.run_code isn't stateful. + pass + + def terminate(self): + """Terminates the entire process.""" + # Not needed here, because e2b.run_code isn't stateful. + pass + +# (Tip: Do this before adding/removing languages, otherwise OI might retain the state of previous languages:) +interpreter.computer.terminate() + +# Give Open Interpreter its languages. This will only let it run PythonE2B: +interpreter.computer.languages = [PythonE2B] + +# Try it out! +interpreter.chat("What's 349808*38490739?") +``` \ No newline at end of file diff --git a/docs/code-execution/usage.mdx b/docs/code-execution/usage.mdx index 62948f16..9fe2b754 100644 --- a/docs/code-execution/usage.mdx +++ b/docs/code-execution/usage.mdx @@ -2,4 +2,35 @@ title: Usage --- -Coming soon... +# Running Code + +The `computer` itself is separate from Open Interpreter's core, so you can run it independently: + +```python +from interpreter import interpreter + +interpreter.computer.run("python", "print('Hello World!')") +``` + +This runs in the same Python instance that interpreter uses, so you can define functions, variables, or log in to services before the AI starts running code: + +```python +interpreter.computer.run("python", "import replicate\nreplicate.api_key='...'") + +interpreter.custom_instructions = "Replicate has already been imported." + +interpreter.chat("Please generate an image on replicate...") # Interpreter will be logged into Replicate +``` + +# Custom Languages + +You also have control over the `computer`'s languages (like Python, Javascript, and Shell), and can easily append custom languages: + + + Add or customize the programming languages that Open Interpreter can use. + \ No newline at end of file diff --git a/docs/language-models/introduction.mdx b/docs/language-models/introduction.mdx index 7b3ff507..cd454ae9 100644 --- a/docs/language-models/introduction.mdx +++ b/docs/language-models/introduction.mdx @@ -4,7 +4,7 @@ title: Introduction **Open Interpreter** works with both hosted and local language models. -Hosted models are faster and far more capable, but require payment. Local models are private and free, but are often difficult to set up. +Hosted models are faster and more capable, but require payment. Local models are private and free, but are often less capable. For this reason, we recommend starting with a **hosted** model, then switching to a local model once you've explored Open Interpreter's capabilities. @@ -13,7 +13,7 @@ For this reason, we recommend starting with a **hosted** model, then switching t Connect to a hosted language model like GPT-4 **(recommended)** @@ -21,9 +21,9 @@ For this reason, we recommend starting with a **hosted** model, then switching t - Setup a local language model like Code Llama + Setup a local language model like Mistral diff --git a/docs/settings/all-settings.mdx b/docs/settings/all-settings.mdx index 40182329..17a75436 100644 --- a/docs/settings/all-settings.mdx +++ b/docs/settings/all-settings.mdx @@ -6,7 +6,7 @@ title: All Settings ### Model Selection -Specifies which language model to use. Check out the [models](https://docs.openinterpreter.com/language-model-setup/introduction) section for a list of available models. +Specifies which language model to use. Check out the [models](/language-models/) section for a list of available models. @@ -198,11 +198,15 @@ interpreter --no-llm_supports_functions interpreter.llm.llm_supports_functions = False ``` +```yaml Profile +llm_supports_functions: false +``` + ### LLM Supports Vision -Inform Open Interpreter that the language model you're using supports vision. +Inform Open Interpreter that the language model you're using supports vision. Defaults to `False`. @@ -224,7 +228,7 @@ llm_supports_vision: true ### Vision Mode -Enables vision mode for multimodal models. Defaults to GPT-4-turbo. +Enables vision mode, which adds some special instructions to the prompt and switches to `gpt-4-vision-preview`. ```bash Terminal @@ -245,7 +249,7 @@ llm.model: "gpt-4-vision-preview" # Any vision supporting model ### OS Mode -Enables OS mode for multimodal models. Defaults to GPT-4-turbo. Currently not available in Python. +Enables OS mode for multimodal models. Currently not available in Python. @@ -278,7 +282,7 @@ Opens the profiles directory. ```bash Terminal -interpreter --profile +interpreter --profiles ``` @@ -290,7 +294,7 @@ Select a profile to use. ```bash Terminal -interpreter --profile "profile.yaml" +interpreter --profile local.yaml ``` @@ -517,8 +521,15 @@ This boolean flag determines whether to enable or disable some offline features ```python Python -interpreter.offline = True # Check for updates, use procedures -interpreter.offline = False # Don't check for updates, don't use procedures +interpreter.offline = True +``` + +```bash Terminal +interpreter --offline true +``` + +```yaml Profile +offline: true ``` @@ -563,130 +574,56 @@ interpreter.messages = messages # A list that resembles the one above # Computer -The following settings and functions are primarily for the language model to use, not for users to use. +The `computer` object in `interpreter.computer` is a virtual computer that the AI controls. Its primary interface/function is to execute code and return the output in real-time. -### Display - View - -Takes a screenshot of the primary display. - - - -```python Python -interpreter.computer.display.view() -``` - - - -### Display - Center - -Gets the x, y value of the center of the screen. - - - -```python Python -x, y = interpreter.computer.display.center() -``` - - - -### Keyboard - Hotkey +### Offline -Performs a hotkey on the computer +Running the `computer` in offline mode will disable some online features, like the hosted [Computer API](https://api.openinterpreter.com/). Inherits from `interpreter.offline`. ```python Python -interpreter.computer.keboard.hotkey(" ", "command") +interpreter.computer.offline = True ``` - - -### Keyboard - Write - -Writes the text into the currently focused window. - - - -```python Python -interpreter.computer.keyboard.write("hello") +```yaml Profile +computer.offline: True ``` -### Mouse - Click +### Verbose -Clicks on the specified coordinates, or an icon, or text. If text is specified, OCR will be run on the screenshot to find the text coordinates and click on it. +This is primarily used for debugging `interpreter.computer`. Inherits from `interpreter.verbose`. ```python Python -# Click on coordinates -interpreter.computer.mouse.click(x=100, y=100) - -# Click on text on the screen -interpreter.computer.mouse.click("Onscreen Text") - -# Click on a gear icon -interpreter.computer.mouse.click(icon="gear icon") +interpreter.computer.verbose = True ``` - - -### Mouse - Move - -Moves to the specified coordinates, or an icon, or text. If text is specified, OCR will be run on the screenshot to find the text coordinates and move to it. - - - -```python Python -# Click on coordinates -interpreter.computer.mouse.move(x=100, y=100) - -# Click on text on the screen -interpreter.computer.mouse.move("Onscreen Text") - -# Click on a gear icon -interpreter.computer.mouse.move(icon="gear icon") +```yaml Profile +computer.verbose: True ``` -### Mouse - Scroll +### Emit Images -Scrolls the mouse a specified number of pixels. +The `emit_images` attribute in `interpreter.computer` controls whether the computer should emit images or not. This is inherited from `interpreter.llm.supports_vision`. - - -```python Python -# Scroll Down -interpreter.computer.mouse.scroll(-10) - -# Scroll Up -interpreter.computer.mouse.scroll(10) -``` - - - -### Clipboard - View +This is used for multimodel vs. text only models. Running `computer.display.view()` will return an actual screenshot for multimodal models if `emit_images` is True. If it's False, `computer.display.view()` will return all the text on the screen. -Returns the contents of the clipboard. +Many other functions of the computer can produce image/text outputs, and this parameter controls that. ```python Python -interpreter.computer.clipboard.view() +interpreter.computer.emit_images = True ``` - - -### OS - Get Selected Text - -Get the selected text on the screen. - - - -```python Python -interpreter.computer.os.get_selected_text() +```yaml Profile +computer.emit_images: True ``` \ No newline at end of file From 985a8d6dab98a24b6b7bbc5e7c12846b227b36c4 Mon Sep 17 00:00:00 2001 From: killian <63927363+KillianLucas@users.noreply.github.com> Date: Tue, 23 Jan 2024 12:51:32 -0800 Subject: [PATCH 3/4] Documentation Update --- docs/code-execution/computer-api.mdx | 2 +- docs/mint.json | 5 ++++- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/docs/code-execution/computer-api.mdx b/docs/code-execution/computer-api.mdx index ef379e1e..217ee426 100644 --- a/docs/code-execution/computer-api.mdx +++ b/docs/code-execution/computer-api.mdx @@ -2,7 +2,7 @@ title: Computer API --- -The following settings and functions are primarily for the language model to use, not for users to use. +The following functions are designed for language models to use in Open Interpreter, currently only supported in [OS Mode](/guides/os-mode/). ### Display - View diff --git a/docs/mint.json b/docs/mint.json index 8b621ba7..f8550ff5 100644 --- a/docs/mint.json +++ b/docs/mint.json @@ -99,7 +99,10 @@ { "group": "Code Execution", "pages": [ - "code-execution/settings" + "code-execution/settings", + "code-execution/usage", + "code-execution/computer-api", + "code-execution/custom-languages" ] }, { From 5672a8e404ee84a3be7f16b8b6b6c4bbcd381d14 Mon Sep 17 00:00:00 2001 From: killian <63927363+KillianLucas@users.noreply.github.com> Date: Tue, 23 Jan 2024 14:08:29 -0800 Subject: [PATCH 4/4] Documentation Update --- docs/getting-started/introduction.mdx | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/getting-started/introduction.mdx b/docs/getting-started/introduction.mdx index 15f7486c..10ed3842 100644 --- a/docs/getting-started/introduction.mdx +++ b/docs/getting-started/introduction.mdx @@ -5,7 +5,7 @@ description: A new way to use computers #

-thumbnail +thumbnail **Open Interpreter** lets language models run code. @@ -41,4 +41,4 @@ interpreter -We've also developed [one-line installers](/getting-started/setup) that install Python and set up Open Interpreter. \ No newline at end of file +We've also developed [one-line installers](/getting-started/setup) that install Python and set up Open Interpreter.