Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
github-actions[bot] committed Mar 11, 2024
2 parents fc6ccb7 + aca11f7 commit ce43dd3
Show file tree
Hide file tree
Showing 9 changed files with 128 additions and 83 deletions.
113 changes: 57 additions & 56 deletions docs/code-execution/computer-api.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,57 +8,57 @@ The following functions are designed for language models to use in Open Interpre

Takes a screenshot of the primary display.

<CodeGroup>

```python Python

```python
interpreter.computer.display.view()
```

</CodeGroup>


### Display - Center

Gets the x, y value of the center of the screen.

<CodeGroup>

```python Python

```python
x, y = interpreter.computer.display.center()
```

</CodeGroup>


### Keyboard - Hotkey

Performs a hotkey on the computer

<CodeGroup>

```python Python

```python
interpreter.computer.keboard.hotkey(" ", "command")
```

</CodeGroup>


### Keyboard - Write

Writes the text into the currently focused window.

<CodeGroup>

```python Python

```python
interpreter.computer.keyboard.write("hello")
```

</CodeGroup>


### Mouse - Click

Clicks on the specified coordinates, or an icon, or text. If text is specified, OCR will be run on the screenshot to find the text coordinates and click on it.

<CodeGroup>

```python Python

```python
# Click on coordinates
interpreter.computer.mouse.click(x=100, y=100)

Expand All @@ -69,15 +69,15 @@ interpreter.computer.mouse.click("Onscreen Text")
interpreter.computer.mouse.click(icon="gear icon")
```

</CodeGroup>


### Mouse - Move

Moves to the specified coordinates, or an icon, or text. If text is specified, OCR will be run on the screenshot to find the text coordinates and move to it.

<CodeGroup>

```python Python

```python
# Click on coordinates
interpreter.computer.mouse.move(x=100, y=100)

Expand All @@ -88,152 +88,153 @@ interpreter.computer.mouse.move("Onscreen Text")
interpreter.computer.mouse.move(icon="gear icon")
```

</CodeGroup>


### Mouse - Scroll

Scrolls the mouse a specified number of pixels.

<CodeGroup>

```python Python

```python
# Scroll Down
interpreter.computer.mouse.scroll(-10)

# Scroll Up
interpreter.computer.mouse.scroll(10)
```

</CodeGroup>


### Clipboard - View

Returns the contents of the clipboard.

<CodeGroup>

```python Python

```python
interpreter.computer.clipboard.view()
```

</CodeGroup>


### OS - Get Selected Text

Get the selected text on the screen.

<CodeGroup>

```python Python

```python
interpreter.computer.os.get_selected_text()
```

</CodeGroup>


### Mail - Get

Retrieves the last {number} emails from the inbox, optionally filtering for only unread emails. (Mac only)
Retrieves the last `number` emails from the inbox, optionally filtering for only unread emails. (Mac only)


<CodeGroup>

```python Python
```python
interpreter.computer.mail.get(number=10, unread=True)
```

</CodeGroup>


### Mail - Send

Sends an email with the given parameters using the default mail app. (Mac only)

<CodeGroup>

```python Python

```python
interpreter.computer.mail.send("[email protected]", "Subject", "Body", ["path/to/attachment.pdf", "path/to/attachment2.pdf"])
```

</CodeGroup>


### Mail - Unread Count

Retrieves the count of unread emails in the inbox. (Mac only)

<CodeGroup>

```python Python

```python
interpreter.computer.mail.unread_count()
```

</CodeGroup>


### SMS - Send

Send a text message using the default SMS app. (Mac only)

<CodeGroup>

```python Python

```python
interpreter.computer.sms.send("2068675309", "Hello from Open Interpreter!")
```

</CodeGroup>


### Contacts - Get Phone Number

Returns the phone number of a contact name. (Mac only)

<CodeGroup>

```python Python

```python
interpreter.computer.contacts.get_phone_number("John Doe")
```

</CodeGroup>


### Contacts - Get Email Address

Returns the email of a contact name. (Mac only)

<CodeGroup>

```python Python

```python
interpreter.computer.contacts.get_phone_number("John Doe")
```

</CodeGroup>


### Calendar - Get Events

Fetches calendar events for the given date or date range from all calendars. (Mac only)

<CodeGroup>

```python Python
interpreter.computer.calendar.get_events(datetime, datetime)

```python
interpreter.computer.calendar.get_events(start_date=datetime, end_date=datetime)
```

</CodeGroup>


### Calendar - Create Event

Creates a new calendar event. Uses first calendar if none is specified (Mac only)

<CodeGroup>

```python Python

```python
interpreter.computer.calendar.create_event(title="Title", start_date=datetime, end_date=datetime, location="Location", notes="Notes", calendar="Work")
```

</CodeGroup>


### Calendar - Delete Event

Delete a specific calendar event. (Mac only)

<CodeGroup>

```python Python

```python
interpreter.computer.calendar.delete_event(event_title="Title", start_date=datetime, calendar="Work")
```

</CodeGroup>


13 changes: 7 additions & 6 deletions interpreter/core/computer/display/display.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,14 +134,15 @@ def find(self, description, screenshot=None):
return self.find_text(description.strip('"'), screenshot)
else:
try:
message = format_to_recipient(
"Locating this icon will take ~10 seconds. Subsequent icons should be found more quickly.",
recipient="user",
)
print(message)

if self.computer.debug:
print("DEBUG MODE ON")
print("NUM HASHES:", len(self._hashes))
else:
message = format_to_recipient(
"Locating this icon will take ~10 seconds. Subsequent icons should be found more quickly.",
recipient="user",
)
print(message)

from .point.point import point

Expand Down
41 changes: 36 additions & 5 deletions interpreter/core/computer/display/point/point.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,15 +68,17 @@ def find_icon(description, screenshot=None, debug=False, hashes=None):

icons_bounding_boxes = get_element_boxes(image_data, debug)

debug_path = os.path.join(os.path.expanduser("~"), "Desktop", "oi-debug")

if debug:
# Create a draw object
image_data_copy = image_data.copy()
draw = ImageDraw.Draw(image_data_copy)
# Draw red rectangles around all blocks
for block in icons_bounding_boxes:
left, top, width, height = (
block["left"],
block["top"],
block["x"],
block["y"],
block["width"],
block["height"],
)
Expand Down Expand Up @@ -104,8 +106,8 @@ def find_icon(description, screenshot=None, debug=False, hashes=None):
# Draw red rectangles around all blocks
for block in icons_bounding_boxes:
left, top, width, height = (
block["left"],
block["top"],
block["x"],
block["y"],
block["width"],
block["height"],
)
Expand Down Expand Up @@ -136,8 +138,8 @@ def find_icon(description, screenshot=None, debug=False, hashes=None):
block["height"],
)
draw.rectangle([(left, top), (left + width, top + height)], outline="blue")

# Save the image to the desktop
debug_path = os.path.join(os.path.expanduser("~"), "Desktop", "oi-debug")
if not os.path.exists(debug_path):
os.makedirs(debug_path)
image_data_copy.save(os.path.join(debug_path, "pytesseract_blocks_image.png"))
Expand Down Expand Up @@ -411,6 +413,9 @@ def combine_boxes(icons_bounding_boxes):
desktop = os.path.join(os.path.join(os.path.expanduser("~")), "Desktop")
image_data_copy.save(os.path.join(desktop, "point_vision.png"))

if "icon" not in description.lower():
description += " icon"

top_icons = image_search(description, icons, hashes)

coordinates = [t["coordinate"] for t in top_icons]
Expand Down Expand Up @@ -665,4 +670,30 @@ def process_image(
# Append the box as a dictionary to the list
boxes.append({"x": x, "y": y, "width": w, "height": h})

# Remove any boxes whose edges cross over any contours
filtered_boxes = []
for box in boxes:
crosses_contour = False
for contour in contours_contrasted:
if (
cv2.pointPolygonTest(contour, (box["x"], box["y"]), False) >= 0
or cv2.pointPolygonTest(
contour, (box["x"] + box["width"], box["y"]), False
)
>= 0
or cv2.pointPolygonTest(
contour, (box["x"], box["y"] + box["height"]), False
)
>= 0
or cv2.pointPolygonTest(
contour, (box["x"] + box["width"], box["y"] + box["height"]), False
)
>= 0
):
crosses_contour = True
break
if not crosses_contour:
filtered_boxes.append(box)
boxes = filtered_boxes

return boxes
1 change: 0 additions & 1 deletion interpreter/core/computer/mouse/mouse.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,6 @@ def move(self, *args, x=None, y=None, icon=None, text=None, screenshot=None):
"""
Moves the mouse to specified coordinates, an icon, or text.
"""
screenshot = None
if len(args) > 1:
raise ValueError(
"Too many positional arguments provided. To move/click specific coordinates, use kwargs (x=x, y=y).\n\nPlease take a screenshot with computer.display.view() to find text/icons to click, then use computer.mouse.click(text) or computer.mouse.click(icon=description_of_icon) if at all possible. This is **significantly** more accurate than using coordinates. Specifying (x=x, y=y) is highly likely to fail. Specifying ('text to click') is highly likely to succeed."
Expand Down
Loading

0 comments on commit ce43dd3

Please sign in to comment.