<!--
hoody-display Subskill (sdk)
Auto-generated by Hoody Skills Generator
Generated: 2026-06-20T00:04:10.689Z
Model: mimo-v2.5-pro + fixer:mimo-v2.5-pro
Mode: sdk


Tokens: 20232

DO NOT EDIT MANUALLY - Changes will be overwritten on next generation
-->

# hoody-display Subskill

## Overview

The **hoody-display** service provides fully embeddable, multiplayer desktop environments accessible via URL. It exposes a virtual display that can be controlled programmatically or accessed through an HTML5 web client, enabling AI agents and applications to interact with GUI software remotely.

### Core Capabilities

- **Screenshots & Thumbnails** — Capture, retrieve, and list display screenshots/thumbnails
- **Input Automation** — Full mouse and keyboard control (clicks, drags, typing, key combos)
- **Window Management** — List, focus, move, resize, minimize, close, and search windows
- **Clipboard Access** — Read and write the display clipboard
- **HTML5 Client** — Embeddable web interface with extensive configuration options
- **Batch Operations** — Execute sequences of actions in a single request
- **Compound Actions** — Combined move-click-type, drag, select operations

### When to Use

hoody-display is ideal for remote desktop automation, visual verification, embedding in web apps, and multi-user collaboration.

### Philosophy Alignment

hoody-display makes desktop environments programmable resources — as easy to control as making an HTTP request. The embeddable HTML5 client means any GUI application is accessible from any browser via URL. Multiplayer support enables collaborative interactions where multiple agents or users can share a single display.

### Service Access

The service URL follows the Hoody Kit pattern:

```
https://{projectId}-{containerId}-display-{serviceId}.{node}.containers.hoody.com
```

All endpoints use the `/api/v1/display/` prefix. Authentication is handled by the Hoody SDK automatically.

---

## Common Workflows

### 1. Health Check and Display Discovery

Verify service availability and gather display information before operations.

```
import { HoodyClient } from '@hoody-ai/hoody-sdk'

const client = new HoodyClient({
  baseURL: 'https://api.hoody.com',
  token: 'YOUR_TOKEN'
})

// Verify service is healthy
const health = await client.display.health.check()

// Get display information and available screenshots
const info = await client.display.getInformation()

// List all available screenshots with metadata
const screenshots = await client.display.listScreenshots()
```

### 2. Screenshot Capture and Retrieval

Capture display state for visual verification or audit trails.

```
// Capture a fresh screenshot (returns image data)
const screenshot = await client.display.screenshots.capture()

// Capture with base64 encoding for JSON transport
const b64 = await client.display.screenshots.capture({ base64: true })

// Capture metadata only (no image download)
const meta = await client.display.screenshots.captureMetadata()

// Retrieve the most recent cached screenshot
const latest = await client.display.screenshots.getLatest()

// Get metadata for the latest screenshot
const latestMeta = await client.display.screenshots.getLatestMetadata()

// Retrieve a specific screenshot by Unix timestamp
const historical = await client.display.screenshots.getByTimestamp(
  '1704067200'
)

// Thumbnails (320x180 scaled) for quick previews
const thumb = await client.display.thumbnails.capture()
const latestThumb = await client.display.thumbnails.getLatest()
const thumbByTime = await client.display.thumbnails.getByTimestamp({ timestamp: '1704067200' })
```

### 3. Mouse Control

Full mouse automation with absolute positioning, relative offsets, and button control.

```
// Move cursor to absolute position
await client.display.input.mouseMove({
  data: { x: 500, y: 300 }
})

// Move cursor relative to current position
await client.display.input.mouseMoveRelative({
  data: { x: 50, y: -25 }
})

// Click at current position (left button default)
await client.display.input.mouseClick()

// Double-click
await client.display.input.mouseDoubleClick()

// Press and hold a mouse button
await client.display.input.mouseDown()

// Release a mouse button
await client.display.input.mouseUp()

// Scroll in a direction
await client.display.input.mouseScroll({
  data: { direction: 'down' }
})

// Get current cursor position
const pos = await client.display.input.mouseLocation()
```

### 4. Keyboard Automation

Type text and send key combinations for keyboard-driven interactions.

```
// Type a string of text
await client.display.input.keyboardType({
  data: { text: 'Hello, World!' }
})

// Send key combinations (ctrl+c, ctrl+shift+s, Return, etc.)
await client.display.input.keyboardKey({
  data: { keys: ['ctrl+c'] }
})

// Hold a key down (X11 keysym names: Shift_L, Control_L, Alt_L)
await client.display.input.keyboardKeyDown({
  data: { key: 'Shift_L' }
})

// Release a held key
await client.display.input.keyboardKeyUp({
  data: { key: 'Shift_L' }
})
```

### 5. Window Management

Inspect, focus, arrange, and manipulate windows on the display.

```
// List all windows
const windows = await client.display.listWindows()

// List only visible windows
const visible = await client.display.listWindows({ onlyVisible: true })

// Get the currently active window
const active = await client.display.input.windowActive()

// Focus a specific window
await client.display.input.windowFocus({
  data: { windowId: '0x01234567' }
})

// Move a window to new coordinates
await client.display.input.windowMove({
  data: { windowId: '0x01234567', x: 100, y: 100 }
})

// Resize a window
await client.display.input.windowResize({
  data: { windowId: '0x01234567', width: 800, height: 600 }
})

// Minimize, raise, or close
await client.display.input.windowMinimize({
  data: { windowId: '0x01234567' }
})
await client.display.input.windowRaise({
  data: { windowId: '0x01234567' }
})
await client.display.input.windowClose({
  data: { windowId: '0x01234567' }
})

// Search for windows matching a pattern
const found = await client.display.input.windowSearch({
  data: { pattern: 'Firefox' }
})

// Note: Query/getter methods like windowGeometry, windowName, and getWindowProperties take flat arg objects (e.g., { windowId }), while action methods use { data: { ... } }.
// Get window geometry, name, and extended properties
const geo = await client.display.input.windowGeometry({
  windowId: '0x01234567'
})
const name = await client.display.input.windowName({
  windowId: '0x01234567'
})
const props = await client.display.getWindowProperties({
  windowId: '0x01234567'
})
```

### 6. Clipboard Operations

Read and write text to the display clipboard for data transfer.

```
// Read clipboard content
const clipboard = await client.display.getClipboard()

// Read a specific selection buffer (e.g., 'primary')
const primary = await client.display.getClipboard({ selection: 'primary' })

// Write text to clipboard
await client.display.setClipboard({
  data: { text: 'Content to paste' }
})
```

### 7. Verified Interaction Pattern

Combine operations with screenshot verification between steps to ensure actions had the expected effect.

```
// Step 1: Capture initial state
const before = await client.display.screenshots.capture()

// Step 2: Perform input action
await client.display.input.mouseMove({ data: { x: 400, y: 250 } })
await client.display.input.mouseClick()

// Step 3: Wait briefly for UI response
await client.display.input.wait({ data: { duration: 500 } })

// Step 4: Capture post-action state for verification
const after = await client.display.screenshots.capture()

// Step 5: Verify by checking window state
const active = await client.display.input.windowActive()
```

---

## Advanced Operations

### 1. Compound Input: Click-At and Type-At

Execute combined move-click and move-click-type operations in single requests.

```
// Move to position and click in one action
await client.display.input.clickAt({
  data: { x: 400, y: 250 }
})

// Move, click, and type text in one action
await client.display.input.typeAt({
  data: { x: 400, y: 250, text: 'username@example.com' }
})
```

### 2. Drag and Select Operations

Perform drag gestures and text selection in single requests.

```
// Drag from one position to another
await client.display.input.drag({
  data: { from_x: 100, from_y: 100, to_x: 500, to_y: 300 }
})

// Select a range via click + shift-click
await client.display.input.select({
  data: { from_x: 100, from_y: 200, to_x: 400, to_y: 200 }
})
```

### 3. Action with Screenshot Return

Execute an action and capture the resulting display state in one request.

```
// Perform action and get screenshot of the result
const result = await client.display.input.act({
  data: { action: 'click', x: 300, y: 200 }
})

// Wait for a duration and capture state
const waitResult = await client.display.input.wait({
  data: { duration: 2000 }
})
```

### 4. Batch Operations

Execute a sequence of actions atomically for efficiency and consistency.

```
const batchResult = await client.display.input.batch({
  data: {
    actions: [
      { action: 'mouse_move', x: 100, y: 100 },
      { action: 'mouse_click' },
      { action: 'keyboard_type', text: 'search query' },
      { action: 'keyboard_key', keys: ['Return'] }
    ]
  }
})
```

### 5. Display Geometry and Reset

Get display dimensions and release stuck inputs.

```
// Get display size for coordinate-aware interactions
const geometry = await client.display.input.geometry()

// Emergency reset — release all held mouse buttons and keys
await client.display.input.reset()
```

### 6. Full Workflow: Window Discovery and Form Automation

A complete pattern that discovers a window, focuses it, and automates form interaction.

```
// Find the target application
const searchResult = await client.display.input.windowSearch({
  data: { pattern: 'Application' }
})

const windowId = searchResult.windows[0].id

// Focus and wait for activation
await client.display.input.windowFocus({ data: { windowId } })
await client.display.input.wait({ data: { duration: 500 } })

// Get window geometry for coordinate calculation
const geo = await client.display.input.windowGeometry({ windowId })

// Fill first field
await client.display.input.typeAt({
  data: { x: geo.x + 200, y: geo.y + 150, text: 'John Doe' }
})

// Tab to next field and type
await client.display.input.keyboardKey({ data: { keys: ['Tab'] } })
await client.display.input.keyboardType({
  data: { text: 'john@example.com' }
})

// Submit form
await client.display.input.keyboardKey({ data: { keys: ['Return'] } })

// Verify result
await client.display.input.wait({ data: { duration: 1000 } })
const result = await client.display.screenshots.capture()
```

### 7. Error Recovery Pattern

Handle stuck input states gracefully with automatic reset.

```
try {
  await client.display.input.keyboardType({
    data: { text: 'important input' }
  })
} catch (error) {
  // Reset all input state on failure
  await client.display.input.reset()
  // Retry
  await client.display.input.keyboardType({
    data: { text: 'important input' }
  })
}
```

### 8. HTML5 Client Access

Generate embeddable display client URLs with configuration options.

```
// Access with default settings
await client.display.accessClient()

// Access with specific configuration
await client.display.accessClient({
  readonly: true,
  dark_mode: true,
  toolbar: false,
  sound: true,
  keyboard: true,
  clipboard: true
})
```

---

## Quick Reference

### Most Common Operations

| Operation | SDK Method | Required Data |
|-----------|-----------|---------------|
| Health check | `client.display.health.check()` | — |
| Capture screenshot | `client.display.screenshots.capture()` | — |
| Get latest screenshot | `client.display.screenshots.getLatest()` | — |
| Screenshot metadata | `client.display.screenshots.captureMetadata()` | — |
| Screenshot by time | `client.display.screenshots.getByTimestamp(ts)` | `timestamp` |
| Capture thumbnail | `client.display.thumbnails.capture()` | — |
| Get display info | `client.display.getInformation()` | — |
| List windows | `client.display.listWindows()` | — |
| Read clipboard | `client.display.getClipboard()` | — |
| Write clipboard | `client.display.setClipboard({ data: { text } })` | `text` |
| Mouse move | `client.display.input.mouseMove({ data: { x, y } })` | `x`, `y` |
| Mouse click | `client.display.input.mouseClick()` | — |
| Mouse scroll | `client.display.input.mouseScroll({ data: { direction } })` | `direction` |
| Type text | `client.display.input.keyboardType({ data: { text } })` | `text` |
| Key combo | `client.display.input.keyboardKey({ data: { keys } })` | `keys` (array) |
| Hold key | `client.display.input.keyboardKeyDown({ data: { key } })` | `key` |
| Release key | `client.display.input.keyboardKeyUp({ data: { key } })` | `key` |
| Focus window | `client.display.input.windowFocus({ data: { windowId } })` | `windowId` |
| Move window | `client.display.input.windowMove({ data: { windowId, x, y } })` | `windowId`, `x`, `y` |
| Reset inputs | `client.display.input.reset()` | — |
| Display geometry | `client.display.input.geometry()` | — |

### Key Response Types

| Type | Shape |
|------|-------|
| `HealthResponse` | `{ status: string, ... }` — 9-field standardized health |
| `Base64ScreenshotResponse` | Image binary or `{ base64: string, ... }` |
| `ScreenshotInfo` | `{ timestamp, width, height, ... }` |
| `WindowListResult` | `{ windows: Array<{ id, name, ... }> }` |
| `InputActionResponse` | `{ success: boolean, ... }` |
| `MouseLocationResult` | `{ x: number, y: number }` |
| `DisplayGeometryResult` | `{ width: number, height: number }` |
| `ClipboardReadResult` | `{ text: string, ... }` |
| `ActionWithScreenshotResult` | Action result + screenshot data |
| `BatchResult` | Results array for each batched action |

### HTML5 Client URL Parameters

Key configuration options for `client.display.accessClient()`: `readonly`, `dark_mode`, `toolbar`, `menu`, `sound`, `keyboard`, `clipboard`, `file_transfer`, `video`, `sharing`, `floating_menu`, `reconnect`, `steal`, `printing`, `web_notifications`, `open_url`.