GlassKit Agent Skill for Smart Glasses Development

The GlassKit agent skill packages smart-glasses context — device constraints, reference patterns, and a starter template — into a format that AI coding agents can read and act on. This page explains what the skill contains, why it matters for glasses development, which agents support it, how to install and update it, and example prompts to get started.

Why Coding Agents Need It

Smart-glasses apps have unique aspects that general-purpose coding agents are not prepared for out of the box:

HUD constraints — Rokid Glasses use a monochrome 480×640 portrait display. Black is transparent; white is green. There is no touchscreen.
Input model — The only physical controls are a temple touchpad (tap, double-tap, swipe forward, swipe backward). Voice commands are offline and fixed-phrase.
Camera and microphone — Device-specific capture constraints, codec preferences, and HAL quirks affect WebRTC streaming behavior.
Battery and performance — The glasses have less CPU and RAM than phones. Implementations must be efficient.
Wearer UX — The wearer cannot look at a phone while using the app. Spoken guidance, minimal HUD text, and low-latency feedback matter more than they do in phone apps.

Without this context an agent will generate code that works on a phone but fails or feels wrong on glasses.

Which Agents It Works With

The GlassKit skill is distributed through the Agent Skills CLI. Any agent that supports that CLI can use it, including:

OpenAI Codex
Claude Code
Cursor
Other editors and CLI agents that support the Agent Skills spec

Installing the Skill

Run the install command once in your project directory or globally:

npx skills add RealComputer/GlassKit

Updating the Skill

When GlassKit ships new references or updates existing ones, pull the latest version with:

npx skills update glasskit

What the Skill Includes

After installation the agent can read three things:

SKILL.md

Top-level orientation: Rokid hardware basics, display and input constraints, workflow steps, and a reference index. The agent reads this first.

references/

Detailed reference docs covering Rokid setup, inputs, voice commands, WebRTC, proactive perception, OpenAI Realtime, and object detection.

assets/rokid-hello-world/

A minimal Rokid Glasses starter app the agent can copy into your workspace as a scaffold. Includes HUD layout and touchpad navigation patterns.

The full reference set the agent reads:

Reference	What it covers
`rokid-setup.md`	Hardware, Wi-Fi/ADB connection, common commands, phone/emulator setup
`rokid-inputs.md`	Touchpad handling, camera access, microphone access
`vosk-voice-commands.md`	Offline command-word recognition setup and implementation
`rokid-webrtc.md`	WebRTC sessions, audio/video tracks, SDP signaling, data channels, ICE/TURN
`proactive-perception-pattern.md`	Continuous observation pattern for workflow-driven apps
`openai-realtime.md`	WebRTC media brokering, VAD, backend-gated turns, sideband events
`object-detection.md`	Backend inference, normalized events, RF-DETR, realtime model augmentation

Example Prompts

Once the skill is installed, use prompts like these with your agent:

Create a starter Rokid Glasses app using the glasskit skill.

What Happens When the Agent Uses the Skill

Agent reads SKILL.md

The agent loads the top-level orientation doc to understand Rokid hardware, display constraints, touchpad input model, and the available reference set.

Agent reads relevant references

Based on the task, the agent reads the specific references it needs — for example rokid-webrtc.md and openai-realtime.md for a voice assistant, or object-detection.md for a detection workflow.

Agent copies the starter template

For a new app, the agent copies assets/rokid-hello-world/ into the target workspace as the initial scaffold, then renames the package and application ID.

Agent implements with constraints in mind

The agent writes Android and backend code that accounts for the monochrome HUD, touchpad navigation, camera HAL quirks, codec preferences, and wearer UX patterns from the references.

If the agent produces code that looks like a phone app rather than a glasses app — touchscreen gestures, colorful UI, or missing touchpad navigation — remind it to re-read the SKILL.md orientation and the rokid-inputs.md reference.

Questions and Contributions

Open an issue in the GlassKit repository or ask in the Discord server for questions about the skill or to contribute new references and examples.

Get Started

Core Concepts

Guides

Examples

GlassKit Agent Skill for Smart Glasses Development

Why Coding Agents Need It

Which Agents It Works With

Installing the Skill

Updating the Skill

What the Skill Includes

SKILL.md

references/

assets/rokid-hello-world/

Example Prompts

What Happens When the Agent Uses the Skill

Questions and Contributions

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Examples

Documentation Index

​Why Coding Agents Need It

​Which Agents It Works With

​Installing the Skill

​Updating the Skill

​What the Skill Includes

SKILL.md

references/

assets/rokid-hello-world/

​Example Prompts

​What Happens When the Agent Uses the Skill

​Questions and Contributions

Build docs developers (and LLMs) love

Why Coding Agents Need It

Which Agents It Works With

Installing the Skill

Updating the Skill

What the Skill Includes

Example Prompts

What Happens When the Agent Uses the Skill

Questions and Contributions