Skip to main content

Overview

The server-client architecture enables distributed inference where the GR00T model runs on a GPU server and clients connect remotely to query actions. This is useful for:
  • Running inference on a remote GPU while controlling robots locally
  • Sharing a single model across multiple robot instances
  • Separating model execution from environment simulation

PolicyServer

Class definition

gr00t/policy/server_client.py
class PolicyServer:
    """An inference server that spins up a ZeroMQ socket and listens for incoming requests.
    
    Can add custom endpoints by calling `register_endpoint`.
    """

Constructor

policy
BasePolicy
required
The policy instance to serve (e.g., Gr00tPolicy or Gr00tSimPolicyWrapper)
host
str
default:"*"
Host address to bind to. Use "*" to listen on all interfaces, or "localhost" for local-only access
port
int
default:"5555"
Port number to listen on
api_token
str | None
default:"None"
Optional API token for authentication (currently not enforced)

Methods

register_endpoint

Register a custom endpoint to the server.
def register_endpoint(
    self, 
    name: str, 
    handler: Callable, 
    requires_input: bool = True
) -> None
name
str
required
The name of the endpoint (e.g., "get_action", "reset")
handler
Callable
required
The handler function that will be called when the endpoint is hit
requires_input
bool
default:"True"
Whether the handler requires input data

run

Start the server and listen for requests.
def run(self) -> None
This method runs an infinite loop until the server is killed via the "kill" endpoint.

Default endpoints

The server automatically registers these endpoints:
ping
GET
Health check endpoint that returns {"status": "ok", "message": "Server is running"}
kill
POST
Gracefully shutdown the server
get_action
POST
Generate actions from observations. Calls policy.get_action(observation, options)
reset
POST
Reset the policy state. Calls policy.reset(options)
get_modality_config
GET
Get modality configurations. Calls policy.get_modality_config()

Usage example

from gr00t.policy.gr00t_policy import Gr00tPolicy
from gr00t.policy.server_client import PolicyServer
from gr00t.data.embodiment_tags import EmbodimentTag

# Initialize policy
policy = Gr00tPolicy(
    embodiment_tag=EmbodimentTag.GR1,
    model_path="nvidia/GR00T-N1.6-3B",
    device="cuda:0"
)

# Create and run server
server = PolicyServer(
    policy=policy,
    host="0.0.0.0",  # Listen on all interfaces
    port=5555
)

print("Starting policy server on port 5555...")
server.run()
Or use the provided script:
uv run python gr00t/eval/run_gr00t_server.py \
    --embodiment-tag GR1 \
    --model-path nvidia/GR00T-N1.6-3B \
    --device cuda:0 \
    --host 0.0.0.0 \
    --port 5555

PolicyClient

Class definition

gr00t/policy/server_client.py
class PolicyClient(BasePolicy):
    """Client for connecting to a PolicyServer.
    
    Implements the same BasePolicy interface but forwards requests to a remote server.
    """

Constructor

host
str
default:"localhost"
Hostname or IP address of the policy server
port
int
default:"5555"
Port number of the policy server
api_token
str | None
default:"None"
Optional API token for authentication
strict
bool
default:"True"
Whether to enforce strict validation (passed to the remote policy)

Methods

ping

Check if the server is reachable.
def ping(self) -> bool
is_alive
bool
True if the server responds to ping, False otherwise

get_action

Generate actions from observations via the remote server.
def get_action(
    self, 
    observation: dict[str, Any], 
    options: dict[str, Any] | None = None
) -> tuple[dict[str, Any], dict[str, Any]]
observation
dict[str, Any]
required
Observation dictionary (format depends on the server’s policy type)
options
dict[str, Any] | None
Optional parameters
actions
dict[str, np.ndarray]
Dictionary of action arrays
info
dict[str, Any]
Additional information

reset

Reset the remote policy.
def reset(self, options: dict[str, Any] | None = None) -> dict[str, Any]
options
dict[str, Any] | None
Optional reset parameters (e.g., {"episode_index": 5} for ReplayPolicy)
info
dict[str, Any]
Information dictionary

get_modality_config

Get modality configurations from the remote server.
def get_modality_config(self) -> dict[str, ModalityConfig]
modality_configs
dict[str, ModalityConfig]
Modality configurations

send_request

Send a custom request to the server.
def send_request(self, endpoint: str, data: Any = None) -> Any
endpoint
str
required
The endpoint name (e.g., "get_action", "ping")
data
Any
Data to send with the request
response
Any
Response from the server

Usage example

from gr00t.policy.server_client import PolicyClient
import numpy as np

# Connect to remote policy server
policy = PolicyClient(host="10.0.0.5", port=5555)

# Verify connection
if not policy.ping():
    raise RuntimeError("Cannot connect to policy server!")

# Get modality configuration
modality_configs = policy.get_modality_config()
print(f"Video modalities: {modality_configs['video'].modality_keys}")

# Prepare observation
observation = {
    "video": {
        "head_camera": np.zeros((1, 1, 224, 224, 3), dtype=np.uint8),
    },
    "state": {
        "joint_positions": np.zeros((1, 1, 14), dtype=np.float32),
    },
    "language": {
        "task": [["pick up the apple"]]
    }
}

# Generate action via network
action, info = policy.get_action(observation)
print(f"Received action with shape: {action['joint_positions'].shape}")

MsgSerializer

Internal serialization class for encoding/decoding messages over the network.
gr00t/policy/server_client.py
class MsgSerializer:
    """Handles serialization of numpy arrays and ModalityConfig objects using msgpack."""
    
    @staticmethod
    def to_bytes(data: Any) -> bytes:
        """Serialize data to bytes."""
        
    @staticmethod
    def from_bytes(data: bytes) -> Any:
        """Deserialize bytes to data."""
The serializer automatically handles numpy arrays and ModalityConfig objects. Custom classes can be supported by extending encode_custom_classes and decode_custom_classes.

Network protocol

The server uses ZeroMQ (REP socket) with msgpack serialization:
  1. Client sends request: {"endpoint": "get_action", "data": {...}}
  2. Server processes request and calls the appropriate handler
  3. Server sends response: {"status": "success", "data": {...}} or {"status": "error", "error": "..."}
  4. Client receives and deserializes response

Error handling

Server-side errors are caught and returned to the client:
# Server catches exceptions and returns error response
try:
    result = handler(data)
    response = {"status": "success", "data": result}
except Exception as e:
    response = {"status": "error", "error": str(e)}
Client-side:
response = policy.send_request("get_action", observation)
if response["status"] == "error":
    raise RuntimeError(f"Server error: {response['error']}")

Performance considerations

Network latency: Each get_action call requires a round-trip to the server. For high-frequency control, consider:
  • Running the server on the same local network
  • Using action chunking to reduce query frequency
  • Batching multiple queries if your environment supports it
Multi-GPU serving: You can run multiple policy servers on different GPUs and load-balance clients across them for higher throughput.

See also

Server-client guide

Complete deployment guide

run_gr00t_server.py

Server launch script reference

Gr00tPolicy

Core policy class

Policy API guide

Using the policy API

Build docs developers (and LLMs) love