Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/markitobonito/cloud_repositorio/llms.txt

Use this file to discover all available pages before exploring further.

Cloud Repositorio discovers and registers worker nodes at startup. The list of workers is read from workers_list in database.yaml and each host is SSH-accessed to collect hardware specs before the CLI becomes available. Workers that cannot be reached are skipped silently; only reachable workers are registered in the workers section of the database.

workers_list

workers_list is the authoritative list of compute worker IP addresses:
workers_list: ["10.0.10.1", "10.0.10.2", "10.0.10.3"]
At startup, WorkerDiscovery.discover_all() iterates over this list and attempts SSH access to each IP. Successful connections populate the workers dict with the collected hardware specs. Adding an IP to workers_list causes WorkerDiscovery to attempt SSH on the next startup.

SSH requirements

WorkerDiscovery connects as the ubuntu user (the default value of RemoteExecutor’s remote_user parameter). SSH must work without a password prompt — key-based authentication is required. The three commands executed during discovery are:
MetricCommand
CPU coresnproc
RAM (GB)free -g | grep Mem | awk '{print $2}'
Disk (GB)df /tmp -B G | tail -1 | awk '{print $2}' | tr -d G
If a command fails or the connection times out, the fallback values used are 2 cores, 1 GB RAM, and 500 GB disk.
Unreachable workers are silently skipped during discovery. No error is raised and the manager starts normally. Check the application logs to identify workers that failed discovery.

Required software on workers

Each compute worker (e.g., 10.0.10.1, 10.0.10.2) must have the following installed and configured:
  • qemu-system-x86_64 — QEMU/KVM hypervisor for running virtual machines
  • ovs-vsctl and Open vSwitch — with a br-int bridge already created (ovs-vsctl add-br br-int)
  • ip tuntap support — for creating TAP interfaces attached to OVS ports
  • Base VM images at /tmp/vm_images/cirros-0.6.2-x86_64-disk.img and/or focal-server-cloudimg-amd64.img (images are SCP-copied automatically if missing, but having them pre-staged speeds up first deployment)

Network node requirements (10.0.10.3)

The designated network node (10.0.10.3 by default) runs DHCP and NAT services and requires additional software:
  • dnsmasq — for per-VLAN DHCP namespaces
  • iptables with MASQUERADE support — for internet access via NAT
  • ip netns support — for isolated DHCP network namespaces per VLAN
  • Open vSwitch with a br-int bridge — same as compute workers

Removing a worker

To remove a worker, delete its IP from workers_list in database.yaml and restart the manager. The workers dict entry for that IP will no longer be refreshed. Existing VMs assigned to the removed worker will remain in the database but cannot be started, stopped, or deleted until the worker is reachable again.
Cloud Repositorio does not support live migration. Removing a worker that has running VMs leaves those VMs orphaned in the database — they appear in slice listings but cannot be controlled. Clean up VMs before removing a worker from the list.

Build docs developers (and LLMs) love