Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/markitobonito/cloud_repositorio/llms.txt

Use this file to discover all available pages before exploring further.

Worker nodes are SSH-accessible machines running QEMU/KVM and Open vSwitch where virtual machines are actually launched. The orchestrator maintains a registry of worker specs collected at startup and uses a round-robin index to spread VMs evenly across all available workers. The compute and network roles can overlap: 10.0.10.3 runs VMs and also acts as the dedicated network node for VLAN gateway and DHCP operations.

Worker discovery

At startup, WorkerDiscovery.discover_all() iterates the workers_list from database.yaml and SSHs to each node to collect hardware specs:
MetricCommandField
CPU coresnprocmax_cores
RAM (GB)free -gmax_ram_gb
Disk spacedf /tmpmax_disk_gb
ssh [email protected] nproc
ssh [email protected] 'free -g | grep Mem | awk "{print \$2}"'
ssh [email protected] 'df /tmp -B G | tail -1 | awk "{print \$2}" | tr -d G'
Workers that are unreachable (SSH timeout or non-zero exit) are silently skipped — they will not appear in the workers registry and will not receive VMs during that session. The default worker list from database.yaml:
workers_list: ["10.0.10.1", "10.0.10.2", "10.0.10.3"]

Worker specs schema

After discovery, each worker is stored under the workers key in database.yaml:
workers:
  10.0.10.1:
    ip: 10.0.10.1
    max_vms: 10
    max_cores: 2
    max_ram_gb: 1
    max_disk_gb: 500
    used_cores: 0
    used_ram_gb: 0
    used_disk_gb: 0
All three workers share the same schema. The used_* fields start at 0 after each discovery run.

Round-robin scheduling

OrchestratorAPI.get_next_worker() selects the target worker for each new VM using a simple round-robin index:
def get_next_worker(self):
    worker = self.workers[self.round_robin_idx % len(self.workers)]
    self.round_robin_idx += 1
    return worker
self.workers is loaded from workers_list in the database at initialization. With three workers and sequential VM additions, the assignment pattern is 10.0.10.1 → 10.0.10.2 → 10.0.10.3 → 10.0.10.1 → …. The index is not persisted between sessions, so it resets to 0 on each restart.

Network node

Worker 10.0.10.3 serves a dual role: it accepts QEMU VMs from the round-robin scheduler just like the other workers, and it is also the network node targeted by VLANManager for all OVS gateway and DHCP namespace operations. This means:
  • All gw_vlan{id} OVS ports are created on 10.0.10.3.
  • All ns-dhcp-vlan{id} network namespaces and dnsmasq processes run on 10.0.10.3.
  • IP forwarding and MASQUERADE rules for VLAN 400 internet access are applied on 10.0.10.3.
The network node IP is hardcoded as the default in VLANManager:
class VLANManager:
    def __init__(self, remote_executor, network_node_ip="10.0.10.3"):
Resource accounting is approximate. The system tracks used_cores, used_ram_gb, and used_disk_gb per worker in database.yaml, but these counters are populated by WorkerDiscovery at startup (set to 0) and incremented as VMs are added. They are not decremented when a VM or slice is deleted. After several create/delete cycles the counters will diverge from actual usage. Restart the manager to reset them via a fresh discovery run.
To add or remove workers, edit the workers_list key in database.yaml before starting the manager. Workers added to the list will be probed during discover_all() at next startup and — if reachable — will begin receiving VMs immediately. Removing a worker from the list prevents new VMs from being scheduled to it but does not affect VMs already running on that host.

Build docs developers (and LLMs) love