Installation

Pre-Release Status

WorldStereo code and model weights are not yet publicly available. This page describes the planned installation process for when the release occurs.

The WorldStereo team is currently preparing the codebase and model weights for public release. Follow the GitHub repository for updates on the release timeline.

Stay Updated

GitHub Repository

Watch the repository for release announcements

arXiv Paper

Read the research paper for technical details

Expected System Requirements

Based on the nature of WorldStereo as a Video Diffusion Model-based framework with geometric memory modules, the following requirements are anticipated:

Hardware Requirements

These are estimated requirements. Actual requirements will be confirmed upon code release.

GPU: NVIDIA GPU with at least 24GB VRAM (e.g., RTX 3090, RTX 4090, or A100)
- Video diffusion models are computationally intensive
- Geometric memory modules require additional VRAM for point cloud storage
RAM: At least 32GB system memory recommended
Storage: 50GB+ free space for model weights and dependencies
CUDA: CUDA 11.8 or higher

Software Requirements

Python: 3.9 or higher
PyTorch: 2.0+ with CUDA support
Additional dependencies (expected):
- Diffusion model libraries (likely diffusers or custom implementation)
- 3D processing libraries (e.g., PyTorch3D, Open3D for point cloud operations)
- Video processing utilities (e.g., OpenCV, imageio)
- Camera pose handling libraries

Planned Installation Steps

The following installation procedure is provisional and will be updated when the code is released.

Clone the Repository

Once released, clone the WorldStereo repository:

git clone https://github.com/FuchengSu/WorldStereo.git
cd WorldStereo

Create Virtual Environment

Set up a Python virtual environment to isolate dependencies:

python -m venv worldstereo-env
source worldstereo-env/bin/activate  # On Windows: worldstereo-env\Scripts\activate

Install Dependencies

Install required Python packages:

pip install -r requirements.txt

This will likely include PyTorch, diffusion libraries, and 3D processing tools.

Download Model Weights

Download pre-trained model weights:

# Method will be provided upon release
# Possibly through Hugging Face Hub or direct download
python download_models.py

Expected model components:

Base VDM (Video Diffusion Model) backbone
Global-geometric memory module weights
Spatial-stereo memory module weights

Verify Installation

Test the installation:

python test_installation.py

This should verify:

CUDA availability and GPU detection
Model weights loaded correctly
All dependencies installed properly

Docker Installation (Expected)

A Docker container may be provided for easier setup:

# Expected Docker usage (tentative)
docker pull worldstereo/worldstereo:latest
docker run --gpus all -it worldstereo/worldstereo:latest

Docker support will simplify dependency management and ensure consistent environments across different systems.

Model Architecture Components

When released, WorldStereo will include:

VDM Backbone

The foundation Video Diffusion Model trained with distribution matching distillation for efficient generation.

Control Branch

The flexible control branch architecture that integrates geometric memory modules without requiring joint training of the entire system.

Geometric Memory Modules

Global-geometric memory: Point cloud-based structural priors
Spatial-stereo memory: 3D correspondence-based attention constraints

Troubleshooting (Anticipated)

GPU Memory Issues

If you encounter out-of-memory errors:

Reduce batch size or video resolution
Use gradient checkpointing if available
Consider using model quantization (e.g., fp16/bf16)

CUDA Compatibility

Ensure your PyTorch installation matches your CUDA version:

python -c "import torch; print(torch.cuda.is_available())"
python -c "import torch; print(torch.version.cuda)"

Getting Help

Once the code is released:

Issues: Report bugs on GitHub Issues
Discussions: Join discussions on the GitHub repository
Documentation: Check the official documentation for detailed guides

Next Steps

While waiting for the release:

Read the Paper

Understand the technical details and methodology

Quick Start Guide

Preview the planned usage workflow

Get Started

Core Concepts

Guides

Research

Pre-Release Status

Stay Updated

GitHub Repository

arXiv Paper

Expected System Requirements

Hardware Requirements

Software Requirements

Planned Installation Steps

Docker Installation (Expected)

Model Architecture Components

VDM Backbone

Control Branch

Geometric Memory Modules

Troubleshooting (Anticipated)

GPU Memory Issues

CUDA Compatibility

Getting Help

Next Steps

Read the Paper

Quick Start Guide

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Research

​Pre-Release Status

​Stay Updated

GitHub Repository

arXiv Paper

​Expected System Requirements

​Hardware Requirements

​Software Requirements

​Planned Installation Steps

​Docker Installation (Expected)

​Model Architecture Components

​VDM Backbone

​Control Branch

​Geometric Memory Modules

​Troubleshooting (Anticipated)

​GPU Memory Issues

​CUDA Compatibility

​Getting Help

​Next Steps

Read the Paper

Quick Start Guide

Build docs developers (and LLMs) love

Pre-Release Status

Stay Updated

Expected System Requirements

Hardware Requirements

Software Requirements

Planned Installation Steps

Docker Installation (Expected)

Model Architecture Components

VDM Backbone

Control Branch

Geometric Memory Modules

Troubleshooting (Anticipated)

GPU Memory Issues

CUDA Compatibility

Getting Help

Next Steps