Skip to main content
This API documentation is provisional. WorldStereo code and model weights are not yet publicly released. The API described here represents the expected interface based on the research framework.

Overview

The 3D Reconstruction Pipeline enables high-quality reconstruction of consistent 3D scenes from WorldStereo-generated videos. By leveraging the multi-view consistency and precise camera control, WorldStereo produces videos that can be reliably reconstructed into detailed 3D models.

Reconstruction Pipeline

reconstruct_scene()

Reconstruct a 3D scene from generated video.
video_path
str
required
Path to the input video file generated by WorldStereo.
camera_trajectory
CameraTrajectory
required
Camera trajectory used during video generation. Required for accurate 3D reconstruction.
output_path
str
required
Path where reconstruction outputs will be saved.
method
str
default:"neural"
Reconstruction method:
  • neural: Neural implicit reconstruction (high quality)
  • mvs: Multi-view stereo (fast)
  • hybrid: Hybrid approach combining both methods
resolution
int
default:"512"
Reconstruction resolution. Higher values produce more detailed geometry but require more memory.
depth_estimation
bool
default:"true"
Enable depth estimation refinement using the spatial-stereo memory.
point_cloud_density
str
default:"medium"
Initial point cloud density: low, medium, or high.
Returns:
scene
ReconstructedScene
Reconstructed 3D scene object containing geometry, textures, and metadata.

ReconstructedScene

Represents a reconstructed 3D scene.
mesh
Mesh
Reconstructed 3D mesh with vertices, faces, and normals.
point_cloud
PointCloud
Dense point cloud representation of the scene.
textures
Dict[str, Texture]
Texture maps for the reconstructed geometry:
  • albedo: Base color texture
  • normal: Normal map
  • roughness: Surface roughness map
cameras
List[Camera]
Camera poses and intrinsics used during reconstruction.
metadata
Dict
Reconstruction metadata including quality metrics and parameters.

Output Formats

Mesh Export

Export reconstructed mesh in various formats.
scene
ReconstructedScene
required
Reconstructed scene to export.
output_path
str
required
Output file path.
format
str
default:"obj"
Export format:
  • obj: Wavefront OBJ (widely compatible)
  • ply: PLY format (preserves vertex colors)
  • gltf: glTF 2.0 (for web and real-time applications)
  • fbx: FBX format (for game engines and 3D software)
  • usd: Universal Scene Description
include_textures
bool
default:"true"
Include texture maps in the export.
texture_resolution
int
default:"2048"
Texture map resolution in pixels.

Point Cloud Export

Export point cloud representation.
scene
ReconstructedScene
required
Reconstructed scene containing point cloud.
output_path
str
required
Output file path.
format
str
default:"ply"
Export format: ply, pcd, xyz, or las.
include_colors
bool
default:"true"
Include RGB colors for each point.
include_normals
bool
default:"true"
Include surface normals for each point.

Usage Examples

Basic Reconstruction

from worldstereo.reconstruction import reconstruct_scene
from worldstereo.camera import load_trajectory

# Load camera trajectory used during generation
trajectory = load_trajectory("trajectory.json")

# Reconstruct scene
scene = reconstruct_scene(
    video_path="generated_video.mp4",
    camera_trajectory=trajectory,
    output_path="./reconstruction",
    method="neural",
    resolution=512
)

print(f"Reconstructed {len(scene.mesh.vertices)} vertices")
print(f"Point cloud size: {len(scene.point_cloud.points)}")

Export to Multiple Formats

from worldstereo.reconstruction import export_mesh, export_point_cloud

# Export mesh as OBJ
export_mesh(
    scene,
    "scene.obj",
    format="obj",
    include_textures=True,
    texture_resolution=2048
)

# Export as glTF for web viewing
export_mesh(
    scene,
    "scene.gltf",
    format="gltf",
    include_textures=True
)

# Export point cloud
export_point_cloud(
    scene,
    "scene.ply",
    format="ply",
    include_colors=True,
    include_normals=True
)

Advanced Reconstruction Options

from worldstereo.reconstruction import reconstruct_scene, ReconstructionConfig

# Configure advanced reconstruction settings
config = ReconstructionConfig(
    method="hybrid",
    resolution=1024,
    depth_estimation=True,
    point_cloud_density="high",
    
    # Neural reconstruction settings
    neural_iterations=10000,
    neural_batch_size=8192,
    
    # MVS settings
    mvs_min_views=3,
    mvs_geometric_consistency=True,
    
    # Post-processing
    mesh_cleanup=True,
    outlier_removal=True,
    smoothing_iterations=5
)

scene = reconstruct_scene(
    video_path="video.mp4",
    camera_trajectory=trajectory,
    output_path="./reconstruction",
    config=config
)

Reconstruction Quality Metrics

Scene Quality Assessment

Access reconstruction quality metrics.
# Get quality metrics
metrics = scene.metadata["quality_metrics"]

print(f"Reconstruction completeness: {metrics['completeness']:.2%}")
print(f"Geometric consistency: {metrics['geometric_consistency']:.3f}")
print(f"Texture quality: {metrics['texture_quality']:.3f}")
print(f"Multi-view consistency: {metrics['multiview_consistency']:.3f}")
completeness
float
Percentage of the scene successfully reconstructed (0-1).
geometric_consistency
float
Geometric consistency score across views (0-1).
texture_quality
float
Texture sharpness and detail preservation score (0-1).
multiview_consistency
float
Consistency of appearance across different camera views (0-1).

Post-Processing

Mesh Refinement

Refine reconstructed mesh geometry.
mesh
Mesh
required
Input mesh to refine.
remove_outliers
bool
default:"true"
Remove outlier vertices and faces.
smooth_normals
bool
default:"true"
Smooth surface normals for better lighting.
fill_holes
bool
default:"true"
Fill small holes in the mesh.
simplify
float
default:"None"
Simplify mesh to target percentage of original vertices (0-1).

Texture Optimization

Optimize and enhance texture maps.
scene
ReconstructedScene
required
Scene with textures to optimize.
super_resolution
bool
default:"false"
Apply super-resolution to texture maps.
seam_blending
bool
default:"true"
Blend texture seams for seamless appearance.
enhance_details
bool
default:"true"
Enhance fine details in textures.

Example: Post-Processing Pipeline

from worldstereo.reconstruction import (
    reconstruct_scene,
    refine_mesh,
    optimize_textures,
    export_mesh
)

# Reconstruct scene
scene = reconstruct_scene(
    video_path="video.mp4",
    camera_trajectory=trajectory,
    output_path="./reconstruction"
)

# Refine mesh
refined_mesh = refine_mesh(
    scene.mesh,
    remove_outliers=True,
    smooth_normals=True,
    fill_holes=True,
    simplify=0.8  # Reduce to 80% of vertices
)

# Optimize textures
optimized_scene = optimize_textures(
    scene,
    super_resolution=True,
    seam_blending=True,
    enhance_details=True
)

# Export final result
export_mesh(
    optimized_scene,
    "final_scene.gltf",
    format="gltf",
    include_textures=True,
    texture_resolution=4096
)

Integration with Geometric Memory

WorldStereo’s reconstruction pipeline leverages the global geometric memory and spatial-stereo memory modules used during video generation:
  • Global geometric memory provides coarse structural priors through incrementally updated point clouds
  • Spatial-stereo memory captures 3D correspondences for fine-grained detail preservation
This integration enables higher quality reconstruction compared to traditional multi-view stereo methods.

Reconstruction from Panoramic Input

Reconstruct scenes from panoramic image inputs.
from worldstereo import WorldStereo
from worldstereo.reconstruction import reconstruct_scene

# Generate video from panoramic image
model = WorldStereo.from_pretrained("worldstereo-v1")

video = model.generate_from_panorama(
    panorama_path="panorama.jpg",
    camera_trajectory=trajectory
)

video.save("panoramic_video.mp4")

# Reconstruct with panoramic context
scene = reconstruct_scene(
    video_path="panoramic_video.mp4",
    camera_trajectory=trajectory,
    output_path="./panoramic_reconstruction",
    panoramic_input=True  # Enable panoramic mode
)

Performance Considerations

Memory usage: High-resolution reconstructions require significant GPU memory. Use resolution=256 or resolution=512 for standard GPUs.Reconstruction time: Neural reconstruction methods take longer but produce higher quality results. Use method="mvs" for faster reconstruction.Video length: Longer videos with more camera views generally produce more complete reconstructions but require more processing time.Camera trajectory: Dense camera trajectories with overlapping views improve reconstruction quality.

Best Practices

  1. Use consistent camera trajectories: Maintain smooth camera motion and adequate overlap between views
  2. Generate high-quality videos: Higher resolution video input produces better reconstruction quality
  3. Leverage geometric memory: WorldStereo’s geometric memory modules improve reconstruction consistency
  4. Post-process results: Apply mesh refinement and texture optimization for production-ready assets
  5. Validate metrics: Check quality metrics to assess reconstruction completeness

Build docs developers (and LLMs) love