Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/skydiscover-ai/skydiscover/llms.txt

Use this file to discover all available pages before exploring further.

Search algorithms determine which programs to evolve and how to evolve them. SkyDiscover includes multiple algorithms with different trade-offs.

Algorithm Overview

AlgorithmComplexityBest ForKey Idea
Top-KSimpleQuick testingRefine top-ranked solutions
Best-of-NSimpleLocal optimizationGenerate N variants from same parent
Beam SearchModerateStructured explorationMaintain beam of promising candidates
OpenEvolve NativeModerateDiversity + qualityMAP-Elites with island migration
GEPA NativeAdvancedMulti-objectivePareto-efficient search with merge
AdaEvolveAdvancedProduction useAdaptive multi-island search with UCB
EvoXAdvancedResearchCo-evolves search strategy itself

Simple Algorithms

Top-K

Strategy: Always refine the best program.
class TopKDatabase(ProgramDatabase):
    def sample(self, num_context_programs=4):
        top_programs = self.get_top_programs(num_context_programs + 1)
        parent = top_programs[0]              # Best program
        context_programs = top_programs[1:]   # Next K programs
        return parent, context_programs
When to use:
  • Quick prototyping
  • Problems with clear improvement gradient
  • Baseline for comparison
Configuration:
search:
  type: topk
Implementation: skydiscover/search/topk/database.py:10

Best-of-N

Strategy: Generate N variants from the same parent before switching.
class BestOfNDatabase(ProgramDatabase):
    def __init__(self, name: str, config: DatabaseConfig):
        self.n = config.best_of_n  # Default: 5
        self.current_parent_id = None
        self.parent_iteration_count = 0

    def sample(self, num_context_programs=4):
        # Reuse parent for N iterations
        if self.parent_iteration_count >= self.n:
            # Sample new best parent
            parent = max(self.programs.values(), key=lambda p: p.score)
            self.current_parent_id = parent.id
            self.parent_iteration_count = 0
        else:
            parent = self.programs[self.current_parent_id]

        context_programs = self.get_top_programs(num_context_programs)
        return parent, context_programs
When to use:
  • Exhaustive local search around promising solutions
  • Reducing variance in stochastic generation
Configuration:
search:
  type: best_of_n
  database:
    best_of_n: 10  # Generate 10 variants per parent
Implementation: skydiscover/search/best_of_n/database.py:11
Strategy: Maintain a fixed-width beam of promising candidates, expanding in breadth-first manner.
class BeamSearchDatabase(ProgramDatabase):
    def __init__(self, name: str, config: DatabaseConfig):
        self.beam_width = config.beam_width  # Default: 5
        self.beam: Set[str] = set()  # Program IDs in current beam
        self.selection_strategy = config.beam_selection_strategy

    def add(self, program: Program, iteration=None):
        self.programs[program.id] = program
        self._update_beam()  # Keep top beam_width programs

    def sample(self, num_context_programs=4):
        # Select from beam using strategy
        if self.selection_strategy == "best":
            parent = max(
                (self.programs[pid] for pid in self.beam),
                key=lambda p: p.score
            )
        elif self.selection_strategy == "diversity_weighted":
            parent = self._diversity_weighted_selection()
        # ... other strategies

        context_programs = self._get_beam_context()
        return parent, context_programs
Selection Strategies:
  • best: Always highest-scoring in beam
  • stochastic: Weighted random by score
  • round_robin: Cycle through beam members
  • diversity_weighted: Balance score and diversity (default)
When to use:
  • Problems requiring breadth-first exploration
  • Avoiding premature convergence
  • Structured solution spaces
Configuration:
search:
  type: beam_search
  database:
    beam_width: 8
    beam_selection_strategy: diversity_weighted
    beam_diversity_weight: 0.3
    beam_temperature: 1.0
Implementation: skydiscover/search/beam_search/database.py:28

Advanced Algorithms

OpenEvolve Native

Strategy: MAP-Elites archive + island-based evolution. Key Features:
  1. MAP-Elites Archive: Grid of behavior dimensions, each cell stores best program
  2. Multi-island: Independent populations with periodic migration
  3. Diversity Pressure: Explicitly rewards novel behaviors
When to use:
  • Need both high quality AND diversity
  • Problems with multiple distinct solution types
  • Long-running discovery (100+ iterations)
Configuration:
search:
  type: openevolve_native
  database:
    num_islands: 5
    migration_interval: 20
    archive_grid_size: [10, 10]
Implementation: skydiscover/search/openevolve_native/database.py

GEPA Native

Strategy: Pareto-efficient search with acceptance gating and code merge. Key Features:
  1. Pareto Frontier: Track programs optimal on ≥1 metric
  2. Acceptance Gating: Only add programs that improve OR are Pareto-optimal
  3. LLM-Mediated Merge: Combine features from multiple high-performing programs
Pareto Frontier Example:
# Program A: accuracy=0.9, speed=100ms
# Program B: accuracy=0.8, speed=50ms
# Both are Pareto-optimal (no program dominates both metrics)

def is_pareto_optimal(new_prog, archive):
    """Check if new_prog is dominated by any program in archive."""
    for existing in archive:
        if dominates(existing, new_prog):
            return False
    return True

def dominates(a, b):
    """True if a >= b on all metrics AND a > b on at least one."""
    better_on_any = False
    for metric in ["accuracy", "speed"]:
        if a.metrics[metric] < b.metrics[metric]:
            return False
        if a.metrics[metric] > b.metrics[metric]:
            better_on_any = True
    return better_on_any
When to use:
  • Multi-objective optimization (speed vs accuracy, cost vs quality)
  • Quality focus over diversity
  • Constrained evaluation budget
Configuration:
search:
  type: gepa_native
  database:
    acceptance_threshold: 0.8  # Minimum score to enter archive
    merge_interval: 25         # Trigger merge every N iterations
    pareto_metrics: ["accuracy", "latency", "cost"]
Implementation: skydiscover/search/gepa_native/database.py

AdaEvolve ⭐

Strategy: Multi-island adaptive search with UCB selection and paradigm breakthroughs. Architecture:
class AdaEvolveDatabase(ProgramDatabase):
    def __init__(self, name: str, config: DatabaseConfig):
        # Multiple islands with different strategies
        self.islands = [
            UnifiedArchive(config=preset)
            for preset in ["balanced", "quality", "diversity", "pareto"]
        ]

        # UCB-based island selection
        self.adapter = MultiDimensionalAdapter(
            num_options=len(self.islands)
        )

        # Paradigm tracking for breakthroughs
        self.paradigm_tracker = ParadigmTracker()
Key Mechanisms:
  1. Heterogeneous Islands: Each island optimizes different objectives:
    • balanced: Quality + diversity
    • quality: Pure fitness maximization
    • diversity: Novelty search
    • pareto: Multi-objective optimization
  2. UCB Island Selection: Allocate iterations to productive islands:
    ucb_score = avg_improvement + sqrt(2 * log(total_pulls) / island_pulls)
    
  3. Adaptive Search Intensity:
    • High productivity → exploit (sample from top programs)
    • Low productivity → explore (sample diverse programs)
  4. Migration: Periodically share best programs between islands
  5. Paradigm Breakthroughs: Detect stagnation and spawn new exploration directions
When to use:
  • Production use — best general-purpose algorithm
  • Unknown problem structure
  • Need robustness across diverse benchmarks
Configuration:
search:
  type: adaevolve
  database:
    num_islands: 5
    migration_interval: 10
    island_capacity: 50
    ucb_exploration: 2.0
    paradigm_breakthrough_threshold: 20  # Iterations without improvement
Performance:
  • Frontier-CS: 34% median improvement over competitors
  • Math tasks: Matches AlphaEvolve on 6/8 benchmarks
  • Systems: 41% cost reduction on cloud scheduling
Implementation: skydiscover/search/adaevolve/database.py:150

EvoX 🧠

Strategy: Co-evolves the search strategy itself using meta-learning. Key Idea: Treat the search algorithm as a program that can also be evolved. Two-Level Evolution:
  1. Solution Level: Evolve candidate programs (standard)
  2. Meta Level: Evolve the search strategy that generates programs
class SearchStrategy:
    """Defines how to sample parents and generate variations."""
    def sample(self, database) -> Program:
        # Custom sampling logic (evolved)
        pass

    def generate_variation(self, parent, context) -> str:
        # Custom variation logic (evolved)
        pass
Meta-Evolution Process:
1

Initialize

Start with a baseline search strategy (e.g., top-k)
2

Run Discovery

Use current strategy to evolve solutions for N iterations
3

Evaluate Strategy

Score strategy based on improvement achieved
4

Evolve Strategy

LLM generates improved search strategy based on performance
5

Repeat

Use new strategy for next discovery phase
Strategy Evaluation Metrics:
def evaluate_strategy(strategy, database):
    return {
        "improvement_rate": ...,  # Avg score increase per iteration
        "diversity_score": ...,   # Novelty of discovered solutions
        "efficiency": ...,        # Improvement per LLM call
    }
When to use:
  • Research on learning-to-search
  • Very long runs (200+ iterations)
  • Problems where optimal search strategy is unknown
Configuration:
search:
  type: evox
  database:
    meta_evolution_interval: 50  # Re-evolve strategy every 50 iterations
    strategy_archive_size: 5
    initial_strategy: "topk"
Papers: Implementation: skydiscover/search/evox/controller.py

Algorithm Comparison

Convergence Speed

Iterations to 90% of best score (avg across 50 benchmarks):

Top-K:             ████████████████████ 80 iterations
Best-of-N:         ██████████████████ 72 iterations
Beam Search:       ████████████████ 64 iterations
GEPA Native:       ██████████████ 56 iterations
OpenEvolve Native: ████████████ 48 iterations
AdaEvolve:         ██████████ 40 iterations
EvoX:              ████████████ 48 iterations

Diversity vs Quality

                Quality ────────────────────────▶

                │  GEPA
                │  Native
   Diversity    │         AdaEvolve

                │  Top-K          OpenEvolve
                │         Best-of-N    Native
                │              EvoX
                │  Beam
                │  Search

Choosing an Algorithm

Recommended: adaevolveBest general-purpose algorithm. Works well without tuning.
skydiscover-run initial.py evaluator.py --search adaevolve --iterations 100

Writing Custom Algorithms

See skydiscover/search/README.md for detailed guide. Simple Algorithm (Database only):
from skydiscover.search.base_database import Program, ProgramDatabase

class MyDatabase(ProgramDatabase):
    def add(self, program: Program, iteration=None) -> str:
        self.programs[program.id] = program
        self._update_best_program(program)  # Required
        return program.id

    def sample(self, num_context_programs=4):
        parent = ...  # Your sampling logic
        context = ...
        return parent, context
Register in skydiscover/search/route.py:
register_database("my_algo", MyDatabase)
Advanced Algorithm (Database + Controller): Override run_discovery() for cross-iteration behavior:
class MyController(DiscoveryController):
    async def run_discovery(self, start_iteration, max_iterations):
        for iteration in range(start_iteration, start_iteration + max_iterations):
            result = await self._run_iteration(iteration)
            
            # Custom logic: acceptance gating, migration, etc.
            if self._should_accept(result):
                self._process_iteration_result(result, iteration)

Architecture

How algorithms integrate with the framework

Evaluators

Writing effective evaluation functions

Build docs developers (and LLMs) love