Core Concepts Overview

SkyDiscover is a modular framework for AI-driven scientific and algorithmic discovery. It provides a unified interface for implementing, running, and comparing discovery algorithms across diverse optimization tasks.

What is SkyDiscover?

SkyDiscover enables you to use large language models (LLMs) to automatically discover and optimize:

Algorithms: Sorting, scheduling, routing, packing problems
Mathematical solutions: Geometric optimization, inequality proofs
System configurations: GPU kernels, cloud scheduling, load balancing
Prompts: Optimizing LLM prompts for specific tasks
Creative content: AI image generation

SkyDiscover has been validated across 200+ optimization tasks, with its flagship algorithms AdaEvolve and EvoX achieving state-of-the-art results comparable to DeepMind’s AlphaEvolve.

Core Components

SkyDiscover consists of four primary components that work together:

1. Initial Program (Optional)

The starting point for optimization. Can be:

A baseline solution to improve upon
Omitted entirely (LLM generates from scratch)
Marked with EVOLVE-BLOCK markers to specify mutable regions

2. Search Algorithm

Determines which programs to evolve and how to evolve them. Options include:

AdaEvolve: Multi-island adaptive search with UCB selection
EvoX: Self-evolving search that co-adapts its own strategy
Top-K: Simple refinement of top-performing solutions
Beam Search: Breadth-first exploration of solution space
Best-of-N: Multiple variants from the same parent

See Search Algorithms for details.

3. Evaluator

A Python function that scores candidate programs:

def evaluate(program_path):
    score = run_and_grade(program_path)
    return {
        "combined_score": score,  # Primary optimization target
        "artifacts": {             # Optional feedback for LLM
            "feedback": "Off by one in loop boundary",
        },
    }

See Evaluators for examples.

4. LLM (Language Model)

Generates program mutations based on:

Parent program
Context programs (high-performing examples)
Evaluation feedback from previous attempts
Population statistics

Supports any LiteLLM-compatible model including OpenAI, Anthropic, Google, and local models.

The Discovery Loop

SkyDiscover runs this cycle for each iteration:

Sample

Search algorithm selects a parent program and context programs from the database

Prompt

Build prompts with parent code, context examples, feedback, and population stats

Generate

LLM creates a new program variant

Evaluate

Run the evaluator to score the program

Add

Store the program and metrics in the database

Adapt

Search algorithm updates its strategy based on results

This loop repeats for the configured number of iterations (typically 50-200).

Key Design Principles

Modularity

Every component is swappable:

Try different search algorithms without changing your problem
Use the same evaluator across multiple algorithms
Switch LLM providers seamlessly

Fairness

All algorithms run with:

Same evaluation budget
Same LLM calls per iteration
Standardized prompt templates
Reproducible checkpointing

Extensibility

Easy to add new:

Search algorithms (see skydiscover/search/README.md:29)
Benchmarks (see benchmarks/README.md)
Context builders for custom prompt strategies

What Makes SkyDiscover Different?

Adaptive Algorithms

AdaEvolve and EvoX dynamically adjust search intensity based on progress, unlike fixed strategies in other frameworks

200+ Benchmarks

Comprehensive evaluation across math, systems, algorithms, and reasoning tasks

Native Implementations

Built-in versions of OpenEvolve and GEPA for fair comparison without external dependencies

Real-time Monitoring

Live dashboard with scatter plots, code diffs, and human feedback integration

Performance Highlights

Across ~200 optimization benchmarks:

Frontier-CS: 34% median score improvement over OpenEvolve, GEPA, and ShinkaEvolve
Math + Systems: Matches or exceeds AlphaEvolve and human SOTA on 12/14 tasks
Real-world impact:
- 41% lower cross-cloud transfer cost
- 14% better GPU load balance for MoE serving
- 29% lower KV-cache pressure via GPU model placement

Next Steps

Architecture

Deep dive into SkyDiscover’s internal architecture

Search Algorithms

Learn about available search algorithms

Evaluators

Write effective evaluation functions

Evolution Blocks

Control what code gets evolved

Get Started

Core Concepts

Guides

Examples

Extending

Core Concepts Overview

What is SkyDiscover?

Core Components

1. Initial Program (Optional)

2. Search Algorithm

3. Evaluator

4. LLM (Language Model)

The Discovery Loop

Key Design Principles

Modularity

Fairness

Extensibility

What Makes SkyDiscover Different?

Adaptive Algorithms

200+ Benchmarks

Native Implementations

Real-time Monitoring

Performance Highlights

Next Steps

Architecture

Search Algorithms

Evaluators

Evolution Blocks

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Examples

Extending

Documentation Index

​What is SkyDiscover?

​Core Components

​1. Initial Program (Optional)

​2. Search Algorithm

​3. Evaluator

​4. LLM (Language Model)

​The Discovery Loop

​Key Design Principles

​Modularity

​Fairness

​Extensibility

​What Makes SkyDiscover Different?

Adaptive Algorithms

200+ Benchmarks

Native Implementations

Real-time Monitoring

​Performance Highlights

​Next Steps

Architecture

Search Algorithms

Evaluators

Evolution Blocks

Build docs developers (and LLMs) love

What is SkyDiscover?

Core Components

1. Initial Program (Optional)

2. Search Algorithm

3. Evaluator

4. LLM (Language Model)

The Discovery Loop

Key Design Principles

Modularity

Fairness

Extensibility

What Makes SkyDiscover Different?

Performance Highlights

Next Steps