SkyDiscover is a modular framework for AI-driven scientific and algorithmic discovery, providing a unified interface for implementing, running, and fairly comparing discovery algorithms across 200+ optimization tasks.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/skydiscover-ai/skydiscover/llms.txt
Use this file to discover all available pages before exploring further.
What is SkyDiscover?
SkyDiscover enables you to use LLMs to automatically discover and optimize solutions to complex problems—from circle packing and competitive programming challenges to GPU kernel optimization and cloud scheduling. Instead of manually coding algorithms, you provide:- An evaluator function that scores candidate solutions
- Optionally, an initial program to improve upon (or start from scratch)
SkyDiscover is under active development. New algorithms, benchmarks, and features are being added regularly.
Key Features
State-of-the-Art Algorithms
SkyDiscover introduces two new adaptive optimization algorithms:- AdaEvolve - Dynamically adjusts optimization behavior based on observed progress with multi-island search, UCB-based selection, and paradigm breakthroughs
- EvoX - Self-evolving paradigm that co-adapts solution generation and experience management using LLMs on the fly
Multiple Search Strategies
Choose from native algorithms:- AdaEvolve - Multi-island adaptive search (recommended)
- EvoX - Self-evolving paradigm
- Top-K - Select and refine top-K solutions
- Beam Search - Breadth-first expansion
- Best-of-N - Generate N variants per iteration
- OpenEvolve Native - MAP-Elites + island-based search
- GEPA Native - Pareto-efficient search with reflective prompting
--extra external):
- OpenEvolve
- GEPA
- ShinkaEvolve
200+ Benchmark Tasks
SkyDiscover includes diverse benchmarks across multiple domains:| Domain | Benchmark | Tasks | Description |
|---|---|---|---|
| 🔢 Math | Circle packing, Erdos problems | 14 | Geometric optimization challenges |
| 🖥️ Systems | ADRS, GPU mode | 9 | Cloud scheduling, load balancing, kernel optimization |
| 🧩 Algorithms | Frontier-CS | 172 | Competitive programming challenges |
| 💻 Programming | ALE Bench | 10 | Algorithmic contests |
| 💬 NLP | Prompt optimization | 1 | HotPotQA prompt evolution |
| 🎨 Creative | Image generation | 1 | AI image generation evolution |
Flexible Model Support
Works with any LiteLLM-compatible model:- OpenAI (GPT-5, GPT-4o, etc.)
- Google (Gemini 2.0, Gemini 3 Pro)
- Anthropic (Claude)
- Local models (Ollama, vLLM)
- Multi-model pools with weighted sampling
Live Monitoring & Human Feedback
Built-in dashboard for real-time progress tracking:- Scatter plot of all generated programs
- Code diffs and metrics visualization
- AI-generated summaries
- Human feedback panel to steer evolution
Modular & Extensible
Easy to extend with:- Custom search algorithms
- New benchmarks
- Custom context builders
- Domain-specific prompts
Architecture Overview
SkyDiscover follows a modular architecture with clear separation of concerns:Core Components
Initial Program (Optional)
Starting point for evolution. Can contain
EVOLVE-BLOCK markers to specify regions to mutate. If omitted, the LLM generates solutions from scratch.Evaluator (Required)
Python function that scores candidate solutions. Returns a dictionary with
combined_score (maximized) and optional artifacts for contextual feedback.Search Algorithm
Evolutionary strategy that selects which programs to mutate. Examples: AdaEvolve, EvoX, Beam Search, Top-K.
LLM Pool
One or more language models that generate program mutations. Supports weighted sampling across multiple models.
Database
Tracks all generated programs, scores, and metadata. Enables checkpointing and resume functionality.
Real-World Impact
SkyDiscover has achieved significant improvements on real systems optimization tasks:- 41% lower cross-cloud transfer costs
- 14% better GPU load balancing for MoE serving
- 29% lower KV-cache pressure via optimized GPU model placement
- Matches or exceeds AlphaEvolve and human SOTA on 12/14 math and systems tasks
Performance Benchmarks
Across ~200 optimization benchmarks:- Frontier-CS (172 problems): ~34% median improvement over OpenEvolve, GEPA, ShinkaEvolve
- Math tasks (8 problems): Matches or exceeds AlphaEvolve on 6/8 tasks
- Systems tasks (6 problems): Matches or exceeds AlphaEvolve on all 6 tasks
Quick Start
Get started with your first discovery in under 5 minutes
Installation
Detailed installation instructions and system requirements
API Reference
Complete Python API documentation
CLI Reference
Command-line interface documentation
Next Steps
Try the Quickstart
Run your first discovery problem with the circle packing example
View Benchmarks
Explore the 200+ included benchmark tasks