2026-04-14 concept agents evolution ai tools open-source optimization

ShinkaEvolve

LLMs as mutation operators for evolutionary code optimization. Sakana AI, 1.1k stars, Apache-2.0. Maintains populations of programs that evolve toward a fitness function.

I used this heavily ~6 months ago. Revisiting: major API refactor (Mar 2026), PyPI package, agent skills, ICLR 2026 acceptance.

Source: github.com/SakanaAI/ShinkaEvolve

How it works

Multi-island evolutionary algorithm where LLMs propose code changes:

Population – archive of programs (default 40), split across islands
Selection – elite parents chosen (top 30%), weighted/power-law/beam search
Mutation – LLM generates patches: diff (60%), full rewrite (30%), crossover (10%)
Evaluation – program runs run_experiment(), returns combined_score
Migration – best programs move between islands every N generations

Three patch types keep diversity: small diffs for local search, full rewrites for exploration, crossover for combining good ideas from different lineages.

What changed since Oct 2025

When	What
Oct 2025	Won ICFP 2025 Programming Contest (Team Unagi)
Jan 2026	ICLR 2026 accepted. Budget control (`max_api_costs`), adaptive targeting, WebUI
Feb 2026	Agent skills for Claude Code/Codex: `shinka-setup`, `shinka-run`, `shinka-inspect`, `shinka-convert`
Mar 2026	Major refactor: unified `ShinkaEvolveRunner`, PyPI as `shinka-evolve`, uv support
Apr 2026	New docs site: sakanaai.github.io/ShinkaEvolve/

Key design choices

Model ensemble with cost-aware UCB bandit: rotates between gpt-5-mini, gemini-3-flash, gemini-3.1-pro, gpt-5.4. Cheaper models get more tries, expensive ones only when they prove better. Budget cap in USD.

Meta-evolution: the system prompt itself evolves every N generations. Archive of 10 prompts, UCB sampling. The prompts that produce better mutations survive.

Novelty search: code embedding similarity (text-embedding-3-small) prevents the population from collapsing to one solution. Threshold 0.99, optional LLM judgment.

Async 5-10x speedup: proposals and evaluations run concurrently with EWMA smoothing for adaptive pacing.

How we use it

Our skills (/shinka-setup, /shinka-run, /shinka-inspect, /shinka-convert) wrap ShinkaEvolve for agent-bit prompt evolution and code optimization. The prompts/system.md in agent-bit was evolved through ShinkaEvolve runs.

This is the production implementation of self-evolution: the fixed/mutable partition. System prompt is mutable, tool definitions are fixed. ShinkaEvolve searches the mutable space.

Connects to erc3-agents-competition: ERC3 used evolution (68→100%) on agent configurations. ShinkaEvolve is the general-purpose version of that pattern.

And to asi-evolve-ai-research-agents: both use LLMs in evolutionary loops, but ASI-Evolve targets model architectures while ShinkaEvolve targets code/prompts.

Quick start

pip install shinka-evolve
shinka_launch variant=circle_packing_example

Or via our skills:

/shinka-setup    # scaffold evaluate.py + initial.py
/shinka-run      # launch evolution
/shinka-inspect  # load top programs into context