Quick Summary: Just ten days after the sudden ban of Fable 5, Japan’s breakout AI lab, Sakana AI, has sent shockwaves through the tech community by launching FUGU and FUGU Ultra. Unlike traditional heavy frontier LLMs developed by US or Chinese tech giants, FUGU introduces a brilliant architectural shift: Multi-Agent Orchestration. It doesn’t act as a single brute-force brain; it acts as an exceptionally smart project manager routing tasks across multiple models. In early developer benchmarks, it completely outpaced Claude Opus 4.8 and GPT 5.5 in raw coding speed.
What is Sakana AI FUGU? (The Smart Manager Paradigm)
Sakana AI FUGU is not a standalone frontier LLM. Instead, it is a highly sophisticated, learned multi-agent orchestrator. Because engineering teams in Japan operate under tighter computational and GPU infrastructure constraints compared to the US or China, they innovated through structural efficiency rather than sheer size.
Instead of building a massive, resource-heavy model, Sakana built a system that coordinates an elite group of existing frontier models (like GPT and Claude variants). When you give FUGU a task, multiple underlying models analyze it simultaneously, collaborate, cross-correct, and deliver a combined, highly optimized solution.
In major coding and reasoning benchmarks, this swarm intelligence allows FUGU Ultra to match or exceed the performance of heavyweights like Mythos Preview and Fable 5.

FUGU Default vs. FUGU Ultra
Sakana AI has rolled out two distinct tiers depending on your development pipeline needs:
FUGU Default
Designed for daily developer workflows like standard coding, rapid code reviews, and chat-style interactions. It is finely tuned to strike an optimal balance between execution latency and response quality. It also features strict data compliance controls, allowing you to manually exclude specific model providers from the orchestration pool.
FUGU Ultra
Built for deep, multi-layered problem-solving. FUGU Ultra deploys a much larger agent pool, aggressively assigning 3 or more AI agents to attack a single problem from different angles. While this guarantees incredibly high-quality logic and complex debugging, it causes noticeable latency and results in a significantly higher cost per token.
The Crossy Road Benchmark: FUGU vs. Claude Opus 4.8
To showcase how powerful this multi-agent architecture is, developers ran a head-to-head live experiment. Both Sakana AI FUGU and Claude Opus 4.8 by Anthropic were given the identical complex prompt: Build a functional Crossy Road game from scratch.
The performance metrics revealed a dramatic speed disparity:
| Metric | Claude Opus 4.8 | Sakana AI FUGU |
| Development Time | 79 Minutes | 22 Minutes |
| Token Consumption | Extremely High | Massively Optimized |
| UI Design & Refinement | Winner (More Polished) | Basic / Strictly Functional |
The Takeaway: Opus 4.8 still holds the upper hand when it comes to visual polish, aesthetic layout, and creative UI details. However, FUGU completely demolished it in architectural speed, finishing the entire core game loop in less than a third of the time while using significantly fewer tokens.
Step-by-Step Setup: Configuring FUGU in Codex
If you have an API key and want to experience FUGU’s collective intelligence inside Codex, the official setup steps can feel a bit scattered. Here is the cleanest way to get it running quickly:
Step 1: Let an Agent Read the Docs
Instead of reading through disorganized setup pages manually, feed the official Sakana AI documentation URL into an advanced coding assistant (like Kimi 2.7 Code Agent) and let it parse the environment variables for you.
Step 2: The Only Manual Step (Setting Your API Key)
Honestly, nobody likes messing around with terminal commands, but this part is actually a breeze and takes less than a minute. You just need to tell your computer where your Sakana API key is so that Codex has permission to run it.
All you have to do is open your terminal, type export SAKANA_API_KEY=”your_actual_api_key” and press enter. Right after that, just type source ~/.bashrc and hit enter one more time to refresh the system.
And that is literally it! No crazy technical gymnastics required, and no confusing setups. You are officially ready to roll.
Step 3: Set Your Reasoning Effort
When building your model catalog, you will be prompted to choose a reasoning effort profile. This is where your budget is won or lost:
-
High: Ideal for standard system checks, feature drafting, code reviews, and general programming questions. This provides deep reasoning without draining your wallet.
-
X-High: Tailored exclusively for complex code refactoring, massive engineering overhauls, or long-term multi-layered bugs.
Step 4: Finalize the Provider Setup
Direct your agent to configure the Sakana provider profile. Set the Base URL to api.sakana.ai/v1 and set your default system target to Fugu Ultra. Once configured, you can effortlessly switch between the standard and ultra models inside Codex using the simple /model command.
My Personal Testing Report (The Honest Truth)
I wanted to see how FUGU Ultra actually behaves under pressure, so I threw it right into my ongoing development sandbox. I tested it on my current trading indicator project called CBR (a system designed to build custom TradingView indicators based on the strategy of trading YouTuber Tom Trades).
The blueprint and indicator foundations of this project were originally built using Fable 5, which worked beautifully. However, the backtesting data and research layers became messy after being passed around through various models like Opus 4.8, Minimax M3, Kimi Code 2.7, and GLM 5.2—a common issue known as Spec Drift.
Here is what happened when I let FUGU Ultra analyze the directory:
The Cost Shock is Real
First things first: be extremely careful with your settings. I ran a single, straightforward prompt asking FUGU Ultra to review all files in my project directory and evaluate the project’s health. Because I had mistakenly left the reasoning effort set to X-High, that one single prompt instantly devoured 29% of my entire monthly standard subscription limit ($20 plan), eating up 84% of my 5-hour high-speed usage window in seconds. It is a resource hog of the highest order.
Incredible Diagnostic Power
Despite the wallet-draining cost, the intelligence is undeniable. FUGU Ultra immediately spotted the exact weaknesses in my project. It instantly recognized that while the core indicator code was structurally sound, the backtesting layer was completely unreliable due to the constant context shifting between different LLMs.
Watching its thought process live was fascinating. You could clearly see the multi-agent mechanism at work—it actively forced a Claude instance and a GPT instance to audit the same file simultaneously, cross-checking each other’s conclusions to point out where the data layer was broken.
The Verdict & Cost Management Strategy
Sakana AI FUGU proves that the future of software development is shifting rapidly toward Multi-Model Routing and automated loop engineering. The convenience of having the collective intelligence of multiple frontier brains working on a single file is incredible.
However, because the orchestration process is so aggressive, token costs can spin out of control instantly.
The Ideal Strategy:
-
Avoid the $20 Tier: Do not jump straight into a standard monthly subscription, or you will find yourself constantly hitting usage walls.
-
Use Pay-As-You-Go: Start with a flexible Pay-As-You-Go API model and load a small credit of $5 or less to test your scripts safely.
-
Reserve X-High for Disasters: Keep your default reasoning profile set to High. Only toggle X-High when you are dealing with a catastrophic engineering failure that a single model cannot decipher.
Once you learn how to manage its aggressive resource consumption, FUGU Ultra is easily one of the most powerful automated development tools on the market.
Now that your coding pipeline is automated, optimize your content creation workflow! Check out our comprehensive Best Suno v4 vs Udio Review: The Ultimate Free AI Music Guide to generate copyright-free background scores for your project videos, or learn how to scale your short-form video reach with our Amazing Munch Studio AI Review: Faceless Viral Shorts.
