Skip to content

Roadmap

What we’re building.

Each stage has explicit exit criteria measured against the Bee Security Eval Harness(internal — open-sourcing planned). We don’t move on until the current stage’s criteria are met. Bee Ignite is internal R&D, not a customer tier — research wins backflow into the customer tiers below.

Stage 0 — Honest launch

Done

Get Bee live with the runtime safety wrapper, an honest marketing-claim audit, and the first Bee Security Eval Harness baseline (12.5 / 100). The point isn't the score — it's the ground truth to measure improvements against.

Shipped

  • Three-layer runtime safety wrapper (intent scan + system-prompt anchor + output filter) in front of every chat call
  • Marketing-claim audit — every "X works" claim cross-referenced against code or labelled Roadmap
  • Baseline eval written to Postgres for trend analysis (52 cases, 10 categories)

Stage 0.5 — Cybersec sweep across Comb / Hive / Swarm

In progress

Train domain-specialised LoRA adapters for cybersecurity on every tier from Comb up. The cybersec adapter pipeline is the template for the other nine domains; getting one right end-to-end de-risks the rest.

In flight

  • Vertex Comb cybersec adapter — landed (train_loss 0.314, ~6h on L4)
  • Vertex Hive cybersec smoke run on Qwen3-14B — running
  • 7 Hive domain adapters dispatched + 2 queued in parallel via 8× A100 quota in us-central1
  • Swarm cybersec on A100 80GB on-demand (Qwen3-30B-A3B MoE)
  • Tier-1 CII detection wrapper for Singapore Cybersecurity Act 2018 s.9 compliance
  • Research queue capture wired — every safety-wrapper block POSTs to /api/research/capture

Exit criteria

  • Hive sweep + Swarm cybersec adapters merged into production routing
  • /api/cron/eval-run re-runs the 52-case harness; total_score strictly higher than 12.5
  • Per-category score ≥ 80% on cybersec-adjacent categories (1, 7, 9, 10)
  • Research queue accumulating production captures with expected wrapper_reason distribution

Stage 1 — APK distribution

Next

Ship the Android workspace via direct APK download from /download. Gated on the Stage 0.5 eval lift — we don't publish a mobile surface against an unmerged adapter set.

Exit criteria

  • Stage 0.5 exit criteria met (cybersec adapters merged, eval lift verified)
  • APK signed by the cuilabs CUI release key and hosted at /download
  • Mobile chat parity with the web workspace for Cell + Hive tiers

Stage 2 — Per-tier observability + per-tier health

Next

Today /status surfaces the parent backend probe; per-tier health (Cell, Brood, Comb, Buzz, Hive, Swarm, Enclave) currently inherits the parent verdict. Stage 2 wires per-tier probes so the status board reflects the actual fan-out.

Exit criteria

  • Per-tier /api/health/<tier> endpoints serving real liveness + p50/p95 latency
  • Tier-by-tier status rows on /status driven by independent probes
  • Sentry release tagging extended to the new endpoints

Stage 3 — MCP HTTP transport + remote MCP

Next

The Bee MCP server today supports stdio (Claude Desktop, Cursor, VS Code, Zed). HTTP transport unlocks remote MCP — Bee usable from hosted clients without a local Python install.

Exit criteria

  • python -m bee.mcp_server --http <port> serves the 4 core tools (chat, code, security, research) over JSON-RPC
  • Authenticated via the same Bee API key used by /v1/chat/completions
  • Documented at /docs/mcp with a Hosted MCP install path

Commerce track — Billing, payments, premium routing

Done

Parallel track to the model work — running production cloud-tier subscriptions, per-tier usage counters, credit ledger, premium routing budgets, and the Stripe v2 cutover. Now stable; future commerce work is incremental rather than a stage.

Shipped

  • Seven customer tiers + per-tier counters (Cell, Brood, Comb, Buzz, Hive, Swarm, Enclave) — Ignite is internal R&D, no commerce surface
  • Stripe v2 cutover with monthly + annual billing cycles + idempotent price seeding
  • credit_ledger append-only audit log + grantCredits / debitCredits / splitChargeAgainstWallet
  • Per-tier usage caps + premium-routing budget caps with optional hard-fail mode
  • VC Partner / Partner / BEE for Startups program intakes

Roadmap caveats

Stage names and exit criteria are stable; specific dates are not. We measure progress by eval lift, not calendar quarters. The internal source of truth is docs/product/roadmap.md in the main repo — this page is the customer-facing distillation, updated when stages flip status.