The complete guide
A pragmatic, AI-powered engineering discipline for solo developers and small teams — maintain a deployable, production-ready state at all times, without a large engineering org.
Describes skills library v1.17.1 · orchestrator /ssd v1.17.1 · bootstrap /ssd-init v1.5.0 · VERSION · CHANGELOG
Verify locally: cat ~/.claude/skills/VERSION
Overview
InsanelyGreat's Shippable States Development (SSD) is a pragmatic engineering discipline designed for solo developers and small teams who use AI — specifically Claude Code — to build production software. The system maintains a deployable, production-ready state at all times throughout the development cycle.
The core principle:
SSD synthesizes lessons from continuous deployment, trunk-based development, feature flags, and decades of software engineering failures where "90% done" meant "months from shipping." It was designed from the ground up to be operated by one person or a handful of people, with AI as a force multiplier at every step of the workflow.
The methodology is simple: maintain a deployable state at all times. The discipline is hard: no shortcuts, no "we'll fix it later," no broken code on main. The payoff is enormous: no death marches, predictable delivery, high quality, low stress — without needing a 10-person eng team to enforce it.
Why It Matters
The "90% Done" Problem
Traditional development creates a predictable trap:
Week 1–8: "Making good progress!"
Week 9: "We're 90% done!"
Week 10: "Still 90% done..."
Week 11: "Uh, still 90%..."
Week 12: Panic, cut features, ship something broken
Why? The last 10% includes all the work no one budgeted for:
- Integration between components
- Error handling and edge cases
- Performance under real load
- Production deployment and data migration
- Security hardening and cross-browser testing
- Accessibility and documentation
The InsanelyGreat's SSD solution: Do the hard "last 10%" work incrementally throughout development, not as a crisis at the end. Claude Code skills handle the checklist so you stay focused on shipping.
The Iron Law
Every project has exactly three variables:
- Scope — what features and capabilities ship
- Time — when it ships
- Quality — how well it works
| Constraint | What Flexes | When to Use |
|---|---|---|
| Fix Time | Scope reduces, quality preserved | Hard deadlines (conference, contract, funding round) |
| Fix Scope | Timeline extends, quality preserved | API compliance, feature parity requirements |
| Fix Quality | Time and scope flex | Medical, financial, safety-critical systems |
Most projects are time-constrained. Declare your constraint at kickoff. Adjusting scope to meet a deadline is not failure — it's engineering judgment.
Principle 1: Constant Production Parity
Your development environment must match production as closely as possible from Day 1.
Traditional
- Weeks 1–8: Local development
- Week 9: "Okay let's deploy to staging..."
- Week 10: "Why doesn't it work in staging?"
- Week 11: "Production is different from staging..."
- Week 12: "What do you mean SSL certs take 3 days?"
InsanelyGreat's SSD
- Day 1: Deploy "Hello World" to production
- Day 2: Deploy first feature to production
- Day 3: Deploy improved version
- Day 30: Deploy to production (like every day)
Why this works: Deployment is never "the hard part" because you do it constantly. Production issues surface immediately when they're easy to fix. You know your deployment budget from Day 1.
Principle 2: The Shippable State Invariant
At the end of each work session, the system must be in a state where:
- All tests pass
- No compilation errors
- No broken user-facing features
- Documentation matches implementation
- Could be deployed to production without embarrassment
Not required: feature-complete or meeting all goals. Just that what exists actually works.
Principle 3: Feature Flags Over Feature Branches
Long-lived feature branches are antithetical to shippable states.
# Problem: Feature branch divergence
Main: A---B---C---D---E---F---G---H
\
Feature: I---J---K---L---M
\
(days of merge conflicts)
# SSD: All work on main, behind flags
Main: A---B---C---D---E---F---G---H
Day 1: Add feature code (flag off by default)
Day 2: Expand feature (still flagged off)
Day 3: Feature works, flip flag on
All work happens on main/trunk. The feature exists in production but is invisible until ready.
if feature_flags.is_enabled("new_checkout", user=user):
return new_checkout_flow(user, plan)
else:
return legacy_checkout_flow(user, plan)
Principle 4: The Ratchet Principle
Forward progress only. Each commit improves the system in some measurable way.
Banned commits:
- "WIP" or "checkpoint" commits
- "Broken, will fix tomorrow"
- Commented-out code "for later"
- Partially implemented features visible to users
The ratchet mechanism — every commit must:
- Pass CI/CD
- Maintain or improve code coverage
- Be deployable
If you need to save work that's not ready: use local stash (not committed), Draft PR with "DO NOT MERGE" (not on main), or a feature flag (committed, but invisible).
Principle 5: Scope Flexibility Is a Feature
Traditional thinking: "We must deliver all planned features by the deadline."
Result: Deliver nothing on time, or deliver broken features.
SSD thinking: "We deliver whatever is shippable by the deadline."
Result: Deliver working software, adjust scope based on reality.
How to cut scope well:
- Cut entire features, not the quality of existing features
- Cut depth, not breadth (fewer powerful features beats many broken features)
- Hide features behind flags rather than deleting (easy to resurrect)
- Communicate cuts early and often to stakeholders
Pattern 1: Deployed Day One
Before writing any business logic, establish a real deployment to your distribution channel. The specifics vary by platform — pick yours:
- Frontend deployed to a real URL (even if it just renders a title)
- Backend API deployed and reachable from the frontend
- Database provisioned and migrated
- CI/CD: push to main → deploy to staging automatically
- One authenticated route working end-to-end
- Error tracking (Sentry) wired up in frontend and backend
- Domain + SSL configured
- App builds and runs on minimum target device/simulator
- Main tab/navigation structure in place (empty screens are fine)
- One piece of data persisted end-to-end: create → persist → relaunch → still there
- Authenticated session working: login → token in Keychain → cold launch restores
- CI: Xcode Cloud or GitHub Actions builds and runs tests on every push
- App archived and submitted to TestFlight (even a Hello World build)
- Crash reporting wired up (Sentry, Crashlytics, or Bugsnag)
- App Store Connect record created with bundle ID matching the app
- App builds and runs on minimum target API
- Hilt dependency injection wired and working
- Navigation structure in place with NavHost
- One piece of data persisted end-to-end in Room: create → persist → kill app → relaunch → still there
- Authenticated session working: login → token in DataStore → relaunch restores
- CI: GitHub Actions or Bitrise builds debug APK and runs unit tests on every push
- Internal Testing track on Play Console with a working build uploaded
- Firebase Crashlytics (or equivalent) initialized and sending test crashes
- App builds and launches on minimum target OS
- Main window with placeholder navigation
- One real piece of persisted data: create it, see it in the UI, relaunch — still there
- Basic Settings window
- CI: Xcode Cloud or GitHub Actions builds and runs tests on every push
- Archive and notarization working (even for a Hello World app)
- Crash reporting wired up (Sentry, Bugsnag, or Crashlytics)
- Service containerized and deployed to production environment (even returning
{"status": "ok"}) - Health endpoints (
/health,/ready) responding correctly - Database provisioned, connected, and one migration applied
- Structured logging with
request_idpropagation on every request - Error tracking (Sentry or equivalent) capturing unhandled exceptions
- One authenticated endpoint working end-to-end
- CI/CD: push to main → container built, tests run, deployed to staging
.env.exampledocumenting every required environment variable
This is your MVP. It does nothing useful, but it's real. If deployment takes 2 weeks and you budget 0 weeks, you're starting 2 weeks late on Day 1.
Pattern 2: Walking Skeleton
Build one feature end-to-end before building any feature fully complete.
Wrong order
- Design all UI screens
- Build all database tables / persistence
- Write all API endpoints / services
- Connect everything
- Discover they don't fit together
Right order
- Build login flow end-to-end
- Build "add item" end-to-end
- Build "edit item" end-to-end
- Each step shippable as-is
Never build all UI then all backend/persistence. One complete flow first, then expand breadth. "End-to-end" means different things by platform: on web, UI → API → DB → response. On iOS, View → persist → relaunch → verify. On Android, Compose → Room → relaunch → verify. The principle is the same: one complete flow before breadth.
Pattern 3: Dark Launching
Launch features in production before they're visible to users. The pattern works across all platforms:
if feature_flags.new_dashboard and user.is_internal_tester:
return render_new_dashboard(user)
return render_old_dashboard(user)
if featureFlags.isEnabled("newDashboard", user: user) {
return NewDashboardView()
}
return OldDashboardView()
if (featureFlags.isEnabled("newDashboard", user)) {
NewDashboardScreen()
} else {
OldDashboardScreen()
}
Benefits:
- Test in production without risk
- Gradual rollout (internal → beta → everyone)
- Easy rollback: flip flag
- Development never blocks deployment
Pattern 4: Timebox with Eject
For risky or exploratory work, timebox it with a pre-committed eject plan.
"We'll spend 3 days exploring this approach.
On day 3, we decide:
- Ship it
- Iterate it (extend timebox)
- Abandon it (revert to last shippable state)"
This prevents:
- Sunk cost fallacy ("we've invested so much...")
- Endless exploration without shipping
- Half-finished experiments in the codebase
Pattern 5: The Nightly Ritual
End each day with a shippable state. Spend the last 30 minutes on this checklist:
- All tests pass locally
- Code committed and pushed
- CI/CD pipeline green
- Feature flags set appropriately
- Documentation updated if APIs changed
- Tomorrow's first task identified
Your future self (or your teammate) should be able to pick up exactly where you left off, with no confusion about what state things are in.
Decision Framework
Choosing Your Constraint
At project kickoff, declare your primary constraint and communicate it explicitly.
| Constraint Type | Example Projects | Scope | Quality |
|---|---|---|---|
| Time-Constrained | Conference demos, MVP for funding, contractual deliveries | Flexes | Preserved |
| Scope-Constrained | API compliance, platform migrations, feature parity | Fixed | Preserved |
| Quality-Constrained | Medical devices, financial systems, infrastructure | Flexes | Fixed |
When to Cut Scope
Scope cuts should happen early and often, not as last-minute panic.
Cut scope now if you see these signals:
- It's Wednesday and you're not confident about Friday's shippable state
- You're accumulating technical debt faster than paying it off
- Tests are being skipped "temporarily"
- "We'll clean it up after shipping" is appearing in conversations
Metrics That Matter
| Traditional (misleading) | SSD (actually useful) |
|---|---|
| Lines of code written | Days since last production deployment |
| Number of commits | Mean time to deploy a change |
| Features "in progress" | % of code behind feature flags (target: <5%) |
| Percentage complete | Test coverage (and is it passing?) |
Deployment Frequency
This is the single most important SSD metric:
- Once per month — Traditional waterfall
- Once per week — Decent
- Once per day — Excellent
- Multiple times per day — World-class
Common Objections
"This sounds like more work"
You're doing the work either way. Option A: days 1–85 ignore deployment, days 86–100 frantic debugging, ship broken. Option B: do the hard parts incrementally every day, day 90 ship the fully-working subset you completed. Same total effort, drastically different stress and quality.
"Our stakeholders need to see progress"
SSD gives better demos. Traditional: "Here's a mockup... this button doesn't work yet... imagine when this is connected to the backend..." SSD: "Here's the actual working product. Press any button." Which demo builds more confidence?
"We need to iterate quickly"
False dichotomy. Shippable states don't slow iteration — they enable it. Every iteration is testable by real users. No integration phase blocking feedback. Pivots are cheap because sunk cost is always minimal.
"My team isn't disciplined enough"
This is exactly why you need this. Discipline problems are solved with systems, not willpower. CI/CD forces tests to pass. Can't commit broken code. Daily deployments force completion. Visible production state keeps everyone honest. SSD creates discipline through automation and forcing functions.
"This doesn't work for mobile apps"
It works. You cannot deploy to the App Store daily (review takes 1-3 days). But you CAN deploy to TestFlight / Play Internal Testing daily. SSD targets the internal deployment pipeline, not the store review process. TestFlight is your "production" for SSD purposes until you cut a release.
Feature flags on mobile use an SDK (Firebase Remote Config, LaunchDarkly). Flag changes take effect on next app launch, not instantly. When you cut a store release, it should be a non-event — you've been shipping to testers daily. For macOS desktop: notarization is your deployment gate. Automate it in CI from Day 1.
Getting Started
Four weeks to establish the SSD rhythm. Success criteria: on day 30, deploy to production with confidence in under 10 minutes.
Day 0: Bootstrap
- Install the skills:
git clone https://github.com/AlexHorovitz/skills ~/.claude/skills - Run
/ssd-initonce at the project root — creates the.ssd/working directory (gitignored), writes.ssd/project.yml(detected stack/framework/platform), createsdocs/decisions/,docs/runbooks/,docs/architecture/, and runs prerequisite checks - All
/ssdphases refuse to proceed until init has run
Week 1: Foundation
- Set up CI/CD pipeline
- Deploy "Hello World" to your distribution channel (production server, TestFlight, Play Internal, notarized build)
- Configure automated testing
- Establish feature flag system (server-side for web, SDK-based for mobile/desktop)
- Invoke
/ssd startto run the Walking Skeleton playbook
Week 2: First Feature
- Build one feature end-to-end
- Deploy to production behind flag
- Verify in production
- Enable for internal users
Week 3: Rhythm
- Deploy to production daily
- Every commit passes CI
- All incomplete features behind flags
- Documentation current
Week 4: Optimization
- Reduce deploy time to under 10 minutes
- Increase test coverage
- Remove old feature flags
- Retrospective: what's working?
Platform-specific Day 1 checklists for iOS, Android, macOS, Web, and Headless are in Pattern 1: Deployed Day One above.
Claude Code Skills
InsanelyGreat's SSD is implemented as a set of orchestrated skills for Claude Code — this is what makes the methodology practical for a single developer or small team. The /ssd orchestrator sequences the right sub-skills for each development phase, giving you the equivalent of a senior architect, systems designer, and code reviewer on call at all times. The full skill set is free for personal and internal organizational use — github.com/AlexHorovitz/skills (library v1.17.1).
Skill Taxonomy
| Type | Skills | When you invoke directly |
|---|---|---|
| Bootstrap | /ssd-init |
Once, at project start (or when .ssd/ has drifted) |
| Orchestrator | /ssd |
Always — start here after init |
| Domain | /architect, /coder, /systems-designer, /refactor |
When working outside the SSD workflow |
| Review | /code-reviewer, /codebase-skeptic, /software-standards |
On-demand or via SSD |
| Reference | /methodology |
When you want to understand SSD doctrine or score self-adherence |
Step 1: /ssd-init — Project Bootstrap
Run once per project before any /ssd phase. First-run housekeeping: creates ssd/ (gitignored working directory), writes .ssd/project.yml (detected language, framework, platform, distribution channel), creates .ssd/current.yml (active workstreams pointer), creates docs/decisions/ / docs/runbooks/ / docs/architecture/ (committed decision records), and runs SSD prerequisite checks.
Idempotent — safe to re-run. It never overwrites existing files, never deletes anything, and appends to .ssd/init-log.md on each run.
/ssd-init./ssd refuses to proceed if .ssd/project.yml is absent. Init is not auto-run — the user decides when to commit to the SSD convention.
Step 2: /ssd — The Orchestrator
As of v1.8.0, the default is no-arg auto-detect: typing just /ssd reads .ssd/current.yml + .ssd/current.notes.yml, surfaces active workstreams, and proposes the next action. The explicit phase commands below remain as escape hatches when you want to force a step. The orchestrator never silently advances a phase — it always proposes; you accept or redirect.
Parallel Feature Workstreams New in v1.17
The orchestrator now treats multiple in-flight features as first-class. Up to four active workstreams per project (soft advisory limit, no hard cap), each with its own branch and an optional git worktree so two features can be edited side-by-side without checkout churn. The single-feature flow remains the default; parallel is opt-in.
feature newAt gate time, /ssd gate intersects the gated workstream's tracked file footprint with every other active workstream's footprint and surfaces overlaps as SUGGESTION-tier findings — never BLOCKER, never MAJOR. Overlap is often intentional (layered features, one workstream extends a file the other added); the orchestrator surfaces, the user judges. Running bare /ssd on any branch auto-resolves to the correct workstream via branch name → recorded mapping → prompt. The Shippable State Invariant still holds per workstream: parallelism reduces switching friction, it does not lower the bar.
Full design notes: ADR-0007 — Parallel features as first-class workstream artifacts.
Iterations Inside a Feature
A feature is one cycle by default (one design → build → review → deploy). Real features sometimes ship as multiple iterations. As of v1.5.0 of /ssd, this is a first-class concept — append #<iter-id> to any phase command:
/ssd code my-feature#3a
/ssd review my-feature#3b
/ssd ship my-feature#3b
Iter-ids match [A-Za-z0-9_-]+. The first #iter reference promotes the feature non-destructively to the multi-iteration layout (iterations/<iter-id>/ under the feature root). Single-cycle features keep the flat layout.
Multi-Round Gates
If /ssd gate fails (BLOCKER or MAJOR found), the workstream returns to coder. Re-running the gate after fixes produces a round-2 review at 04-code-review-round-2.md (or iterations/<iter>/code-review/round-2.md for multi-iter features). The orchestrator auto-numbers rounds, increments current.yml.gate_rounds, and requires closed_from_previous_round discipline on round 2+ — every closure is verified against the code, not copied from coder-status. gate_rounds: 3 on a workstream is a useful budget signal that scope cut may be wiser than another fix attempt.
The Rails — Canonical Opinionated Path
The eight-step canonical sequence (brief → design → code → review → gate → deploy → rollout-advance → flag-removal) lives in ssd/rails.md. That file is the single source of truth for what no-arg auto-detect proposes and what the eight critic-grade invariants are. A workstream that skips a step records the deviation in current.yml.active[].rail_deviations. Deviations are not failures — they are engineering judgment captured for the record. Teams with genuinely different needs fork rails.md and point project.yml.rails: at the fork.
Developer Profile + Teaching Mode
v1.10.0 adds a developer_profile field to .ssd/project.yml with values novice | standard | expert. The profile adjusts defaults — confirmation prompts, narration verbosity, whether bare phase commands are accepted — without forking the product. A novice can always invoke any command an expert can; the profile is a hint, not a gate. Teaching mode (auto-on for the first 5 invocations) appends a one-line "under the hood: I called architect because we're at phase=design" to every conversational turn so the user can see what the orchestrator chose and why. Re-enable with /ssd --teach; disable permanently with /ssd --no-teach. Bridge flags --explain, --narrate, --raw let either surface reveal the other.
Milestone → Verify Loop
Every milestone takes a before/after snapshot and requires explicit verification:
- Snapshot: record git SHA and metrics to
.ssd/milestones/<topic>/sha-beforeandmetrics-before.yml. - Deep audit:
codebase-skepticwritesskeptic-before.md. - Refactor planning:
refactoremitsrefactor-plan.md— every item cites a specific finding ID fromskeptic-before.md. No cite → not in scope. - Validate:
code-reviewerwithremediation_mode: trueon each refactor PR. - Deploy and confirm production health.
- Verify (mandatory): re-run
codebase-skeptic→skeptic-after.md; diff frontmatter; re-runcode-revieweron the remediation diff. The milestone is complete only when all original BLOCKER/🔴/💀 findings are ✅ closed, no new BLOCKER-severity regression was introduced, and the remediation diff has no BLOCKERs. A refactor that claims to close findings without verification is indistinguishable from wishful thinking.
Sub-Skill Reference
| Skill | Role in SSD | Phase |
|---|---|---|
/ssd-init (v1.5.0) |
First-run housekeeping: creates .ssd/ tree, writes project.yml + current.yml (v2 schema) + current.notes.yml, runs prerequisite checks. Idempotent. v1.5.0 added developer_profile and teaching_mode defaults; v1.3.0 added the v1→v2 prompted migration. |
prerequisite to all phases |
/architect (v1.1.0) |
Design: models, services, API contracts, ADRs, current-scale baseline. Platform-adaptive (web, iOS, Android, macOS, headless); web guides cover Next.js, Django, FastAPI, Rails, Laravel, Angular, Vue/Nuxt, Spring Boot, ASP.NET Core. Integration has a first-class contract. | start, feature |
/systems-designer (v1.2.0) |
Production readiness: reliability, observability, deployment safety. Validates architect spec in Phase 0. Covers AI/LLM integration, compliance & data lifecycle, cost observability, and chaos/failure injection. | start, feature, ship |
/coder (v1.1.0) |
Implementation from spec (Python, TypeScript, Swift, Ruby, Java, C#, PHP, Go, Rust, C/C++, Obj-C). Halts if the architect spec omits a feature flag. Spec-drift check amends ADRs. Emits 03-coder-status.md with test/lint/typecheck results. |
feature |
/code-reviewer (v1.2.0) |
PR gate: BLOCKER/MAJOR findings block merge. Phase 1.5 prior-review follow-up (remediation mode) and Phase 3.5 fix-introduces-edge-cases. Red flags include LLM prompt injection, IntegrityError fetch mismatch, cache-without-race-test, release theatre. Loads examples.md reference. |
feature, milestone, gate, verify |
/codebase-skeptic (v1.2.0) |
Deep architectural critique through 10 expert voices. Mandatory Phase 2.5 Operational Failure Modes Sweep. Forward-Looking Pass in Phase 4. Incident-Story attestation (Beck), Domain-Modeling Stance (Evans), Deployment-Gate Hardening (Humble). | milestone |
/software-standards (v1.1.0) |
Adversarial comparative audit. Two modes: Comparative and Adversarial Single. Requires 2–3 evidence citations per /10 score. For vendor selection / legacy onboarding / pre-acquisition — not routine review. |
audit |
/refactor (v1.2.0) |
Post-ship targeted improvement. Every item cites a specific finding from skeptic-before.md. Step 4.5 Budget Check with halt-and-rollback. Step 5 per-item re-check loop closure. Step 6 systems-designer coordination trigger. Loads patterns.md reference. |
milestone |
/methodology (v1.2.0) |
SSD doctrine reference — Iron Law, Five Principles, Decision Framework. Provides machine-checkable rule source for /ssd gate. /methodology score emits a self-adherence metric. |
reference / any phase |
Review Tier Selection
Three skills do "review" work. Never chain all three — pick the right tier:
/code-reviewer— every PR, always, no exceptions (≤500 changed lines)/codebase-skeptic— milestone reviews and pre-release audits of an owned codebase/software-standards— comparative/adversarial evaluation only (vendor selection, legacy onboarding, pre-acquisition)
coder and a language-specific coder (e.g. python-django-coder) both apply, the specific one wins. code-reviewer and codebase-skeptic are mutually exclusive on the same scope. codebase-skeptic and software-standards are mutually exclusive.The SSD Artifact Tree
Every SSD invocation produces artifacts at well-known paths relative to the project root. Sub-skills read from and write to this tree — that is what lets a session resume, a reviewer verify, and a teammate onboard. As of v1.3.0 of the orchestrator, the working directory is hidden (.ssd/) — the visible ssd/ name collided with the orchestrator skill source directory in the SSD skills repo itself.
<project-root>/
├── docs/ # committed decision records
│ ├── decisions/ # ADRs from architect
│ ├── runbooks/ # runbooks from systems-designer
│ └── architecture/ # component diagrams, data models
└── .ssd/ # gitignored working directory
├── project.yml # language, framework, platform, profile
├── current.yml # v2 schema: machine-managed workstream state
├── current.notes.yml # free-form context for next agent / human
├── features/
│ └── <slug>/
│ ├── 00-brief.md # epic-level for multi-iter features
│ ├── 01-architect.md
│ ├── 02-systems-designer.md
│ ├── 03-coder-status.md # — single-cycle features only
│ ├── 04-code-review.md # — single-cycle features only
│ ├── 04-code-review-round-2.md # — multi-round gate output
│ ├── 05-deploy.md
│ └── iterations/ # — multi-iteration features only (opt-in)
│ └── <iter-id>/ # e.g., 3a, 3b, auth-flow
│ ├── brief.md
│ ├── coder-status.md
│ ├── code-review/
│ │ ├── round-1.md
│ │ └── round-2.md
│ ├── deferred.yml # carry-over ledger
│ └── deploy.md
├── milestones/
│ └── YYYY-MM-DD-<topic>/
│ ├── sha-before
│ ├── metrics-before.yml
│ ├── skeptic-before.md
│ ├── refactor-plan.md
│ ├── refactor-prs.md
│ ├── skeptic-after.md
│ └── verification.md
├── audits/
│ └── YYYY-MM-DD-<scope>/
│ └── standards-report.md
└── archive/ # closed feature + milestone directories
Every primary output carries YAML frontmatter (skill, version, produced_at, scope, consumed_by). Review outputs add finding_counts and a computed gate_pass. Design outputs add a deliverables block. This is what makes /ssd gate mechanically checkable and milestone verification a frontmatter diff rather than prose reconciliation.
current.yml v2 carries schema_version: 2 with per-workstream slug, phase, iteration, budget_hours, elapsed_hours, gate_rounds, rail_deviations, and blockers. The free-form sidecar current.notes.yml holds anything that doesn't fit the schema (handoff notes, scope changes, open questions). Legacy v1 files are read in compatibility mode with an opt-in prompted migration that writes current.yml.bak first — no silent rewrites.
Session Continuity
On invocation, /ssd reads .ssd/current.yml + .ssd/current.notes.yml. Each active workstream carries a budget in hours. The orchestrator flags entries that are over budget ("suggest scope reduction, not more work") and entries last-touched more than 3 days ago ("stale work that may need a fresh audit"). Closing a workstream archives its artifacts under .ssd/archive/features/<slug>/; matching notes move to .ssd/archive/features/<slug>/notes.yml so historical context stays with the work.
Methodology-Backed Gate Enforcement
As of v1.4.0 of the orchestrator, gate enforcement is an executable shell script, not LLM-internal checks. Before /ssd gate passes, the orchestrator invokes:
bash methodology/gate-rules.sh --base <base-branch> --json
Each rule emits PASS | FAIL | SKIP. Any FAIL exits non-zero and the gate refuses to pass. The same script is invocable from CI for parity.
| Rule (script) | What it checks | Source |
|---|---|---|
wip-commits | git log <base>..HEAD --grep='WIP|checkpoint|TODO.*tomorrow|FIXME.*later' -i is empty | core.md §4 |
tests-pass | Project's test_command from .ssd/project.yml exits 0 | core.md §1 |
feature-flag-present | Project's feature_flag_marker appears in non-doc changed files (skipped for doc/config-only diffs) | core.md §3 |
adr-delta | If architectural diff > 200 lines outside test/doc/migration scope, docs/decisions/ has a new or modified ADR | core.md §2 |
"I know better" is not an override — use /ssd ship --force (logged) if the team has a deliberate exception. See ADR-0005 for why this is a bash script rather than orchestrator-internal LLM checks.
Hard Rules
1. No merge without a clean /ssd gate
No BLOCKER or MAJOR findings from the code-reviewer. No exceptions.
2. No incomplete work on main without a feature flag
WIP commits on main are banned. Use a feature flag or a local stash.
3. Tests must pass before and after every change
"I'll fix the tests tomorrow" is not a shippable state.
4. Refactor only after shipping
Separate PRs, never mixed with feature work. Milestones run after shipping, never instead of it.
5. Deploy beats perfection
Reduce scope rather than delay a deploy.
6. Production parity from day one
If you haven't deployed to production yet, that is your next task.