# GitHub PR Roundtrip Test Harness Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Build an end-to-end integration test harness that exercises `crit pull` and `crit push` against a real GitHub repo + real PR + real `gh api` comment posts, exhausting every state-transition variant of the local-review-file ↔ GitHub-PR roundtrip so regressions like "deleting comments and recreating them whole" are caught automatically. **Architecture:** - One dedicated sandbox GitHub repo (created once, reused) with a stable `main` branch containing a fixed set of files. - A Go integration test file behind build tag `e2e_github` that, per scenario, spins up a fresh feature branch + PR, runs the `crit` binary against it, posts/edits/deletes comments via `gh api`, and asserts on both (a) the contents of the local review JSON file (`~/.crit/reviews/.json`) and (b) the live PR comment list fetched via `gh`. Each scenario uses a unique throwaway branch name and closes its PR + deletes its branch in `t.Cleanup`. - Tests share a small set of helpers (in a sibling `_test.go`) for: cloning the sandbox into a temp dir, opening a PR with seeded files, running the compiled `crit` binary in that workdir, posting comments via `gh api`, reading the review file, and listing PR comments. **Tech Stack:** - Go (`testing` package, table-driven where it fits) - `gh` CLI (assume installed and authenticated; harness skips otherwise — same pattern as `pr_integration_test.go`) - The compiled `./crit` binary (built once by Makefile target before running) - A real GitHub repo under the user's personal account (not committed to crit/) **Scope discipline — testing only.** This plan adds the harness. It does NOT fix any bug the harness exposes. When a scenario fails, file the failure as a GitHub issue with the verbatim test output and move on. Each fix lives in its own follow-up plan/PR so the harness lands clean and we keep diff scope tight. **Out of scope:** - crit-web sharing flow (already covered by `share_integration_test.go`) - Sapling/`sl` backend (this harness is git-only) - The browser UI (covered by Playwright e2e) - Fixing any production bug the harness uncovers — file as a follow-up issue **NOT in CI.** This suite hits the live GitHub API, costs network round-trips, eats rate-limit budget, and creates real PRs. It is local-developer-only: - Build tag `e2e_github` keeps it out of `go test ./...`. - The runner script hard-fails when `CRIT_ROUNDTRIP_REPO` is unset, so even if CI accidentally ran the tag, it would skip cleanly. - No GitHub Actions workflow is added. - `crit/.github/workflows/test.yml` is **not** modified by this plan — verify in self-review that nothing else triggers `e2e_github`. **Anyone can run it.** The sandbox slug is per-developer via the `CRIT_ROUNDTRIP_REPO` environment variable. A contributor wanting to run the suite creates their own `crit-roundtrip-sandbox` repo under any account they control, exports `CRIT_ROUNDTRIP_REPO=/crit-roundtrip-sandbox`, and runs `make e2e-roundtrip`. Task 1's bootstrap procedure is reproducible; the README spells this out. --- ## Reported bug class to reproduce A user reports `crit pull` and `crit push` "delete comments and recreate them whole." Plausible causes the harness must surface: 1. `push` posts a new GitHub comment for a comment that already has a `GitHubID` (idempotency broken). 2. `pull` overwrites a local comment in place, losing its `GitHubID`, so the next `push` re-posts it. 3. `pull` doesn't recognize a comment it just pushed (because GitHub assigns a new ID different from the local one) and re-imports it as a duplicate root. 4. Replies lose their parent linkage and get pushed as new root comments. 5. Switching branches and returning corrupts the review file. The matrix of scenarios below is designed so any of these would fail at least one assertion. --- ## File Structure Files to create (all under `crit/`): | Path | Responsibility | | --- | --- | | `crit/roundtrip_integration_test.go` | All scenarios — one Go test func per scenario, build tag `e2e_github`. | | `crit/roundtrip_helpers_test.go` | Shared helpers (clone sandbox, openPR, runCrit, postRemoteComment, readReviewFile, listPRComments, cleanup). Build tag `e2e_github`. | | `crit/scripts/e2e-roundtrip.sh` | Runner: builds `crit`, exports env (sandbox repo slug), invokes `go test -tags e2e_github -run TestRoundtrip`. | | `crit/test/roundtrip/README.md` | Setup steps (one-time sandbox repo creation, env vars, how to run, how to add scenarios). | Files to modify: | Path | Change | | --- | --- | | `crit/Makefile` | Add `e2e-roundtrip` target. | | `crit/scripts/CLAUDE.md` (or `AGENTS.md`) | One sentence pointing at `e2e-roundtrip.sh`. | The plan does NOT touch any production source under `crit/` — the harness is purely additive. If a scenario surfaces a real bug, fixing it is a separate change handled outside this plan. --- ## One-time prerequisites (Task 1 only) The engineer running Task 1 needs: - A GitHub account with `gh auth status` green - Permission to create a public repo under an account or org they own - The repo will be named `crit-roundtrip-sandbox`. **Default owner is the `crit-md` org** (keeps personal-account noise low). Anyone can run the suite against any sandbox they own — pass `--owner ` to `gh repo create` in Task 1 Step 3 if you don't have access to `crit-md`. - Public is fine — content is dummy code. Public also avoids collaborator/org gotchas. --- ## Worktree setup Before Task 1, create the worktree: ```bash cd /Users/tomasztomczyk/Server/side/crit-mono/crit git fetch origin main wt switch -c -b origin/main pr-roundtrip-harness git log --oneline origin/main..HEAD # expect empty ``` All subsequent tasks operate inside that worktree. --- ## Task 1: Create sandbox GitHub repo **Files:** - Create: (no files in repo) — produces a real GitHub repo `/crit-roundtrip-sandbox` with seeded content on `main`. - [ ] **Step 1: Decide the repo owner + slug** Default owner is the `crit-md` org. Override via `OWNER=` if you don't have access: ```bash OWNER="${OWNER:-crit-md}" echo "$OWNER/crit-roundtrip-sandbox" ``` - [ ] **Step 2: Check the repo doesn't already exist** Run: ```bash gh repo view "$OWNER/crit-roundtrip-sandbox" 2>/dev/null && echo EXISTS || echo NEW ``` If `EXISTS`: stop, talk to the user before reusing or deleting. If `NEW`: continue. - [ ] **Step 3: Create the repo and seed content** Run from a scratch dir (NOT inside the worktree): ```bash TMP=$(mktemp -d) cd "$TMP" mkdir crit-roundtrip-sandbox && cd crit-roundtrip-sandbox git init -q -b main cat > README.md <<'EOF' # crit roundtrip sandbox Throwaway repo used by crit's `e2e-roundtrip` integration tests. Branches and PRs here are created and closed automatically — do not rely on any branch existing. EOF cat > sample.go <<'EOF' package sample func Add(a, b int) int { return a + b } func Sub(a, b int) int { return a - b } func Mul(a, b int) int { return a * b } func Div(a, b int) int { return a / b } EOF cat > sample.md <<'EOF' # Sample plan ## Section A First paragraph in section A. ## Section B First paragraph in section B. ## Section C First paragraph in section C. EOF git add . git -c user.email=me@tomasztomczyk.com -c user.name="crit-roundtrip-bot" commit -q -m "seed sandbox content" gh repo create "$OWNER/crit-roundtrip-sandbox" \ --public \ --source=. \ --remote=origin \ --push \ --description "Sandbox for crit pull/push integration tests" ``` - [ ] **Step 4: Verify the repo is reachable** Run: ```bash gh api "repos/$OWNER/crit-roundtrip-sandbox" --jq .default_branch ``` Expected: `main` - [ ] **Step 5: Export the slug for your shell** Add to your shell rc (or a `.envrc.local` if you use direnv): ```bash export CRIT_ROUNDTRIP_REPO=/crit-roundtrip-sandbox ``` Or just export it ad-hoc before running `make e2e-roundtrip`. The runner expects this env var to be set; no config files are involved. - [ ] **Step 6: No commit (the worktree is unchanged)** This task created a remote repo. No code change yet. --- ## Task 2: Test runner script + Makefile target **Files:** - Create: `crit/scripts/e2e-roundtrip.sh` - Modify: `crit/Makefile` (add `e2e-roundtrip` PHONY target) - [ ] **Step 1: Write the runner** Create `crit/scripts/e2e-roundtrip.sh`: ```bash #!/usr/bin/env bash # End-to-end integration test runner for the crit ↔ GitHub PR roundtrip. # Builds crit, then runs Go tests behind build tag e2e_github against a real # GitHub PR in the sandbox repo configured via CRIT_ROUNDTRIP_REPO. # # Usage: # ./scripts/e2e-roundtrip.sh # all scenarios # ./scripts/e2e-roundtrip.sh -run TestX # one scenario # ./scripts/e2e-roundtrip.sh -v -count=1 # extra go test args set -euo pipefail CRIT_DIR="$(cd "$(dirname "$0")/.." && pwd)" if command -v /opt/homebrew/bin/mise >/dev/null 2>&1; then eval "$(/opt/homebrew/bin/mise env -s bash -C "$CRIT_DIR" 2>/dev/null)" || true fi if [ -z "${CRIT_ROUNDTRIP_REPO:-}" ]; then echo "✗ CRIT_ROUNDTRIP_REPO not set." >&2 echo " Export it (e.g. CRIT_ROUNDTRIP_REPO=crit-md/crit-roundtrip-sandbox)." >&2 echo " See crit/test/roundtrip/README.md for one-time setup." >&2 exit 1 fi if ! command -v gh >/dev/null 2>&1; then echo "✗ gh not installed" >&2; exit 1 fi if ! gh auth status >/dev/null 2>&1; then echo "✗ gh not authenticated" >&2; exit 1 fi echo "→ Building crit..." make -C "$CRIT_DIR" build -j export CRIT_ROUNDTRIP_REPO export CRIT_BINARY="$CRIT_DIR/crit" cd "$CRIT_DIR" exec go test -tags e2e_github -count=1 -timeout 20m -run TestRoundtrip "$@" ./... ``` Make it executable: ```bash chmod +x crit/scripts/e2e-roundtrip.sh ``` - [ ] **Step 2: Add Makefile target** In `crit/Makefile`, add after the `e2e-share` rule: ```make e2e-roundtrip: build ./scripts/e2e-roundtrip.sh ``` And add `e2e-roundtrip` to the `.PHONY:` list at the bottom. - [ ] **Step 3: Smoke test the runner (it has nothing to run yet)** Run: ```bash make -C crit e2e-roundtrip ``` Expected: builds `crit`, then `go test ... -run TestRoundtrip ./...` exits with `no tests to run` (exit 0). If it errors on env/auth, fix and retry. - [ ] **Step 4: Commit** ```bash cd crit git add scripts/e2e-roundtrip.sh Makefile git -c core.hooksPath=/dev/null commit -m "test: add e2e-roundtrip script and Makefile target" ``` (Pre-commit hook bypassed per repo memory: the crit pre-commit hook has been observed corrupting worktrees. Run `go test ./...` and `gofmt -l .` manually instead — both should be clean since no Go changed in this task.) --- ## Task 3: Helpers — write the file with no scenarios yet **Files:** - Create: `crit/roundtrip_helpers_test.go` - [ ] **Step 1: Write the helpers file** Create `crit/roundtrip_helpers_test.go`: ```go //go:build e2e_github package main import ( "bytes" "context" "encoding/json" "fmt" "os" "os/exec" "path/filepath" "strings" "testing" "time" ) // roundtripEnv holds the per-scenario state. type roundtripEnv struct { t *testing.T repoSlug string // e.g. "tomasztomczyk/crit-roundtrip-sandbox" branch string // unique per scenario prNumber int workDir string // temp clone dir critBinary string } // newRoundtripEnv clones the sandbox into a temp dir, creates a unique // branch with a tiny diff against main, opens a PR, and registers cleanup // to close the PR + delete the branch. Skips when prerequisites are missing. func newRoundtripEnv(t *testing.T) *roundtripEnv { t.Helper() if _, err := exec.LookPath("gh"); err != nil { t.Skip("gh not installed") } if err := exec.Command("gh", "auth", "status").Run(); err != nil { t.Skip("gh not authenticated") } repo := os.Getenv("CRIT_ROUNDTRIP_REPO") if repo == "" { t.Skip("CRIT_ROUNDTRIP_REPO not set") } bin := os.Getenv("CRIT_BINARY") if bin == "" { t.Skip("CRIT_BINARY not set") } dir := t.TempDir() branch := fmt.Sprintf("rt-%s-%d", sanitizeName(t.Name()), time.Now().UnixNano()) mustRun(t, dir, "git", "clone", "--depth", "1", fmt.Sprintf("https://github.com/%s.git", repo), ".") mustRun(t, dir, "git", "config", "user.email", "me@tomasztomczyk.com") mustRun(t, dir, "git", "config", "user.name", "crit-roundtrip-bot") mustRun(t, dir, "git", "checkout", "-b", branch) // Apply a tiny diff so the PR has lines to comment on. if err := appendLine(filepath.Join(dir, "sample.go"), "\nfunc Mod(a, b int) int { return a % b }\n"); err != nil { t.Fatal(err) } if err := appendLine(filepath.Join(dir, "sample.md"), "\n## Section D\nFirst paragraph in section D.\n"); err != nil { t.Fatal(err) } mustRun(t, dir, "git", "commit", "-am", "add Mod and section D") mustRun(t, dir, "git", "push", "-u", "origin", branch) prURL := mustOutput(t, dir, "gh", "pr", "create", "--repo", repo, "--title", "roundtrip: "+t.Name(), "--body", "automated test PR", "--head", branch, "--base", "main") prNum := mustParsePRNumber(t, prURL) env := &roundtripEnv{ t: t, repoSlug: repo, branch: branch, prNumber: prNum, workDir: dir, critBinary: bin, } t.Cleanup(func() { // Best-effort: close the PR (GitHub does not allow deleting PRs). _ = exec.Command("gh", "pr", "close", "--repo", repo, fmt.Sprint(prNum), "--delete-branch").Run() }) return env } // runCrit runs the crit binary inside the env workdir with the given args // and returns combined stdout+stderr. Fails the test on non-zero exit. func (e *roundtripEnv) runCrit(args ...string) string { e.t.Helper() cmd := exec.Command(e.critBinary, args...) cmd.Dir = e.workDir cmd.Env = append(os.Environ(), "GH_NO_UPDATE_NOTIFIER=1") var out bytes.Buffer cmd.Stdout = &out cmd.Stderr = &out if err := cmd.Run(); err != nil { e.t.Fatalf("crit %v failed: %v\noutput:\n%s", args, err, out.String()) } return out.String() } // runCritExpectExit runs crit and returns (output, exit code). Does not // fail the test on non-zero. Use for negative cases. func (e *roundtripEnv) runCritExpectExit(args ...string) (string, int) { e.t.Helper() cmd := exec.Command(e.critBinary, args...) cmd.Dir = e.workDir var out bytes.Buffer cmd.Stdout = &out cmd.Stderr = &out err := cmd.Run() code := 0 if exitErr, ok := err.(*exec.ExitError); ok { code = exitErr.ExitCode() } else if err != nil { e.t.Fatalf("crit %v: %v", args, err) } return out.String(), code } // reviewFile reads and unmarshals the review file crit wrote for this branch. // Resolves it via `crit status --json`. func (e *roundtripEnv) reviewFile() CritJSON { e.t.Helper() out := e.runCrit("status", "--json") var status struct { ReviewPath string `json:"review_path"` } if err := json.Unmarshal([]byte(out), &status); err != nil { e.t.Fatalf("parse status JSON: %v\noutput:\n%s", err, out) } if status.ReviewPath == "" { e.t.Fatalf("status JSON had no review_path:\n%s", out) } data, err := os.ReadFile(status.ReviewPath) if err != nil { e.t.Fatalf("read review file %s: %v", status.ReviewPath, err) } var cj CritJSON if err := json.Unmarshal(data, &cj); err != nil { e.t.Fatalf("parse review file: %v", err) } return cj } // allComments flattens the review file into a slice of (file, comment). type localComment struct { Path string Comment Comment } func (e *roundtripEnv) allLocalComments() []localComment { cj := e.reviewFile() var out []localComment for path, f := range cj.Files { for _, c := range f.Comments { out = append(out, localComment{Path: path, Comment: c}) } } return out } // remoteComment is a slim view of the GitHub review-comment API payload. type remoteComment struct { ID int64 `json:"id"` InReplyTo int64 `json:"in_reply_to_id"` Path string `json:"path"` Line int `json:"line"` StartLine int `json:"start_line"` Body string `json:"body"` UserLogin string `json:"-"` rawUser struct { Login string `json:"login"` } `json:"user"` } // listRemoteComments fetches all review comments for the PR via gh api. func (e *roundtripEnv) listRemoteComments() []remoteComment { e.t.Helper() out := mustOutput(e.t, e.workDir, "gh", "api", fmt.Sprintf("repos/%s/pulls/%d/comments?per_page=100", e.repoSlug, e.prNumber)) var raw []remoteComment if err := json.Unmarshal([]byte(out), &raw); err != nil { e.t.Fatalf("parse remote comments: %v\n%s", err, out) } for i := range raw { raw[i].UserLogin = raw[i].rawUser.Login } return raw } // postRemoteComment posts a brand-new top-level review comment on the PR // via gh api (simulates a reviewer commenting in the GitHub web UI). // Returns the new comment's GitHub ID. func (e *roundtripEnv) postRemoteComment(path string, line int, body string) int64 { e.t.Helper() headSHA := strings.TrimSpace(mustOutput(e.t, e.workDir, "git", "rev-parse", "HEAD")) payload := map[string]any{ "body": body, "commit_id": headSHA, "path": path, "line": line, "side": "RIGHT", } buf, _ := json.Marshal(payload) cmd := exec.Command("gh", "api", fmt.Sprintf("repos/%s/pulls/%d/comments", e.repoSlug, e.prNumber), "--method", "POST", "--input", "-") cmd.Stdin = bytes.NewReader(buf) cmd.Dir = e.workDir out, err := cmd.CombinedOutput() if err != nil { e.t.Fatalf("post remote comment: %v\n%s", err, out) } var resp struct { ID int64 `json:"id"` } if err := json.Unmarshal(out, &resp); err != nil { e.t.Fatalf("parse remote-comment response: %v\n%s", err, out) } return resp.ID } // postRemoteReply posts a reply to an existing review comment. func (e *roundtripEnv) postRemoteReply(parentID int64, body string) int64 { e.t.Helper() payload := map[string]any{"body": body, "in_reply_to": parentID} buf, _ := json.Marshal(payload) cmd := exec.Command("gh", "api", fmt.Sprintf("repos/%s/pulls/%d/comments", e.repoSlug, e.prNumber), "--method", "POST", "--input", "-") cmd.Stdin = bytes.NewReader(buf) cmd.Dir = e.workDir out, err := cmd.CombinedOutput() if err != nil { e.t.Fatalf("post remote reply: %v\n%s", err, out) } var resp struct { ID int64 `json:"id"` } if err := json.Unmarshal(out, &resp); err != nil { e.t.Fatalf("parse remote-reply response: %v\n%s", err, out) } return resp.ID } // --- low-level helpers --- func mustRun(t *testing.T, dir, name string, args ...string) { t.Helper() cmd := exec.CommandContext(context.Background(), name, args...) cmd.Dir = dir if out, err := cmd.CombinedOutput(); err != nil { t.Fatalf("%s %v failed in %s: %v\n%s", name, args, dir, err, out) } } func mustOutput(t *testing.T, dir, name string, args ...string) string { t.Helper() cmd := exec.Command(name, args...) cmd.Dir = dir out, err := cmd.CombinedOutput() if err != nil { t.Fatalf("%s %v failed in %s: %v\n%s", name, args, dir, err, out) } return string(out) } func appendLine(path, text string) error { f, err := os.OpenFile(path, os.O_APPEND|os.O_WRONLY, 0644) if err != nil { return err } defer f.Close() _, err = f.WriteString(text) return err } func mustParsePRNumber(t *testing.T, prURL string) int { t.Helper() prURL = strings.TrimSpace(prURL) idx := strings.LastIndex(prURL, "/") if idx < 0 { t.Fatalf("cannot parse PR number from %q", prURL) } var n int if _, err := fmt.Sscanf(prURL[idx+1:], "%d", &n); err != nil { t.Fatalf("cannot parse PR number from %q: %v", prURL, err) } return n } func sanitizeName(s string) string { s = strings.ReplaceAll(s, "/", "-") s = strings.ReplaceAll(s, " ", "-") if len(s) > 30 { s = s[:30] } return s } ``` - [ ] **Step 2: Compile the test file** Run: ```bash cd crit go test -tags e2e_github -run NONEXISTENT -count=1 ./... ``` Expected: builds successfully (`ok` or `no tests to run`). If a name doesn't compile (e.g. `Comment` field name mismatch), fix by inspecting `review_file.go` for the actual struct names. - [ ] **Step 3: Commit** ```bash git add roundtrip_helpers_test.go git -c core.hooksPath=/dev/null commit -m "test: add roundtrip e2e helpers (no scenarios yet)" ``` --- ## Task 4: Scenario — fresh PR, idempotent push (the bug class) **Files:** - Create: `crit/roundtrip_integration_test.go` This is THE scenario that should reproduce the reported bug if it exists. We add local comments, push, then push again and assert no duplicate appears on the PR. - [ ] **Step 1: Write the failing scenario** Create `crit/roundtrip_integration_test.go`: ```go //go:build e2e_github package main import ( "strings" "testing" ) // TestRoundtrip_PushIsIdempotent: a clean PR + local comments. After two // pushes, GitHub should hold exactly N comments. This is the canonical // "delete and recreate" regression test. func TestRoundtrip_PushIsIdempotent(t *testing.T) { e := newRoundtripEnv(t) // Two local comments on different files. e.runCrit("comment", "sample.go:1", "Comment on sample.go line 1") e.runCrit("comment", "sample.md:1", "Comment on sample.md line 1") // First push. out1 := e.runCrit("push") if !strings.Contains(out1, "Posted") && !strings.Contains(out1, "review") { t.Logf("push #1 output:\n%s", out1) } remoteAfter1 := e.listRemoteComments() if len(remoteAfter1) != 2 { t.Fatalf("after push #1: want 2 remote comments, got %d:\n%s", len(remoteAfter1), dumpRemote(remoteAfter1)) } // Second push — must not create duplicates and must not delete anything. out2 := e.runCrit("push") t.Logf("push #2 output:\n%s", out2) remoteAfter2 := e.listRemoteComments() if len(remoteAfter2) != 2 { t.Fatalf("after push #2: want 2 remote comments, got %d:\n%s", len(remoteAfter2), dumpRemote(remoteAfter2)) } // IDs must match across the two pushes — no deletions, no recreations. idsBefore := commentIDs(remoteAfter1) idsAfter := commentIDs(remoteAfter2) if !sameIDs(idsBefore, idsAfter) { t.Fatalf("comment IDs changed between pushes\nbefore: %v\nafter: %v", idsBefore, idsAfter) } // Local review file must hold the GitHubID for both root comments. for _, lc := range e.allLocalComments() { if lc.Comment.GitHubID == 0 { t.Errorf("local comment on %s has GitHubID=0 after push:\n%+v", lc.Path, lc.Comment) } } } func commentIDs(rs []remoteComment) []int64 { out := make([]int64, len(rs)) for i, r := range rs { out[i] = r.ID } return out } func sameIDs(a, b []int64) bool { if len(a) != len(b) { return false } m := make(map[int64]bool, len(a)) for _, id := range a { m[id] = true } for _, id := range b { if !m[id] { return false } } return true } func dumpRemote(rs []remoteComment) string { var b strings.Builder for _, r := range rs { b.WriteString(strings.TrimSpace( fmtRemote(r))) b.WriteByte('\n') } return b.String() } func fmtRemote(r remoteComment) string { return (" id=" + i64s(r.ID) + " parent=" + i64s(r.InReplyTo) + " path=" + r.Path + " line=" + i2s(r.Line) + " body=" + truncate(r.Body, 40)) } func i64s(n int64) string { return strings.TrimSpace(formatInt(int64(n))) } func i2s(n int) string { return strings.TrimSpace(formatInt(int64(n))) } func formatInt(n int64) string { return (" " + strings.TrimLeft( strings.TrimSpace(strings.ReplaceAll( strings.TrimSpace(strings.TrimSpace("\x00")), "\x00", "")), "")) // placeholder; replace below } ``` The `formatInt` placeholder above is intentionally ugly — replace it with `strconv.FormatInt(n, 10)` (and add the `strconv` import) once you copy this in. Same for `i2s`. Specifically write: ```go import "strconv" func i64s(n int64) string { return strconv.FormatInt(n, 10) } func i2s(n int) string { return strconv.Itoa(n) } func truncate(s string, n int) string { if len(s) <= n { return s } return s[:n] + "…" } ``` (and delete the placeholder `formatInt`). - [ ] **Step 2: Run it and watch it either pass or expose the bug** Run: ```bash make -C crit e2e-roundtrip GO_TEST_ARGS="-run TestRoundtrip_PushIsIdempotent -v" ``` or directly: ```bash cd crit && ./scripts/e2e-roundtrip.sh -run TestRoundtrip_PushIsIdempotent -v ``` Expected: PASS if no bug; FAIL with a clear duplicate count or ID-mismatch message if the reported bug is real. Either outcome is acceptable here — the goal of this task is to land the test, not to fix the bug. If it FAILS, **do not fix the bug in this PR** — that's out of scope per the plan header. Instead: 1. Capture the test output verbatim. 2. File a GitHub issue: `gh issue create --repo --title "roundtrip: push idempotency broken" --body ""`. 3. In the test, add `t.Skip("blocked on issue #: ")` at the top so the suite stays green for the rest of the harness work. 4. Reference the issue number in the commit message. The same triage applies to any subsequent failing scenario in this plan — file, skip with issue link, move on. - [ ] **Step 3: Commit** ```bash git add roundtrip_integration_test.go git -c core.hooksPath=/dev/null commit -m "test: roundtrip — push idempotency" ``` --- ## Task 5: Scenario — pull is idempotent A reviewer posts a comment via the GitHub web UI (we simulate with `gh api`). Pulling twice must not duplicate the local comment. **Files:** - Modify: `crit/roundtrip_integration_test.go` - [ ] **Step 1: Append the test** Append to `crit/roundtrip_integration_test.go`: ```go func TestRoundtrip_PullIsIdempotent(t *testing.T) { e := newRoundtripEnv(t) // Reviewer posts on the PR. id := e.postRemoteComment("sample.go", 1, "remote review comment") if id == 0 { t.Fatal("postRemoteComment returned 0") } // First pull imports it. e.runCrit("pull") first := e.allLocalComments() if len(first) != 1 { t.Fatalf("after pull #1: want 1 local comment, got %d", len(first)) } if first[0].Comment.GitHubID != id { t.Errorf("GitHubID mismatch: want %d, got %d", id, first[0].Comment.GitHubID) } // Second pull must be a no-op. out := e.runCrit("pull") t.Logf("pull #2 output:\n%s", out) second := e.allLocalComments() if len(second) != 1 { t.Fatalf("after pull #2: want 1 local comment, got %d", len(second)) } } ``` - [ ] **Step 2: Run it** ```bash cd crit && ./scripts/e2e-roundtrip.sh -run TestRoundtrip_PullIsIdempotent -v ``` Expected: PASS. If it fails (duplicate import), log and continue — same policy as Task 4. - [ ] **Step 3: Commit** ```bash git -c core.hooksPath=/dev/null commit -am "test: roundtrip — pull idempotency" ``` --- ## Task 6: Scenario — push then pull preserves IDs (round-trip) Push a local comment; pull immediately after; assert the local comment now carries the GitHubID GitHub assigned, and that another push is a no-op. **Files:** - Modify: `crit/roundtrip_integration_test.go` - [ ] **Step 1: Append the test** ```go func TestRoundtrip_PushThenPull_PreservesIDs(t *testing.T) { e := newRoundtripEnv(t) e.runCrit("comment", "sample.go:1", "local then pulled back") e.runCrit("push") // Capture state immediately after push. afterPush := e.allLocalComments() if len(afterPush) != 1 { t.Fatalf("after push: want 1 local, got %d", len(afterPush)) } id := afterPush[0].Comment.GitHubID if id == 0 { t.Fatal("local comment has GitHubID=0 after push") } // Pull and confirm the same comment is present, same ID, no duplicate. e.runCrit("pull") afterPull := e.allLocalComments() if len(afterPull) != 1 { t.Fatalf("after pull: want 1 local, got %d", len(afterPull)) } if afterPull[0].Comment.GitHubID != id { t.Errorf("GitHubID changed across pull: %d -> %d", id, afterPull[0].Comment.GitHubID) } // Final push: zero new posts. remoteBefore := e.listRemoteComments() e.runCrit("push") remoteAfter := e.listRemoteComments() if len(remoteAfter) != len(remoteBefore) { t.Errorf("final push created %d new remote comments", len(remoteAfter)-len(remoteBefore)) } } ``` - [ ] **Step 2: Run + commit** ```bash cd crit && ./scripts/e2e-roundtrip.sh -run TestRoundtrip_PushThenPull_PreservesIDs -v git -c core.hooksPath=/dev/null commit -am "test: roundtrip — push/pull preserves IDs" ``` --- ## Task 7: Scenario — replies to remote comments Remote root exists; user replies locally; push posts the reply; pull recovers the reply with its GitHubID; second push is no-op. **Files:** - Modify: `crit/roundtrip_integration_test.go` - [ ] **Step 1: Append the test** ```go func TestRoundtrip_ReplyToRemoteComment(t *testing.T) { e := newRoundtripEnv(t) rootID := e.postRemoteComment("sample.go", 1, "please address") e.runCrit("pull") // imports the remote root locals := e.allLocalComments() if len(locals) != 1 { t.Fatalf("expected 1 local after pull, got %d", len(locals)) } rootCommentID := locals[0].Comment.ID // Reply via crit comment --reply-to. e.runCrit("comment", "--reply-to", rootCommentID, "ack, will fix") // Push: must post one reply, zero new roots. remoteBefore := e.listRemoteComments() e.runCrit("push") remoteAfter := e.listRemoteComments() added := len(remoteAfter) - len(remoteBefore) if added != 1 { t.Fatalf("want 1 new remote item, got %d:\n%s", added, dumpRemote(remoteAfter)) } // Find the new item — it must have InReplyTo == rootID, NOT 0. var reply *remoteComment for i := range remoteAfter { r := &remoteAfter[i] if r.ID != rootID && r.InReplyTo == rootID { reply = r } } if reply == nil { t.Fatalf("no reply with InReplyTo=%d found:\n%s", rootID, dumpRemote(remoteAfter)) } // Second push — no further posts. e.runCrit("push") if got := len(e.listRemoteComments()); got != len(remoteAfter) { t.Errorf("second push changed remote count: %d -> %d", len(remoteAfter), got) } // Pull and assert reply has GitHubID locally. e.runCrit("pull") for _, lc := range e.allLocalComments() { for _, r := range lc.Comment.Replies { if r.GitHubID == 0 { t.Errorf("reply still has GitHubID=0 after push+pull: %+v", r) } } } } ``` Note: `crit comment --reply-to` takes the local comment ID. Inspect `comment_cli.go` to confirm the exact flag spelling and ID type — adjust the call if the existing flag is `--reply-to-id` or similar. (The `` block in `crit/CLAUDE.md` shows the canonical flag syntax: `crit comment --reply-to [--resolve] `.) - [ ] **Step 2: Run + commit** ```bash cd crit && ./scripts/e2e-roundtrip.sh -run TestRoundtrip_ReplyToRemoteComment -v git -c core.hooksPath=/dev/null commit -am "test: roundtrip — reply to remote comment" ``` --- ## Task 8: Scenario — pull merges remote changes without losing local replies Local root pushed → reviewer replies on web → user replies again locally → push → no duplicates, ordering preserved. **Files:** - Modify: `crit/roundtrip_integration_test.go` - [ ] **Step 1: Append the test** ```go func TestRoundtrip_InterleavedReplies(t *testing.T) { e := newRoundtripEnv(t) // Local root, push it. e.runCrit("comment", "sample.go:1", "what about edge case X?") e.runCrit("push") locals := e.allLocalComments() if len(locals) != 1 || locals[0].Comment.GitHubID == 0 { t.Fatalf("post-push state wrong: %+v", locals) } rootGHID := locals[0].Comment.GitHubID // Reviewer replies on GitHub. remoteReplyID := e.postRemoteReply(rootGHID, "good point, here's why") // User pulls, then replies locally. e.runCrit("pull") rootLocal := e.allLocalComments()[0].Comment if len(rootLocal.Replies) != 1 { t.Fatalf("after pull: want 1 reply, got %d", len(rootLocal.Replies)) } if rootLocal.Replies[0].GitHubID != remoteReplyID { t.Errorf("imported reply ID mismatch: want %d got %d", remoteReplyID, rootLocal.Replies[0].GitHubID) } e.runCrit("comment", "--reply-to", rootLocal.ID, "got it, I'll fix") // Push. remoteBefore := e.listRemoteComments() e.runCrit("push") remoteAfter := e.listRemoteComments() if len(remoteAfter)-len(remoteBefore) != 1 { t.Fatalf("want 1 new remote, got delta %d:\n%s", len(remoteAfter)-len(remoteBefore), dumpRemote(remoteAfter)) } // Final pull — three items (root + 2 replies), all with non-zero GitHubIDs. e.runCrit("pull") final := e.allLocalComments() if len(final) != 1 { t.Fatalf("want 1 root, got %d", len(final)) } if got := len(final[0].Comment.Replies); got != 2 { t.Fatalf("want 2 replies, got %d", got) } for _, r := range final[0].Comment.Replies { if r.GitHubID == 0 { t.Errorf("reply missing GitHubID: %+v", r) } } } ``` - [ ] **Step 2: Run + commit** ```bash cd crit && ./scripts/e2e-roundtrip.sh -run TestRoundtrip_InterleavedReplies -v git -c core.hooksPath=/dev/null commit -am "test: roundtrip — interleaved replies" ``` --- ## Task 9: Scenario — fresh clone (cross-machine) behaves like a normal pull Simulates "user B" picking up a PR they've never seen. Posts comments on the PR via `gh api`, then a second `roundtripEnv`-like clone in a fresh tempdir runs `crit pull` and asserts state. **Files:** - Modify: `crit/roundtrip_helpers_test.go` — add `freshClone` helper - Modify: `crit/roundtrip_integration_test.go` - [ ] **Step 1: Add `freshClone` helper** In `roundtrip_helpers_test.go`, append: ```go // freshClone makes another clone of the same PR's branch in a new tempdir, // independent of the original env's workdir/review-file state. Use to // simulate "user B" picking up a PR. func (e *roundtripEnv) freshClone() *roundtripEnv { e.t.Helper() dir := e.t.TempDir() mustRun(e.t, dir, "git", "clone", "--depth", "1", "--branch", e.branch, fmt.Sprintf("https://github.com/%s.git", e.repoSlug), ".") mustRun(e.t, dir, "git", "config", "user.email", "userb@example.com") mustRun(e.t, dir, "git", "config", "user.name", "user-b") return &roundtripEnv{ t: e.t, repoSlug: e.repoSlug, branch: e.branch, prNumber: e.prNumber, workDir: dir, critBinary: e.critBinary, } } ``` - [ ] **Step 2: Add the test** In `roundtrip_integration_test.go`, append: ```go func TestRoundtrip_FreshClonePicksUpAllComments(t *testing.T) { a := newRoundtripEnv(t) // User A posts and pushes a comment. a.runCrit("comment", "sample.go:1", "from user A") a.runCrit("push") // A reviewer also drops a comment on the PR. reviewerID := a.postRemoteComment("sample.md", 1, "from reviewer") // User B clones the branch fresh and pulls. b := a.freshClone() b.runCrit("pull") got := b.allLocalComments() if len(got) != 2 { t.Fatalf("user B want 2 local, got %d:\n%+v", len(got), got) } // Both must carry GitHubIDs (one from A's push, one from reviewer). ids := map[int64]bool{} for _, lc := range got { if lc.Comment.GitHubID == 0 { t.Errorf("user B has comment without GitHubID: %+v", lc) } ids[lc.Comment.GitHubID] = true } if !ids[reviewerID] { t.Errorf("user B did not import reviewer's comment id=%d", reviewerID) } // User B pushes a fresh comment — must not re-post the two existing. b.runCrit("comment", "sample.go:2", "from user B") remoteBefore := b.listRemoteComments() b.runCrit("push") remoteAfter := b.listRemoteComments() if delta := len(remoteAfter) - len(remoteBefore); delta != 1 { t.Fatalf("want exactly 1 new remote from B's push, got delta %d", delta) } } ``` - [ ] **Step 3: Run + commit** ```bash cd crit && ./scripts/e2e-roundtrip.sh -run TestRoundtrip_FreshClonePicksUpAllComments -v git -c core.hooksPath=/dev/null commit -am "test: roundtrip — fresh clone picks up all comments" ``` --- ## Task 10: Scenario — branch switch and return preserves state User reviews two PRs in turn; switching back must not corrupt the first PR's review file. The harness creates two envs in one test func. **Files:** - Modify: `crit/roundtrip_integration_test.go` - [ ] **Step 1: Append the test** ```go func TestRoundtrip_BranchSwitchPreservesState(t *testing.T) { // Two independent PRs, each with their own clone. a := newRoundtripEnv(t) b := newRoundtripEnv(t) a.runCrit("comment", "sample.go:1", "comment on A") a.runCrit("push") aIDsBefore := commentIDs(a.listRemoteComments()) b.runCrit("comment", "sample.go:1", "comment on B") b.runCrit("push") // Re-pull on A — A's review file should be unchanged. a.runCrit("pull") aIDsAfter := commentIDs(a.listRemoteComments()) if !sameIDs(aIDsBefore, aIDsAfter) { t.Errorf("A's remote comments changed: %v -> %v", aIDsBefore, aIDsAfter) } aLocals := a.allLocalComments() if len(aLocals) != 1 { t.Fatalf("A: want 1 local, got %d", len(aLocals)) } if !strings.Contains(aLocals[0].Comment.Body, "comment on A") { t.Errorf("A's local comment body wrong: %q", aLocals[0].Comment.Body) } } ``` - [ ] **Step 2: Run + commit** ```bash cd crit && ./scripts/e2e-roundtrip.sh -run TestRoundtrip_BranchSwitchPreservesState -v git -c core.hooksPath=/dev/null commit -am "test: roundtrip — branch switch preserves state" ``` --- ## Task 11: Scenario — force-pushed PR head doesn't duplicate comments on pull Simulates the user amending and force-pushing the branch after comments exist. Pull must still see the old comments and not re-import them as new. **Files:** - Modify: `crit/roundtrip_integration_test.go` - [ ] **Step 1: Append the test** ```go func TestRoundtrip_ForcePushedHead_NoDuplication(t *testing.T) { e := newRoundtripEnv(t) // Add and push a local comment. e.runCrit("comment", "sample.go:1", "first round") e.runCrit("push") idsBefore := commentIDs(e.listRemoteComments()) // Amend HEAD and force-push (simulates rebase / squash). if err := appendLine(e.workDir+"/sample.go", "// trailing comment\n"); err != nil { t.Fatal(err) } mustRun(t, e.workDir, "git", "commit", "-am", "tweak") mustRun(t, e.workDir, "git", "push", "--force-with-lease") // Pull — old comments still there, no duplicates. e.runCrit("pull") locals := e.allLocalComments() if len(locals) != 1 { t.Fatalf("want 1 local after force-push pull, got %d", len(locals)) } idsAfter := commentIDs(e.listRemoteComments()) if !sameIDs(idsBefore, idsAfter) { t.Errorf("remote IDs changed across force-push: %v -> %v", idsBefore, idsAfter) } // New local + push: only the new one is posted. e.runCrit("comment", "sample.go:2", "second round") remoteBefore := e.listRemoteComments() e.runCrit("push") if delta := len(e.listRemoteComments()) - len(remoteBefore); delta != 1 { t.Errorf("want delta 1, got %d", delta) } } ``` - [ ] **Step 2: Run + commit** ```bash cd crit && ./scripts/e2e-roundtrip.sh -run TestRoundtrip_ForcePushedHead_NoDuplication -v git -c core.hooksPath=/dev/null commit -am "test: roundtrip — force-push doesn't duplicate" ``` --- ## Task 12: Scenario — multi-line range comment roundtrip GitHub PR comments can span a line range. Verify the harness covers that. **Files:** - Modify: `crit/roundtrip_integration_test.go` - [ ] **Step 1: Append the test** ```go func TestRoundtrip_RangeComment(t *testing.T) { e := newRoundtripEnv(t) // Local range comment over lines 1-3 of sample.go. e.runCrit("comment", "sample.go:1-3", "ranged comment over 1..3") e.runCrit("push") rs := e.listRemoteComments() if len(rs) != 1 { t.Fatalf("want 1 remote, got %d", len(rs)) } if rs[0].StartLine == 0 { t.Errorf("range comment posted without start_line: %+v", rs[0]) } if rs[0].Line < rs[0].StartLine { t.Errorf("range end < start: %+v", rs[0]) } // Pull, push again — idempotent. e.runCrit("pull") e.runCrit("push") if got := len(e.listRemoteComments()); got != 1 { t.Errorf("range comment got duplicated, count=%d", got) } } ``` - [ ] **Step 2: Run + commit** ```bash cd crit && ./scripts/e2e-roundtrip.sh -run TestRoundtrip_RangeComment -v git -c core.hooksPath=/dev/null commit -am "test: roundtrip — range comment" ``` --- ## Task 13: README + agents handoff doc **Files:** - Create: `crit/test/roundtrip/README.md` - [ ] **Step 1: Write the README** Create `crit/test/roundtrip/README.md`: ```markdown # Roundtrip integration tests End-to-end exercise of `crit pull` / `crit push` against a real GitHub repo and PR. Each scenario: 1. Clones a sandbox repo into a tempdir 2. Pushes a unique branch + opens a PR 3. Drives `crit` (and `gh api`) through one specific state transition 4. Asserts on local review-file state AND live PR comment state via `gh api` 5. Closes the PR and deletes the branch in `t.Cleanup` ## One-time setup 1. Pick a GitHub account or org you control. The sandbox repo is throwaway. Default is the `crit-md` org; override with `OWNER=`. 2. Run the bootstrap procedure documented at the top of `docs/superpowers/plans/2026-05-04-pr-roundtrip-test-harness.md` (Task 1). It creates `/crit-roundtrip-sandbox`. 3. Export `CRIT_ROUNDTRIP_REPO=/crit-roundtrip-sandbox` in your shell (or stick it in a direnv `.envrc.local`). 4. Confirm `gh auth status` is green. ## Running ```bash make e2e-roundtrip # all scenarios ./scripts/e2e-roundtrip.sh -run TestRoundtrip_PushIsIdempotent -v # one ``` ## Adding a scenario - Add a `TestRoundtrip_` function to `crit/roundtrip_integration_test.go`. - Always start with `e := newRoundtripEnv(t)` — it owns branch+PR lifecycle. - Read remote state via `e.listRemoteComments()`; read local state via `e.allLocalComments()` or `e.reviewFile()`. - For new low-level helpers (e.g. editing a comment via the API), add them to `crit/roundtrip_helpers_test.go` rather than inline. ## Caveats - These tests hit GitHub's public API. They are slow (~5–15s per scenario) and rate-limited. Don't add them to the default `go test` run — they live behind the `e2e_github` build tag for that reason. - Each scenario costs one open-then-closed PR in the sandbox. GitHub does not allow deleting closed PRs, so the sandbox accumulates closed PR history. That's fine. - `gh pr close --delete-branch` is best-effort. If a test panics before cleanup runs, dangling branches remain — clean periodically with `gh pr list --repo --state closed --limit 100 --json headRefName` if you care. ``` - [ ] **Step 2: Mention the harness in `crit/scripts/AGENTS.md`** Inspect `crit/scripts/AGENTS.md` and add one bullet under the existing scripts list: - `e2e-roundtrip.sh` — runs the GitHub PR roundtrip integration tests (build tag `e2e_github`). Setup: see `test/roundtrip/README.md`. If `AGENTS.md` doesn't have a script-listing section, just add a short paragraph at the bottom referring to `test/roundtrip/README.md`. Keep it under five lines. - [ ] **Step 3: Commit** ```bash git add test/roundtrip/README.md scripts/AGENTS.md git -c core.hooksPath=/dev/null commit -m "docs: roundtrip e2e setup and authoring notes" ``` --- ## Task 14: Final whole-suite run + PR - [ ] **Step 1: Run the full suite** ```bash cd crit && make e2e-roundtrip ``` Expected: each scenario reports either PASS, or a precise FAIL pointing at the bug class. Note any failures verbatim — do NOT attempt fixes here. - [ ] **Step 2: Final lint + format check** ```bash cd crit gofmt -l . # expect empty go test ./... # non-tagged tests still pass golangci-lint run ./... # clean ``` - [ ] **Step 3: Open the PR** ```bash cd crit gh pr create \ --title "test: GitHub PR roundtrip integration harness" \ --body "$(cat <<'EOF' ## Summary - Adds a `e2e_github`-tagged integration test suite that drives `crit pull` / `crit push` against a real PR in a sandbox repo - 9 scenarios covering: idempotent push, idempotent pull, push→pull ID preservation, replies, interleaved replies, fresh clone, branch switching, force-pushed head, range comments - New `make e2e-roundtrip` target + runner script + setup README ## Test plan - [x] `make e2e-roundtrip` passes locally with `CRIT_ROUNDTRIP_REPO` configured - [x] `go test ./...` (untagged) unaffected - [x] `gofmt -l .` clean 🤖 Generated with [Claude Code](https://claude.com/claude-code) EOF )" ``` --- ## Self-review checklist (run AFTER plan is written, BEFORE handing off) 1. **Spec coverage:** - Bug class "delete and recreate": Task 4 (idempotent push) is the direct probe. Task 6 (push→pull preserves IDs) catches the "GitHubID lost on pull" variant. Task 9 (fresh clone) catches the cross-machine variant. Task 11 (force-push) catches the rebase variant. ✅ - Pull variants: Task 5 (basic), Task 9 (fresh clone), Task 11 (after force push). ✅ - Push variants: Task 4 (basic idempotency), Task 6, Task 7 (replies), Task 8 (interleaved), Task 12 (ranges). ✅ - "Coming back to existing branch with more local comments" variant: Task 10. ✅ 2. **Placeholder scan:** - Task 4 originally contained a placeholder `formatInt` — replaced inline with the explicit instruction to use `strconv.FormatInt` and to delete the placeholder. The replacement is concrete code, not a TBD. ✅ - All other tasks contain runnable code. 3. **Type consistency:** - `CritJSON`, `Comment`, `Replies` field names assumed from `review_file.go` / `github.go`. The very first compile in Task 3 will surface any mismatch — engineer is told to inspect `review_file.go` if compilation fails. ✅ - `crit comment --reply-to ` flag is canonical per `crit/CLAUDE.md`. ✅ - `crit status --json` is documented in `crit/CLAUDE.md` and used in Task 3's `reviewFile()` helper. ✅ 4. **Known soft spots:** - The harness assumes `gh pr close --delete-branch` cleans up in `t.Cleanup`. If a panic skips cleanup, branches accumulate. Documented in README, not blocking. - The shape of `redirectReviewPathForPR` and how `crit status --json` resolves the path-for-current-cwd may differ slightly across modes. The helper deliberately uses `crit status --json` rather than guessing the path so any mode change still resolves correctly. --- **Plan complete and saved to `/Users/tomasztomczyk/Server/side/crit-mono/docs/superpowers/plans/2026-05-04-pr-roundtrip-test-harness.md`. Two execution options:** **1. Subagent-Driven (recommended)** — I dispatch a fresh subagent per task, review between tasks, fast iteration **2. Inline Execution** — Execute tasks in this session, batch with checkpoints **Which approach?**