Skip to content

feat: generate snapshots on-demand when not cached#160

Draft
nssherpa wants to merge 2 commits intomainfrom
nsherpa/snapshot
Draft

feat: generate snapshots on-demand when not cached#160
nssherpa wants to merge 2 commits intomainfrom
nsherpa/snapshot

Conversation

@nssherpa
Copy link
Collaborator

@nssherpa nssherpa commented Feb 25, 2026

Summary

  • The snapshot endpoint (GET /git/{host}/{path}/snapshot.tar.zst) previously returned 404 if no snapshot was cached
  • It now triggers a mirror clone (if the repo hasn't been cloned yet) and generates the snapshot on-demand, blocking until complete
  • A new ensureCloneReady helper handles the StateEmpty → clone → StateReady and StateCloning → poll → StateReady transitions

Test plan

  • TestSnapshotHTTPEndpoint — cache hit path unchanged; cache miss with no clonable repo returns 503
  • TestSnapshotOnDemandGenerationViaHTTP — snapshot endpoint generates and serves a snapshot when none is cached but mirror is ready
  • TestSnapshotGenerationViaLocalClone / TestSnapshotRemoteURLUsesServerURL — existing generation tests unaffected

Works without cloning the repo first.

time sh -c 'mkdir ~/Development/gitclones/cachew-stream-1 &&  curl http://localhost:8080/git/github.com/block/cachew.git/snapshot.tar.zst | zstd -dc | tar -xpf - -C ~/Development/gitclones/cachew-stream-1'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 11.2M    0 11.2M    0     0  7741k      0 --:--:--  0:00:01 --:--:-- 7736k
sh -c   0.01s user 0.08s system 5% cpu 1.608 total

The snapshot endpoint now triggers a clone and generates a snapshot
in a single request rather than returning 404. If the repo mirror
is not yet cloned, the request blocks until the clone completes
before generating and serving the snapshot.
@nssherpa nssherpa marked this pull request as ready for review February 25, 2026 23:30
@nssherpa nssherpa requested a review from a team as a code owner February 25, 2026 23:30
@nssherpa nssherpa requested review from alecthomas and removed request for a team February 25, 2026 23:30
@nssherpa nssherpa marked this pull request as draft February 25, 2026 23:36
Replace scheduler-based serialization with a per-repo sync.Mutex.
This prevents concurrent snapshot runs for the same repo without
blocking on the scheduler's global worker pool, so on-demand HTTP
requests are not delayed by unrelated repos' background jobs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant