gitserver

self-hosted git server tooling
git clone https://git.ryansepassi.com/git/gitserver.git
Log | Files | Refs | README

commit 6544bb0d45254cf3e788b66dd3a924f8dc1525d4
parent 822b6157bf24c7088b52915e545b6f7821555c5d
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Fri, 17 Apr 2026 16:46:26 -0700

add remove-repo + publish-public-prune; preserve orphans in manifest

publish-public now keeps manifest entries for paths no longer present
locally, so a subsequent prune can DELETE them from Bunny. Previously
stagit-update's tail call to publish-public would clobber those entries
before prune could see them.

Diffstat:
MREADME.md | 72+++++++++++++++++++++++++++++++++++-------------------------------------
Mbin/publish-public | 65++++++++++++++++++++++++++++++++++++++---------------------------
Abin/publish-public-prune | 78++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Abin/remove-repo | 32++++++++++++++++++++++++++++++++
4 files changed, 183 insertions(+), 64 deletions(-)

diff --git a/README.md b/README.md @@ -39,13 +39,19 @@ Everything on the server lives under `~/repos/`: 6. **Bunny Pull Zone** (once, via dashboard): - Create Pull Zone pointing at the Storage Zone. - Attach custom hostname (`PUBLIC_DOMAIN`), enable TLS. - - **Edge Rule** — for git ref + branch-archive freshness, force no-cache on: - - `*/HEAD` - - `*/info/refs` - - `*/objects/info/packs` - - `*/archive/refs/heads/*` - - Actions: `Override Cache Time = 0` + `Override Browser Cache Time = 0`. - - Everything else (content-addressed objects + sha/tag archives) caches normally. + - **Edge Rules** — two rules, both with + `Override Cache Time = 0` + `Override Browser Cache Time = 0`: + - `git no cache` — matches `*/HEAD`, `*/info/refs`, `*/objects/info/packs` + (dumb-HTTP clone freshness). + - `stagit no cache` — matches `*/index.html`, `*/log.html`, `*/files.html`, + `*/refs.html`, `*/atom.xml`, `*/tags.xml`, `*/file/*`, `*style.css`, and + the site root (`https://PUBLIC_DOMAIN` / `https://PUBLIC_DOMAIN/`). + - Content-addressed objects (`/objects/<sha>/...`) and commit/sha/tag + archive tarballs cache normally. + - Note: branch archives (`*/archive/refs/heads/*`) are rewritten on every + push to that branch, so they will be stale at the edge until TTL + expires. Add a third rule matching `*/archive/refs/heads/*` if you + care about branch-tarball freshness. 7. **DNS** (once): CNAME `PUBLIC_DOMAIN` → `<zone>.b-cdn.net`. ## Day to day @@ -80,6 +86,17 @@ Normal `git push`. The `post-receive` hook runs: All automatic; no manual step. +### Remove a repo + +```sh +ssh $GIT_HOST ~/repos/bin/remove-repo <name> +``` + +Deletes the bare repo, all stagit HTML (private + public), the public clone +mirror, archive tarballs, and stagit caches. Rebuilds indexes and runs +`publish-public-prune` to drop Bunny orphans. Requires typing the repo name +to confirm. + ### Flip a repo public / private Public requires: the `public` marker file, and (for clean clone URL display) a @@ -98,8 +115,8 @@ ssh $GIT_HOST 'rm ~/repos/<name>.git/public \ ``` `stagit-update` removes the public mirror when the marker is gone; the next -`publish-public` run on the next push will still have stale files on Bunny — -see "Cleanup" below. +`publish-public` run on the next push will still have stale files on Bunny. +Run `~/repos/bin/publish-public-prune` to clean them up. ### Update tooling @@ -115,7 +132,9 @@ reloads caddy, and runs `stagit-update` if stagit is already installed. | `git push <repo>` to the server | `post-receive` → `update-server-info` + `stagit-update` + `publish-public` | | `bringup` on the server | one-time install (stagit, caddy snippet) | | `add-repo` | creates bare repo + hook + initial HTML | -| Bunny CDN | serves public repos automatically; edge rules enforce no-cache on git refs | +| `remove-repo` | deletes bare repo + HTML + archive + Bunny orphans | +| `publish-public-prune` | deletes Bunny objects absent from local `www-public/` | +| Bunny CDN | serves public repos automatically; edge rules enforce no-cache on git refs + stagit HTML | **Nothing runs on a schedule.** There are no cron jobs, no systemd timers. Everything is push-driven. @@ -137,40 +156,19 @@ Everything is push-driven. `publish-public` PUTs only — it doesn't delete remote files that are absent locally. Orphans accumulate when files move or repos get un-publicized. - -**Full rebuild of `www-public/`** (safe; purely derived from `~/repos/*.git`): +`remove-repo` handles this automatically; otherwise run: ```sh -ssh $GIT_HOST 'rm -rf ~/repos/www-public/* && ~/repos/bin/stagit-update' +ssh $GIT_HOST ~/repos/bin/publish-public-prune ``` -**Delete Bunny orphans** (when local was just rebuilt): +**Full rebuild of `www-public/`** (safe; purely derived from `~/repos/*.git`): ```sh ssh $GIT_HOST ' - set -eu - . ~/repos/config.env - KEY=$(cat ~/repos/bunny.key) - EP=https://storage.bunnycdn.com/$BUNNY_ZONE - cd ~/repos/www-public - find -L . -type f | sed "s|^\./||" | sort > /tmp/local.list - python3 - <<PY > /tmp/remote.list -import json, urllib.request, os -KEY = open(os.path.expanduser("~/repos/bunny.key")).read().strip() -BASE = "'"$EP"'" -def ls(p): - r = urllib.request.Request(BASE + "/" + p, headers={"AccessKey": KEY, "Accept": "application/json"}) - return json.loads(urllib.request.urlopen(r).read()) -def walk(prefix): - for o in ls(prefix): - rel = (prefix + o["ObjectName"]).lstrip("/") - if o["IsDirectory"]: walk(rel + "/") - else: print(rel) -walk("") -PY - comm -23 <(sort /tmp/remote.list) /tmp/local.list | while read -r f; do - curl -sS -X DELETE "$EP/$f" -H "AccessKey: $KEY" -o /dev/null -w "DEL %{http_code} $f\n" - done + rm -rf ~/repos/www-public/* && + ~/repos/bin/stagit-update && + ~/repos/bin/publish-public-prune ' ``` diff --git a/bin/publish-public b/bin/publish-public @@ -27,11 +27,13 @@ mkdir -p "$STATE" touch "$MANIFEST" CUR=$(mktemp) +OLD=$(mktemp) TODO=$(mktemp) FAILED=$(mktemp) -trap 'rm -f "$CUR" "$TODO" "$FAILED"' EXIT +FAILED_SORTED=$(mktemp) +trap 'rm -f "$CUR" "$OLD" "$TODO" "$FAILED" "$FAILED_SORTED"' EXIT -# Build current manifest: lines of "<sha256>\t<path>", byte-sorted so comm +# Build current state: lines of "<sha256>\t<path>", byte-sorted so comm # is safe. Paths are unique, so full-line order is deterministic. cd "$SRC" find -L . -type f -print0 \ @@ -39,42 +41,51 @@ find -L . -type f -print0 \ | sed 's| \./|\t|' \ | LC_ALL=C sort > "$CUR" -OLD=$(mktemp) LC_ALL=C sort "$MANIFEST" > "$OLD" LC_ALL=C comm -23 "$CUR" "$OLD" > "$TODO" -rm -f "$OLD" n=$(wc -l < "$TODO" | tr -d ' ') total=$(wc -l < "$CUR" | tr -d ' ') echo "publish-public: $n/$total files to upload" -if [ "$n" -eq 0 ]; then - # Manifest still represents current state — persist it in case paths were deleted. - cp "$CUR" "$MANIFEST" - exit 0 + +if [ "$n" -gt 0 ]; then + export ENDPOINT KEY FAILED + awk -F'\t' '{print $2}' "$TODO" | xargs -P 8 -I{} sh -c ' + f=$1 + code=$(curl -sS -T "$f" "$ENDPOINT/$f" \ + -H "AccessKey: $KEY" -o /dev/null -w "%{http_code}") + case $code in + 200|201) ;; + *) echo "PUT $code: $f" >&2; echo "$f" >> "$FAILED" ;; + esac + ' _ {} fi -export ENDPOINT KEY FAILED -awk -F'\t' '{print $2}' "$TODO" | xargs -P 8 -I{} sh -c ' - f=$1 - code=$(curl -sS -T "$f" "$ENDPOINT/$f" \ - -H "AccessKey: $KEY" -o /dev/null -w "%{http_code}") - case $code in - 200|201) ;; - *) echo "PUT $code: $f" >&2; echo "$f" >> "$FAILED" ;; - esac -' _ {} +[ -s "$FAILED" ] && LC_ALL=C sort -u "$FAILED" > "$FAILED_SORTED" + +# Rebuild manifest as "what we believe is on Bunny": +# - paths in CUR whose PUT succeeded (or were unchanged): fresh CUR entry +# - paths in CUR whose PUT failed: keep OLD entry (what's actually there) +# - paths in OLD but not CUR (orphans): keep OLD entry +# Orphans get cleaned up later by publish-public-prune. +awk -F'\t' -v failed="$FAILED_SORTED" ' + BEGIN { while ((getline p < failed) > 0) bad[p]=1 } + NR==FNR { old[$2]=$0; next } + { + if (($2) in bad) { + if (($2) in old) { print old[$2]; delete old[$2] } + } else { + print + if (($2) in old) delete old[$2] + } + } + END { for (p in old) print old[p] } +' "$OLD" "$CUR" > "$MANIFEST.new" +mv "$MANIFEST.new" "$MANIFEST" -# Persist the new manifest, but exclude any paths whose upload failed so they -# get retried next run. if [ -s "$FAILED" ]; then - sort -u "$FAILED" > "$FAILED.sorted" - awk -F'\t' 'NR==FNR{bad[$0]=1; next} !($2 in bad)' \ - "$FAILED.sorted" "$CUR" > "$MANIFEST.new" - mv "$MANIFEST.new" "$MANIFEST" fail_n=$(wc -l < "$FAILED" | tr -d ' ') echo "publish-public: $((n - fail_n)) uploaded, $fail_n failed" exit 1 -else - cp "$CUR" "$MANIFEST" - echo "publish-public: $n uploaded" fi +[ "$n" -gt 0 ] && echo "publish-public: $n uploaded" || true diff --git a/bin/publish-public-prune b/bin/publish-public-prune @@ -0,0 +1,78 @@ +#!/bin/sh +# Delete objects in the Bunny storage zone that publish-public uploaded but +# that no longer exist in ~/repos/www-public/ (orphans from remove-repo, +# flipping a repo private, or a www-public rebuild). +# +# Uses the publish-public manifest as the source of truth for "what's on +# Bunny" — avoids walking the Bunny storage API directory-by-directory. +# Trade-off: won't catch files that landed on Bunny outside publish-public +# (e.g. manual uploads). For the normal case that's fine. +set -eu + +. "$HOME/repos/config.env" 2>/dev/null || true +: "${BUNNY_ZONE:?set BUNNY_ZONE in .env (laptop) and re-run push}" + +KEY_FILE=$HOME/repos/bunny.key +if [ ! -r "$KEY_FILE" ]; then + echo "publish-public-prune: $KEY_FILE missing" >&2 + exit 1 +fi +KEY=$(cat "$KEY_FILE") +ENDPOINT=https://storage.bunnycdn.com/$BUNNY_ZONE +SRC=$HOME/repos/www-public +MANIFEST=$HOME/repos/.publish-state/manifest.tsv + +if [ ! -s "$MANIFEST" ]; then + echo "publish-public-prune: no manifest — run publish-public first" >&2 + exit 0 +fi + +UPLOADED=$(mktemp) +LOCAL=$(mktemp) +FAILED=$(mktemp) +trap 'rm -f "$UPLOADED" "$LOCAL" "$FAILED" "$FAILED.sorted"' EXIT + +awk -F'\t' '{print $2}' "$MANIFEST" | LC_ALL=C sort -u > "$UPLOADED" +( cd "$SRC" && find -L . -type f ) | sed 's|^\./||' | LC_ALL=C sort > "$LOCAL" + +TODO=$(mktemp) +trap 'rm -f "$UPLOADED" "$LOCAL" "$FAILED" "$FAILED.sorted" "$TODO"' EXIT +LC_ALL=C comm -23 "$UPLOADED" "$LOCAL" > "$TODO" + +n=$(wc -l < "$TODO" | tr -d ' ') +if [ "$n" -eq 0 ]; then + echo "publish-public-prune: nothing to delete" + exit 0 +fi +echo "publish-public-prune: deleting $n orphans" + +export ENDPOINT KEY FAILED +xargs -P 8 -I{} sh -c ' + f=$1 + code=$(curl -sS -X DELETE "$ENDPOINT/$f" \ + -H "AccessKey: $KEY" -o /dev/null -w "%{http_code}") + case $code in + 200|204|404) ;; + *) echo "DEL $code: $f" >&2; echo "$f" >> "$FAILED" ;; + esac +' _ {} < "$TODO" + +# Drop successfully-deleted paths from the manifest. Keep entries for paths +# whose DELETE failed so they get retried on the next run. +: > "$FAILED.sorted" +[ -s "$FAILED" ] && LC_ALL=C sort -u "$FAILED" > "$FAILED.sorted" +awk -F'\t' -v todo="$TODO" -v failed="$FAILED.sorted" ' + BEGIN { + while ((getline p < todo) > 0) orphan[p]=1 + while ((getline p < failed) > 0) keep[p]=1 + } + !orphan[$2] || keep[$2] +' "$MANIFEST" > "$MANIFEST.new" +mv "$MANIFEST.new" "$MANIFEST" + +if [ -s "$FAILED" ]; then + fail_n=$(wc -l < "$FAILED" | tr -d ' ') + echo "publish-public-prune: $((n - fail_n)) deleted, $fail_n failed" + exit 1 +fi +echo "publish-public-prune: $n deleted" diff --git a/bin/remove-repo b/bin/remove-repo @@ -0,0 +1,32 @@ +#!/bin/sh +# Permanently delete a repo: bare dir, stagit HTML (public + private), +# public git mirror, archive tarballs, stagit caches. Regenerates indexes +# and prunes Bunny orphans afterward. +# Usage: remove-repo <name> +set -eu + +REPOS=$HOME/repos +name=${1:-} +[ -n "$name" ] || { echo "usage: remove-repo <name>" >&2; exit 2; } + +dir=$REPOS/$name.git +[ -d "$dir" ] || { echo "no such repo: $dir" >&2; exit 1; } + +printf "Permanently delete %s and all derived HTML/archives.\nType '%s' to confirm: " "$dir" "$name" +IFS= read -r confirm +[ "$confirm" = "$name" ] || { echo "aborted"; exit 1; } + +rm -rf "$dir" +rm -rf "$REPOS/www/$name" +rm -rf "$REPOS/www-public/$name" +rm -rf "$REPOS/www-public/git/$name.git" +rm -f "$REPOS/.stagit-cache/$name.priv.cache" +rm -f "$REPOS/.stagit-cache/$name.pub.cache" + +# Rebuild indexes (and, via its tail call, re-run publish-public). +"$REPOS/bin/stagit-update" + +# Clean orphans off Bunny. +[ -x "$REPOS/bin/publish-public-prune" ] && "$REPOS/bin/publish-public-prune" || true + +echo "ok: removed $name"