From 4d7a8780dea64aa11dce9120cf94d0ad7c18fd40 Mon Sep 17 00:00:00 2001 From: Peter Date: Fri, 24 Apr 2026 00:09:01 +0200 Subject: [PATCH] Document enrich + export-swap + extend; add swap-ready usage guide README.md now covers all six subcommands (embed, cluster, refine, dedup, extend, enrich, export-swap), an end-to-end pipeline recipe, the delta recipe for merging a new source into an existing result, the quality- weight formula used by export-swap, and the GFPGAN blend recommendation at swap time (0.85, overriding roop-unleashed's 0.65 default). Co-Authored-By: Claude Opus 4.7 (1M context) --- README.md | 135 +++++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 99 insertions(+), 36 deletions(-) diff --git a/README.md b/README.md index 400e2f0..2e0cd7c 100644 --- a/README.md +++ b/README.md @@ -1,56 +1,119 @@ # face-sets -Sort photos by similar face using InsightFace embeddings + agglomerative clustering, then refine into faceset-ready folders for downstream face-swap tooling (roop-unleashed, etc.). +Sort photos by similar face using InsightFace embeddings + agglomerative clustering, refine into per-identity sets, and export ready-to-drop bundles for face-swap tooling (roop-unleashed, etc.). ## Pipeline -`sort_faces.py` is a single-file CLI with four subcommands: +`sort_faces.py` is a single-file CLI with six subcommands: -| step | what it does | -|---------|------------------------------------------------------------------------------| -| embed | Recursively scan a source tree, detect + embed every face, write `.npz` cache | -| cluster | Raw agglomerative clustering of the cache into `person_NNN/` / `_singletons/` / `_noface/` | -| refine | Initial cluster → centroid merge → quality gate → outlier rejection → size filter → `faceset_NNN/` | -| dedup | Post-hoc near-duplicate report: byte-identical groups + visual near-dupes (same face + same size within a tight cosine threshold) | +| step | what it does | +|-------------|-------------------------------------------------------------------------------------------------------------| +| embed | Recursively scan a source tree, detect + embed every face, write `.npz` cache. Resumable; sha256-dedup. | +| cluster | Raw agglomerative clustering of the cache into `person_NNN/` / `_singletons/` / `_noface/` with manifest. | +| refine | Initial cluster → centroid merge → quality gate → outlier rejection → size filter → `faceset_NNN/`. | +| dedup | Post-hoc near-duplicate report: byte-identical + visual near-dupe groups → `.duplicates.json`. | +| extend | Fold new embeddings into an existing raw/refine output via nearest person-centroid without renumbering. | +| enrich | Re-detect each cached face to persist landmark_2d_106, landmark_3d_68, pose (pitch/yaw/roll) into cache. | +| export-swap | Per-identity export: tight outlier gate + visual-dupe collapse + composite quality rank + single-face PNG crops + `.fsz` bundles (top-N and full) ready for roop-unleashed. Optional singleton rescue into `_candidates/`. | -`embed` is resumable and incremental: it loads any existing cache at the target path and only hashes/embeds files it hasn't processed before. A periodic flush (default every 50 new files) writes the cache atomically, so a mid-run crash loses at most a few dozen embeddings. +### Design principles -Byte-identical duplicates are detected via sha256 during the listing phase. The canonical file is embedded once; other paths with the same hash are carried as `aliases` on the cache's top-level `path_aliases` dict. Every alias is materialized by `cluster`/`refine`, so each on-disk location ends up represented in the output. +- **embed is resumable and incremental.** It loads any existing cache at the target path and only hashes / embeds files it has not seen. Atomic flush every 50 new files so a mid-run crash loses at most ~50 embeddings. +- **Byte-identical duplicates are sha256-grouped at listing time.** The canonical file is embedded once; other paths with the same hash become `path_aliases` in the cache. Every alias is materialized by `cluster` / `refine` / `export-swap`, so each on-disk location is represented. +- **`safe_dst_name` always flattens the absolute path.** This keeps output filenames stable across runs even as `src_root` changes between embed / extend / export invocations. +- **Caches and outputs stay out of git** via `.gitignore`; defaults live under `work/`. -Cache and outputs are kept out of the repo via `.gitignore`; defaults live under `work/`. - -## Typical run +## Typical end-to-end run ```bash -# 1. Embed (CPU; InsightFace buffalo_l). Caches faces + metadata. Resumable. -python sort_faces.py embed /mnt/x/src/nl work/cache/nl_full.npz +SRC=/mnt/x/src/nl +CACHE=work/cache/nl_full.npz +OUT=/mnt/e/temp_things/fcswp/nl_sorted -# 2. Raw clusters (every multi-face cluster -> a person_NNN/ folder). -python sort_faces.py cluster work/cache/nl_full.npz /mnt/e/temp_things/fcswp/nl_sorted/raw_full +# 1. Embed (CPU; InsightFace buffalo_l). Resumable on re-run. +python sort_faces.py embed "$SRC" "$CACHE" -# 3. Refined facesets (filters for faceset-ready quality). -python sort_faces.py refine work/cache/nl_full.npz /mnt/e/temp_things/fcswp/nl_sorted/facesets_full +# 2. Raw clusters (one person_NNN/ per multi-face cluster). +python sort_faces.py cluster "$CACHE" "$OUT/raw_full" -# 4. (Optional) report on byte-identical + visual near-duplicates. -python sort_faces.py dedup work/cache/nl_full.npz +# 3. Refined facesets (quality-gated per-identity sets). +python sort_faces.py refine "$CACHE" "$OUT/facesets_full" + +# 4. Near-duplicate report (byte + visual). +python sort_faces.py dedup "$CACHE" + +# 5. Enrich the cache with landmarks + pose (needed by export-swap). +python sort_faces.py enrich "$CACHE" + +# 6. Export roop-unleashed-ready bundles. +python sort_faces.py export-swap "$CACHE" \ + "$OUT/facesets_full/refine_manifest.json" "$OUT/facesets_swap_ready" \ + --raw-manifest "$OUT/raw_full/manifest.json" --candidates ``` -## Refine defaults +### Merging a new source into an existing result -| flag | default | meaning | -|---|---|---| -| `--initial-threshold` | 0.55 | cosine distance for stage-1 clustering | -| `--merge-threshold` | 0.40 | centroid-level merge of over-split clusters | -| `--outlier-threshold` | 0.55 | drop face if cosine dist from cluster centroid exceeds this (only if cluster ≥ 4) | -| `--min-faces` | 15 | minimum unique images per faceset | -| `--min-short` | 90 | minimum short-edge pixels of face bbox | -| `--min-blur` | 40.0 | Laplacian-variance blur gate | -| `--min-det-score` | 0.6 | InsightFace detector score gate | -| `--mode` | copy | copy / move / symlink | +```bash +# Embed new source into the same cache (resume from existing embeddings + aliases). +python sort_faces.py embed /mnt/x/src/lzbkp_red "$CACHE" -## Prior runs (as of 2026-04-22) +# Fold new faces into raw_full + facesets_full without renumbering. +python sort_faces.py extend "$CACHE" "$OUT/raw_full" --refine-out "$OUT/facesets_full" -- `work/cache/kos11.npz` — 181 images, 333 faces from `Kos '11/` → `kos11_sorted/` -- `work/cache/nl_all.npz` — 916 images, 1396 faces from `Neuer Ordner (2)/New Folder/` → `nl_sorted/raw/`, refined to 6 facesets (197, 120, 91, 47, 23, 18 images) +# Refresh the swap-ready export to reflect the merge. +python sort_faces.py enrich "$CACHE" +python sort_faces.py export-swap "$CACHE" \ + "$OUT/facesets_full/refine_manifest.json" "$OUT/facesets_swap_ready" \ + --raw-manifest "$OUT/raw_full/manifest.json" --candidates +``` -Output lives outside the repo at `/mnt/e/temp_things/fcswp/`. +## Key defaults + +`refine`: + +| flag | default | meaning | +|-------------------------|--------:|---------| +| `--initial-threshold` | 0.55 | cosine distance for stage-1 clustering | +| `--merge-threshold` | 0.40 | centroid-level merge of over-split clusters | +| `--outlier-threshold` | 0.55 | drop face if cosine dist from centroid exceeds (only if cluster ≥ 4) | +| `--min-faces` | 15 | minimum unique images per faceset | +| `--min-short` | 90 | minimum short-edge pixels of face bbox | +| `--min-blur` | 40.0 | Laplacian-variance blur gate | +| `--min-det-score` | 0.6 | InsightFace detector score gate | + +`export-swap`: + +| flag | default | meaning | +|-------------------------------|--------:|---------| +| `--top-n` | 30 | size of the `_topN.fsz` bundle | +| `--outlier-threshold` | 0.45 | tighter than refine; trims cluster boundary for averaging | +| `--pad-ratio` | 0.5 | padding around face bbox for PNG crop | +| `--out-size` | 512 | PNG output is square `out_size × out_size` | +| `--min-face-short` | 100 | export gate; stricter than refine's 90 | +| `--candidates` | off | rescue `_singletons/` into `_candidates/` for manual review | +| `--candidate-match-threshold` | 0.55 | cos-dist cutoff for singleton → existing faceset | +| `--candidate-min-score` | 0.40 | composite-quality floor for candidates | + +The composite quality score in `export-swap` is `0.30·frontality + 0.20·det_score + 0.20·landmark_symmetry + 0.15·face_size + 0.15·sharpness`, each normalized to `[0, 1]`. + +## Downstream: roop-unleashed + +The `.fsz` bundles emitted by `export-swap` drop straight into roop-unleashed's Face Swap tab. Each PNG inside is already a clean single-face crop — critical, because the roop-unleashed loader appends every face it re-detects in each PNG to the averaged identity embedding. + +Highly recommended at swap time: enable **Select post-processing = GFPGAN** with the **Original/Enhanced image blend ratio = 0.85** (default is 0.65 which is conservative). See `docs/analysis/facesets-downstream-refinement-evaluation.md` for the full evaluation. + +## Layout + +``` +/opt/face-sets/ +├─ README.md (this file) +├─ sort_faces.py (the tool) +├─ docs/ +│ └─ analysis/ +│ └─ facesets-downstream-refinement-evaluation.md +└─ work/ (gitignored) + ├─ cache/ + │ └─ nl_full.npz (canonical cache + duplicates.json) + └─ logs/ + └─ *.log (every long step writes here) +```