Document enrich + export-swap + extend; add swap-ready usage guide

README.md now covers all six subcommands (embed, cluster, refine, dedup,
extend, enrich, export-swap), an end-to-end pipeline recipe, the delta
recipe for merging a new source into an existing result, the quality-
weight formula used by export-swap, and the GFPGAN blend recommendation
at swap time (0.85, overriding roop-unleashed's 0.65 default).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-24 00:09:01 +02:00
parent d53ab9fbfc
commit 4d7a8780de

119
README.md
View File

@@ -1,56 +1,119 @@
# face-sets # face-sets
Sort photos by similar face using InsightFace embeddings + agglomerative clustering, then refine into faceset-ready folders for downstream face-swap tooling (roop-unleashed, etc.). Sort photos by similar face using InsightFace embeddings + agglomerative clustering, refine into per-identity sets, and export ready-to-drop bundles for face-swap tooling (roop-unleashed, etc.).
## Pipeline ## Pipeline
`sort_faces.py` is a single-file CLI with four subcommands: `sort_faces.py` is a single-file CLI with six subcommands:
| step | what it does | | step | what it does |
|---------|------------------------------------------------------------------------------| |-------------|-------------------------------------------------------------------------------------------------------------|
| embed | Recursively scan a source tree, detect + embed every face, write `.npz` cache | | embed | Recursively scan a source tree, detect + embed every face, write `.npz` cache. Resumable; sha256-dedup. |
| cluster | Raw agglomerative clustering of the cache into `person_NNN/` / `_singletons/` / `_noface/` | | cluster | Raw agglomerative clustering of the cache into `person_NNN/` / `_singletons/` / `_noface/` with manifest. |
| refine | Initial cluster → centroid merge → quality gate → outlier rejection → size filter → `faceset_NNN/` | | refine | Initial cluster → centroid merge → quality gate → outlier rejection → size filter → `faceset_NNN/`. |
| dedup | Post-hoc near-duplicate report: byte-identical groups + visual near-dupes (same face + same size within a tight cosine threshold) | | dedup | Post-hoc near-duplicate report: byte-identical + visual near-dupe groups → `<cache>.duplicates.json`. |
| extend | Fold new embeddings into an existing raw/refine output via nearest person-centroid without renumbering. |
| enrich | Re-detect each cached face to persist landmark_2d_106, landmark_3d_68, pose (pitch/yaw/roll) into cache. |
| export-swap | Per-identity export: tight outlier gate + visual-dupe collapse + composite quality rank + single-face PNG crops + `.fsz` bundles (top-N and full) ready for roop-unleashed. Optional singleton rescue into `_candidates/`. |
`embed` is resumable and incremental: it loads any existing cache at the target path and only hashes/embeds files it hasn't processed before. A periodic flush (default every 50 new files) writes the cache atomically, so a mid-run crash loses at most a few dozen embeddings. ### Design principles
Byte-identical duplicates are detected via sha256 during the listing phase. The canonical file is embedded once; other paths with the same hash are carried as `aliases` on the cache's top-level `path_aliases` dict. Every alias is materialized by `cluster`/`refine`, so each on-disk location ends up represented in the output. - **embed is resumable and incremental.** It loads any existing cache at the target path and only hashes / embeds files it has not seen. Atomic flush every 50 new files so a mid-run crash loses at most ~50 embeddings.
- **Byte-identical duplicates are sha256-grouped at listing time.** The canonical file is embedded once; other paths with the same hash become `path_aliases` in the cache. Every alias is materialized by `cluster` / `refine` / `export-swap`, so each on-disk location is represented.
- **`safe_dst_name` always flattens the absolute path.** This keeps output filenames stable across runs even as `src_root` changes between embed / extend / export invocations.
- **Caches and outputs stay out of git** via `.gitignore`; defaults live under `work/`.
Cache and outputs are kept out of the repo via `.gitignore`; defaults live under `work/`. ## Typical end-to-end run
## Typical run
```bash ```bash
# 1. Embed (CPU; InsightFace buffalo_l). Caches faces + metadata. Resumable. SRC=/mnt/x/src/nl
python sort_faces.py embed /mnt/x/src/nl work/cache/nl_full.npz CACHE=work/cache/nl_full.npz
OUT=/mnt/e/temp_things/fcswp/nl_sorted
# 2. Raw clusters (every multi-face cluster -> a person_NNN/ folder). # 1. Embed (CPU; InsightFace buffalo_l). Resumable on re-run.
python sort_faces.py cluster work/cache/nl_full.npz /mnt/e/temp_things/fcswp/nl_sorted/raw_full python sort_faces.py embed "$SRC" "$CACHE"
# 3. Refined facesets (filters for faceset-ready quality). # 2. Raw clusters (one person_NNN/ per multi-face cluster).
python sort_faces.py refine work/cache/nl_full.npz /mnt/e/temp_things/fcswp/nl_sorted/facesets_full python sort_faces.py cluster "$CACHE" "$OUT/raw_full"
# 4. (Optional) report on byte-identical + visual near-duplicates. # 3. Refined facesets (quality-gated per-identity sets).
python sort_faces.py dedup work/cache/nl_full.npz python sort_faces.py refine "$CACHE" "$OUT/facesets_full"
# 4. Near-duplicate report (byte + visual).
python sort_faces.py dedup "$CACHE"
# 5. Enrich the cache with landmarks + pose (needed by export-swap).
python sort_faces.py enrich "$CACHE"
# 6. Export roop-unleashed-ready bundles.
python sort_faces.py export-swap "$CACHE" \
"$OUT/facesets_full/refine_manifest.json" "$OUT/facesets_swap_ready" \
--raw-manifest "$OUT/raw_full/manifest.json" --candidates
``` ```
## Refine defaults ### Merging a new source into an existing result
```bash
# Embed new source into the same cache (resume from existing embeddings + aliases).
python sort_faces.py embed /mnt/x/src/lzbkp_red "$CACHE"
# Fold new faces into raw_full + facesets_full without renumbering.
python sort_faces.py extend "$CACHE" "$OUT/raw_full" --refine-out "$OUT/facesets_full"
# Refresh the swap-ready export to reflect the merge.
python sort_faces.py enrich "$CACHE"
python sort_faces.py export-swap "$CACHE" \
"$OUT/facesets_full/refine_manifest.json" "$OUT/facesets_swap_ready" \
--raw-manifest "$OUT/raw_full/manifest.json" --candidates
```
## Key defaults
`refine`:
| flag | default | meaning | | flag | default | meaning |
|---|---|---| |-------------------------|--------:|---------|
| `--initial-threshold` | 0.55 | cosine distance for stage-1 clustering | | `--initial-threshold` | 0.55 | cosine distance for stage-1 clustering |
| `--merge-threshold` | 0.40 | centroid-level merge of over-split clusters | | `--merge-threshold` | 0.40 | centroid-level merge of over-split clusters |
| `--outlier-threshold` | 0.55 | drop face if cosine dist from cluster centroid exceeds this (only if cluster ≥ 4) | | `--outlier-threshold` | 0.55 | drop face if cosine dist from centroid exceeds (only if cluster ≥ 4) |
| `--min-faces` | 15 | minimum unique images per faceset | | `--min-faces` | 15 | minimum unique images per faceset |
| `--min-short` | 90 | minimum short-edge pixels of face bbox | | `--min-short` | 90 | minimum short-edge pixels of face bbox |
| `--min-blur` | 40.0 | Laplacian-variance blur gate | | `--min-blur` | 40.0 | Laplacian-variance blur gate |
| `--min-det-score` | 0.6 | InsightFace detector score gate | | `--min-det-score` | 0.6 | InsightFace detector score gate |
| `--mode` | copy | copy / move / symlink |
## Prior runs (as of 2026-04-22) `export-swap`:
- `work/cache/kos11.npz` — 181 images, 333 faces from `Kos '11/``kos11_sorted/` | flag | default | meaning |
- `work/cache/nl_all.npz` — 916 images, 1396 faces from `Neuer Ordner (2)/New Folder/``nl_sorted/raw/`, refined to 6 facesets (197, 120, 91, 47, 23, 18 images) |-------------------------------|--------:|---------|
| `--top-n` | 30 | size of the `<faceset>_topN.fsz` bundle |
| `--outlier-threshold` | 0.45 | tighter than refine; trims cluster boundary for averaging |
| `--pad-ratio` | 0.5 | padding around face bbox for PNG crop |
| `--out-size` | 512 | PNG output is square `out_size × out_size` |
| `--min-face-short` | 100 | export gate; stricter than refine's 90 |
| `--candidates` | off | rescue `_singletons/` into `_candidates/` for manual review |
| `--candidate-match-threshold` | 0.55 | cos-dist cutoff for singleton → existing faceset |
| `--candidate-min-score` | 0.40 | composite-quality floor for candidates |
Output lives outside the repo at `/mnt/e/temp_things/fcswp/`. The composite quality score in `export-swap` is `0.30·frontality + 0.20·det_score + 0.20·landmark_symmetry + 0.15·face_size + 0.15·sharpness`, each normalized to `[0, 1]`.
## Downstream: roop-unleashed
The `.fsz` bundles emitted by `export-swap` drop straight into roop-unleashed's Face Swap tab. Each PNG inside is already a clean single-face crop — critical, because the roop-unleashed loader appends every face it re-detects in each PNG to the averaged identity embedding.
Highly recommended at swap time: enable **Select post-processing = GFPGAN** with the **Original/Enhanced image blend ratio = 0.85** (default is 0.65 which is conservative). See `docs/analysis/facesets-downstream-refinement-evaluation.md` for the full evaluation.
## Layout
```
/opt/face-sets/
├─ README.md (this file)
├─ sort_faces.py (the tool)
├─ docs/
│ └─ analysis/
│ └─ facesets-downstream-refinement-evaluation.md
└─ work/ (gitignored)
├─ cache/
│ └─ nl_full.npz (canonical cache + duplicates.json)
└─ logs/
└─ *.log (every long step writes here)
```