Files
face-sets/README.md
Peter 4d7a8780de Document enrich + export-swap + extend; add swap-ready usage guide
README.md now covers all six subcommands (embed, cluster, refine, dedup,
extend, enrich, export-swap), an end-to-end pipeline recipe, the delta
recipe for merging a new source into an existing result, the quality-
weight formula used by export-swap, and the GFPGAN blend recommendation
at swap time (0.85, overriding roop-unleashed's 0.65 default).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 00:09:01 +02:00

6.5 KiB
Raw Blame History

face-sets

Sort photos by similar face using InsightFace embeddings + agglomerative clustering, refine into per-identity sets, and export ready-to-drop bundles for face-swap tooling (roop-unleashed, etc.).

Pipeline

sort_faces.py is a single-file CLI with six subcommands:

step what it does
embed Recursively scan a source tree, detect + embed every face, write .npz cache. Resumable; sha256-dedup.
cluster Raw agglomerative clustering of the cache into person_NNN/ / _singletons/ / _noface/ with manifest.
refine Initial cluster → centroid merge → quality gate → outlier rejection → size filter → faceset_NNN/.
dedup Post-hoc near-duplicate report: byte-identical + visual near-dupe groups → <cache>.duplicates.json.
extend Fold new embeddings into an existing raw/refine output via nearest person-centroid without renumbering.
enrich Re-detect each cached face to persist landmark_2d_106, landmark_3d_68, pose (pitch/yaw/roll) into cache.
export-swap Per-identity export: tight outlier gate + visual-dupe collapse + composite quality rank + single-face PNG crops + .fsz bundles (top-N and full) ready for roop-unleashed. Optional singleton rescue into _candidates/.

Design principles

  • embed is resumable and incremental. It loads any existing cache at the target path and only hashes / embeds files it has not seen. Atomic flush every 50 new files so a mid-run crash loses at most ~50 embeddings.
  • Byte-identical duplicates are sha256-grouped at listing time. The canonical file is embedded once; other paths with the same hash become path_aliases in the cache. Every alias is materialized by cluster / refine / export-swap, so each on-disk location is represented.
  • safe_dst_name always flattens the absolute path. This keeps output filenames stable across runs even as src_root changes between embed / extend / export invocations.
  • Caches and outputs stay out of git via .gitignore; defaults live under work/.

Typical end-to-end run

SRC=/mnt/x/src/nl
CACHE=work/cache/nl_full.npz
OUT=/mnt/e/temp_things/fcswp/nl_sorted

# 1. Embed (CPU; InsightFace buffalo_l). Resumable on re-run.
python sort_faces.py embed "$SRC" "$CACHE"

# 2. Raw clusters (one person_NNN/ per multi-face cluster).
python sort_faces.py cluster "$CACHE" "$OUT/raw_full"

# 3. Refined facesets (quality-gated per-identity sets).
python sort_faces.py refine  "$CACHE" "$OUT/facesets_full"

# 4. Near-duplicate report (byte + visual).
python sort_faces.py dedup   "$CACHE"

# 5. Enrich the cache with landmarks + pose (needed by export-swap).
python sort_faces.py enrich  "$CACHE"

# 6. Export roop-unleashed-ready bundles.
python sort_faces.py export-swap "$CACHE" \
  "$OUT/facesets_full/refine_manifest.json" "$OUT/facesets_swap_ready" \
  --raw-manifest "$OUT/raw_full/manifest.json" --candidates

Merging a new source into an existing result

# Embed new source into the same cache (resume from existing embeddings + aliases).
python sort_faces.py embed /mnt/x/src/lzbkp_red "$CACHE"

# Fold new faces into raw_full + facesets_full without renumbering.
python sort_faces.py extend "$CACHE" "$OUT/raw_full" --refine-out "$OUT/facesets_full"

# Refresh the swap-ready export to reflect the merge.
python sort_faces.py enrich "$CACHE"
python sort_faces.py export-swap "$CACHE" \
  "$OUT/facesets_full/refine_manifest.json" "$OUT/facesets_swap_ready" \
  --raw-manifest "$OUT/raw_full/manifest.json" --candidates

Key defaults

refine:

flag default meaning
--initial-threshold 0.55 cosine distance for stage-1 clustering
--merge-threshold 0.40 centroid-level merge of over-split clusters
--outlier-threshold 0.55 drop face if cosine dist from centroid exceeds (only if cluster ≥ 4)
--min-faces 15 minimum unique images per faceset
--min-short 90 minimum short-edge pixels of face bbox
--min-blur 40.0 Laplacian-variance blur gate
--min-det-score 0.6 InsightFace detector score gate

export-swap:

flag default meaning
--top-n 30 size of the <faceset>_topN.fsz bundle
--outlier-threshold 0.45 tighter than refine; trims cluster boundary for averaging
--pad-ratio 0.5 padding around face bbox for PNG crop
--out-size 512 PNG output is square out_size × out_size
--min-face-short 100 export gate; stricter than refine's 90
--candidates off rescue _singletons/ into _candidates/ for manual review
--candidate-match-threshold 0.55 cos-dist cutoff for singleton → existing faceset
--candidate-min-score 0.40 composite-quality floor for candidates

The composite quality score in export-swap is 0.30·frontality + 0.20·det_score + 0.20·landmark_symmetry + 0.15·face_size + 0.15·sharpness, each normalized to [0, 1].

Downstream: roop-unleashed

The .fsz bundles emitted by export-swap drop straight into roop-unleashed's Face Swap tab. Each PNG inside is already a clean single-face crop — critical, because the roop-unleashed loader appends every face it re-detects in each PNG to the averaged identity embedding.

Highly recommended at swap time: enable Select post-processing = GFPGAN with the Original/Enhanced image blend ratio = 0.85 (default is 0.65 which is conservative). See docs/analysis/facesets-downstream-refinement-evaluation.md for the full evaluation.

Layout

/opt/face-sets/
├─ README.md                                     (this file)
├─ sort_faces.py                                 (the tool)
├─ docs/
│  └─ analysis/
│     └─ facesets-downstream-refinement-evaluation.md
└─ work/                                         (gitignored)
   ├─ cache/
   │  └─ nl_full.npz                             (canonical cache + duplicates.json)
   └─ logs/
      └─ *.log                                   (every long step writes here)