README.md now covers all six subcommands (embed, cluster, refine, dedup, extend, enrich, export-swap), an end-to-end pipeline recipe, the delta recipe for merging a new source into an existing result, the quality- weight formula used by export-swap, and the GFPGAN blend recommendation at swap time (0.85, overriding roop-unleashed's 0.65 default). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
face-sets
Sort photos by similar face using InsightFace embeddings + agglomerative clustering, refine into per-identity sets, and export ready-to-drop bundles for face-swap tooling (roop-unleashed, etc.).
Pipeline
sort_faces.py is a single-file CLI with six subcommands:
| step | what it does |
|---|---|
| embed | Recursively scan a source tree, detect + embed every face, write .npz cache. Resumable; sha256-dedup. |
| cluster | Raw agglomerative clustering of the cache into person_NNN/ / _singletons/ / _noface/ with manifest. |
| refine | Initial cluster → centroid merge → quality gate → outlier rejection → size filter → faceset_NNN/. |
| dedup | Post-hoc near-duplicate report: byte-identical + visual near-dupe groups → <cache>.duplicates.json. |
| extend | Fold new embeddings into an existing raw/refine output via nearest person-centroid without renumbering. |
| enrich | Re-detect each cached face to persist landmark_2d_106, landmark_3d_68, pose (pitch/yaw/roll) into cache. |
| export-swap | Per-identity export: tight outlier gate + visual-dupe collapse + composite quality rank + single-face PNG crops + .fsz bundles (top-N and full) ready for roop-unleashed. Optional singleton rescue into _candidates/. |
Design principles
- embed is resumable and incremental. It loads any existing cache at the target path and only hashes / embeds files it has not seen. Atomic flush every 50 new files so a mid-run crash loses at most ~50 embeddings.
- Byte-identical duplicates are sha256-grouped at listing time. The canonical file is embedded once; other paths with the same hash become
path_aliasesin the cache. Every alias is materialized bycluster/refine/export-swap, so each on-disk location is represented. safe_dst_namealways flattens the absolute path. This keeps output filenames stable across runs even assrc_rootchanges between embed / extend / export invocations.- Caches and outputs stay out of git via
.gitignore; defaults live underwork/.
Typical end-to-end run
SRC=/mnt/x/src/nl
CACHE=work/cache/nl_full.npz
OUT=/mnt/e/temp_things/fcswp/nl_sorted
# 1. Embed (CPU; InsightFace buffalo_l). Resumable on re-run.
python sort_faces.py embed "$SRC" "$CACHE"
# 2. Raw clusters (one person_NNN/ per multi-face cluster).
python sort_faces.py cluster "$CACHE" "$OUT/raw_full"
# 3. Refined facesets (quality-gated per-identity sets).
python sort_faces.py refine "$CACHE" "$OUT/facesets_full"
# 4. Near-duplicate report (byte + visual).
python sort_faces.py dedup "$CACHE"
# 5. Enrich the cache with landmarks + pose (needed by export-swap).
python sort_faces.py enrich "$CACHE"
# 6. Export roop-unleashed-ready bundles.
python sort_faces.py export-swap "$CACHE" \
"$OUT/facesets_full/refine_manifest.json" "$OUT/facesets_swap_ready" \
--raw-manifest "$OUT/raw_full/manifest.json" --candidates
Merging a new source into an existing result
# Embed new source into the same cache (resume from existing embeddings + aliases).
python sort_faces.py embed /mnt/x/src/lzbkp_red "$CACHE"
# Fold new faces into raw_full + facesets_full without renumbering.
python sort_faces.py extend "$CACHE" "$OUT/raw_full" --refine-out "$OUT/facesets_full"
# Refresh the swap-ready export to reflect the merge.
python sort_faces.py enrich "$CACHE"
python sort_faces.py export-swap "$CACHE" \
"$OUT/facesets_full/refine_manifest.json" "$OUT/facesets_swap_ready" \
--raw-manifest "$OUT/raw_full/manifest.json" --candidates
Key defaults
refine:
| flag | default | meaning |
|---|---|---|
--initial-threshold |
0.55 | cosine distance for stage-1 clustering |
--merge-threshold |
0.40 | centroid-level merge of over-split clusters |
--outlier-threshold |
0.55 | drop face if cosine dist from centroid exceeds (only if cluster ≥ 4) |
--min-faces |
15 | minimum unique images per faceset |
--min-short |
90 | minimum short-edge pixels of face bbox |
--min-blur |
40.0 | Laplacian-variance blur gate |
--min-det-score |
0.6 | InsightFace detector score gate |
export-swap:
| flag | default | meaning |
|---|---|---|
--top-n |
30 | size of the <faceset>_topN.fsz bundle |
--outlier-threshold |
0.45 | tighter than refine; trims cluster boundary for averaging |
--pad-ratio |
0.5 | padding around face bbox for PNG crop |
--out-size |
512 | PNG output is square out_size × out_size |
--min-face-short |
100 | export gate; stricter than refine's 90 |
--candidates |
off | rescue _singletons/ into _candidates/ for manual review |
--candidate-match-threshold |
0.55 | cos-dist cutoff for singleton → existing faceset |
--candidate-min-score |
0.40 | composite-quality floor for candidates |
The composite quality score in export-swap is 0.30·frontality + 0.20·det_score + 0.20·landmark_symmetry + 0.15·face_size + 0.15·sharpness, each normalized to [0, 1].
Downstream: roop-unleashed
The .fsz bundles emitted by export-swap drop straight into roop-unleashed's Face Swap tab. Each PNG inside is already a clean single-face crop — critical, because the roop-unleashed loader appends every face it re-detects in each PNG to the averaged identity embedding.
Highly recommended at swap time: enable Select post-processing = GFPGAN with the Original/Enhanced image blend ratio = 0.85 (default is 0.65 which is conservative). See docs/analysis/facesets-downstream-refinement-evaluation.md for the full evaluation.
Layout
/opt/face-sets/
├─ README.md (this file)
├─ sort_faces.py (the tool)
├─ docs/
│ └─ analysis/
│ └─ facesets-downstream-refinement-evaluation.md
└─ work/ (gitignored)
├─ cache/
│ └─ nl_full.npz (canonical cache + duplicates.json)
└─ logs/
└─ *.log (every long step writes here)