Peter d53ab9fbfc Add enrich + export-swap pipeline for downstream face-swap ready output
- enrich: re-detects each cached face with buffalo_l (detection +
  landmark_2d_106 + landmark_3d_68, recognition module skipped for speed)
  and persists landmarks + pose into the cache so per-face frontality and
  landmark-symmetry quality signals become available.
- compute_quality: composite score combining det_score, face short-edge,
  blur, frontality (from pose pitch/yaw), and 2D-landmark symmetry with
  tunable weights. Default weighting 0.30/0.20/0.20/0.15/0.15.
- export-swap: builds facesets_swap_ready/ from an existing refine
  manifest. Per identity: tighter outlier gate (default 0.45), visual-
  near-dupe collapse (keep best representative per group), multi-face-
  per-source-image collapse (keep best bbox), rank by composite score,
  single-face-per-PNG crops at 512x512 with 0.5 bbox padding, ready-to-
  drop .fsz bundles (top-N + full), per-faceset manifest.json, NAME.txt
  placeholder for the operator. The multi-face-per-PNG collapse is the
  critical fix: roop-unleashed's .fsz loader appends every detected face
  in each PNG to the FaceSet, so any multi-face crop would contaminate
  the averaged embedding.
- Optional --candidates rescues raw_full singletons: matches against the
  final per-faceset centroids and routes to _candidates/to_<faceset>/
  for manual review; orphaned singletons that still cluster among
  themselves land in _candidates/new_<NNN>/.
- docs/analysis/: evaluation document captures the evidence, downstream
  requirements (FaceSet averaging, inswapper_128), opportunity matrix
  (R1-R14), and the recommended target state this export implements.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 22:37:32 +02:00

face-sets

Sort photos by similar face using InsightFace embeddings + agglomerative clustering, then refine into faceset-ready folders for downstream face-swap tooling (roop-unleashed, etc.).

Pipeline

sort_faces.py is a single-file CLI with four subcommands:

step what it does
embed Recursively scan a source tree, detect + embed every face, write .npz cache
cluster Raw agglomerative clustering of the cache into person_NNN/ / _singletons/ / _noface/
refine Initial cluster → centroid merge → quality gate → outlier rejection → size filter → faceset_NNN/
dedup Post-hoc near-duplicate report: byte-identical groups + visual near-dupes (same face + same size within a tight cosine threshold)

embed is resumable and incremental: it loads any existing cache at the target path and only hashes/embeds files it hasn't processed before. A periodic flush (default every 50 new files) writes the cache atomically, so a mid-run crash loses at most a few dozen embeddings.

Byte-identical duplicates are detected via sha256 during the listing phase. The canonical file is embedded once; other paths with the same hash are carried as aliases on the cache's top-level path_aliases dict. Every alias is materialized by cluster/refine, so each on-disk location ends up represented in the output.

Cache and outputs are kept out of the repo via .gitignore; defaults live under work/.

Typical run

# 1. Embed (CPU; InsightFace buffalo_l). Caches faces + metadata. Resumable.
python sort_faces.py embed /mnt/x/src/nl work/cache/nl_full.npz

# 2. Raw clusters (every multi-face cluster -> a person_NNN/ folder).
python sort_faces.py cluster work/cache/nl_full.npz /mnt/e/temp_things/fcswp/nl_sorted/raw_full

# 3. Refined facesets (filters for faceset-ready quality).
python sort_faces.py refine  work/cache/nl_full.npz /mnt/e/temp_things/fcswp/nl_sorted/facesets_full

# 4. (Optional) report on byte-identical + visual near-duplicates.
python sort_faces.py dedup   work/cache/nl_full.npz

Refine defaults

flag default meaning
--initial-threshold 0.55 cosine distance for stage-1 clustering
--merge-threshold 0.40 centroid-level merge of over-split clusters
--outlier-threshold 0.55 drop face if cosine dist from cluster centroid exceeds this (only if cluster ≥ 4)
--min-faces 15 minimum unique images per faceset
--min-short 90 minimum short-edge pixels of face bbox
--min-blur 40.0 Laplacian-variance blur gate
--min-det-score 0.6 InsightFace detector score gate
--mode copy copy / move / symlink

Prior runs (as of 2026-04-22)

  • work/cache/kos11.npz — 181 images, 333 faces from Kos '11/kos11_sorted/
  • work/cache/nl_all.npz — 916 images, 1396 faces from Neuer Ordner (2)/New Folder/nl_sorted/raw/, refined to 6 facesets (197, 120, 91, 47, 23, 18 images)

Output lives outside the repo at /mnt/e/temp_things/fcswp/.

Description
No description provided
Readme 313 KiB
Languages
Python 97.5%
Shell 2.5%