README.md now covers all six subcommands (embed, cluster, refine, dedup, extend, enrich, export-swap), an end-to-end pipeline recipe, the delta recipe for merging a new source into an existing result, the quality- weight formula used by export-swap, and the GFPGAN blend recommendation at swap time (0.85, overriding roop-unleashed's 0.65 default). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
120 lines
6.5 KiB
Markdown
120 lines
6.5 KiB
Markdown
# face-sets
|
||
|
||
Sort photos by similar face using InsightFace embeddings + agglomerative clustering, refine into per-identity sets, and export ready-to-drop bundles for face-swap tooling (roop-unleashed, etc.).
|
||
|
||
## Pipeline
|
||
|
||
`sort_faces.py` is a single-file CLI with six subcommands:
|
||
|
||
| step | what it does |
|
||
|-------------|-------------------------------------------------------------------------------------------------------------|
|
||
| embed | Recursively scan a source tree, detect + embed every face, write `.npz` cache. Resumable; sha256-dedup. |
|
||
| cluster | Raw agglomerative clustering of the cache into `person_NNN/` / `_singletons/` / `_noface/` with manifest. |
|
||
| refine | Initial cluster → centroid merge → quality gate → outlier rejection → size filter → `faceset_NNN/`. |
|
||
| dedup | Post-hoc near-duplicate report: byte-identical + visual near-dupe groups → `<cache>.duplicates.json`. |
|
||
| extend | Fold new embeddings into an existing raw/refine output via nearest person-centroid without renumbering. |
|
||
| enrich | Re-detect each cached face to persist landmark_2d_106, landmark_3d_68, pose (pitch/yaw/roll) into cache. |
|
||
| export-swap | Per-identity export: tight outlier gate + visual-dupe collapse + composite quality rank + single-face PNG crops + `.fsz` bundles (top-N and full) ready for roop-unleashed. Optional singleton rescue into `_candidates/`. |
|
||
|
||
### Design principles
|
||
|
||
- **embed is resumable and incremental.** It loads any existing cache at the target path and only hashes / embeds files it has not seen. Atomic flush every 50 new files so a mid-run crash loses at most ~50 embeddings.
|
||
- **Byte-identical duplicates are sha256-grouped at listing time.** The canonical file is embedded once; other paths with the same hash become `path_aliases` in the cache. Every alias is materialized by `cluster` / `refine` / `export-swap`, so each on-disk location is represented.
|
||
- **`safe_dst_name` always flattens the absolute path.** This keeps output filenames stable across runs even as `src_root` changes between embed / extend / export invocations.
|
||
- **Caches and outputs stay out of git** via `.gitignore`; defaults live under `work/`.
|
||
|
||
## Typical end-to-end run
|
||
|
||
```bash
|
||
SRC=/mnt/x/src/nl
|
||
CACHE=work/cache/nl_full.npz
|
||
OUT=/mnt/e/temp_things/fcswp/nl_sorted
|
||
|
||
# 1. Embed (CPU; InsightFace buffalo_l). Resumable on re-run.
|
||
python sort_faces.py embed "$SRC" "$CACHE"
|
||
|
||
# 2. Raw clusters (one person_NNN/ per multi-face cluster).
|
||
python sort_faces.py cluster "$CACHE" "$OUT/raw_full"
|
||
|
||
# 3. Refined facesets (quality-gated per-identity sets).
|
||
python sort_faces.py refine "$CACHE" "$OUT/facesets_full"
|
||
|
||
# 4. Near-duplicate report (byte + visual).
|
||
python sort_faces.py dedup "$CACHE"
|
||
|
||
# 5. Enrich the cache with landmarks + pose (needed by export-swap).
|
||
python sort_faces.py enrich "$CACHE"
|
||
|
||
# 6. Export roop-unleashed-ready bundles.
|
||
python sort_faces.py export-swap "$CACHE" \
|
||
"$OUT/facesets_full/refine_manifest.json" "$OUT/facesets_swap_ready" \
|
||
--raw-manifest "$OUT/raw_full/manifest.json" --candidates
|
||
```
|
||
|
||
### Merging a new source into an existing result
|
||
|
||
```bash
|
||
# Embed new source into the same cache (resume from existing embeddings + aliases).
|
||
python sort_faces.py embed /mnt/x/src/lzbkp_red "$CACHE"
|
||
|
||
# Fold new faces into raw_full + facesets_full without renumbering.
|
||
python sort_faces.py extend "$CACHE" "$OUT/raw_full" --refine-out "$OUT/facesets_full"
|
||
|
||
# Refresh the swap-ready export to reflect the merge.
|
||
python sort_faces.py enrich "$CACHE"
|
||
python sort_faces.py export-swap "$CACHE" \
|
||
"$OUT/facesets_full/refine_manifest.json" "$OUT/facesets_swap_ready" \
|
||
--raw-manifest "$OUT/raw_full/manifest.json" --candidates
|
||
```
|
||
|
||
## Key defaults
|
||
|
||
`refine`:
|
||
|
||
| flag | default | meaning |
|
||
|-------------------------|--------:|---------|
|
||
| `--initial-threshold` | 0.55 | cosine distance for stage-1 clustering |
|
||
| `--merge-threshold` | 0.40 | centroid-level merge of over-split clusters |
|
||
| `--outlier-threshold` | 0.55 | drop face if cosine dist from centroid exceeds (only if cluster ≥ 4) |
|
||
| `--min-faces` | 15 | minimum unique images per faceset |
|
||
| `--min-short` | 90 | minimum short-edge pixels of face bbox |
|
||
| `--min-blur` | 40.0 | Laplacian-variance blur gate |
|
||
| `--min-det-score` | 0.6 | InsightFace detector score gate |
|
||
|
||
`export-swap`:
|
||
|
||
| flag | default | meaning |
|
||
|-------------------------------|--------:|---------|
|
||
| `--top-n` | 30 | size of the `<faceset>_topN.fsz` bundle |
|
||
| `--outlier-threshold` | 0.45 | tighter than refine; trims cluster boundary for averaging |
|
||
| `--pad-ratio` | 0.5 | padding around face bbox for PNG crop |
|
||
| `--out-size` | 512 | PNG output is square `out_size × out_size` |
|
||
| `--min-face-short` | 100 | export gate; stricter than refine's 90 |
|
||
| `--candidates` | off | rescue `_singletons/` into `_candidates/` for manual review |
|
||
| `--candidate-match-threshold` | 0.55 | cos-dist cutoff for singleton → existing faceset |
|
||
| `--candidate-min-score` | 0.40 | composite-quality floor for candidates |
|
||
|
||
The composite quality score in `export-swap` is `0.30·frontality + 0.20·det_score + 0.20·landmark_symmetry + 0.15·face_size + 0.15·sharpness`, each normalized to `[0, 1]`.
|
||
|
||
## Downstream: roop-unleashed
|
||
|
||
The `.fsz` bundles emitted by `export-swap` drop straight into roop-unleashed's Face Swap tab. Each PNG inside is already a clean single-face crop — critical, because the roop-unleashed loader appends every face it re-detects in each PNG to the averaged identity embedding.
|
||
|
||
Highly recommended at swap time: enable **Select post-processing = GFPGAN** with the **Original/Enhanced image blend ratio = 0.85** (default is 0.65 which is conservative). See `docs/analysis/facesets-downstream-refinement-evaluation.md` for the full evaluation.
|
||
|
||
## Layout
|
||
|
||
```
|
||
/opt/face-sets/
|
||
├─ README.md (this file)
|
||
├─ sort_faces.py (the tool)
|
||
├─ docs/
|
||
│ └─ analysis/
|
||
│ └─ facesets-downstream-refinement-evaluation.md
|
||
└─ work/ (gitignored)
|
||
├─ cache/
|
||
│ └─ nl_full.npz (canonical cache + duplicates.json)
|
||
└─ logs/
|
||
└─ *.log (every long step writes here)
|
||
```
|