Document enrich + export-swap + extend; add swap-ready usage guide
README.md now covers all six subcommands (embed, cluster, refine, dedup, extend, enrich, export-swap), an end-to-end pipeline recipe, the delta recipe for merging a new source into an existing result, the quality- weight formula used by export-swap, and the GFPGAN blend recommendation at swap time (0.85, overriding roop-unleashed's 0.65 default). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
119
README.md
119
README.md
@@ -1,56 +1,119 @@
|
|||||||
# face-sets
|
# face-sets
|
||||||
|
|
||||||
Sort photos by similar face using InsightFace embeddings + agglomerative clustering, then refine into faceset-ready folders for downstream face-swap tooling (roop-unleashed, etc.).
|
Sort photos by similar face using InsightFace embeddings + agglomerative clustering, refine into per-identity sets, and export ready-to-drop bundles for face-swap tooling (roop-unleashed, etc.).
|
||||||
|
|
||||||
## Pipeline
|
## Pipeline
|
||||||
|
|
||||||
`sort_faces.py` is a single-file CLI with four subcommands:
|
`sort_faces.py` is a single-file CLI with six subcommands:
|
||||||
|
|
||||||
| step | what it does |
|
| step | what it does |
|
||||||
|---------|------------------------------------------------------------------------------|
|
|-------------|-------------------------------------------------------------------------------------------------------------|
|
||||||
| embed | Recursively scan a source tree, detect + embed every face, write `.npz` cache |
|
| embed | Recursively scan a source tree, detect + embed every face, write `.npz` cache. Resumable; sha256-dedup. |
|
||||||
| cluster | Raw agglomerative clustering of the cache into `person_NNN/` / `_singletons/` / `_noface/` |
|
| cluster | Raw agglomerative clustering of the cache into `person_NNN/` / `_singletons/` / `_noface/` with manifest. |
|
||||||
| refine | Initial cluster → centroid merge → quality gate → outlier rejection → size filter → `faceset_NNN/` |
|
| refine | Initial cluster → centroid merge → quality gate → outlier rejection → size filter → `faceset_NNN/`. |
|
||||||
| dedup | Post-hoc near-duplicate report: byte-identical groups + visual near-dupes (same face + same size within a tight cosine threshold) |
|
| dedup | Post-hoc near-duplicate report: byte-identical + visual near-dupe groups → `<cache>.duplicates.json`. |
|
||||||
|
| extend | Fold new embeddings into an existing raw/refine output via nearest person-centroid without renumbering. |
|
||||||
|
| enrich | Re-detect each cached face to persist landmark_2d_106, landmark_3d_68, pose (pitch/yaw/roll) into cache. |
|
||||||
|
| export-swap | Per-identity export: tight outlier gate + visual-dupe collapse + composite quality rank + single-face PNG crops + `.fsz` bundles (top-N and full) ready for roop-unleashed. Optional singleton rescue into `_candidates/`. |
|
||||||
|
|
||||||
`embed` is resumable and incremental: it loads any existing cache at the target path and only hashes/embeds files it hasn't processed before. A periodic flush (default every 50 new files) writes the cache atomically, so a mid-run crash loses at most a few dozen embeddings.
|
### Design principles
|
||||||
|
|
||||||
Byte-identical duplicates are detected via sha256 during the listing phase. The canonical file is embedded once; other paths with the same hash are carried as `aliases` on the cache's top-level `path_aliases` dict. Every alias is materialized by `cluster`/`refine`, so each on-disk location ends up represented in the output.
|
- **embed is resumable and incremental.** It loads any existing cache at the target path and only hashes / embeds files it has not seen. Atomic flush every 50 new files so a mid-run crash loses at most ~50 embeddings.
|
||||||
|
- **Byte-identical duplicates are sha256-grouped at listing time.** The canonical file is embedded once; other paths with the same hash become `path_aliases` in the cache. Every alias is materialized by `cluster` / `refine` / `export-swap`, so each on-disk location is represented.
|
||||||
|
- **`safe_dst_name` always flattens the absolute path.** This keeps output filenames stable across runs even as `src_root` changes between embed / extend / export invocations.
|
||||||
|
- **Caches and outputs stay out of git** via `.gitignore`; defaults live under `work/`.
|
||||||
|
|
||||||
Cache and outputs are kept out of the repo via `.gitignore`; defaults live under `work/`.
|
## Typical end-to-end run
|
||||||
|
|
||||||
## Typical run
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 1. Embed (CPU; InsightFace buffalo_l). Caches faces + metadata. Resumable.
|
SRC=/mnt/x/src/nl
|
||||||
python sort_faces.py embed /mnt/x/src/nl work/cache/nl_full.npz
|
CACHE=work/cache/nl_full.npz
|
||||||
|
OUT=/mnt/e/temp_things/fcswp/nl_sorted
|
||||||
|
|
||||||
# 2. Raw clusters (every multi-face cluster -> a person_NNN/ folder).
|
# 1. Embed (CPU; InsightFace buffalo_l). Resumable on re-run.
|
||||||
python sort_faces.py cluster work/cache/nl_full.npz /mnt/e/temp_things/fcswp/nl_sorted/raw_full
|
python sort_faces.py embed "$SRC" "$CACHE"
|
||||||
|
|
||||||
# 3. Refined facesets (filters for faceset-ready quality).
|
# 2. Raw clusters (one person_NNN/ per multi-face cluster).
|
||||||
python sort_faces.py refine work/cache/nl_full.npz /mnt/e/temp_things/fcswp/nl_sorted/facesets_full
|
python sort_faces.py cluster "$CACHE" "$OUT/raw_full"
|
||||||
|
|
||||||
# 4. (Optional) report on byte-identical + visual near-duplicates.
|
# 3. Refined facesets (quality-gated per-identity sets).
|
||||||
python sort_faces.py dedup work/cache/nl_full.npz
|
python sort_faces.py refine "$CACHE" "$OUT/facesets_full"
|
||||||
|
|
||||||
|
# 4. Near-duplicate report (byte + visual).
|
||||||
|
python sort_faces.py dedup "$CACHE"
|
||||||
|
|
||||||
|
# 5. Enrich the cache with landmarks + pose (needed by export-swap).
|
||||||
|
python sort_faces.py enrich "$CACHE"
|
||||||
|
|
||||||
|
# 6. Export roop-unleashed-ready bundles.
|
||||||
|
python sort_faces.py export-swap "$CACHE" \
|
||||||
|
"$OUT/facesets_full/refine_manifest.json" "$OUT/facesets_swap_ready" \
|
||||||
|
--raw-manifest "$OUT/raw_full/manifest.json" --candidates
|
||||||
```
|
```
|
||||||
|
|
||||||
## Refine defaults
|
### Merging a new source into an existing result
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Embed new source into the same cache (resume from existing embeddings + aliases).
|
||||||
|
python sort_faces.py embed /mnt/x/src/lzbkp_red "$CACHE"
|
||||||
|
|
||||||
|
# Fold new faces into raw_full + facesets_full without renumbering.
|
||||||
|
python sort_faces.py extend "$CACHE" "$OUT/raw_full" --refine-out "$OUT/facesets_full"
|
||||||
|
|
||||||
|
# Refresh the swap-ready export to reflect the merge.
|
||||||
|
python sort_faces.py enrich "$CACHE"
|
||||||
|
python sort_faces.py export-swap "$CACHE" \
|
||||||
|
"$OUT/facesets_full/refine_manifest.json" "$OUT/facesets_swap_ready" \
|
||||||
|
--raw-manifest "$OUT/raw_full/manifest.json" --candidates
|
||||||
|
```
|
||||||
|
|
||||||
|
## Key defaults
|
||||||
|
|
||||||
|
`refine`:
|
||||||
|
|
||||||
| flag | default | meaning |
|
| flag | default | meaning |
|
||||||
|---|---|---|
|
|-------------------------|--------:|---------|
|
||||||
| `--initial-threshold` | 0.55 | cosine distance for stage-1 clustering |
|
| `--initial-threshold` | 0.55 | cosine distance for stage-1 clustering |
|
||||||
| `--merge-threshold` | 0.40 | centroid-level merge of over-split clusters |
|
| `--merge-threshold` | 0.40 | centroid-level merge of over-split clusters |
|
||||||
| `--outlier-threshold` | 0.55 | drop face if cosine dist from cluster centroid exceeds this (only if cluster ≥ 4) |
|
| `--outlier-threshold` | 0.55 | drop face if cosine dist from centroid exceeds (only if cluster ≥ 4) |
|
||||||
| `--min-faces` | 15 | minimum unique images per faceset |
|
| `--min-faces` | 15 | minimum unique images per faceset |
|
||||||
| `--min-short` | 90 | minimum short-edge pixels of face bbox |
|
| `--min-short` | 90 | minimum short-edge pixels of face bbox |
|
||||||
| `--min-blur` | 40.0 | Laplacian-variance blur gate |
|
| `--min-blur` | 40.0 | Laplacian-variance blur gate |
|
||||||
| `--min-det-score` | 0.6 | InsightFace detector score gate |
|
| `--min-det-score` | 0.6 | InsightFace detector score gate |
|
||||||
| `--mode` | copy | copy / move / symlink |
|
|
||||||
|
|
||||||
## Prior runs (as of 2026-04-22)
|
`export-swap`:
|
||||||
|
|
||||||
- `work/cache/kos11.npz` — 181 images, 333 faces from `Kos '11/` → `kos11_sorted/`
|
| flag | default | meaning |
|
||||||
- `work/cache/nl_all.npz` — 916 images, 1396 faces from `Neuer Ordner (2)/New Folder/` → `nl_sorted/raw/`, refined to 6 facesets (197, 120, 91, 47, 23, 18 images)
|
|-------------------------------|--------:|---------|
|
||||||
|
| `--top-n` | 30 | size of the `<faceset>_topN.fsz` bundle |
|
||||||
|
| `--outlier-threshold` | 0.45 | tighter than refine; trims cluster boundary for averaging |
|
||||||
|
| `--pad-ratio` | 0.5 | padding around face bbox for PNG crop |
|
||||||
|
| `--out-size` | 512 | PNG output is square `out_size × out_size` |
|
||||||
|
| `--min-face-short` | 100 | export gate; stricter than refine's 90 |
|
||||||
|
| `--candidates` | off | rescue `_singletons/` into `_candidates/` for manual review |
|
||||||
|
| `--candidate-match-threshold` | 0.55 | cos-dist cutoff for singleton → existing faceset |
|
||||||
|
| `--candidate-min-score` | 0.40 | composite-quality floor for candidates |
|
||||||
|
|
||||||
Output lives outside the repo at `/mnt/e/temp_things/fcswp/`.
|
The composite quality score in `export-swap` is `0.30·frontality + 0.20·det_score + 0.20·landmark_symmetry + 0.15·face_size + 0.15·sharpness`, each normalized to `[0, 1]`.
|
||||||
|
|
||||||
|
## Downstream: roop-unleashed
|
||||||
|
|
||||||
|
The `.fsz` bundles emitted by `export-swap` drop straight into roop-unleashed's Face Swap tab. Each PNG inside is already a clean single-face crop — critical, because the roop-unleashed loader appends every face it re-detects in each PNG to the averaged identity embedding.
|
||||||
|
|
||||||
|
Highly recommended at swap time: enable **Select post-processing = GFPGAN** with the **Original/Enhanced image blend ratio = 0.85** (default is 0.65 which is conservative). See `docs/analysis/facesets-downstream-refinement-evaluation.md` for the full evaluation.
|
||||||
|
|
||||||
|
## Layout
|
||||||
|
|
||||||
|
```
|
||||||
|
/opt/face-sets/
|
||||||
|
├─ README.md (this file)
|
||||||
|
├─ sort_faces.py (the tool)
|
||||||
|
├─ docs/
|
||||||
|
│ └─ analysis/
|
||||||
|
│ └─ facesets-downstream-refinement-evaluation.md
|
||||||
|
└─ work/ (gitignored)
|
||||||
|
├─ cache/
|
||||||
|
│ └─ nl_full.npz (canonical cache + duplicates.json)
|
||||||
|
└─ logs/
|
||||||
|
└─ *.log (every long step writes here)
|
||||||
|
```
|
||||||
|
|||||||
Reference in New Issue
Block a user