Add face-sort pipeline as the repo's base
Single-file CLI (embed / cluster / refine) using InsightFace buffalo_l embeddings and agglomerative clustering, migrated in from the ad-hoc /home/peter/face_sort/ directory so this repo is the canonical home for faceset preparation feeding roop-unleashed and similar tools. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
49
README.md
49
README.md
@@ -1 +1,48 @@
|
||||
Dummy
|
||||
# face-sets
|
||||
|
||||
Sort photos by similar face using InsightFace embeddings + agglomerative clustering, then refine into faceset-ready folders for downstream face-swap tooling (roop-unleashed, etc.).
|
||||
|
||||
## Pipeline
|
||||
|
||||
`sort_faces.py` is a single-file CLI with three subcommands:
|
||||
|
||||
| step | what it does |
|
||||
|---------|------------------------------------------------------------------------------|
|
||||
| embed | Recursively scan a source tree, detect + embed every face, write `.npz` cache |
|
||||
| cluster | Raw agglomerative clustering of the cache into `person_NNN/` / `_singletons/` / `_noface/` |
|
||||
| refine | Initial cluster → centroid merge → quality gate → outlier rejection → size filter → `faceset_NNN/` |
|
||||
|
||||
Cache and outputs are kept out of the repo via `.gitignore`; defaults live under `work/`.
|
||||
|
||||
## Typical run
|
||||
|
||||
```bash
|
||||
# 1. Embed (CPU; InsightFace buffalo_l). Caches faces + metadata.
|
||||
python sort_faces.py embed "/mnt/x/src/nl/Neuer Ordner (2)/New Folder" work/cache/nl_all.npz
|
||||
|
||||
# 2. Raw clusters (every multi-face cluster -> a person_NNN/ folder).
|
||||
python sort_faces.py cluster work/cache/nl_all.npz /mnt/e/temp_things/fcswp/nl_sorted/raw
|
||||
|
||||
# 3. Refined facesets (filters for faceset-ready quality).
|
||||
python sort_faces.py refine work/cache/nl_all.npz /mnt/e/temp_things/fcswp/nl_sorted/facesets
|
||||
```
|
||||
|
||||
## Refine defaults
|
||||
|
||||
| flag | default | meaning |
|
||||
|---|---|---|
|
||||
| `--initial-threshold` | 0.55 | cosine distance for stage-1 clustering |
|
||||
| `--merge-threshold` | 0.40 | centroid-level merge of over-split clusters |
|
||||
| `--outlier-threshold` | 0.55 | drop face if cosine dist from cluster centroid exceeds this (only if cluster ≥ 4) |
|
||||
| `--min-faces` | 15 | minimum unique images per faceset |
|
||||
| `--min-short` | 90 | minimum short-edge pixels of face bbox |
|
||||
| `--min-blur` | 40.0 | Laplacian-variance blur gate |
|
||||
| `--min-det-score` | 0.6 | InsightFace detector score gate |
|
||||
| `--mode` | copy | copy / move / symlink |
|
||||
|
||||
## Prior runs (as of 2026-04-22)
|
||||
|
||||
- `work/cache/kos11.npz` — 181 images, 333 faces from `Kos '11/` → `kos11_sorted/`
|
||||
- `work/cache/nl_all.npz` — 916 images, 1396 faces from `Neuer Ordner (2)/New Folder/` → `nl_sorted/raw/`, refined to 6 facesets (197, 120, 91, 47, 23, 18 images)
|
||||
|
||||
Output lives outside the repo at `/mnt/e/temp_things/fcswp/`.
|
||||
|
||||
Reference in New Issue
Block a user