From 03a0c755318f7761c06f787463ca3ec04738fda2 Mon Sep 17 00:00:00 2001 From: Peter Date: Sun, 26 Apr 2026 12:08:25 +0200 Subject: [PATCH] Document hand-sorted-folder import + age-split workflow - README: document work/build_folders.py (hand-sorted folder identities) and the new age-split workflow for splitting a long-running identity into era-specific facesets after clustering. - Force-track work/age_split_001.py and work/check_faceset001_age.py; these are the worked example + readiness probe for faceset_001 and the template for splitting any other identity by EXIF era. Co-Authored-By: Claude Opus 4.7 (1M context) --- README.md | 95 ++++++- work/age_split_001.py | 485 +++++++++++++++++++++++++++++++++++ work/check_faceset001_age.py | 151 +++++++++++ 3 files changed, 729 insertions(+), 2 deletions(-) create mode 100644 work/age_split_001.py create mode 100644 work/check_faceset001_age.py diff --git a/README.md b/README.md index 2e0cd7c..f1fe9f7 100644 --- a/README.md +++ b/README.md @@ -67,6 +67,92 @@ python sort_faces.py export-swap "$CACHE" \ --raw-manifest "$OUT/raw_full/manifest.json" --candidates ``` +### Importing hand-sorted folders as identities + +When source folders are already hand-sorted by person (one folder per identity), the +clustering path is the wrong tool — the identity is asserted, not inferred. The +orchestration script `work/build_folders.py` covers this case: + +- For each trusted folder, it filters cache records that fall under it, builds an + identity centroid via two-pass outlier rejection (cos-dist 0.55 → 0.45) so + bystanders in group photos drop out, and writes a synthetic `refine_manifest.json`. +- It then routes each face record from a *mixed* folder (e.g. `osrc/`) into every + identity centroid within a tight cosine cutoff (default 0.45). A multi-identity + photo lands in multiple facesets; `export-swap`'s per-bbox outlier filter ensures + each faceset crops only its matching face. +- Finally it invokes `cmd_export_swap` against the synthetic manifest, renames the + emitted `.fsz` bundles after the source folder, drops a `