work/cluster_osrc.py mirrors build_folders.py's shape (synthesize a
refine_manifest, hand off to cmd_export_swap, relocate, merge top-level
manifest) but discovers identities by clustering rather than asserting
them by folder. Drops faces already covered by existing identity
centroids, clusters the rest at 0.55, applies refine-equivalent gates
with min_faces=6, numbers new facesets past the existing maximum so
faceset_001..NNN are never disturbed.
The 2026-04-26 run on /mnt/x/src/osrc produced faceset_020..025 (sizes
4-26 exported PNGs); analysis writeup in docs/analysis/.
README also notes the refine-renumbers caveat in passing — extend +
orchestration script is the safe pattern; cmd_refine is for fresh
clusters only.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Documents the 2026-04-26 split of faceset_001 (707 curated faces) into
6 substantive era buckets + 68 thin fragments, including the readiness
probe evidence, the anchor-based assignment rationale (replaces
transitive union-find that caused year-drift), and the re-run / apply-
to-other-identity workflow.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- enrich: re-detects each cached face with buffalo_l (detection +
landmark_2d_106 + landmark_3d_68, recognition module skipped for speed)
and persists landmarks + pose into the cache so per-face frontality and
landmark-symmetry quality signals become available.
- compute_quality: composite score combining det_score, face short-edge,
blur, frontality (from pose pitch/yaw), and 2D-landmark symmetry with
tunable weights. Default weighting 0.30/0.20/0.20/0.15/0.15.
- export-swap: builds facesets_swap_ready/ from an existing refine
manifest. Per identity: tighter outlier gate (default 0.45), visual-
near-dupe collapse (keep best representative per group), multi-face-
per-source-image collapse (keep best bbox), rank by composite score,
single-face-per-PNG crops at 512x512 with 0.5 bbox padding, ready-to-
drop .fsz bundles (top-N + full), per-faceset manifest.json, NAME.txt
placeholder for the operator. The multi-face-per-PNG collapse is the
critical fix: roop-unleashed's .fsz loader appends every detected face
in each PNG to the FaceSet, so any multi-face crop would contaminate
the averaged embedding.
- Optional --candidates rescues raw_full singletons: matches against the
final per-faceset centroids and routes to _candidates/to_<faceset>/
for manual review; orphaned singletons that still cluster among
themselves land in _candidates/new_<NNN>/.
- docs/analysis/: evaluation document captures the evidence, downstream
requirements (FaceSet averaging, inswapper_128), opportunity matrix
(R1-R14), and the recommended target state this export implements.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>