Add target-side video preprocessing pipeline

Preprocesses a folder of video files into UUID-named clips suitable as
target inputs for roop-unleashed-style face-swap. Counterpart to the
faceset (source-side) tooling.

work/video_target_pipeline.py — orchestration with subcommands
  scan / scenes / stage / merge / track / score / cut / report. Quality
  gates default to face-sets-can-handle-side-profile values (yaw<=75°,
  pitch<=45°, face_short>=80px, det>=0.5). Cross-track segment merge
  fuses adjacent-in-time tracks within the same scene up to 2s gap.
  Output organized into <output_dir>/<source_stem>/<uuid>.mp4 +
  <uuid>.json sidecar with full provenance.

work/video_face_worker.py — Windows DML face detect+embed worker. Uses
  JSONL append-only for results.jsonl: a critical perf fix (re-
  serializing the monolithic 245MB results.json on every flush was the
  dominant cost in the first attempt, dropping throughput to 0.5 fps).
  Append-only got it to 13+ fps, ~7.5 fps cumulative across the first
  6.18h batch. Also uses seek-once-per-video + sequential cap.grab()
  between samples to dodge cv2 per-sample seek pathology on long H.264.
  Legacy results.json is auto-migrated to .jsonl on first load.

work/run_video_pipeline.sh — generic chain driver, parameterized via
  WORK / INPUT_DIR / OUTPUT_DIR / FILTER_FROM / SKIP_PATTERN / MAX_DUR /
  IDENTITY env vars. work/status_video_pipeline.sh — generic status
  helper.

First production batch (ct_src_00050..00062, 13 files, 6.18h input):
600 emitted segments, 239.5min accepted content (64.6% of input), 254
segments built from >=2 tracks (cross-track merge), 1h43m wall clock.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-27 21:38:50 +02:00
parent 49a43c7685
commit 998fa79f81
6 changed files with 1480 additions and 0 deletions

View File

@@ -343,6 +343,7 @@ clean it up over time:
| `work/consolidate_facesets.py` | Merge duplicate identities (centroid cosine sim ≥ 0.55 with confident ≥ 0.65, **complete-linkage** to defeat single-link chaining). Pulls embeddings from cache, no GPU. See `docs/analysis/identity-consolidation-and-age-extend.md`. |
| `work/age_extend_001.py` | Slot newly-added PNGs into existing era buckets of `faceset_001` (anchor cosine distance ≤ 0.40 AND `|year_delta|` ≤ 5). Same anchor-fragment rule as `age_split_001.py`. |
| `work/dedup_optimize.py` (+ Windows `work/multiface_worker.py`) | (a) cross-family SHA256 byte-dedup, (b) within-faceset near-dup at cosine sim ≥ 0.95, (c) multi-face audit (re-detect via insightface, drop PNGs with face_count ≠ 1). Multi-face is the load-bearing roop invariant. See `docs/analysis/dedup-and-roop-optimization.md`. |
| `work/video_target_pipeline.py` (+ Windows `work/video_face_worker.py` + `work/run_video_pipeline.sh` chain) | Target-side preprocessing: scan a folder of videos → PySceneDetect shot-cuts → 2 fps frame sampling → DML face detection + embedding → IoU+embedding tracking → quality-gated segments (yaw≤75°, face≥80px, det≥0.5, ≥70% pass-rate, 1120s duration, 2s cross-track merge gap) → ffmpeg stream-copy into UUID-named clips with sidecar JSON. Output organized into per-source subfolders. See `docs/analysis/video-target-preprocessing.md`. |
All four operate idempotently and reversibly: dropped PNGs go to
`<faceset>/faces/_dropped/`, quarantined whole facesets go to
@@ -382,6 +383,10 @@ Highly recommended at swap time: enable **Select post-processing = GFPGAN** with
├─ consolidate_facesets.py (duplicate-identity merger; complete-linkage)
├─ dedup_optimize.py (byte + near-dup + multi-face audit driver)
├─ multiface_worker.py (Windows DML multi-face audit worker)
├─ video_target_pipeline.py (video → swappable segment cuts orchestration)
├─ video_face_worker.py (Windows DML per-frame face worker; JSONL append-only)
├─ run_video_pipeline.sh (generic chain driver: scenes → stage → worker → cut)
├─ status_video_pipeline.sh (status helper for any video_pipeline log)
├─ synthetic_*_manifest.json (per-run synthetic refine manifests)
├─ immich/
│ ├─ users.json (label -> userId map; gitignored)