Three-piece workflow that imports a self-hosted Immich library and emits new facesets without disturbing existing identity numbering: - work/immich_stage.py (WSL): pages /search/metadata, parallel-fetches /faces?id= per asset, prefilters by face_short>=90 against bbox scaled to original-image coords, downloads originals, sha256-dedups against nl_full.npz and same-run staged files. 8-worker ThreadPoolExecutor doing the full /faces->filter->/original chain per asset; resumable via state.json. API URL + key come from IMMICH_URL / IMMICH_API_KEY env vars, label->UUID map from work/immich/users.json (gitignored). - work/embed_worker.py (Windows venv at C:\face_embed_venv): runs insightface.FaceAnalysis(buffalo_l) with the DmlExecutionProvider on AMD Radeon Vega via onnxruntime-directml. Produces a cache file in the same .npz schema as sort_faces.cmd_embed (loadable via load_cache). ~7.5x speedup over CPU end-to-end; embeddings bit- identical to CPU (cosine similarity 1.0000 across 8 sample faces). - work/cluster_immich.py (WSL): mirrors cluster_osrc.py against an immich_<user>.npz. Builds existing identity centroids from canonical faceset_NNN/ in facesets_swap_ready/, drops matches at <=0.45, clusters the rest at 0.55, applies refine gates, hands off to cmd_export_swap. Numbers new facesets past the existing maximum. - work/finalize_immich.sh: chains queue->Windows embed->cache copy-> cluster_immich, with logging. The 2026-04-26 run on https://fotos.computerliebe.org (Immich v2.7.2) processed 53,842 admin-accessible assets, staged 10,261, embedded 19,462 face records on Vega DML in 64.6 min, matched 8,103 (42%) to existing identities, and emitted 185 new facesets (faceset_026..264 with gaps). facesets_swap_ready/ went from 31 to 216 substantive facesets. Important caveat surfaced: /search/metadata's userIds filter is silently ignored when the API key is bound to a different user, so this run can't enumerate other users' libraries from the admin key. A per-user API key would be required for nic. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
51 lines
2.4 KiB
Bash
Executable File
51 lines
2.4 KiB
Bash
Executable File
#!/usr/bin/env bash
|
|
# Finalize an Immich user's stage:
|
|
# 1. Copy queue.json to /mnt/c so the Windows embed worker can read it
|
|
# 2. Run the embed worker on Windows (DML)
|
|
# 3. Copy the resulting cache back to /opt/face-sets/work/cache/
|
|
# 4. Run cluster_immich.py to discover + emit new facesets
|
|
#
|
|
# Usage: ./work/finalize_immich.sh <user-label>
|
|
set -euo pipefail
|
|
|
|
USER_LABEL="${1:?usage: $0 <user-label>}"
|
|
|
|
REPO="$(cd "$(dirname "$0")/.." && pwd)"
|
|
WSL_QUEUE="$REPO/work/immich/$USER_LABEL/queue.json"
|
|
WIN_QUEUE_DIR="/mnt/c/face_embed_venv/work/immich/$USER_LABEL"
|
|
WIN_QUEUE="$WIN_QUEUE_DIR/queue.json"
|
|
WIN_QUEUE_FOR_PS="C:\\face_embed_venv\\work\\immich\\$USER_LABEL\\queue.json"
|
|
|
|
WIN_CACHE_DIR="/mnt/c/face_embed_venv/work/cache"
|
|
WIN_CACHE="$WIN_CACHE_DIR/immich_${USER_LABEL}.npz"
|
|
WIN_CACHE_FOR_PS="C:\\face_embed_venv\\work\\cache\\immich_${USER_LABEL}.npz"
|
|
WSL_CACHE="$REPO/work/cache/immich_${USER_LABEL}.npz"
|
|
|
|
LOG="$REPO/work/logs/immich_finalize_${USER_LABEL}.log"
|
|
|
|
[ -f "$WSL_QUEUE" ] || { echo "missing queue: $WSL_QUEUE" >&2; exit 1; }
|
|
|
|
echo "=== finalize: $USER_LABEL ===" | tee -a "$LOG"
|
|
date | tee -a "$LOG"
|
|
|
|
mkdir -p "$WIN_QUEUE_DIR" "$WIN_CACHE_DIR" "$REPO/work/cache"
|
|
|
|
echo "[1/4] copying queue: $WSL_QUEUE -> $WIN_QUEUE" | tee -a "$LOG"
|
|
cp "$WSL_QUEUE" "$WIN_QUEUE"
|
|
echo " $(wc -c < "$WIN_QUEUE") bytes; $(/home/peter/face_sort_env/bin/python3 -c "import json,sys; print(len(json.load(open('$WIN_QUEUE'))))") entries"
|
|
|
|
echo "[2/4] running Windows DML embed worker" | tee -a "$LOG"
|
|
powershell.exe -NoProfile -Command "C:\\face_embed_venv\\Scripts\\python.exe C:\\face_embed_venv\\bench\\embed_worker.py '$WIN_QUEUE_FOR_PS' '$WIN_CACHE_FOR_PS'" 2>&1 | tee -a "$LOG"
|
|
|
|
[ -f "$WIN_CACHE" ] || { echo "embed produced no cache file at $WIN_CACHE" | tee -a "$LOG"; exit 1; }
|
|
|
|
echo "[3/4] copying cache back: $WIN_CACHE -> $WSL_CACHE" | tee -a "$LOG"
|
|
cp "$WIN_CACHE" "$WSL_CACHE"
|
|
echo " $(/home/peter/face_sort_env/bin/python3 -c "import sys,json; sys.path.insert(0,'$REPO'); from sort_faces import load_cache; e,m,_,_,_=load_cache('$WSL_CACHE'); print(f'{len(e)} embeddings, {sum(1 for x in m if x.get(\"noface\"))} noface, {sum(1 for x in m if not x.get(\"noface\"))} faces')")"
|
|
|
|
echo "[4/4] running cluster_immich.py" | tee -a "$LOG"
|
|
/home/peter/face_sort_env/bin/python3 "$REPO/work/cluster_immich.py" "$WSL_CACHE" 2>&1 | tee -a "$LOG"
|
|
|
|
echo "=== finalize done: $USER_LABEL ===" | tee -a "$LOG"
|
|
date | tee -a "$LOG"
|