Skip to content

shiyu2011/robotic_cv_stack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

Robotic CV Stack — Occlusion-Aware Head Pose + RVM Matting + 6DRepNet + (optional) 3DMM

This is a ready-to-run reference stack for robust, real-time face/head pose under occlusion:

  • MediaPipe Face Mesh → dense landmarks (468)
  • Occluder mask via RVM portrait matting (alpha) + quick skin mask inside face hull
  • Masked PnP (RANSAC) for full 6‑DoF pose (R + t), ignoring occluded points
  • 6DRepNet fallback (rotation-only) for when landmarks are unreliable
  • Kalman smoothing (per angle)
  • Optional: SynergyNet ONNX wrapper for 3DMM personalization (rotation + shape/expr), to run offline/occasionally

Quickstart: python -m venv .venv && source .venv/bin/activate pip install -r requirements.txt python app/run.py --source app/demo_data/sample.mp4 --mask-source rvm --show-mask

or

python app/run.py --source webcam --mask-source rvm

Flags: --source {webcam|/path/to/video} --mask-source {skin|rvm|none} --fov-deg 60.0 --show-mask --use-6drepnet --use-3dmm --synergy-onnx /path/to/synergy.onnx --save-out out.mp4

Datasets (optional): BIWI Kinect Head Pose: app/datasets/get_biwi_sample.py LaPa or CelebAMask-HQ: app/datasets/get_lapa_sample.py

Notes:

  • RVM is loaded via torch.hub (mobilenetv3). Internet needed on first run; cached after.
  • 6DRepNet returns (pitch, yaw, roll). We convert to (yaw, pitch, roll) for consistency.
  • Metric translation (x,y,z) comes from PnP or depth; 6DRepNet is rotation-only.

About

robotic cv pipeline

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages