StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams

Overview

StreamSplat is a fully feed-forward framework that instantly transforms uncalibrated video streams of arbitrary length into dynamic 3D Gaussian Splatting representations in an online manner.

Feed-forward inference: No per-scene optimization required
Camera-free: Works directly with uncalibrated monocular videos
Dynamic scene modeling: Handles both static and dynamic scene elements through polynomial motion modeling
Probabilistic Gaussian prediction: Uses truncated Gaussian models for robust Gaussian position modeling
Two-stage training: Stage 1 trains the static encoder, Stage 2 trains the dynamic decoder

Videos

re10k-1.mp4

re10k-2.mp4

DAVIS.mp4

vos.mp4

Environment Setup

Create conda environment:

conda env create -f environment.yml
conda activate StreamSplat

Build the differentiable Gaussian rasterizer:

cd submodules/diff-gaussian-rasterization-orth
pip install .

Download pretrained depth model:

Download Depth Anything V2 checkpoint and place it in the checkpoints/ directory:

mkdir -p checkpoints
# Download depth_anything_v2_vitl.pth from https://github.com/DepthAnything/Depth-Anything-V2
# Place it in checkpoints/depth_anything_v2_vitl.pth

Dataset Preparation

StreamSplat supports training on multiple datasets. All datasets require pre-computed depth maps using Depth Anything V2.

Supported Datasets

Dataset	Type	Description
RealEstate10K	Static	Real estate videos
CO3Dv2	Static	Object-centric multi-view
DAVIS	Dynamic	High-quality videos
YouTube-VOS	Dynamic	Large-scale videos

Preprocessing Depth Maps

Use the provided script to preprocess depth maps for DAVIS (similar scripts can be adapted for other datasets):

python preprocess_depth_davis.py --root_path /path/to/davis

Configure Dataset Paths

Edit configs/options.py and configs/options_decoder.py to set dataset paths:

root_path_re10k: str = "/path/to/re10k"
root_path_co3d: str = "/path/to/co3d"
root_path_davis: str = "/path/to/davis"
root_path_vos: str = "/path/to/youtube-vos"

Training

Configure Accelerate

Create an accelerate config file (or use the provided acc_configs/gpu8.yaml):

accelerate config

Stage 1: Train Static Encoder

Train the static encoder on combined datasets:

accelerate launch --config_file acc_configs/gpu8.yaml train.py combined \
    --workspace /path/to/workspace/encoder_exp

Stage 2: Train Dynamic Decoder

After Stage 1 completes, train the dynamic decoder with the frozen encoder:

accelerate launch --config_file acc_configs/gpu8.yaml train_decoder.py combined_rcvd \
    --workspace /path/to/workspace/decoder_exp \
    --encoder_path /path/to/workspace/encoder_exp/model.safetensors

Monitoring Training

Training progress is logged to Weights & Biases. Set up wandb before training:

wandb login

Checkpoints are saved every 10 epochs and every 30 minutes to checkpoint_latest/.

Citation

If you find this work useful, please cite:

@article{wu2025streamsplat,
    title={StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams}, 
    author={Zike Wu and Qi Yan and Xuanyu Yi and Lele Wang and Renjie Liao},
    journal={arXiv preprint arXiv:2506.08862},
    year={2025},
}

Acknowledgments

This project builds upon several excellent works:

3D Gaussian Splatting for the differentiable rasterization
diff-gaussian-rasterization for the depth & alpha rendering
DINOv2 for vision features
Depth Anything V2 for monocular depth estimation
Gamba and MVGamba for the codebase and training framework
Nutworld for orthographic rasterization
edm for data augmentation

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
acc_configs		acc_configs
assets		assets
configs		configs
datasets		datasets
encoders		encoders
gaussian_renderer_dynamic		gaussian_renderer_dynamic
model		model
submodules/diff-gaussian-rasterization-orth		submodules/diff-gaussian-rasterization-orth
third_party		third_party
utils		utils
.gitmodules		.gitmodules
README.md		README.md
environment.yml		environment.yml
preprocess_depth_davis.py		preprocess_depth_davis.py
train.py		train.py
train_decoder.py		train_decoder.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams

Overview

Videos

Environment Setup

Dataset Preparation

Supported Datasets

Preprocessing Depth Maps

Configure Dataset Paths

Training

Configure Accelerate

Stage 1: Train Static Encoder

Stage 2: Train Dynamic Decoder

Monitoring Training

Citation

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

DSL-Lab/StreamSplat

Folders and files

Latest commit

History

Repository files navigation

StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams

Overview

Videos

Environment Setup

Dataset Preparation

Supported Datasets

Preprocessing Depth Maps

Configure Dataset Paths

Training

Configure Accelerate

Stage 1: Train Static Encoder

Stage 2: Train Dynamic Decoder

Monitoring Training

Citation

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages