Acknowledgement

MatchAttention: Matching the Relative Positions for High-Resolution Cross-View Matching

Tingman Yan, Tao Liu, Xilian Yang, Qunfei Zhao, Zeyang Xia

Please help Star this repo if you find it useful. Thank you!

Introduction

MatchAttention is a contiguous and differentiable sliding-window attention mechanism that enables long-range connection, explict matching, and linear complexity. When applied to stereo matching and optical flow, real-time and state-of-the-arts performance can be achieved.

FLOPs and memory consumption

Zero-shot generalization

Strong zero-shot generalization on real-world datasets when trained on FSD Mix datasets.
High-Resolution inference with fine-grained details
Real-time inference (MatchStereo-T @1280x720 on a RTX 4060 Ti GPU)

Explainable occlusion handling

Top row shows the color image and GT occlusion mask from the Middlebury dataset (Playtable, 1852 x 2720). Bottom row shows the cross relative position $R_{pos}[..., 0]$ (disparity) and the self relative position $sR_{pos}[..., 0]$ predicted by MatchStereo-B trained on FSD Mix datasets. The visualization of $sR_{pos}[..., 0]$ demonstrates that the attention sampling positions for occluded regions lie within their non-occluded neighboring regions.

Comparison with SOTA

MatchStereo-B ranked 1st in average error on the public Middlebury benchmark (2025-05-10)
State-of-the-arts performance on four real-world benchmarks.

Model Weights

Model	Params	Resolution	FLOPs	GPU Mem	Latency	Checkpoint
MatchStereo-T	8.78M	1536x1536	0.34T	1.45G	38ms	Hugging Face
MatchStereo-S	25.2M	1536x1536	0.98T	1.73G	45ms	Hugging Face
MatchStereo-B	75.5M	1536x1536	3.59T	2.94G	75ms	Hugging Face
MatchFlow-B	75.5M	1536x1536	3.60T	3.22G	77ms	Hugging Face

GPU memory and latency measured on a single RTX 5090 GPU with torch.compile enabled and FP16 precision

Setup

1. Clone MatchAttention

git clone https://github.com/TingmanYan/MatchAttention
cd MatchAttention

2. Installation

## Create enviorment
conda create -n matchstereo python=3.10
conda activate matchstereo
## For pytorch 2.5.1+cu124
conda install -c "nvidia/label/cuda-12.4.0" cuda-toolkit
conda install pytorch==2.5.1 torchvision==0.20.1 pytorch-cuda=12.4 -c pytorch -c nvidia
## For pytorch 2.7.1+cu128
conda install nvidia/label/cuda-12.8.1::cuda-toolkit
pip install torch==2.7.1 torchvision==0.22.1 --index-url https://download.pytorch.org/whl/cu128
## other dependencies
pip install -r requirements.txt
## (Optional) Install CUDA implementation of match attention
cd models
bash compile.sh

## Download model weights to ./checkpoints

Note

PyTorch 2.0+ is required for torch.compile

Inference

1. Command line

# on custom images
# stereo
python run_img.py --img0_dir images/left/ --img1_dir images/right/ --output_path outputs --checkpoint_path checkpoints/matchstereo_tiny_fsd.pth --no_compile
# flow
python run_img.py --img0_dir images/frame1/ --img1_dir images/frame2/ --output_path outputs --variant base --checkpoint_path checkpoints/matchflow_base_sintel.pth --mode flow --no_compile
# test on Middlebury
python run_img.py --middv3_dir images/MiddEval3/ --variant tiny --checkpoint_path checkpoints/matchstereo_tiny_fsd.pth --test_inference_time --inference_size 1536 1536 --mat_impl pytorch --precision fp16
python run_img.py --middv3_dir images/MiddEval3/ --variant small --checkpoint_path checkpoints/matchstereo_small_fsd.pth --test_inference_time --inference_size 2176 3840 --mat_impl cuda --precision fp16
python run_img.py --middv3_dir images/MiddEval3/ --variant base --checkpoint_path checkpoints/matchstereo_base_fsd.pth --mat_impl cuda --low_res_init --no_compile
# test on ETH3D
python run_img.py --eth3d_dir images/ETH3D/ --checkpoint_path checkpoints/matchstereo_tiny_fsd.pth --inference_size 416 832 --mat_impl pytorch --precision fp16 --device_id -1 # run on CPU

2. Local Gradio demo

python gradio_app.py

3. Real-time inference using a ZED camera

python zed_capture.py --checkpoint_path checkpoints/matchstereo_tiny_fsd.pth

Citation

Please cite our paper if you find it useful

@article{yan2025matchattention,
  title={MatchAttention: Matching the Relative Positions for High-Resolution Cross-View Matching},
  author={Tingman Yan and Tao Liu and Xilian Yang and Qunfei Zhao and Zeyang Xia},
  journal={arXiv preprint arXiv:2510.14260},
  year={2025}
}

Acknowledgement

We would like to thank the authors of UniMatch, RAFT-Stereo, MetaFormer, and TransNeXt for their code release. Thanks to the author of FoundationStereo for the release of the FSD dataset.

Contact

Please reach out to Tingman Yan for questions.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
dataloader/stereo		dataloader/stereo
examples		examples
images		images
models		models
utils		utils
INSTALL.md		INSTALL.md
LICENSE		LICENSE
README.md		README.md
calculate_flops.py		calculate_flops.py
gradio_app.py		gradio_app.py
requirements.txt		requirements.txt
run_img.py		run_img.py
zed_capture.py		zed_capture.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MatchAttention: Matching the Relative Positions for High-Resolution Cross-View Matching

Introduction

FLOPs and memory consumption

Zero-shot generalization

Explainable occlusion handling

Comparison with SOTA

Model Weights

Setup

1. Clone MatchAttention

2. Installation

Inference

1. Command line

2. Local Gradio demo

3. Real-time inference using a ZED camera

Citation

Acknowledgement

Contact

About

Uh oh!

Releases

Packages

Languages

License

TingmanYan/MatchAttention

Folders and files

Latest commit

History

Repository files navigation

MatchAttention: Matching the Relative Positions for High-Resolution Cross-View Matching

Introduction

FLOPs and memory consumption

Zero-shot generalization

Explainable occlusion handling

Comparison with SOTA

Model Weights

Setup

1. Clone MatchAttention

2. Installation

Inference

1. Command line

2. Local Gradio demo

3. Real-time inference using a ZED camera

Citation

Acknowledgement

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages