├── LICENSE ├── README.md └── assets ├── teaser.png └── temporal_infer.png /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2025 SpyderZSY 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 |
2 |

POMATO: Marrying Pointmap Matching with Temporal Motions 3 | for Dynamic 3D Reconstruction

4 | 5 |

6 | ArXiv 7 |

8 | 9 | Songyan Zhang1*, Yongtao Ge2,3*, Jinyuan Tian2*, Hao Chen2†, Chen Lv1, Chunhua Shen2 10 | 11 | 1Nanyang Technology University, Singapore; 2Zhejiang University, China; 3The University of Adelaide, Australia 12 | 13 | *Equal Contributions, †Corresponding Author 14 |

15 |
16 | 17 | We present **POMATO** , a model that enables 3D reconstruction from an arbitrary dynamic video. Without relying on external modules, POMATO can 18 | directly perform 3D reconstruction along with temporal 3D point tracking and dynamic mask estimation. 19 | 20 | # Code will come soon! 21 | 22 | ## 🚀News 23 | 24 | - ```[Apr 2025]``` Released [paper](https://arxiv.org/abs/2504.05692) and init the github repo. 25 | 26 | 27 | ## 🔨 TODO LIST 28 | 29 | - [ ] Release the inference code and huggingface model. 30 | - [ ] Release the visualization of 3D tracking. 31 | - [ ] Release the training code. 32 | 33 | ## ✨Hightlights 34 | 🔥 We introduce a temporal motion module to facilitate the interactions of motion features along the temporal dimension. 35 | 36 |

37 | 38 |

39 | 40 | Inference pipelines for point tracking, video depth, 41 | and multi-view reconstruction with temporal motion module. tk indicates the keyframe in the 42 | sequence. 43 | 44 | ## 📌 Citation 45 | 46 | If you find our POMATO is useful in your research or applications, please consider giving a star ⭐ and citing using the following BibTeX: 47 | 48 | ```bibtex 49 | @article{zhang2025pomatomarryingpointmapmatching, 50 | title={POMATO: Marrying Pointmap Matching with Temporal Motion for Dynamic 3D Reconstruction}, 51 | author={Songyan Zhang and Yongtao Ge and Jinyuan Tian and Guangkai Xu and Hao Chen and Chen Lv and Chunhua Shen}, 52 | journal={arXiv preprint arXiv:2504.05692}, 53 | year={2025}, 54 | } 55 | ``` 56 | -------------------------------------------------------------------------------- /assets/teaser.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wyddmw/POMATO/e875633b8706294011c12da1c7c47e329e50a4c0/assets/teaser.png -------------------------------------------------------------------------------- /assets/temporal_infer.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wyddmw/POMATO/e875633b8706294011c12da1c7c47e329e50a4c0/assets/temporal_infer.png --------------------------------------------------------------------------------