├── README.md
├── README_EN.md
├── com.png
├── model.png
└── sota.png
/README.md:
--------------------------------------------------------------------------------
1 |
2 |
3 |

4 |
5 |
6 |
7 |
8 |
9 | [简体中文](README.md) | [English](README_EN.md) | [Paper](https://arxiv.org/abs/2308.06743)
10 | # TextDiff: Mask-Guided Residual Diffusion Models for Scene Text Image Super-Resolution
11 | 这里是论文[TextDiff: Mask-Guided Residual Diffusion Models for Scene Text Image Super-Resolution]((https://arxiv.org/abs/2308.06743))的官方复现仓库。TextDiff是一个场景文字超分辨率优化模型(详见[论文](https://arxiv.org/abs/2308.06743)).
12 |
13 |
14 | # 网络结构
15 |
16 |
17 |

18 |
19 |
20 | # News
21 | - 置顶: 介绍一款我们实验室开发的多功能且多平台的OCR软件,包含常用的各种OCR功能,例如PDF转word,PDF转excel,公式识别,表格识别以及自动去除水印功能,欢迎试用!
22 | - 查看To-do lists,获取最新信息。
23 |
24 | # 使用指南
25 |
26 | ## 环境配置
27 | ### 深度学习环境
28 | - python >= 3.7
29 | - pytorch >= 1.7.0
30 | - torchvision >= 0.8.0
31 | - lmdb >= 0.98
32 | - pillow >= 7.1.2
33 | - numpy
34 | - six
35 | - tqdm
36 | - python-opencv
37 | - easydict
38 | - yaml
39 |
40 | ### 数据集
41 | - 下载TextZoom数据集
42 |
43 | ### 相关权重文件
44 | - 下载Aster model权重文件
45 | - 下载Moran model权重文件
46 | - 下载CRNN model权重文件
47 |
48 | ## 训练
49 | 1. 安装
50 | ```
51 | git clone https://github.com/Lenubolim/TextDiff.git
52 | ```
53 | 2. 参数配置
54 |
见config.yaml文件
55 |
56 | 3. 训练
57 | ```
58 | python train.py
59 | ```
60 | ## 推理
61 | ```
62 | python test.py
63 | ```
64 |
65 | # To-do lists
66 |
67 | - [ ] 添加训练代码(To be released soon.)
68 | - [ ] 添加推理代码(To be released soon.)
69 | - [ ] 使用DPM_solver减少推理步长
70 |
71 |
72 | # 效果图
73 |
74 |
75 | # 感谢
76 |
77 | - 如果你觉得TextDiff对你有帮助,请给个star,谢谢!
78 | - 如果你有任何问题,欢迎提issue(issue通知与我邮箱绑定,看到后我会及时回复)。
79 | - 如果你愿意将TextDiff作为你的项目的baseline,欢迎引用我们的论文。
80 |
81 |
82 | # References
83 |
84 |
85 | - [1] Scene text telescope:
86 | Text-focused scene image super-resolution
87 | - [2] Activating more pixels in image super-resolution
88 | transformer.
89 | - [3] Srdiff: Single image super-resolution
90 | with diffusion probabilistic models.
91 | - [4] DocDiff: Document Enhancement via Residual Diffusion Models
92 | - [5] Improving
93 | Scene Text Image Super-Resolution via Dual Prior Modulation Network
94 |
95 |
96 | # :book: Citation
97 | If you use (part of) my code or find my work helpful, please consider citing
98 | ```
99 | @article{liu2023textdiff,
100 | title={TextDiff: Mask-Guided Residual Diffusion Models for Scene Text Image Super-Resolution},
101 | author={Liu, Baolin and Yang, Zongyuan and Wang, Pengfei and Zhou, Junjie and Liu, Ziqi and Song, Ziyi and Liu, Yan and Xiong, Yongping},
102 | journal={arXiv preprint arXiv:2308.06743},
103 | year={2023}
104 | }
105 | ```
106 | # Acknowledgement
107 | This code is developed relying on DocDiff and TATT. Thanks for these great projects. Among them, DocDiff is the main research content of my classmate, and I participated in part of the research.
108 |
--------------------------------------------------------------------------------
/README_EN.md:
--------------------------------------------------------------------------------
1 |
2 |
3 |

4 |
5 |
6 |
7 |
8 |
9 | [简体中文](README.md) | [English](README_EN.md) | [Paper](https://arxiv.org/abs/2308.06743)
10 | # TextDiff: Mask-Guided Residual Diffusion Models for Scene Text Image Super-Resolution
11 | Here is the official reproduction repository of the paper [TextDiff: Mask-Guided Residual Diffusion Models for Scene Text Image Super-Resolution]((https://arxiv.org/abs/2308.06743)). TextDiff is a scene text super-resolution optimization model (see [paper](https://arxiv.org/abs/2308.06743) for details).
12 |
13 |
14 | # Network Structure
15 |
16 |
17 |

18 |
19 |
20 | # User Guide
21 |
22 |
23 | ## Environment configuration
24 | ### Deep Learning Environment
25 | - python >= 3.7
26 | - pytorch >= 1.7.0
27 | - torchvision >= 0.8.0
28 | - lmdb >= 0.98
29 | - pillow >= 7.1.2
30 | - numpy
31 | - six
32 | - tqdm
33 | - python-opencv
34 | - easydict
35 | - yaml
36 |
37 | ### Dataset
38 | - Download TextZoom dataset
39 |
40 | ### Related weight files
41 | - Download Aster model weight file
42 | - Download Moran model weight file
43 | - Download CRNN model weight file
44 |
45 | # To-do lists
46 |
47 | - [ ] Add training code
48 | - [ ] Add inference code
49 | - [ ] Use DPM_solver to reduce inference step size
50 |
51 | # Renderings
52 |
53 |
54 | # Gratitude
55 |
56 | - If you think TextDiff is helpful to you, please give it a star, thank you!
57 | - If you have any questions, please raise an issue and I will reply as soon as possible.
58 | - If you are willing to use TextDiff as a baseline for your project, you are welcome to cite our paper.
59 |
60 |
61 | # References
62 |
63 |
64 | - [1] Scene text telescope:
65 | Text-focused scene image super-resolution
66 | - [2] Activating more pixels in image super-resolution
67 | transformer.
68 | - [3] Srdiff: Single image super-resolution
69 | with diffusion probabilistic models.
70 | - [4] DocDiff: Document Enhancement via Residual Diffusion Models
71 | - [5] Improving
72 | Scene Text Image Super-Resolution via Dual Prior Modulation Network
73 |
74 |
75 | # :book: Citation
76 | If you use (part of) my code or find my work helpful, please consider citing
77 | ```
78 | @article{liu2023textdiff,
79 | title={TextDiff: Mask-Guided Residual Diffusion Models for Scene Text Image Super-Resolution},
80 | author={Liu, Baolin and Yang, Zongyuan and Wang, Pengfei and Zhou, Junjie and Liu, Ziqi and Song, Ziyi and Liu, Yan and Xiong, Yongping},
81 | journal={arXiv preprint arXiv:2308.06743},
82 | year={2023}
83 | }
84 | ```
85 |
86 | # Acknowledgement
87 | This code is developed relying on DocDiff and TATT. Thanks for these great projects. Among them, DocDiff is the main research content of my classmate, and I participated in part of the research.
88 |
89 |
--------------------------------------------------------------------------------
/com.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Lenubolim/TextDiff/b0264a94a240af2801e7bac1ca27ea77392473e7/com.png
--------------------------------------------------------------------------------
/model.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Lenubolim/TextDiff/b0264a94a240af2801e7bac1ca27ea77392473e7/model.png
--------------------------------------------------------------------------------
/sota.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Lenubolim/TextDiff/b0264a94a240af2801e7bac1ca27ea77392473e7/sota.png
--------------------------------------------------------------------------------