├── example_CU.png ├── example_TU.png ├── example_encoder.png ├── README_CU_TU.md └── README.md /example_CU.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tianyili2017/HIF-Database/HEAD/example_CU.png -------------------------------------------------------------------------------- /example_TU.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tianyili2017/HIF-Database/HEAD/example_TU.png -------------------------------------------------------------------------------- /example_encoder.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tianyili2017/HIF-Database/HEAD/example_encoder.png -------------------------------------------------------------------------------- /README_CU_TU.md: -------------------------------------------------------------------------------- 1 | # Data Format for *Info_XX.dat* Files 2 | 3 | In HEVC, the compression artifacts are highly influenced by the quad-tree based CTU partition. For the proposed MIF approach, we have considered the partition information, and provided the corresponding data files in our database. 4 | 5 | The CU and TU depth information for all 182 video sequences is stored in 4368 files, organized in folders *HIF_LDP_Info/*, *HIF_LDB_Info/* and *HIF_RA_Info/* for the LDP, LDB and RA configurations, respectively. All files are named with the following rule: 6 | 7 | *Info\_*(configuration)*\_*(video name)*\_*(QP value)*\_*(number of frames)*\_*(*CUDepth*/*TUDepth*)*.dat* 8 | 9 | The minimum CU and TU sizes in HEVC are 8×8 and 4×4. Therefore, the pixels in each adjacent 16×16 block share the same CU depth, and the pixels in each adjacent 8×8 block share the same TU depth. We describe the CTU partition of each video sequence, via storing them in two 3-D arrays: 10 | 11 | (1) $N×(H//16)×(W//16)$ array: CU depth of a video sequence. Each element is with the range of $\{0, 1, 2, 3\}$, corresponding to the CU sizes of $\{64×64, 32×32, 16×16, 8×8\}$. 12 | 13 | (2) $N×(H//8)×(W//8)$ array: TU depth of a video sequence. Each element is with the range of $\{1, 2, 3, 4\}$, corresponding to the TU sizes of $\{32×32, 16×16, 8×8, 4×4\}$. 14 | 15 | Here, $N$ represents the number of frames. $H$ and $W$ stand for the frame height and width, respectively. Signal "$//$" means integer division, with no regard to the remainder. In each above array, the first consecutive $(H//16)×(W//16)$ (or $(H//8)×(W//8)$) elements represent the CU (or TU) depth of the first frame, and the next consecutive $(H//16)×(W//16)$ (or $(H//8)×(W//8)$) elements represent that of the second frame, and so on ..., until the information of all frames are stored. 16 | 17 | Then, the above two arrays (format of unsigned char) are written into the corresponding files: *Info\_*XX*\_*(*CUDepth*/*TUDepth*)*.dat*. 18 | 19 | ## Example 20 | 21 | Video sequence *garden_sif.yuv*, 352$\times$240, 115 frames in total. 22 | 23 | For each frame, $(240//16)×(352//16)=330$ values are needed to represent the CU partition, and $(240//8)×(352//8)=1320$ values are needed to represent the TU partition. 24 | 25 | Taking QP 32 as an example, the CU file *HIF_RA_Info/Info_RA_garden_sif_qp32_nf115_CUDepth.dat* and the TU file *HIF_RA_Info/Info_RA_garden_sif_qp32_nf115_TUDepth.dat* are 330×115 = 37,950 bytes and 1320×115 = 151,800 bytes in size, respectively. 26 | 27 | The partition information of the 47th frame (POC 46) is shown below. 28 | 29 | (1) CU partition: corresponding data are stored in the (15,181-15,510)th bytes of the CU file. 30 | 31 | ![](example_CU.png) 32 | 33 | (2) TU partition: corresponding data are stored in the (60,721-62,040)th bytes of the TU file. 34 | 35 | ![](example_TU.png) -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # HIF-Database 2 | 3 | A large-scale database for HEVC in-loop filter (HIF). 4 | 5 | For learning the in-loop filter in HEVC, we construct this database providing distorted frames before and after the standard in-loop filter. This database has been used to train the proposed deep-learning-based multi-frame in-loop filter (MIF) [1], and may also facilitate other related future works. 6 | 7 | **Source codes (both training and test)**: https://github.com/tianyili2017/MultiFrame-InLoop-Filter 8 | 9 | Link of paper [1]: https://ieeexplore.ieee.org/document/8736997 10 | 11 | Zhihu blog in Chinese [1]: https://zhuanlan.zhihu.com/p/78591265 12 | 13 | ## 1. Database Construction 14 | 15 | The HIF database is constructed from 182 raw video sequences, consisting of 6 facial sequences from [2], 87 sequences from Xiph.org [3] and 89 sequences from the Consumer Digital Video Library [4] in the Video Quality Experts Group (VQEG) [5]. The 182 sequences were divided into non-overlapping sets of training (120 sequences), validation (40 sequences) and test (22 sequences). The corresponding copyright files are in folder *Copyright/*. **Data in our database can be freely used for research without any commercial purpose, only on the condition that the copyrights are appropriately obeyed.** 16 | 17 | The sequences were all encoded by HM 16.5 [6] at four QPs {22, 27, 32, 37} with the Low Delay P (LDP) (using encoder\_lowdelay\_P\_main.cfg), the Low Delay B (LDB) (using encoder\_lowdelay\_main.cfg) and the Random Access (RA) (using encoder\_randomaccess\_main.cfg) configurations. During the encoding procedure, all unfiltered reconstructed frames (URFs) were extracted as the input of our MIF approach, with their corresponding raw frames being ground-truth. In addition, CU and TU partition results for all the frames were also extracted as auxiliary features, since the compression artifacts are highly influenced by the block partition structure in HEVC. As a result, each frame-wise sample in the HIF database consists of four parts, i.e., a URF, its associated raw frame and two matrices indicating the CU and TU depth throughout the frame. Finally, 12 sub-databases were obtained corresponding to 4 QPs and 3 configurations. Each sub-database contains 51,335 frame-wise samples, and thus 616,020 frame-wise samples were collected for the whole HIF database. Note that each frame-wise sample can be split into multiple block-wise samples for data augmentation. Also, the position of each block-wise sample within the frame-wise sample is alterable, further increasing the variety of training samples in practice. Therefore, the HIF database is ready for providing sufficient data for our deep-learning-based MIF approach. 18 | 19 | ## 2. Data Access 20 | 21 | All files can be downloaded at 22 | 23 | Dropbox: 24 | 25 | https://www.dropbox.com/sh/cpa4pca3jhvjhn7/AABrw4Oq4ZzvBYTyOziJispra?dl=0 26 | 27 | or Baidu Cloud Disk: 28 | 29 | https://pan.baidu.com/s/10uVms2eHP9CIEhBke6dB5A 30 | 31 | From the above two sources, files are identical. Choose either source that is more convenient for you. 32 | 33 | (1) 182 video sequences are in folder *YUV_HIF/*. 34 | 35 | (2) The encoded bit-stream files (\*.bin format) for QPs {22, 27, 32, 37} can be obtained from folders *HIF_LDP_Bin/*, *HIF_LDB_Bin/* and *HIF_RA_Bin/*. With the decoder in *HM-16.5_HIF_Rec/*, both reconstructed frames with and without the standard in-loop filter are generated. To run this decoder, please refer to file *HM-16.5_HIF_Rec/bin/README.md*. 36 | 37 | (3) The CU and TU partition results are stored in folders *HIF_LDP_Info/*, *HIF_LDB_Info/* and *HIF_RA_Info/*, with the corresponding data format specified in *README_CU_TU_Info.md*. 38 | 39 | ## 3. Subjective Quality Analysis 40 | 41 | In addition to the HIF database itself, this GitHub project also provides compressed files for evaluating the subjective visual quality of our MIF approach [1] on the 22 test sequences in HIF database. Benefiting from the multi-frame design where a low-quality frame can be enhanced by its neighboring higher-quality frames, our approach may provide considerably better visual quality, compared with the standard in-loop filter. Considering that YUV files are too large, here we provide the corresponding bit-stream files and decoders, as follow. 42 | 43 | First, the sequences compressed by standard HEVC (i.e., with standard DBF and SAO filter) can be generated with the bit-stream files in *HIF_LDP_Bin/*, *HIF_LDB_Bin/*, *HIF_RA_Bin/* and the decoder in *HM-16.5_HIF_Rec/*, the same as **2. (2)**. 44 | 45 | Then, the sequences compressed by our MIF approach can be generated with the bit-stream files in *Test_MIF-Net_RA_Bin/* and log files in *Test_MIF-Net_RA_Log/*, with our adapted HM decoder in folder *HM-16.5_MIF-Net/*. To run this decoder, please refer to file *HM-16.5_MIF-Net/bin/README.md*. 46 | 47 | ## References 48 | 49 | [1] Tianyi Li, Mai Xu, Ce Zhu, Ren Yang, Zulin Wang and Zhenyu Guan, “A Deep Learning Approach for Multi-Frame In-Loop Filter of HEVC,” IEEE TIP, vol. 28, no. 11, pp. 5663-5678, Nov. 2019. 50 | 51 | [2] Mai Xu, Xin Deng, Shengxi Li and Zulin Wang, “Region-of-Interest Based Conversational HEVC Coding with Hierarchical Perception Model of Face,” IEEE JSTSP, vol. 8, no. 3, pp. 475–489, Jun. 2014. 52 | 53 | [3] Xiph.org, “Xiph.org Video Test Media,” https://media.xiph.org/video/derf, 2017. 54 | 55 | [4] CDVL.org, “Consumer Digital Video Library,” https://www.cdvl.org/index.php, 2019. 56 | 57 | [5] VQEG, “VQEG Video Datasets and Organizations,” https://www.its.bldrdoc.gov/vqeg/video-datasets-and-organizations.aspx/, 2017. 58 | 59 | [6] JCT-VC, HM Software, [Online]. Available: https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-16.5/, [Accessed 5-Nov.-2016], 2014. 60 | --------------------------------------------------------------------------------