├── example_CU.png
├── example_TU.png
├── example_encoder.png
├── README_CU_TU.md
└── README.md
/example_CU.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tianyili2017/HIF-Database/HEAD/example_CU.png
--------------------------------------------------------------------------------
/example_TU.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tianyili2017/HIF-Database/HEAD/example_TU.png
--------------------------------------------------------------------------------
/example_encoder.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tianyili2017/HIF-Database/HEAD/example_encoder.png
--------------------------------------------------------------------------------
/README_CU_TU.md:
--------------------------------------------------------------------------------
1 | # Data Format for *Info_XX.dat* Files
2 |
3 | In HEVC, the compression artifacts are highly influenced by the quad-tree based CTU partition. For the proposed MIF approach, we have considered the partition information, and provided the corresponding data files in our database.
4 |
5 | The CU and TU depth information for all 182 video sequences is stored in 4368 files, organized in folders *HIF_LDP_Info/*, *HIF_LDB_Info/* and *HIF_RA_Info/* for the LDP, LDB and RA configurations, respectively. All files are named with the following rule:
6 |
7 | *Info\_*(configuration)*\_*(video name)*\_*(QP value)*\_*(number of frames)*\_*(*CUDepth*/*TUDepth*)*.dat*
8 |
9 | The minimum CU and TU sizes in HEVC are 8×8 and 4×4. Therefore, the pixels in each adjacent 16×16 block share the same CU depth, and the pixels in each adjacent 8×8 block share the same TU depth. We describe the CTU partition of each video sequence, via storing them in two 3-D arrays:
10 |
11 | (1) $N×(H//16)×(W//16)$ array: CU depth of a video sequence. Each element is with the range of $\{0, 1, 2, 3\}$, corresponding to the CU sizes of $\{64×64, 32×32, 16×16, 8×8\}$.
12 |
13 | (2) $N×(H//8)×(W//8)$ array: TU depth of a video sequence. Each element is with the range of $\{1, 2, 3, 4\}$, corresponding to the TU sizes of $\{32×32, 16×16, 8×8, 4×4\}$.
14 |
15 | Here, $N$ represents the number of frames. $H$ and $W$ stand for the frame height and width, respectively. Signal "$//$" means integer division, with no regard to the remainder. In each above array, the first consecutive $(H//16)×(W//16)$ (or $(H//8)×(W//8)$) elements represent the CU (or TU) depth of the first frame, and the next consecutive $(H//16)×(W//16)$ (or $(H//8)×(W//8)$) elements represent that of the second frame, and so on ..., until the information of all frames are stored.
16 |
17 | Then, the above two arrays (format of unsigned char) are written into the corresponding files: *Info\_*XX*\_*(*CUDepth*/*TUDepth*)*.dat*.
18 |
19 | ## Example
20 |
21 | Video sequence *garden_sif.yuv*, 352$\times$240, 115 frames in total.
22 |
23 | For each frame, $(240//16)×(352//16)=330$ values are needed to represent the CU partition, and $(240//8)×(352//8)=1320$ values are needed to represent the TU partition.
24 |
25 | Taking QP 32 as an example, the CU file *HIF_RA_Info/Info_RA_garden_sif_qp32_nf115_CUDepth.dat* and the TU file *HIF_RA_Info/Info_RA_garden_sif_qp32_nf115_TUDepth.dat* are 330×115 = 37,950 bytes and 1320×115 = 151,800 bytes in size, respectively.
26 |
27 | The partition information of the 47th frame (POC 46) is shown below.
28 |
29 | (1) CU partition: corresponding data are stored in the (15,181-15,510)th bytes of the CU file.
30 |
31 | 
32 |
33 | (2) TU partition: corresponding data are stored in the (60,721-62,040)th bytes of the TU file.
34 |
35 | 
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # HIF-Database
2 |
3 | A large-scale database for HEVC in-loop filter (HIF).
4 |
5 | For learning the in-loop filter in HEVC, we construct this database providing distorted frames before and after the standard in-loop filter. This database has been used to train the proposed deep-learning-based multi-frame in-loop filter (MIF) [1], and may also facilitate other related future works.
6 |
7 | **Source codes (both training and test)**: https://github.com/tianyili2017/MultiFrame-InLoop-Filter
8 |
9 | Link of paper [1]: https://ieeexplore.ieee.org/document/8736997
10 |
11 | Zhihu blog in Chinese [1]: https://zhuanlan.zhihu.com/p/78591265
12 |
13 | ## 1. Database Construction
14 |
15 | The HIF database is constructed from 182 raw video sequences, consisting of 6 facial sequences from [2], 87 sequences from Xiph.org [3] and 89 sequences from the Consumer Digital Video Library [4] in the Video Quality Experts Group (VQEG) [5]. The 182 sequences were divided into non-overlapping sets of training (120 sequences), validation (40 sequences) and test (22 sequences). The corresponding copyright files are in folder *Copyright/*. **Data in our database can be freely used for research without any commercial purpose, only on the condition that the copyrights are appropriately obeyed.**
16 |
17 | The sequences were all encoded by HM 16.5 [6] at four QPs {22, 27, 32, 37} with the Low Delay P (LDP) (using encoder\_lowdelay\_P\_main.cfg), the Low Delay B (LDB) (using encoder\_lowdelay\_main.cfg) and the Random Access (RA) (using encoder\_randomaccess\_main.cfg) configurations. During the encoding procedure, all unfiltered reconstructed frames (URFs) were extracted as the input of our MIF approach, with their corresponding raw frames being ground-truth. In addition, CU and TU partition results for all the frames were also extracted as auxiliary features, since the compression artifacts are highly influenced by the block partition structure in HEVC. As a result, each frame-wise sample in the HIF database consists of four parts, i.e., a URF, its associated raw frame and two matrices indicating the CU and TU depth throughout the frame. Finally, 12 sub-databases were obtained corresponding to 4 QPs and 3 configurations. Each sub-database contains 51,335 frame-wise samples, and thus 616,020 frame-wise samples were collected for the whole HIF database. Note that each frame-wise sample can be split into multiple block-wise samples for data augmentation. Also, the position of each block-wise sample within the frame-wise sample is alterable, further increasing the variety of training samples in practice. Therefore, the HIF database is ready for providing sufficient data for our deep-learning-based MIF approach.
18 |
19 | ## 2. Data Access
20 |
21 | All files can be downloaded at
22 |
23 | Dropbox:
24 |
25 | https://www.dropbox.com/sh/cpa4pca3jhvjhn7/AABrw4Oq4ZzvBYTyOziJispra?dl=0
26 |
27 | or Baidu Cloud Disk:
28 |
29 | https://pan.baidu.com/s/10uVms2eHP9CIEhBke6dB5A
30 |
31 | From the above two sources, files are identical. Choose either source that is more convenient for you.
32 |
33 | (1) 182 video sequences are in folder *YUV_HIF/*.
34 |
35 | (2) The encoded bit-stream files (\*.bin format) for QPs {22, 27, 32, 37} can be obtained from folders *HIF_LDP_Bin/*, *HIF_LDB_Bin/* and *HIF_RA_Bin/*. With the decoder in *HM-16.5_HIF_Rec/*, both reconstructed frames with and without the standard in-loop filter are generated. To run this decoder, please refer to file *HM-16.5_HIF_Rec/bin/README.md*.
36 |
37 | (3) The CU and TU partition results are stored in folders *HIF_LDP_Info/*, *HIF_LDB_Info/* and *HIF_RA_Info/*, with the corresponding data format specified in *README_CU_TU_Info.md*.
38 |
39 | ## 3. Subjective Quality Analysis
40 |
41 | In addition to the HIF database itself, this GitHub project also provides compressed files for evaluating the subjective visual quality of our MIF approach [1] on the 22 test sequences in HIF database. Benefiting from the multi-frame design where a low-quality frame can be enhanced by its neighboring higher-quality frames, our approach may provide considerably better visual quality, compared with the standard in-loop filter. Considering that YUV files are too large, here we provide the corresponding bit-stream files and decoders, as follow.
42 |
43 | First, the sequences compressed by standard HEVC (i.e., with standard DBF and SAO filter) can be generated with the bit-stream files in *HIF_LDP_Bin/*, *HIF_LDB_Bin/*, *HIF_RA_Bin/* and the decoder in *HM-16.5_HIF_Rec/*, the same as **2. (2)**.
44 |
45 | Then, the sequences compressed by our MIF approach can be generated with the bit-stream files in *Test_MIF-Net_RA_Bin/* and log files in *Test_MIF-Net_RA_Log/*, with our adapted HM decoder in folder *HM-16.5_MIF-Net/*. To run this decoder, please refer to file *HM-16.5_MIF-Net/bin/README.md*.
46 |
47 | ## References
48 |
49 | [1] Tianyi Li, Mai Xu, Ce Zhu, Ren Yang, Zulin Wang and Zhenyu Guan, “A Deep Learning Approach for Multi-Frame In-Loop Filter of HEVC,” IEEE TIP, vol. 28, no. 11, pp. 5663-5678, Nov. 2019.
50 |
51 | [2] Mai Xu, Xin Deng, Shengxi Li and Zulin Wang, “Region-of-Interest Based Conversational HEVC Coding with Hierarchical Perception Model of Face,” IEEE JSTSP, vol. 8, no. 3, pp. 475–489, Jun. 2014.
52 |
53 | [3] Xiph.org, “Xiph.org Video Test Media,” https://media.xiph.org/video/derf, 2017.
54 |
55 | [4] CDVL.org, “Consumer Digital Video Library,” https://www.cdvl.org/index.php, 2019.
56 |
57 | [5] VQEG, “VQEG Video Datasets and Organizations,” https://www.its.bldrdoc.gov/vqeg/video-datasets-and-organizations.aspx/, 2017.
58 |
59 | [6] JCT-VC, HM Software, [Online]. Available: https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-16.5/, [Accessed 5-Nov.-2016], 2014.
60 |
--------------------------------------------------------------------------------