├── .gitignore ├── CONTRIBUTING.md ├── License.txt ├── README.md ├── dataset ├── CSIQ │ ├── csiq_video_quality_data.txt │ ├── csiq_video_quality_seqs.txt │ └── prep_csiq_score.py ├── LIVE │ ├── live_video_quality_data.txt │ ├── live_video_quality_seqs.txt │ └── prep_live_score.py ├── NFLX │ ├── NFLX_dataset_public.py │ └── prep_NFLX_score.py ├── VIDEOSET │ └── videoset_subj_score_v2.json └── dataset.py ├── eval.py ├── model └── network.py ├── opts.py ├── requirements.txt ├── save └── model_videoset_v3.pt ├── scripts ├── eval.sh ├── ft.sh └── train.sh ├── tool ├── decode_stream.py └── draw.py └── train.py /.gitignore: -------------------------------------------------------------------------------- 1 | __pycache__/ 2 | /runs 3 | /save/* 4 | !/save/model_videoset_v3.pt 5 | /log 6 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | ## Contributing to DVQA 2 | 3 | Welcome to report issues or pull requests 4 | -------------------------------------------------------------------------------- /License.txt: -------------------------------------------------------------------------------- 1 | Tencent is pleased to support the open source community by making DVQA available. 2 | 3 | Copyright (C) 2020 THL A29 Limited, a Tencent company. All rights reserved. 4 | 5 | Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following 6 | conditions are met: 7 | 8 | 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following 9 | disclaimer. 10 | 11 | 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following 12 | disclaimer in the documentation and/or other materials provided with the distribution. 13 | 14 | 3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products 15 | derived from this software without specific prior written permission. 16 | 17 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, 18 | BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT 19 | SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 20 | CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR 21 | PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR 22 | TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 23 | POSSIBILITY OF SUCH DAMAGE. 24 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | DVQA - Deep learning-based Video Quality Assessment 2 | 3 | ## News 4 | 5 | - 12/17/2019 add pretrained model on PGC videos 6 | 7 | ## Installation 8 | 9 | We recommend to run the code with virtualenv. The code is developed with Python3. 10 | 11 | Please install other prerequisites with the following command after invoking a virtual env. 12 | 13 | ``` 14 | pip install -r requirements.txt 15 | ``` 16 | All packages are required to run the code. 17 | 18 | ## Dataset 19 | 20 | Please prepare a dataset if you want to evaluate in batch or train the code from scratch on your own GPUs. The dataset should be in json format, e.g. your\_dataset.json 21 | 22 | ``` 23 | { 24 | "test": { 25 | "dis": ["dis_1.yuv", "dis_2.yuv"], 26 | "ref": ["ref_1.yuv", "ref_2.yuv"], 27 | "fps": [30, 24], 28 | "mos": [94.2, 55.8], 29 | "height": [1080, 720], 30 | "width": [1920, 1280] 31 | }, 32 | "train": { 33 | "dis": ["dis_3.yuv", "dis_4.yuv"], 34 | "ref": ["ref_3.yuv", "ref_4.yuv"], 35 | "fps": [50, 24], 36 | "mos": [85.2, 51.8], 37 | "height": [320, 720], 38 | "width": [640, 1280] 39 | } 40 | } 41 | ``` 42 | For the time being, only YUV is supported. We will update modules to read bitstream. 43 | 44 | ## Eval a dataset 45 | 46 | Put all YUV files (both dis and ref) in a folder and prepare your_dataset.json accordingly. Invoke virtualenv and run: 47 | 48 | ``` 49 | python eval.py --multi_gpu --video_dir /dir/to/yuv --score_file_path /path/to/your_dataset.json --load_model ./save/model_pgc.pt 50 | ``` 51 | 52 | ## Train from scratch 53 | 54 | Prepare dataset as above and simply run: 55 | 56 | ``` 57 | python train.py --multi_gpu --video_dir /dir/to/yuv --score_file_path /path/to/your_dataset.json --save_model ./save/your_new_trained.pt 58 | ``` 59 | Please check train.sh and opts.py if you would like to tweak other hyper-parameters. 60 | 61 | ## Known issues 62 | 63 | The pretrained model was trained on 720P PGC videos compressed with H.264/AVC. It runs well with video of a resolution 1920x1080 and below. 64 | 65 | We are not sure about the performance when the code is run with the following scenario, 66 | 1. PGC with other distortion types, especially time-related distortions. 67 | 2. PGC with post-processing filters, like de-nosing, super-resolution, artifacts reduction, etc. 68 | 3. UGC videos with pre-processing filter. 69 | 4. UGC videos compressed with common codecs. 70 | 71 | We will try to answer above questions. Stay tuned. 72 | -------------------------------------------------------------------------------- /dataset/CSIQ/csiq_video_quality_data.txt: -------------------------------------------------------------------------------- 1 | 77.03659589 10.68229334 2 | 46.05828638 11.67556088 3 | 16.98977584 8.975408634 4 | 47.87233568 12.90558422 5 | 59.52295226 10.02854956 6 | 63.53092857 10.73559013 7 | 23.22137353 9.633084011 8 | 39.18686526 11.99896188 9 | 64.10516994 11.11996791 10 | 27.95086987 9.43386689 11 | 43.10871603 9.392493285 12 | 56.94803765 8.62708836 13 | 66.23185483 13.07712492 14 | 53.04012751 11.59132418 15 | 41.83434535 12.07339852 16 | 17.23781535 10.47034576 17 | 38.26370489 11.94198184 18 | 74.93389035 9.478265116 19 | 74.33437005 10.67530744 20 | 36.15574099 12.70897066 21 | 19.07448461 10.04962736 22 | 47.30351665 12.27800756 23 | 62.72479257 12.32255814 24 | 72.88094169 11.52198149 25 | 30.91529469 10.98965848 26 | 51.81943375 12.8503839 27 | 66.48359975 13.04969298 28 | 34.40822221 11.49189459 29 | 50.08198352 13.42102203 30 | 60.68369663 12.17457699 31 | 70.89327619 12.56400945 32 | 58.08476845 12.04455283 33 | 46.97396136 13.2067548 34 | 14.48228099 12.19790959 35 | 41.61453858 13.09358172 36 | 72.40638794 11.90062763 37 | 70.60577747 11.09584103 38 | 50.94476997 10.34591532 39 | 21.83559842 10.17884888 40 | 43.62532077 11.23140795 41 | 53.88891454 11.09043 42 | 68.49752895 11.34396889 43 | 27.23487322 9.630655265 44 | 44.89801346 11.45727039 45 | 57.64004442 11.25607956 46 | 23.41833438 10.43762768 47 | 50.18519535 11.51447251 48 | 58.36419382 11.38001498 49 | 65.78999421 13.90534333 50 | 53.5645593 12.63323682 51 | 37.6104527 10.03583007 52 | 19.73899899 9.910142045 53 | 50.35526658 12.11352316 54 | 69.59854381 11.33558692 55 | 62.87918978 11.41159937 56 | 33.85061428 13.54285106 57 | 14.92923679 6.828468632 58 | 35.11641687 10.43713833 59 | 52.4982942 10.03890222 60 | 66.25458562 9.139532485 61 | 27.63248197 9.302392498 62 | 45.15664539 9.434347477 63 | 64.43225628 7.952784986 64 | 27.05051504 9.326012858 65 | 48.15232603 8.41645693 66 | 66.51605678 8.095660709 67 | 66.867476 11.17665978 68 | 58.2501235 11.19802045 69 | 40.371758 11.97980241 70 | 16.86594618 9.316591046 71 | 51.11946157 9.534721293 72 | 73.61803456 9.139200391 73 | 80.24697721 8.114217872 74 | 51.52948607 12.85108112 75 | 27.38247179 9.107548075 76 | 41.03723316 13.0922217 77 | 52.44280478 13.01991713 78 | 64.34347279 8.270155808 79 | 34.09030457 9.096970961 80 | 52.25923635 9.005787357 81 | 67.07804333 9.447486342 82 | 33.97223008 8.9624807 83 | 54.37509447 10.30530786 84 | 63.59901217 9.302487853 85 | 68.82679581 11.81049835 86 | 55.33105232 12.76696602 87 | 44.32098298 10.0288545 88 | 15.32108178 8.188513903 89 | 43.19870312 15.83002924 90 | 70.90403793 13.0103436 91 | 53.17354596 11.59011479 92 | 34.04796718 10.02394747 93 | 17.78340253 7.4449613 94 | 37.01950616 11.11612199 95 | 46.51888825 9.612767176 96 | 60.364435 9.089100329 97 | 22.21005878 9.434656151 98 | 39.83489667 9.702219087 99 | 59.34890673 7.860333346 100 | 21.27320839 8.138559375 101 | 38.83293807 12.37495157 102 | 62.73739714 10.7909046 103 | 71.62032717 11.09895612 104 | 62.67867706 15.63234194 105 | 46.18218926 11.2882467 106 | 22.47093604 10.5649761 107 | 40.73906987 12.73258494 108 | 70.14176791 9.172305919 109 | 53.59894756 9.439882534 110 | 32.93455043 14.51621335 111 | 22.90368876 10.96273481 112 | 44.85004437 13.96989401 113 | 54.3517671 12.36393615 114 | 65.49892668 12.16442783 115 | 46.96523031 10.36704275 116 | 56.7929296 11.14494472 117 | 66.4621166 10.45118447 118 | 30.84561652 12.57999828 119 | 54.58016742 12.10969375 120 | 72.48292161 13.96802526 121 | 68.00360471 15.1666297 122 | 63.96955028 14.97916725 123 | 43.79527584 11.12553103 124 | 17.67005082 12.09990922 125 | 41.02256363 15.78553188 126 | 71.55953909 12.15049909 127 | 75.887318 12.43790308 128 | 63.12511595 15.99820623 129 | 39.32328426 12.26853178 130 | 33.94465085 18.77570328 131 | 50.31336483 14.75787882 132 | 58.6132508 14.81632115 133 | 24.63740062 14.33033172 134 | 37.96128614 13.32445739 135 | 55.45947188 13.04601296 136 | 24.56694236 12.5741992 137 | 35.85982249 11.97389232 138 | 55.19559447 11.9475562 139 | 68.64805944 14.20983289 140 | 58.94759809 12.7464758 141 | 42.06015426 15.09457338 142 | 17.32961501 13.15849912 143 | 50.4331632 13.00156555 144 | 68.82964224 11.10942679 145 | 71.34861758 11.05052592 146 | 45.92218919 12.18816792 147 | 33.32167219 12.4417105 148 | 33.33259131 15.04303384 149 | 45.99884009 13.69187619 150 | 57.3619734 14.46617532 151 | 40.8293305 12.28396416 152 | 51.2924095 12.16139341 153 | 65.47287561 9.98770745 154 | 35.09312743 12.09728916 155 | 53.26328289 11.49583871 156 | 70.8765717 11.03342858 157 | 63.17684048 13.69483052 158 | 58.33602566 13.50846739 159 | 40.60143602 12.33208306 160 | 33.49377505 13.41019755 161 | 62.91420905 13.14503849 162 | 78.99350105 10.92069275 163 | 69.03073695 11.1086235 164 | 44.87401856 11.25229977 165 | 31.52565431 13.16048146 166 | 36.80488145 12.50257295 167 | 55.17661364 10.64543202 168 | 60.94209742 9.277534115 169 | 38.22349859 10.41976785 170 | 56.76359372 15.15577019 171 | 72.70164753 10.35491257 172 | 44.55409245 9.633922257 173 | 60.73065684 8.64755535 174 | 71.64790234 8.661880564 175 | 70.34737955 10.33569754 176 | 58.40100748 11.19599823 177 | 44.03370891 12.75386051 178 | 28.11075824 15.66130178 179 | 65.46549175 9.990327781 180 | 82.80008516 9.171781626 181 | 73.58225148 14.80830683 182 | 53.11354387 15.05058804 183 | 20.84597893 13.82228349 184 | 49.34783875 13.55053991 185 | 54.43910328 13.62899598 186 | 63.1863331 14.26525636 187 | 35.01431445 13.19332365 188 | 50.50521804 13.18986437 189 | 70.43787024 12.34679746 190 | 31.85484722 12.42544112 191 | 50.67216537 12.89324405 192 | 60.29534737 14.5254711 193 | 70.0192286 15.53322936 194 | 59.00027104 16.2270485 195 | 41.57339111 13.85798974 196 | 14.80210383 12.46163461 197 | 46.89729744 13.70177809 198 | 75.79427181 11.78725179 199 | 68.8800519 11.77675702 200 | 48.42246494 11.89779914 201 | 35.81287413 13.77085137 202 | 38.0605698 9.466687828 203 | 47.43617272 11.43293805 204 | 60.92460606 10.70242415 205 | 38.9098127 12.72318234 206 | 57.80180051 9.133746206 207 | 72.85497891 6.688103398 208 | 26.66715185 12.22327503 209 | 57.51913375 8.307480655 210 | 65.80745229 8.791192296 211 | 73.78412042 12.09659861 212 | 59.35873126 12.12424519 213 | 47.60257315 11.3749444 214 | 29.94580064 12.59422049 215 | 57.57219085 12.12436366 216 | 73.08194444 10.34255094 217 | -------------------------------------------------------------------------------- /dataset/CSIQ/csiq_video_quality_seqs.txt: -------------------------------------------------------------------------------- 1 | BQMall_832x480_dst_01.yuv 2 | BQMall_832x480_dst_02.yuv 3 | BQMall_832x480_dst_03.yuv 4 | BQMall_832x480_dst_04.yuv 5 | BQMall_832x480_dst_05.yuv 6 | BQMall_832x480_dst_06.yuv 7 | BQMall_832x480_dst_07.yuv 8 | BQMall_832x480_dst_08.yuv 9 | BQMall_832x480_dst_09.yuv 10 | BQMall_832x480_dst_10.yuv 11 | BQMall_832x480_dst_11.yuv 12 | BQMall_832x480_dst_12.yuv 13 | BQMall_832x480_dst_13.yuv 14 | BQMall_832x480_dst_14.yuv 15 | BQMall_832x480_dst_15.yuv 16 | BQMall_832x480_dst_16.yuv 17 | BQMall_832x480_dst_17.yuv 18 | BQMall_832x480_dst_18.yuv 19 | BQTerrace_832x480_dst_01.yuv 20 | BQTerrace_832x480_dst_02.yuv 21 | BQTerrace_832x480_dst_03.yuv 22 | BQTerrace_832x480_dst_04.yuv 23 | BQTerrace_832x480_dst_05.yuv 24 | BQTerrace_832x480_dst_06.yuv 25 | BQTerrace_832x480_dst_07.yuv 26 | BQTerrace_832x480_dst_08.yuv 27 | BQTerrace_832x480_dst_09.yuv 28 | BQTerrace_832x480_dst_10.yuv 29 | BQTerrace_832x480_dst_11.yuv 30 | BQTerrace_832x480_dst_12.yuv 31 | BQTerrace_832x480_dst_13.yuv 32 | BQTerrace_832x480_dst_14.yuv 33 | BQTerrace_832x480_dst_15.yuv 34 | BQTerrace_832x480_dst_16.yuv 35 | BQTerrace_832x480_dst_17.yuv 36 | BQTerrace_832x480_dst_18.yuv 37 | BasketballDrive_832x480_dst_01.yuv 38 | BasketballDrive_832x480_dst_02.yuv 39 | BasketballDrive_832x480_dst_03.yuv 40 | BasketballDrive_832x480_dst_04.yuv 41 | BasketballDrive_832x480_dst_05.yuv 42 | BasketballDrive_832x480_dst_06.yuv 43 | BasketballDrive_832x480_dst_07.yuv 44 | BasketballDrive_832x480_dst_08.yuv 45 | BasketballDrive_832x480_dst_09.yuv 46 | BasketballDrive_832x480_dst_10.yuv 47 | BasketballDrive_832x480_dst_11.yuv 48 | BasketballDrive_832x480_dst_12.yuv 49 | BasketballDrive_832x480_dst_13.yuv 50 | BasketballDrive_832x480_dst_14.yuv 51 | BasketballDrive_832x480_dst_15.yuv 52 | BasketballDrive_832x480_dst_16.yuv 53 | BasketballDrive_832x480_dst_17.yuv 54 | BasketballDrive_832x480_dst_18.yuv 55 | Cactus_832x480_dst_01.yuv 56 | Cactus_832x480_dst_02.yuv 57 | Cactus_832x480_dst_03.yuv 58 | Cactus_832x480_dst_04.yuv 59 | Cactus_832x480_dst_05.yuv 60 | Cactus_832x480_dst_06.yuv 61 | Cactus_832x480_dst_07.yuv 62 | Cactus_832x480_dst_08.yuv 63 | Cactus_832x480_dst_09.yuv 64 | Cactus_832x480_dst_10.yuv 65 | Cactus_832x480_dst_11.yuv 66 | Cactus_832x480_dst_12.yuv 67 | Cactus_832x480_dst_13.yuv 68 | Cactus_832x480_dst_14.yuv 69 | Cactus_832x480_dst_15.yuv 70 | Cactus_832x480_dst_16.yuv 71 | Cactus_832x480_dst_17.yuv 72 | Cactus_832x480_dst_18.yuv 73 | Carving_832x480_dst_01.yuv 74 | Carving_832x480_dst_02.yuv 75 | Carving_832x480_dst_03.yuv 76 | Carving_832x480_dst_04.yuv 77 | Carving_832x480_dst_05.yuv 78 | Carving_832x480_dst_06.yuv 79 | Carving_832x480_dst_07.yuv 80 | Carving_832x480_dst_08.yuv 81 | Carving_832x480_dst_09.yuv 82 | Carving_832x480_dst_10.yuv 83 | Carving_832x480_dst_11.yuv 84 | Carving_832x480_dst_12.yuv 85 | Carving_832x480_dst_13.yuv 86 | Carving_832x480_dst_14.yuv 87 | Carving_832x480_dst_15.yuv 88 | Carving_832x480_dst_16.yuv 89 | Carving_832x480_dst_17.yuv 90 | Carving_832x480_dst_18.yuv 91 | Chipmunks_832x480_dst_01.yuv 92 | Chipmunks_832x480_dst_02.yuv 93 | Chipmunks_832x480_dst_03.yuv 94 | Chipmunks_832x480_dst_04.yuv 95 | Chipmunks_832x480_dst_05.yuv 96 | Chipmunks_832x480_dst_06.yuv 97 | Chipmunks_832x480_dst_07.yuv 98 | Chipmunks_832x480_dst_08.yuv 99 | Chipmunks_832x480_dst_09.yuv 100 | Chipmunks_832x480_dst_10.yuv 101 | Chipmunks_832x480_dst_11.yuv 102 | Chipmunks_832x480_dst_12.yuv 103 | Chipmunks_832x480_dst_13.yuv 104 | Chipmunks_832x480_dst_14.yuv 105 | Chipmunks_832x480_dst_15.yuv 106 | Chipmunks_832x480_dst_16.yuv 107 | Chipmunks_832x480_dst_17.yuv 108 | Chipmunks_832x480_dst_18.yuv 109 | Flowervase_832x480_dst_01.yuv 110 | Flowervase_832x480_dst_02.yuv 111 | Flowervase_832x480_dst_03.yuv 112 | Flowervase_832x480_dst_04.yuv 113 | Flowervase_832x480_dst_05.yuv 114 | Flowervase_832x480_dst_06.yuv 115 | Flowervase_832x480_dst_07.yuv 116 | Flowervase_832x480_dst_08.yuv 117 | Flowervase_832x480_dst_09.yuv 118 | Flowervase_832x480_dst_10.yuv 119 | Flowervase_832x480_dst_11.yuv 120 | Flowervase_832x480_dst_12.yuv 121 | Flowervase_832x480_dst_13.yuv 122 | Flowervase_832x480_dst_14.yuv 123 | Flowervase_832x480_dst_15.yuv 124 | Flowervase_832x480_dst_16.yuv 125 | Flowervase_832x480_dst_17.yuv 126 | Flowervase_832x480_dst_18.yuv 127 | Keiba_832x480_dst_01.yuv 128 | Keiba_832x480_dst_02.yuv 129 | Keiba_832x480_dst_03.yuv 130 | Keiba_832x480_dst_04.yuv 131 | Keiba_832x480_dst_05.yuv 132 | Keiba_832x480_dst_06.yuv 133 | Keiba_832x480_dst_07.yuv 134 | Keiba_832x480_dst_08.yuv 135 | Keiba_832x480_dst_09.yuv 136 | Keiba_832x480_dst_10.yuv 137 | Keiba_832x480_dst_11.yuv 138 | Keiba_832x480_dst_12.yuv 139 | Keiba_832x480_dst_13.yuv 140 | Keiba_832x480_dst_14.yuv 141 | Keiba_832x480_dst_15.yuv 142 | Keiba_832x480_dst_16.yuv 143 | Keiba_832x480_dst_17.yuv 144 | Keiba_832x480_dst_18.yuv 145 | Kimono_832x480_dst_01.yuv 146 | Kimono_832x480_dst_02.yuv 147 | Kimono_832x480_dst_03.yuv 148 | Kimono_832x480_dst_04.yuv 149 | Kimono_832x480_dst_05.yuv 150 | Kimono_832x480_dst_06.yuv 151 | Kimono_832x480_dst_07.yuv 152 | Kimono_832x480_dst_08.yuv 153 | Kimono_832x480_dst_09.yuv 154 | Kimono_832x480_dst_10.yuv 155 | Kimono_832x480_dst_11.yuv 156 | Kimono_832x480_dst_12.yuv 157 | Kimono_832x480_dst_13.yuv 158 | Kimono_832x480_dst_14.yuv 159 | Kimono_832x480_dst_15.yuv 160 | Kimono_832x480_dst_16.yuv 161 | Kimono_832x480_dst_17.yuv 162 | Kimono_832x480_dst_18.yuv 163 | ParkScene_832x480_dst_01.yuv 164 | ParkScene_832x480_dst_02.yuv 165 | ParkScene_832x480_dst_03.yuv 166 | ParkScene_832x480_dst_04.yuv 167 | ParkScene_832x480_dst_05.yuv 168 | ParkScene_832x480_dst_06.yuv 169 | ParkScene_832x480_dst_07.yuv 170 | ParkScene_832x480_dst_08.yuv 171 | ParkScene_832x480_dst_09.yuv 172 | ParkScene_832x480_dst_10.yuv 173 | ParkScene_832x480_dst_11.yuv 174 | ParkScene_832x480_dst_12.yuv 175 | ParkScene_832x480_dst_13.yuv 176 | ParkScene_832x480_dst_14.yuv 177 | ParkScene_832x480_dst_15.yuv 178 | ParkScene_832x480_dst_16.yuv 179 | ParkScene_832x480_dst_17.yuv 180 | ParkScene_832x480_dst_18.yuv 181 | PartyScene_832x480_dst_01.yuv 182 | PartyScene_832x480_dst_02.yuv 183 | PartyScene_832x480_dst_03.yuv 184 | PartyScene_832x480_dst_04.yuv 185 | PartyScene_832x480_dst_05.yuv 186 | PartyScene_832x480_dst_06.yuv 187 | PartyScene_832x480_dst_07.yuv 188 | PartyScene_832x480_dst_08.yuv 189 | PartyScene_832x480_dst_09.yuv 190 | PartyScene_832x480_dst_10.yuv 191 | PartyScene_832x480_dst_11.yuv 192 | PartyScene_832x480_dst_12.yuv 193 | PartyScene_832x480_dst_13.yuv 194 | PartyScene_832x480_dst_14.yuv 195 | PartyScene_832x480_dst_15.yuv 196 | PartyScene_832x480_dst_16.yuv 197 | PartyScene_832x480_dst_17.yuv 198 | PartyScene_832x480_dst_18.yuv 199 | Timelapse_832x480_dst_01.yuv 200 | Timelapse_832x480_dst_02.yuv 201 | Timelapse_832x480_dst_03.yuv 202 | Timelapse_832x480_dst_04.yuv 203 | Timelapse_832x480_dst_05.yuv 204 | Timelapse_832x480_dst_06.yuv 205 | Timelapse_832x480_dst_07.yuv 206 | Timelapse_832x480_dst_08.yuv 207 | Timelapse_832x480_dst_09.yuv 208 | Timelapse_832x480_dst_10.yuv 209 | Timelapse_832x480_dst_11.yuv 210 | Timelapse_832x480_dst_12.yuv 211 | Timelapse_832x480_dst_13.yuv 212 | Timelapse_832x480_dst_14.yuv 213 | Timelapse_832x480_dst_15.yuv 214 | Timelapse_832x480_dst_16.yuv 215 | Timelapse_832x480_dst_17.yuv 216 | Timelapse_832x480_dst_18.yuv 217 | -------------------------------------------------------------------------------- /dataset/CSIQ/prep_csiq_score.py: -------------------------------------------------------------------------------- 1 | import os 2 | import re 3 | from collections import OrderedDict 4 | import numpy as np 5 | import json 6 | 7 | 8 | def make_score_file(): 9 | dir_path = os.path.dirname(os.path.realpath(__file__)) 10 | 11 | seq_file_name = 'csiq_video_quality_seqs.txt' 12 | score_file_name = 'csiq_video_quality_data.txt' 13 | 14 | all_scenes = ['Keiba', 'Timelapse', 'BQTerrace', 'Carving', 'Chipmunks', 15 | 'Flowervase', 'ParkScene', 'PartyScene', 'BQMall', 'Cactus', 16 | 'Kimono','BasketballDrive'] 17 | test_scene = ['BQTerrace', 'ParkScene'] 18 | framerate = { 19 | 'Chipmunks_832x480_ref.yuv': 24, 20 | 'Kimono_832x480_ref.yuv': 24, 21 | 'ParkScene_832x480_ref.yuv': 24, 22 | 'Carving_832x480_ref.yuv': 25, 23 | 'Flowervase_832x480_ref.yuv': 30, 24 | 'Keiba_832x480_ref.yuv': 30, 25 | 'Timelapse_832x480_ref.yuv': 30, 26 | 'BasketballDrive_832x480_ref.yuv': 50, 27 | 'Cactus_832x480_ref.yuv': 50, 28 | 'PartyScene_832x480_ref.yuv': 50, 29 | 'BQMall_832x480_ref.yuv': 60, 30 | 'BQTerrace_832x480_ref.yuv': 60} 31 | 32 | width = 832 33 | height = 480 34 | 35 | seqs = np.genfromtxt(os.path.join(dir_path, seq_file_name), dtype='str') 36 | score = np.genfromtxt(os.path.join(dir_path, score_file_name), dtype='float') 37 | 38 | ret = OrderedDict() 39 | ret['train'] = OrderedDict() 40 | ret['test'] = OrderedDict() 41 | 42 | trn_dis = [] 43 | trn_ref = [] 44 | trn_mos = [] 45 | trn_height = [] 46 | trn_width = [] 47 | trn_fps = [] 48 | 49 | tst_dis = [] 50 | tst_ref = [] 51 | tst_mos = [] 52 | tst_height = [] 53 | tst_width = [] 54 | tst_fps = [] 55 | 56 | for clip, mos in zip(seqs, score): 57 | clip_info = re.split('_', clip) 58 | clip_name = clip_info[0] 59 | ref = clip.replace(clip[-10:-4], 'ref') 60 | fps = framerate[ref] 61 | 62 | if clip_name in test_scene: 63 | tst_dis.append(clip) 64 | tst_ref.append(ref) 65 | tst_mos.append(100.0 - float(mos[0])) 66 | tst_height.append(height) 67 | tst_width.append(width) 68 | tst_fps.append(fps) 69 | else: 70 | trn_dis.append(clip) 71 | trn_ref.append(ref) 72 | trn_mos.append(100.0 - float(mos[0])) 73 | trn_height.append(height) 74 | trn_width.append(width) 75 | trn_fps.append(fps) 76 | 77 | ret['train']['dis'] = trn_dis 78 | ret['train']['ref'] = trn_ref 79 | ret['train']['mos'] = trn_mos 80 | ret['train']['height'] = trn_height 81 | ret['train']['width'] = trn_width 82 | ret['train']['fps'] = trn_fps 83 | 84 | ret['test']['dis'] = tst_dis 85 | ret['test']['ref'] = tst_ref 86 | ret['test']['mos'] = tst_mos 87 | ret['test']['height'] = tst_height 88 | ret['test']['width'] = tst_width 89 | ret['test']['fps'] = tst_fps 90 | 91 | with open('csiq_subj_score_{}.json'.format('_'.join(test_scene)), 'w') as f: 92 | json.dump(ret, f, indent=4, sort_keys=True) 93 | 94 | 95 | if __name__ == "__main__": 96 | 97 | make_score_file() 98 | 99 | print('Done') 100 | -------------------------------------------------------------------------------- /dataset/LIVE/live_video_quality_data.txt: -------------------------------------------------------------------------------- 1 | 44.5104 12.2909 2 | 70.1054 8.4630 3 | 66.4280 10.9220 4 | 75.1225 8.7056 5 | 73.8803 5.7825 6 | 63.2564 8.8315 7 | 61.2726 10.6827 8 | 40.5551 8.4040 9 | 52.6111 9.8646 10 | 60.2534 9.0097 11 | 68.7186 9.3995 12 | 42.9784 8.5050 13 | 51.0530 8.0119 14 | 55.7020 9.3731 15 | 65.6457 10.8023 16 | 64.9369 12.4744 17 | 46.2446 9.8897 18 | 54.3732 12.0351 19 | 46.4907 10.9136 20 | 68.1064 10.4983 21 | 54.8101 13.2412 22 | 54.6555 12.2369 23 | 39.1978 11.7595 24 | 43.6833 12.6685 25 | 55.8563 15.2382 26 | 63.5809 12.0636 27 | 38.8828 11.0500 28 | 45.6069 14.4528 29 | 48.0089 13.7996 30 | 47.5270 11.8475 31 | 68.1431 12.0123 32 | 63.5698 12.6835 33 | 48.0196 11.2378 34 | 51.4980 13.1559 35 | 55.2291 11.2665 36 | 62.3778 12.1601 37 | 42.6909 9.5547 38 | 37.8713 9.9518 39 | 45.4363 11.9058 40 | 53.6343 13.7169 41 | 62.9934 10.0094 42 | 31.4716 8.0896 43 | 42.8568 11.4820 44 | 52.0988 8.0925 45 | 62.2062 10.8021 46 | 71.2731 7.3171 47 | 72.1356 8.2769 48 | 64.6561 8.7193 49 | 53.1125 10.2891 50 | 73.4730 11.2189 51 | 55.3531 10.7032 52 | 52.4524 9.9872 53 | 38.6726 8.7816 54 | 47.7716 8.6263 55 | 56.9119 9.3595 56 | 63.7984 7.4827 57 | 33.4734 8.8625 58 | 42.5381 12.2394 59 | 56.1328 10.0524 60 | 65.7102 10.8513 61 | 65.6522 11.8297 62 | 61.3221 11.2218 63 | 44.0305 12.3100 64 | 41.4157 10.1887 65 | 58.4534 10.2342 66 | 44.2762 10.2308 67 | 48.3834 10.8759 68 | 40.7745 10.9440 69 | 46.5633 9.3641 70 | 52.3269 11.2327 71 | 56.0811 9.9024 72 | 36.5136 10.6661 73 | 42.9632 9.5615 74 | 49.1987 12.5682 75 | 57.4200 10.8714 76 | 54.9213 9.9593 77 | 63.2756 7.0135 78 | 56.8614 10.3063 79 | 49.2987 7.9941 80 | 59.3959 8.3076 81 | 44.8094 11.1511 82 | 39.1088 8.8315 83 | 32.6002 7.5710 84 | 44.0164 9.5158 85 | 54.9423 8.8703 86 | 57.1497 10.3586 87 | 40.9999 10.2129 88 | 44.6477 9.6876 89 | 49.2215 8.2303 90 | 53.7003 8.3839 91 | 68.9412 13.2694 92 | 52.9363 10.9429 93 | 51.0109 11.6969 94 | 55.9066 12.9653 95 | 61.7965 8.9395 96 | 45.9273 12.2075 97 | 40.9576 10.0565 98 | 31.9421 10.0953 99 | 36.6396 10.2083 100 | 38.6448 9.1071 101 | 52.1844 10.8366 102 | 32.7252 11.6010 103 | 43.9984 9.6540 104 | 50.5090 8.9686 105 | 53.4364 11.3882 106 | 81.1601 8.8839 107 | 70.5494 7.0154 108 | 54.9174 10.3442 109 | 49.6350 9.8661 110 | 55.5307 8.4316 111 | 61.2837 9.8106 112 | 46.2254 8.0034 113 | 36.2440 8.7969 114 | 40.8004 8.8023 115 | 51.6153 10.4552 116 | 66.3166 10.0913 117 | 37.0212 7.4451 118 | 44.0813 9.4971 119 | 57.5757 6.4381 120 | 62.0745 6.2390 121 | 78.3431 9.9876 122 | 69.2258 8.0969 123 | 59.5299 9.8755 124 | 57.8482 10.2606 125 | 73.3075 9.0790 126 | 58.5392 11.3208 127 | 54.0963 10.0428 128 | 47.3711 10.8012 129 | 48.7705 7.7892 130 | 57.6788 9.7494 131 | 67.8232 7.4454 132 | 30.9426 8.0339 133 | 40.5326 9.8009 134 | 52.5435 9.9240 135 | 64.8173 9.5076 136 | 61.3882 10.2155 137 | 66.3322 11.1123 138 | 45.4702 7.6892 139 | 45.3150 8.6377 140 | 55.3240 6.1770 141 | 56.1730 8.7040 142 | 44.6086 10.3585 143 | 39.8067 8.2885 144 | 53.7598 9.0671 145 | 59.8921 10.6386 146 | 77.2518 8.7931 147 | 39.7105 9.5447 148 | 46.8271 10.3513 149 | 54.4239 11.2077 150 | 61.8235 11.1164 151 | -------------------------------------------------------------------------------- /dataset/LIVE/live_video_quality_seqs.txt: -------------------------------------------------------------------------------- 1 | pa2_25fps.yuv 2 | pa3_25fps.yuv 3 | pa4_25fps.yuv 4 | pa5_25fps.yuv 5 | pa6_25fps.yuv 6 | pa7_25fps.yuv 7 | pa8_25fps.yuv 8 | pa9_25fps.yuv 9 | pa10_25fps.yuv 10 | pa11_25fps.yuv 11 | pa12_25fps.yuv 12 | pa13_25fps.yuv 13 | pa14_25fps.yuv 14 | pa15_25fps.yuv 15 | pa16_25fps.yuv 16 | rb2_25fps.yuv 17 | rb3_25fps.yuv 18 | rb4_25fps.yuv 19 | rb5_25fps.yuv 20 | rb6_25fps.yuv 21 | rb7_25fps.yuv 22 | rb8_25fps.yuv 23 | rb9_25fps.yuv 24 | rb10_25fps.yuv 25 | rb11_25fps.yuv 26 | rb12_25fps.yuv 27 | rb13_25fps.yuv 28 | rb14_25fps.yuv 29 | rb15_25fps.yuv 30 | rb16_25fps.yuv 31 | rh2_25fps.yuv 32 | rh3_25fps.yuv 33 | rh4_25fps.yuv 34 | rh5_25fps.yuv 35 | rh6_25fps.yuv 36 | rh7_25fps.yuv 37 | rh8_25fps.yuv 38 | rh9_25fps.yuv 39 | rh10_25fps.yuv 40 | rh11_25fps.yuv 41 | rh12_25fps.yuv 42 | rh13_25fps.yuv 43 | rh14_25fps.yuv 44 | rh15_25fps.yuv 45 | rh16_25fps.yuv 46 | tr2_25fps.yuv 47 | tr3_25fps.yuv 48 | tr4_25fps.yuv 49 | tr5_25fps.yuv 50 | tr6_25fps.yuv 51 | tr7_25fps.yuv 52 | tr8_25fps.yuv 53 | tr9_25fps.yuv 54 | tr10_25fps.yuv 55 | tr11_25fps.yuv 56 | tr12_25fps.yuv 57 | tr13_25fps.yuv 58 | tr14_25fps.yuv 59 | tr15_25fps.yuv 60 | tr16_25fps.yuv 61 | st2_25fps.yuv 62 | st3_25fps.yuv 63 | st4_25fps.yuv 64 | st5_25fps.yuv 65 | st6_25fps.yuv 66 | st7_25fps.yuv 67 | st8_25fps.yuv 68 | st9_25fps.yuv 69 | st10_25fps.yuv 70 | st11_25fps.yuv 71 | st12_25fps.yuv 72 | st13_25fps.yuv 73 | st14_25fps.yuv 74 | st15_25fps.yuv 75 | st16_25fps.yuv 76 | sf2_25fps.yuv 77 | sf3_25fps.yuv 78 | sf4_25fps.yuv 79 | sf5_25fps.yuv 80 | sf6_25fps.yuv 81 | sf7_25fps.yuv 82 | sf8_25fps.yuv 83 | sf9_25fps.yuv 84 | sf10_25fps.yuv 85 | sf11_25fps.yuv 86 | sf12_25fps.yuv 87 | sf13_25fps.yuv 88 | sf14_25fps.yuv 89 | sf15_25fps.yuv 90 | sf16_25fps.yuv 91 | bs2_25fps.yuv 92 | bs3_25fps.yuv 93 | bs4_25fps.yuv 94 | bs5_25fps.yuv 95 | bs6_25fps.yuv 96 | bs7_25fps.yuv 97 | bs8_25fps.yuv 98 | bs9_25fps.yuv 99 | bs10_25fps.yuv 100 | bs11_25fps.yuv 101 | bs12_25fps.yuv 102 | bs13_25fps.yuv 103 | bs14_25fps.yuv 104 | bs15_25fps.yuv 105 | bs16_25fps.yuv 106 | sh2_50fps.yuv 107 | sh3_50fps.yuv 108 | sh4_50fps.yuv 109 | sh5_50fps.yuv 110 | sh6_50fps.yuv 111 | sh7_50fps.yuv 112 | sh8_50fps.yuv 113 | sh9_50fps.yuv 114 | sh10_50fps.yuv 115 | sh11_50fps.yuv 116 | sh12_50fps.yuv 117 | sh13_50fps.yuv 118 | sh14_50fps.yuv 119 | sh15_50fps.yuv 120 | sh16_50fps.yuv 121 | mc2_50fps.yuv 122 | mc3_50fps.yuv 123 | mc4_50fps.yuv 124 | mc5_50fps.yuv 125 | mc6_50fps.yuv 126 | mc7_50fps.yuv 127 | mc8_50fps.yuv 128 | mc9_50fps.yuv 129 | mc10_50fps.yuv 130 | mc11_50fps.yuv 131 | mc12_50fps.yuv 132 | mc13_50fps.yuv 133 | mc14_50fps.yuv 134 | mc15_50fps.yuv 135 | mc16_50fps.yuv 136 | pr2_50fps.yuv 137 | pr3_50fps.yuv 138 | pr4_50fps.yuv 139 | pr5_50fps.yuv 140 | pr6_50fps.yuv 141 | pr7_50fps.yuv 142 | pr8_50fps.yuv 143 | pr9_50fps.yuv 144 | pr10_50fps.yuv 145 | pr11_50fps.yuv 146 | pr12_50fps.yuv 147 | pr13_50fps.yuv 148 | pr14_50fps.yuv 149 | pr15_50fps.yuv 150 | pr16_50fps.yuv 151 | -------------------------------------------------------------------------------- /dataset/LIVE/prep_live_score.py: -------------------------------------------------------------------------------- 1 | import os 2 | import re 3 | from collections import OrderedDict 4 | import numpy as np 5 | import json 6 | 7 | def make_score_file(): 8 | dir_path = os.path.dirname(os.path.realpath(__file__)) 9 | 10 | seq_file_name = 'live_video_quality_seqs.txt' 11 | score_file_name = 'live_video_quality_data.txt' 12 | 13 | all_scenes = ['bs', "st", 'sh', "mc", "pa", 'sf', 'rb', 'tr', 'pr', 'rh'] 14 | test_scene = ['bs', 'st'] 15 | framerate = { 16 | 'pa01_25fps.yuv': 25, 17 | 'rb01_25fps.yuv': 25, 18 | 'rh01_25fps.yuv': 25, 19 | 'tr01_25fps.yuv': 25, 20 | 'st01_25fps.yuv': 25, 21 | 'sf01_25fps.yuv': 25, 22 | 'bs01_25fps.yuv': 25, 23 | 'sh01_50fps.yuv': 50, 24 | 'mc01_50fps.yuv': 50, 25 | 'pr01_50fps.yuv': 50} 26 | 27 | width = 768 28 | height = 432 29 | 30 | seqs = np.genfromtxt(os.path.join(dir_path, seq_file_name), dtype='str') 31 | score = np.genfromtxt(os.path.join(dir_path, score_file_name), dtype='float') 32 | 33 | ret = OrderedDict() 34 | ret['train'] = OrderedDict() 35 | ret['test'] = OrderedDict() 36 | 37 | trn_dis = [] 38 | trn_ref = [] 39 | trn_mos = [] 40 | trn_height = [] 41 | trn_width = [] 42 | trn_fps = [] 43 | 44 | tst_dis = [] 45 | tst_ref = [] 46 | tst_mos = [] 47 | tst_height = [] 48 | tst_width = [] 49 | tst_fps = [] 50 | 51 | for clip, mos in zip(seqs, score): 52 | clip_info = re.split('_', clip) 53 | clip_name = clip_info[0][0:2] 54 | dst = int(clip_info[0][2:]) 55 | clip = '' + clip_name + '{dst:02d}'.format(dst=dst) + '_' + clip_info[-1] 56 | ref = '' + clip_name + '01' + '_' + clip_info[-1] 57 | fps = framerate[ref] 58 | 59 | if clip_name in test_scene: 60 | tst_dis.append(clip) 61 | tst_ref.append(ref) 62 | tst_mos.append(100.0 - float(mos[0])) 63 | tst_height.append(height) 64 | tst_width.append(width) 65 | tst_fps.append(fps) 66 | else: 67 | trn_dis.append(clip) 68 | trn_ref.append(ref) 69 | trn_mos.append(100.0 - float(mos[0])) 70 | trn_height.append(height) 71 | trn_width.append(width) 72 | trn_fps.append(fps) 73 | 74 | ret['train']['dis'] = trn_dis 75 | ret['train']['ref'] = trn_ref 76 | ret['train']['mos'] = trn_mos 77 | ret['train']['height'] = trn_height 78 | ret['train']['width'] = trn_width 79 | ret['train']['fps'] = trn_fps 80 | 81 | ret['test']['dis'] = tst_dis 82 | ret['test']['ref'] = tst_ref 83 | ret['test']['mos'] = tst_mos 84 | ret['test']['height'] = tst_height 85 | ret['test']['width'] = tst_width 86 | ret['test']['fps'] = tst_fps 87 | 88 | with open('live_subj_score_{}.json'.format('_'.join(test_scene)), 'w') as f: 89 | json.dump(ret, f, indent=4, sort_keys=True) 90 | 91 | 92 | if __name__ == "__main__": 93 | 94 | make_score_file() 95 | 96 | print('Done') 97 | -------------------------------------------------------------------------------- /dataset/NFLX/NFLX_dataset_public.py: -------------------------------------------------------------------------------- 1 | dataset_name = 'NFLX_public' 2 | yuv_fmt = 'yuv420p' 3 | width = 1920 4 | height = 1080 5 | 6 | ref_dir = '[path to dataset videos]/ref' 7 | dis_dir = '[path to dataset videos]/dis' 8 | 9 | ref_videos = [ 10 | {'content_id': 0, 11 | 'content_name': 'BigBuckBunny', 12 | 'path': ref_dir + '/BigBuckBunny_25fps.yuv'}, 13 | {'content_id': 1, 14 | 'content_name': 'BirdsInCage', 15 | 'path': ref_dir + '/BirdsInCage_30fps.yuv'}, 16 | {'content_id': 2, 17 | 'content_name': 'CrowdRun', 18 | 'path': ref_dir + '/CrowdRun_25fps.yuv'}, 19 | {'content_id': 3, 20 | 'content_name': 'ElFuente1', 21 | 'path': ref_dir + '/ElFuente1_30fps.yuv'}, 22 | {'content_id': 4, 23 | 'content_name': 'ElFuente2', 24 | 'path': ref_dir + '/ElFuente2_30fps.yuv'}, 25 | {'content_id': 5, 26 | 'content_name': 'FoxBird', 27 | 'path': ref_dir + '/FoxBird_25fps.yuv'}, 28 | {'content_id': 6, 29 | 'content_name': 'OldTownCross', 30 | 'path': ref_dir + '/OldTownCross_25fps.yuv'}, 31 | {'content_id': 7, 32 | 'content_name': 'Seeking', 33 | 'path': ref_dir + '/Seeking_25fps.yuv'}, 34 | {'content_id': 8, 35 | 'content_name': 'Tennis', 36 | 'path': ref_dir + '/Tennis_24fps.yuv'} 37 | ] 38 | 39 | dis_videos = [ 40 | {'asset_id': 9, 41 | 'content_id': 0, 42 | 'dmos': 22.5, 43 | 'path': dis_dir + '/BigBuckBunny_20_288_375.yuv', 44 | }, 45 | {'asset_id': 10, 46 | 'content_id': 0, 47 | 'dmos': 35.0, 48 | 'path': dis_dir + '/BigBuckBunny_30_384_550.yuv', 49 | }, 50 | {'asset_id': 11, 51 | 'content_id': 0, 52 | 'dmos': 49.166666666666664, 53 | 'path': dis_dir + '/BigBuckBunny_40_384_750.yuv', 54 | }, 55 | {'asset_id': 12, 56 | 'content_id': 0, 57 | 'dmos': 61.666666666666664, 58 | 'path': dis_dir + '/BigBuckBunny_50_480_1050.yuv', 59 | }, 60 | {'asset_id': 13, 61 | 'content_id': 0, 62 | 'dmos': 78.333333333333329, 63 | 'path': dis_dir + '/BigBuckBunny_55_480_1750.yuv', 64 | }, 65 | {'asset_id': 14, 66 | 'content_id': 0, 67 | 'dmos': 97.5, 68 | 'path': dis_dir + '/BigBuckBunny_75_720_3050.yuv', 69 | }, 70 | {'asset_id': 15, 71 | 'content_id': 0, 72 | 'dmos': 95.0, 73 | 'path': dis_dir + '/BigBuckBunny_80_720_4250.yuv', 74 | }, 75 | {'asset_id': 16, 76 | 'content_id': 0, 77 | 'dmos': 99.166666666666671, 78 | 'path': dis_dir + '/BigBuckBunny_85_1080_3800.yuv', 79 | }, 80 | {'asset_id': 17, 81 | 'content_id': 0, 82 | 'dmos': 103.33333333333333, 83 | 'path': dis_dir + '/BigBuckBunny_90_1080_4300.yuv', 84 | }, 85 | {'asset_id': 18, 86 | 'content_id': 0, 87 | 'dmos': 99.166666666666671, 88 | 'path': dis_dir + '/BigBuckBunny_95_1080_5800.yuv', 89 | }, 90 | {'asset_id': 19, 91 | 'content_id': 1, 92 | 'dmos': 38.333333333333336, 93 | 'path': dis_dir + '/BirdsInCage_40_288_375.yuv', 94 | }, 95 | {'asset_id': 20, 96 | 'content_id': 1, 97 | 'dmos': 40.0, 98 | 'path': dis_dir + '/BirdsInCage_50_288_550.yuv', 99 | }, 100 | {'asset_id': 21, 101 | 'content_id': 1, 102 | 'dmos': 52.5, 103 | 'path': dis_dir + '/BirdsInCage_60_384_550.yuv', 104 | }, 105 | {'asset_id': 22, 106 | 'content_id': 1, 107 | 'dmos': 55.0, 108 | 'path': dis_dir + '/BirdsInCage_65_384_750.yuv', 109 | }, 110 | {'asset_id': 23, 111 | 'content_id': 1, 112 | 'dmos': 70.0, 113 | 'path': dis_dir + '/BirdsInCage_80_480_750.yuv', 114 | }, 115 | {'asset_id': 24, 116 | 'content_id': 1, 117 | 'dmos': 92.5, 118 | 'path': dis_dir + '/BirdsInCage_85_720_1050.yuv', 119 | }, 120 | {'asset_id': 25, 121 | 'content_id': 1, 122 | 'dmos': 100.83333333333333, 123 | 'path': dis_dir + '/BirdsInCage_90_1080_1800.yuv', 124 | }, 125 | {'asset_id': 26, 126 | 'content_id': 1, 127 | 'dmos': 102.5, 128 | 'path': dis_dir + '/BirdsInCage_95_1080_3000.yuv', 129 | }, 130 | {'asset_id': 27, 131 | 'content_id': 2, 132 | 'dmos': 20.0, 133 | 'path': dis_dir + '/CrowdRun_30_288_375.yuv', 134 | }, 135 | {'asset_id': 28, 136 | 'content_id': 2, 137 | 'dmos': 40.0, 138 | 'path': dis_dir + '/CrowdRun_40_480_2350.yuv', 139 | }, 140 | {'asset_id': 29, 141 | 'content_id': 2, 142 | 'dmos': 58.333333333333336, 143 | 'path': dis_dir + '/CrowdRun_50_1080_4300.yuv', 144 | }, 145 | {'asset_id': 30, 146 | 'content_id': 2, 147 | 'dmos': 67.5, 148 | 'path': dis_dir + '/CrowdRun_65_1080_5800.yuv', 149 | }, 150 | {'asset_id': 31, 151 | 'content_id': 2, 152 | 'dmos': 81.666666666666671, 153 | 'path': dis_dir + '/CrowdRun_75_1080_7500.yuv', 154 | }, 155 | {'asset_id': 32, 156 | 'content_id': 2, 157 | 'dmos': 85.0, 158 | 'path': dis_dir + '/CrowdRun_80_1080_10000.yuv', 159 | }, 160 | {'asset_id': 33, 161 | 'content_id': 2, 162 | 'dmos': 94.166666666666671, 163 | 'path': dis_dir + '/CrowdRun_90_1080_15000.yuv', 164 | }, 165 | {'asset_id': 34, 166 | 'content_id': 3, 167 | 'dmos': 18.333333333333332, 168 | 'path': dis_dir + '/ElFuente1_10_288_375.yuv', 169 | }, 170 | {'asset_id': 35, 171 | 'content_id': 3, 172 | 'dmos': 29.166666666666668, 173 | 'path': dis_dir + '/ElFuente1_25_384_750.yuv', 174 | }, 175 | {'asset_id': 36, 176 | 'content_id': 3, 177 | 'dmos': 66.666666666666671, 178 | 'path': dis_dir + '/ElFuente1_50_480_1750.yuv', 179 | }, 180 | {'asset_id': 37, 181 | 'content_id': 3, 182 | 'dmos': 72.5, 183 | 'path': dis_dir + '/ElFuente1_60_720_2350.yuv', 184 | }, 185 | {'asset_id': 38, 186 | 'content_id': 3, 187 | 'dmos': 86.666666666666671, 188 | 'path': dis_dir + '/ElFuente1_70_1080_4300.yuv', 189 | }, 190 | {'asset_id': 39, 191 | 'content_id': 3, 192 | 'dmos': 95.0, 193 | 'path': dis_dir + '/ElFuente1_85_1080_5800.yuv', 194 | }, 195 | {'asset_id': 40, 196 | 'content_id': 3, 197 | 'dmos': 99.166666666666671, 198 | 'path': dis_dir + '/ElFuente1_90_1080_7500.yuv', 199 | }, 200 | {'asset_id': 41, 201 | 'content_id': 4, 202 | 'dmos': 25.0, 203 | 'path': dis_dir + '/ElFuente2_05_288_375.yuv', 204 | }, 205 | {'asset_id': 42, 206 | 'content_id': 4, 207 | 'dmos': 55.0, 208 | 'path': dis_dir + '/ElFuente2_30_480_1750.yuv', 209 | }, 210 | {'asset_id': 43, 211 | 'content_id': 4, 212 | 'dmos': 58.333333333333336, 213 | 'path': dis_dir + '/ElFuente2_50_720_3050.yuv', 214 | }, 215 | {'asset_id': 44, 216 | 'content_id': 4, 217 | 'dmos': 68.333333333333329, 218 | 'path': dis_dir + '/ElFuente2_60_1080_4300.yuv', 219 | }, 220 | {'asset_id': 45, 221 | 'content_id': 4, 222 | 'dmos': 75.833333333333329, 223 | 'path': dis_dir + '/ElFuente2_65_720_4250.yuv', 224 | }, 225 | {'asset_id': 46, 226 | 'content_id': 4, 227 | 'dmos': 82.5, 228 | 'path': dis_dir + '/ElFuente2_70_1080_5800.yuv', 229 | }, 230 | {'asset_id': 47, 231 | 'content_id': 4, 232 | 'dmos': 93.333333333333329, 233 | 'path': dis_dir + '/ElFuente2_80_1080_10000.yuv', 234 | }, 235 | {'asset_id': 48, 236 | 'content_id': 4, 237 | 'dmos': 96.666666666666671, 238 | 'path': dis_dir + '/ElFuente2_85_1080_15000.yuv', 239 | }, 240 | {'asset_id': 49, 241 | 'content_id': 4, 242 | 'dmos': 97.5, 243 | 'path': dis_dir + '/ElFuente2_90_1080_20000.yuv', 244 | }, 245 | {'asset_id': 50, 246 | 'content_id': 5, 247 | 'dmos': 34.166666666666664, 248 | 'path': dis_dir + '/FoxBird_20_288_375.yuv', 249 | }, 250 | {'asset_id': 51, 251 | 'content_id': 5, 252 | 'dmos': 60.0, 253 | 'path': dis_dir + '/FoxBird_40_384_750.yuv', 254 | }, 255 | {'asset_id': 52, 256 | 'content_id': 5, 257 | 'dmos': 64.166666666666671, 258 | 'path': dis_dir + '/FoxBird_55_480_750.yuv', 259 | }, 260 | {'asset_id': 53, 261 | 'content_id': 5, 262 | 'dmos': 83.333333333333329, 263 | 'path': dis_dir + '/FoxBird_65_480_1750.yuv', 264 | }, 265 | {'asset_id': 54, 266 | 'content_id': 5, 267 | 'dmos': 90.833333333333329, 268 | 'path': dis_dir + '/FoxBird_80_1080_2300.yuv', 269 | }, 270 | {'asset_id': 55, 271 | 'content_id': 5, 272 | 'dmos': 101.66666666666667, 273 | 'path': dis_dir + '/FoxBird_95_1080_5800.yuv', 274 | }, 275 | {'asset_id': 56, 276 | 'content_id': 6, 277 | 'dmos': 30.833333333333332, 278 | 'path': dis_dir + '/OldTownCross_20_288_375.yuv', 279 | }, 280 | {'asset_id': 57, 281 | 'content_id': 6, 282 | 'dmos': 45.0, 283 | 'path': dis_dir + '/OldTownCross_45_384_750.yuv', 284 | }, 285 | {'asset_id': 58, 286 | 'content_id': 6, 287 | 'dmos': 57.5, 288 | 'path': dis_dir + '/OldTownCross_55_480_750.yuv', 289 | }, 290 | {'asset_id': 59, 291 | 'content_id': 6, 292 | 'dmos': 75.0, 293 | 'path': dis_dir + '/OldTownCross_60_480_1750.yuv', 294 | }, 295 | {'asset_id': 60, 296 | 'content_id': 6, 297 | 'dmos': 99.166666666666671, 298 | 'path': dis_dir + '/OldTownCross_80_720_2350.yuv', 299 | }, 300 | {'asset_id': 61, 301 | 'content_id': 6, 302 | 'dmos': 99.166666666666671, 303 | 'path': dis_dir + '/OldTownCross_85_720_2950.yuv', 304 | }, 305 | {'asset_id': 62, 306 | 'content_id': 6, 307 | 'dmos': 109.16666666666667, 308 | 'path': dis_dir + '/OldTownCross_90_1080_4300.yuv', 309 | }, 310 | {'asset_id': 63, 311 | 'content_id': 7, 312 | 'dmos': 19.166666666666668, 313 | 'path': dis_dir + '/Seeking_10_288_375.yuv', 314 | }, 315 | {'asset_id': 64, 316 | 'content_id': 7, 317 | 'dmos': 41.666666666666664, 318 | 'path': dis_dir + '/Seeking_30_480_1050.yuv', 319 | }, 320 | {'asset_id': 65, 321 | 'content_id': 7, 322 | 'dmos': 50.833333333333336, 323 | 'path': dis_dir + '/Seeking_45_480_1750.yuv', 324 | }, 325 | {'asset_id': 66, 326 | 'content_id': 7, 327 | 'dmos': 66.666666666666671, 328 | 'path': dis_dir + '/Seeking_50_720_2350.yuv', 329 | }, 330 | {'asset_id': 67, 331 | 'content_id': 7, 332 | 'dmos': 75.833333333333329, 333 | 'path': dis_dir + '/Seeking_60_720_3050.yuv', 334 | }, 335 | {'asset_id': 68, 336 | 'content_id': 7, 337 | 'dmos': 80.833333333333329, 338 | 'path': dis_dir + '/Seeking_65_1080_4300.yuv', 339 | }, 340 | {'asset_id': 69, 341 | 'content_id': 7, 342 | 'dmos': 91.666666666666671, 343 | 'path': dis_dir + '/Seeking_75_1080_5800.yuv', 344 | }, 345 | {'asset_id': 70, 346 | 'content_id': 7, 347 | 'dmos': 90.0, 348 | 'path': dis_dir + '/Seeking_85_1080_7500.yuv', 349 | }, 350 | {'asset_id': 71, 351 | 'content_id': 7, 352 | 'dmos': 91.666666666666671, 353 | 'path': dis_dir + '/Seeking_90_1080_15000.yuv', 354 | }, 355 | {'asset_id': 72, 356 | 'content_id': 7, 357 | 'dmos': 96.666666666666671, 358 | 'path': dis_dir + '/Seeking_95_1080_20000.yuv', 359 | }, 360 | {'asset_id': 73, 361 | 'content_id': 8, 362 | 'dmos': 33.333333333333336, 363 | 'path': dis_dir + '/Tennis_20_288_375.yuv', 364 | }, 365 | {'asset_id': 74, 366 | 'content_id': 8, 367 | 'dmos': 50.0, 368 | 'path': dis_dir + '/Tennis_40_384_750.yuv', 369 | }, 370 | {'asset_id': 75, 371 | 'content_id': 8, 372 | 'dmos': 71.666666666666671, 373 | 'path': dis_dir + '/Tennis_60_480_1050.yuv', 374 | }, 375 | {'asset_id': 76, 376 | 'content_id': 8, 377 | 'dmos': 68.333333333333329, 378 | 'path': dis_dir + '/Tennis_70_480_1750.yuv', 379 | }, 380 | {'asset_id': 77, 381 | 'content_id': 8, 382 | 'dmos': 94.166666666666671, 383 | 'path': dis_dir + '/Tennis_80_720_3050.yuv', 384 | }, 385 | {'asset_id': 78, 386 | 'content_id': 8, 387 | 'dmos': 99.166666666666671, 388 | 'path': dis_dir + '/Tennis_90_1080_4300.yuv', 389 | }] 390 | -------------------------------------------------------------------------------- /dataset/NFLX/prep_NFLX_score.py: -------------------------------------------------------------------------------- 1 | import os 2 | import re 3 | from collections import OrderedDict 4 | import numpy as np 5 | import json 6 | from NFLX_dataset_public import dis_videos 7 | 8 | def make_score_file(): 9 | 10 | dir_path = os.path.dirname(os.path.realpath(__file__)) 11 | 12 | all_scenes = ['BigBuckBunny', 'BirdsInCage', 'CrowdRun', 'ElFuente1', 'ElFuente2', 13 | 'FoxBird', 'OldTownCross', 'Seeking', 'Tennis'] 14 | test_scene = all_scenes 15 | framerate = { 16 | 'BigBuckBunny_25fps.yuv': 25, 17 | 'BirdsInCage_30fps.yuv': 30, 18 | 'CrowdRun_25fps.yuv': 25, 19 | 'ElFuente1_30fps.yuv': 30, 20 | 'ElFuente2_30fps.yuv': 30, 21 | 'FoxBird_25fps.yuv': 25, 22 | 'OldTownCross_25fps.yuv': 25, 23 | 'Seeking_25fps.yuv': 25, 24 | 'Tennis_24fps.yuv': 24} 25 | 26 | width = 1920 27 | height = 1080 28 | 29 | ret = OrderedDict() 30 | ret['train'] = OrderedDict() 31 | ret['test'] = OrderedDict() 32 | 33 | trn_dis = [] 34 | trn_ref = [] 35 | trn_mos = [] 36 | trn_height = [] 37 | trn_width = [] 38 | trn_fps = [] 39 | 40 | tst_dis = [] 41 | tst_ref = [] 42 | tst_mos = [] 43 | tst_height = [] 44 | tst_width = [] 45 | tst_fps = [] 46 | 47 | for pvs in dis_videos: 48 | 49 | clip_info = re.split('/', pvs['path']) 50 | clip_name = clip_info[-1] 51 | print(clip_name) 52 | scene = re.split('_', clip_name)[0] 53 | for k, v in framerate.items(): 54 | if scene in k: 55 | if scene in test_scene: 56 | tst_dis.append(clip_name) 57 | tst_ref.append(k) 58 | tst_mos.append(pvs['dmos']) 59 | tst_height.append(height) 60 | tst_width.append(width) 61 | tst_fps.append(v) 62 | else: 63 | trn_dis.append(clip_name) 64 | trn_ref.append(k) 65 | trn_mos.append(pvs['dmos']) 66 | trn_height.append(height) 67 | trn_width.append(width) 68 | trn_fps.append(v) 69 | break 70 | 71 | ret['train']['dis'] = trn_dis 72 | ret['train']['ref'] = trn_ref 73 | ret['train']['mos'] = trn_mos 74 | ret['train']['height'] = trn_height 75 | ret['train']['width'] = trn_width 76 | ret['train']['fps'] = trn_fps 77 | 78 | ret['test']['dis'] = tst_dis 79 | ret['test']['ref'] = tst_ref 80 | ret['test']['mos'] = tst_mos 81 | ret['test']['height'] = tst_height 82 | ret['test']['width'] = tst_width 83 | ret['test']['fps'] = tst_fps 84 | 85 | with open('NFLX_subj_score.json', 'w') as f: 86 | json.dump(ret, f, indent=4, sort_keys=True) 87 | 88 | 89 | if __name__ == "__main__": 90 | 91 | make_score_file() 92 | 93 | print('Done') 94 | -------------------------------------------------------------------------------- /dataset/dataset.py: -------------------------------------------------------------------------------- 1 | import os 2 | import re 3 | import json 4 | import numpy as np 5 | import subprocess 6 | import torch 7 | from torch.utils.data import DataLoader, Dataset 8 | 9 | 10 | class CropSegment(object): 11 | r""" 12 | Crop a clip along the spatial axes, i.e. h, w 13 | DO NOT crop along the temporal axis 14 | 15 | args: 16 | size_x: horizontal dimension of a segment 17 | size_y: vertical dimension of a segment 18 | stride_x: horizontal stride between segments 19 | stride_y: vertical stride between segments 20 | return: 21 | clip (tensor): dim = (N, C, D, H=size_y, W=size_x). N are segments number by applying sliding window with given window size and stride 22 | """ 23 | 24 | def __init__(self, size_x, size_y, stride_x, stride_y): 25 | 26 | self.size_x = size_x 27 | self.size_y = size_y 28 | self.stride_x = stride_x 29 | self.stride_y = stride_y 30 | 31 | def __call__(self, clip): 32 | 33 | # input dimension [C, D, H, W] 34 | channel = clip.shape[0] 35 | depth = clip.shape[1] 36 | 37 | clip = clip.unfold(2, self.size_x, self.stride_x) 38 | clip = clip.unfold(3, self.size_y, self.stride_y) 39 | clip = clip.permute(2, 3, 0, 1, 4, 5) 40 | clip = clip.contiguous().view(-1, channel, depth, self.size_x, self.size_y) 41 | 42 | return clip 43 | 44 | 45 | class VideoDataset(Dataset): 46 | r""" 47 | A Dataset for a folder of videos 48 | 49 | args: 50 | subj_score_file (str): path to the subjective score file. It contains train/test split, ref list, dis list, fps list and mos list 51 | directory (str): the path to the directory containing all videos 52 | mode (str, optional): determines whether to read train/test data 53 | channel (int, optional): number of channels of a sample 54 | size_x: horizontal dimension of a segment 55 | size_y: vertical dimension of a segment 56 | stride_x: horizontal stride between segments 57 | stride_y: vertical stride between segments 58 | """ 59 | 60 | def __init__(self, subj_score_file, directory, mode='train', channel=1, size_x=112, size_y=112, stride_x=80, stride_y=80, transform=None): 61 | 62 | with open(subj_score_file, "r") as f: 63 | data = json.load(f) 64 | self.video_dir = directory 65 | data = data[mode] 66 | self.ref = data['ref'] 67 | self.dis = data['dis'] 68 | self.label = data['mos'] 69 | self.framerate = data['fps'] 70 | self.frame_height = data['height'] 71 | self.frame_width = data['width'] 72 | self.channel = channel 73 | self.size_x = size_x 74 | self.size_y = size_y 75 | self.stride_x = stride_x 76 | self.stride_y = stride_y 77 | self.transform = transform 78 | 79 | def __getitem__(self, index): 80 | 81 | ref = os.path.join(self.video_dir, self.ref[index]) 82 | dis = os.path.join(self.video_dir, self.dis[index]) 83 | label = float(self.label[index]) 84 | framerate = int(self.framerate[index]) 85 | frame_height = int(self.frame_height[index]) 86 | frame_width = int(self.frame_width[index]) 87 | 88 | if framerate <= 30: 89 | stride_t = 2 90 | elif framerate <= 60: 91 | stride_t = 4 92 | else: 93 | raise ValueError('Unsupported fps') 94 | 95 | if ref.endswith(('.YUV', '.yuv')): 96 | ref = self.load_yuv(ref, frame_height, frame_width, stride_t) 97 | elif ref.endswith(('.mp4')): 98 | ref = self.load_encode(ref, frame_height, frame_width, stride_t) 99 | else: 100 | raise ValueError('Unsupported video format') 101 | 102 | if dis.endswith(('.YUV', '.yuv')): 103 | dis = self.load_yuv(dis, frame_height, frame_width, stride_t) 104 | elif dis.endswith(('.mp4')): 105 | dis = self.load_encode(dis, frame_height, frame_width, stride_t) 106 | else: 107 | raise ValueError('Unsupported video format') 108 | 109 | offset_v = (frame_height - self.size_y) % self.stride_y 110 | offset_t = int(offset_v / 4 * 2) 111 | offset_b = offset_v - offset_t 112 | offset_h = (frame_width - self.size_x) % self.stride_x 113 | offset_l = int(offset_h / 4 * 2) 114 | offset_r = offset_h - offset_l 115 | 116 | ref = ref[:, :, offset_t:frame_height-offset_b, offset_l:frame_width-offset_r] 117 | dis = dis[:, :, offset_t:frame_height-offset_b, offset_l:frame_width-offset_r] 118 | 119 | spatial_crop = CropSegment(self.size_x, self.size_y, self.stride_x, self.stride_y) 120 | ref = spatial_crop(ref) 121 | dis = spatial_crop(dis) 122 | 123 | ref = torch.from_numpy(np.asarray(ref)) 124 | dis = torch.from_numpy(np.asarray(dis)) 125 | label = torch.from_numpy(np.asarray(label)) 126 | 127 | return ref, dis, label 128 | 129 | def load_yuv(self, file_path, frame_height, frame_width, stride_t, start=0): 130 | r""" 131 | Load frames on-demand from raw video, currently supports only yuv420p 132 | 133 | args: 134 | file_path (str): path to yuv file 135 | frame_height 136 | frame_width 137 | stride_t (int): sample the 1st frame from every stride_t frames 138 | start (int): index of the 1st sampled frame 139 | return: 140 | ret (tensor): contains sampled frames (Y channel). dim = (C, D, H, W) 141 | """ 142 | 143 | bytes_per_frame = int(frame_height * frame_width * 1.5) 144 | frame_count = os.path.getsize(file_path) / bytes_per_frame 145 | 146 | ret = [] 147 | count = 0 148 | 149 | with open(file_path, 'rb') as f: 150 | while count < frame_count: 151 | if count % stride_t == 0: 152 | offset = count * bytes_per_frame 153 | f.seek(offset, 0) 154 | frame = f.read(frame_height * frame_width) 155 | frame = np.frombuffer(frame, "uint8") 156 | frame = frame.astype('float32') / 255. 157 | frame = frame.reshape(1, 1, frame_height, frame_width) 158 | ret.append(frame) 159 | count += 1 160 | 161 | ret = np.concatenate(ret, axis=1) 162 | ret = torch.from_numpy(np.asarray(ret)) 163 | 164 | return ret 165 | 166 | def load_encode(self, file_path, frame_height, frame_width, stride_t, start=0): 167 | r""" 168 | Load frames on-demand from encode bitstream 169 | 170 | args: 171 | file_path (str): path to yuv file 172 | frame_height 173 | frame_width 174 | stride_t (int): sample the 1st frame from every stride_t frames 175 | start (int): index of the 1st sampled frame 176 | return: 177 | ret (array): contains sampled frames. dim = (C, D, H, W) 178 | """ 179 | 180 | enc_path = file_path 181 | enc_name = re.split('/', enc_path)[-1] 182 | 183 | yuv_name = enc_name.replace('.mp4', '.yuv') 184 | yuv_path = os.path.join('/dockerdata/tmp/', yuv_name) 185 | cmd = "ffmpeg -y -i {src} -f rawvideo -pix_fmt yuv420p -vsync 0 -an {dst}".format(src=enc_path, dst=yuv_path) 186 | subprocess.run(cmd, shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE) 187 | 188 | ret = self.load_yuv(yuv_path, frame_height, frame_width, stride_t, start=0) 189 | 190 | return ret 191 | 192 | def __len__(self): 193 | return len(self.dis) 194 | 195 | 196 | if __name__ == '__main__': 197 | 198 | root_dir = os.path.dirname(os.path.realpath(__file__)) 199 | subj_score_file = os.path.join(root_dir, 'csiq_subj_score.json') 200 | video_dir = '/dockerdata/CSIQ_YUV' 201 | csiq_dataset = VideoDataset(subj_score_file, video_dir) 202 | print(len(csiq_dataset)) 203 | -------------------------------------------------------------------------------- /eval.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import torch 4 | import json 5 | import numpy as np 6 | import torch.nn as nn 7 | from dataset.dataset import VideoDataset 8 | from model.network import C3DVQANet 9 | from scipy.stats import spearmanr, pearsonr 10 | from opts import parse_opts 11 | from tool.draw import mos_scatter 12 | 13 | def test_model(model, device, criterion, dataloaders): 14 | 15 | phase = 'test' 16 | model.eval() 17 | 18 | epoch_labels = [] 19 | epoch_preds = [] 20 | 21 | for ref, dis, labels in dataloaders[phase]: 22 | 23 | ref = ref.to(device) 24 | dis = dis.to(device) 25 | labels = labels.to(device).float() 26 | 27 | # dim: [batch=1, P, C, D, H, W] 28 | ref = ref.reshape(-1, ref.shape[2], ref.shape[3], ref.shape[4], ref.shape[5]) 29 | dis = dis.reshape(-1, dis.shape[2], dis.shape[3], dis.shape[4], dis.shape[5]) 30 | 31 | with torch.no_grad(): 32 | preds = model(ref, dis) 33 | preds = torch.mean(preds, 0, keepdim=True) 34 | 35 | epoch_labels.append(labels.flatten()) 36 | epoch_preds.append(preds.flatten()) 37 | 38 | epoch_labels = torch.cat(epoch_labels).flatten().data.cpu().numpy() 39 | epoch_preds = torch.cat(epoch_preds).flatten().data.cpu().numpy() 40 | 41 | ret = {} 42 | ret['MOS'] = epoch_labels.tolist() 43 | ret['PRED'] = epoch_preds.tolist() 44 | 45 | # print(json.dumps(ret)) 46 | 47 | epoch_rmse = np.sqrt(np.mean((epoch_labels - epoch_preds)**2)) 48 | print("{phase} RMSE: {rmse:.4f}".format(phase=phase, rmse=epoch_rmse)) 49 | 50 | if len(epoch_labels) > 5: 51 | epoch_plcc = pearsonr(epoch_labels, epoch_preds)[0] 52 | epoch_srocc = spearmanr(epoch_labels, epoch_preds)[0] 53 | 54 | print("{phase}:\t PLCC: {plcc:.4f}\t SROCC: {srocc:.4f}".format(phase=phase, plcc=epoch_plcc, srocc=epoch_srocc)) 55 | 56 | 57 | if __name__=='__main__': 58 | 59 | opt = parse_opts() 60 | 61 | video_path = opt.video_dir 62 | subj_dataset = opt.score_file_path 63 | load_checkpoint = opt.load_model 64 | MULTI_GPU_MODE = opt.multi_gpu 65 | channel = opt.channel 66 | size_x = opt.size_x 67 | size_y = opt.size_y 68 | stride_x = opt.stride_x 69 | stride_y = opt.stride_y 70 | 71 | video_dataset = {x: VideoDataset(subj_dataset, video_path, x, channel, size_x, size_y, stride_x, stride_y) for x in ['test']} 72 | dataloaders = {x: torch.utils.data.DataLoader(video_dataset[x], batch_size=1, shuffle=False, num_workers=4, drop_last=False) for x in ['test']} 73 | 74 | device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") 75 | checkpoint = torch.load(load_checkpoint) 76 | 77 | model = C3DVQANet().to(device) 78 | model.load_state_dict(checkpoint['model_state_dict']) 79 | 80 | if torch.cuda.device_count() > 1 and MULTI_GPU_MODE == True: 81 | device_ids = range(0, torch.cuda.device_count()) 82 | model = torch.nn.DataParallel(model, device_ids=device_ids) 83 | print("muti-gpu mode enabled, use {0:d} gpus".format(torch.cuda.device_count())) 84 | else: 85 | print('use {0}'.format('cuda' if torch.cuda.is_available() else 'cpu')) 86 | 87 | criterion = nn.MSELoss() 88 | 89 | test_model(model, device, criterion, dataloaders) 90 | -------------------------------------------------------------------------------- /model/network.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import numpy as np 4 | 5 | 6 | class ResidualFrame(object): 7 | 8 | def __init__(self, eps=1.0): 9 | super(ResidualFrame, self).__init__() 10 | 11 | self.eps = eps 12 | self.log_255 = np.float32(2 * np.log(255.0)) 13 | self.log_max = np.float32(self.log_255 - np.log(self.eps)) 14 | 15 | def __call__(self, x, y): 16 | 17 | d = torch.pow(255.0 * (x - y), 2) 18 | residual = self.log_255 - torch.log(d + self.eps) 19 | residual = residual / self.log_max 20 | 21 | return residual 22 | 23 | 24 | class DownsampleConv3D(nn.Module): 25 | r""" 26 | Downsample by 2 over an input signal composed of several input planes with distinct spatial and time axes, by performing 3D convolution over the spatiotemporal axes 27 | 28 | args: 29 | in_channels (int): number of channels in the input tensor 30 | out_channels (int): number of channels produced by the convolution 31 | kernel_size (int or tuple): size of the convolution kernel 32 | stride (int or tuple): stride 33 | padding (int or tuple): zero-padding 34 | bias (bool, optional): whether to add a learnable bias 35 | """ 36 | 37 | def __init__(self, in_channels, out_channels, kernel_size=(1, 5, 5), stride=(1, 2, 2), padding=(0, 2, 2), dilation=1, groups=1, bias=False): 38 | super(DownsampleConv3D, self).__init__() 39 | 40 | k = np.float32([1, 4, 6, 4, 1]) 41 | k = np.outer(k, k) 42 | k5x5 = (k/k.sum()).reshape((1, 1, 1, 5, 5)) 43 | 44 | conv1 = nn.Conv3d(in_channels, out_channels, kernel_size, stride, padding, dilation, groups, bias) 45 | 46 | with torch.no_grad(): 47 | conv1.weight = nn.Parameter(torch.from_numpy(k5x5)) 48 | 49 | self.conv1 = conv1 50 | 51 | def forward(self, x): 52 | 53 | x = self.conv1(x) 54 | 55 | return x 56 | 57 | 58 | class UpsampleConv3D(nn.Module): 59 | r""" 60 | Upsample by 2 over an input signal composed of several input planes with distinct spatial and time axes, by performing 3D convolution over the spatiotemporal axes 61 | 62 | rrgs: 63 | in_channels (int): number of channels in the input tensor 64 | out_channels (int): number of channels produced by the convolution 65 | kernel_size (int or tuple): size of the convolution kernel 66 | stride (int or tuple): stride 67 | padding (int or tuple): zero-padding 68 | bias (bool, optional): whether to add a learnable bias 69 | """ 70 | 71 | def __init__(self, in_channels, out_channels, kernel_size=(1, 5, 5), stride=(1, 2, 2), padding=(0, 2, 2), dilation=1, groups=1, bias=False): 72 | super(UpsampleConv3D, self).__init__() 73 | 74 | k = np.float32([1, 4, 6, 4, 1]) 75 | k = np.outer(k, k) 76 | k5x5 = (k/k.sum()).reshape((1, 1, 1, 5, 5)) 77 | 78 | conv1 = nn.ConvTranspose3d(in_channels, out_channels, kernel_size, stride, padding, output_padding=(0, 1, 1), bias=bias) 79 | 80 | with torch.no_grad(): 81 | conv1.weight = nn.Parameter(torch.from_numpy(k5x5)) 82 | 83 | self.conv1 = conv1 84 | 85 | def forward(self, x): 86 | 87 | x = self.conv1(x) 88 | 89 | return x 90 | 91 | 92 | class SpatialConv3D(nn.Module): 93 | r""" 94 | Apply 3D conv. over an input signal composed of several input planes with distinct spatial and time axes, by performing 3D convolution over the spatiotemporal axes 95 | 96 | rrgs: 97 | in_channels (int): number of channels in the input tensor 98 | out_channels (int): number of channels produced by the convolution 99 | kernel_size (int or tuple): size of the convolution kernel 100 | stride (int or tuple): stride 101 | padding (int or tuple): zero-padding 102 | """ 103 | 104 | def __init__(self, in_channels, out_channels, kernel_size=(1, 3, 3), stride=(1, 2, 2), padding=(0, 1, 1)): 105 | super(SpatialConv3D, self).__init__() 106 | 107 | self.conv1 = nn.Conv3d(in_channels, 16, kernel_size, stride, padding) 108 | self.reLu1 = nn.LeakyReLU(inplace=True) 109 | self.conv2 = nn.Conv3d(16, out_channels, kernel_size, stride, padding) 110 | self.reLu2 = nn.LeakyReLU(inplace=True) 111 | 112 | def forward(self, x): 113 | 114 | x = self.conv1(x) 115 | x = self.reLu1(x) 116 | x = self.conv2(x) 117 | x = self.reLu2(x) 118 | 119 | return x 120 | 121 | 122 | class SpatialTemporalConv3D(nn.Module): 123 | r""" 124 | Apply 3D conv. over an input signal composed of several input planes with distinct spatial and time axes, by performing 3D convolution over the spatiotemporal axes 125 | 126 | args: 127 | in_channels (int): number of channels in the input tensor 128 | out_channels (int): number of channels produced by the convolution 129 | kernel_size (int or tuple): size of the convolution kernel 130 | stride (int or tuple): stride 131 | padding (int or tuple): zero-padding 132 | """ 133 | 134 | def __init__(self, in_channels, out_channels, kernel_size=3, stride=1, padding=1): 135 | super(SpatialTemporalConv3D, self).__init__() 136 | 137 | self.conv1 = nn.Conv3d(in_channels, 64, kernel_size, stride, padding) 138 | self.relu1 = nn.LeakyReLU(inplace=True) 139 | self.conv2 = nn.Conv3d(64, 64, kernel_size, stride, padding) 140 | self.relu2 = nn.LeakyReLU(inplace=True) 141 | self.conv3 = nn.Conv3d(64, 32, kernel_size, stride, padding) 142 | self.relu3 = nn.LeakyReLU(inplace=True) 143 | self.conv4 = nn.Conv3d(32, out_channels, kernel_size, stride, padding) 144 | self.relu4 = nn.LeakyReLU(inplace=True) 145 | 146 | def forward(self, x): 147 | 148 | x = self.conv1(x) 149 | x = self.relu1(x) 150 | x = self.conv2(x) 151 | x = self.relu2(x) 152 | x = self.conv3(x) 153 | x = self.relu3(x) 154 | x = self.conv4(x) 155 | x = self.relu4(x) 156 | 157 | return x 158 | 159 | 160 | class C3DVQANet(nn.Module): 161 | 162 | def __init__(self): 163 | super(C3DVQANet, self).__init__() 164 | 165 | self.diff = ResidualFrame(eps=1.0) 166 | 167 | self.conv1_1 = DownsampleConv3D(1, 1) 168 | self.conv1_2 = UpsampleConv3D(1, 1) 169 | 170 | self.conv2_1 = SpatialConv3D(1, 16) 171 | self.conv2_2 = SpatialConv3D(1, 16) 172 | 173 | self.conv3 = SpatialTemporalConv3D(32, 1) 174 | 175 | self.pool = nn.AdaptiveAvgPool3d(1) 176 | 177 | self.fc1 = nn.Linear(1, 4) 178 | self.relu1 = nn.LeakyReLU(inplace=True) 179 | self.fc2 = nn.Linear(4, 1) 180 | self.relu2 = nn.LeakyReLU(inplace=True) 181 | 182 | def forward(self, ref, dis): 183 | 184 | err1 = self.diff(ref, dis) 185 | 186 | err2 = self.conv1_1(err1) # 112x112 -> 56x56 187 | err2 = self.conv1_1(err2) # 56x56 -> 28x28 188 | 189 | err3 = self.conv2_1(err1) # 112x112 -> 56x56 190 | 191 | lo = dis 192 | for i in range(3): 193 | lo = self.conv1_1(lo) 194 | for i in range(3): 195 | lo = self.conv1_2(lo) 196 | dis = dis - lo 197 | 198 | dis = self.conv2_2(dis) # 112x112 -> 56x56 199 | 200 | sens = torch.cat([dis, err3], dim=1) 201 | sens = self.conv3(sens) 202 | 203 | res = err2 * sens 204 | res = res[:, :, :, 4:-4, 4:-4] 205 | res = self.pool(res) 206 | 207 | res = self.fc1(res) 208 | res = self.relu1(res) 209 | res = self.fc2(res) 210 | res = self.relu2(res) 211 | res = torch.squeeze(res) 212 | 213 | return res 214 | -------------------------------------------------------------------------------- /opts.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | 3 | def parse_opts(): 4 | 5 | parser = argparse.ArgumentParser() 6 | parser.add_argument('--video_dir', default='/dockerdata/CSIQ_YUV', type=str, help='Path to input videos') 7 | parser.add_argument('--score_file_path', default='./dataset/csiq_subj_score.json', type=str, help='Path to input subjective score') 8 | parser.add_argument('--load_model', default='', type=str, help='Path to load checkpoint') 9 | parser.add_argument('--save_model', default='./save/model_csiq.pt', type=str, help='Path to save checkpoint') 10 | parser.add_argument('--log_file_name', default='./log/run.log', type=str, help='Path to save log') 11 | 12 | parser.add_argument('--channel', default=1, type=int, help='channel number of input data, 1 for Y channel, 3 for YUV') 13 | parser.add_argument('--size_x', default=112, type=int, help='patch size x of segment') 14 | parser.add_argument('--size_y', default=112, type=int, help='patch size y of segment') 15 | parser.add_argument('--stride_x', default=80, type=int, help='patch stride x between segments') 16 | parser.add_argument('--stride_y', default=80, type=int, help='patch stride y between segments') 17 | 18 | parser.add_argument('--learning_rate', default=3e-4, type=float, help='learning rate') 19 | parser.add_argument('--weight_decay', default=1e-2, type=float, help='L2 regularization') 20 | parser.add_argument('--epochs', default=20, type=int, help='epochs to train') 21 | parser.add_argument('--multi_gpu', action='store_true', help='whether to use all GPUs') 22 | 23 | args = parser.parse_args() 24 | 25 | return args 26 | 27 | if __name__ == '__main__': 28 | 29 | args = parse_opts() 30 | print(args) 31 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | absl-py==0.9.0 2 | cachetools==4.1.0 3 | certifi==2020.4.5.1 4 | chardet==3.0.4 5 | cycler==0.10.0 6 | google-auth==1.13.1 7 | google-auth-oauthlib==0.4.1 8 | grpcio==1.28.1 9 | idna==2.9 10 | kiwisolver==1.1.0 11 | Markdown==3.2.1 12 | matplotlib==3.0.3 13 | numpy==1.18.1 14 | oauthlib==3.1.0 15 | protobuf==3.11.3 16 | pyasn1==0.4.8 17 | pyasn1-modules==0.2.8 18 | pyparsing==2.4.6 19 | python-dateutil==2.8.1 20 | requests==2.23.0 21 | requests-oauthlib==1.3.0 22 | rsa==4.0 23 | scipy==1.4.1 24 | six==1.14.0 25 | tensorboard==2.2.0 26 | tensorboard-plugin-wit==1.6.0.post3 27 | torch==1.4.0 28 | tqdm==4.43.0 29 | urllib3==1.25.8 30 | Werkzeug==1.0.1 31 | -------------------------------------------------------------------------------- /save/model_videoset_v3.pt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tencent/DVQA/21727333a6b41d54ad1a8beca1fcbe00a69ed347/save/model_videoset_v3.pt -------------------------------------------------------------------------------- /scripts/eval.sh: -------------------------------------------------------------------------------- 1 | python ./eval.py --multi_gpu --video_dir /apdcephfs/private_tommyhqwang/YUV/PGC_YUV --score_file_path ./dataset/VIDEOSET/videoset_subj_score_v2.json --load_model ./save/model_videoset_v3.pt_88 2 | -------------------------------------------------------------------------------- /scripts/ft.sh: -------------------------------------------------------------------------------- 1 | python ./train.py --multi_gpu --video_dir yuv_dir --score_file_path ./dataset/*.json --load_model ./save/*.pt --save_model ./save/*.pt --size_x 112 --size_y 112 --stride_x 80 --stride_y 80 --learning_rate 3e-3 --weight_decay 1e-2 --epoch 100 2 | -------------------------------------------------------------------------------- /scripts/train.sh: -------------------------------------------------------------------------------- 1 | python ./train.py --multi_gpu --video_dir /apdcephfs/private_tommyhqwang/YUV/PGC_YUV --score_file_path ./dataset/VIDEOSET/videoset_subj_score_v2.json --save_model ./save/model_videoset_v3.pt --log_file_name ./log/videoset_v3.log --size_x 112 --size_y 112 --stride_x 80 --stride_y 80 --learning_rate 3e-4 --weight_decay 1e-2 --epochs 100 2 | -------------------------------------------------------------------------------- /tool/decode_stream.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import re 4 | 5 | 6 | def decode_bitstreams(src_dir, dst_dir, rule=None): 7 | 8 | cmd_to_exec = [] 9 | for dirpath, dirs, files in os.walk(src_dir, topdown=False): 10 | for f in files: 11 | if not f.startswith('.') and f.endswith('.mp4'): 12 | src_path = os.path.join(src_dir, f) 13 | dst_path = os.path.join(dst_dir, re.sub('.mp4', '.yuv', f)) 14 | print('decoding {f}'.format(f=src_path)) 15 | cmd= "ffmpeg -i " + src_path + " -f rawvideo -pix_fmt yuv420p -an -vsync 0 -y " + dst_path 16 | cmd_to_exec.append(cmd) 17 | print(cmd) 18 | 19 | with open('decoding.sh', 'wt') as f: 20 | f.write("#!/bin/sh\n\n") 21 | for item in cmd_to_exec: 22 | f.write("%s\n" % item) 23 | 24 | 25 | if __name__ == "__main__": 26 | 27 | src_dir = sys.argv[1] 28 | dst_dir = sys.argv[2] 29 | 30 | decode_bitstreams(src_dir, dst_dir) 31 | -------------------------------------------------------------------------------- /tool/draw.py: -------------------------------------------------------------------------------- 1 | import matplotlib 2 | import matplotlib.pyplot as plt 3 | import numpy as np 4 | 5 | 6 | def mos_scatter(mos, pred): 7 | 8 | fig = plt.figure(figsize=(48,48)) 9 | plt.scatter(mos, pred, alpha=0.8) 10 | plt.xlabel('MOS') 11 | plt.ylabel('PRED') 12 | plt.plot([0, 100], [0, 100]) 13 | 14 | return fig 15 | 16 | 17 | -------------------------------------------------------------------------------- /train.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import json 4 | import numpy as np 5 | import logging 6 | import torch 7 | import torch.nn as nn 8 | from torch.utils.tensorboard import SummaryWriter 9 | from tqdm import tqdm 10 | from scipy.stats import spearmanr, pearsonr 11 | from opts import parse_opts 12 | from model.network import C3DVQANet 13 | from dataset.dataset import VideoDataset 14 | from tool.draw import mos_scatter 15 | 16 | writer = SummaryWriter() 17 | 18 | def train_model(model, device, criterion, optimizer, scheduler, dataloaders, save_checkpoint, epoch_resume=1, num_epochs=25): 19 | 20 | for epoch in tqdm(range(epoch_resume, num_epochs+epoch_resume), unit='epoch', initial=epoch_resume, total=num_epochs+epoch_resume): 21 | for phase in ['train', 'test']: 22 | epoch_labels = [] 23 | epoch_preds = [] 24 | epoch_loss = 0.0 25 | epoch_size = 0 26 | 27 | if phase == 'train': 28 | model.train() 29 | else: 30 | model.eval() 31 | 32 | for ref, dis, labels in dataloaders[phase]: 33 | ref = ref.to(device) 34 | dis = dis.to(device) 35 | labels = labels.to(device).float() 36 | 37 | ref = ref.reshape(-1, ref.shape[2], ref.shape[3], ref.shape[4], ref.shape[5]) 38 | dis = dis.reshape(-1, dis.shape[2], dis.shape[3], dis.shape[4], dis.shape[5]) 39 | 40 | optimizer.zero_grad() 41 | 42 | with torch.set_grad_enabled(phase == 'train'): 43 | preds = model(ref, dis) 44 | preds = torch.mean(preds, 0, keepdim=True) 45 | loss = criterion(preds, labels) 46 | 47 | if torch.cuda.device_count() > 1 and MULTI_GPU_MODE == True: 48 | loss = torch.mean(loss) 49 | 50 | if phase == 'train': 51 | loss.backward() 52 | optimizer.step() 53 | 54 | epoch_loss += loss.item() * labels.size(0) 55 | epoch_size += labels.size(0) 56 | epoch_labels.append(labels.flatten()) 57 | epoch_preds.append(preds.flatten()) 58 | 59 | epoch_loss = epoch_loss / epoch_size 60 | 61 | if phase == 'train': 62 | scheduler.step(epoch_loss) 63 | 64 | epoch_labels = torch.cat(epoch_labels).flatten().data.cpu().numpy() 65 | epoch_preds = torch.cat(epoch_preds).flatten().data.cpu().numpy() 66 | 67 | logging.info('epoch_labels: {}'.format(epoch_labels)) 68 | logging.info('epoch_preds: {}'.format(epoch_preds)) 69 | 70 | epoch_plcc = pearsonr(epoch_labels, epoch_preds)[0] 71 | epoch_srocc = spearmanr(epoch_labels, epoch_preds)[0] 72 | epoch_rmse = np.sqrt(np.mean((epoch_labels - epoch_preds)**2)) 73 | 74 | logging.info("{phase}-Loss: {loss:.4f}\t RMSE: {rmse:.4f}\t PLCC: {plcc:.4f}\t SROCC: {srocc:.4f}".format(phase=phase, loss=epoch_loss, rmse=epoch_rmse, plcc=epoch_plcc, srocc=epoch_srocc)) 75 | 76 | if phase == 'train': 77 | writer.add_scalar('Loss/train', epoch_loss, epoch) 78 | writer.add_scalar('RMSE/train', epoch_rmse, epoch) 79 | writer.add_scalar('PLCC/train', epoch_plcc, epoch) 80 | writer.add_scalar('SROCC/train', epoch_srocc, epoch) 81 | else: 82 | writer.add_scalar('Loss/test', epoch_loss, epoch) 83 | writer.add_scalar('RMSE/test', epoch_rmse, epoch) 84 | writer.add_scalar('PLCC/test', epoch_plcc, epoch) 85 | writer.add_scalar('SROCC/test', epoch_srocc, epoch) 86 | writer.add_figure('Pred vs. MOS', mos_scatter(epoch_labels, epoch_preds), epoch) 87 | 88 | if phase == 'test' and save_checkpoint: 89 | _checkpoint = '{pt}_{epoch}'.format(pt=save_checkpoint, epoch=epoch) 90 | torch.save({'epoch': epoch, 'model_state_dict': model.module.state_dict(), 'optimizer_state_dict': optimizer.state_dict()}, _checkpoint) 91 | 92 | 93 | if __name__=='__main__': 94 | 95 | opt = parse_opts() 96 | 97 | video_path = opt.video_dir 98 | subj_dataset = opt.score_file_path 99 | save_checkpoint = opt.save_model 100 | load_checkpoint = opt.load_model 101 | log_file_name = opt.log_file_name 102 | LEARNING_RATE = opt.learning_rate 103 | L2_REGULARIZATION = opt.weight_decay 104 | NUM_EPOCHS = opt.epochs 105 | MULTI_GPU_MODE = opt.multi_gpu 106 | channel = opt.channel 107 | size_x = opt.size_x 108 | size_y = opt.size_y 109 | stride_x = opt.stride_x 110 | stride_y = opt.stride_y 111 | 112 | logging.basicConfig(filename=log_file_name, filemode='w', format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', level=logging.DEBUG) 113 | logging.info('OK parse options') 114 | 115 | video_dataset = {x: VideoDataset(subj_dataset, video_path, x, channel, size_x, size_y, stride_x, stride_y) for x in ['train', 'test']} 116 | dataloaders = {x: torch.utils.data.DataLoader(video_dataset[x], batch_size=1, shuffle=True, num_workers=8, drop_last=True) for x in ['train', 'test']} 117 | 118 | device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") 119 | 120 | if torch.cuda.device_count() > 1 and MULTI_GPU_MODE == True: 121 | device_ids = range(0, torch.cuda.device_count()) 122 | model = torch.nn.DataParallel(C3DVQANet().to(device), device_ids=device_ids) 123 | logging.info("muti-gpu mode enabled, use {0:d} gpus".format(torch.cuda.device_count())) 124 | else: 125 | model = C3DVQANet().to(device) 126 | logging.info('use {0}'.format('cuda' if torch.cuda.is_available() else 'cpu')) 127 | 128 | criterion = nn.MSELoss() 129 | optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE, weight_decay=L2_REGULARIZATION) 130 | scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0.9, patience=5) 131 | epoch_resume = 1 132 | 133 | if os.path.exists(load_checkpoint): 134 | checkpoint = torch.load(load_checkpoint) 135 | logging.info("loading checkpoint") 136 | 137 | if torch.cuda.device_count() > 1 and MULTI_GPU_MODE == True: 138 | model.module.load_state_dict(checkpoint['model_state_dict']) 139 | else: 140 | model.load_state_dict(checkpoint['model_state_dict']) 141 | 142 | optimizer.load_state_dict(checkpoint['optimizer_state_dict']) 143 | epoch_resume = checkpoint['epoch'] 144 | 145 | train_model(model, device, criterion, optimizer, scheduler, dataloaders, save_checkpoint, epoch_resume, num_epochs=NUM_EPOCHS) 146 | --------------------------------------------------------------------------------