├── .gitignore
├── CONTRIBUTING.md
├── License.txt
├── README.md
├── dataset
    ├── CSIQ
    │   ├── csiq_video_quality_data.txt
    │   ├── csiq_video_quality_seqs.txt
    │   └── prep_csiq_score.py
    ├── LIVE
    │   ├── live_video_quality_data.txt
    │   ├── live_video_quality_seqs.txt
    │   └── prep_live_score.py
    ├── NFLX
    │   ├── NFLX_dataset_public.py
    │   └── prep_NFLX_score.py
    ├── VIDEOSET
    │   └── videoset_subj_score_v2.json
    └── dataset.py
├── eval.py
├── model
    └── network.py
├── opts.py
├── requirements.txt
├── save
    └── model_videoset_v3.pt
├── scripts
    ├── eval.sh
    ├── ft.sh
    └── train.sh
├── tool
    ├── decode_stream.py
    └── draw.py
└── train.py


/.gitignore:
--------------------------------------------------------------------------------
1 | __pycache__/
2 | /runs
3 | /save/*
4 | !/save/model_videoset_v3.pt
5 | /log
6 | 


--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
1 | ## Contributing to DVQA
2 | 
3 | Welcome to report issues or pull requests
4 | 


--------------------------------------------------------------------------------
/License.txt:
--------------------------------------------------------------------------------
 1 | Tencent is pleased to support the open source community by making DVQA available.
 2 | 
 3 | Copyright (C) 2020 THL A29 Limited, a Tencent company. All rights reserved.
 4 | 
 5 | Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following
 6 | conditions are met:
 7 | 
 8 | 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following 
 9 | disclaimer.
10 | 
11 | 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following 
12 | disclaimer in the documentation and/or other materials provided with the distribution.
13 | 
14 | 3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products 
15 | derived from this software without specific prior written permission.
16 | 
17 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, 
18 | BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT 
19 | SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 
20 | CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR 
21 | PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR 
22 | TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 
23 | POSSIBILITY OF SUCH DAMAGE.
24 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | DVQA - Deep learning-based Video Quality Assessment
 2 | 
 3 | ## News
 4 | 
 5 | - 12/17/2019 add pretrained model on PGC videos
 6 | 
 7 | ## Installation
 8 | 
 9 | We recommend to run the code with virtualenv. The code is developed with Python3.
10 | 
11 | Please install other prerequisites with the following command after invoking a virtual env.
12 | 
13 | ```
14 | pip install -r requirements.txt
15 | ```
16 | All packages are required to run the code.
17 | 
18 | ## Dataset
19 | 
20 | Please prepare a dataset if you want to evaluate in batch or train the code from scratch on your own GPUs. The dataset should be in json format, e.g. your\_dataset.json
21 | 
22 | ```
23 | {
24 |     "test": {
25 |         "dis": ["dis_1.yuv", "dis_2.yuv"],
26 |         "ref": ["ref_1.yuv", "ref_2.yuv"],
27 |         "fps": [30, 24],
28 |         "mos": [94.2, 55.8],
29 |         "height": [1080, 720],
30 |         "width": [1920, 1280]
31 |     },
32 |     "train": {
33 |         "dis": ["dis_3.yuv", "dis_4.yuv"],
34 |         "ref": ["ref_3.yuv", "ref_4.yuv"],
35 |         "fps": [50, 24],
36 |         "mos": [85.2, 51.8],
37 |         "height": [320, 720],
38 |         "width": [640, 1280]
39 |     }
40 | }
41 | ```
42 | For the time being, only YUV is supported. We will update modules to read bitstream.
43 | 
44 | ## Eval a dataset
45 | 
46 | Put all YUV files (both dis and ref) in a folder and prepare your_dataset.json accordingly. Invoke virtualenv and run:
47 | 
48 | ```
49 | python eval.py --multi_gpu --video_dir /dir/to/yuv --score_file_path /path/to/your_dataset.json --load_model ./save/model_pgc.pt
50 | ```
51 | 
52 | ## Train from scratch
53 | 
54 | Prepare dataset as above and simply run:
55 | 
56 | ```
57 | python train.py --multi_gpu --video_dir /dir/to/yuv --score_file_path /path/to/your_dataset.json --save_model ./save/your_new_trained.pt
58 | ```
59 | Please check train.sh and opts.py if you would like to tweak other hyper-parameters.
60 | 
61 | ## Known issues
62 | 
63 | The pretrained model was trained on 720P PGC videos compressed with H.264/AVC. It runs well with video of a resolution 1920x1080 and below.
64 | 
65 | We are not sure about the performance when the code is run with the following scenario,
66 | 1. PGC with other distortion types, especially time-related distortions.
67 | 2. PGC with post-processing filters, like de-nosing, super-resolution, artifacts reduction, etc.
68 | 3. UGC videos with pre-processing filter.
69 | 4. UGC videos compressed with common codecs.
70 | 
71 | We will try to answer above questions. Stay tuned.
72 | 


--------------------------------------------------------------------------------
/dataset/CSIQ/csiq_video_quality_data.txt:
--------------------------------------------------------------------------------
  1 | 77.03659589	10.68229334
  2 | 46.05828638	11.67556088
  3 | 16.98977584	8.975408634
  4 | 47.87233568	12.90558422
  5 | 59.52295226	10.02854956
  6 | 63.53092857	10.73559013
  7 | 23.22137353	9.633084011
  8 | 39.18686526	11.99896188
  9 | 64.10516994	11.11996791
 10 | 27.95086987	9.43386689
 11 | 43.10871603	9.392493285
 12 | 56.94803765	8.62708836
 13 | 66.23185483	13.07712492
 14 | 53.04012751	11.59132418
 15 | 41.83434535	12.07339852
 16 | 17.23781535	10.47034576
 17 | 38.26370489	11.94198184
 18 | 74.93389035	9.478265116
 19 | 74.33437005	10.67530744
 20 | 36.15574099	12.70897066
 21 | 19.07448461	10.04962736
 22 | 47.30351665	12.27800756
 23 | 62.72479257	12.32255814
 24 | 72.88094169	11.52198149
 25 | 30.91529469	10.98965848
 26 | 51.81943375	12.8503839
 27 | 66.48359975	13.04969298
 28 | 34.40822221	11.49189459
 29 | 50.08198352	13.42102203
 30 | 60.68369663	12.17457699
 31 | 70.89327619	12.56400945
 32 | 58.08476845	12.04455283
 33 | 46.97396136	13.2067548
 34 | 14.48228099	12.19790959
 35 | 41.61453858	13.09358172
 36 | 72.40638794	11.90062763
 37 | 70.60577747	11.09584103
 38 | 50.94476997	10.34591532
 39 | 21.83559842	10.17884888
 40 | 43.62532077	11.23140795
 41 | 53.88891454	11.09043
 42 | 68.49752895	11.34396889
 43 | 27.23487322	9.630655265
 44 | 44.89801346	11.45727039
 45 | 57.64004442	11.25607956
 46 | 23.41833438	10.43762768
 47 | 50.18519535	11.51447251
 48 | 58.36419382	11.38001498
 49 | 65.78999421	13.90534333
 50 | 53.5645593	12.63323682
 51 | 37.6104527	10.03583007
 52 | 19.73899899	9.910142045
 53 | 50.35526658	12.11352316
 54 | 69.59854381	11.33558692
 55 | 62.87918978	11.41159937
 56 | 33.85061428	13.54285106
 57 | 14.92923679	6.828468632
 58 | 35.11641687	10.43713833
 59 | 52.4982942	10.03890222
 60 | 66.25458562	9.139532485
 61 | 27.63248197	9.302392498
 62 | 45.15664539	9.434347477
 63 | 64.43225628	7.952784986
 64 | 27.05051504	9.326012858
 65 | 48.15232603	8.41645693
 66 | 66.51605678	8.095660709
 67 | 66.867476	11.17665978
 68 | 58.2501235	11.19802045
 69 | 40.371758	11.97980241
 70 | 16.86594618	9.316591046
 71 | 51.11946157	9.534721293
 72 | 73.61803456	9.139200391
 73 | 80.24697721	8.114217872
 74 | 51.52948607	12.85108112
 75 | 27.38247179	9.107548075
 76 | 41.03723316	13.0922217
 77 | 52.44280478	13.01991713
 78 | 64.34347279	8.270155808
 79 | 34.09030457	9.096970961
 80 | 52.25923635	9.005787357
 81 | 67.07804333	9.447486342
 82 | 33.97223008	8.9624807
 83 | 54.37509447	10.30530786
 84 | 63.59901217	9.302487853
 85 | 68.82679581	11.81049835
 86 | 55.33105232	12.76696602
 87 | 44.32098298	10.0288545
 88 | 15.32108178	8.188513903
 89 | 43.19870312	15.83002924
 90 | 70.90403793	13.0103436
 91 | 53.17354596	11.59011479
 92 | 34.04796718	10.02394747
 93 | 17.78340253	7.4449613
 94 | 37.01950616	11.11612199
 95 | 46.51888825	9.612767176
 96 | 60.364435	9.089100329
 97 | 22.21005878	9.434656151
 98 | 39.83489667	9.702219087
 99 | 59.34890673	7.860333346
100 | 21.27320839	8.138559375
101 | 38.83293807	12.37495157
102 | 62.73739714	10.7909046
103 | 71.62032717	11.09895612
104 | 62.67867706	15.63234194
105 | 46.18218926	11.2882467
106 | 22.47093604	10.5649761
107 | 40.73906987	12.73258494
108 | 70.14176791	9.172305919
109 | 53.59894756	9.439882534
110 | 32.93455043	14.51621335
111 | 22.90368876	10.96273481
112 | 44.85004437	13.96989401
113 | 54.3517671	12.36393615
114 | 65.49892668	12.16442783
115 | 46.96523031	10.36704275
116 | 56.7929296	11.14494472
117 | 66.4621166	10.45118447
118 | 30.84561652	12.57999828
119 | 54.58016742	12.10969375
120 | 72.48292161	13.96802526
121 | 68.00360471	15.1666297
122 | 63.96955028	14.97916725
123 | 43.79527584	11.12553103
124 | 17.67005082	12.09990922
125 | 41.02256363	15.78553188
126 | 71.55953909	12.15049909
127 | 75.887318	12.43790308
128 | 63.12511595	15.99820623
129 | 39.32328426	12.26853178
130 | 33.94465085	18.77570328
131 | 50.31336483	14.75787882
132 | 58.6132508	14.81632115
133 | 24.63740062	14.33033172
134 | 37.96128614	13.32445739
135 | 55.45947188	13.04601296
136 | 24.56694236	12.5741992
137 | 35.85982249	11.97389232
138 | 55.19559447	11.9475562
139 | 68.64805944	14.20983289
140 | 58.94759809	12.7464758
141 | 42.06015426	15.09457338
142 | 17.32961501	13.15849912
143 | 50.4331632	13.00156555
144 | 68.82964224	11.10942679
145 | 71.34861758	11.05052592
146 | 45.92218919	12.18816792
147 | 33.32167219	12.4417105
148 | 33.33259131	15.04303384
149 | 45.99884009	13.69187619
150 | 57.3619734	14.46617532
151 | 40.8293305	12.28396416
152 | 51.2924095	12.16139341
153 | 65.47287561	9.98770745
154 | 35.09312743	12.09728916
155 | 53.26328289	11.49583871
156 | 70.8765717	11.03342858
157 | 63.17684048	13.69483052
158 | 58.33602566	13.50846739
159 | 40.60143602	12.33208306
160 | 33.49377505	13.41019755
161 | 62.91420905	13.14503849
162 | 78.99350105	10.92069275
163 | 69.03073695	11.1086235
164 | 44.87401856	11.25229977
165 | 31.52565431	13.16048146
166 | 36.80488145	12.50257295
167 | 55.17661364	10.64543202
168 | 60.94209742	9.277534115
169 | 38.22349859	10.41976785
170 | 56.76359372	15.15577019
171 | 72.70164753	10.35491257
172 | 44.55409245	9.633922257
173 | 60.73065684	8.64755535
174 | 71.64790234	8.661880564
175 | 70.34737955	10.33569754
176 | 58.40100748	11.19599823
177 | 44.03370891	12.75386051
178 | 28.11075824	15.66130178
179 | 65.46549175	9.990327781
180 | 82.80008516	9.171781626
181 | 73.58225148	14.80830683
182 | 53.11354387	15.05058804
183 | 20.84597893	13.82228349
184 | 49.34783875	13.55053991
185 | 54.43910328	13.62899598
186 | 63.1863331	14.26525636
187 | 35.01431445	13.19332365
188 | 50.50521804	13.18986437
189 | 70.43787024	12.34679746
190 | 31.85484722	12.42544112
191 | 50.67216537	12.89324405
192 | 60.29534737	14.5254711
193 | 70.0192286	15.53322936
194 | 59.00027104	16.2270485
195 | 41.57339111	13.85798974
196 | 14.80210383	12.46163461
197 | 46.89729744	13.70177809
198 | 75.79427181	11.78725179
199 | 68.8800519	11.77675702
200 | 48.42246494	11.89779914
201 | 35.81287413	13.77085137
202 | 38.0605698	9.466687828
203 | 47.43617272	11.43293805
204 | 60.92460606	10.70242415
205 | 38.9098127	12.72318234
206 | 57.80180051	9.133746206
207 | 72.85497891	6.688103398
208 | 26.66715185	12.22327503
209 | 57.51913375	8.307480655
210 | 65.80745229	8.791192296
211 | 73.78412042	12.09659861
212 | 59.35873126	12.12424519
213 | 47.60257315	11.3749444
214 | 29.94580064	12.59422049
215 | 57.57219085	12.12436366
216 | 73.08194444	10.34255094
217 | 


--------------------------------------------------------------------------------
/dataset/CSIQ/csiq_video_quality_seqs.txt:
--------------------------------------------------------------------------------
  1 | BQMall_832x480_dst_01.yuv	
  2 | BQMall_832x480_dst_02.yuv	
  3 | BQMall_832x480_dst_03.yuv	
  4 | BQMall_832x480_dst_04.yuv	
  5 | BQMall_832x480_dst_05.yuv	
  6 | BQMall_832x480_dst_06.yuv	
  7 | BQMall_832x480_dst_07.yuv	
  8 | BQMall_832x480_dst_08.yuv	
  9 | BQMall_832x480_dst_09.yuv	
 10 | BQMall_832x480_dst_10.yuv
 11 | BQMall_832x480_dst_11.yuv	
 12 | BQMall_832x480_dst_12.yuv
 13 | BQMall_832x480_dst_13.yuv	
 14 | BQMall_832x480_dst_14.yuv	
 15 | BQMall_832x480_dst_15.yuv	
 16 | BQMall_832x480_dst_16.yuv	
 17 | BQMall_832x480_dst_17.yuv	
 18 | BQMall_832x480_dst_18.yuv	
 19 | BQTerrace_832x480_dst_01.yuv	
 20 | BQTerrace_832x480_dst_02.yuv	
 21 | BQTerrace_832x480_dst_03.yuv	
 22 | BQTerrace_832x480_dst_04.yuv	
 23 | BQTerrace_832x480_dst_05.yuv	
 24 | BQTerrace_832x480_dst_06.yuv	
 25 | BQTerrace_832x480_dst_07.yuv	
 26 | BQTerrace_832x480_dst_08.yuv
 27 | BQTerrace_832x480_dst_09.yuv	
 28 | BQTerrace_832x480_dst_10.yuv	
 29 | BQTerrace_832x480_dst_11.yuv	
 30 | BQTerrace_832x480_dst_12.yuv	
 31 | BQTerrace_832x480_dst_13.yuv	
 32 | BQTerrace_832x480_dst_14.yuv	
 33 | BQTerrace_832x480_dst_15.yuv
 34 | BQTerrace_832x480_dst_16.yuv	
 35 | BQTerrace_832x480_dst_17.yuv	
 36 | BQTerrace_832x480_dst_18.yuv	
 37 | BasketballDrive_832x480_dst_01.yuv	
 38 | BasketballDrive_832x480_dst_02.yuv	
 39 | BasketballDrive_832x480_dst_03.yuv	
 40 | BasketballDrive_832x480_dst_04.yuv	
 41 | BasketballDrive_832x480_dst_05.yuv
 42 | BasketballDrive_832x480_dst_06.yuv	
 43 | BasketballDrive_832x480_dst_07.yuv	
 44 | BasketballDrive_832x480_dst_08.yuv	
 45 | BasketballDrive_832x480_dst_09.yuv	
 46 | BasketballDrive_832x480_dst_10.yuv	
 47 | BasketballDrive_832x480_dst_11.yuv	
 48 | BasketballDrive_832x480_dst_12.yuv	
 49 | BasketballDrive_832x480_dst_13.yuv	
 50 | BasketballDrive_832x480_dst_14.yuv
 51 | BasketballDrive_832x480_dst_15.yuv
 52 | BasketballDrive_832x480_dst_16.yuv	
 53 | BasketballDrive_832x480_dst_17.yuv	
 54 | BasketballDrive_832x480_dst_18.yuv	
 55 | Cactus_832x480_dst_01.yuv	
 56 | Cactus_832x480_dst_02.yuv	
 57 | Cactus_832x480_dst_03.yuv	
 58 | Cactus_832x480_dst_04.yuv	
 59 | Cactus_832x480_dst_05.yuv
 60 | Cactus_832x480_dst_06.yuv	
 61 | Cactus_832x480_dst_07.yuv	
 62 | Cactus_832x480_dst_08.yuv	
 63 | Cactus_832x480_dst_09.yuv	
 64 | Cactus_832x480_dst_10.yuv	
 65 | Cactus_832x480_dst_11.yuv
 66 | Cactus_832x480_dst_12.yuv	
 67 | Cactus_832x480_dst_13.yuv
 68 | Cactus_832x480_dst_14.yuv
 69 | Cactus_832x480_dst_15.yuv
 70 | Cactus_832x480_dst_16.yuv	
 71 | Cactus_832x480_dst_17.yuv	
 72 | Cactus_832x480_dst_18.yuv	
 73 | Carving_832x480_dst_01.yuv	
 74 | Carving_832x480_dst_02.yuv	
 75 | Carving_832x480_dst_03.yuv	
 76 | Carving_832x480_dst_04.yuv
 77 | Carving_832x480_dst_05.yuv	
 78 | Carving_832x480_dst_06.yuv	
 79 | Carving_832x480_dst_07.yuv	
 80 | Carving_832x480_dst_08.yuv	
 81 | Carving_832x480_dst_09.yuv	
 82 | Carving_832x480_dst_10.yuv
 83 | Carving_832x480_dst_11.yuv	
 84 | Carving_832x480_dst_12.yuv	
 85 | Carving_832x480_dst_13.yuv	
 86 | Carving_832x480_dst_14.yuv	
 87 | Carving_832x480_dst_15.yuv
 88 | Carving_832x480_dst_16.yuv	
 89 | Carving_832x480_dst_17.yuv	
 90 | Carving_832x480_dst_18.yuv
 91 | Chipmunks_832x480_dst_01.yuv	
 92 | Chipmunks_832x480_dst_02.yuv	
 93 | Chipmunks_832x480_dst_03.yuv
 94 | Chipmunks_832x480_dst_04.yuv	
 95 | Chipmunks_832x480_dst_05.yuv	
 96 | Chipmunks_832x480_dst_06.yuv		    
 97 | Chipmunks_832x480_dst_07.yuv	
 98 | Chipmunks_832x480_dst_08.yuv	
 99 | Chipmunks_832x480_dst_09.yuv	
100 | Chipmunks_832x480_dst_10.yuv	
101 | Chipmunks_832x480_dst_11.yuv	
102 | Chipmunks_832x480_dst_12.yuv
103 | Chipmunks_832x480_dst_13.yuv	
104 | Chipmunks_832x480_dst_14.yuv	
105 | Chipmunks_832x480_dst_15.yuv
106 | Chipmunks_832x480_dst_16.yuv
107 | Chipmunks_832x480_dst_17.yuv	
108 | Chipmunks_832x480_dst_18.yuv	
109 | Flowervase_832x480_dst_01.yuv	
110 | Flowervase_832x480_dst_02.yuv	
111 | Flowervase_832x480_dst_03.yuv	
112 | Flowervase_832x480_dst_04.yuv	
113 | Flowervase_832x480_dst_05.yuv
114 | Flowervase_832x480_dst_06.yuv	
115 | Flowervase_832x480_dst_07.yuv	
116 | Flowervase_832x480_dst_08.yuv
117 | Flowervase_832x480_dst_09.yuv
118 | Flowervase_832x480_dst_10.yuv	
119 | Flowervase_832x480_dst_11.yuv	
120 | Flowervase_832x480_dst_12.yuv	
121 | Flowervase_832x480_dst_13.yuv
122 | Flowervase_832x480_dst_14.yuv	
123 | Flowervase_832x480_dst_15.yuv	
124 | Flowervase_832x480_dst_16.yuv	
125 | Flowervase_832x480_dst_17.yuv	
126 | Flowervase_832x480_dst_18.yuv	
127 | Keiba_832x480_dst_01.yuv
128 | Keiba_832x480_dst_02.yuv	
129 | Keiba_832x480_dst_03.yuv	
130 | Keiba_832x480_dst_04.yuv	
131 | Keiba_832x480_dst_05.yuv	
132 | Keiba_832x480_dst_06.yuv
133 | Keiba_832x480_dst_07.yuv	
134 | Keiba_832x480_dst_08.yuv	
135 | Keiba_832x480_dst_09.yuv	
136 | Keiba_832x480_dst_10.yuv
137 | Keiba_832x480_dst_11.yuv	
138 | Keiba_832x480_dst_12.yuv
139 | Keiba_832x480_dst_13.yuv	
140 | Keiba_832x480_dst_14.yuv
141 | Keiba_832x480_dst_15.yuv	
142 | Keiba_832x480_dst_16.yuv	
143 | Keiba_832x480_dst_17.yuv
144 | Keiba_832x480_dst_18.yuv	
145 | Kimono_832x480_dst_01.yuv	
146 | Kimono_832x480_dst_02.yuv	
147 | Kimono_832x480_dst_03.yuv
148 | Kimono_832x480_dst_04.yuv	
149 | Kimono_832x480_dst_05.yuv	
150 | Kimono_832x480_dst_06.yuv
151 | Kimono_832x480_dst_07.yuv
152 | Kimono_832x480_dst_08.yuv
153 | Kimono_832x480_dst_09.yuv
154 | Kimono_832x480_dst_10.yuv	
155 | Kimono_832x480_dst_11.yuv	
156 | Kimono_832x480_dst_12.yuv
157 | Kimono_832x480_dst_13.yuv	
158 | Kimono_832x480_dst_14.yuv	
159 | Kimono_832x480_dst_15.yuv	
160 | Kimono_832x480_dst_16.yuv	
161 | Kimono_832x480_dst_17.yuv	
162 | Kimono_832x480_dst_18.yuv	
163 | ParkScene_832x480_dst_01.yuv
164 | ParkScene_832x480_dst_02.yuv	
165 | ParkScene_832x480_dst_03.yuv	
166 | ParkScene_832x480_dst_04.yuv	
167 | ParkScene_832x480_dst_05.yuv	
168 | ParkScene_832x480_dst_06.yuv	
169 | ParkScene_832x480_dst_07.yuv	
170 | ParkScene_832x480_dst_08.yuv	
171 | ParkScene_832x480_dst_09.yuv	
172 | ParkScene_832x480_dst_10.yuv	
173 | ParkScene_832x480_dst_11.yuv
174 | ParkScene_832x480_dst_12.yuv	
175 | ParkScene_832x480_dst_13.yuv	
176 | ParkScene_832x480_dst_14.yuv	
177 | ParkScene_832x480_dst_15.yuv	
178 | ParkScene_832x480_dst_16.yuv	
179 | ParkScene_832x480_dst_17.yuv	
180 | ParkScene_832x480_dst_18.yuv	
181 | PartyScene_832x480_dst_01.yuv	
182 | PartyScene_832x480_dst_02.yuv	
183 | PartyScene_832x480_dst_03.yuv	
184 | PartyScene_832x480_dst_04.yuv	
185 | PartyScene_832x480_dst_05.yuv	
186 | PartyScene_832x480_dst_06.yuv
187 | PartyScene_832x480_dst_07.yuv	
188 | PartyScene_832x480_dst_08.yuv	
189 | PartyScene_832x480_dst_09.yuv	
190 | PartyScene_832x480_dst_10.yuv	
191 | PartyScene_832x480_dst_11.yuv	
192 | PartyScene_832x480_dst_12.yuv
193 | PartyScene_832x480_dst_13.yuv
194 | PartyScene_832x480_dst_14.yuv
195 | PartyScene_832x480_dst_15.yuv	
196 | PartyScene_832x480_dst_16.yuv	
197 | PartyScene_832x480_dst_17.yuv	
198 | PartyScene_832x480_dst_18.yuv	
199 | Timelapse_832x480_dst_01.yuv
200 | Timelapse_832x480_dst_02.yuv	
201 | Timelapse_832x480_dst_03.yuv	
202 | Timelapse_832x480_dst_04.yuv
203 | Timelapse_832x480_dst_05.yuv	
204 | Timelapse_832x480_dst_06.yuv	
205 | Timelapse_832x480_dst_07.yuv
206 | Timelapse_832x480_dst_08.yuv	
207 | Timelapse_832x480_dst_09.yuv	
208 | Timelapse_832x480_dst_10.yuv	
209 | Timelapse_832x480_dst_11.yuv	
210 | Timelapse_832x480_dst_12.yuv	
211 | Timelapse_832x480_dst_13.yuv	
212 | Timelapse_832x480_dst_14.yuv	
213 | Timelapse_832x480_dst_15.yuv
214 | Timelapse_832x480_dst_16.yuv	
215 | Timelapse_832x480_dst_17.yuv	
216 | Timelapse_832x480_dst_18.yuv	
217 | 


--------------------------------------------------------------------------------
/dataset/CSIQ/prep_csiq_score.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import re
  3 | from collections import OrderedDict
  4 | import numpy as np
  5 | import json
  6 | 
  7 | 
  8 | def make_score_file():
  9 |     dir_path = os.path.dirname(os.path.realpath(__file__))
 10 | 
 11 |     seq_file_name = 'csiq_video_quality_seqs.txt'
 12 |     score_file_name = 'csiq_video_quality_data.txt'
 13 | 
 14 |     all_scenes = ['Keiba', 'Timelapse', 'BQTerrace', 'Carving', 'Chipmunks',
 15 |                   'Flowervase', 'ParkScene', 'PartyScene', 'BQMall', 'Cactus',
 16 |                   'Kimono','BasketballDrive']
 17 |     test_scene = ['BQTerrace', 'ParkScene']
 18 |     framerate = {
 19 |         'Chipmunks_832x480_ref.yuv': 24,
 20 |         'Kimono_832x480_ref.yuv': 24,
 21 |         'ParkScene_832x480_ref.yuv': 24,
 22 |         'Carving_832x480_ref.yuv': 25,
 23 |         'Flowervase_832x480_ref.yuv': 30,
 24 |         'Keiba_832x480_ref.yuv': 30,
 25 |         'Timelapse_832x480_ref.yuv': 30,
 26 |         'BasketballDrive_832x480_ref.yuv': 50,
 27 |         'Cactus_832x480_ref.yuv': 50,
 28 |         'PartyScene_832x480_ref.yuv': 50,
 29 |         'BQMall_832x480_ref.yuv': 60,
 30 |         'BQTerrace_832x480_ref.yuv': 60}
 31 | 
 32 |     width = 832
 33 |     height = 480
 34 | 
 35 |     seqs = np.genfromtxt(os.path.join(dir_path, seq_file_name), dtype='str')
 36 |     score = np.genfromtxt(os.path.join(dir_path, score_file_name), dtype='float')
 37 | 
 38 |     ret = OrderedDict()
 39 |     ret['train'] = OrderedDict()
 40 |     ret['test'] = OrderedDict()
 41 | 
 42 |     trn_dis = []
 43 |     trn_ref = []
 44 |     trn_mos = []
 45 |     trn_height = []
 46 |     trn_width = []
 47 |     trn_fps = []
 48 | 
 49 |     tst_dis = []
 50 |     tst_ref = []
 51 |     tst_mos = []
 52 |     tst_height = []
 53 |     tst_width = []
 54 |     tst_fps = []
 55 | 
 56 |     for clip, mos in zip(seqs, score):
 57 |         clip_info = re.split('_', clip)
 58 |         clip_name = clip_info[0]
 59 |         ref = clip.replace(clip[-10:-4], 'ref')
 60 |         fps = framerate[ref]
 61 | 
 62 |         if clip_name in test_scene:
 63 |             tst_dis.append(clip)
 64 |             tst_ref.append(ref)
 65 |             tst_mos.append(100.0 - float(mos[0]))
 66 |             tst_height.append(height)
 67 |             tst_width.append(width)
 68 |             tst_fps.append(fps)
 69 |         else:
 70 |             trn_dis.append(clip)
 71 |             trn_ref.append(ref)
 72 |             trn_mos.append(100.0 - float(mos[0]))
 73 |             trn_height.append(height)
 74 |             trn_width.append(width)
 75 |             trn_fps.append(fps)
 76 | 
 77 |     ret['train']['dis'] = trn_dis
 78 |     ret['train']['ref'] = trn_ref
 79 |     ret['train']['mos'] = trn_mos
 80 |     ret['train']['height'] = trn_height
 81 |     ret['train']['width'] = trn_width
 82 |     ret['train']['fps'] = trn_fps
 83 | 
 84 |     ret['test']['dis'] = tst_dis
 85 |     ret['test']['ref'] = tst_ref
 86 |     ret['test']['mos'] = tst_mos
 87 |     ret['test']['height'] = tst_height
 88 |     ret['test']['width'] = tst_width
 89 |     ret['test']['fps'] = tst_fps
 90 | 
 91 |     with open('csiq_subj_score_{}.json'.format('_'.join(test_scene)), 'w') as f:
 92 |         json.dump(ret, f, indent=4, sort_keys=True)
 93 | 
 94 | 
 95 | if __name__ == "__main__":
 96 | 
 97 |     make_score_file()
 98 | 
 99 |     print('Done')
100 | 


--------------------------------------------------------------------------------
/dataset/LIVE/live_video_quality_data.txt:
--------------------------------------------------------------------------------
  1 | 44.5104	12.2909
  2 | 70.1054	8.4630
  3 | 66.4280	10.9220
  4 | 75.1225	8.7056
  5 | 73.8803	5.7825
  6 | 63.2564	8.8315
  7 | 61.2726	10.6827
  8 | 40.5551	8.4040
  9 | 52.6111	9.8646
 10 | 60.2534	9.0097
 11 | 68.7186	9.3995
 12 | 42.9784	8.5050
 13 | 51.0530	8.0119
 14 | 55.7020	9.3731
 15 | 65.6457	10.8023
 16 | 64.9369	12.4744
 17 | 46.2446	9.8897
 18 | 54.3732	12.0351
 19 | 46.4907	10.9136
 20 | 68.1064	10.4983
 21 | 54.8101	13.2412
 22 | 54.6555	12.2369
 23 | 39.1978	11.7595
 24 | 43.6833	12.6685
 25 | 55.8563	15.2382
 26 | 63.5809	12.0636
 27 | 38.8828	11.0500
 28 | 45.6069	14.4528
 29 | 48.0089	13.7996
 30 | 47.5270	11.8475
 31 | 68.1431	12.0123
 32 | 63.5698	12.6835
 33 | 48.0196	11.2378
 34 | 51.4980	13.1559
 35 | 55.2291	11.2665
 36 | 62.3778	12.1601
 37 | 42.6909	9.5547
 38 | 37.8713	9.9518
 39 | 45.4363	11.9058
 40 | 53.6343	13.7169
 41 | 62.9934	10.0094
 42 | 31.4716	8.0896
 43 | 42.8568	11.4820
 44 | 52.0988	8.0925
 45 | 62.2062	10.8021
 46 | 71.2731	7.3171
 47 | 72.1356	8.2769
 48 | 64.6561	8.7193
 49 | 53.1125	10.2891
 50 | 73.4730	11.2189
 51 | 55.3531	10.7032
 52 | 52.4524	9.9872
 53 | 38.6726	8.7816
 54 | 47.7716	8.6263
 55 | 56.9119	9.3595
 56 | 63.7984	7.4827
 57 | 33.4734	8.8625
 58 | 42.5381	12.2394
 59 | 56.1328	10.0524
 60 | 65.7102	10.8513
 61 | 65.6522	11.8297
 62 | 61.3221	11.2218
 63 | 44.0305	12.3100
 64 | 41.4157	10.1887
 65 | 58.4534	10.2342
 66 | 44.2762	10.2308
 67 | 48.3834	10.8759
 68 | 40.7745	10.9440
 69 | 46.5633	9.3641
 70 | 52.3269	11.2327
 71 | 56.0811	9.9024
 72 | 36.5136	10.6661
 73 | 42.9632	9.5615
 74 | 49.1987	12.5682
 75 | 57.4200	10.8714
 76 | 54.9213	9.9593
 77 | 63.2756	7.0135
 78 | 56.8614	10.3063
 79 | 49.2987	7.9941
 80 | 59.3959	8.3076
 81 | 44.8094	11.1511
 82 | 39.1088	8.8315
 83 | 32.6002	7.5710
 84 | 44.0164	9.5158
 85 | 54.9423	8.8703
 86 | 57.1497	10.3586
 87 | 40.9999	10.2129
 88 | 44.6477	9.6876
 89 | 49.2215	8.2303
 90 | 53.7003	8.3839
 91 | 68.9412	13.2694
 92 | 52.9363	10.9429
 93 | 51.0109	11.6969
 94 | 55.9066	12.9653
 95 | 61.7965	8.9395
 96 | 45.9273	12.2075
 97 | 40.9576	10.0565
 98 | 31.9421	10.0953
 99 | 36.6396	10.2083
100 | 38.6448	9.1071
101 | 52.1844	10.8366
102 | 32.7252	11.6010
103 | 43.9984	9.6540
104 | 50.5090	8.9686
105 | 53.4364	11.3882
106 | 81.1601	8.8839
107 | 70.5494	7.0154
108 | 54.9174	10.3442
109 | 49.6350	9.8661
110 | 55.5307	8.4316
111 | 61.2837	9.8106
112 | 46.2254	8.0034
113 | 36.2440	8.7969
114 | 40.8004	8.8023
115 | 51.6153	10.4552
116 | 66.3166	10.0913
117 | 37.0212	7.4451
118 | 44.0813	9.4971
119 | 57.5757	6.4381
120 | 62.0745	6.2390
121 | 78.3431	9.9876
122 | 69.2258	8.0969
123 | 59.5299	9.8755
124 | 57.8482	10.2606
125 | 73.3075	9.0790
126 | 58.5392	11.3208
127 | 54.0963	10.0428
128 | 47.3711	10.8012
129 | 48.7705	7.7892
130 | 57.6788	9.7494
131 | 67.8232	7.4454
132 | 30.9426	8.0339
133 | 40.5326	9.8009
134 | 52.5435	9.9240
135 | 64.8173	9.5076
136 | 61.3882	10.2155
137 | 66.3322	11.1123
138 | 45.4702	7.6892
139 | 45.3150	8.6377
140 | 55.3240	6.1770
141 | 56.1730	8.7040
142 | 44.6086	10.3585
143 | 39.8067	8.2885
144 | 53.7598	9.0671
145 | 59.8921	10.6386
146 | 77.2518	8.7931
147 | 39.7105	9.5447
148 | 46.8271	10.3513
149 | 54.4239	11.2077
150 | 61.8235	11.1164
151 | 


--------------------------------------------------------------------------------
/dataset/LIVE/live_video_quality_seqs.txt:
--------------------------------------------------------------------------------
  1 | pa2_25fps.yuv
  2 | pa3_25fps.yuv
  3 | pa4_25fps.yuv
  4 | pa5_25fps.yuv
  5 | pa6_25fps.yuv
  6 | pa7_25fps.yuv
  7 | pa8_25fps.yuv
  8 | pa9_25fps.yuv
  9 | pa10_25fps.yuv
 10 | pa11_25fps.yuv
 11 | pa12_25fps.yuv
 12 | pa13_25fps.yuv
 13 | pa14_25fps.yuv
 14 | pa15_25fps.yuv
 15 | pa16_25fps.yuv
 16 | rb2_25fps.yuv
 17 | rb3_25fps.yuv
 18 | rb4_25fps.yuv
 19 | rb5_25fps.yuv
 20 | rb6_25fps.yuv
 21 | rb7_25fps.yuv
 22 | rb8_25fps.yuv
 23 | rb9_25fps.yuv
 24 | rb10_25fps.yuv
 25 | rb11_25fps.yuv
 26 | rb12_25fps.yuv
 27 | rb13_25fps.yuv
 28 | rb14_25fps.yuv
 29 | rb15_25fps.yuv
 30 | rb16_25fps.yuv
 31 | rh2_25fps.yuv
 32 | rh3_25fps.yuv
 33 | rh4_25fps.yuv
 34 | rh5_25fps.yuv
 35 | rh6_25fps.yuv
 36 | rh7_25fps.yuv
 37 | rh8_25fps.yuv
 38 | rh9_25fps.yuv
 39 | rh10_25fps.yuv
 40 | rh11_25fps.yuv
 41 | rh12_25fps.yuv
 42 | rh13_25fps.yuv
 43 | rh14_25fps.yuv
 44 | rh15_25fps.yuv
 45 | rh16_25fps.yuv
 46 | tr2_25fps.yuv
 47 | tr3_25fps.yuv
 48 | tr4_25fps.yuv
 49 | tr5_25fps.yuv
 50 | tr6_25fps.yuv
 51 | tr7_25fps.yuv
 52 | tr8_25fps.yuv
 53 | tr9_25fps.yuv
 54 | tr10_25fps.yuv
 55 | tr11_25fps.yuv
 56 | tr12_25fps.yuv
 57 | tr13_25fps.yuv
 58 | tr14_25fps.yuv
 59 | tr15_25fps.yuv
 60 | tr16_25fps.yuv
 61 | st2_25fps.yuv
 62 | st3_25fps.yuv
 63 | st4_25fps.yuv
 64 | st5_25fps.yuv
 65 | st6_25fps.yuv
 66 | st7_25fps.yuv
 67 | st8_25fps.yuv
 68 | st9_25fps.yuv
 69 | st10_25fps.yuv
 70 | st11_25fps.yuv
 71 | st12_25fps.yuv
 72 | st13_25fps.yuv
 73 | st14_25fps.yuv
 74 | st15_25fps.yuv
 75 | st16_25fps.yuv
 76 | sf2_25fps.yuv
 77 | sf3_25fps.yuv
 78 | sf4_25fps.yuv
 79 | sf5_25fps.yuv
 80 | sf6_25fps.yuv
 81 | sf7_25fps.yuv
 82 | sf8_25fps.yuv
 83 | sf9_25fps.yuv
 84 | sf10_25fps.yuv
 85 | sf11_25fps.yuv
 86 | sf12_25fps.yuv
 87 | sf13_25fps.yuv
 88 | sf14_25fps.yuv
 89 | sf15_25fps.yuv
 90 | sf16_25fps.yuv
 91 | bs2_25fps.yuv
 92 | bs3_25fps.yuv
 93 | bs4_25fps.yuv
 94 | bs5_25fps.yuv
 95 | bs6_25fps.yuv
 96 | bs7_25fps.yuv
 97 | bs8_25fps.yuv
 98 | bs9_25fps.yuv
 99 | bs10_25fps.yuv
100 | bs11_25fps.yuv
101 | bs12_25fps.yuv
102 | bs13_25fps.yuv
103 | bs14_25fps.yuv
104 | bs15_25fps.yuv
105 | bs16_25fps.yuv
106 | sh2_50fps.yuv
107 | sh3_50fps.yuv
108 | sh4_50fps.yuv
109 | sh5_50fps.yuv
110 | sh6_50fps.yuv
111 | sh7_50fps.yuv
112 | sh8_50fps.yuv
113 | sh9_50fps.yuv
114 | sh10_50fps.yuv
115 | sh11_50fps.yuv
116 | sh12_50fps.yuv
117 | sh13_50fps.yuv
118 | sh14_50fps.yuv
119 | sh15_50fps.yuv
120 | sh16_50fps.yuv
121 | mc2_50fps.yuv
122 | mc3_50fps.yuv
123 | mc4_50fps.yuv
124 | mc5_50fps.yuv
125 | mc6_50fps.yuv
126 | mc7_50fps.yuv
127 | mc8_50fps.yuv
128 | mc9_50fps.yuv
129 | mc10_50fps.yuv
130 | mc11_50fps.yuv
131 | mc12_50fps.yuv
132 | mc13_50fps.yuv
133 | mc14_50fps.yuv
134 | mc15_50fps.yuv
135 | mc16_50fps.yuv
136 | pr2_50fps.yuv
137 | pr3_50fps.yuv
138 | pr4_50fps.yuv
139 | pr5_50fps.yuv
140 | pr6_50fps.yuv
141 | pr7_50fps.yuv
142 | pr8_50fps.yuv
143 | pr9_50fps.yuv
144 | pr10_50fps.yuv
145 | pr11_50fps.yuv
146 | pr12_50fps.yuv
147 | pr13_50fps.yuv
148 | pr14_50fps.yuv
149 | pr15_50fps.yuv
150 | pr16_50fps.yuv
151 | 


--------------------------------------------------------------------------------
/dataset/LIVE/prep_live_score.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import re
 3 | from collections import OrderedDict
 4 | import numpy as np
 5 | import json
 6 | 
 7 | def make_score_file():
 8 |     dir_path = os.path.dirname(os.path.realpath(__file__))
 9 | 
10 |     seq_file_name = 'live_video_quality_seqs.txt'
11 |     score_file_name = 'live_video_quality_data.txt'
12 | 
13 |     all_scenes = ['bs', "st", 'sh', "mc", "pa", 'sf', 'rb', 'tr', 'pr', 'rh']
14 |     test_scene = ['bs', 'st']
15 |     framerate = {
16 |         'pa01_25fps.yuv': 25,
17 |         'rb01_25fps.yuv': 25,
18 |         'rh01_25fps.yuv': 25,
19 |         'tr01_25fps.yuv': 25,
20 |         'st01_25fps.yuv': 25,
21 |         'sf01_25fps.yuv': 25,
22 |         'bs01_25fps.yuv': 25,
23 |         'sh01_50fps.yuv': 50,
24 |         'mc01_50fps.yuv': 50,
25 |         'pr01_50fps.yuv': 50}
26 | 
27 |     width = 768
28 |     height = 432
29 | 
30 |     seqs = np.genfromtxt(os.path.join(dir_path, seq_file_name), dtype='str')
31 |     score = np.genfromtxt(os.path.join(dir_path, score_file_name), dtype='float')
32 | 
33 |     ret = OrderedDict()
34 |     ret['train'] = OrderedDict()
35 |     ret['test'] = OrderedDict()
36 | 
37 |     trn_dis = []
38 |     trn_ref = []
39 |     trn_mos = []
40 |     trn_height = []
41 |     trn_width = []
42 |     trn_fps = []
43 | 
44 |     tst_dis = []
45 |     tst_ref = []
46 |     tst_mos = []
47 |     tst_height = []
48 |     tst_width = []
49 |     tst_fps = []
50 | 
51 |     for clip, mos in zip(seqs, score):
52 |         clip_info = re.split('_', clip)
53 |         clip_name = clip_info[0][0:2]
54 |         dst = int(clip_info[0][2:])
55 |         clip = '' + clip_name + '{dst:02d}'.format(dst=dst) + '_' + clip_info[-1]
56 |         ref = '' + clip_name + '01' + '_' + clip_info[-1]
57 |         fps = framerate[ref]
58 | 
59 |         if clip_name in test_scene:
60 |             tst_dis.append(clip)
61 |             tst_ref.append(ref)
62 |             tst_mos.append(100.0 - float(mos[0]))
63 |             tst_height.append(height)
64 |             tst_width.append(width)
65 |             tst_fps.append(fps)
66 |         else:
67 |             trn_dis.append(clip)
68 |             trn_ref.append(ref)
69 |             trn_mos.append(100.0 - float(mos[0]))
70 |             trn_height.append(height)
71 |             trn_width.append(width)
72 |             trn_fps.append(fps)
73 | 
74 |     ret['train']['dis'] = trn_dis
75 |     ret['train']['ref'] = trn_ref
76 |     ret['train']['mos'] = trn_mos
77 |     ret['train']['height'] = trn_height
78 |     ret['train']['width'] = trn_width
79 |     ret['train']['fps'] = trn_fps
80 | 
81 |     ret['test']['dis'] = tst_dis
82 |     ret['test']['ref'] = tst_ref
83 |     ret['test']['mos'] = tst_mos
84 |     ret['test']['height'] = tst_height
85 |     ret['test']['width'] = tst_width
86 |     ret['test']['fps'] = tst_fps
87 | 
88 |     with open('live_subj_score_{}.json'.format('_'.join(test_scene)), 'w') as f:
89 |         json.dump(ret, f, indent=4, sort_keys=True)
90 | 
91 | 
92 | if __name__ == "__main__":
93 | 
94 |     make_score_file()
95 | 
96 |     print('Done')
97 | 


--------------------------------------------------------------------------------
/dataset/NFLX/NFLX_dataset_public.py:
--------------------------------------------------------------------------------
  1 | dataset_name = 'NFLX_public'
  2 | yuv_fmt = 'yuv420p'
  3 | width = 1920
  4 | height = 1080
  5 | 
  6 | ref_dir = '[path to dataset videos]/ref'
  7 | dis_dir = '[path to dataset videos]/dis'
  8 | 
  9 | ref_videos = [
 10 |  {'content_id': 0,
 11 |   'content_name': 'BigBuckBunny',
 12 |   'path': ref_dir + '/BigBuckBunny_25fps.yuv'},
 13 |  {'content_id': 1,
 14 |   'content_name': 'BirdsInCage',
 15 |   'path': ref_dir + '/BirdsInCage_30fps.yuv'},
 16 |  {'content_id': 2,
 17 |   'content_name': 'CrowdRun',
 18 |   'path': ref_dir + '/CrowdRun_25fps.yuv'},
 19 |  {'content_id': 3,
 20 |   'content_name': 'ElFuente1',
 21 |   'path': ref_dir + '/ElFuente1_30fps.yuv'},
 22 |  {'content_id': 4,
 23 |   'content_name': 'ElFuente2',
 24 |   'path': ref_dir + '/ElFuente2_30fps.yuv'},
 25 |  {'content_id': 5,
 26 |   'content_name': 'FoxBird',
 27 |   'path': ref_dir + '/FoxBird_25fps.yuv'},
 28 |  {'content_id': 6,
 29 |   'content_name': 'OldTownCross',
 30 |   'path': ref_dir + '/OldTownCross_25fps.yuv'},
 31 |  {'content_id': 7,
 32 |   'content_name': 'Seeking',
 33 |   'path': ref_dir + '/Seeking_25fps.yuv'},
 34 |  {'content_id': 8,
 35 |   'content_name': 'Tennis',
 36 |   'path': ref_dir + '/Tennis_24fps.yuv'}
 37 | ]
 38 | 
 39 | dis_videos = [
 40 |  {'asset_id': 9,
 41 |   'content_id': 0,
 42 |   'dmos': 22.5,
 43 |   'path': dis_dir + '/BigBuckBunny_20_288_375.yuv',
 44 |  },
 45 |  {'asset_id': 10,
 46 |   'content_id': 0,
 47 |   'dmos': 35.0,
 48 |   'path': dis_dir + '/BigBuckBunny_30_384_550.yuv',
 49 |  },
 50 |  {'asset_id': 11,
 51 |   'content_id': 0,
 52 |   'dmos': 49.166666666666664,
 53 |   'path': dis_dir + '/BigBuckBunny_40_384_750.yuv',
 54 |  },
 55 |  {'asset_id': 12,
 56 |   'content_id': 0,
 57 |   'dmos': 61.666666666666664,
 58 |   'path': dis_dir + '/BigBuckBunny_50_480_1050.yuv',
 59 |  },
 60 |  {'asset_id': 13,
 61 |   'content_id': 0,
 62 |   'dmos': 78.333333333333329,
 63 |   'path': dis_dir + '/BigBuckBunny_55_480_1750.yuv',
 64 |  },
 65 |  {'asset_id': 14,
 66 |   'content_id': 0,
 67 |   'dmos': 97.5,
 68 |   'path': dis_dir + '/BigBuckBunny_75_720_3050.yuv',
 69 |  },
 70 |  {'asset_id': 15,
 71 |   'content_id': 0,
 72 |   'dmos': 95.0,
 73 |   'path': dis_dir + '/BigBuckBunny_80_720_4250.yuv',
 74 |  },
 75 |  {'asset_id': 16,
 76 |   'content_id': 0,
 77 |   'dmos': 99.166666666666671,
 78 |   'path': dis_dir + '/BigBuckBunny_85_1080_3800.yuv',
 79 |  },
 80 |  {'asset_id': 17,
 81 |   'content_id': 0,
 82 |   'dmos': 103.33333333333333,
 83 |   'path': dis_dir + '/BigBuckBunny_90_1080_4300.yuv',
 84 |  },
 85 |  {'asset_id': 18,
 86 |   'content_id': 0,
 87 |   'dmos': 99.166666666666671,
 88 |   'path': dis_dir + '/BigBuckBunny_95_1080_5800.yuv',
 89 |  },
 90 |  {'asset_id': 19,
 91 |   'content_id': 1,
 92 |   'dmos': 38.333333333333336,
 93 |   'path': dis_dir + '/BirdsInCage_40_288_375.yuv',
 94 |  },
 95 |  {'asset_id': 20,
 96 |   'content_id': 1,
 97 |   'dmos': 40.0,
 98 |   'path': dis_dir + '/BirdsInCage_50_288_550.yuv',
 99 |  },
100 |  {'asset_id': 21,
101 |   'content_id': 1,
102 |   'dmos': 52.5,
103 |   'path': dis_dir + '/BirdsInCage_60_384_550.yuv',
104 |  },
105 |  {'asset_id': 22,
106 |   'content_id': 1,
107 |   'dmos': 55.0,
108 |   'path': dis_dir + '/BirdsInCage_65_384_750.yuv',
109 |  },
110 |  {'asset_id': 23,
111 |   'content_id': 1,
112 |   'dmos': 70.0,
113 |   'path': dis_dir + '/BirdsInCage_80_480_750.yuv',
114 |  },
115 |  {'asset_id': 24,
116 |   'content_id': 1,
117 |   'dmos': 92.5,
118 |   'path': dis_dir + '/BirdsInCage_85_720_1050.yuv',
119 |  },
120 |  {'asset_id': 25,
121 |   'content_id': 1,
122 |   'dmos': 100.83333333333333,
123 |   'path': dis_dir + '/BirdsInCage_90_1080_1800.yuv',
124 |  },
125 |  {'asset_id': 26,
126 |   'content_id': 1,
127 |   'dmos': 102.5,
128 |   'path': dis_dir + '/BirdsInCage_95_1080_3000.yuv',
129 |  },
130 |  {'asset_id': 27,
131 |   'content_id': 2,
132 |   'dmos': 20.0,
133 |   'path': dis_dir + '/CrowdRun_30_288_375.yuv',
134 |  },
135 |  {'asset_id': 28,
136 |   'content_id': 2,
137 |   'dmos': 40.0,
138 |   'path': dis_dir + '/CrowdRun_40_480_2350.yuv',
139 |  },
140 |  {'asset_id': 29,
141 |   'content_id': 2,
142 |   'dmos': 58.333333333333336,
143 |   'path': dis_dir + '/CrowdRun_50_1080_4300.yuv',
144 |  },
145 |  {'asset_id': 30,
146 |   'content_id': 2,
147 |   'dmos': 67.5,
148 |   'path': dis_dir + '/CrowdRun_65_1080_5800.yuv',
149 |  },
150 |  {'asset_id': 31,
151 |   'content_id': 2,
152 |   'dmos': 81.666666666666671,
153 |   'path': dis_dir + '/CrowdRun_75_1080_7500.yuv',
154 |  },
155 |  {'asset_id': 32,
156 |   'content_id': 2,
157 |   'dmos': 85.0,
158 |   'path': dis_dir + '/CrowdRun_80_1080_10000.yuv',
159 |  },
160 |  {'asset_id': 33,
161 |   'content_id': 2,
162 |   'dmos': 94.166666666666671,
163 |   'path': dis_dir + '/CrowdRun_90_1080_15000.yuv',
164 |  },
165 |  {'asset_id': 34,
166 |   'content_id': 3,
167 |   'dmos': 18.333333333333332,
168 |   'path': dis_dir + '/ElFuente1_10_288_375.yuv',
169 |  },
170 |  {'asset_id': 35,
171 |   'content_id': 3,
172 |   'dmos': 29.166666666666668,
173 |   'path': dis_dir + '/ElFuente1_25_384_750.yuv',
174 |  },
175 |  {'asset_id': 36,
176 |   'content_id': 3,
177 |   'dmos': 66.666666666666671,
178 |   'path': dis_dir + '/ElFuente1_50_480_1750.yuv',
179 |  },
180 |  {'asset_id': 37,
181 |   'content_id': 3,
182 |   'dmos': 72.5,
183 |   'path': dis_dir + '/ElFuente1_60_720_2350.yuv',
184 |  },
185 |  {'asset_id': 38,
186 |   'content_id': 3,
187 |   'dmos': 86.666666666666671,
188 |   'path': dis_dir + '/ElFuente1_70_1080_4300.yuv',
189 |  },
190 |  {'asset_id': 39,
191 |   'content_id': 3,
192 |   'dmos': 95.0,
193 |   'path': dis_dir + '/ElFuente1_85_1080_5800.yuv',
194 |  },
195 |  {'asset_id': 40,
196 |   'content_id': 3,
197 |   'dmos': 99.166666666666671,
198 |   'path': dis_dir + '/ElFuente1_90_1080_7500.yuv',
199 |  },
200 |  {'asset_id': 41,
201 |   'content_id': 4,
202 |   'dmos': 25.0,
203 |   'path': dis_dir + '/ElFuente2_05_288_375.yuv',
204 |  },
205 |  {'asset_id': 42,
206 |   'content_id': 4,
207 |   'dmos': 55.0,
208 |   'path': dis_dir + '/ElFuente2_30_480_1750.yuv',
209 |  },
210 |  {'asset_id': 43,
211 |   'content_id': 4,
212 |   'dmos': 58.333333333333336,
213 |   'path': dis_dir + '/ElFuente2_50_720_3050.yuv',
214 |  },
215 |  {'asset_id': 44,
216 |   'content_id': 4,
217 |   'dmos': 68.333333333333329,
218 |   'path': dis_dir + '/ElFuente2_60_1080_4300.yuv',
219 |  },
220 |  {'asset_id': 45,
221 |   'content_id': 4,
222 |   'dmos': 75.833333333333329,
223 |   'path': dis_dir + '/ElFuente2_65_720_4250.yuv',
224 |  },
225 |  {'asset_id': 46,
226 |   'content_id': 4,
227 |   'dmos': 82.5,
228 |   'path': dis_dir + '/ElFuente2_70_1080_5800.yuv',
229 |  },
230 |  {'asset_id': 47,
231 |   'content_id': 4,
232 |   'dmos': 93.333333333333329,
233 |   'path': dis_dir + '/ElFuente2_80_1080_10000.yuv',
234 |  },
235 |  {'asset_id': 48,
236 |   'content_id': 4,
237 |   'dmos': 96.666666666666671,
238 |   'path': dis_dir + '/ElFuente2_85_1080_15000.yuv',
239 |  },
240 |  {'asset_id': 49,
241 |   'content_id': 4,
242 |   'dmos': 97.5,
243 |   'path': dis_dir + '/ElFuente2_90_1080_20000.yuv',
244 |  },
245 |  {'asset_id': 50,
246 |   'content_id': 5,
247 |   'dmos': 34.166666666666664,
248 |   'path': dis_dir + '/FoxBird_20_288_375.yuv',
249 |  },
250 |  {'asset_id': 51,
251 |   'content_id': 5,
252 |   'dmos': 60.0,
253 |   'path': dis_dir + '/FoxBird_40_384_750.yuv',
254 |  },
255 |  {'asset_id': 52,
256 |   'content_id': 5,
257 |   'dmos': 64.166666666666671,
258 |   'path': dis_dir + '/FoxBird_55_480_750.yuv',
259 |  },
260 |  {'asset_id': 53,
261 |   'content_id': 5,
262 |   'dmos': 83.333333333333329,
263 |   'path': dis_dir + '/FoxBird_65_480_1750.yuv',
264 |  },
265 |  {'asset_id': 54,
266 |   'content_id': 5,
267 |   'dmos': 90.833333333333329,
268 |   'path': dis_dir + '/FoxBird_80_1080_2300.yuv',
269 |  },
270 |  {'asset_id': 55,
271 |   'content_id': 5,
272 |   'dmos': 101.66666666666667,
273 |   'path': dis_dir + '/FoxBird_95_1080_5800.yuv',
274 |  },
275 |  {'asset_id': 56,
276 |   'content_id': 6,
277 |   'dmos': 30.833333333333332,
278 |   'path': dis_dir + '/OldTownCross_20_288_375.yuv',
279 |  },
280 |  {'asset_id': 57,
281 |   'content_id': 6,
282 |   'dmos': 45.0,
283 |   'path': dis_dir + '/OldTownCross_45_384_750.yuv',
284 |  },
285 |  {'asset_id': 58,
286 |   'content_id': 6,
287 |   'dmos': 57.5,
288 |   'path': dis_dir + '/OldTownCross_55_480_750.yuv',
289 |  },
290 |  {'asset_id': 59,
291 |   'content_id': 6,
292 |   'dmos': 75.0,
293 |   'path': dis_dir + '/OldTownCross_60_480_1750.yuv',
294 |  },
295 |  {'asset_id': 60,
296 |   'content_id': 6,
297 |   'dmos': 99.166666666666671,
298 |   'path': dis_dir + '/OldTownCross_80_720_2350.yuv',
299 |  },
300 |  {'asset_id': 61,
301 |   'content_id': 6,
302 |   'dmos': 99.166666666666671,
303 |   'path': dis_dir + '/OldTownCross_85_720_2950.yuv',
304 |  },
305 |  {'asset_id': 62,
306 |   'content_id': 6,
307 |   'dmos': 109.16666666666667,
308 |   'path': dis_dir + '/OldTownCross_90_1080_4300.yuv',
309 |  },
310 |  {'asset_id': 63,
311 |   'content_id': 7,
312 |   'dmos': 19.166666666666668,
313 |   'path': dis_dir + '/Seeking_10_288_375.yuv',
314 |  },
315 |  {'asset_id': 64,
316 |   'content_id': 7,
317 |   'dmos': 41.666666666666664,
318 |   'path': dis_dir + '/Seeking_30_480_1050.yuv',
319 |  },
320 |  {'asset_id': 65,
321 |   'content_id': 7,
322 |   'dmos': 50.833333333333336,
323 |   'path': dis_dir + '/Seeking_45_480_1750.yuv',
324 |  },
325 |  {'asset_id': 66,
326 |   'content_id': 7,
327 |   'dmos': 66.666666666666671,
328 |   'path': dis_dir + '/Seeking_50_720_2350.yuv',
329 |  },
330 |  {'asset_id': 67,
331 |   'content_id': 7,
332 |   'dmos': 75.833333333333329,
333 |   'path': dis_dir + '/Seeking_60_720_3050.yuv',
334 |  },
335 |  {'asset_id': 68,
336 |   'content_id': 7,
337 |   'dmos': 80.833333333333329,
338 |   'path': dis_dir + '/Seeking_65_1080_4300.yuv',
339 |  },
340 |  {'asset_id': 69,
341 |   'content_id': 7,
342 |   'dmos': 91.666666666666671,
343 |   'path': dis_dir + '/Seeking_75_1080_5800.yuv',
344 |  },
345 |  {'asset_id': 70,
346 |   'content_id': 7,
347 |   'dmos': 90.0,
348 |   'path': dis_dir + '/Seeking_85_1080_7500.yuv',
349 |  },
350 |  {'asset_id': 71,
351 |   'content_id': 7,
352 |   'dmos': 91.666666666666671,
353 |   'path': dis_dir + '/Seeking_90_1080_15000.yuv',
354 |  },
355 |  {'asset_id': 72,
356 |   'content_id': 7,
357 |   'dmos': 96.666666666666671,
358 |   'path': dis_dir + '/Seeking_95_1080_20000.yuv',
359 |  },
360 |  {'asset_id': 73,
361 |   'content_id': 8,
362 |   'dmos': 33.333333333333336,
363 |   'path': dis_dir + '/Tennis_20_288_375.yuv',
364 |  },
365 |  {'asset_id': 74,
366 |   'content_id': 8,
367 |   'dmos': 50.0,
368 |   'path': dis_dir + '/Tennis_40_384_750.yuv',
369 |  },
370 |  {'asset_id': 75,
371 |   'content_id': 8,
372 |   'dmos': 71.666666666666671,
373 |   'path': dis_dir + '/Tennis_60_480_1050.yuv',
374 |  },
375 |  {'asset_id': 76,
376 |   'content_id': 8,
377 |   'dmos': 68.333333333333329,
378 |   'path': dis_dir + '/Tennis_70_480_1750.yuv',
379 |  },
380 |  {'asset_id': 77,
381 |   'content_id': 8,
382 |   'dmos': 94.166666666666671,
383 |   'path': dis_dir + '/Tennis_80_720_3050.yuv',
384 |  },
385 |  {'asset_id': 78,
386 |   'content_id': 8,
387 |   'dmos': 99.166666666666671,
388 |   'path': dis_dir + '/Tennis_90_1080_4300.yuv',
389 |  }]
390 | 


--------------------------------------------------------------------------------
/dataset/NFLX/prep_NFLX_score.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import re
 3 | from collections import OrderedDict
 4 | import numpy as np
 5 | import json
 6 | from NFLX_dataset_public import dis_videos
 7 | 
 8 | def make_score_file():
 9 | 
10 |     dir_path = os.path.dirname(os.path.realpath(__file__))
11 | 
12 |     all_scenes = ['BigBuckBunny', 'BirdsInCage', 'CrowdRun', 'ElFuente1', 'ElFuente2',
13 |                   'FoxBird', 'OldTownCross', 'Seeking', 'Tennis']
14 |     test_scene = all_scenes
15 |     framerate = {
16 |         'BigBuckBunny_25fps.yuv': 25,
17 |         'BirdsInCage_30fps.yuv': 30,
18 |         'CrowdRun_25fps.yuv': 25,
19 |         'ElFuente1_30fps.yuv': 30,
20 |         'ElFuente2_30fps.yuv': 30,
21 |         'FoxBird_25fps.yuv': 25,
22 |         'OldTownCross_25fps.yuv': 25,
23 |         'Seeking_25fps.yuv': 25,
24 |         'Tennis_24fps.yuv': 24}
25 | 
26 |     width = 1920
27 |     height = 1080
28 | 
29 |     ret = OrderedDict()
30 |     ret['train'] = OrderedDict()
31 |     ret['test'] = OrderedDict()
32 | 
33 |     trn_dis = []
34 |     trn_ref = []
35 |     trn_mos = []
36 |     trn_height = []
37 |     trn_width = []
38 |     trn_fps = []
39 | 
40 |     tst_dis = []
41 |     tst_ref = []
42 |     tst_mos = []
43 |     tst_height = []
44 |     tst_width = []
45 |     tst_fps = []
46 | 
47 |     for pvs in dis_videos:
48 | 
49 |         clip_info = re.split('/', pvs['path'])
50 |         clip_name = clip_info[-1]
51 |         print(clip_name)
52 |         scene = re.split('_', clip_name)[0]
53 |         for k, v in framerate.items():
54 |             if scene in k:
55 |                 if scene in test_scene:
56 |                     tst_dis.append(clip_name)
57 |                     tst_ref.append(k)
58 |                     tst_mos.append(pvs['dmos'])
59 |                     tst_height.append(height)
60 |                     tst_width.append(width)
61 |                     tst_fps.append(v)
62 |                 else:
63 |                     trn_dis.append(clip_name)
64 |                     trn_ref.append(k)
65 |                     trn_mos.append(pvs['dmos'])
66 |                     trn_height.append(height)
67 |                     trn_width.append(width)
68 |                     trn_fps.append(v)
69 |                 break
70 | 
71 |     ret['train']['dis'] = trn_dis
72 |     ret['train']['ref'] = trn_ref
73 |     ret['train']['mos'] = trn_mos
74 |     ret['train']['height'] = trn_height
75 |     ret['train']['width'] = trn_width
76 |     ret['train']['fps'] = trn_fps
77 | 
78 |     ret['test']['dis'] = tst_dis
79 |     ret['test']['ref'] = tst_ref
80 |     ret['test']['mos'] = tst_mos
81 |     ret['test']['height'] = tst_height
82 |     ret['test']['width'] = tst_width
83 |     ret['test']['fps'] = tst_fps
84 | 
85 |     with open('NFLX_subj_score.json', 'w') as f:
86 |         json.dump(ret, f, indent=4, sort_keys=True)
87 | 
88 | 
89 | if __name__ == "__main__":
90 | 
91 |     make_score_file()
92 | 
93 |     print('Done')
94 | 


--------------------------------------------------------------------------------
/dataset/dataset.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import re
  3 | import json
  4 | import numpy as np
  5 | import subprocess
  6 | import torch
  7 | from torch.utils.data import DataLoader, Dataset
  8 | 
  9 | 
 10 | class CropSegment(object):
 11 |     r"""
 12 |     Crop a clip along the spatial axes, i.e. h, w
 13 |     DO NOT crop along the temporal axis
 14 | 
 15 |     args:
 16 |         size_x: horizontal dimension of a segment
 17 |         size_y: vertical dimension of a segment
 18 |         stride_x: horizontal stride between segments
 19 |         stride_y: vertical stride between segments
 20 |     return:
 21 |         clip (tensor): dim = (N, C, D, H=size_y, W=size_x). N are segments number by applying sliding window with given window size and stride
 22 |     """
 23 | 
 24 |     def __init__(self, size_x, size_y, stride_x, stride_y):
 25 | 
 26 |         self.size_x = size_x
 27 |         self.size_y = size_y
 28 |         self.stride_x = stride_x
 29 |         self.stride_y = stride_y
 30 | 
 31 |     def __call__(self, clip):
 32 | 
 33 |         # input dimension [C, D, H, W]
 34 |         channel = clip.shape[0]
 35 |         depth = clip.shape[1]
 36 | 
 37 |         clip = clip.unfold(2, self.size_x, self.stride_x)
 38 |         clip = clip.unfold(3, self.size_y, self.stride_y)
 39 |         clip = clip.permute(2, 3, 0, 1, 4, 5)
 40 |         clip = clip.contiguous().view(-1, channel, depth, self.size_x, self.size_y)
 41 | 
 42 |         return clip
 43 | 
 44 | 
 45 | class VideoDataset(Dataset):
 46 |     r"""
 47 |     A Dataset for a folder of videos
 48 | 
 49 |     args:
 50 |         subj_score_file (str): path to the subjective score file. It contains train/test split, ref list, dis list, fps list and mos list
 51 |         directory (str): the path to the directory containing all videos
 52 |         mode (str, optional): determines whether to read train/test data
 53 |         channel (int, optional): number of channels of a sample
 54 |         size_x: horizontal dimension of a segment
 55 |         size_y: vertical dimension of a segment
 56 |         stride_x: horizontal stride between segments
 57 |         stride_y: vertical stride between segments
 58 |     """
 59 | 
 60 |     def __init__(self, subj_score_file, directory, mode='train', channel=1, size_x=112, size_y=112, stride_x=80, stride_y=80, transform=None):
 61 | 
 62 |         with open(subj_score_file, "r") as f:
 63 |             data = json.load(f)
 64 |         self.video_dir = directory
 65 |         data = data[mode]
 66 |         self.ref = data['ref']
 67 |         self.dis = data['dis']
 68 |         self.label = data['mos']
 69 |         self.framerate = data['fps']
 70 |         self.frame_height = data['height']
 71 |         self.frame_width = data['width']
 72 |         self.channel = channel
 73 |         self.size_x = size_x
 74 |         self.size_y = size_y
 75 |         self.stride_x = stride_x
 76 |         self.stride_y = stride_y
 77 |         self.transform = transform
 78 | 
 79 |     def __getitem__(self, index):
 80 | 
 81 |         ref = os.path.join(self.video_dir, self.ref[index])
 82 |         dis = os.path.join(self.video_dir, self.dis[index])
 83 |         label = float(self.label[index])
 84 |         framerate = int(self.framerate[index])
 85 |         frame_height = int(self.frame_height[index])
 86 |         frame_width = int(self.frame_width[index])
 87 | 
 88 |         if framerate <= 30:
 89 |             stride_t = 2
 90 |         elif framerate <= 60:
 91 |             stride_t = 4
 92 |         else:
 93 |             raise ValueError('Unsupported fps')
 94 | 
 95 |         if ref.endswith(('.YUV', '.yuv')):
 96 |             ref = self.load_yuv(ref, frame_height, frame_width, stride_t)
 97 |         elif ref.endswith(('.mp4')):
 98 |             ref = self.load_encode(ref, frame_height, frame_width, stride_t)
 99 |         else:
100 |             raise ValueError('Unsupported video format')
101 | 
102 |         if dis.endswith(('.YUV', '.yuv')):
103 |             dis = self.load_yuv(dis, frame_height, frame_width, stride_t)
104 |         elif dis.endswith(('.mp4')):
105 |             dis = self.load_encode(dis, frame_height, frame_width, stride_t)
106 |         else:
107 |             raise ValueError('Unsupported video format')
108 | 
109 |         offset_v = (frame_height - self.size_y) % self.stride_y
110 |         offset_t = int(offset_v / 4 * 2)
111 |         offset_b = offset_v - offset_t
112 |         offset_h = (frame_width - self.size_x) % self.stride_x
113 |         offset_l = int(offset_h / 4 * 2)
114 |         offset_r = offset_h - offset_l
115 | 
116 |         ref = ref[:, :, offset_t:frame_height-offset_b, offset_l:frame_width-offset_r]
117 |         dis = dis[:, :, offset_t:frame_height-offset_b, offset_l:frame_width-offset_r]
118 | 
119 |         spatial_crop = CropSegment(self.size_x, self.size_y, self.stride_x, self.stride_y)
120 |         ref = spatial_crop(ref)
121 |         dis = spatial_crop(dis)
122 | 
123 |         ref = torch.from_numpy(np.asarray(ref))
124 |         dis = torch.from_numpy(np.asarray(dis))
125 |         label = torch.from_numpy(np.asarray(label))
126 | 
127 |         return ref, dis, label
128 | 
129 |     def load_yuv(self, file_path, frame_height, frame_width, stride_t, start=0):
130 |         r"""
131 |         Load frames on-demand from raw video, currently supports only yuv420p
132 | 
133 |         args:
134 |             file_path (str): path to yuv file
135 |             frame_height
136 |             frame_width
137 |             stride_t (int): sample the 1st frame from every stride_t frames
138 |             start (int): index of the 1st sampled frame
139 |         return:
140 |             ret (tensor): contains sampled frames (Y channel). dim = (C, D, H, W)
141 |         """
142 | 
143 |         bytes_per_frame = int(frame_height * frame_width * 1.5)
144 |         frame_count = os.path.getsize(file_path) / bytes_per_frame
145 | 
146 |         ret = []
147 |         count = 0
148 | 
149 |         with open(file_path, 'rb') as f:
150 |             while count < frame_count:
151 |                 if count % stride_t == 0:
152 |                     offset = count * bytes_per_frame
153 |                     f.seek(offset, 0)
154 |                     frame = f.read(frame_height * frame_width)
155 |                     frame = np.frombuffer(frame, "uint8")
156 |                     frame = frame.astype('float32') / 255.
157 |                     frame = frame.reshape(1, 1, frame_height, frame_width)
158 |                     ret.append(frame)
159 |                 count += 1
160 | 
161 |         ret = np.concatenate(ret, axis=1)
162 |         ret = torch.from_numpy(np.asarray(ret))
163 | 
164 |         return ret
165 | 
166 |     def load_encode(self, file_path, frame_height, frame_width, stride_t, start=0):
167 |         r"""
168 |         Load frames on-demand from encode bitstream
169 | 
170 |         args:
171 |             file_path (str): path to yuv file
172 |             frame_height
173 |             frame_width
174 |             stride_t (int): sample the 1st frame from every stride_t frames
175 |             start (int): index of the 1st sampled frame
176 |         return:
177 |             ret (array): contains sampled frames. dim = (C, D, H, W)
178 |         """
179 | 
180 |         enc_path = file_path
181 |         enc_name = re.split('/', enc_path)[-1]
182 | 
183 |         yuv_name = enc_name.replace('.mp4', '.yuv')
184 |         yuv_path = os.path.join('/dockerdata/tmp/', yuv_name)
185 |         cmd = "ffmpeg -y -i {src} -f rawvideo -pix_fmt yuv420p -vsync 0 -an {dst}".format(src=enc_path, dst=yuv_path)
186 |         subprocess.run(cmd, shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
187 | 
188 |         ret = self.load_yuv(yuv_path, frame_height, frame_width, stride_t, start=0)
189 | 
190 |         return ret
191 | 
192 |     def __len__(self):
193 |         return len(self.dis)
194 | 
195 | 
196 | if __name__ == '__main__':
197 | 
198 |     root_dir = os.path.dirname(os.path.realpath(__file__))
199 |     subj_score_file = os.path.join(root_dir, 'csiq_subj_score.json')
200 |     video_dir = '/dockerdata/CSIQ_YUV'
201 |     csiq_dataset = VideoDataset(subj_score_file, video_dir)
202 |     print(len(csiq_dataset))
203 | 


--------------------------------------------------------------------------------
/eval.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import sys
 3 | import torch
 4 | import json
 5 | import numpy as np
 6 | import torch.nn as nn
 7 | from dataset.dataset import VideoDataset
 8 | from model.network import C3DVQANet
 9 | from scipy.stats import spearmanr, pearsonr
10 | from opts import parse_opts
11 | from tool.draw import mos_scatter 
12 | 
13 | def test_model(model, device, criterion, dataloaders):
14 | 
15 |     phase = 'test'
16 |     model.eval()
17 | 
18 |     epoch_labels = []
19 |     epoch_preds = []
20 | 
21 |     for ref, dis, labels in dataloaders[phase]:
22 | 
23 |         ref = ref.to(device)
24 |         dis = dis.to(device)
25 |         labels = labels.to(device).float()
26 | 
27 |         # dim: [batch=1, P, C, D, H, W]
28 |         ref = ref.reshape(-1, ref.shape[2], ref.shape[3], ref.shape[4], ref.shape[5])
29 |         dis = dis.reshape(-1, dis.shape[2], dis.shape[3], dis.shape[4], dis.shape[5])
30 | 
31 |         with torch.no_grad():
32 |             preds = model(ref, dis)
33 |             preds = torch.mean(preds, 0, keepdim=True)
34 | 
35 |         epoch_labels.append(labels.flatten())
36 |         epoch_preds.append(preds.flatten())
37 | 
38 |     epoch_labels = torch.cat(epoch_labels).flatten().data.cpu().numpy()
39 |     epoch_preds = torch.cat(epoch_preds).flatten().data.cpu().numpy()
40 |     
41 |     ret = {}
42 |     ret['MOS'] = epoch_labels.tolist()
43 |     ret['PRED'] = epoch_preds.tolist()
44 |  
45 |     # print(json.dumps(ret))
46 | 
47 |     epoch_rmse = np.sqrt(np.mean((epoch_labels - epoch_preds)**2))
48 |     print("{phase} RMSE: {rmse:.4f}".format(phase=phase, rmse=epoch_rmse))
49 | 
50 |     if len(epoch_labels) > 5:
51 |         epoch_plcc = pearsonr(epoch_labels, epoch_preds)[0]
52 |         epoch_srocc = spearmanr(epoch_labels, epoch_preds)[0]
53 | 
54 |         print("{phase}:\t PLCC: {plcc:.4f}\t SROCC: {srocc:.4f}".format(phase=phase, plcc=epoch_plcc, srocc=epoch_srocc))
55 | 
56 | 
57 | if __name__=='__main__':
58 | 
59 |     opt = parse_opts()
60 | 
61 |     video_path = opt.video_dir
62 |     subj_dataset = opt.score_file_path
63 |     load_checkpoint = opt.load_model
64 |     MULTI_GPU_MODE = opt.multi_gpu
65 |     channel = opt.channel
66 |     size_x = opt.size_x
67 |     size_y = opt.size_y
68 |     stride_x = opt.stride_x
69 |     stride_y = opt.stride_y
70 | 
71 |     video_dataset = {x: VideoDataset(subj_dataset, video_path, x, channel, size_x, size_y, stride_x, stride_y) for x in ['test']}
72 |     dataloaders = {x: torch.utils.data.DataLoader(video_dataset[x], batch_size=1, shuffle=False, num_workers=4, drop_last=False) for x in ['test']}
73 | 
74 |     device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
75 |     checkpoint = torch.load(load_checkpoint)
76 | 
77 |     model = C3DVQANet().to(device)
78 |     model.load_state_dict(checkpoint['model_state_dict'])
79 | 
80 |     if torch.cuda.device_count() > 1 and MULTI_GPU_MODE == True:
81 |         device_ids = range(0, torch.cuda.device_count())
82 |         model = torch.nn.DataParallel(model, device_ids=device_ids)
83 |         print("muti-gpu mode enabled, use {0:d} gpus".format(torch.cuda.device_count()))
84 |     else:
85 |         print('use {0}'.format('cuda' if torch.cuda.is_available() else 'cpu'))
86 | 
87 |     criterion = nn.MSELoss()
88 | 
89 |     test_model(model, device, criterion, dataloaders)
90 | 


--------------------------------------------------------------------------------
/model/network.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import numpy as np
  4 | 
  5 | 
  6 | class ResidualFrame(object):
  7 | 
  8 |     def __init__(self, eps=1.0):
  9 |         super(ResidualFrame, self).__init__()
 10 | 
 11 |         self.eps = eps
 12 |         self.log_255 = np.float32(2 * np.log(255.0))
 13 |         self.log_max = np.float32(self.log_255 - np.log(self.eps))
 14 | 
 15 |     def __call__(self, x, y):
 16 | 
 17 |         d = torch.pow(255.0 * (x - y), 2)
 18 |         residual = self.log_255 - torch.log(d + self.eps)
 19 |         residual = residual / self.log_max
 20 | 
 21 |         return residual
 22 | 
 23 | 
 24 | class DownsampleConv3D(nn.Module):
 25 |     r"""
 26 |     Downsample by 2 over an input signal composed of several input planes with distinct spatial and time axes, by performing 3D convolution over the spatiotemporal axes
 27 | 
 28 |     args:
 29 |         in_channels (int): number of channels in the input tensor
 30 |         out_channels (int): number of channels produced by the convolution
 31 |         kernel_size (int or tuple): size of the convolution kernel
 32 |         stride (int or tuple): stride
 33 |         padding (int or tuple): zero-padding
 34 |         bias (bool, optional): whether to add a learnable bias
 35 |     """
 36 | 
 37 |     def __init__(self, in_channels, out_channels, kernel_size=(1, 5, 5), stride=(1, 2, 2), padding=(0, 2, 2), dilation=1, groups=1, bias=False):
 38 |         super(DownsampleConv3D, self).__init__()
 39 | 
 40 |         k = np.float32([1, 4, 6, 4, 1])
 41 |         k = np.outer(k, k)
 42 |         k5x5 = (k/k.sum()).reshape((1, 1, 1, 5, 5))
 43 | 
 44 |         conv1 = nn.Conv3d(in_channels, out_channels, kernel_size, stride, padding, dilation, groups, bias)
 45 | 
 46 |         with torch.no_grad():
 47 |             conv1.weight = nn.Parameter(torch.from_numpy(k5x5))
 48 | 
 49 |         self.conv1 = conv1
 50 | 
 51 |     def forward(self, x):
 52 | 
 53 |         x = self.conv1(x)
 54 | 
 55 |         return x
 56 | 
 57 | 
 58 | class UpsampleConv3D(nn.Module):
 59 |     r"""
 60 |     Upsample by 2 over an input signal composed of several input planes with distinct spatial and time axes, by performing 3D convolution over the spatiotemporal axes
 61 | 
 62 |     rrgs:
 63 |         in_channels (int): number of channels in the input tensor
 64 |         out_channels (int): number of channels produced by the convolution
 65 |         kernel_size (int or tuple): size of the convolution kernel
 66 |         stride (int or tuple): stride
 67 |         padding (int or tuple): zero-padding
 68 |         bias (bool, optional): whether to add a learnable bias
 69 |     """
 70 | 
 71 |     def __init__(self, in_channels, out_channels, kernel_size=(1, 5, 5), stride=(1, 2, 2), padding=(0, 2, 2), dilation=1, groups=1, bias=False):
 72 |         super(UpsampleConv3D, self).__init__()
 73 | 
 74 |         k = np.float32([1, 4, 6, 4, 1])
 75 |         k = np.outer(k, k)
 76 |         k5x5 = (k/k.sum()).reshape((1, 1, 1, 5, 5))
 77 | 
 78 |         conv1 = nn.ConvTranspose3d(in_channels, out_channels, kernel_size, stride, padding, output_padding=(0, 1, 1), bias=bias)
 79 | 
 80 |         with torch.no_grad():
 81 |             conv1.weight = nn.Parameter(torch.from_numpy(k5x5))
 82 | 
 83 |         self.conv1 = conv1
 84 | 
 85 |     def forward(self, x):
 86 | 
 87 |         x = self.conv1(x)
 88 | 
 89 |         return x
 90 | 
 91 | 
 92 | class SpatialConv3D(nn.Module):
 93 |     r"""
 94 |     Apply 3D conv. over an input signal composed of several input planes with distinct spatial and time axes, by performing 3D convolution over the spatiotemporal axes
 95 | 
 96 |     rrgs:
 97 |         in_channels (int): number of channels in the input tensor
 98 |         out_channels (int): number of channels produced by the convolution
 99 |         kernel_size (int or tuple): size of the convolution kernel
100 |         stride (int or tuple): stride
101 |         padding (int or tuple): zero-padding
102 |     """
103 | 
104 |     def __init__(self, in_channels, out_channels, kernel_size=(1, 3, 3), stride=(1, 2, 2), padding=(0, 1, 1)):
105 |         super(SpatialConv3D, self).__init__()
106 | 
107 |         self.conv1 = nn.Conv3d(in_channels, 16, kernel_size, stride, padding)
108 |         self.reLu1 = nn.LeakyReLU(inplace=True)
109 |         self.conv2 = nn.Conv3d(16, out_channels, kernel_size, stride, padding)
110 |         self.reLu2 = nn.LeakyReLU(inplace=True)
111 | 
112 |     def forward(self, x):
113 | 
114 |         x = self.conv1(x)
115 |         x = self.reLu1(x)
116 |         x = self.conv2(x)
117 |         x = self.reLu2(x)
118 | 
119 |         return x
120 | 
121 | 
122 | class SpatialTemporalConv3D(nn.Module):
123 |     r"""
124 |     Apply 3D conv. over an input signal composed of several input planes with distinct spatial and time axes, by performing 3D convolution over the spatiotemporal axes
125 | 
126 |     args:
127 |         in_channels (int): number of channels in the input tensor
128 |         out_channels (int): number of channels produced by the convolution
129 |         kernel_size (int or tuple): size of the convolution kernel
130 |         stride (int or tuple): stride
131 |         padding (int or tuple): zero-padding
132 |     """
133 | 
134 |     def __init__(self, in_channels, out_channels, kernel_size=3, stride=1, padding=1):
135 |         super(SpatialTemporalConv3D, self).__init__()
136 | 
137 |         self.conv1 = nn.Conv3d(in_channels, 64, kernel_size, stride, padding)
138 |         self.relu1 = nn.LeakyReLU(inplace=True)
139 |         self.conv2 = nn.Conv3d(64, 64, kernel_size, stride, padding)
140 |         self.relu2 = nn.LeakyReLU(inplace=True)
141 |         self.conv3 = nn.Conv3d(64, 32, kernel_size, stride, padding)
142 |         self.relu3 = nn.LeakyReLU(inplace=True)
143 |         self.conv4 = nn.Conv3d(32, out_channels, kernel_size, stride, padding)
144 |         self.relu4 = nn.LeakyReLU(inplace=True)
145 | 
146 |     def forward(self, x):
147 | 
148 |         x = self.conv1(x)
149 |         x = self.relu1(x)
150 |         x = self.conv2(x)
151 |         x = self.relu2(x)
152 |         x = self.conv3(x)
153 |         x = self.relu3(x)
154 |         x = self.conv4(x)
155 |         x = self.relu4(x)
156 | 
157 |         return x
158 | 
159 | 
160 | class C3DVQANet(nn.Module):
161 | 
162 |     def __init__(self):
163 |         super(C3DVQANet, self).__init__()
164 | 
165 |         self.diff = ResidualFrame(eps=1.0)
166 | 
167 |         self.conv1_1 = DownsampleConv3D(1, 1)
168 |         self.conv1_2 = UpsampleConv3D(1, 1)
169 | 
170 |         self.conv2_1 = SpatialConv3D(1, 16)
171 |         self.conv2_2 = SpatialConv3D(1, 16)
172 | 
173 |         self.conv3 = SpatialTemporalConv3D(32, 1)
174 | 
175 |         self.pool = nn.AdaptiveAvgPool3d(1)
176 | 
177 |         self.fc1 = nn.Linear(1, 4)
178 |         self.relu1 = nn.LeakyReLU(inplace=True)
179 |         self.fc2 = nn.Linear(4, 1)
180 |         self.relu2 = nn.LeakyReLU(inplace=True)
181 | 
182 |     def forward(self, ref, dis):
183 | 
184 |         err1 = self.diff(ref, dis)
185 | 
186 |         err2 = self.conv1_1(err1)    # 112x112 -> 56x56
187 |         err2 = self.conv1_1(err2)    # 56x56 -> 28x28
188 | 
189 |         err3 = self.conv2_1(err1)    # 112x112 -> 56x56
190 | 
191 |         lo = dis
192 |         for i in range(3):
193 |             lo = self.conv1_1(lo)
194 |         for i in range(3):
195 |             lo = self.conv1_2(lo)
196 |         dis = dis - lo
197 | 
198 |         dis = self.conv2_2(dis)    # 112x112 -> 56x56
199 | 
200 |         sens = torch.cat([dis, err3], dim=1)
201 |         sens = self.conv3(sens)
202 | 
203 |         res = err2 * sens
204 |         res = res[:, :, :, 4:-4, 4:-4]
205 |         res = self.pool(res)
206 | 
207 |         res = self.fc1(res)
208 |         res = self.relu1(res)
209 |         res = self.fc2(res)
210 |         res = self.relu2(res)
211 |         res = torch.squeeze(res)
212 | 
213 |         return res
214 | 


--------------------------------------------------------------------------------
/opts.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | 
 3 | def parse_opts():
 4 | 
 5 |     parser = argparse.ArgumentParser()
 6 |     parser.add_argument('--video_dir', default='/dockerdata/CSIQ_YUV', type=str, help='Path to input videos')
 7 |     parser.add_argument('--score_file_path', default='./dataset/csiq_subj_score.json', type=str, help='Path to input subjective score')
 8 |     parser.add_argument('--load_model', default='', type=str, help='Path to load checkpoint')
 9 |     parser.add_argument('--save_model', default='./save/model_csiq.pt', type=str, help='Path to save checkpoint')
10 |     parser.add_argument('--log_file_name', default='./log/run.log', type=str, help='Path to save log')
11 | 
12 |     parser.add_argument('--channel', default=1, type=int, help='channel number of input data, 1 for Y channel, 3 for YUV')
13 |     parser.add_argument('--size_x', default=112, type=int, help='patch size x of segment')
14 |     parser.add_argument('--size_y', default=112, type=int, help='patch size y of segment')
15 |     parser.add_argument('--stride_x', default=80, type=int, help='patch stride x between segments')
16 |     parser.add_argument('--stride_y', default=80, type=int, help='patch stride y between segments')
17 | 
18 |     parser.add_argument('--learning_rate', default=3e-4, type=float, help='learning rate')
19 |     parser.add_argument('--weight_decay', default=1e-2, type=float, help='L2 regularization')
20 |     parser.add_argument('--epochs', default=20, type=int, help='epochs to train')
21 |     parser.add_argument('--multi_gpu', action='store_true', help='whether to use all GPUs')
22 | 
23 |     args = parser.parse_args()
24 | 
25 |     return args
26 | 
27 | if __name__ == '__main__':
28 | 
29 |     args = parse_opts()
30 |     print(args)
31 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | absl-py==0.9.0
 2 | cachetools==4.1.0
 3 | certifi==2020.4.5.1
 4 | chardet==3.0.4
 5 | cycler==0.10.0
 6 | google-auth==1.13.1
 7 | google-auth-oauthlib==0.4.1
 8 | grpcio==1.28.1
 9 | idna==2.9
10 | kiwisolver==1.1.0
11 | Markdown==3.2.1
12 | matplotlib==3.0.3
13 | numpy==1.18.1
14 | oauthlib==3.1.0
15 | protobuf==3.11.3
16 | pyasn1==0.4.8
17 | pyasn1-modules==0.2.8
18 | pyparsing==2.4.6
19 | python-dateutil==2.8.1
20 | requests==2.23.0
21 | requests-oauthlib==1.3.0
22 | rsa==4.0
23 | scipy==1.4.1
24 | six==1.14.0
25 | tensorboard==2.2.0
26 | tensorboard-plugin-wit==1.6.0.post3
27 | torch==1.4.0
28 | tqdm==4.43.0
29 | urllib3==1.25.8
30 | Werkzeug==1.0.1
31 | 


--------------------------------------------------------------------------------
/save/model_videoset_v3.pt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Tencent/DVQA/21727333a6b41d54ad1a8beca1fcbe00a69ed347/save/model_videoset_v3.pt


--------------------------------------------------------------------------------
/scripts/eval.sh:
--------------------------------------------------------------------------------
1 | python ./eval.py --multi_gpu --video_dir /apdcephfs/private_tommyhqwang/YUV/PGC_YUV --score_file_path ./dataset/VIDEOSET/videoset_subj_score_v2.json --load_model ./save/model_videoset_v3.pt_88
2 | 


--------------------------------------------------------------------------------
/scripts/ft.sh:
--------------------------------------------------------------------------------
1 | python ./train.py --multi_gpu --video_dir yuv_dir --score_file_path ./dataset/*.json --load_model ./save/*.pt --save_model ./save/*.pt --size_x 112 --size_y 112 --stride_x 80 --stride_y 80 --learning_rate 3e-3 --weight_decay 1e-2 --epoch 100 
2 | 


--------------------------------------------------------------------------------
/scripts/train.sh:
--------------------------------------------------------------------------------
1 | python ./train.py --multi_gpu --video_dir /apdcephfs/private_tommyhqwang/YUV/PGC_YUV --score_file_path ./dataset/VIDEOSET/videoset_subj_score_v2.json --save_model ./save/model_videoset_v3.pt --log_file_name ./log/videoset_v3.log --size_x 112 --size_y 112 --stride_x 80 --stride_y 80 --learning_rate 3e-4 --weight_decay 1e-2 --epochs 100
2 | 


--------------------------------------------------------------------------------
/tool/decode_stream.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import sys
 3 | import re
 4 | 
 5 | 
 6 | def decode_bitstreams(src_dir, dst_dir, rule=None):
 7 | 
 8 |     cmd_to_exec = []
 9 |     for dirpath, dirs, files in os.walk(src_dir, topdown=False):
10 |         for f in files:
11 |             if not f.startswith('.') and f.endswith('.mp4'):
12 |                 src_path = os.path.join(src_dir, f)
13 |                 dst_path = os.path.join(dst_dir, re.sub('.mp4', '.yuv', f))
14 |                 print('decoding {f}'.format(f=src_path))
15 |                 cmd= "ffmpeg -i " + src_path + " -f rawvideo -pix_fmt yuv420p -an -vsync 0 -y " + dst_path
16 |                 cmd_to_exec.append(cmd)
17 |                 print(cmd)
18 | 
19 |     with open('decoding.sh', 'wt') as f:
20 |         f.write("#!/bin/sh\n\n")
21 |         for item in cmd_to_exec:
22 |             f.write("%s\n" % item)
23 | 
24 | 
25 | if __name__ == "__main__":
26 | 
27 |     src_dir = sys.argv[1]
28 |     dst_dir = sys.argv[2]
29 | 
30 |     decode_bitstreams(src_dir, dst_dir)
31 | 


--------------------------------------------------------------------------------
/tool/draw.py:
--------------------------------------------------------------------------------
 1 | import matplotlib
 2 | import matplotlib.pyplot as plt
 3 | import numpy as np
 4 | 
 5 | 
 6 | def mos_scatter(mos, pred):
 7 |     
 8 |     fig = plt.figure(figsize=(48,48))
 9 |     plt.scatter(mos, pred, alpha=0.8)
10 |     plt.xlabel('MOS')
11 |     plt.ylabel('PRED')
12 |     plt.plot([0, 100], [0, 100])
13 | 
14 |     return fig
15 | 
16 | 
17 | 


--------------------------------------------------------------------------------
/train.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import sys
  3 | import json
  4 | import numpy as np
  5 | import logging
  6 | import torch
  7 | import torch.nn as nn
  8 | from torch.utils.tensorboard import SummaryWriter
  9 | from tqdm import tqdm
 10 | from scipy.stats import spearmanr, pearsonr
 11 | from opts import parse_opts
 12 | from model.network import C3DVQANet
 13 | from dataset.dataset import VideoDataset
 14 | from tool.draw import mos_scatter
 15 | 
 16 | writer = SummaryWriter()
 17 | 
 18 | def train_model(model, device, criterion, optimizer, scheduler, dataloaders, save_checkpoint, epoch_resume=1, num_epochs=25):
 19 | 
 20 |     for epoch in tqdm(range(epoch_resume, num_epochs+epoch_resume), unit='epoch', initial=epoch_resume, total=num_epochs+epoch_resume):
 21 |         for phase in ['train', 'test']:
 22 |             epoch_labels = []
 23 |             epoch_preds = []
 24 |             epoch_loss = 0.0
 25 |             epoch_size = 0
 26 | 
 27 |             if phase == 'train':
 28 |                 model.train()
 29 |             else:
 30 |                 model.eval()
 31 | 
 32 |             for ref, dis, labels in dataloaders[phase]:
 33 |                 ref = ref.to(device)
 34 |                 dis = dis.to(device)
 35 |                 labels = labels.to(device).float()
 36 | 
 37 |                 ref = ref.reshape(-1, ref.shape[2], ref.shape[3], ref.shape[4], ref.shape[5])
 38 |                 dis = dis.reshape(-1, dis.shape[2], dis.shape[3], dis.shape[4], dis.shape[5])
 39 | 
 40 |                 optimizer.zero_grad()
 41 | 
 42 |                 with torch.set_grad_enabled(phase == 'train'):
 43 |                     preds = model(ref, dis)
 44 |                     preds = torch.mean(preds, 0, keepdim=True)
 45 |                     loss = criterion(preds, labels)
 46 | 
 47 |                     if torch.cuda.device_count() > 1 and MULTI_GPU_MODE == True:
 48 |                         loss = torch.mean(loss)
 49 | 
 50 |                     if phase == 'train':
 51 |                         loss.backward()
 52 |                         optimizer.step()
 53 | 
 54 |                 epoch_loss += loss.item() * labels.size(0)
 55 |                 epoch_size += labels.size(0)
 56 |                 epoch_labels.append(labels.flatten())
 57 |                 epoch_preds.append(preds.flatten())
 58 | 
 59 |             epoch_loss = epoch_loss / epoch_size
 60 | 
 61 |             if phase == 'train':
 62 |                 scheduler.step(epoch_loss)
 63 | 
 64 |             epoch_labels = torch.cat(epoch_labels).flatten().data.cpu().numpy()
 65 |             epoch_preds = torch.cat(epoch_preds).flatten().data.cpu().numpy()
 66 | 
 67 |             logging.info('epoch_labels: {}'.format(epoch_labels))
 68 |             logging.info('epoch_preds: {}'.format(epoch_preds))
 69 | 
 70 |             epoch_plcc = pearsonr(epoch_labels, epoch_preds)[0]
 71 |             epoch_srocc = spearmanr(epoch_labels, epoch_preds)[0]
 72 |             epoch_rmse = np.sqrt(np.mean((epoch_labels - epoch_preds)**2))
 73 | 
 74 |             logging.info("{phase}-Loss: {loss:.4f}\t RMSE: {rmse:.4f}\t PLCC: {plcc:.4f}\t SROCC: {srocc:.4f}".format(phase=phase, loss=epoch_loss, rmse=epoch_rmse, plcc=epoch_plcc, srocc=epoch_srocc))
 75 | 
 76 |             if phase == 'train':
 77 |                 writer.add_scalar('Loss/train', epoch_loss, epoch)
 78 |                 writer.add_scalar('RMSE/train', epoch_rmse, epoch)
 79 |                 writer.add_scalar('PLCC/train', epoch_plcc, epoch)
 80 |                 writer.add_scalar('SROCC/train', epoch_srocc, epoch)
 81 |             else:
 82 |                 writer.add_scalar('Loss/test', epoch_loss, epoch)
 83 |                 writer.add_scalar('RMSE/test', epoch_rmse, epoch)
 84 |                 writer.add_scalar('PLCC/test', epoch_plcc, epoch)
 85 |                 writer.add_scalar('SROCC/test', epoch_srocc, epoch)
 86 |                 writer.add_figure('Pred vs. MOS', mos_scatter(epoch_labels, epoch_preds), epoch)
 87 |  
 88 |             if phase == 'test' and save_checkpoint:
 89 |                 _checkpoint = '{pt}_{epoch}'.format(pt=save_checkpoint, epoch=epoch)
 90 |                 torch.save({'epoch': epoch, 'model_state_dict': model.module.state_dict(), 'optimizer_state_dict': optimizer.state_dict()}, _checkpoint)
 91 | 
 92 | 
 93 | if __name__=='__main__':
 94 | 
 95 |     opt = parse_opts()
 96 | 
 97 |     video_path = opt.video_dir
 98 |     subj_dataset = opt.score_file_path
 99 |     save_checkpoint = opt.save_model
100 |     load_checkpoint = opt.load_model
101 |     log_file_name = opt.log_file_name
102 |     LEARNING_RATE = opt.learning_rate
103 |     L2_REGULARIZATION = opt.weight_decay
104 |     NUM_EPOCHS = opt.epochs
105 |     MULTI_GPU_MODE = opt.multi_gpu
106 |     channel = opt.channel
107 |     size_x = opt.size_x
108 |     size_y = opt.size_y
109 |     stride_x = opt.stride_x
110 |     stride_y = opt.stride_y
111 | 
112 |     logging.basicConfig(filename=log_file_name, filemode='w', format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', level=logging.DEBUG)
113 |     logging.info('OK parse options')
114 | 
115 |     video_dataset = {x: VideoDataset(subj_dataset, video_path, x, channel, size_x, size_y, stride_x, stride_y) for x in ['train', 'test']}
116 |     dataloaders = {x: torch.utils.data.DataLoader(video_dataset[x], batch_size=1, shuffle=True, num_workers=8, drop_last=True) for x in ['train', 'test']}
117 | 
118 |     device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
119 | 
120 |     if torch.cuda.device_count() > 1 and MULTI_GPU_MODE == True:
121 |         device_ids = range(0, torch.cuda.device_count())
122 |         model = torch.nn.DataParallel(C3DVQANet().to(device), device_ids=device_ids)
123 |         logging.info("muti-gpu mode enabled, use {0:d} gpus".format(torch.cuda.device_count()))
124 |     else:
125 |         model = C3DVQANet().to(device)
126 |         logging.info('use {0}'.format('cuda' if torch.cuda.is_available() else 'cpu'))
127 | 
128 |     criterion = nn.MSELoss()
129 |     optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE, weight_decay=L2_REGULARIZATION)
130 |     scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0.9, patience=5)
131 |     epoch_resume = 1
132 | 
133 |     if os.path.exists(load_checkpoint):
134 |         checkpoint = torch.load(load_checkpoint)
135 |         logging.info("loading checkpoint")
136 | 
137 |         if torch.cuda.device_count() > 1 and MULTI_GPU_MODE == True:
138 |             model.module.load_state_dict(checkpoint['model_state_dict'])
139 |         else:
140 |             model.load_state_dict(checkpoint['model_state_dict'])
141 | 
142 |         optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
143 |         epoch_resume = checkpoint['epoch']
144 | 
145 |     train_model(model, device, criterion, optimizer, scheduler, dataloaders, save_checkpoint, epoch_resume, num_epochs=NUM_EPOCHS)
146 | 


--------------------------------------------------------------------------------