├── Ensemble.ipynb
├── README.md
├── Numpy_Extraction_for_Month_Start_Month_End.ipynb
├── Numpy_Extration_for_25_Periods.ipynb
└── Field_Aggregation_Mean.ipynb
/Ensemble.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "nbformat": 4,
3 | "nbformat_minor": 0,
4 | "metadata": {
5 | "colab": {
6 | "name": "Ensemble.ipynb",
7 | "provenance": [],
8 | "collapsed_sections": [],
9 | "include_colab_link": true
10 | },
11 | "kernelspec": {
12 | "name": "python3",
13 | "display_name": "Python 3"
14 | },
15 | "language_info": {
16 | "name": "python"
17 | }
18 | },
19 | "cells": [
20 | {
21 | "cell_type": "markdown",
22 | "metadata": {
23 | "id": "view-in-github",
24 | "colab_type": "text"
25 | },
26 | "source": [
27 | "
"
28 | ]
29 | },
30 | {
31 | "cell_type": "code",
32 | "metadata": {
33 | "id": "5XcPHquLXoba"
34 | },
35 | "source": [
36 | "# Import Libraries\n",
37 | "import pandas as pd\n",
38 | "import numpy as np"
39 | ],
40 | "execution_count": null,
41 | "outputs": []
42 | },
43 | {
44 | "cell_type": "code",
45 | "metadata": {
46 | "id": "K6UuiZ2hYCe4"
47 | },
48 | "source": [
49 | "# Function that creates submission file\n",
50 | "def sub_creator(mpreds, name): \n",
51 | " ss_cols = ['Field ID', 'Crop_Lucerne/Medics', 'Crop_Planted pastures (perennial)', 'Crop_Fallow', 'Crop_Wine grapes', 'Crop_Weeds',\\\n",
52 | " 'Crop_Small grain grazing', 'Crop_Wheat', 'Crop_Canola', 'Crop_Rooibos']\n",
53 | " ss = pd.DataFrame(mpreds, columns = ss_cols[1:])\n",
54 | " ss['Field ID'] = catboost['Field ID'].astype(int)\n",
55 | " ss = ss[ss_cols]\n",
56 | " ss.to_csv(f'{name}.csv', index = False)\n",
57 | " ss.head()"
58 | ],
59 | "execution_count": null,
60 | "outputs": []
61 | },
62 | {
63 | "cell_type": "code",
64 | "metadata": {
65 | "id": "06xlb3o6Xutl"
66 | },
67 | "source": [
68 | "# Ensemble pytorch catboost and lightgbm models\n",
69 | "catboost = pd.read_csv('catboost_models.csv')\n",
70 | "lgbm = pd.read_csv('lgbm_models.csv')\n",
71 | "pytorch = pd.read_csv('pytorch_models.csv')\n",
72 | "\n",
73 | "ensemble = (((catboost.iloc[:, 1:].values * 0.7) + (lgbm.iloc[:, 1:].values * 0.3))*0.5) + (pytorch.iloc[:, 1:].values * 0.5)\n",
74 | "sub_creator(ensemble, 'final_submission')"
75 | ],
76 | "execution_count": null,
77 | "outputs": []
78 | }
79 | ]
80 | }
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | [Team Ensemble](https://zindi.africa/competitions/radiant-earth-spot-the-crop-xl-challenge/leaderboard/teams/ensemble) [Radiant Earth Spot the Crop XL-Challenge Solution](https://zindi.africa/competitions/radiant-earth-spot-the-crop-xl-challenge)
2 |
3 | 
4 |
5 | Team Ensemble: [Brainiac](https://www.linkedin.com/in/dariusmoruri/) | [Dr Fad](https://www.linkedin.com/in/yinka-fadahunsi-846b4531/?originalSubdomain=ke)
6 |
7 | ### ***Many thanks to the organizers for such an exciting challenge!***
8 |
9 | ## Objective
10 | The main objective of this challenge was to use time-series of Sentinel-1 and Sentinel-2 multi-spectral data to classify crops in the Western Cape of South Africa. The challenge was to build a machine learning model to predict crop type classes for the test dataset. The training dataset was generated by the Radiant Earth Foundation team, using the ground reference data collected and provided by the Western Cape Department of Agriculture
11 |
12 | ## Metrics of success
13 | The evaluation metric for this challenge was Cross Entropy with binary outcome for each crop
14 |
15 |
16 | 
17 |
18 |
19 |
20 | In which:
21 |
22 | - j indicates the field number (j=1 to N)
23 | - N indicates total number of fields in the dataset (87,347 in the train and 35,389 in the test)
24 | - i indicates the crop type (i=1 to 9)
25 | - y_j,i is the binary (0, 1) indicator for crop type i in field j (each field has only one correct crop type)
26 | - p_j,i is the predicted probability (between 0 and 1) for crop type i in field j
27 |
28 | ## Hardware resources
29 | - Google colab pro
30 |
31 | ## Solution Approach
32 | #### ***Data Download and Manipulation***
33 | - Images were downloaded in batches to avoid out of memory error as colab TPU has a maximum of 35gb RAM
34 | - The images were zipped and stored in google drive.
35 | - Images with a 10 day frequency were used to get the raw image pixels
36 | - Images at the start and end of every month were also processed to raw numpy values
37 | - Pyspark was used to get the mean of the pixel values for each field. Pyspark was utilised becaused the data was quite huge and because of limited compute resources.
38 |
39 | #### ***Featue Engineering and Preprocessing***
40 | - Removed skewness using square root
41 | - Vegetation indices calculations
42 | - Vegetation Indices aggregation - mean
43 | - Vegetation indices differences between different periods
44 | - Quantiles
45 | - Filled missing and infinite numbers with -999999
46 |
47 | #### ***Model Training***
48 | - Catboost classifier trained on vegetation indices data using a 10 stratified cross validation strategy
49 | - LGBM classifier trained on vegetation indices data using a 10 stratified cross validation strategy
50 | - A pytorch classifier trained on raw image pixels.
51 |
52 | #### ***Final Model***
53 | - The final model is an ensemble of boosting trees i.e LGBM and catboost and a pytorch classifier
54 |
55 |
56 | ## To reproduce the same score on the leaderboard follow this instructions
57 | 1. Upload the Feature_Engineering_&_CATBOOST.ipynb notebook to colab.
58 | - Enable GPU runtime
59 | - Run all to get the catboost_models file
60 |
61 | 2. Upload the Feature_Engineering_&_LGBM.ipynb notebook to colab.
62 | - Enable TPU runtime
63 | - Run all to get the lgbm_models file
64 |
65 | 3. Upload the Pixel_Features-Pytorch.ipynb notebook to colab
66 | - Enable GPU runtime
67 | - Run all to get the pytorch_models file
68 |
69 | 4. Finally upload the Ensemble.ipynb notebook to colab
70 | - Upload the lgbm_models file
71 | - Upload the catboost_models file
72 | - Upload the pytorch_models file
73 | - Run all to get the final submission file
74 |
--------------------------------------------------------------------------------
/Numpy_Extraction_for_Month_Start_Month_End.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "nbformat": 4,
3 | "nbformat_minor": 0,
4 | "metadata": {
5 | "colab": {
6 | "name": "Numpy_Extraction_for_Month_Start_Month_End.ipynb",
7 | "provenance": [],
8 | "collapsed_sections": [],
9 | "machine_shape": "hm",
10 | "include_colab_link": true
11 | },
12 | "kernelspec": {
13 | "name": "python3",
14 | "display_name": "Python 3"
15 | },
16 | "language_info": {
17 | "name": "python"
18 | },
19 | "accelerator": "TPU"
20 | },
21 | "cells": [
22 | {
23 | "cell_type": "markdown",
24 | "metadata": {
25 | "id": "view-in-github",
26 | "colab_type": "text"
27 | },
28 | "source": [
29 | "
"
30 | ]
31 | },
32 | {
33 | "cell_type": "code",
34 | "metadata": {
35 | "colab": {
36 | "base_uri": "https://localhost:8080/"
37 | },
38 | "id": "vemh8H3vZ9VO",
39 | "outputId": "fef28028-28ff-43a8-b9c5-30d3513dd6a7"
40 | },
41 | "source": [
42 | "!pip -qq install rasterio tifffile"
43 | ],
44 | "execution_count": null,
45 | "outputs": [
46 | {
47 | "output_type": "stream",
48 | "name": "stdout",
49 | "text": [
50 | "\u001b[K |████████████████████████████████| 19.3 MB 89 kB/s \n",
51 | "\u001b[?25h"
52 | ]
53 | }
54 | ]
55 | },
56 | {
57 | "cell_type": "code",
58 | "metadata": {
59 | "id": "3mKYb4JmMBEZ"
60 | },
61 | "source": [
62 | "import os\n",
63 | "import glob\n",
64 | "import shutil\n",
65 | "import gc\n",
66 | "from joblib import Parallel, delayed\n",
67 | "from tqdm import tqdm_notebook\n",
68 | "import h5py\n",
69 | "\n",
70 | "import pandas as pd\n",
71 | "import numpy as np\n",
72 | "import datetime as dt\n",
73 | "import matplotlib.pyplot as plt\n",
74 | "\n",
75 | "\n",
76 | "import rasterio\n",
77 | "import tifffile as tiff\n",
78 | "\n",
79 | "%matplotlib inline\n",
80 | "pd.set_option('display.max_colwidth', None)"
81 | ],
82 | "execution_count": null,
83 | "outputs": []
84 | },
85 | {
86 | "cell_type": "code",
87 | "metadata": {
88 | "id": "MiS1dNGpRHBe"
89 | },
90 | "source": [
91 | "%%time\n",
92 | "# os.mkdir('radiant')\n",
93 | "shutil.unpack_archive('/content/drive/MyDrive/CompeData/Radiant/Radiant_Data.zip', '/content/radiant')\n",
94 | "gc.collect()"
95 | ],
96 | "execution_count": null,
97 | "outputs": []
98 | },
99 | {
100 | "cell_type": "code",
101 | "metadata": {
102 | "id": "O0_JwAo0bMdV"
103 | },
104 | "source": [
105 | "train = pd.concat([pd.read_csv(f'/content/radiant/train{i}.csv', parse_dates=['datetime']) for i in range(1, 5)]).reset_index(drop = True)\n",
106 | "test = pd.concat([pd.read_csv(f'/content/radiant/test{i}.csv', parse_dates=['datetime']) for i in range(1, 5)]).reset_index(drop = True)\n",
107 | "train.file_path = train.file_path.apply(lambda x: '/'.join(['/content', 'radiant'] + x.split('/')[2:]))\n",
108 | "test.file_path = test.file_path.apply(lambda x: '/'.join(['/content', 'radiant'] + x.split('/')[2:]))\n",
109 | "train.datetime, test.datetime = pd.to_datetime(train.datetime.dt.date), pd.to_datetime(test.datetime.dt.date)\n",
110 | "train['month'], test['month'] = train.datetime.dt.month, test.datetime.dt.month\n",
111 | "train.head()"
112 | ],
113 | "execution_count": null,
114 | "outputs": []
115 | },
116 | {
117 | "cell_type": "code",
118 | "metadata": {
119 | "id": "ph__jwFtDmR6"
120 | },
121 | "source": [
122 | "train.month.unique()"
123 | ],
124 | "execution_count": null,
125 | "outputs": []
126 | },
127 | {
128 | "cell_type": "code",
129 | "metadata": {
130 | "id": "ofOR8cvvB7YW"
131 | },
132 | "source": [
133 | "train.tile_id.unique()[50:60]"
134 | ],
135 | "execution_count": null,
136 | "outputs": []
137 | },
138 | {
139 | "cell_type": "code",
140 | "metadata": {
141 | "id": "idcCxN2tAbQz"
142 | },
143 | "source": [
144 | "bands = ['B01','B02','B03','B04','B05','B06','B07','B08','B8A','B09','B11','B12','CLM']"
145 | ],
146 | "execution_count": null,
147 | "outputs": []
148 | },
149 | {
150 | "cell_type": "code",
151 | "metadata": {
152 | "id": "CwkqspiVAgSi"
153 | },
154 | "source": [
155 | "date_cols = []\n",
156 | "for i in range(4, 12):\n",
157 | " for x in range(1, 3):\n",
158 | " date_cols.append(str(i) + '_' + str(x))\n",
159 | "date_cols"
160 | ],
161 | "execution_count": null,
162 | "outputs": []
163 | },
164 | {
165 | "cell_type": "code",
166 | "metadata": {
167 | "id": "JJhxiJcXAbNN"
168 | },
169 | "source": [
170 | "def process_tile_train(tile):\n",
171 | " tile_df = train[(train.tile_id == tile)].reset_index(drop = True)\n",
172 | "\n",
173 | " y = np.expand_dims(rasterio.open(tile_df[tile_df.asset == 'labels'].file_path.values[0]).read(1).flatten(), axis = 1)\n",
174 | " fields = np.expand_dims(rasterio.open(tile_df[tile_df.asset == 'field_ids'].file_path.values[0]).read(1).flatten(), axis = 1)\n",
175 | "\n",
176 | " tile_df = train[(train.tile_id == tile) & (train.satellite_platform == 's2')].reset_index(drop = True)\n",
177 | "\n",
178 | " dates = []\n",
179 | " for month in range(4, 12):\n",
180 | " dates.append(tile_df[tile_df.month == month].datetime.sort_values().tolist()[0])\n",
181 | " dates.append(tile_df[tile_df.month == month].datetime.sort_values().tolist()[-1])\n",
182 | "\n",
183 | " X_tile = np.empty((256 * 256, 0))\n",
184 | "\n",
185 | " colls = []\n",
186 | " for date, datec in zip(dates, date_cols):\n",
187 | " for band in bands:\n",
188 | " tif_file = tile_df[(tile_df.asset == band) & (tile_df.datetime == date)].file_path.values[0]\n",
189 | " X_tile = np.append(X_tile, (np.expand_dims(rasterio.open(tif_file).read(1).flatten(), axis = 1)), axis = 1)\n",
190 | " colls.append(datec + '_' + band)\n",
191 | " df = pd.DataFrame(X_tile, columns = colls)\n",
192 | " df['y'], df['fields'] = y, fields\n",
193 | " return df"
194 | ],
195 | "execution_count": null,
196 | "outputs": []
197 | },
198 | {
199 | "cell_type": "code",
200 | "metadata": {
201 | "id": "7hQOhSMPb-qI"
202 | },
203 | "source": [
204 | "tiles = train.tile_id.unique()\n",
205 | "chunks = [tiles[x:x+265] for x in range(0, len(tiles), 265)]\n",
206 | "[len(x) for x in chunks]"
207 | ],
208 | "execution_count": null,
209 | "outputs": []
210 | },
211 | {
212 | "cell_type": "code",
213 | "metadata": {
214 | "id": "iULhCd36KrGG"
215 | },
216 | "source": [
217 | "for i in range(len(chunks)):\n",
218 | " pd.DataFrame(np.vstack(Parallel(n_jobs=-1, verbose=1, backend=\"multiprocessing\")(map(delayed(process_tile_train), [x for x in chunks[i]])))).to_csv(f'/content/drive/MyDrive/CompeData/Radiant/Start_end/train{i}.csv', index = False)\n",
219 | " gc.collect()\n",
220 | " print(i)"
221 | ],
222 | "execution_count": null,
223 | "outputs": []
224 | },
225 | {
226 | "cell_type": "code",
227 | "metadata": {
228 | "id": "BZD9TPDRLAuQ"
229 | },
230 | "source": [
231 | "def process_tile_test(tile):\n",
232 | " tile_df = train[(train.tile_id == tile)].reset_index(drop = True)\n",
233 | "\n",
234 | " # y = np.expand_dims(rasterio.open(tile_df[tile_df.asset == 'labels'].file_path.values[0]).read(1).flatten(), axis = 1)\n",
235 | " fields = np.expand_dims(rasterio.open(tile_df[tile_df.asset == 'field_ids'].file_path.values[0]).read(1).flatten(), axis = 1)\n",
236 | "\n",
237 | " tile_df = train[(train.tile_id == tile) & (train.satellite_platform == 's2')].reset_index(drop = True)\n",
238 | "\n",
239 | " dates = []\n",
240 | " for month in range(4, 12):\n",
241 | " dates.append(tile_df[tile_df.month == month].datetime.sort_values().tolist()[0])\n",
242 | " dates.append(tile_df[tile_df.month == month].datetime.sort_values().tolist()[-1])\n",
243 | "\n",
244 | " X_tile = np.empty((256 * 256, 0))\n",
245 | "\n",
246 | " colls = []\n",
247 | " for date, datec in zip(dates, date_cols):\n",
248 | " for band in bands:\n",
249 | " tif_file = tile_df[(tile_df.asset == band) & (tile_df.datetime == date)].file_path.values[0]\n",
250 | " X_tile = np.append(X_tile, (np.expand_dims(rasterio.open(tif_file).read(1).flatten(), axis = 1)), axis = 1)\n",
251 | " colls.append(datec + '_' + band)\n",
252 | " df = pd.DataFrame(X_tile, columns = colls)\n",
253 | " df['fields'] = fields\n",
254 | " return df"
255 | ],
256 | "execution_count": null,
257 | "outputs": []
258 | },
259 | {
260 | "cell_type": "code",
261 | "metadata": {
262 | "id": "wbk6YD0akhTo"
263 | },
264 | "source": [
265 | "tiles = train.tile_id.unique()\n",
266 | "chunks = [tiles[x:x+265] for x in range(0, len(tiles), 265)]\n",
267 | "[len(x) for x in chunks]"
268 | ],
269 | "execution_count": null,
270 | "outputs": []
271 | },
272 | {
273 | "cell_type": "code",
274 | "metadata": {
275 | "id": "uQS4UBXKkhRF"
276 | },
277 | "source": [
278 | "for i in range(len(chunks)):\n",
279 | " pd.DataFrame(np.vstack(Parallel(n_jobs=-1, verbose=1, backend=\"multiprocessing\")(map(delayed(process_tile_train), [x for x in chunks[i]])))).to_csv(f'/content/drive/MyDrive/CompeData/Radiant/Start_end/test{i}.csv', index = False)\n",
280 | " gc.collect()\n",
281 | " print(i)"
282 | ],
283 | "execution_count": null,
284 | "outputs": []
285 | },
286 | {
287 | "cell_type": "code",
288 | "metadata": {
289 | "id": "ilCskgRkkhMg"
290 | },
291 | "source": [
292 | ""
293 | ],
294 | "execution_count": null,
295 | "outputs": []
296 | }
297 | ]
298 | }
--------------------------------------------------------------------------------
/Numpy_Extration_for_25_Periods.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "nbformat": 4,
3 | "nbformat_minor": 0,
4 | "metadata": {
5 | "colab": {
6 | "name": "Numpy_Extration_for_25_Periods.ipynb",
7 | "provenance": [],
8 | "collapsed_sections": [],
9 | "machine_shape": "hm",
10 | "include_colab_link": true
11 | },
12 | "kernelspec": {
13 | "name": "python3",
14 | "display_name": "Python 3"
15 | },
16 | "language_info": {
17 | "name": "python"
18 | },
19 | "accelerator": "GPU"
20 | },
21 | "cells": [
22 | {
23 | "cell_type": "markdown",
24 | "metadata": {
25 | "id": "view-in-github",
26 | "colab_type": "text"
27 | },
28 | "source": [
29 | "
"
30 | ]
31 | },
32 | {
33 | "cell_type": "code",
34 | "metadata": {
35 | "id": "vemh8H3vZ9VO"
36 | },
37 | "source": [
38 | "# Install libraries\n",
39 | "!pip -qq install rasterio tifffile"
40 | ],
41 | "execution_count": null,
42 | "outputs": []
43 | },
44 | {
45 | "cell_type": "code",
46 | "metadata": {
47 | "id": "3mKYb4JmMBEZ"
48 | },
49 | "source": [
50 | "# Import libraries\n",
51 | "import os\n",
52 | "import glob\n",
53 | "import shutil\n",
54 | "import gc\n",
55 | "from joblib import Parallel, delayed\n",
56 | "from tqdm import tqdm_notebook\n",
57 | "import h5py\n",
58 | "\n",
59 | "import pandas as pd\n",
60 | "import numpy as np\n",
61 | "import datetime as dt\n",
62 | "from datetime import datetime, timedelta\n",
63 | "import matplotlib.pyplot as plt\n",
64 | "\n",
65 | "\n",
66 | "import rasterio\n",
67 | "import tifffile as tiff\n",
68 | "\n",
69 | "%matplotlib inline\n",
70 | "pd.set_option('display.max_colwidth', None)"
71 | ],
72 | "execution_count": null,
73 | "outputs": []
74 | },
75 | {
76 | "cell_type": "code",
77 | "metadata": {
78 | "id": "MAFPT_D_ovYr"
79 | },
80 | "source": [
81 | "# Download data with a frequency of 10 days\n",
82 | "def date_finder(start_date):\n",
83 | " season_dates = []\n",
84 | " m = str(start_date)[:10]\n",
85 | " s = str(start_date)[:10]\n",
86 | " for i in range(24):\n",
87 | " date = datetime.strptime(s, \"%Y-%m-%d\")\n",
88 | " s = str(date + timedelta(days = 10))[:10]\n",
89 | " season_dates.append(datetime.strptime(s, \"%Y-%m-%d\"))\n",
90 | " seasons_dates = [datetime.strptime(m, \"%Y-%m-%d\")] + season_dates\n",
91 | " seasons_dates = [np.datetime64(x) for x in seasons_dates]\n",
92 | " return list(seasons_dates)\n",
93 | "\n",
94 | "# If day not in a frequency of 10 days, find the nearest date\n",
95 | "def nearest(items, pivot):\n",
96 | " return min(items, key=lambda x: abs(x - pivot))"
97 | ],
98 | "execution_count": null,
99 | "outputs": []
100 | },
101 | {
102 | "cell_type": "code",
103 | "metadata": {
104 | "id": "MiS1dNGpRHBe",
105 | "colab": {
106 | "base_uri": "https://localhost:8080/"
107 | },
108 | "outputId": "d0ef8c1b-1d0d-4fe7-8998-e8e3f1d9118f"
109 | },
110 | "source": [
111 | "%%time\n",
112 | "# Unpack data saved in gdrive to colab\n",
113 | "shutil.unpack_archive('/content/drive/MyDrive/CompeData/Radiant/Radiant_Data.zip', '/content/radiant')\n",
114 | "gc.collect()"
115 | ],
116 | "execution_count": null,
117 | "outputs": [
118 | {
119 | "output_type": "stream",
120 | "name": "stdout",
121 | "text": [
122 | "CPU times: user 13min 43s, sys: 4min 22s, total: 18min 5s\n",
123 | "Wall time: 27min 23s\n"
124 | ]
125 | }
126 | ]
127 | },
128 | {
129 | "cell_type": "code",
130 | "metadata": {
131 | "id": "O0_JwAo0bMdV",
132 | "colab": {
133 | "base_uri": "https://localhost:8080/",
134 | "height": 203
135 | },
136 | "outputId": "0b21aed2-dc7b-414c-fcd6-8340d6ca3f8c"
137 | },
138 | "source": [
139 | "# Load files\n",
140 | "train = pd.concat([pd.read_csv(f'/content/radiant/train{i}.csv', parse_dates=['datetime']) for i in range(1, 5)]).reset_index(drop = True)\n",
141 | "test = pd.concat([pd.read_csv(f'/content/radiant/test{i}.csv', parse_dates=['datetime']) for i in range(1, 5)]).reset_index(drop = True)\n",
142 | "train.file_path = train.file_path.apply(lambda x: '/'.join(['/content', 'radiant'] + x.split('/')[2:]))\n",
143 | "test.file_path = test.file_path.apply(lambda x: '/'.join(['/content', 'radiant'] + x.split('/')[2:]))\n",
144 | "train.datetime, test.datetime = pd.to_datetime(train.datetime.dt.date), pd.to_datetime(test.datetime.dt.date)\n",
145 | "train['month'], test['month'] = train.datetime.dt.month, test.datetime.dt.month\n",
146 | "train.head()"
147 | ],
148 | "execution_count": null,
149 | "outputs": [
150 | {
151 | "output_type": "execute_result",
152 | "data": {
153 | "text/html": [
154 | "
\n",
155 | "\n",
168 | "
\n",
169 | " \n",
170 | " \n",
171 | " | \n",
172 | " tile_id | \n",
173 | " datetime | \n",
174 | " satellite_platform | \n",
175 | " asset | \n",
176 | " file_path | \n",
177 | " month | \n",
178 | "
\n",
179 | " \n",
180 | " \n",
181 | " \n",
182 | " | 0 | \n",
183 | " 2587 | \n",
184 | " NaT | \n",
185 | " NaN | \n",
186 | " documentation | \n",
187 | " /content/radiant/ref_south_africa_crops_competition_v1_train_labels/_common/documentation.pdf | \n",
188 | " NaN | \n",
189 | "
\n",
190 | " \n",
191 | " | 1 | \n",
192 | " 2587 | \n",
193 | " NaT | \n",
194 | " NaN | \n",
195 | " field_ids | \n",
196 | " /content/radiant/ref_south_africa_crops_competition_v1_train_labels/ref_south_africa_crops_competition_v1_train_labels_2587/field_ids.tif | \n",
197 | " NaN | \n",
198 | "
\n",
199 | " \n",
200 | " | 2 | \n",
201 | " 2587 | \n",
202 | " NaT | \n",
203 | " NaN | \n",
204 | " field_info_train | \n",
205 | " /content/radiant/ref_south_africa_crops_competition_v1_train_labels/_common/field_info_train.csv | \n",
206 | " NaN | \n",
207 | "
\n",
208 | " \n",
209 | " | 3 | \n",
210 | " 2587 | \n",
211 | " NaT | \n",
212 | " NaN | \n",
213 | " labels | \n",
214 | " /content/radiant/ref_south_africa_crops_competition_v1_train_labels/ref_south_africa_crops_competition_v1_train_labels_2587/labels.tif | \n",
215 | " NaN | \n",
216 | "
\n",
217 | " \n",
218 | " | 4 | \n",
219 | " 2587 | \n",
220 | " NaT | \n",
221 | " NaN | \n",
222 | " raster_values | \n",
223 | " /content/radiant/ref_south_africa_crops_competition_v1_train_labels/_common/raster_values.json | \n",
224 | " NaN | \n",
225 | "
\n",
226 | " \n",
227 | "
\n",
228 | "
"
229 | ],
230 | "text/plain": [
231 | " tile_id ... month\n",
232 | "0 2587 ... NaN\n",
233 | "1 2587 ... NaN\n",
234 | "2 2587 ... NaN\n",
235 | "3 2587 ... NaN\n",
236 | "4 2587 ... NaN\n",
237 | "\n",
238 | "[5 rows x 6 columns]"
239 | ]
240 | },
241 | "metadata": {},
242 | "execution_count": 4
243 | }
244 | ]
245 | },
246 | {
247 | "cell_type": "code",
248 | "metadata": {
249 | "id": "ph__jwFtDmR6",
250 | "colab": {
251 | "base_uri": "https://localhost:8080/"
252 | },
253 | "outputId": "61c45381-5b4f-47de-f2ff-4d264fb2eb7d"
254 | },
255 | "source": [
256 | "# Unique months\n",
257 | "train.month.unique()"
258 | ],
259 | "execution_count": null,
260 | "outputs": [
261 | {
262 | "output_type": "execute_result",
263 | "data": {
264 | "text/plain": [
265 | "array([nan, 4., 5., 6., 7., 8., 9., 10., 11.])"
266 | ]
267 | },
268 | "metadata": {},
269 | "execution_count": 5
270 | }
271 | ]
272 | },
273 | {
274 | "cell_type": "code",
275 | "metadata": {
276 | "id": "idcCxN2tAbQz"
277 | },
278 | "source": [
279 | "# Bands\n",
280 | "bands = ['B01','B02','B03','B04','B05','B06','B07','B08','B8A','B09','B11','B12','CLM']"
281 | ],
282 | "execution_count": null,
283 | "outputs": []
284 | },
285 | {
286 | "cell_type": "code",
287 | "metadata": {
288 | "id": "JJhxiJcXAbNN"
289 | },
290 | "source": [
291 | "# Function to load tile and extract fields data into a numpy array and convert the same to a dataframe\n",
292 | "# Train\n",
293 | "def process_tile_train(tile):\n",
294 | " tile_df = train[(train.tile_id == tile)].reset_index(drop = True)\n",
295 | "\n",
296 | " y = np.expand_dims(rasterio.open(tile_df[tile_df.asset == 'labels'].file_path.values[0]).read(1).flatten(), axis = 1)\n",
297 | " fields = np.expand_dims(rasterio.open(tile_df[tile_df.asset == 'field_ids'].file_path.values[0]).read(1).flatten(), axis = 1)\n",
298 | "\n",
299 | " tile_df = train[(train.tile_id == tile) & (train.satellite_platform == 's2')].reset_index(drop = True)\n",
300 | "\n",
301 | " unique_dates = list(tile_df.datetime.unique())\n",
302 | " start_date = tile_df.datetime.unique()[0]\n",
303 | " # Assert\n",
304 | " diff = set([str(x)[:10] for x in date_finder(start_date)]) - set([str(x)[:10] for x in unique_dates])\n",
305 | " if len(diff) > 0:\n",
306 | " missing = list(set([str(x)[:10] for x in date_finder(start_date)]) - set(diff))\n",
307 | " for d in diff:\n",
308 | " missing.append(str(nearest(unique_dates, np.datetime64(d)))[:10])\n",
309 | " dates = sorted([np.datetime64(x) for x in missing]) \n",
310 | " else:\n",
311 | " dates = date_finder(start_date)\n",
312 | "\n",
313 | " X_tile = np.empty((256 * 256, 0))\n",
314 | "\n",
315 | " colls = []\n",
316 | " for date, datec in zip(dates, range(25)):\n",
317 | " for band in bands:\n",
318 | " tif_file = tile_df[(tile_df.asset == band) & (tile_df.datetime == date)].file_path.values[0]\n",
319 | " X_tile = np.append(X_tile, (np.expand_dims(rasterio.open(tif_file).read(1).flatten(), axis = 1)), axis = 1)\n",
320 | " colls.append(str(datec) + '_' + band)\n",
321 | " df = pd.DataFrame(X_tile, columns = colls)\n",
322 | " df['y'], df['fields'] = y, fields\n",
323 | " return df"
324 | ],
325 | "execution_count": null,
326 | "outputs": []
327 | },
328 | {
329 | "cell_type": "code",
330 | "metadata": {
331 | "id": "7hQOhSMPb-qI",
332 | "colab": {
333 | "base_uri": "https://localhost:8080/"
334 | },
335 | "outputId": "8a2ab606-a1b3-440f-8df4-546cc0be0b2d"
336 | },
337 | "source": [
338 | "# Preprocessing the data in chunks to avoid outofmemmory error\n",
339 | "# Train\n",
340 | "tiles = train.tile_id.unique()\n",
341 | "chunks = [tiles[x:x+50] for x in range(0, len(tiles), 50)]\n",
342 | "[len(x) for x in chunks], len(chunks)"
343 | ],
344 | "execution_count": null,
345 | "outputs": [
346 | {
347 | "output_type": "execute_result",
348 | "data": {
349 | "text/plain": [
350 | "([50,\n",
351 | " 50,\n",
352 | " 50,\n",
353 | " 50,\n",
354 | " 50,\n",
355 | " 50,\n",
356 | " 50,\n",
357 | " 50,\n",
358 | " 50,\n",
359 | " 50,\n",
360 | " 50,\n",
361 | " 50,\n",
362 | " 50,\n",
363 | " 50,\n",
364 | " 50,\n",
365 | " 50,\n",
366 | " 50,\n",
367 | " 50,\n",
368 | " 50,\n",
369 | " 50,\n",
370 | " 50,\n",
371 | " 50,\n",
372 | " 50,\n",
373 | " 50,\n",
374 | " 50,\n",
375 | " 50,\n",
376 | " 50,\n",
377 | " 50,\n",
378 | " 50,\n",
379 | " 50,\n",
380 | " 50,\n",
381 | " 50,\n",
382 | " 50,\n",
383 | " 50,\n",
384 | " 50,\n",
385 | " 50,\n",
386 | " 50,\n",
387 | " 50,\n",
388 | " 50,\n",
389 | " 50,\n",
390 | " 50,\n",
391 | " 50,\n",
392 | " 50,\n",
393 | " 50,\n",
394 | " 50,\n",
395 | " 50,\n",
396 | " 50,\n",
397 | " 50,\n",
398 | " 50,\n",
399 | " 50,\n",
400 | " 50,\n",
401 | " 50,\n",
402 | " 50],\n",
403 | " 53)"
404 | ]
405 | },
406 | "metadata": {},
407 | "execution_count": 43
408 | }
409 | ]
410 | },
411 | {
412 | "cell_type": "code",
413 | "metadata": {
414 | "id": "iULhCd36KrGG"
415 | },
416 | "source": [
417 | "# Preprocessing the tiles without storing them in memory but saving them as csvs in gdrive\n",
418 | "# Train\n",
419 | "for i in range(len(chunks)):\n",
420 | " pd.DataFrame(np.vstack(Parallel(n_jobs=-1, verbose=1, backend=\"multiprocessing\")(map(delayed(process_tile_train), [x for x in chunks[i]])))).to_csv(f'/content/drive/MyDrive/CompeData/Radiant/Seasonality/train/train{i}.csv', index = False)\n",
421 | " gc.collect()"
422 | ],
423 | "execution_count": null,
424 | "outputs": []
425 | },
426 | {
427 | "cell_type": "code",
428 | "metadata": {
429 | "id": "scZ6uxWaz0Fq"
430 | },
431 | "source": [
432 | "# Function to load tile and extract fields data into a numpy array and convert the same to a dataframe\n",
433 | "# Test\n",
434 | "def process_tile_test(tile):\n",
435 | " tile_df = test[(test.tile_id == tile)].reset_index(drop = True)\n",
436 | "\n",
437 | " fields = np.expand_dims(rasterio.open(tile_df[tile_df.asset == 'field_ids'].file_path.values[0]).read(1).flatten(), axis = 1)\n",
438 | "\n",
439 | " tile_df = test[(test.tile_id == tile) & (test.satellite_platform == 's2')].reset_index(drop = True)\n",
440 | "\n",
441 | " unique_dates = list(tile_df.datetime.unique())\n",
442 | " start_date = tile_df.datetime.unique()[0]\n",
443 | " # Assert\n",
444 | " diff = set([str(x)[:10] for x in date_finder(start_date)]) - set([str(x)[:10] for x in unique_dates])\n",
445 | " if len(diff) > 0:\n",
446 | " missing = list(set([str(x)[:10] for x in date_finder(start_date)]) - set(diff))\n",
447 | " for d in diff:\n",
448 | " missing.append(str(nearest(unique_dates, np.datetime64(d)))[:10])\n",
449 | " dates = sorted([np.datetime64(x) for x in missing]) \n",
450 | " else:\n",
451 | " dates = date_finder(start_date)\n",
452 | "\n",
453 | " X_tile = np.empty((256 * 256, 0))\n",
454 | "\n",
455 | " colls = []\n",
456 | " for date, datec in zip(dates, range(25)):\n",
457 | " for band in bands:\n",
458 | " tif_file = tile_df[(tile_df.asset == band) & (tile_df.datetime == date)].file_path.values[0]\n",
459 | " X_tile = np.append(X_tile, (np.expand_dims(rasterio.open(tif_file).read(1).flatten(), axis = 1)), axis = 1)\n",
460 | " colls.append(str(datec) + '_' + band)\n",
461 | " df = pd.DataFrame(X_tile, columns = colls)\n",
462 | " df['fields'] = fields\n",
463 | " return df"
464 | ],
465 | "execution_count": null,
466 | "outputs": []
467 | },
468 | {
469 | "cell_type": "code",
470 | "metadata": {
471 | "id": "q2-rj6T9z0Co"
472 | },
473 | "source": [
474 | "# Preprocessing the data in chunks to avoid outofmemmory error\n",
475 | "# Train\n",
476 | "tiles = test.tile_id.unique()\n",
477 | "chunks = [tiles[x:x+50] for x in range(0, len(tiles), 50)]\n",
478 | "[len(x) for x in chunks], len(chunks)"
479 | ],
480 | "execution_count": null,
481 | "outputs": []
482 | },
483 | {
484 | "cell_type": "code",
485 | "metadata": {
486 | "id": "ifNd5fgxzz_1"
487 | },
488 | "source": [
489 | "# Preprocessing the tiles without storing them in memory but saving them as csvs in gdrive\n",
490 | "# Train\n",
491 | "for i in range(len(chunks)):\n",
492 | " pd.DataFrame(np.vstack(Parallel(n_jobs=-1, verbose=1, backend=\"multiprocessing\")(map(delayed(process_tile_test), [x for x in chunks[i]])))).to_csv(f'/content/drive/MyDrive/CompeData/Radiant/Seasonality/test/test{i}.csv', index = False)\n",
493 | " gc.collect()"
494 | ],
495 | "execution_count": null,
496 | "outputs": []
497 | }
498 | ]
499 | }
--------------------------------------------------------------------------------
/Field_Aggregation_Mean.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "nbformat": 4,
3 | "nbformat_minor": 0,
4 | "metadata": {
5 | "kernelspec": {
6 | "display_name": "Python 3",
7 | "language": "python",
8 | "name": "python3"
9 | },
10 | "language_info": {
11 | "codemirror_mode": {
12 | "name": "ipython",
13 | "version": 3
14 | },
15 | "file_extension": ".py",
16 | "mimetype": "text/x-python",
17 | "name": "python",
18 | "nbconvert_exporter": "python",
19 | "pygments_lexer": "ipython3",
20 | "version": "3.7.4"
21 | },
22 | "colab": {
23 | "name": "Field_Aggregation_Mean.ipynb",
24 | "provenance": [],
25 | "collapsed_sections": [],
26 | "machine_shape": "hm",
27 | "include_colab_link": true
28 | },
29 | "widgets": {
30 | "application/vnd.jupyter.widget-state+json": {
31 | "7b430419c8e04c91acafc35a53118cf3": {
32 | "model_module": "@jupyter-widgets/controls",
33 | "model_name": "HBoxModel",
34 | "model_module_version": "1.5.0",
35 | "state": {
36 | "_view_name": "HBoxView",
37 | "_dom_classes": [],
38 | "_model_name": "HBoxModel",
39 | "_view_module": "@jupyter-widgets/controls",
40 | "_model_module_version": "1.5.0",
41 | "_view_count": null,
42 | "_view_module_version": "1.5.0",
43 | "box_style": "",
44 | "layout": "IPY_MODEL_8466a8ec4bc347b4b69f220d31679012",
45 | "_model_module": "@jupyter-widgets/controls",
46 | "children": [
47 | "IPY_MODEL_2bce2a23961544b1960e268693e33947",
48 | "IPY_MODEL_89380e275f19406a8614dace6af1f99b",
49 | "IPY_MODEL_54724ef8ab44408c8fcc616610dd9bdb"
50 | ]
51 | }
52 | },
53 | "8466a8ec4bc347b4b69f220d31679012": {
54 | "model_module": "@jupyter-widgets/base",
55 | "model_name": "LayoutModel",
56 | "model_module_version": "1.2.0",
57 | "state": {
58 | "_view_name": "LayoutView",
59 | "grid_template_rows": null,
60 | "right": null,
61 | "justify_content": null,
62 | "_view_module": "@jupyter-widgets/base",
63 | "overflow": null,
64 | "_model_module_version": "1.2.0",
65 | "_view_count": null,
66 | "flex_flow": null,
67 | "width": null,
68 | "min_width": null,
69 | "border": null,
70 | "align_items": null,
71 | "bottom": null,
72 | "_model_module": "@jupyter-widgets/base",
73 | "top": null,
74 | "grid_column": null,
75 | "overflow_y": null,
76 | "overflow_x": null,
77 | "grid_auto_flow": null,
78 | "grid_area": null,
79 | "grid_template_columns": null,
80 | "flex": null,
81 | "_model_name": "LayoutModel",
82 | "justify_items": null,
83 | "grid_row": null,
84 | "max_height": null,
85 | "align_content": null,
86 | "visibility": null,
87 | "align_self": null,
88 | "height": null,
89 | "min_height": null,
90 | "padding": null,
91 | "grid_auto_rows": null,
92 | "grid_gap": null,
93 | "max_width": null,
94 | "order": null,
95 | "_view_module_version": "1.2.0",
96 | "grid_template_areas": null,
97 | "object_position": null,
98 | "object_fit": null,
99 | "grid_auto_columns": null,
100 | "margin": null,
101 | "display": null,
102 | "left": null
103 | }
104 | },
105 | "2bce2a23961544b1960e268693e33947": {
106 | "model_module": "@jupyter-widgets/controls",
107 | "model_name": "HTMLModel",
108 | "model_module_version": "1.5.0",
109 | "state": {
110 | "_view_name": "HTMLView",
111 | "style": "IPY_MODEL_1bd7d3e972804c65bed5fdc30011d9f1",
112 | "_dom_classes": [],
113 | "description": "",
114 | "_model_name": "HTMLModel",
115 | "placeholder": "",
116 | "_view_module": "@jupyter-widgets/controls",
117 | "_model_module_version": "1.5.0",
118 | "value": "100%",
119 | "_view_count": null,
120 | "_view_module_version": "1.5.0",
121 | "description_tooltip": null,
122 | "_model_module": "@jupyter-widgets/controls",
123 | "layout": "IPY_MODEL_b771437f48194bad9dbcc254019b6803"
124 | }
125 | },
126 | "89380e275f19406a8614dace6af1f99b": {
127 | "model_module": "@jupyter-widgets/controls",
128 | "model_name": "FloatProgressModel",
129 | "model_module_version": "1.5.0",
130 | "state": {
131 | "_view_name": "ProgressView",
132 | "style": "IPY_MODEL_e8fa90fbc2b741ad988e6b5d13af6d67",
133 | "_dom_classes": [],
134 | "description": "",
135 | "_model_name": "FloatProgressModel",
136 | "bar_style": "success",
137 | "max": 10,
138 | "_view_module": "@jupyter-widgets/controls",
139 | "_model_module_version": "1.5.0",
140 | "value": 10,
141 | "_view_count": null,
142 | "_view_module_version": "1.5.0",
143 | "orientation": "horizontal",
144 | "min": 0,
145 | "description_tooltip": null,
146 | "_model_module": "@jupyter-widgets/controls",
147 | "layout": "IPY_MODEL_f649ba6bc4b14c7ab06d08da5fe3b7de"
148 | }
149 | },
150 | "54724ef8ab44408c8fcc616610dd9bdb": {
151 | "model_module": "@jupyter-widgets/controls",
152 | "model_name": "HTMLModel",
153 | "model_module_version": "1.5.0",
154 | "state": {
155 | "_view_name": "HTMLView",
156 | "style": "IPY_MODEL_e5be528350c54c34b702a87be4c16b40",
157 | "_dom_classes": [],
158 | "description": "",
159 | "_model_name": "HTMLModel",
160 | "placeholder": "",
161 | "_view_module": "@jupyter-widgets/controls",
162 | "_model_module_version": "1.5.0",
163 | "value": " 10/10 [00:11<00:00, 1.58it/s]",
164 | "_view_count": null,
165 | "_view_module_version": "1.5.0",
166 | "description_tooltip": null,
167 | "_model_module": "@jupyter-widgets/controls",
168 | "layout": "IPY_MODEL_355037c622754c2bba10bd49c8f25754"
169 | }
170 | },
171 | "1bd7d3e972804c65bed5fdc30011d9f1": {
172 | "model_module": "@jupyter-widgets/controls",
173 | "model_name": "DescriptionStyleModel",
174 | "model_module_version": "1.5.0",
175 | "state": {
176 | "_view_name": "StyleView",
177 | "_model_name": "DescriptionStyleModel",
178 | "description_width": "",
179 | "_view_module": "@jupyter-widgets/base",
180 | "_model_module_version": "1.5.0",
181 | "_view_count": null,
182 | "_view_module_version": "1.2.0",
183 | "_model_module": "@jupyter-widgets/controls"
184 | }
185 | },
186 | "b771437f48194bad9dbcc254019b6803": {
187 | "model_module": "@jupyter-widgets/base",
188 | "model_name": "LayoutModel",
189 | "model_module_version": "1.2.0",
190 | "state": {
191 | "_view_name": "LayoutView",
192 | "grid_template_rows": null,
193 | "right": null,
194 | "justify_content": null,
195 | "_view_module": "@jupyter-widgets/base",
196 | "overflow": null,
197 | "_model_module_version": "1.2.0",
198 | "_view_count": null,
199 | "flex_flow": null,
200 | "width": null,
201 | "min_width": null,
202 | "border": null,
203 | "align_items": null,
204 | "bottom": null,
205 | "_model_module": "@jupyter-widgets/base",
206 | "top": null,
207 | "grid_column": null,
208 | "overflow_y": null,
209 | "overflow_x": null,
210 | "grid_auto_flow": null,
211 | "grid_area": null,
212 | "grid_template_columns": null,
213 | "flex": null,
214 | "_model_name": "LayoutModel",
215 | "justify_items": null,
216 | "grid_row": null,
217 | "max_height": null,
218 | "align_content": null,
219 | "visibility": null,
220 | "align_self": null,
221 | "height": null,
222 | "min_height": null,
223 | "padding": null,
224 | "grid_auto_rows": null,
225 | "grid_gap": null,
226 | "max_width": null,
227 | "order": null,
228 | "_view_module_version": "1.2.0",
229 | "grid_template_areas": null,
230 | "object_position": null,
231 | "object_fit": null,
232 | "grid_auto_columns": null,
233 | "margin": null,
234 | "display": null,
235 | "left": null
236 | }
237 | },
238 | "e8fa90fbc2b741ad988e6b5d13af6d67": {
239 | "model_module": "@jupyter-widgets/controls",
240 | "model_name": "ProgressStyleModel",
241 | "model_module_version": "1.5.0",
242 | "state": {
243 | "_view_name": "StyleView",
244 | "_model_name": "ProgressStyleModel",
245 | "description_width": "",
246 | "_view_module": "@jupyter-widgets/base",
247 | "_model_module_version": "1.5.0",
248 | "_view_count": null,
249 | "_view_module_version": "1.2.0",
250 | "bar_color": null,
251 | "_model_module": "@jupyter-widgets/controls"
252 | }
253 | },
254 | "f649ba6bc4b14c7ab06d08da5fe3b7de": {
255 | "model_module": "@jupyter-widgets/base",
256 | "model_name": "LayoutModel",
257 | "model_module_version": "1.2.0",
258 | "state": {
259 | "_view_name": "LayoutView",
260 | "grid_template_rows": null,
261 | "right": null,
262 | "justify_content": null,
263 | "_view_module": "@jupyter-widgets/base",
264 | "overflow": null,
265 | "_model_module_version": "1.2.0",
266 | "_view_count": null,
267 | "flex_flow": null,
268 | "width": null,
269 | "min_width": null,
270 | "border": null,
271 | "align_items": null,
272 | "bottom": null,
273 | "_model_module": "@jupyter-widgets/base",
274 | "top": null,
275 | "grid_column": null,
276 | "overflow_y": null,
277 | "overflow_x": null,
278 | "grid_auto_flow": null,
279 | "grid_area": null,
280 | "grid_template_columns": null,
281 | "flex": null,
282 | "_model_name": "LayoutModel",
283 | "justify_items": null,
284 | "grid_row": null,
285 | "max_height": null,
286 | "align_content": null,
287 | "visibility": null,
288 | "align_self": null,
289 | "height": null,
290 | "min_height": null,
291 | "padding": null,
292 | "grid_auto_rows": null,
293 | "grid_gap": null,
294 | "max_width": null,
295 | "order": null,
296 | "_view_module_version": "1.2.0",
297 | "grid_template_areas": null,
298 | "object_position": null,
299 | "object_fit": null,
300 | "grid_auto_columns": null,
301 | "margin": null,
302 | "display": null,
303 | "left": null
304 | }
305 | },
306 | "e5be528350c54c34b702a87be4c16b40": {
307 | "model_module": "@jupyter-widgets/controls",
308 | "model_name": "DescriptionStyleModel",
309 | "model_module_version": "1.5.0",
310 | "state": {
311 | "_view_name": "StyleView",
312 | "_model_name": "DescriptionStyleModel",
313 | "description_width": "",
314 | "_view_module": "@jupyter-widgets/base",
315 | "_model_module_version": "1.5.0",
316 | "_view_count": null,
317 | "_view_module_version": "1.2.0",
318 | "_model_module": "@jupyter-widgets/controls"
319 | }
320 | },
321 | "355037c622754c2bba10bd49c8f25754": {
322 | "model_module": "@jupyter-widgets/base",
323 | "model_name": "LayoutModel",
324 | "model_module_version": "1.2.0",
325 | "state": {
326 | "_view_name": "LayoutView",
327 | "grid_template_rows": null,
328 | "right": null,
329 | "justify_content": null,
330 | "_view_module": "@jupyter-widgets/base",
331 | "overflow": null,
332 | "_model_module_version": "1.2.0",
333 | "_view_count": null,
334 | "flex_flow": null,
335 | "width": null,
336 | "min_width": null,
337 | "border": null,
338 | "align_items": null,
339 | "bottom": null,
340 | "_model_module": "@jupyter-widgets/base",
341 | "top": null,
342 | "grid_column": null,
343 | "overflow_y": null,
344 | "overflow_x": null,
345 | "grid_auto_flow": null,
346 | "grid_area": null,
347 | "grid_template_columns": null,
348 | "flex": null,
349 | "_model_name": "LayoutModel",
350 | "justify_items": null,
351 | "grid_row": null,
352 | "max_height": null,
353 | "align_content": null,
354 | "visibility": null,
355 | "align_self": null,
356 | "height": null,
357 | "min_height": null,
358 | "padding": null,
359 | "grid_auto_rows": null,
360 | "grid_gap": null,
361 | "max_width": null,
362 | "order": null,
363 | "_view_module_version": "1.2.0",
364 | "grid_template_areas": null,
365 | "object_position": null,
366 | "object_fit": null,
367 | "grid_auto_columns": null,
368 | "margin": null,
369 | "display": null,
370 | "left": null
371 | }
372 | },
373 | "dfc9917c7f834a50a4243f9c2b0c7acf": {
374 | "model_module": "@jupyter-widgets/controls",
375 | "model_name": "HBoxModel",
376 | "model_module_version": "1.5.0",
377 | "state": {
378 | "_view_name": "HBoxView",
379 | "_dom_classes": [],
380 | "_model_name": "HBoxModel",
381 | "_view_module": "@jupyter-widgets/controls",
382 | "_model_module_version": "1.5.0",
383 | "_view_count": null,
384 | "_view_module_version": "1.5.0",
385 | "box_style": "",
386 | "layout": "IPY_MODEL_44459dfef2754dfe8c5c5a4ff37dcf4a",
387 | "_model_module": "@jupyter-widgets/controls",
388 | "children": [
389 | "IPY_MODEL_d0fcacb5d7624f04a939f93dc6ce2263",
390 | "IPY_MODEL_02bd460511784a6daf3dbe9d1ef63443",
391 | "IPY_MODEL_2f5def119b2745018845e600fd6478bf"
392 | ]
393 | }
394 | },
395 | "44459dfef2754dfe8c5c5a4ff37dcf4a": {
396 | "model_module": "@jupyter-widgets/base",
397 | "model_name": "LayoutModel",
398 | "model_module_version": "1.2.0",
399 | "state": {
400 | "_view_name": "LayoutView",
401 | "grid_template_rows": null,
402 | "right": null,
403 | "justify_content": null,
404 | "_view_module": "@jupyter-widgets/base",
405 | "overflow": null,
406 | "_model_module_version": "1.2.0",
407 | "_view_count": null,
408 | "flex_flow": null,
409 | "width": null,
410 | "min_width": null,
411 | "border": null,
412 | "align_items": null,
413 | "bottom": null,
414 | "_model_module": "@jupyter-widgets/base",
415 | "top": null,
416 | "grid_column": null,
417 | "overflow_y": null,
418 | "overflow_x": null,
419 | "grid_auto_flow": null,
420 | "grid_area": null,
421 | "grid_template_columns": null,
422 | "flex": null,
423 | "_model_name": "LayoutModel",
424 | "justify_items": null,
425 | "grid_row": null,
426 | "max_height": null,
427 | "align_content": null,
428 | "visibility": null,
429 | "align_self": null,
430 | "height": null,
431 | "min_height": null,
432 | "padding": null,
433 | "grid_auto_rows": null,
434 | "grid_gap": null,
435 | "max_width": null,
436 | "order": null,
437 | "_view_module_version": "1.2.0",
438 | "grid_template_areas": null,
439 | "object_position": null,
440 | "object_fit": null,
441 | "grid_auto_columns": null,
442 | "margin": null,
443 | "display": null,
444 | "left": null
445 | }
446 | },
447 | "d0fcacb5d7624f04a939f93dc6ce2263": {
448 | "model_module": "@jupyter-widgets/controls",
449 | "model_name": "HTMLModel",
450 | "model_module_version": "1.5.0",
451 | "state": {
452 | "_view_name": "HTMLView",
453 | "style": "IPY_MODEL_ce26a50eec88416384396ea39d569c61",
454 | "_dom_classes": [],
455 | "description": "",
456 | "_model_name": "HTMLModel",
457 | "placeholder": "",
458 | "_view_module": "@jupyter-widgets/controls",
459 | "_model_module_version": "1.5.0",
460 | "value": "100%",
461 | "_view_count": null,
462 | "_view_module_version": "1.5.0",
463 | "description_tooltip": null,
464 | "_model_module": "@jupyter-widgets/controls",
465 | "layout": "IPY_MODEL_0f9efdd1b2214514be75d26fcdb18385"
466 | }
467 | },
468 | "02bd460511784a6daf3dbe9d1ef63443": {
469 | "model_module": "@jupyter-widgets/controls",
470 | "model_name": "FloatProgressModel",
471 | "model_module_version": "1.5.0",
472 | "state": {
473 | "_view_name": "ProgressView",
474 | "style": "IPY_MODEL_cfcd541aac3f4c4d8b848d93599feaaf",
475 | "_dom_classes": [],
476 | "description": "",
477 | "_model_name": "FloatProgressModel",
478 | "bar_style": "success",
479 | "max": 10,
480 | "_view_module": "@jupyter-widgets/controls",
481 | "_model_module_version": "1.5.0",
482 | "value": 10,
483 | "_view_count": null,
484 | "_view_module_version": "1.5.0",
485 | "orientation": "horizontal",
486 | "min": 0,
487 | "description_tooltip": null,
488 | "_model_module": "@jupyter-widgets/controls",
489 | "layout": "IPY_MODEL_87cf6567215b4ba8b02ae7fe4e680e6d"
490 | }
491 | },
492 | "2f5def119b2745018845e600fd6478bf": {
493 | "model_module": "@jupyter-widgets/controls",
494 | "model_name": "HTMLModel",
495 | "model_module_version": "1.5.0",
496 | "state": {
497 | "_view_name": "HTMLView",
498 | "style": "IPY_MODEL_d66ad955b2c34d15bd37b2bde9d0439b",
499 | "_dom_classes": [],
500 | "description": "",
501 | "_model_name": "HTMLModel",
502 | "placeholder": "",
503 | "_view_module": "@jupyter-widgets/controls",
504 | "_model_module_version": "1.5.0",
505 | "value": " 10/10 [00:00<00:00, 89.67it/s]",
506 | "_view_count": null,
507 | "_view_module_version": "1.5.0",
508 | "description_tooltip": null,
509 | "_model_module": "@jupyter-widgets/controls",
510 | "layout": "IPY_MODEL_ccfe7c215f9942c99ce74be1931161c8"
511 | }
512 | },
513 | "ce26a50eec88416384396ea39d569c61": {
514 | "model_module": "@jupyter-widgets/controls",
515 | "model_name": "DescriptionStyleModel",
516 | "model_module_version": "1.5.0",
517 | "state": {
518 | "_view_name": "StyleView",
519 | "_model_name": "DescriptionStyleModel",
520 | "description_width": "",
521 | "_view_module": "@jupyter-widgets/base",
522 | "_model_module_version": "1.5.0",
523 | "_view_count": null,
524 | "_view_module_version": "1.2.0",
525 | "_model_module": "@jupyter-widgets/controls"
526 | }
527 | },
528 | "0f9efdd1b2214514be75d26fcdb18385": {
529 | "model_module": "@jupyter-widgets/base",
530 | "model_name": "LayoutModel",
531 | "model_module_version": "1.2.0",
532 | "state": {
533 | "_view_name": "LayoutView",
534 | "grid_template_rows": null,
535 | "right": null,
536 | "justify_content": null,
537 | "_view_module": "@jupyter-widgets/base",
538 | "overflow": null,
539 | "_model_module_version": "1.2.0",
540 | "_view_count": null,
541 | "flex_flow": null,
542 | "width": null,
543 | "min_width": null,
544 | "border": null,
545 | "align_items": null,
546 | "bottom": null,
547 | "_model_module": "@jupyter-widgets/base",
548 | "top": null,
549 | "grid_column": null,
550 | "overflow_y": null,
551 | "overflow_x": null,
552 | "grid_auto_flow": null,
553 | "grid_area": null,
554 | "grid_template_columns": null,
555 | "flex": null,
556 | "_model_name": "LayoutModel",
557 | "justify_items": null,
558 | "grid_row": null,
559 | "max_height": null,
560 | "align_content": null,
561 | "visibility": null,
562 | "align_self": null,
563 | "height": null,
564 | "min_height": null,
565 | "padding": null,
566 | "grid_auto_rows": null,
567 | "grid_gap": null,
568 | "max_width": null,
569 | "order": null,
570 | "_view_module_version": "1.2.0",
571 | "grid_template_areas": null,
572 | "object_position": null,
573 | "object_fit": null,
574 | "grid_auto_columns": null,
575 | "margin": null,
576 | "display": null,
577 | "left": null
578 | }
579 | },
580 | "cfcd541aac3f4c4d8b848d93599feaaf": {
581 | "model_module": "@jupyter-widgets/controls",
582 | "model_name": "ProgressStyleModel",
583 | "model_module_version": "1.5.0",
584 | "state": {
585 | "_view_name": "StyleView",
586 | "_model_name": "ProgressStyleModel",
587 | "description_width": "",
588 | "_view_module": "@jupyter-widgets/base",
589 | "_model_module_version": "1.5.0",
590 | "_view_count": null,
591 | "_view_module_version": "1.2.0",
592 | "bar_color": null,
593 | "_model_module": "@jupyter-widgets/controls"
594 | }
595 | },
596 | "87cf6567215b4ba8b02ae7fe4e680e6d": {
597 | "model_module": "@jupyter-widgets/base",
598 | "model_name": "LayoutModel",
599 | "model_module_version": "1.2.0",
600 | "state": {
601 | "_view_name": "LayoutView",
602 | "grid_template_rows": null,
603 | "right": null,
604 | "justify_content": null,
605 | "_view_module": "@jupyter-widgets/base",
606 | "overflow": null,
607 | "_model_module_version": "1.2.0",
608 | "_view_count": null,
609 | "flex_flow": null,
610 | "width": null,
611 | "min_width": null,
612 | "border": null,
613 | "align_items": null,
614 | "bottom": null,
615 | "_model_module": "@jupyter-widgets/base",
616 | "top": null,
617 | "grid_column": null,
618 | "overflow_y": null,
619 | "overflow_x": null,
620 | "grid_auto_flow": null,
621 | "grid_area": null,
622 | "grid_template_columns": null,
623 | "flex": null,
624 | "_model_name": "LayoutModel",
625 | "justify_items": null,
626 | "grid_row": null,
627 | "max_height": null,
628 | "align_content": null,
629 | "visibility": null,
630 | "align_self": null,
631 | "height": null,
632 | "min_height": null,
633 | "padding": null,
634 | "grid_auto_rows": null,
635 | "grid_gap": null,
636 | "max_width": null,
637 | "order": null,
638 | "_view_module_version": "1.2.0",
639 | "grid_template_areas": null,
640 | "object_position": null,
641 | "object_fit": null,
642 | "grid_auto_columns": null,
643 | "margin": null,
644 | "display": null,
645 | "left": null
646 | }
647 | },
648 | "d66ad955b2c34d15bd37b2bde9d0439b": {
649 | "model_module": "@jupyter-widgets/controls",
650 | "model_name": "DescriptionStyleModel",
651 | "model_module_version": "1.5.0",
652 | "state": {
653 | "_view_name": "StyleView",
654 | "_model_name": "DescriptionStyleModel",
655 | "description_width": "",
656 | "_view_module": "@jupyter-widgets/base",
657 | "_model_module_version": "1.5.0",
658 | "_view_count": null,
659 | "_view_module_version": "1.2.0",
660 | "_model_module": "@jupyter-widgets/controls"
661 | }
662 | },
663 | "ccfe7c215f9942c99ce74be1931161c8": {
664 | "model_module": "@jupyter-widgets/base",
665 | "model_name": "LayoutModel",
666 | "model_module_version": "1.2.0",
667 | "state": {
668 | "_view_name": "LayoutView",
669 | "grid_template_rows": null,
670 | "right": null,
671 | "justify_content": null,
672 | "_view_module": "@jupyter-widgets/base",
673 | "overflow": null,
674 | "_model_module_version": "1.2.0",
675 | "_view_count": null,
676 | "flex_flow": null,
677 | "width": null,
678 | "min_width": null,
679 | "border": null,
680 | "align_items": null,
681 | "bottom": null,
682 | "_model_module": "@jupyter-widgets/base",
683 | "top": null,
684 | "grid_column": null,
685 | "overflow_y": null,
686 | "overflow_x": null,
687 | "grid_auto_flow": null,
688 | "grid_area": null,
689 | "grid_template_columns": null,
690 | "flex": null,
691 | "_model_name": "LayoutModel",
692 | "justify_items": null,
693 | "grid_row": null,
694 | "max_height": null,
695 | "align_content": null,
696 | "visibility": null,
697 | "align_self": null,
698 | "height": null,
699 | "min_height": null,
700 | "padding": null,
701 | "grid_auto_rows": null,
702 | "grid_gap": null,
703 | "max_width": null,
704 | "order": null,
705 | "_view_module_version": "1.2.0",
706 | "grid_template_areas": null,
707 | "object_position": null,
708 | "object_fit": null,
709 | "grid_auto_columns": null,
710 | "margin": null,
711 | "display": null,
712 | "left": null
713 | }
714 | },
715 | "c05b441e442f4c2396c9ba920423e05a": {
716 | "model_module": "@jupyter-widgets/controls",
717 | "model_name": "HBoxModel",
718 | "model_module_version": "1.5.0",
719 | "state": {
720 | "_view_name": "HBoxView",
721 | "_dom_classes": [],
722 | "_model_name": "HBoxModel",
723 | "_view_module": "@jupyter-widgets/controls",
724 | "_model_module_version": "1.5.0",
725 | "_view_count": null,
726 | "_view_module_version": "1.5.0",
727 | "box_style": "",
728 | "layout": "IPY_MODEL_80ac4e5eaa574d1d9796dafb61c968b1",
729 | "_model_module": "@jupyter-widgets/controls",
730 | "children": [
731 | "IPY_MODEL_dbdc3f534915460d8c094fe3fbf6d3e5",
732 | "IPY_MODEL_a930671d0de84ca093dcb8b4e3cb1d0d",
733 | "IPY_MODEL_62b1340ef5b146a69f5426ddd31a132e"
734 | ]
735 | }
736 | },
737 | "80ac4e5eaa574d1d9796dafb61c968b1": {
738 | "model_module": "@jupyter-widgets/base",
739 | "model_name": "LayoutModel",
740 | "model_module_version": "1.2.0",
741 | "state": {
742 | "_view_name": "LayoutView",
743 | "grid_template_rows": null,
744 | "right": null,
745 | "justify_content": null,
746 | "_view_module": "@jupyter-widgets/base",
747 | "overflow": null,
748 | "_model_module_version": "1.2.0",
749 | "_view_count": null,
750 | "flex_flow": null,
751 | "width": null,
752 | "min_width": null,
753 | "border": null,
754 | "align_items": null,
755 | "bottom": null,
756 | "_model_module": "@jupyter-widgets/base",
757 | "top": null,
758 | "grid_column": null,
759 | "overflow_y": null,
760 | "overflow_x": null,
761 | "grid_auto_flow": null,
762 | "grid_area": null,
763 | "grid_template_columns": null,
764 | "flex": null,
765 | "_model_name": "LayoutModel",
766 | "justify_items": null,
767 | "grid_row": null,
768 | "max_height": null,
769 | "align_content": null,
770 | "visibility": null,
771 | "align_self": null,
772 | "height": null,
773 | "min_height": null,
774 | "padding": null,
775 | "grid_auto_rows": null,
776 | "grid_gap": null,
777 | "max_width": null,
778 | "order": null,
779 | "_view_module_version": "1.2.0",
780 | "grid_template_areas": null,
781 | "object_position": null,
782 | "object_fit": null,
783 | "grid_auto_columns": null,
784 | "margin": null,
785 | "display": null,
786 | "left": null
787 | }
788 | },
789 | "dbdc3f534915460d8c094fe3fbf6d3e5": {
790 | "model_module": "@jupyter-widgets/controls",
791 | "model_name": "HTMLModel",
792 | "model_module_version": "1.5.0",
793 | "state": {
794 | "_view_name": "HTMLView",
795 | "style": "IPY_MODEL_b760c18a75cb4a3785b0aa429ac21660",
796 | "_dom_classes": [],
797 | "description": "",
798 | "_model_name": "HTMLModel",
799 | "placeholder": "",
800 | "_view_module": "@jupyter-widgets/controls",
801 | "_model_module_version": "1.5.0",
802 | "value": "100%",
803 | "_view_count": null,
804 | "_view_module_version": "1.5.0",
805 | "description_tooltip": null,
806 | "_model_module": "@jupyter-widgets/controls",
807 | "layout": "IPY_MODEL_c02d098387c44ac899bcc1a5ef97aeaa"
808 | }
809 | },
810 | "a930671d0de84ca093dcb8b4e3cb1d0d": {
811 | "model_module": "@jupyter-widgets/controls",
812 | "model_name": "FloatProgressModel",
813 | "model_module_version": "1.5.0",
814 | "state": {
815 | "_view_name": "ProgressView",
816 | "style": "IPY_MODEL_81305ee731aa4b958c71a180aedcc7a2",
817 | "_dom_classes": [],
818 | "description": "",
819 | "_model_name": "FloatProgressModel",
820 | "bar_style": "success",
821 | "max": 5,
822 | "_view_module": "@jupyter-widgets/controls",
823 | "_model_module_version": "1.5.0",
824 | "value": 5,
825 | "_view_count": null,
826 | "_view_module_version": "1.5.0",
827 | "orientation": "horizontal",
828 | "min": 0,
829 | "description_tooltip": null,
830 | "_model_module": "@jupyter-widgets/controls",
831 | "layout": "IPY_MODEL_f37f5d60c87b4652b46b6175882c9caa"
832 | }
833 | },
834 | "62b1340ef5b146a69f5426ddd31a132e": {
835 | "model_module": "@jupyter-widgets/controls",
836 | "model_name": "HTMLModel",
837 | "model_module_version": "1.5.0",
838 | "state": {
839 | "_view_name": "HTMLView",
840 | "style": "IPY_MODEL_4843dbc1f2dd4701a136f87dfa1b940e",
841 | "_dom_classes": [],
842 | "description": "",
843 | "_model_name": "HTMLModel",
844 | "placeholder": "",
845 | "_view_module": "@jupyter-widgets/controls",
846 | "_model_module_version": "1.5.0",
847 | "value": " 5/5 [00:01<00:00, 2.52it/s]",
848 | "_view_count": null,
849 | "_view_module_version": "1.5.0",
850 | "description_tooltip": null,
851 | "_model_module": "@jupyter-widgets/controls",
852 | "layout": "IPY_MODEL_e7f81c8eee9e460faa0c2a8eba58f3c9"
853 | }
854 | },
855 | "b760c18a75cb4a3785b0aa429ac21660": {
856 | "model_module": "@jupyter-widgets/controls",
857 | "model_name": "DescriptionStyleModel",
858 | "model_module_version": "1.5.0",
859 | "state": {
860 | "_view_name": "StyleView",
861 | "_model_name": "DescriptionStyleModel",
862 | "description_width": "",
863 | "_view_module": "@jupyter-widgets/base",
864 | "_model_module_version": "1.5.0",
865 | "_view_count": null,
866 | "_view_module_version": "1.2.0",
867 | "_model_module": "@jupyter-widgets/controls"
868 | }
869 | },
870 | "c02d098387c44ac899bcc1a5ef97aeaa": {
871 | "model_module": "@jupyter-widgets/base",
872 | "model_name": "LayoutModel",
873 | "model_module_version": "1.2.0",
874 | "state": {
875 | "_view_name": "LayoutView",
876 | "grid_template_rows": null,
877 | "right": null,
878 | "justify_content": null,
879 | "_view_module": "@jupyter-widgets/base",
880 | "overflow": null,
881 | "_model_module_version": "1.2.0",
882 | "_view_count": null,
883 | "flex_flow": null,
884 | "width": null,
885 | "min_width": null,
886 | "border": null,
887 | "align_items": null,
888 | "bottom": null,
889 | "_model_module": "@jupyter-widgets/base",
890 | "top": null,
891 | "grid_column": null,
892 | "overflow_y": null,
893 | "overflow_x": null,
894 | "grid_auto_flow": null,
895 | "grid_area": null,
896 | "grid_template_columns": null,
897 | "flex": null,
898 | "_model_name": "LayoutModel",
899 | "justify_items": null,
900 | "grid_row": null,
901 | "max_height": null,
902 | "align_content": null,
903 | "visibility": null,
904 | "align_self": null,
905 | "height": null,
906 | "min_height": null,
907 | "padding": null,
908 | "grid_auto_rows": null,
909 | "grid_gap": null,
910 | "max_width": null,
911 | "order": null,
912 | "_view_module_version": "1.2.0",
913 | "grid_template_areas": null,
914 | "object_position": null,
915 | "object_fit": null,
916 | "grid_auto_columns": null,
917 | "margin": null,
918 | "display": null,
919 | "left": null
920 | }
921 | },
922 | "81305ee731aa4b958c71a180aedcc7a2": {
923 | "model_module": "@jupyter-widgets/controls",
924 | "model_name": "ProgressStyleModel",
925 | "model_module_version": "1.5.0",
926 | "state": {
927 | "_view_name": "StyleView",
928 | "_model_name": "ProgressStyleModel",
929 | "description_width": "",
930 | "_view_module": "@jupyter-widgets/base",
931 | "_model_module_version": "1.5.0",
932 | "_view_count": null,
933 | "_view_module_version": "1.2.0",
934 | "bar_color": null,
935 | "_model_module": "@jupyter-widgets/controls"
936 | }
937 | },
938 | "f37f5d60c87b4652b46b6175882c9caa": {
939 | "model_module": "@jupyter-widgets/base",
940 | "model_name": "LayoutModel",
941 | "model_module_version": "1.2.0",
942 | "state": {
943 | "_view_name": "LayoutView",
944 | "grid_template_rows": null,
945 | "right": null,
946 | "justify_content": null,
947 | "_view_module": "@jupyter-widgets/base",
948 | "overflow": null,
949 | "_model_module_version": "1.2.0",
950 | "_view_count": null,
951 | "flex_flow": null,
952 | "width": null,
953 | "min_width": null,
954 | "border": null,
955 | "align_items": null,
956 | "bottom": null,
957 | "_model_module": "@jupyter-widgets/base",
958 | "top": null,
959 | "grid_column": null,
960 | "overflow_y": null,
961 | "overflow_x": null,
962 | "grid_auto_flow": null,
963 | "grid_area": null,
964 | "grid_template_columns": null,
965 | "flex": null,
966 | "_model_name": "LayoutModel",
967 | "justify_items": null,
968 | "grid_row": null,
969 | "max_height": null,
970 | "align_content": null,
971 | "visibility": null,
972 | "align_self": null,
973 | "height": null,
974 | "min_height": null,
975 | "padding": null,
976 | "grid_auto_rows": null,
977 | "grid_gap": null,
978 | "max_width": null,
979 | "order": null,
980 | "_view_module_version": "1.2.0",
981 | "grid_template_areas": null,
982 | "object_position": null,
983 | "object_fit": null,
984 | "grid_auto_columns": null,
985 | "margin": null,
986 | "display": null,
987 | "left": null
988 | }
989 | },
990 | "4843dbc1f2dd4701a136f87dfa1b940e": {
991 | "model_module": "@jupyter-widgets/controls",
992 | "model_name": "DescriptionStyleModel",
993 | "model_module_version": "1.5.0",
994 | "state": {
995 | "_view_name": "StyleView",
996 | "_model_name": "DescriptionStyleModel",
997 | "description_width": "",
998 | "_view_module": "@jupyter-widgets/base",
999 | "_model_module_version": "1.5.0",
1000 | "_view_count": null,
1001 | "_view_module_version": "1.2.0",
1002 | "_model_module": "@jupyter-widgets/controls"
1003 | }
1004 | },
1005 | "e7f81c8eee9e460faa0c2a8eba58f3c9": {
1006 | "model_module": "@jupyter-widgets/base",
1007 | "model_name": "LayoutModel",
1008 | "model_module_version": "1.2.0",
1009 | "state": {
1010 | "_view_name": "LayoutView",
1011 | "grid_template_rows": null,
1012 | "right": null,
1013 | "justify_content": null,
1014 | "_view_module": "@jupyter-widgets/base",
1015 | "overflow": null,
1016 | "_model_module_version": "1.2.0",
1017 | "_view_count": null,
1018 | "flex_flow": null,
1019 | "width": null,
1020 | "min_width": null,
1021 | "border": null,
1022 | "align_items": null,
1023 | "bottom": null,
1024 | "_model_module": "@jupyter-widgets/base",
1025 | "top": null,
1026 | "grid_column": null,
1027 | "overflow_y": null,
1028 | "overflow_x": null,
1029 | "grid_auto_flow": null,
1030 | "grid_area": null,
1031 | "grid_template_columns": null,
1032 | "flex": null,
1033 | "_model_name": "LayoutModel",
1034 | "justify_items": null,
1035 | "grid_row": null,
1036 | "max_height": null,
1037 | "align_content": null,
1038 | "visibility": null,
1039 | "align_self": null,
1040 | "height": null,
1041 | "min_height": null,
1042 | "padding": null,
1043 | "grid_auto_rows": null,
1044 | "grid_gap": null,
1045 | "max_width": null,
1046 | "order": null,
1047 | "_view_module_version": "1.2.0",
1048 | "grid_template_areas": null,
1049 | "object_position": null,
1050 | "object_fit": null,
1051 | "grid_auto_columns": null,
1052 | "margin": null,
1053 | "display": null,
1054 | "left": null
1055 | }
1056 | },
1057 | "dc7d7680505d44999e165717debcd1a4": {
1058 | "model_module": "@jupyter-widgets/controls",
1059 | "model_name": "HBoxModel",
1060 | "model_module_version": "1.5.0",
1061 | "state": {
1062 | "_view_name": "HBoxView",
1063 | "_dom_classes": [],
1064 | "_model_name": "HBoxModel",
1065 | "_view_module": "@jupyter-widgets/controls",
1066 | "_model_module_version": "1.5.0",
1067 | "_view_count": null,
1068 | "_view_module_version": "1.5.0",
1069 | "box_style": "",
1070 | "layout": "IPY_MODEL_63cf9d3439294863a5728040904c8ad7",
1071 | "_model_module": "@jupyter-widgets/controls",
1072 | "children": [
1073 | "IPY_MODEL_2bfaac11391a4adfb91433e0f3368683",
1074 | "IPY_MODEL_788a1eafa3814b9399816accd124f6eb",
1075 | "IPY_MODEL_d984a74eb7ad432a82fde659554da5b9"
1076 | ]
1077 | }
1078 | },
1079 | "63cf9d3439294863a5728040904c8ad7": {
1080 | "model_module": "@jupyter-widgets/base",
1081 | "model_name": "LayoutModel",
1082 | "model_module_version": "1.2.0",
1083 | "state": {
1084 | "_view_name": "LayoutView",
1085 | "grid_template_rows": null,
1086 | "right": null,
1087 | "justify_content": null,
1088 | "_view_module": "@jupyter-widgets/base",
1089 | "overflow": null,
1090 | "_model_module_version": "1.2.0",
1091 | "_view_count": null,
1092 | "flex_flow": null,
1093 | "width": null,
1094 | "min_width": null,
1095 | "border": null,
1096 | "align_items": null,
1097 | "bottom": null,
1098 | "_model_module": "@jupyter-widgets/base",
1099 | "top": null,
1100 | "grid_column": null,
1101 | "overflow_y": null,
1102 | "overflow_x": null,
1103 | "grid_auto_flow": null,
1104 | "grid_area": null,
1105 | "grid_template_columns": null,
1106 | "flex": null,
1107 | "_model_name": "LayoutModel",
1108 | "justify_items": null,
1109 | "grid_row": null,
1110 | "max_height": null,
1111 | "align_content": null,
1112 | "visibility": null,
1113 | "align_self": null,
1114 | "height": null,
1115 | "min_height": null,
1116 | "padding": null,
1117 | "grid_auto_rows": null,
1118 | "grid_gap": null,
1119 | "max_width": null,
1120 | "order": null,
1121 | "_view_module_version": "1.2.0",
1122 | "grid_template_areas": null,
1123 | "object_position": null,
1124 | "object_fit": null,
1125 | "grid_auto_columns": null,
1126 | "margin": null,
1127 | "display": null,
1128 | "left": null
1129 | }
1130 | },
1131 | "2bfaac11391a4adfb91433e0f3368683": {
1132 | "model_module": "@jupyter-widgets/controls",
1133 | "model_name": "HTMLModel",
1134 | "model_module_version": "1.5.0",
1135 | "state": {
1136 | "_view_name": "HTMLView",
1137 | "style": "IPY_MODEL_50696b849220459495b9b9a091ce05bb",
1138 | "_dom_classes": [],
1139 | "description": "",
1140 | "_model_name": "HTMLModel",
1141 | "placeholder": "",
1142 | "_view_module": "@jupyter-widgets/controls",
1143 | "_model_module_version": "1.5.0",
1144 | "value": "100%",
1145 | "_view_count": null,
1146 | "_view_module_version": "1.5.0",
1147 | "description_tooltip": null,
1148 | "_model_module": "@jupyter-widgets/controls",
1149 | "layout": "IPY_MODEL_72efa9c92d01410dbca8b357bc581e56"
1150 | }
1151 | },
1152 | "788a1eafa3814b9399816accd124f6eb": {
1153 | "model_module": "@jupyter-widgets/controls",
1154 | "model_name": "FloatProgressModel",
1155 | "model_module_version": "1.5.0",
1156 | "state": {
1157 | "_view_name": "ProgressView",
1158 | "style": "IPY_MODEL_5be0c6d165d748fcbb017817fd29c708",
1159 | "_dom_classes": [],
1160 | "description": "",
1161 | "_model_name": "FloatProgressModel",
1162 | "bar_style": "success",
1163 | "max": 5,
1164 | "_view_module": "@jupyter-widgets/controls",
1165 | "_model_module_version": "1.5.0",
1166 | "value": 5,
1167 | "_view_count": null,
1168 | "_view_module_version": "1.5.0",
1169 | "orientation": "horizontal",
1170 | "min": 0,
1171 | "description_tooltip": null,
1172 | "_model_module": "@jupyter-widgets/controls",
1173 | "layout": "IPY_MODEL_f3591ac91f184068827b659c302bbe99"
1174 | }
1175 | },
1176 | "d984a74eb7ad432a82fde659554da5b9": {
1177 | "model_module": "@jupyter-widgets/controls",
1178 | "model_name": "HTMLModel",
1179 | "model_module_version": "1.5.0",
1180 | "state": {
1181 | "_view_name": "HTMLView",
1182 | "style": "IPY_MODEL_9bca2e1c229b41d7add2792d72015b3a",
1183 | "_dom_classes": [],
1184 | "description": "",
1185 | "_model_name": "HTMLModel",
1186 | "placeholder": "",
1187 | "_view_module": "@jupyter-widgets/controls",
1188 | "_model_module_version": "1.5.0",
1189 | "value": " 5/5 [00:00<00:00, 86.87it/s]",
1190 | "_view_count": null,
1191 | "_view_module_version": "1.5.0",
1192 | "description_tooltip": null,
1193 | "_model_module": "@jupyter-widgets/controls",
1194 | "layout": "IPY_MODEL_03f86a1edb104dd2b55a587599a8339e"
1195 | }
1196 | },
1197 | "50696b849220459495b9b9a091ce05bb": {
1198 | "model_module": "@jupyter-widgets/controls",
1199 | "model_name": "DescriptionStyleModel",
1200 | "model_module_version": "1.5.0",
1201 | "state": {
1202 | "_view_name": "StyleView",
1203 | "_model_name": "DescriptionStyleModel",
1204 | "description_width": "",
1205 | "_view_module": "@jupyter-widgets/base",
1206 | "_model_module_version": "1.5.0",
1207 | "_view_count": null,
1208 | "_view_module_version": "1.2.0",
1209 | "_model_module": "@jupyter-widgets/controls"
1210 | }
1211 | },
1212 | "72efa9c92d01410dbca8b357bc581e56": {
1213 | "model_module": "@jupyter-widgets/base",
1214 | "model_name": "LayoutModel",
1215 | "model_module_version": "1.2.0",
1216 | "state": {
1217 | "_view_name": "LayoutView",
1218 | "grid_template_rows": null,
1219 | "right": null,
1220 | "justify_content": null,
1221 | "_view_module": "@jupyter-widgets/base",
1222 | "overflow": null,
1223 | "_model_module_version": "1.2.0",
1224 | "_view_count": null,
1225 | "flex_flow": null,
1226 | "width": null,
1227 | "min_width": null,
1228 | "border": null,
1229 | "align_items": null,
1230 | "bottom": null,
1231 | "_model_module": "@jupyter-widgets/base",
1232 | "top": null,
1233 | "grid_column": null,
1234 | "overflow_y": null,
1235 | "overflow_x": null,
1236 | "grid_auto_flow": null,
1237 | "grid_area": null,
1238 | "grid_template_columns": null,
1239 | "flex": null,
1240 | "_model_name": "LayoutModel",
1241 | "justify_items": null,
1242 | "grid_row": null,
1243 | "max_height": null,
1244 | "align_content": null,
1245 | "visibility": null,
1246 | "align_self": null,
1247 | "height": null,
1248 | "min_height": null,
1249 | "padding": null,
1250 | "grid_auto_rows": null,
1251 | "grid_gap": null,
1252 | "max_width": null,
1253 | "order": null,
1254 | "_view_module_version": "1.2.0",
1255 | "grid_template_areas": null,
1256 | "object_position": null,
1257 | "object_fit": null,
1258 | "grid_auto_columns": null,
1259 | "margin": null,
1260 | "display": null,
1261 | "left": null
1262 | }
1263 | },
1264 | "5be0c6d165d748fcbb017817fd29c708": {
1265 | "model_module": "@jupyter-widgets/controls",
1266 | "model_name": "ProgressStyleModel",
1267 | "model_module_version": "1.5.0",
1268 | "state": {
1269 | "_view_name": "StyleView",
1270 | "_model_name": "ProgressStyleModel",
1271 | "description_width": "",
1272 | "_view_module": "@jupyter-widgets/base",
1273 | "_model_module_version": "1.5.0",
1274 | "_view_count": null,
1275 | "_view_module_version": "1.2.0",
1276 | "bar_color": null,
1277 | "_model_module": "@jupyter-widgets/controls"
1278 | }
1279 | },
1280 | "f3591ac91f184068827b659c302bbe99": {
1281 | "model_module": "@jupyter-widgets/base",
1282 | "model_name": "LayoutModel",
1283 | "model_module_version": "1.2.0",
1284 | "state": {
1285 | "_view_name": "LayoutView",
1286 | "grid_template_rows": null,
1287 | "right": null,
1288 | "justify_content": null,
1289 | "_view_module": "@jupyter-widgets/base",
1290 | "overflow": null,
1291 | "_model_module_version": "1.2.0",
1292 | "_view_count": null,
1293 | "flex_flow": null,
1294 | "width": null,
1295 | "min_width": null,
1296 | "border": null,
1297 | "align_items": null,
1298 | "bottom": null,
1299 | "_model_module": "@jupyter-widgets/base",
1300 | "top": null,
1301 | "grid_column": null,
1302 | "overflow_y": null,
1303 | "overflow_x": null,
1304 | "grid_auto_flow": null,
1305 | "grid_area": null,
1306 | "grid_template_columns": null,
1307 | "flex": null,
1308 | "_model_name": "LayoutModel",
1309 | "justify_items": null,
1310 | "grid_row": null,
1311 | "max_height": null,
1312 | "align_content": null,
1313 | "visibility": null,
1314 | "align_self": null,
1315 | "height": null,
1316 | "min_height": null,
1317 | "padding": null,
1318 | "grid_auto_rows": null,
1319 | "grid_gap": null,
1320 | "max_width": null,
1321 | "order": null,
1322 | "_view_module_version": "1.2.0",
1323 | "grid_template_areas": null,
1324 | "object_position": null,
1325 | "object_fit": null,
1326 | "grid_auto_columns": null,
1327 | "margin": null,
1328 | "display": null,
1329 | "left": null
1330 | }
1331 | },
1332 | "9bca2e1c229b41d7add2792d72015b3a": {
1333 | "model_module": "@jupyter-widgets/controls",
1334 | "model_name": "DescriptionStyleModel",
1335 | "model_module_version": "1.5.0",
1336 | "state": {
1337 | "_view_name": "StyleView",
1338 | "_model_name": "DescriptionStyleModel",
1339 | "description_width": "",
1340 | "_view_module": "@jupyter-widgets/base",
1341 | "_model_module_version": "1.5.0",
1342 | "_view_count": null,
1343 | "_view_module_version": "1.2.0",
1344 | "_model_module": "@jupyter-widgets/controls"
1345 | }
1346 | },
1347 | "03f86a1edb104dd2b55a587599a8339e": {
1348 | "model_module": "@jupyter-widgets/base",
1349 | "model_name": "LayoutModel",
1350 | "model_module_version": "1.2.0",
1351 | "state": {
1352 | "_view_name": "LayoutView",
1353 | "grid_template_rows": null,
1354 | "right": null,
1355 | "justify_content": null,
1356 | "_view_module": "@jupyter-widgets/base",
1357 | "overflow": null,
1358 | "_model_module_version": "1.2.0",
1359 | "_view_count": null,
1360 | "flex_flow": null,
1361 | "width": null,
1362 | "min_width": null,
1363 | "border": null,
1364 | "align_items": null,
1365 | "bottom": null,
1366 | "_model_module": "@jupyter-widgets/base",
1367 | "top": null,
1368 | "grid_column": null,
1369 | "overflow_y": null,
1370 | "overflow_x": null,
1371 | "grid_auto_flow": null,
1372 | "grid_area": null,
1373 | "grid_template_columns": null,
1374 | "flex": null,
1375 | "_model_name": "LayoutModel",
1376 | "justify_items": null,
1377 | "grid_row": null,
1378 | "max_height": null,
1379 | "align_content": null,
1380 | "visibility": null,
1381 | "align_self": null,
1382 | "height": null,
1383 | "min_height": null,
1384 | "padding": null,
1385 | "grid_auto_rows": null,
1386 | "grid_gap": null,
1387 | "max_width": null,
1388 | "order": null,
1389 | "_view_module_version": "1.2.0",
1390 | "grid_template_areas": null,
1391 | "object_position": null,
1392 | "object_fit": null,
1393 | "grid_auto_columns": null,
1394 | "margin": null,
1395 | "display": null,
1396 | "left": null
1397 | }
1398 | }
1399 | }
1400 | }
1401 | },
1402 | "cells": [
1403 | {
1404 | "cell_type": "markdown",
1405 | "metadata": {
1406 | "id": "view-in-github",
1407 | "colab_type": "text"
1408 | },
1409 | "source": [
1410 | "
"
1411 | ]
1412 | },
1413 | {
1414 | "cell_type": "markdown",
1415 | "metadata": {
1416 | "id": "SmtqOeQpmANk"
1417 | },
1418 | "source": [
1419 | "#### Use this notebook to calculate mean of pixels per field for \n",
1420 | " - Sentinel one data\n",
1421 | " - Sentinel 2 data with 25 periods/dates\n",
1422 | " - Sentinel 2 data with month start month end dates\n",
1423 | "\n",
1424 | "**All you have to change is the path to the files**"
1425 | ]
1426 | },
1427 | {
1428 | "cell_type": "code",
1429 | "metadata": {
1430 | "id": "Mr9kNjYMfssv",
1431 | "colab": {
1432 | "base_uri": "https://localhost:8080/"
1433 | },
1434 | "outputId": "8b873aec-cba7-47df-82b2-ab1170e1062f"
1435 | },
1436 | "source": [
1437 | "!pip -qq install pyspark findspark\n",
1438 | "!apt-get install openjdk-8-jdk-headless -qq > /dev/null\n",
1439 | "!wget -q https://www-us.apache.org/dist/spark/spark-3.0.1/spark-3.0.1-bin-hadoop2.7.tgz"
1440 | ],
1441 | "execution_count": null,
1442 | "outputs": [
1443 | {
1444 | "output_type": "stream",
1445 | "name": "stdout",
1446 | "text": [
1447 | "\u001b[K |████████████████████████████████| 212.4 MB 48 kB/s \n",
1448 | "\u001b[K |████████████████████████████████| 198 kB 50.4 MB/s \n",
1449 | "\u001b[?25h Building wheel for pyspark (setup.py) ... \u001b[?25l\u001b[?25hdone\n"
1450 | ]
1451 | }
1452 | ]
1453 | },
1454 | {
1455 | "cell_type": "code",
1456 | "metadata": {
1457 | "id": "_0YW6UeGfiZo"
1458 | },
1459 | "source": [
1460 | "import pandas as pd\n",
1461 | "import numpy as np\n",
1462 | "import os\n",
1463 | "import glob\n",
1464 | "from tqdm.notebook import tqdm_notebook\n",
1465 | "import pyspark\n",
1466 | "from pyspark import SparkContext\n",
1467 | "from pyspark.sql import SparkSession\n",
1468 | "import pyspark.sql.functions as F\n",
1469 | "from pyspark.sql import Row, DataFrame, Column\n",
1470 | "from pyspark.sql.functions import lit\n",
1471 | "\n",
1472 | "from functools import reduce"
1473 | ],
1474 | "execution_count": null,
1475 | "outputs": []
1476 | },
1477 | {
1478 | "cell_type": "code",
1479 | "metadata": {
1480 | "id": "VeJYJfIefiZr"
1481 | },
1482 | "source": [
1483 | "os.environ[\"JAVA_HOME\"] = \"/usr/lib/jvm/java-8-openjdk-amd64\""
1484 | ],
1485 | "execution_count": null,
1486 | "outputs": []
1487 | },
1488 | {
1489 | "cell_type": "code",
1490 | "metadata": {
1491 | "id": "6Pr1lA3HEA7P"
1492 | },
1493 | "source": [
1494 | "spark = SparkSession.builder \\\n",
1495 | " .master('local[*]') \\\n",
1496 | " .config(\"spark.driver.memory\", \"15g\") \\\n",
1497 | " .appName('my-cool-app') \\\n",
1498 | " .getOrCreate()"
1499 | ],
1500 | "execution_count": null,
1501 | "outputs": []
1502 | },
1503 | {
1504 | "cell_type": "code",
1505 | "metadata": {
1506 | "colab": {
1507 | "base_uri": "https://localhost:8080/",
1508 | "height": 214
1509 | },
1510 | "id": "p-lXDoJ7smXI",
1511 | "outputId": "00dfad31-12f5-4b54-ae9b-c379af98eb47"
1512 | },
1513 | "source": [
1514 | "spark"
1515 | ],
1516 | "execution_count": null,
1517 | "outputs": [
1518 | {
1519 | "output_type": "execute_result",
1520 | "data": {
1521 | "text/html": [
1522 | "\n",
1523 | " \n",
1524 | "
SparkSession - in-memory
\n",
1525 | " \n",
1526 | "
\n",
1527 | "
SparkContext
\n",
1528 | "\n",
1529 | "
Spark UI
\n",
1530 | "\n",
1531 | "
\n",
1532 | " - Version
\n",
1533 | " v3.1.2 \n",
1534 | " - Master
\n",
1535 | " local[*] \n",
1536 | " - AppName
\n",
1537 | " my-cool-app \n",
1538 | "
\n",
1539 | "
\n",
1540 | " \n",
1541 | "
\n",
1542 | " "
1543 | ],
1544 | "text/plain": [
1545 | ""
1546 | ]
1547 | },
1548 | "metadata": {},
1549 | "execution_count": 5
1550 | }
1551 | ]
1552 | },
1553 | {
1554 | "cell_type": "code",
1555 | "metadata": {
1556 | "colab": {
1557 | "base_uri": "https://localhost:8080/"
1558 | },
1559 | "id": "zPAm0Gykyo6M",
1560 | "outputId": "b1ddfce2-eeb8-4936-8586-3eb6e85ab041"
1561 | },
1562 | "source": [
1563 | "train = pd.read_csv('/content/drive/MyDrive/CompeData/Radiant/train_file.csv')\n",
1564 | "field_ids = train.field.unique()\n",
1565 | "len(field_ids)"
1566 | ],
1567 | "execution_count": null,
1568 | "outputs": [
1569 | {
1570 | "output_type": "execute_result",
1571 | "data": {
1572 | "text/plain": [
1573 | "87114"
1574 | ]
1575 | },
1576 | "metadata": {},
1577 | "execution_count": 8
1578 | }
1579 | ]
1580 | },
1581 | {
1582 | "cell_type": "code",
1583 | "metadata": {
1584 | "colab": {
1585 | "base_uri": "https://localhost:8080/"
1586 | },
1587 | "id": "CpH-dnlrB1HD",
1588 | "outputId": "65b14ec0-9fcf-4837-c5c2-1d0dc1a264b2"
1589 | },
1590 | "source": [
1591 | "train.shape"
1592 | ],
1593 | "execution_count": null,
1594 | "outputs": [
1595 | {
1596 | "output_type": "execute_result",
1597 | "data": {
1598 | "text/plain": [
1599 | "(87114, 106)"
1600 | ]
1601 | },
1602 | "metadata": {},
1603 | "execution_count": 9
1604 | }
1605 | ]
1606 | },
1607 | {
1608 | "cell_type": "markdown",
1609 | "metadata": {
1610 | "id": "FzO9EQksO5sb"
1611 | },
1612 | "source": [
1613 | "## SAR"
1614 | ]
1615 | },
1616 | {
1617 | "cell_type": "code",
1618 | "metadata": {
1619 | "colab": {
1620 | "base_uri": "https://localhost:8080/",
1621 | "height": 82,
1622 | "referenced_widgets": [
1623 | "7b430419c8e04c91acafc35a53118cf3",
1624 | "8466a8ec4bc347b4b69f220d31679012",
1625 | "2bce2a23961544b1960e268693e33947",
1626 | "89380e275f19406a8614dace6af1f99b",
1627 | "54724ef8ab44408c8fcc616610dd9bdb",
1628 | "1bd7d3e972804c65bed5fdc30011d9f1",
1629 | "b771437f48194bad9dbcc254019b6803",
1630 | "e8fa90fbc2b741ad988e6b5d13af6d67",
1631 | "f649ba6bc4b14c7ab06d08da5fe3b7de",
1632 | "e5be528350c54c34b702a87be4c16b40",
1633 | "355037c622754c2bba10bd49c8f25754"
1634 | ]
1635 | },
1636 | "id": "-0HJ-fPHO4lC",
1637 | "outputId": "7034e8c0-d17c-4ef0-fcd6-2a0945299e24"
1638 | },
1639 | "source": [
1640 | "%%time\n",
1641 | "dfs = []\n",
1642 | "for i in tqdm_notebook(glob.glob('/content/drive/MyDrive/CompeData/Radiant/s1_all_data/s1_train*.csv')):\n",
1643 | " dfs.append(spark.read.csv(path = i, sep =',', encoding = 'UTF-8', comment = None, header = True))"
1644 | ],
1645 | "execution_count": null,
1646 | "outputs": [
1647 | {
1648 | "output_type": "display_data",
1649 | "data": {
1650 | "application/vnd.jupyter.widget-view+json": {
1651 | "model_id": "7b430419c8e04c91acafc35a53118cf3",
1652 | "version_minor": 0,
1653 | "version_major": 2
1654 | },
1655 | "text/plain": [
1656 | " 0%| | 0/10 [00:00, ?it/s]"
1657 | ]
1658 | },
1659 | "metadata": {}
1660 | },
1661 | {
1662 | "output_type": "stream",
1663 | "name": "stdout",
1664 | "text": [
1665 | "CPU times: user 165 ms, sys: 19.9 ms, total: 184 ms\n",
1666 | "Wall time: 11.1 s\n"
1667 | ]
1668 | }
1669 | ]
1670 | },
1671 | {
1672 | "cell_type": "code",
1673 | "metadata": {
1674 | "colab": {
1675 | "base_uri": "https://localhost:8080/"
1676 | },
1677 | "id": "B8WCJBwJO4Z1",
1678 | "outputId": "a342fc55-4f13-4675-8c50-1e7f89580080"
1679 | },
1680 | "source": [
1681 | "dfs[0].show(5)"
1682 | ],
1683 | "execution_count": null,
1684 | "outputs": [
1685 | {
1686 | "output_type": "stream",
1687 | "name": "stdout",
1688 | "text": [
1689 | "+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+---+------+\n",
1690 | "|04_0_VV|04_0_VH|04_1_VV|04_1_VH|04_2_VV|04_2_VH|04_3_VV|04_3_VH|04_4_VV|04_4_VH|04_5_VV|04_5_VH|05_0_VV|05_0_VH|05_1_VV|05_1_VH|05_2_VV|05_2_VH|05_3_VV|05_3_VH|05_4_VV|05_4_VH|06_0_VV|06_0_VH|06_1_VV|06_1_VH|06_2_VV|06_2_VH|06_3_VV|06_3_VH|06_4_VV|06_4_VH|07_0_VV|07_0_VH|07_1_VV|07_1_VH|07_2_VV|07_2_VH|07_3_VV|07_3_VH|07_4_VV|07_4_VH|08_0_VV|08_0_VH|08_1_VV|08_1_VH|08_2_VV|08_2_VH|08_3_VV|08_3_VH|08_4_VV|08_4_VH|09_0_VV|09_0_VH|09_1_VV|09_1_VH|09_2_VV|09_2_VH|09_3_VV|09_3_VH|09_4_VV|09_4_VH|10_0_VV|10_0_VH|10_1_VV|10_1_VH|10_2_VV|10_2_VH|10_3_VV|10_3_VH|10_4_VV|10_4_VH|11_0_VV|11_0_VH|11_1_VV|11_1_VH|11_2_VV|11_2_VH|11_3_VV|11_3_VH|11_4_VV|11_4_VH| y|fields|\n",
1691 | "+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+---+------+\n",
1692 | "| 7.0| 3.0| 7.0| 3.0| 16.0| 4.0| 9.0| 2.0| 8.0| 1.0| 24.0| 13.0| 4.0| 6.0| 4.0| 2.0| 6.0| 2.0| 42.0| 9.0| 15.0| 2.0| 11.0| 3.0| 16.0| 4.0| 35.0| 4.0| 12.0| 5.0| 26.0| 5.0| 6.0| 3.0| 9.0| 3.0| 10.0| 1.0| 13.0| 4.0| 13.0| 1.0| 7.0| 3.0| 17.0| 2.0| 24.0| 3.0| 24.0| 4.0| 14.0| 7.0| 14.0| 3.0| 20.0| 3.0| 3.0| 2.0| 12.0| 7.0| 8.0| 4.0| 35.0| 7.0| 19.0| 10.0| 71.0| 5.0| 8.0| 7.0| 11.0| 4.0| 23.0| 7.0| 4.0| 4.0| 5.0| 5.0| 12.0| 7.0| 18.0| 1.0| 0| 0|\n",
1693 | "| 11.0| 2.0| 6.0| 1.0| 9.0| 1.0| 7.0| 3.0| 8.0| 0.0| 19.0| 4.0| 6.0| 0.0| 5.0| 1.0| 9.0| 1.0| 44.0| 2.0| 6.0| 1.0| 9.0| 1.0| 23.0| 6.0| 24.0| 1.0| 13.0| 3.0| 30.0| 2.0| 8.0| 3.0| 3.0| 3.0| 19.0| 1.0| 16.0| 4.0| 14.0| 3.0| 7.0| 3.0| 36.0| 1.0| 36.0| 4.0| 24.0| 5.0| 19.0| 2.0| 14.0| 5.0| 24.0| 10.0| 3.0| 9.0| 8.0| 4.0| 18.0| 15.0| 18.0| 6.0| 20.0| 7.0| 73.0| 5.0| 7.0| 4.0| 12.0| 4.0| 11.0| 1.0| 8.0| 2.0| 5.0| 0.0| 12.0| 3.0| 14.0| 2.0| 0| 0|\n",
1694 | "| 12.0| 3.0| 6.0| 2.0| 13.0| 1.0| 6.0| 3.0| 5.0| 0.0| 19.0| 7.0| 6.0| 1.0| 5.0| 1.0| 11.0| 2.0| 32.0| 1.0| 3.0| 1.0| 7.0| 1.0| 21.0| 2.0| 6.0| 1.0| 6.0| 0.0| 19.0| 2.0| 10.0| 1.0| 3.0| 5.0| 3.0| 0.0| 13.0| 2.0| 7.0| 5.0| 5.0| 2.0| 44.0| 2.0| 31.0| 1.0| 16.0| 4.0| 11.0| 3.0| 9.0| 2.0| 13.0| 2.0| 8.0| 6.0| 10.0| 6.0| 19.0| 12.0| 10.0| 2.0| 9.0| 3.0| 35.0| 8.0| 10.0| 2.0| 9.0| 1.0| 31.0| 3.0| 12.0| 1.0| 8.0| 0.0| 11.0| 2.0| 14.0| 1.0| 0| 0|\n",
1695 | "| 12.0| 3.0| 3.0| 2.0| 13.0| 0.0| 4.0| 2.0| 13.0| 0.0| 27.0| 6.0| 6.0| 3.0| 6.0| 1.0| 12.0| 3.0| 20.0| 1.0| 12.0| 1.0| 7.0| 0.0| 16.0| 2.0| 8.0| 1.0| 7.0| 1.0| 9.0| 1.0| 1.0| 0.0| 8.0| 5.0| 2.0| 0.0| 2.0| 0.0| 4.0| 3.0| 7.0| 1.0| 37.0| 3.0| 19.0| 5.0| 18.0| 5.0| 10.0| 4.0| 6.0| 1.0| 4.0| 1.0| 6.0| 3.0| 10.0| 3.0| 10.0| 2.0| 11.0| 4.0| 11.0| 1.0| 29.0| 5.0| 17.0| 1.0| 8.0| 2.0| 18.0| 3.0| 7.0| 1.0| 4.0| 1.0| 8.0| 1.0| 7.0| 2.0| 0| 0|\n",
1696 | "| 17.0| 3.0| 2.0| 1.0| 6.0| 0.0| 3.0| 0.0| 10.0| 2.0| 72.0| 5.0| 6.0| 2.0| 3.0| 1.0| 16.0| 0.0| 14.0| 1.0| 10.0| 1.0| 15.0| 0.0| 36.0| 2.0| 17.0| 1.0| 9.0| 2.0| 8.0| 2.0| 6.0| 0.0| 16.0| 1.0| 7.0| 0.0| 2.0| 1.0| 9.0| 2.0| 9.0| 2.0| 32.0| 4.0| 23.0| 6.0| 6.0| 4.0| 4.0| 0.0| 11.0| 2.0| 8.0| 1.0| 11.0| 0.0| 5.0| 2.0| 10.0| 3.0| 12.0| 3.0| 15.0| 2.0| 38.0| 7.0| 21.0| 1.0| 9.0| 1.0| 25.0| 1.0| 5.0| 1.0| 4.0| 1.0| 5.0| 1.0| 3.0| 0.0| 0| 0|\n",
1697 | "+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+---+------+\n",
1698 | "only showing top 5 rows\n",
1699 | "\n"
1700 | ]
1701 | }
1702 | ]
1703 | },
1704 | {
1705 | "cell_type": "code",
1706 | "metadata": {
1707 | "colab": {
1708 | "base_uri": "https://localhost:8080/",
1709 | "height": 82,
1710 | "referenced_widgets": [
1711 | "dfc9917c7f834a50a4243f9c2b0c7acf",
1712 | "44459dfef2754dfe8c5c5a4ff37dcf4a",
1713 | "d0fcacb5d7624f04a939f93dc6ce2263",
1714 | "02bd460511784a6daf3dbe9d1ef63443",
1715 | "2f5def119b2745018845e600fd6478bf",
1716 | "ce26a50eec88416384396ea39d569c61",
1717 | "0f9efdd1b2214514be75d26fcdb18385",
1718 | "cfcd541aac3f4c4d8b848d93599feaaf",
1719 | "87cf6567215b4ba8b02ae7fe4e680e6d",
1720 | "d66ad955b2c34d15bd37b2bde9d0439b",
1721 | "ccfe7c215f9942c99ce74be1931161c8"
1722 | ]
1723 | },
1724 | "id": "LOruCmAUPhk5",
1725 | "outputId": "5ffc21f2-944d-4793-8ade-671117d7bf30"
1726 | },
1727 | "source": [
1728 | "%%time\n",
1729 | "cols = set()\n",
1730 | "for j in tqdm_notebook(dfs):\n",
1731 | " for x in j.columns:\n",
1732 | " cols.add(x)\n",
1733 | "cols = sorted(cols)\n",
1734 | "\n",
1735 | "# Create a dictionary with all the dataframes\n",
1736 | "all_data = {}\n",
1737 | "for i, d in enumerate(dfs):\n",
1738 | " new_name = 'new_df' + str(i) # New name for the key, the dataframe is the value\n",
1739 | " all_data[new_name] = d\n",
1740 | " # Loop through all column names. Add the missing columns to the dataframe (with value 0)\n",
1741 | " for x in cols:\n",
1742 | " if x not in d.columns:\n",
1743 | " all_data[new_name] = all_data[new_name].withColumn(x, F.lit(0))\n",
1744 | " all_data[new_name] = all_data[new_name].select(cols) # Use 'select' to get the columns sorted"
1745 | ],
1746 | "execution_count": null,
1747 | "outputs": [
1748 | {
1749 | "output_type": "display_data",
1750 | "data": {
1751 | "application/vnd.jupyter.widget-view+json": {
1752 | "model_id": "dfc9917c7f834a50a4243f9c2b0c7acf",
1753 | "version_minor": 0,
1754 | "version_major": 2
1755 | },
1756 | "text/plain": [
1757 | " 0%| | 0/10 [00:00, ?it/s]"
1758 | ]
1759 | },
1760 | "metadata": {}
1761 | },
1762 | {
1763 | "output_type": "stream",
1764 | "name": "stdout",
1765 | "text": [
1766 | "CPU times: user 418 ms, sys: 97.2 ms, total: 515 ms\n",
1767 | "Wall time: 2.52 s\n"
1768 | ]
1769 | }
1770 | ]
1771 | },
1772 | {
1773 | "cell_type": "code",
1774 | "metadata": {
1775 | "colab": {
1776 | "base_uri": "https://localhost:8080/"
1777 | },
1778 | "id": "9hPrNtFRPzhA",
1779 | "outputId": "118716ce-43a3-4fc0-9444-ed1f85aa5094"
1780 | },
1781 | "source": [
1782 | "len(dfs)"
1783 | ],
1784 | "execution_count": null,
1785 | "outputs": [
1786 | {
1787 | "output_type": "execute_result",
1788 | "data": {
1789 | "text/plain": [
1790 | "10"
1791 | ]
1792 | },
1793 | "metadata": {},
1794 | "execution_count": 15
1795 | }
1796 | ]
1797 | },
1798 | {
1799 | "cell_type": "code",
1800 | "metadata": {
1801 | "colab": {
1802 | "base_uri": "https://localhost:8080/"
1803 | },
1804 | "id": "7RndGX9vP4mk",
1805 | "outputId": "124c2062-54f0-4965-be24-fbd067525a4e"
1806 | },
1807 | "source": [
1808 | "%%time\n",
1809 | "all_data_dfs = [all_data['new_df'+str(i)] for i in range(10)]\n",
1810 | "final_data = reduce(DataFrame.unionAll, all_data_dfs)\n",
1811 | "final_data.show(5)"
1812 | ],
1813 | "execution_count": null,
1814 | "outputs": [
1815 | {
1816 | "output_type": "stream",
1817 | "name": "stdout",
1818 | "text": [
1819 | "+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+------+---+\n",
1820 | "|04_0_VH|04_0_VV|04_1_VH|04_1_VV|04_2_VH|04_2_VV|04_3_VH|04_3_VV|04_4_VH|04_4_VV|04_5_VH|04_5_VV|05_0_VH|05_0_VV|05_1_VH|05_1_VV|05_2_VH|05_2_VV|05_3_VH|05_3_VV|05_4_VH|05_4_VV|06_0_VH|06_0_VV|06_1_VH|06_1_VV|06_2_VH|06_2_VV|06_3_VH|06_3_VV|06_4_VH|06_4_VV|07_0_VH|07_0_VV|07_1_VH|07_1_VV|07_2_VH|07_2_VV|07_3_VH|07_3_VV|07_4_VH|07_4_VV|08_0_VH|08_0_VV|08_1_VH|08_1_VV|08_2_VH|08_2_VV|08_3_VH|08_3_VV|08_4_VH|08_4_VV|09_0_VH|09_0_VV|09_1_VH|09_1_VV|09_2_VH|09_2_VV|09_3_VH|09_3_VV|09_4_VH|09_4_VV|10_0_VH|10_0_VV|10_1_VH|10_1_VV|10_2_VH|10_2_VV|10_3_VH|10_3_VV|10_4_VH|10_4_VV|11_0_VH|11_0_VV|11_1_VH|11_1_VV|11_2_VH|11_2_VV|11_3_VH|11_3_VV|11_4_VH|11_4_VV|fields| y|\n",
1821 | "+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+------+---+\n",
1822 | "| 3.0| 7.0| 3.0| 7.0| 4.0| 16.0| 2.0| 9.0| 1.0| 8.0| 13.0| 24.0| 6.0| 4.0| 2.0| 4.0| 2.0| 6.0| 9.0| 42.0| 2.0| 15.0| 3.0| 11.0| 4.0| 16.0| 4.0| 35.0| 5.0| 12.0| 5.0| 26.0| 3.0| 6.0| 3.0| 9.0| 1.0| 10.0| 4.0| 13.0| 1.0| 13.0| 3.0| 7.0| 2.0| 17.0| 3.0| 24.0| 4.0| 24.0| 7.0| 14.0| 3.0| 14.0| 3.0| 20.0| 2.0| 3.0| 7.0| 12.0| 4.0| 8.0| 7.0| 35.0| 10.0| 19.0| 5.0| 71.0| 7.0| 8.0| 4.0| 11.0| 7.0| 23.0| 4.0| 4.0| 5.0| 5.0| 7.0| 12.0| 1.0| 18.0| 0| 0|\n",
1823 | "| 2.0| 11.0| 1.0| 6.0| 1.0| 9.0| 3.0| 7.0| 0.0| 8.0| 4.0| 19.0| 0.0| 6.0| 1.0| 5.0| 1.0| 9.0| 2.0| 44.0| 1.0| 6.0| 1.0| 9.0| 6.0| 23.0| 1.0| 24.0| 3.0| 13.0| 2.0| 30.0| 3.0| 8.0| 3.0| 3.0| 1.0| 19.0| 4.0| 16.0| 3.0| 14.0| 3.0| 7.0| 1.0| 36.0| 4.0| 36.0| 5.0| 24.0| 2.0| 19.0| 5.0| 14.0| 10.0| 24.0| 9.0| 3.0| 4.0| 8.0| 15.0| 18.0| 6.0| 18.0| 7.0| 20.0| 5.0| 73.0| 4.0| 7.0| 4.0| 12.0| 1.0| 11.0| 2.0| 8.0| 0.0| 5.0| 3.0| 12.0| 2.0| 14.0| 0| 0|\n",
1824 | "| 3.0| 12.0| 2.0| 6.0| 1.0| 13.0| 3.0| 6.0| 0.0| 5.0| 7.0| 19.0| 1.0| 6.0| 1.0| 5.0| 2.0| 11.0| 1.0| 32.0| 1.0| 3.0| 1.0| 7.0| 2.0| 21.0| 1.0| 6.0| 0.0| 6.0| 2.0| 19.0| 1.0| 10.0| 5.0| 3.0| 0.0| 3.0| 2.0| 13.0| 5.0| 7.0| 2.0| 5.0| 2.0| 44.0| 1.0| 31.0| 4.0| 16.0| 3.0| 11.0| 2.0| 9.0| 2.0| 13.0| 6.0| 8.0| 6.0| 10.0| 12.0| 19.0| 2.0| 10.0| 3.0| 9.0| 8.0| 35.0| 2.0| 10.0| 1.0| 9.0| 3.0| 31.0| 1.0| 12.0| 0.0| 8.0| 2.0| 11.0| 1.0| 14.0| 0| 0|\n",
1825 | "| 3.0| 12.0| 2.0| 3.0| 0.0| 13.0| 2.0| 4.0| 0.0| 13.0| 6.0| 27.0| 3.0| 6.0| 1.0| 6.0| 3.0| 12.0| 1.0| 20.0| 1.0| 12.0| 0.0| 7.0| 2.0| 16.0| 1.0| 8.0| 1.0| 7.0| 1.0| 9.0| 0.0| 1.0| 5.0| 8.0| 0.0| 2.0| 0.0| 2.0| 3.0| 4.0| 1.0| 7.0| 3.0| 37.0| 5.0| 19.0| 5.0| 18.0| 4.0| 10.0| 1.0| 6.0| 1.0| 4.0| 3.0| 6.0| 3.0| 10.0| 2.0| 10.0| 4.0| 11.0| 1.0| 11.0| 5.0| 29.0| 1.0| 17.0| 2.0| 8.0| 3.0| 18.0| 1.0| 7.0| 1.0| 4.0| 1.0| 8.0| 2.0| 7.0| 0| 0|\n",
1826 | "| 3.0| 17.0| 1.0| 2.0| 0.0| 6.0| 0.0| 3.0| 2.0| 10.0| 5.0| 72.0| 2.0| 6.0| 1.0| 3.0| 0.0| 16.0| 1.0| 14.0| 1.0| 10.0| 0.0| 15.0| 2.0| 36.0| 1.0| 17.0| 2.0| 9.0| 2.0| 8.0| 0.0| 6.0| 1.0| 16.0| 0.0| 7.0| 1.0| 2.0| 2.0| 9.0| 2.0| 9.0| 4.0| 32.0| 6.0| 23.0| 4.0| 6.0| 0.0| 4.0| 2.0| 11.0| 1.0| 8.0| 0.0| 11.0| 2.0| 5.0| 3.0| 10.0| 3.0| 12.0| 2.0| 15.0| 7.0| 38.0| 1.0| 21.0| 1.0| 9.0| 1.0| 25.0| 1.0| 5.0| 1.0| 4.0| 1.0| 5.0| 0.0| 3.0| 0| 0|\n",
1827 | "+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+------+---+\n",
1828 | "only showing top 5 rows\n",
1829 | "\n",
1830 | "CPU times: user 12.7 ms, sys: 122 µs, total: 12.8 ms\n",
1831 | "Wall time: 1.75 s\n"
1832 | ]
1833 | }
1834 | ]
1835 | },
1836 | {
1837 | "cell_type": "code",
1838 | "metadata": {
1839 | "id": "lVtaJ0btQEzO"
1840 | },
1841 | "source": [
1842 | "final_data = final_data.select(*(F.col(c).cast(\"float\").alias(c) for c in final_data.columns))\n",
1843 | "final_data.printSchema()"
1844 | ],
1845 | "execution_count": null,
1846 | "outputs": []
1847 | },
1848 | {
1849 | "cell_type": "code",
1850 | "metadata": {
1851 | "colab": {
1852 | "base_uri": "https://localhost:8080/"
1853 | },
1854 | "id": "6082CzASQ6qA",
1855 | "outputId": "0aae82aa-0ad6-4909-f3ba-b6a54705a877"
1856 | },
1857 | "source": [
1858 | "%%time\n",
1859 | "final_data_mean = final_data.groupBy('fields').mean().toPandas()\n",
1860 | "final_data_mean.head()"
1861 | ],
1862 | "execution_count": null,
1863 | "outputs": [
1864 | {
1865 | "output_type": "stream",
1866 | "name": "stdout",
1867 | "text": [
1868 | "CPU times: user 22.7 s, sys: 2.51 s, total: 25.2 s\n",
1869 | "Wall time: 59min 1s\n"
1870 | ]
1871 | }
1872 | ]
1873 | },
1874 | {
1875 | "cell_type": "code",
1876 | "metadata": {
1877 | "colab": {
1878 | "base_uri": "https://localhost:8080/",
1879 | "height": 252
1880 | },
1881 | "id": "4xTXELIwQ6kr",
1882 | "outputId": "83494c8c-882b-45e9-dde8-38e57ec15c7d"
1883 | },
1884 | "source": [
1885 | "final_data_mean.head()"
1886 | ],
1887 | "execution_count": null,
1888 | "outputs": [
1889 | {
1890 | "output_type": "execute_result",
1891 | "data": {
1892 | "text/html": [
1893 | "\n",
1894 | "\n",
1907 | "
\n",
1908 | " \n",
1909 | " \n",
1910 | " | \n",
1911 | " fields | \n",
1912 | " avg(04_0_VH) | \n",
1913 | " avg(04_0_VV) | \n",
1914 | " avg(04_1_VH) | \n",
1915 | " avg(04_1_VV) | \n",
1916 | " avg(04_2_VH) | \n",
1917 | " avg(04_2_VV) | \n",
1918 | " avg(04_3_VH) | \n",
1919 | " avg(04_3_VV) | \n",
1920 | " avg(04_4_VH) | \n",
1921 | " avg(04_4_VV) | \n",
1922 | " avg(04_5_VH) | \n",
1923 | " avg(04_5_VV) | \n",
1924 | " avg(05_0_VH) | \n",
1925 | " avg(05_0_VV) | \n",
1926 | " avg(05_1_VH) | \n",
1927 | " avg(05_1_VV) | \n",
1928 | " avg(05_2_VH) | \n",
1929 | " avg(05_2_VV) | \n",
1930 | " avg(05_3_VH) | \n",
1931 | " avg(05_3_VV) | \n",
1932 | " avg(05_4_VH) | \n",
1933 | " avg(05_4_VV) | \n",
1934 | " avg(06_0_VH) | \n",
1935 | " avg(06_0_VV) | \n",
1936 | " avg(06_1_VH) | \n",
1937 | " avg(06_1_VV) | \n",
1938 | " avg(06_2_VH) | \n",
1939 | " avg(06_2_VV) | \n",
1940 | " avg(06_3_VH) | \n",
1941 | " avg(06_3_VV) | \n",
1942 | " avg(06_4_VH) | \n",
1943 | " avg(06_4_VV) | \n",
1944 | " avg(07_0_VH) | \n",
1945 | " avg(07_0_VV) | \n",
1946 | " avg(07_1_VH) | \n",
1947 | " avg(07_1_VV) | \n",
1948 | " avg(07_2_VH) | \n",
1949 | " avg(07_2_VV) | \n",
1950 | " avg(07_3_VH) | \n",
1951 | " ... | \n",
1952 | " avg(08_1_VH) | \n",
1953 | " avg(08_1_VV) | \n",
1954 | " avg(08_2_VH) | \n",
1955 | " avg(08_2_VV) | \n",
1956 | " avg(08_3_VH) | \n",
1957 | " avg(08_3_VV) | \n",
1958 | " avg(08_4_VH) | \n",
1959 | " avg(08_4_VV) | \n",
1960 | " avg(09_0_VH) | \n",
1961 | " avg(09_0_VV) | \n",
1962 | " avg(09_1_VH) | \n",
1963 | " avg(09_1_VV) | \n",
1964 | " avg(09_2_VH) | \n",
1965 | " avg(09_2_VV) | \n",
1966 | " avg(09_3_VH) | \n",
1967 | " avg(09_3_VV) | \n",
1968 | " avg(09_4_VH) | \n",
1969 | " avg(09_4_VV) | \n",
1970 | " avg(10_0_VH) | \n",
1971 | " avg(10_0_VV) | \n",
1972 | " avg(10_1_VH) | \n",
1973 | " avg(10_1_VV) | \n",
1974 | " avg(10_2_VH) | \n",
1975 | " avg(10_2_VV) | \n",
1976 | " avg(10_3_VH) | \n",
1977 | " avg(10_3_VV) | \n",
1978 | " avg(10_4_VH) | \n",
1979 | " avg(10_4_VV) | \n",
1980 | " avg(11_0_VH) | \n",
1981 | " avg(11_0_VV) | \n",
1982 | " avg(11_1_VH) | \n",
1983 | " avg(11_1_VV) | \n",
1984 | " avg(11_2_VH) | \n",
1985 | " avg(11_2_VV) | \n",
1986 | " avg(11_3_VH) | \n",
1987 | " avg(11_3_VV) | \n",
1988 | " avg(11_4_VH) | \n",
1989 | " avg(11_4_VV) | \n",
1990 | " avg(fields) | \n",
1991 | " avg(y) | \n",
1992 | "
\n",
1993 | " \n",
1994 | " \n",
1995 | " \n",
1996 | " | 0 | \n",
1997 | " 87958.0 | \n",
1998 | " 2.676827 | \n",
1999 | " 19.940902 | \n",
2000 | " 2.954588 | \n",
2001 | " 16.914774 | \n",
2002 | " 1.517263 | \n",
2003 | " 17.317885 | \n",
2004 | " 2.625816 | \n",
2005 | " 15.063453 | \n",
2006 | " 2.490202 | \n",
2007 | " 19.443235 | \n",
2008 | " 2.078072 | \n",
2009 | " 13.496112 | \n",
2010 | " 3.026439 | \n",
2011 | " 24.278383 | \n",
2012 | " 3.165474 | \n",
2013 | " 20.325661 | \n",
2014 | " 4.285537 | \n",
2015 | " 29.982271 | \n",
2016 | " 3.807465 | \n",
2017 | " 24.830171 | \n",
2018 | " 3.390358 | \n",
2019 | " 28.850078 | \n",
2020 | " 2.199689 | \n",
2021 | " 17.498600 | \n",
2022 | " 4.494868 | \n",
2023 | " 53.866874 | \n",
2024 | " 3.990980 | \n",
2025 | " 29.867496 | \n",
2026 | " 3.549611 | \n",
2027 | " 41.758009 | \n",
2028 | " 3.655988 | \n",
2029 | " 21.973250 | \n",
2030 | " 2.746812 | \n",
2031 | " 22.492379 | \n",
2032 | " 3.851633 | \n",
2033 | " 20.743390 | \n",
2034 | " 4.149611 | \n",
2035 | " 31.140280 | \n",
2036 | " 4.115708 | \n",
2037 | " ... | \n",
2038 | " 6.016485 | \n",
2039 | " 23.416174 | \n",
2040 | " 5.159876 | \n",
2041 | " 17.355521 | \n",
2042 | " 5.144323 | \n",
2043 | " 19.175117 | \n",
2044 | " 3.609331 | \n",
2045 | " 9.614308 | \n",
2046 | " 4.097667 | \n",
2047 | " 11.322240 | \n",
2048 | " 4.806843 | \n",
2049 | " 11.568896 | \n",
2050 | " 5.043857 | \n",
2051 | " 14.158942 | \n",
2052 | " 5.234526 | \n",
2053 | " 13.975117 | \n",
2054 | " 5.299222 | \n",
2055 | " 16.238258 | \n",
2056 | " 5.269362 | \n",
2057 | " 14.159876 | \n",
2058 | " 4.789114 | \n",
2059 | " 17.764852 | \n",
2060 | " 5.262830 | \n",
2061 | " 19.453810 | \n",
2062 | " 3.973872 | \n",
2063 | " 16.525972 | \n",
2064 | " 4.184137 | \n",
2065 | " 13.902022 | \n",
2066 | " 2.562364 | \n",
2067 | " 15.148056 | \n",
2068 | " 3.333126 | \n",
2069 | " 14.901400 | \n",
2070 | " 2.629238 | \n",
2071 | " 17.477138 | \n",
2072 | " 3.062519 | \n",
2073 | " 14.792224 | \n",
2074 | " 2.813997 | \n",
2075 | " 18.101711 | \n",
2076 | " 87958.0 | \n",
2077 | " 6.0 | \n",
2078 | "
\n",
2079 | " \n",
2080 | " | 1 | \n",
2081 | " 26338.0 | \n",
2082 | " 5.603060 | \n",
2083 | " 26.801530 | \n",
2084 | " 2.649399 | \n",
2085 | " 27.394098 | \n",
2086 | " 4.034973 | \n",
2087 | " 20.696393 | \n",
2088 | " NaN | \n",
2089 | " NaN | \n",
2090 | " NaN | \n",
2091 | " NaN | \n",
2092 | " NaN | \n",
2093 | " NaN | \n",
2094 | " 3.618798 | \n",
2095 | " 21.302077 | \n",
2096 | " 4.794973 | \n",
2097 | " 25.068852 | \n",
2098 | " 5.045683 | \n",
2099 | " 27.548634 | \n",
2100 | " NaN | \n",
2101 | " NaN | \n",
2102 | " NaN | \n",
2103 | " NaN | \n",
2104 | " 3.813552 | \n",
2105 | " 42.742732 | \n",
2106 | " 3.390164 | \n",
2107 | " 31.265792 | \n",
2108 | " NaN | \n",
2109 | " NaN | \n",
2110 | " NaN | \n",
2111 | " NaN | \n",
2112 | " NaN | \n",
2113 | " NaN | \n",
2114 | " 3.405246 | \n",
2115 | " 17.995410 | \n",
2116 | " 4.087869 | \n",
2117 | " 25.903825 | \n",
2118 | " 3.808743 | \n",
2119 | " 16.022295 | \n",
2120 | " NaN | \n",
2121 | " ... | \n",
2122 | " 4.260765 | \n",
2123 | " 15.931585 | \n",
2124 | " NaN | \n",
2125 | " NaN | \n",
2126 | " NaN | \n",
2127 | " NaN | \n",
2128 | " NaN | \n",
2129 | " NaN | \n",
2130 | " 4.268415 | \n",
2131 | " 11.620328 | \n",
2132 | " 4.164372 | \n",
2133 | " 11.830383 | \n",
2134 | " 5.489836 | \n",
2135 | " 15.177049 | \n",
2136 | " NaN | \n",
2137 | " NaN | \n",
2138 | " NaN | \n",
2139 | " NaN | \n",
2140 | " 5.150601 | \n",
2141 | " 15.444809 | \n",
2142 | " 3.140765 | \n",
2143 | " 15.039781 | \n",
2144 | " NaN | \n",
2145 | " NaN | \n",
2146 | " NaN | \n",
2147 | " NaN | \n",
2148 | " NaN | \n",
2149 | " NaN | \n",
2150 | " 2.434973 | \n",
2151 | " 15.283279 | \n",
2152 | " 2.280000 | \n",
2153 | " 15.930710 | \n",
2154 | " 3.110820 | \n",
2155 | " 17.633443 | \n",
2156 | " NaN | \n",
2157 | " NaN | \n",
2158 | " NaN | \n",
2159 | " NaN | \n",
2160 | " 26338.0 | \n",
2161 | " 7.0 | \n",
2162 | "
\n",
2163 | " \n",
2164 | " | 2 | \n",
2165 | " 55436.0 | \n",
2166 | " 1.374696 | \n",
2167 | " 12.523684 | \n",
2168 | " 1.226113 | \n",
2169 | " 12.234818 | \n",
2170 | " 1.432591 | \n",
2171 | " 12.479352 | \n",
2172 | " NaN | \n",
2173 | " NaN | \n",
2174 | " NaN | \n",
2175 | " NaN | \n",
2176 | " NaN | \n",
2177 | " NaN | \n",
2178 | " 4.423279 | \n",
2179 | " 26.419231 | \n",
2180 | " 5.006680 | \n",
2181 | " 28.885628 | \n",
2182 | " 4.181377 | \n",
2183 | " 26.215992 | \n",
2184 | " NaN | \n",
2185 | " NaN | \n",
2186 | " NaN | \n",
2187 | " NaN | \n",
2188 | " 4.687045 | \n",
2189 | " 35.269636 | \n",
2190 | " 4.108704 | \n",
2191 | " 28.842713 | \n",
2192 | " NaN | \n",
2193 | " NaN | \n",
2194 | " NaN | \n",
2195 | " NaN | \n",
2196 | " NaN | \n",
2197 | " NaN | \n",
2198 | " 4.374089 | \n",
2199 | " 24.329960 | \n",
2200 | " 8.350405 | \n",
2201 | " 42.986437 | \n",
2202 | " 9.071457 | \n",
2203 | " 36.728340 | \n",
2204 | " NaN | \n",
2205 | " ... | \n",
2206 | " 8.344939 | \n",
2207 | " 25.227530 | \n",
2208 | " NaN | \n",
2209 | " NaN | \n",
2210 | " NaN | \n",
2211 | " NaN | \n",
2212 | " NaN | \n",
2213 | " NaN | \n",
2214 | " 9.095344 | \n",
2215 | " 20.195142 | \n",
2216 | " 14.563158 | \n",
2217 | " 32.720040 | \n",
2218 | " 17.946559 | \n",
2219 | " 38.577733 | \n",
2220 | " NaN | \n",
2221 | " NaN | \n",
2222 | " NaN | \n",
2223 | " NaN | \n",
2224 | " 12.281984 | \n",
2225 | " 28.092105 | \n",
2226 | " 7.105263 | \n",
2227 | " 18.811741 | \n",
2228 | " NaN | \n",
2229 | " NaN | \n",
2230 | " NaN | \n",
2231 | " NaN | \n",
2232 | " NaN | \n",
2233 | " NaN | \n",
2234 | " 3.455466 | \n",
2235 | " 13.469231 | \n",
2236 | " 2.971053 | \n",
2237 | " 15.141296 | \n",
2238 | " 2.600810 | \n",
2239 | " 14.885628 | \n",
2240 | " NaN | \n",
2241 | " NaN | \n",
2242 | " NaN | \n",
2243 | " NaN | \n",
2244 | " 55436.0 | \n",
2245 | " 8.0 | \n",
2246 | "
\n",
2247 | " \n",
2248 | " | 3 | \n",
2249 | " 101335.0 | \n",
2250 | " 1.514648 | \n",
2251 | " 9.358670 | \n",
2252 | " 1.258116 | \n",
2253 | " 7.973872 | \n",
2254 | " 1.628662 | \n",
2255 | " 8.589865 | \n",
2256 | " 1.377672 | \n",
2257 | " 8.021378 | \n",
2258 | " 1.794141 | \n",
2259 | " 10.216944 | \n",
2260 | " 7.686461 | \n",
2261 | " 44.382423 | \n",
2262 | " 1.743468 | \n",
2263 | " 11.716548 | \n",
2264 | " 1.385590 | \n",
2265 | " 9.500396 | \n",
2266 | " 1.771971 | \n",
2267 | " 12.045131 | \n",
2268 | " 4.846397 | \n",
2269 | " 29.173397 | \n",
2270 | " 1.908947 | \n",
2271 | " 13.020586 | \n",
2272 | " 1.387173 | \n",
2273 | " 10.524149 | \n",
2274 | " 3.228029 | \n",
2275 | " 29.858274 | \n",
2276 | " 3.025337 | \n",
2277 | " 21.114014 | \n",
2278 | " 3.159145 | \n",
2279 | " 20.205067 | \n",
2280 | " 3.876485 | \n",
2281 | " 20.482185 | \n",
2282 | " 3.369755 | \n",
2283 | " 14.817102 | \n",
2284 | " 4.915281 | \n",
2285 | " 22.543151 | \n",
2286 | " 5.932700 | \n",
2287 | " 25.900238 | \n",
2288 | " 5.742676 | \n",
2289 | " ... | \n",
2290 | " 6.458432 | \n",
2291 | " 21.057007 | \n",
2292 | " 5.045922 | \n",
2293 | " 17.707838 | \n",
2294 | " 5.969121 | \n",
2295 | " 18.872526 | \n",
2296 | " 3.909739 | \n",
2297 | " 13.253365 | \n",
2298 | " 4.026128 | \n",
2299 | " 10.783848 | \n",
2300 | " 4.828979 | \n",
2301 | " 15.520190 | \n",
2302 | " 3.944576 | \n",
2303 | " 9.995249 | \n",
2304 | " 4.441805 | \n",
2305 | " 14.840063 | \n",
2306 | " 5.030087 | \n",
2307 | " 13.472684 | \n",
2308 | " 8.271576 | \n",
2309 | " 21.724466 | \n",
2310 | " 5.296912 | \n",
2311 | " 16.022169 | \n",
2312 | " 6.555819 | \n",
2313 | " 22.524149 | \n",
2314 | " 3.804434 | \n",
2315 | " 13.372130 | \n",
2316 | " 3.058591 | \n",
2317 | " 11.595408 | \n",
2318 | " 4.068884 | \n",
2319 | " 23.426762 | \n",
2320 | " 1.793349 | \n",
2321 | " 9.307205 | \n",
2322 | " 1.619161 | \n",
2323 | " 9.403009 | \n",
2324 | " 1.338876 | \n",
2325 | " 8.879652 | \n",
2326 | " 1.653998 | \n",
2327 | " 10.286619 | \n",
2328 | " 101335.0 | \n",
2329 | " 7.0 | \n",
2330 | "
\n",
2331 | " \n",
2332 | " | 4 | \n",
2333 | " 13268.0 | \n",
2334 | " 3.000000 | \n",
2335 | " 19.500000 | \n",
2336 | " 5.166667 | \n",
2337 | " 12.833333 | \n",
2338 | " 3.666667 | \n",
2339 | " 20.666667 | \n",
2340 | " 2.000000 | \n",
2341 | " 10.833333 | \n",
2342 | " 3.666667 | \n",
2343 | " 18.666667 | \n",
2344 | " 2.333333 | \n",
2345 | " 14.333333 | \n",
2346 | " 2.833333 | \n",
2347 | " 20.666667 | \n",
2348 | " 1.666667 | \n",
2349 | " 13.500000 | \n",
2350 | " 2.333333 | \n",
2351 | " 25.166667 | \n",
2352 | " 4.166667 | \n",
2353 | " 17.500000 | \n",
2354 | " 2.000000 | \n",
2355 | " 18.166667 | \n",
2356 | " 3.333333 | \n",
2357 | " 20.500000 | \n",
2358 | " 6.000000 | \n",
2359 | " 58.000000 | \n",
2360 | " 3.833333 | \n",
2361 | " 33.333333 | \n",
2362 | " 2.166667 | \n",
2363 | " 35.833333 | \n",
2364 | " 2.333333 | \n",
2365 | " 21.166667 | \n",
2366 | " 4.333333 | \n",
2367 | " 29.000000 | \n",
2368 | " 3.833333 | \n",
2369 | " 28.833333 | \n",
2370 | " 5.833333 | \n",
2371 | " 17.833333 | \n",
2372 | " 4.833333 | \n",
2373 | " ... | \n",
2374 | " 3.333333 | \n",
2375 | " 32.500000 | \n",
2376 | " 7.500000 | \n",
2377 | " 22.333333 | \n",
2378 | " 4.500000 | \n",
2379 | " 19.666667 | \n",
2380 | " 4.333333 | \n",
2381 | " 9.166667 | \n",
2382 | " 2.166667 | \n",
2383 | " 13.166667 | \n",
2384 | " 8.500000 | \n",
2385 | " 16.666667 | \n",
2386 | " 3.833333 | \n",
2387 | " 22.166667 | \n",
2388 | " 2.833333 | \n",
2389 | " 21.000000 | \n",
2390 | " 1.333333 | \n",
2391 | " 28.833333 | \n",
2392 | " 2.500000 | \n",
2393 | " 27.166667 | \n",
2394 | " 2.500000 | \n",
2395 | " 27.166667 | \n",
2396 | " 3.333333 | \n",
2397 | " 28.666667 | \n",
2398 | " 2.166667 | \n",
2399 | " 20.333333 | \n",
2400 | " 2.166667 | \n",
2401 | " 11.000000 | \n",
2402 | " 2.333333 | \n",
2403 | " 27.333333 | \n",
2404 | " 3.000000 | \n",
2405 | " 15.500000 | \n",
2406 | " 5.333333 | \n",
2407 | " 24.000000 | \n",
2408 | " 9.166667 | \n",
2409 | " 42.333333 | \n",
2410 | " 5.000000 | \n",
2411 | " 23.166667 | \n",
2412 | " 13268.0 | \n",
2413 | " 2.0 | \n",
2414 | "
\n",
2415 | " \n",
2416 | "
\n",
2417 | "
5 rows × 85 columns
\n",
2418 | "
"
2419 | ],
2420 | "text/plain": [
2421 | " fields avg(04_0_VH) avg(04_0_VV) ... avg(11_4_VV) avg(fields) avg(y)\n",
2422 | "0 87958.0 2.676827 19.940902 ... 18.101711 87958.0 6.0\n",
2423 | "1 26338.0 5.603060 26.801530 ... NaN 26338.0 7.0\n",
2424 | "2 55436.0 1.374696 12.523684 ... NaN 55436.0 8.0\n",
2425 | "3 101335.0 1.514648 9.358670 ... 10.286619 101335.0 7.0\n",
2426 | "4 13268.0 3.000000 19.500000 ... 23.166667 13268.0 2.0\n",
2427 | "\n",
2428 | "[5 rows x 85 columns]"
2429 | ]
2430 | },
2431 | "metadata": {},
2432 | "execution_count": 21
2433 | }
2434 | ]
2435 | },
2436 | {
2437 | "cell_type": "code",
2438 | "metadata": {
2439 | "colab": {
2440 | "base_uri": "https://localhost:8080/",
2441 | "height": 252
2442 | },
2443 | "id": "cTIe-_-jfNK_",
2444 | "outputId": "75d3f1b7-42f7-4828-dc25-52fbb1aa97a3"
2445 | },
2446 | "source": [
2447 | "sar_train = final_data_mean.iloc[:, 1:]\n",
2448 | "sar_train.columns = final_data.columns\n",
2449 | "sar_train.y = round(sar_train.y).astype(int)\n",
2450 | "sar_train = sar_train[sar_train.y != 0].reset_index(drop = True)\n",
2451 | "sar_train.head()"
2452 | ],
2453 | "execution_count": null,
2454 | "outputs": [
2455 | {
2456 | "output_type": "execute_result",
2457 | "data": {
2458 | "text/html": [
2459 | "\n",
2460 | "\n",
2473 | "
\n",
2474 | " \n",
2475 | " \n",
2476 | " | \n",
2477 | " 04_0_VH | \n",
2478 | " 04_0_VV | \n",
2479 | " 04_1_VH | \n",
2480 | " 04_1_VV | \n",
2481 | " 04_2_VH | \n",
2482 | " 04_2_VV | \n",
2483 | " 04_3_VH | \n",
2484 | " 04_3_VV | \n",
2485 | " 04_4_VH | \n",
2486 | " 04_4_VV | \n",
2487 | " 04_5_VH | \n",
2488 | " 04_5_VV | \n",
2489 | " 05_0_VH | \n",
2490 | " 05_0_VV | \n",
2491 | " 05_1_VH | \n",
2492 | " 05_1_VV | \n",
2493 | " 05_2_VH | \n",
2494 | " 05_2_VV | \n",
2495 | " 05_3_VH | \n",
2496 | " 05_3_VV | \n",
2497 | " 05_4_VH | \n",
2498 | " 05_4_VV | \n",
2499 | " 06_0_VH | \n",
2500 | " 06_0_VV | \n",
2501 | " 06_1_VH | \n",
2502 | " 06_1_VV | \n",
2503 | " 06_2_VH | \n",
2504 | " 06_2_VV | \n",
2505 | " 06_3_VH | \n",
2506 | " 06_3_VV | \n",
2507 | " 06_4_VH | \n",
2508 | " 06_4_VV | \n",
2509 | " 07_0_VH | \n",
2510 | " 07_0_VV | \n",
2511 | " 07_1_VH | \n",
2512 | " 07_1_VV | \n",
2513 | " 07_2_VH | \n",
2514 | " 07_2_VV | \n",
2515 | " 07_3_VH | \n",
2516 | " 07_3_VV | \n",
2517 | " ... | \n",
2518 | " 08_1_VH | \n",
2519 | " 08_1_VV | \n",
2520 | " 08_2_VH | \n",
2521 | " 08_2_VV | \n",
2522 | " 08_3_VH | \n",
2523 | " 08_3_VV | \n",
2524 | " 08_4_VH | \n",
2525 | " 08_4_VV | \n",
2526 | " 09_0_VH | \n",
2527 | " 09_0_VV | \n",
2528 | " 09_1_VH | \n",
2529 | " 09_1_VV | \n",
2530 | " 09_2_VH | \n",
2531 | " 09_2_VV | \n",
2532 | " 09_3_VH | \n",
2533 | " 09_3_VV | \n",
2534 | " 09_4_VH | \n",
2535 | " 09_4_VV | \n",
2536 | " 10_0_VH | \n",
2537 | " 10_0_VV | \n",
2538 | " 10_1_VH | \n",
2539 | " 10_1_VV | \n",
2540 | " 10_2_VH | \n",
2541 | " 10_2_VV | \n",
2542 | " 10_3_VH | \n",
2543 | " 10_3_VV | \n",
2544 | " 10_4_VH | \n",
2545 | " 10_4_VV | \n",
2546 | " 11_0_VH | \n",
2547 | " 11_0_VV | \n",
2548 | " 11_1_VH | \n",
2549 | " 11_1_VV | \n",
2550 | " 11_2_VH | \n",
2551 | " 11_2_VV | \n",
2552 | " 11_3_VH | \n",
2553 | " 11_3_VV | \n",
2554 | " 11_4_VH | \n",
2555 | " 11_4_VV | \n",
2556 | " fields | \n",
2557 | " y | \n",
2558 | "
\n",
2559 | " \n",
2560 | " \n",
2561 | " \n",
2562 | " | 0 | \n",
2563 | " 2.676827 | \n",
2564 | " 19.940902 | \n",
2565 | " 2.954588 | \n",
2566 | " 16.914774 | \n",
2567 | " 1.517263 | \n",
2568 | " 17.317885 | \n",
2569 | " 2.625816 | \n",
2570 | " 15.063453 | \n",
2571 | " 2.490202 | \n",
2572 | " 19.443235 | \n",
2573 | " 2.078072 | \n",
2574 | " 13.496112 | \n",
2575 | " 3.026439 | \n",
2576 | " 24.278383 | \n",
2577 | " 3.165474 | \n",
2578 | " 20.325661 | \n",
2579 | " 4.285537 | \n",
2580 | " 29.982271 | \n",
2581 | " 3.807465 | \n",
2582 | " 24.830171 | \n",
2583 | " 3.390358 | \n",
2584 | " 28.850078 | \n",
2585 | " 2.199689 | \n",
2586 | " 17.498600 | \n",
2587 | " 4.494868 | \n",
2588 | " 53.866874 | \n",
2589 | " 3.990980 | \n",
2590 | " 29.867496 | \n",
2591 | " 3.549611 | \n",
2592 | " 41.758009 | \n",
2593 | " 3.655988 | \n",
2594 | " 21.973250 | \n",
2595 | " 2.746812 | \n",
2596 | " 22.492379 | \n",
2597 | " 3.851633 | \n",
2598 | " 20.743390 | \n",
2599 | " 4.149611 | \n",
2600 | " 31.140280 | \n",
2601 | " 4.115708 | \n",
2602 | " 15.497356 | \n",
2603 | " ... | \n",
2604 | " 6.016485 | \n",
2605 | " 23.416174 | \n",
2606 | " 5.159876 | \n",
2607 | " 17.355521 | \n",
2608 | " 5.144323 | \n",
2609 | " 19.175117 | \n",
2610 | " 3.609331 | \n",
2611 | " 9.614308 | \n",
2612 | " 4.097667 | \n",
2613 | " 11.322240 | \n",
2614 | " 4.806843 | \n",
2615 | " 11.568896 | \n",
2616 | " 5.043857 | \n",
2617 | " 14.158942 | \n",
2618 | " 5.234526 | \n",
2619 | " 13.975117 | \n",
2620 | " 5.299222 | \n",
2621 | " 16.238258 | \n",
2622 | " 5.269362 | \n",
2623 | " 14.159876 | \n",
2624 | " 4.789114 | \n",
2625 | " 17.764852 | \n",
2626 | " 5.262830 | \n",
2627 | " 19.453810 | \n",
2628 | " 3.973872 | \n",
2629 | " 16.525972 | \n",
2630 | " 4.184137 | \n",
2631 | " 13.902022 | \n",
2632 | " 2.562364 | \n",
2633 | " 15.148056 | \n",
2634 | " 3.333126 | \n",
2635 | " 14.901400 | \n",
2636 | " 2.629238 | \n",
2637 | " 17.477138 | \n",
2638 | " 3.062519 | \n",
2639 | " 14.792224 | \n",
2640 | " 2.813997 | \n",
2641 | " 18.101711 | \n",
2642 | " 87958.0 | \n",
2643 | " 6 | \n",
2644 | "
\n",
2645 | " \n",
2646 | " | 1 | \n",
2647 | " 5.603060 | \n",
2648 | " 26.801530 | \n",
2649 | " 2.649399 | \n",
2650 | " 27.394098 | \n",
2651 | " 4.034973 | \n",
2652 | " 20.696393 | \n",
2653 | " NaN | \n",
2654 | " NaN | \n",
2655 | " NaN | \n",
2656 | " NaN | \n",
2657 | " NaN | \n",
2658 | " NaN | \n",
2659 | " 3.618798 | \n",
2660 | " 21.302077 | \n",
2661 | " 4.794973 | \n",
2662 | " 25.068852 | \n",
2663 | " 5.045683 | \n",
2664 | " 27.548634 | \n",
2665 | " NaN | \n",
2666 | " NaN | \n",
2667 | " NaN | \n",
2668 | " NaN | \n",
2669 | " 3.813552 | \n",
2670 | " 42.742732 | \n",
2671 | " 3.390164 | \n",
2672 | " 31.265792 | \n",
2673 | " NaN | \n",
2674 | " NaN | \n",
2675 | " NaN | \n",
2676 | " NaN | \n",
2677 | " NaN | \n",
2678 | " NaN | \n",
2679 | " 3.405246 | \n",
2680 | " 17.995410 | \n",
2681 | " 4.087869 | \n",
2682 | " 25.903825 | \n",
2683 | " 3.808743 | \n",
2684 | " 16.022295 | \n",
2685 | " NaN | \n",
2686 | " NaN | \n",
2687 | " ... | \n",
2688 | " 4.260765 | \n",
2689 | " 15.931585 | \n",
2690 | " NaN | \n",
2691 | " NaN | \n",
2692 | " NaN | \n",
2693 | " NaN | \n",
2694 | " NaN | \n",
2695 | " NaN | \n",
2696 | " 4.268415 | \n",
2697 | " 11.620328 | \n",
2698 | " 4.164372 | \n",
2699 | " 11.830383 | \n",
2700 | " 5.489836 | \n",
2701 | " 15.177049 | \n",
2702 | " NaN | \n",
2703 | " NaN | \n",
2704 | " NaN | \n",
2705 | " NaN | \n",
2706 | " 5.150601 | \n",
2707 | " 15.444809 | \n",
2708 | " 3.140765 | \n",
2709 | " 15.039781 | \n",
2710 | " NaN | \n",
2711 | " NaN | \n",
2712 | " NaN | \n",
2713 | " NaN | \n",
2714 | " NaN | \n",
2715 | " NaN | \n",
2716 | " 2.434973 | \n",
2717 | " 15.283279 | \n",
2718 | " 2.280000 | \n",
2719 | " 15.930710 | \n",
2720 | " 3.110820 | \n",
2721 | " 17.633443 | \n",
2722 | " NaN | \n",
2723 | " NaN | \n",
2724 | " NaN | \n",
2725 | " NaN | \n",
2726 | " 26338.0 | \n",
2727 | " 7 | \n",
2728 | "
\n",
2729 | " \n",
2730 | " | 2 | \n",
2731 | " 1.374696 | \n",
2732 | " 12.523684 | \n",
2733 | " 1.226113 | \n",
2734 | " 12.234818 | \n",
2735 | " 1.432591 | \n",
2736 | " 12.479352 | \n",
2737 | " NaN | \n",
2738 | " NaN | \n",
2739 | " NaN | \n",
2740 | " NaN | \n",
2741 | " NaN | \n",
2742 | " NaN | \n",
2743 | " 4.423279 | \n",
2744 | " 26.419231 | \n",
2745 | " 5.006680 | \n",
2746 | " 28.885628 | \n",
2747 | " 4.181377 | \n",
2748 | " 26.215992 | \n",
2749 | " NaN | \n",
2750 | " NaN | \n",
2751 | " NaN | \n",
2752 | " NaN | \n",
2753 | " 4.687045 | \n",
2754 | " 35.269636 | \n",
2755 | " 4.108704 | \n",
2756 | " 28.842713 | \n",
2757 | " NaN | \n",
2758 | " NaN | \n",
2759 | " NaN | \n",
2760 | " NaN | \n",
2761 | " NaN | \n",
2762 | " NaN | \n",
2763 | " 4.374089 | \n",
2764 | " 24.329960 | \n",
2765 | " 8.350405 | \n",
2766 | " 42.986437 | \n",
2767 | " 9.071457 | \n",
2768 | " 36.728340 | \n",
2769 | " NaN | \n",
2770 | " NaN | \n",
2771 | " ... | \n",
2772 | " 8.344939 | \n",
2773 | " 25.227530 | \n",
2774 | " NaN | \n",
2775 | " NaN | \n",
2776 | " NaN | \n",
2777 | " NaN | \n",
2778 | " NaN | \n",
2779 | " NaN | \n",
2780 | " 9.095344 | \n",
2781 | " 20.195142 | \n",
2782 | " 14.563158 | \n",
2783 | " 32.720040 | \n",
2784 | " 17.946559 | \n",
2785 | " 38.577733 | \n",
2786 | " NaN | \n",
2787 | " NaN | \n",
2788 | " NaN | \n",
2789 | " NaN | \n",
2790 | " 12.281984 | \n",
2791 | " 28.092105 | \n",
2792 | " 7.105263 | \n",
2793 | " 18.811741 | \n",
2794 | " NaN | \n",
2795 | " NaN | \n",
2796 | " NaN | \n",
2797 | " NaN | \n",
2798 | " NaN | \n",
2799 | " NaN | \n",
2800 | " 3.455466 | \n",
2801 | " 13.469231 | \n",
2802 | " 2.971053 | \n",
2803 | " 15.141296 | \n",
2804 | " 2.600810 | \n",
2805 | " 14.885628 | \n",
2806 | " NaN | \n",
2807 | " NaN | \n",
2808 | " NaN | \n",
2809 | " NaN | \n",
2810 | " 55436.0 | \n",
2811 | " 8 | \n",
2812 | "
\n",
2813 | " \n",
2814 | " | 3 | \n",
2815 | " 1.514648 | \n",
2816 | " 9.358670 | \n",
2817 | " 1.258116 | \n",
2818 | " 7.973872 | \n",
2819 | " 1.628662 | \n",
2820 | " 8.589865 | \n",
2821 | " 1.377672 | \n",
2822 | " 8.021378 | \n",
2823 | " 1.794141 | \n",
2824 | " 10.216944 | \n",
2825 | " 7.686461 | \n",
2826 | " 44.382423 | \n",
2827 | " 1.743468 | \n",
2828 | " 11.716548 | \n",
2829 | " 1.385590 | \n",
2830 | " 9.500396 | \n",
2831 | " 1.771971 | \n",
2832 | " 12.045131 | \n",
2833 | " 4.846397 | \n",
2834 | " 29.173397 | \n",
2835 | " 1.908947 | \n",
2836 | " 13.020586 | \n",
2837 | " 1.387173 | \n",
2838 | " 10.524149 | \n",
2839 | " 3.228029 | \n",
2840 | " 29.858274 | \n",
2841 | " 3.025337 | \n",
2842 | " 21.114014 | \n",
2843 | " 3.159145 | \n",
2844 | " 20.205067 | \n",
2845 | " 3.876485 | \n",
2846 | " 20.482185 | \n",
2847 | " 3.369755 | \n",
2848 | " 14.817102 | \n",
2849 | " 4.915281 | \n",
2850 | " 22.543151 | \n",
2851 | " 5.932700 | \n",
2852 | " 25.900238 | \n",
2853 | " 5.742676 | \n",
2854 | " 20.196358 | \n",
2855 | " ... | \n",
2856 | " 6.458432 | \n",
2857 | " 21.057007 | \n",
2858 | " 5.045922 | \n",
2859 | " 17.707838 | \n",
2860 | " 5.969121 | \n",
2861 | " 18.872526 | \n",
2862 | " 3.909739 | \n",
2863 | " 13.253365 | \n",
2864 | " 4.026128 | \n",
2865 | " 10.783848 | \n",
2866 | " 4.828979 | \n",
2867 | " 15.520190 | \n",
2868 | " 3.944576 | \n",
2869 | " 9.995249 | \n",
2870 | " 4.441805 | \n",
2871 | " 14.840063 | \n",
2872 | " 5.030087 | \n",
2873 | " 13.472684 | \n",
2874 | " 8.271576 | \n",
2875 | " 21.724466 | \n",
2876 | " 5.296912 | \n",
2877 | " 16.022169 | \n",
2878 | " 6.555819 | \n",
2879 | " 22.524149 | \n",
2880 | " 3.804434 | \n",
2881 | " 13.372130 | \n",
2882 | " 3.058591 | \n",
2883 | " 11.595408 | \n",
2884 | " 4.068884 | \n",
2885 | " 23.426762 | \n",
2886 | " 1.793349 | \n",
2887 | " 9.307205 | \n",
2888 | " 1.619161 | \n",
2889 | " 9.403009 | \n",
2890 | " 1.338876 | \n",
2891 | " 8.879652 | \n",
2892 | " 1.653998 | \n",
2893 | " 10.286619 | \n",
2894 | " 101335.0 | \n",
2895 | " 7 | \n",
2896 | "
\n",
2897 | " \n",
2898 | " | 4 | \n",
2899 | " 3.000000 | \n",
2900 | " 19.500000 | \n",
2901 | " 5.166667 | \n",
2902 | " 12.833333 | \n",
2903 | " 3.666667 | \n",
2904 | " 20.666667 | \n",
2905 | " 2.000000 | \n",
2906 | " 10.833333 | \n",
2907 | " 3.666667 | \n",
2908 | " 18.666667 | \n",
2909 | " 2.333333 | \n",
2910 | " 14.333333 | \n",
2911 | " 2.833333 | \n",
2912 | " 20.666667 | \n",
2913 | " 1.666667 | \n",
2914 | " 13.500000 | \n",
2915 | " 2.333333 | \n",
2916 | " 25.166667 | \n",
2917 | " 4.166667 | \n",
2918 | " 17.500000 | \n",
2919 | " 2.000000 | \n",
2920 | " 18.166667 | \n",
2921 | " 3.333333 | \n",
2922 | " 20.500000 | \n",
2923 | " 6.000000 | \n",
2924 | " 58.000000 | \n",
2925 | " 3.833333 | \n",
2926 | " 33.333333 | \n",
2927 | " 2.166667 | \n",
2928 | " 35.833333 | \n",
2929 | " 2.333333 | \n",
2930 | " 21.166667 | \n",
2931 | " 4.333333 | \n",
2932 | " 29.000000 | \n",
2933 | " 3.833333 | \n",
2934 | " 28.833333 | \n",
2935 | " 5.833333 | \n",
2936 | " 17.833333 | \n",
2937 | " 4.833333 | \n",
2938 | " 20.333333 | \n",
2939 | " ... | \n",
2940 | " 3.333333 | \n",
2941 | " 32.500000 | \n",
2942 | " 7.500000 | \n",
2943 | " 22.333333 | \n",
2944 | " 4.500000 | \n",
2945 | " 19.666667 | \n",
2946 | " 4.333333 | \n",
2947 | " 9.166667 | \n",
2948 | " 2.166667 | \n",
2949 | " 13.166667 | \n",
2950 | " 8.500000 | \n",
2951 | " 16.666667 | \n",
2952 | " 3.833333 | \n",
2953 | " 22.166667 | \n",
2954 | " 2.833333 | \n",
2955 | " 21.000000 | \n",
2956 | " 1.333333 | \n",
2957 | " 28.833333 | \n",
2958 | " 2.500000 | \n",
2959 | " 27.166667 | \n",
2960 | " 2.500000 | \n",
2961 | " 27.166667 | \n",
2962 | " 3.333333 | \n",
2963 | " 28.666667 | \n",
2964 | " 2.166667 | \n",
2965 | " 20.333333 | \n",
2966 | " 2.166667 | \n",
2967 | " 11.000000 | \n",
2968 | " 2.333333 | \n",
2969 | " 27.333333 | \n",
2970 | " 3.000000 | \n",
2971 | " 15.500000 | \n",
2972 | " 5.333333 | \n",
2973 | " 24.000000 | \n",
2974 | " 9.166667 | \n",
2975 | " 42.333333 | \n",
2976 | " 5.000000 | \n",
2977 | " 23.166667 | \n",
2978 | " 13268.0 | \n",
2979 | " 2 | \n",
2980 | "
\n",
2981 | " \n",
2982 | "
\n",
2983 | "
5 rows × 84 columns
\n",
2984 | "
"
2985 | ],
2986 | "text/plain": [
2987 | " 04_0_VH 04_0_VV 04_1_VH 04_1_VV ... 11_4_VH 11_4_VV fields y\n",
2988 | "0 2.676827 19.940902 2.954588 16.914774 ... 2.813997 18.101711 87958.0 6\n",
2989 | "1 5.603060 26.801530 2.649399 27.394098 ... NaN NaN 26338.0 7\n",
2990 | "2 1.374696 12.523684 1.226113 12.234818 ... NaN NaN 55436.0 8\n",
2991 | "3 1.514648 9.358670 1.258116 7.973872 ... 1.653998 10.286619 101335.0 7\n",
2992 | "4 3.000000 19.500000 5.166667 12.833333 ... 5.000000 23.166667 13268.0 2\n",
2993 | "\n",
2994 | "[5 rows x 84 columns]"
2995 | ]
2996 | },
2997 | "metadata": {},
2998 | "execution_count": 41
2999 | }
3000 | ]
3001 | },
3002 | {
3003 | "cell_type": "code",
3004 | "metadata": {
3005 | "colab": {
3006 | "base_uri": "https://localhost:8080/"
3007 | },
3008 | "id": "Ox20DSAEfyyG",
3009 | "outputId": "64bdf921-d532-42f8-c280-3df1b8061138"
3010 | },
3011 | "source": [
3012 | "sar_train.y.unique()"
3013 | ],
3014 | "execution_count": null,
3015 | "outputs": [
3016 | {
3017 | "output_type": "execute_result",
3018 | "data": {
3019 | "text/plain": [
3020 | "array([6, 7, 8, 2, 5, 4, 1, 3, 9])"
3021 | ]
3022 | },
3023 | "metadata": {},
3024 | "execution_count": 42
3025 | }
3026 | ]
3027 | },
3028 | {
3029 | "cell_type": "code",
3030 | "metadata": {
3031 | "id": "N5PwFjR0hKZY"
3032 | },
3033 | "source": [
3034 | "sar_train.to_csv('/content/drive/MyDrive/CompeData/Radiant/s1_all_data/sar_train.csv', index = False)"
3035 | ],
3036 | "execution_count": null,
3037 | "outputs": []
3038 | },
3039 | {
3040 | "cell_type": "code",
3041 | "metadata": {
3042 | "colab": {
3043 | "base_uri": "https://localhost:8080/",
3044 | "height": 82,
3045 | "referenced_widgets": [
3046 | "c05b441e442f4c2396c9ba920423e05a",
3047 | "80ac4e5eaa574d1d9796dafb61c968b1",
3048 | "dbdc3f534915460d8c094fe3fbf6d3e5",
3049 | "a930671d0de84ca093dcb8b4e3cb1d0d",
3050 | "62b1340ef5b146a69f5426ddd31a132e",
3051 | "b760c18a75cb4a3785b0aa429ac21660",
3052 | "c02d098387c44ac899bcc1a5ef97aeaa",
3053 | "81305ee731aa4b958c71a180aedcc7a2",
3054 | "f37f5d60c87b4652b46b6175882c9caa",
3055 | "4843dbc1f2dd4701a136f87dfa1b940e",
3056 | "e7f81c8eee9e460faa0c2a8eba58f3c9"
3057 | ]
3058 | },
3059 | "id": "lfwLN-mbhYhb",
3060 | "outputId": "2fa25d79-cfe8-4a5f-af3d-35ff314c90fb"
3061 | },
3062 | "source": [
3063 | "%%time\n",
3064 | "dfs = []\n",
3065 | "for i in tqdm_notebook(glob.glob('/content/drive/MyDrive/CompeData/Radiant/s1_all_data/s1_test*.csv')):\n",
3066 | " dfs.append(spark.read.csv(path = i, sep =',', encoding = 'UTF-8', comment = None, header = True))"
3067 | ],
3068 | "execution_count": null,
3069 | "outputs": [
3070 | {
3071 | "output_type": "display_data",
3072 | "data": {
3073 | "application/vnd.jupyter.widget-view+json": {
3074 | "model_id": "c05b441e442f4c2396c9ba920423e05a",
3075 | "version_minor": 0,
3076 | "version_major": 2
3077 | },
3078 | "text/plain": [
3079 | " 0%| | 0/5 [00:00, ?it/s]"
3080 | ]
3081 | },
3082 | "metadata": {}
3083 | },
3084 | {
3085 | "output_type": "stream",
3086 | "name": "stdout",
3087 | "text": [
3088 | "CPU times: user 89.9 ms, sys: 3.57 ms, total: 93.4 ms\n",
3089 | "Wall time: 1.99 s\n"
3090 | ]
3091 | }
3092 | ]
3093 | },
3094 | {
3095 | "cell_type": "code",
3096 | "metadata": {
3097 | "colab": {
3098 | "base_uri": "https://localhost:8080/",
3099 | "height": 82,
3100 | "referenced_widgets": [
3101 | "dc7d7680505d44999e165717debcd1a4",
3102 | "63cf9d3439294863a5728040904c8ad7",
3103 | "2bfaac11391a4adfb91433e0f3368683",
3104 | "788a1eafa3814b9399816accd124f6eb",
3105 | "d984a74eb7ad432a82fde659554da5b9",
3106 | "50696b849220459495b9b9a091ce05bb",
3107 | "72efa9c92d01410dbca8b357bc581e56",
3108 | "5be0c6d165d748fcbb017817fd29c708",
3109 | "f3591ac91f184068827b659c302bbe99",
3110 | "9bca2e1c229b41d7add2792d72015b3a",
3111 | "03f86a1edb104dd2b55a587599a8339e"
3112 | ]
3113 | },
3114 | "id": "-YNI_CPkhgHX",
3115 | "outputId": "f2f498b1-8514-4e31-c151-ffe28bc77a7a"
3116 | },
3117 | "source": [
3118 | "%%time\n",
3119 | "cols = set()\n",
3120 | "for j in tqdm_notebook(dfs):\n",
3121 | " for x in j.columns:\n",
3122 | " cols.add(x)\n",
3123 | "cols = sorted(cols)\n",
3124 | "\n",
3125 | "# Create a dictionary with all the dataframes\n",
3126 | "all_data = {}\n",
3127 | "for i, d in enumerate(dfs):\n",
3128 | " new_name = 'new_df' + str(i) # New name for the key, the dataframe is the value\n",
3129 | " all_data[new_name] = d\n",
3130 | " # Loop through all column names. Add the missing columns to the dataframe (with value 0)\n",
3131 | " for x in cols:\n",
3132 | " if x not in d.columns:\n",
3133 | " all_data[new_name] = all_data[new_name].withColumn(x, F.lit(0))\n",
3134 | " all_data[new_name] = all_data[new_name].select(cols) # Use 'select' to get the columns sorted"
3135 | ],
3136 | "execution_count": null,
3137 | "outputs": [
3138 | {
3139 | "output_type": "display_data",
3140 | "data": {
3141 | "application/vnd.jupyter.widget-view+json": {
3142 | "model_id": "dc7d7680505d44999e165717debcd1a4",
3143 | "version_minor": 0,
3144 | "version_major": 2
3145 | },
3146 | "text/plain": [
3147 | " 0%| | 0/5 [00:00, ?it/s]"
3148 | ]
3149 | },
3150 | "metadata": {}
3151 | },
3152 | {
3153 | "output_type": "stream",
3154 | "name": "stdout",
3155 | "text": [
3156 | "CPU times: user 199 ms, sys: 28.2 ms, total: 227 ms\n",
3157 | "Wall time: 1.02 s\n"
3158 | ]
3159 | }
3160 | ]
3161 | },
3162 | {
3163 | "cell_type": "code",
3164 | "metadata": {
3165 | "colab": {
3166 | "base_uri": "https://localhost:8080/"
3167 | },
3168 | "id": "83S0Xj9whgDh",
3169 | "outputId": "b28edf77-03f0-453d-b8e8-6c8ac87316f4"
3170 | },
3171 | "source": [
3172 | "len(dfs)"
3173 | ],
3174 | "execution_count": null,
3175 | "outputs": [
3176 | {
3177 | "output_type": "execute_result",
3178 | "data": {
3179 | "text/plain": [
3180 | "5"
3181 | ]
3182 | },
3183 | "metadata": {},
3184 | "execution_count": 46
3185 | }
3186 | ]
3187 | },
3188 | {
3189 | "cell_type": "code",
3190 | "metadata": {
3191 | "colab": {
3192 | "base_uri": "https://localhost:8080/"
3193 | },
3194 | "id": "7ggUilAahf-4",
3195 | "outputId": "56a82920-dc5f-4e4c-bed5-7929f73e4454"
3196 | },
3197 | "source": [
3198 | "%%time\n",
3199 | "all_data_dfs = [all_data['new_df'+str(i)] for i in range(5)]\n",
3200 | "final_data = reduce(DataFrame.unionAll, all_data_dfs)\n",
3201 | "final_data.show(5)"
3202 | ],
3203 | "execution_count": null,
3204 | "outputs": [
3205 | {
3206 | "output_type": "stream",
3207 | "name": "stdout",
3208 | "text": [
3209 | "+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+------+\n",
3210 | "|04_0_VH|04_0_VV|04_1_VH|04_1_VV|04_2_VH|04_2_VV|04_3_VH|04_3_VV|04_4_VH|04_4_VV|04_5_VH|04_5_VV|05_0_VH|05_0_VV|05_1_VH|05_1_VV|05_2_VH|05_2_VV|05_3_VH|05_3_VV|05_4_VH|05_4_VV|06_0_VH|06_0_VV|06_1_VH|06_1_VV|06_2_VH|06_2_VV|06_3_VH|06_3_VV|06_4_VH|06_4_VV|07_0_VH|07_0_VV|07_1_VH|07_1_VV|07_2_VH|07_2_VV|07_3_VH|07_3_VV|07_4_VH|07_4_VV|08_0_VH|08_0_VV|08_1_VH|08_1_VV|08_2_VH|08_2_VV|08_3_VH|08_3_VV|08_4_VH|08_4_VV|09_0_VH|09_0_VV|09_1_VH|09_1_VV|09_2_VH|09_2_VV|09_3_VH|09_3_VV|09_4_VH|09_4_VV|10_0_VH|10_0_VV|10_1_VH|10_1_VV|10_2_VH|10_2_VV|10_3_VH|10_3_VV|10_4_VH|10_4_VV|11_0_VH|11_0_VV|11_1_VH|11_1_VV|11_2_VH|11_2_VV|11_3_VH|11_3_VV|11_4_VH|11_4_VV|fields|\n",
3211 | "+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+------+\n",
3212 | "| 6.0| 14.0| 5.0| 35.0| 5.0| 15.0| 7.0| 16.0| 9.0| 23.0| 7.0| 14.0| 7.0| 31.0| 10.0| 32.0| 2.0| 26.0| 6.0| 28.0| 6.0| 37.0| 6.0| 28.0| 14.0| 21.0| 13.0| 20.0| 6.0| 16.0| 4.0| 12.0| 3.0| 37.0| 5.0| 14.0| 1.0| 27.0| 4.0| 17.0| 4.0| 23.0| 4.0| 15.0| 14.0| 21.0| 11.0| 21.0| 5.0| 29.0| 5.0| 10.0| 1.0| 16.0| 2.0| 19.0| 1.0| 27.0| 9.0| 26.0| 3.0| 27.0| 9.0| 27.0| 11.0| 33.0| 12.0| 36.0| 5.0| 9.0| 11.0| 20.0| 15.0| 21.0| 5.0| 23.0| 4.0| 9.0| 7.0| 28.0| 3.0| 20.0| 0|\n",
3213 | "| 4.0| 19.0| 8.0| 30.0| 3.0| 14.0| 4.0| 12.0| 6.0| 28.0| 10.0| 39.0| 10.0| 35.0| 9.0| 13.0| 1.0| 29.0| 9.0| 19.0| 5.0| 45.0| 3.0| 33.0| 11.0| 42.0| 8.0| 27.0| 10.0| 31.0| 3.0| 11.0| 3.0| 48.0| 7.0| 17.0| 2.0| 24.0| 3.0| 11.0| 8.0| 34.0| 4.0| 16.0| 3.0| 23.0| 13.0| 43.0| 10.0| 36.0| 6.0| 10.0| 9.0| 12.0| 6.0| 24.0| 1.0| 34.0| 4.0| 13.0| 3.0| 29.0| 3.0| 29.0| 8.0| 23.0| 10.0| 15.0| 6.0| 4.0| 9.0| 17.0| 4.0| 27.0| 5.0| 12.0| 3.0| 12.0| 8.0| 16.0| 7.0| 8.0| 0|\n",
3214 | "| 1.0| 12.0| 10.0| 10.0| 6.0| 19.0| 2.0| 8.0| 5.0| 30.0| 10.0| 22.0| 4.0| 18.0| 5.0| 7.0| 1.0| 25.0| 8.0| 7.0| 3.0| 28.0| 4.0| 42.0| 6.0| 36.0| 8.0| 27.0| 7.0| 27.0| 2.0| 21.0| 5.0| 34.0| 4.0| 14.0| 3.0| 15.0| 3.0| 9.0| 9.0| 22.0| 2.0| 23.0| 3.0| 23.0| 9.0| 37.0| 6.0| 33.0| 4.0| 12.0| 12.0| 8.0| 3.0| 20.0| 4.0| 14.0| 4.0| 10.0| 4.0| 16.0| 10.0| 29.0| 5.0| 28.0| 8.0| 14.0| 9.0| 10.0| 8.0| 16.0| 5.0| 20.0| 6.0| 10.0| 1.0| 7.0| 7.0| 17.0| 10.0| 16.0| 0|\n",
3215 | "| 5.0| 6.0| 4.0| 13.0| 10.0| 15.0| 4.0| 12.0| 7.0| 21.0| 17.0| 22.0| 5.0| 6.0| 6.0| 12.0| 2.0| 14.0| 9.0| 16.0| 5.0| 9.0| 6.0| 30.0| 3.0| 19.0| 5.0| 17.0| 7.0| 18.0| 8.0| 29.0| 4.0| 27.0| 4.0| 23.0| 6.0| 16.0| 7.0| 5.0| 7.0| 10.0| 3.0| 8.0| 3.0| 19.0| 5.0| 30.0| 9.0| 19.0| 2.0| 12.0| 8.0| 17.0| 1.0| 18.0| 9.0| 9.0| 4.0| 10.0| 5.0| 10.0| 15.0| 24.0| 5.0| 19.0| 3.0| 15.0| 9.0| 20.0| 6.0| 9.0| 5.0| 26.0| 6.0| 6.0| 6.0| 6.0| 8.0| 15.0| 8.0| 10.0| 0|\n",
3216 | "| 10.0| 9.0| 5.0| 21.0| 10.0| 18.0| 7.0| 13.0| 4.0| 18.0| 9.0| 24.0| 6.0| 10.0| 10.0| 15.0| 6.0| 10.0| 9.0| 16.0| 11.0| 18.0| 10.0| 21.0| 2.0| 17.0| 8.0| 15.0| 6.0| 23.0| 11.0| 19.0| 7.0| 29.0| 8.0| 17.0| 8.0| 18.0| 9.0| 6.0| 8.0| 9.0| 5.0| 11.0| 4.0| 27.0| 6.0| 25.0| 7.0| 15.0| 8.0| 16.0| 4.0| 23.0| 2.0| 13.0| 12.0| 5.0| 3.0| 6.0| 2.0| 6.0| 19.0| 14.0| 4.0| 9.0| 2.0| 15.0| 8.0| 17.0| 5.0| 12.0| 3.0| 35.0| 5.0| 31.0| 8.0| 17.0| 16.0| 16.0| 9.0| 13.0| 0|\n",
3217 | "+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+------+\n",
3218 | "only showing top 5 rows\n",
3219 | "\n",
3220 | "CPU times: user 5.53 ms, sys: 2.15 ms, total: 7.68 ms\n",
3221 | "Wall time: 781 ms\n"
3222 | ]
3223 | }
3224 | ]
3225 | },
3226 | {
3227 | "cell_type": "code",
3228 | "metadata": {
3229 | "id": "D6fzj2Rshf6s"
3230 | },
3231 | "source": [
3232 | "final_data = final_data.select(*(F.col(c).cast(\"float\").alias(c) for c in final_data.columns))\n",
3233 | "final_data.printSchema()"
3234 | ],
3235 | "execution_count": null,
3236 | "outputs": []
3237 | },
3238 | {
3239 | "cell_type": "code",
3240 | "metadata": {
3241 | "colab": {
3242 | "base_uri": "https://localhost:8080/"
3243 | },
3244 | "id": "4Z_rdapMhqdo",
3245 | "outputId": "cc794914-81ff-484e-fefc-7559d23e07f2"
3246 | },
3247 | "source": [
3248 | "%%time\n",
3249 | "final_data_mean = final_data.groupBy('fields').mean().toPandas()\n",
3250 | "final_data_mean.head()"
3251 | ],
3252 | "execution_count": null,
3253 | "outputs": [
3254 | {
3255 | "output_type": "stream",
3256 | "name": "stdout",
3257 | "text": [
3258 | "CPU times: user 10.1 s, sys: 1.04 s, total: 11.1 s\n",
3259 | "Wall time: 25min 29s\n"
3260 | ]
3261 | }
3262 | ]
3263 | },
3264 | {
3265 | "cell_type": "code",
3266 | "metadata": {
3267 | "colab": {
3268 | "base_uri": "https://localhost:8080/",
3269 | "height": 252
3270 | },
3271 | "id": "VSQCX0ZshqaK",
3272 | "outputId": "1f78a9bc-c87b-40a8-d39b-ffeecd5e1ece"
3273 | },
3274 | "source": [
3275 | "sar_test = final_data_mean.iloc[:, 1:]\n",
3276 | "sar_test.columns = final_data.columns\n",
3277 | "sar_test.head()"
3278 | ],
3279 | "execution_count": null,
3280 | "outputs": [
3281 | {
3282 | "output_type": "execute_result",
3283 | "data": {
3284 | "text/html": [
3285 | "\n",
3286 | "\n",
3299 | "
\n",
3300 | " \n",
3301 | " \n",
3302 | " | \n",
3303 | " 04_0_VH | \n",
3304 | " 04_0_VV | \n",
3305 | " 04_1_VH | \n",
3306 | " 04_1_VV | \n",
3307 | " 04_2_VH | \n",
3308 | " 04_2_VV | \n",
3309 | " 04_3_VH | \n",
3310 | " 04_3_VV | \n",
3311 | " 04_4_VH | \n",
3312 | " 04_4_VV | \n",
3313 | " 04_5_VH | \n",
3314 | " 04_5_VV | \n",
3315 | " 05_0_VH | \n",
3316 | " 05_0_VV | \n",
3317 | " 05_1_VH | \n",
3318 | " 05_1_VV | \n",
3319 | " 05_2_VH | \n",
3320 | " 05_2_VV | \n",
3321 | " 05_3_VH | \n",
3322 | " 05_3_VV | \n",
3323 | " 05_4_VH | \n",
3324 | " 05_4_VV | \n",
3325 | " 06_0_VH | \n",
3326 | " 06_0_VV | \n",
3327 | " 06_1_VH | \n",
3328 | " 06_1_VV | \n",
3329 | " 06_2_VH | \n",
3330 | " 06_2_VV | \n",
3331 | " 06_3_VH | \n",
3332 | " 06_3_VV | \n",
3333 | " 06_4_VH | \n",
3334 | " 06_4_VV | \n",
3335 | " 07_0_VH | \n",
3336 | " 07_0_VV | \n",
3337 | " 07_1_VH | \n",
3338 | " 07_1_VV | \n",
3339 | " 07_2_VH | \n",
3340 | " 07_2_VV | \n",
3341 | " 07_3_VH | \n",
3342 | " 07_3_VV | \n",
3343 | " ... | \n",
3344 | " 08_0_VV | \n",
3345 | " 08_1_VH | \n",
3346 | " 08_1_VV | \n",
3347 | " 08_2_VH | \n",
3348 | " 08_2_VV | \n",
3349 | " 08_3_VH | \n",
3350 | " 08_3_VV | \n",
3351 | " 08_4_VH | \n",
3352 | " 08_4_VV | \n",
3353 | " 09_0_VH | \n",
3354 | " 09_0_VV | \n",
3355 | " 09_1_VH | \n",
3356 | " 09_1_VV | \n",
3357 | " 09_2_VH | \n",
3358 | " 09_2_VV | \n",
3359 | " 09_3_VH | \n",
3360 | " 09_3_VV | \n",
3361 | " 09_4_VH | \n",
3362 | " 09_4_VV | \n",
3363 | " 10_0_VH | \n",
3364 | " 10_0_VV | \n",
3365 | " 10_1_VH | \n",
3366 | " 10_1_VV | \n",
3367 | " 10_2_VH | \n",
3368 | " 10_2_VV | \n",
3369 | " 10_3_VH | \n",
3370 | " 10_3_VV | \n",
3371 | " 10_4_VH | \n",
3372 | " 10_4_VV | \n",
3373 | " 11_0_VH | \n",
3374 | " 11_0_VV | \n",
3375 | " 11_1_VH | \n",
3376 | " 11_1_VV | \n",
3377 | " 11_2_VH | \n",
3378 | " 11_2_VV | \n",
3379 | " 11_3_VH | \n",
3380 | " 11_3_VV | \n",
3381 | " 11_4_VH | \n",
3382 | " 11_4_VV | \n",
3383 | " fields | \n",
3384 | "
\n",
3385 | " \n",
3386 | " \n",
3387 | " \n",
3388 | " | 0 | \n",
3389 | " 0.816514 | \n",
3390 | " 5.449541 | \n",
3391 | " 0.651376 | \n",
3392 | " 4.724771 | \n",
3393 | " 0.917431 | \n",
3394 | " 4.431193 | \n",
3395 | " NaN | \n",
3396 | " NaN | \n",
3397 | " NaN | \n",
3398 | " NaN | \n",
3399 | " NaN | \n",
3400 | " NaN | \n",
3401 | " 0.715596 | \n",
3402 | " 3.256881 | \n",
3403 | " 0.761468 | \n",
3404 | " 4.458716 | \n",
3405 | " 0.853211 | \n",
3406 | " 4.458716 | \n",
3407 | " NaN | \n",
3408 | " NaN | \n",
3409 | " NaN | \n",
3410 | " NaN | \n",
3411 | " 0.825688 | \n",
3412 | " 14.165138 | \n",
3413 | " 0.954128 | \n",
3414 | " 16.211009 | \n",
3415 | " NaN | \n",
3416 | " NaN | \n",
3417 | " NaN | \n",
3418 | " NaN | \n",
3419 | " NaN | \n",
3420 | " NaN | \n",
3421 | " 1.284404 | \n",
3422 | " 5.422018 | \n",
3423 | " 0.880734 | \n",
3424 | " 11.623853 | \n",
3425 | " 0.853211 | \n",
3426 | " 6.009174 | \n",
3427 | " NaN | \n",
3428 | " NaN | \n",
3429 | " ... | \n",
3430 | " 23.899083 | \n",
3431 | " 1.807339 | \n",
3432 | " 16.972477 | \n",
3433 | " NaN | \n",
3434 | " NaN | \n",
3435 | " NaN | \n",
3436 | " NaN | \n",
3437 | " NaN | \n",
3438 | " NaN | \n",
3439 | " 1.724771 | \n",
3440 | " 9.348624 | \n",
3441 | " 2.321101 | \n",
3442 | " 7.706422 | \n",
3443 | " 1.834862 | \n",
3444 | " 8.174312 | \n",
3445 | " NaN | \n",
3446 | " NaN | \n",
3447 | " NaN | \n",
3448 | " NaN | \n",
3449 | " 1.522936 | \n",
3450 | " 6.715596 | \n",
3451 | " 1.899083 | \n",
3452 | " 10.045872 | \n",
3453 | " NaN | \n",
3454 | " NaN | \n",
3455 | " NaN | \n",
3456 | " NaN | \n",
3457 | " NaN | \n",
3458 | " NaN | \n",
3459 | " 1.862385 | \n",
3460 | " 7.807339 | \n",
3461 | " 1.972477 | \n",
3462 | " 8.275229 | \n",
3463 | " 1.110092 | \n",
3464 | " 7.440367 | \n",
3465 | " NaN | \n",
3466 | " NaN | \n",
3467 | " NaN | \n",
3468 | " NaN | \n",
3469 | " 714.0 | \n",
3470 | "
\n",
3471 | " \n",
3472 | " | 1 | \n",
3473 | " 0.954545 | \n",
3474 | " 22.136364 | \n",
3475 | " 1.636364 | \n",
3476 | " 11.272727 | \n",
3477 | " 2.863636 | \n",
3478 | " 9.181818 | \n",
3479 | " NaN | \n",
3480 | " NaN | \n",
3481 | " NaN | \n",
3482 | " NaN | \n",
3483 | " NaN | \n",
3484 | " NaN | \n",
3485 | " 1.545455 | \n",
3486 | " 9.318182 | \n",
3487 | " 2.363636 | \n",
3488 | " 8.272727 | \n",
3489 | " 1.818182 | \n",
3490 | " 10.772727 | \n",
3491 | " NaN | \n",
3492 | " NaN | \n",
3493 | " NaN | \n",
3494 | " NaN | \n",
3495 | " 1.590909 | \n",
3496 | " 15.363636 | \n",
3497 | " 2.000000 | \n",
3498 | " 8.545455 | \n",
3499 | " NaN | \n",
3500 | " NaN | \n",
3501 | " NaN | \n",
3502 | " NaN | \n",
3503 | " NaN | \n",
3504 | " NaN | \n",
3505 | " 2.363636 | \n",
3506 | " 12.863636 | \n",
3507 | " 3.181818 | \n",
3508 | " 11.136364 | \n",
3509 | " 2.590909 | \n",
3510 | " 6.818182 | \n",
3511 | " NaN | \n",
3512 | " NaN | \n",
3513 | " ... | \n",
3514 | " 7.454545 | \n",
3515 | " 1.545455 | \n",
3516 | " 10.590909 | \n",
3517 | " NaN | \n",
3518 | " NaN | \n",
3519 | " NaN | \n",
3520 | " NaN | \n",
3521 | " NaN | \n",
3522 | " NaN | \n",
3523 | " 1.227273 | \n",
3524 | " 11.136364 | \n",
3525 | " 3.227273 | \n",
3526 | " 9.772727 | \n",
3527 | " 1.772727 | \n",
3528 | " 12.136364 | \n",
3529 | " NaN | \n",
3530 | " NaN | \n",
3531 | " NaN | \n",
3532 | " NaN | \n",
3533 | " 2.818182 | \n",
3534 | " 11.409091 | \n",
3535 | " 2.181818 | \n",
3536 | " 9.954545 | \n",
3537 | " NaN | \n",
3538 | " NaN | \n",
3539 | " NaN | \n",
3540 | " NaN | \n",
3541 | " NaN | \n",
3542 | " NaN | \n",
3543 | " 1.000000 | \n",
3544 | " 9.136364 | \n",
3545 | " 2.136364 | \n",
3546 | " 9.318182 | \n",
3547 | " 1.954545 | \n",
3548 | " 9.590909 | \n",
3549 | " NaN | \n",
3550 | " NaN | \n",
3551 | " NaN | \n",
3552 | " NaN | \n",
3553 | " 118859.0 | \n",
3554 | "
\n",
3555 | " \n",
3556 | " | 2 | \n",
3557 | " 1.137255 | \n",
3558 | " 7.758170 | \n",
3559 | " 0.705882 | \n",
3560 | " 4.725490 | \n",
3561 | " 1.000000 | \n",
3562 | " 7.254902 | \n",
3563 | " 0.882353 | \n",
3564 | " 5.594771 | \n",
3565 | " 1.346405 | \n",
3566 | " 7.431373 | \n",
3567 | " 2.836601 | \n",
3568 | " 16.104575 | \n",
3569 | " 1.104575 | \n",
3570 | " 6.882353 | \n",
3571 | " 1.000000 | \n",
3572 | " 5.496732 | \n",
3573 | " 1.418301 | \n",
3574 | " 7.549020 | \n",
3575 | " 0.856209 | \n",
3576 | " 6.248366 | \n",
3577 | " 1.287582 | \n",
3578 | " 7.006536 | \n",
3579 | " 0.856209 | \n",
3580 | " 6.241830 | \n",
3581 | " 1.993464 | \n",
3582 | " 23.379085 | \n",
3583 | " 2.111111 | \n",
3584 | " 16.274510 | \n",
3585 | " 2.895425 | \n",
3586 | " 19.019608 | \n",
3587 | " 2.647059 | \n",
3588 | " 14.287582 | \n",
3589 | " 4.130719 | \n",
3590 | " 17.235294 | \n",
3591 | " 3.980392 | \n",
3592 | " 14.522876 | \n",
3593 | " 3.823529 | \n",
3594 | " 22.444444 | \n",
3595 | " 4.241830 | \n",
3596 | " 17.372549 | \n",
3597 | " ... | \n",
3598 | " 14.751634 | \n",
3599 | " 4.640523 | \n",
3600 | " 20.758170 | \n",
3601 | " 4.431373 | \n",
3602 | " 21.209150 | \n",
3603 | " 5.947712 | \n",
3604 | " 19.209150 | \n",
3605 | " 5.875817 | \n",
3606 | " 21.725490 | \n",
3607 | " 4.901961 | \n",
3608 | " 15.790850 | \n",
3609 | " 5.248366 | \n",
3610 | " 16.254902 | \n",
3611 | " 3.254902 | \n",
3612 | " 10.856209 | \n",
3613 | " 4.143791 | \n",
3614 | " 11.830065 | \n",
3615 | " 2.732026 | \n",
3616 | " 7.869281 | \n",
3617 | " 5.267974 | \n",
3618 | " 12.862745 | \n",
3619 | " 2.732026 | \n",
3620 | " 7.803922 | \n",
3621 | " 2.150327 | \n",
3622 | " 10.111111 | \n",
3623 | " 2.294118 | \n",
3624 | " 8.830065 | \n",
3625 | " 2.359477 | \n",
3626 | " 6.869281 | \n",
3627 | " 2.235294 | \n",
3628 | " 15.287582 | \n",
3629 | " 2.339869 | \n",
3630 | " 7.542484 | \n",
3631 | " 1.777778 | \n",
3632 | " 8.679739 | \n",
3633 | " 1.627451 | \n",
3634 | " 7.026144 | \n",
3635 | " 2.424837 | \n",
3636 | " 9.052288 | \n",
3637 | " 97477.0 | \n",
3638 | "
\n",
3639 | " \n",
3640 | " | 3 | \n",
3641 | " 2.223183 | \n",
3642 | " 7.326990 | \n",
3643 | " 1.904844 | \n",
3644 | " 6.553633 | \n",
3645 | " 1.930796 | \n",
3646 | " 9.442907 | \n",
3647 | " 1.871972 | \n",
3648 | " 6.891003 | \n",
3649 | " 2.129758 | \n",
3650 | " 7.645329 | \n",
3651 | " 3.726644 | \n",
3652 | " 15.249135 | \n",
3653 | " 1.762976 | \n",
3654 | " 6.420415 | \n",
3655 | " 1.610727 | \n",
3656 | " 7.237024 | \n",
3657 | " 1.946367 | \n",
3658 | " 8.410035 | \n",
3659 | " 2.178201 | \n",
3660 | " 9.287197 | \n",
3661 | " 2.411765 | \n",
3662 | " 9.077855 | \n",
3663 | " 1.896194 | \n",
3664 | " 9.363322 | \n",
3665 | " 2.603806 | \n",
3666 | " 12.671280 | \n",
3667 | " 2.429066 | \n",
3668 | " 13.475779 | \n",
3669 | " 2.859862 | \n",
3670 | " 11.269896 | \n",
3671 | " 2.169550 | \n",
3672 | " 10.060554 | \n",
3673 | " 2.768166 | \n",
3674 | " 9.553633 | \n",
3675 | " 2.131488 | \n",
3676 | " 10.072664 | \n",
3677 | " 2.695502 | \n",
3678 | " 11.344291 | \n",
3679 | " 2.266436 | \n",
3680 | " 9.012111 | \n",
3681 | " ... | \n",
3682 | " 9.608997 | \n",
3683 | " 3.200692 | \n",
3684 | " 17.339100 | \n",
3685 | " 3.304498 | \n",
3686 | " 12.577855 | \n",
3687 | " 3.567474 | \n",
3688 | " 13.972318 | \n",
3689 | " 3.399654 | \n",
3690 | " 11.550173 | \n",
3691 | " 3.608997 | \n",
3692 | " 10.935986 | \n",
3693 | " 3.008651 | \n",
3694 | " 9.434256 | \n",
3695 | " 3.266436 | \n",
3696 | " 10.474048 | \n",
3697 | " 2.955017 | \n",
3698 | " 10.346021 | \n",
3699 | " 2.764706 | \n",
3700 | " 11.133218 | \n",
3701 | " 2.752595 | \n",
3702 | " 9.804498 | \n",
3703 | " 2.750865 | \n",
3704 | " 10.935986 | \n",
3705 | " 2.484429 | \n",
3706 | " 11.025952 | \n",
3707 | " 2.804498 | \n",
3708 | " 10.377163 | \n",
3709 | " 2.519031 | \n",
3710 | " 9.332180 | \n",
3711 | " 2.468858 | \n",
3712 | " 13.237024 | \n",
3713 | " 2.385813 | \n",
3714 | " 9.250865 | \n",
3715 | " 2.960208 | \n",
3716 | " 9.309689 | \n",
3717 | " 2.318339 | \n",
3718 | " 8.731834 | \n",
3719 | " 3.013841 | \n",
3720 | " 10.859862 | \n",
3721 | " 65754.0 | \n",
3722 | "
\n",
3723 | " \n",
3724 | " | 4 | \n",
3725 | " 2.090909 | \n",
3726 | " 14.045455 | \n",
3727 | " 2.462121 | \n",
3728 | " 11.409091 | \n",
3729 | " 3.969697 | \n",
3730 | " 19.333333 | \n",
3731 | " 2.590909 | \n",
3732 | " 12.227273 | \n",
3733 | " 2.742424 | \n",
3734 | " 12.318182 | \n",
3735 | " 3.454545 | \n",
3736 | " 12.780303 | \n",
3737 | " 2.325758 | \n",
3738 | " 16.598485 | \n",
3739 | " 3.757576 | \n",
3740 | " 22.780303 | \n",
3741 | " 2.954545 | \n",
3742 | " 21.439394 | \n",
3743 | " 2.848485 | \n",
3744 | " 21.765152 | \n",
3745 | " 3.090909 | \n",
3746 | " 23.909091 | \n",
3747 | " 3.530303 | \n",
3748 | " 39.219697 | \n",
3749 | " 7.393939 | \n",
3750 | " 92.030303 | \n",
3751 | " 6.325758 | \n",
3752 | " 65.325758 | \n",
3753 | " 4.878788 | \n",
3754 | " 64.098485 | \n",
3755 | " 7.045455 | \n",
3756 | " 52.803030 | \n",
3757 | " 3.257576 | \n",
3758 | " 39.045455 | \n",
3759 | " 5.962121 | \n",
3760 | " 51.613636 | \n",
3761 | " 4.257576 | \n",
3762 | " 51.265152 | \n",
3763 | " 4.886364 | \n",
3764 | " 34.128788 | \n",
3765 | " ... | \n",
3766 | " 29.333333 | \n",
3767 | " 6.909091 | \n",
3768 | " 72.530303 | \n",
3769 | " 8.939394 | \n",
3770 | " 51.833333 | \n",
3771 | " 9.431818 | \n",
3772 | " 64.310606 | \n",
3773 | " 3.659091 | \n",
3774 | " 20.946970 | \n",
3775 | " 3.484848 | \n",
3776 | " 22.159091 | \n",
3777 | " 5.333333 | \n",
3778 | " 20.787879 | \n",
3779 | " 3.469697 | \n",
3780 | " 16.810606 | \n",
3781 | " 3.984848 | \n",
3782 | " 12.227273 | \n",
3783 | " 4.583333 | \n",
3784 | " 15.893939 | \n",
3785 | " 4.325758 | \n",
3786 | " 10.204545 | \n",
3787 | " 2.856061 | \n",
3788 | " 9.325758 | \n",
3789 | " 6.515152 | \n",
3790 | " 10.166667 | \n",
3791 | " 3.840909 | \n",
3792 | " 8.363636 | \n",
3793 | " 4.325758 | \n",
3794 | " 7.265152 | \n",
3795 | " 9.537879 | \n",
3796 | " 25.234848 | \n",
3797 | " 4.272727 | \n",
3798 | " 10.431818 | \n",
3799 | " 3.462121 | \n",
3800 | " 16.500000 | \n",
3801 | " 3.075758 | \n",
3802 | " 13.515152 | \n",
3803 | " 3.606061 | \n",
3804 | " 18.939394 | \n",
3805 | " 120372.0 | \n",
3806 | "
\n",
3807 | " \n",
3808 | "
\n",
3809 | "
5 rows × 83 columns
\n",
3810 | "
"
3811 | ],
3812 | "text/plain": [
3813 | " 04_0_VH 04_0_VV 04_1_VH ... 11_4_VH 11_4_VV fields\n",
3814 | "0 0.816514 5.449541 0.651376 ... NaN NaN 714.0\n",
3815 | "1 0.954545 22.136364 1.636364 ... NaN NaN 118859.0\n",
3816 | "2 1.137255 7.758170 0.705882 ... 2.424837 9.052288 97477.0\n",
3817 | "3 2.223183 7.326990 1.904844 ... 3.013841 10.859862 65754.0\n",
3818 | "4 2.090909 14.045455 2.462121 ... 3.606061 18.939394 120372.0\n",
3819 | "\n",
3820 | "[5 rows x 83 columns]"
3821 | ]
3822 | },
3823 | "metadata": {},
3824 | "execution_count": 50
3825 | }
3826 | ]
3827 | },
3828 | {
3829 | "cell_type": "code",
3830 | "metadata": {
3831 | "id": "q_fhmm9WhqWH"
3832 | },
3833 | "source": [
3834 | "sar_test.to_csv('/content/drive/MyDrive/CompeData/Radiant/s1_all_data/sar_test.csv', index = False)"
3835 | ],
3836 | "execution_count": null,
3837 | "outputs": []
3838 | }
3839 | ]
3840 | }
--------------------------------------------------------------------------------