├── README.md ├── fsdr-1122.ipynb ├── heml0922-spartificial.ipynb ├── psgq-0922.ipynb ├── pvpda1122.ipynb ├── rspd-1222.ipynb ├── sdsi0922.ipynb └── spml0922.ipynb /README.md: -------------------------------------------------------------------------------- 1 | # YouTube Academic Projects 2 | 3 | 4 | YouTube Academic Projects Series is a youtube video series by [Spartificial](https://spartificial.com/) that provides free certified Machine Learning related student projects focusing on Space technology and Sustainable Development. 5 | 6 | Here is the [YouTube Playlist](https://youtube.com/playlist?list=PL7HQvd_RTCc3Vope7dkx4pggrH5f-uvZe) of all the project videos. Watch them for thorough explanation. 7 | 8 | 9 |  10 | 11 | If you have completed the tasks for any of the project, you can submit your solution notebook through this [form](https://docs.google.com/forms/d/e/1FAIpQLSd0TiEf7SsHMS7dvnkUzUZBiXKq-0Ctv8ejjNjbubR4LHfGtg/viewform) 12 | 13 | 14 | 1. [Hunting for Exoplanet with Machine Learning](https://github.com/Spartificial/yt-acad-projs/blob/main/heml0922-spartificial.ipynb) 15 | 2. [Ship Detection from Satellite Imagery](https://github.com/Spartificial/yt-acad-projs/blob/main/sdsi0922.ipynb) 16 | 3. [Sunspots Prediction using Machine Learning](https://github.com/Spartificial/yt-acad-projs/blob/main/spml0922.ipynb) 17 | 4. [India's PV Power Potential Data Analysis](https://github.com/Spartificial/yt-acad-projs/blob/main/pvpda1122.ipynb) 18 | 5. [Predicting Stars, Galaxies & Quasars with ML Model](https://github.com/Spartificial/yt-acad-projs/blob/main/psgq-0922.ipynb) 19 | 6. [Rooftop Solar Panel Detection using Deep Learning](https://github.com/Spartificial/yt-acad-projs/blob/main/rspd-1222.ipynb) 20 | 7. [Fish Detection using Deep Learning](https://github.com/Spartificial/yt-acad-projs/blob/main/fsdr-1122.ipynb) 21 | -------------------------------------------------------------------------------- /heml0922-spartificial.ipynb: -------------------------------------------------------------------------------- 1 | {"metadata":{"kernelspec":{"language":"python","display_name":"Python 3","name":"python3"},"language_info":{"name":"python","version":"3.7.12","mimetype":"text/x-python","codemirror_mode":{"name":"ipython","version":3},"pygments_lexer":"ipython3","nbconvert_exporter":"python","file_extension":".py"}},"nbformat_minor":4,"nbformat":4,"cells":[{"cell_type":"markdown","source":"#
Presently, 1% of the electricity produced worldwide comes from solar energy. In fact, predictions for solar energy production indicate a possible 65-fold increase in output by 2050, making solar energy one of the world's greatest sources of energy at that point. Thirty percent of this energy is thought to be produced by solar photovoltaic, or solar PV, power systems mounted on rooftops. Solar PV power has already started to take on a more and bigger part in the generation of electricity in the US in recent years. Solar energy production increased by 75,123 GWh or 39 times between 2008 and 2017, or a 39-fold increase.\n", 58 | "\n", 59 | "
Here's an overview on the global growth -\n", 60 | "\n", 61 | "\n", 62 | "\n", 63 | "Credits : Bloomberg\n", 64 | "\n", 65 | "
Granular data on distributed rooftop solar PV is becoming increasingly important as solar photovoltaic (PV) becomes a significant segment of the energy industry. An imagery-based solar panel recognition algorithm that can be used to create detailed databases of installations and their power capacity would be extremely helpful to solar power suppliers and consumers, urban planners, grid system operators, and energy policy makers. The fact that solar panel installers typically keep installation details to themselves is another factor in solar panel detection. A well-known solar panel detecting technique or algorithm is therefore urgently needed. However, there hasn't been much effort done to identify solar panels in aerial or satellite photographs.\n", 66 | "\n", 67 | "
We first require a labelled data-set of satellite images in order to create an algorithm that can recognise solar panels from aerial or satellite imagery.\n" 68 | ], 69 | "metadata": { 70 | "id": "-VGVSOmgoqtq" 71 | } 72 | }, 73 | { 74 | "cell_type": "markdown", 75 | "source": [ 76 | "# Understanding the Dataset\n", 77 | "\n", 78 | "#####
When examining the photographs themselves, it is clear that solar panels frequently have rectangular shapes with distinct angles and borders. However, the whole pictures that include solar PV do not necessarily have a same structure. The solar panels are not always at the centre of images, which come in a range of sizes and hues. Additionally, the background scenery in the photographs of the two classes is also not uniform. Both classes contain illustrations of home swimming pools, pavement, grass, and rooftops. A model should also be able to predict the same class independent of the orientation of each image." 87 | ], 88 | "metadata": { 89 | "id": "PKm9aP3Quf2O" 90 | } 91 | }, 92 | { 93 | "cell_type": "markdown", 94 | "source": [ 95 | "#Importing necessary libraries and modules for this notebook" 96 | ], 97 | "metadata": { 98 | "id": "qIy76Rc0BENn" 99 | } 100 | }, 101 | { 102 | "cell_type": "code", 103 | "execution_count": null, 104 | "metadata": { 105 | "id": "rI4GdpfgMmfe" 106 | }, 107 | "outputs": [], 108 | "source": [ 109 | "# IMPORT REQUIRED LIBRARIES AND FUNCTIONS\n", 110 | "\n", 111 | "\n", 112 | "'''Data Handling & Linear Algebra'''\n", 113 | "import numpy as np\n", 114 | "import pandas as pd\n", 115 | "\n", 116 | "'''Visualisation'''\n", 117 | "import matplotlib.pyplot as plt\n", 118 | "import matplotlib as mpl\n", 119 | "from pylab import rcParams\n", 120 | "import seaborn as sns\n", 121 | "\n", 122 | "'''Data Analysis'''\n", 123 | "from sklearn.model_selection import StratifiedKFold\n", 124 | "from sklearn.metrics import roc_auc_score\n", 125 | "from sklearn.metrics import roc_curve\n", 126 | "from sklearn.metrics import confusion_matrix\n", 127 | "\n", 128 | "'''Manipulating Data and Model Building'''\n", 129 | "from keras.layers import Conv2D\n", 130 | "from keras.layers import Dense\n", 131 | "from keras.layers import GlobalMaxPooling2D\n", 132 | "from keras.layers import MaxPooling2D\n", 133 | "from keras.layers import BatchNormalization\n", 134 | "from keras.layers import Add\n", 135 | "from keras.models import Sequential" 136 | ] 137 | }, 138 | { 139 | "cell_type": "markdown", 140 | "source": [ 141 | "###Importing Google Drive for Dataset Access" 142 | ], 143 | "metadata": { 144 | "id": "bvIn56Q0ArnN" 145 | } 146 | }, 147 | { 148 | "cell_type": "markdown", 149 | "source": [ 150 | "- Download this dataset to your system.\n", 151 | "- Upload this 'data' folder directly in your 'Main Drive'." 152 | ], 153 | "metadata": { 154 | "id": "0x3uRuOYnyf7" 155 | } 156 | }, 157 | { 158 | "cell_type": "code", 159 | "source": [ 160 | "from google.colab import drive\n", 161 | "drive.mount('/content/drive')" 162 | ], 163 | "metadata": { 164 | "id": "MCFAQAs9YS93" 165 | }, 166 | "execution_count": null, 167 | "outputs": [] 168 | }, 169 | { 170 | "cell_type": "code", 171 | "execution_count": null, 172 | "metadata": { 173 | "id": "nazMB-iPHyhz" 174 | }, 175 | "outputs": [], 176 | "source": [ 177 | "# define dataset directories - the below links won't work if you haven't placed 'data' folder in your 'Main Drive'\n", 178 | "DIR_TRAIN_IMAGES = \"/content/drive/MyDrive/data/training/\"\n", 179 | "DIR_TRAIN_LABELS = \"/content/drive/MyDrive/data/labels_training.csv\"" 180 | ] 181 | }, 182 | { 183 | "cell_type": "markdown", 184 | "source": [ 185 | "#Exploratory Analysis & Data Scaling\n", 186 | "\n", 187 | "\n" 188 | ], 189 | "metadata": { 190 | "id": "coIj9Hc42GzC" 191 | } 192 | }, 193 | { 194 | "cell_type": "code", 195 | "source": [ 196 | "pd.read_csv(DIR_TRAIN_LABELS).head()" 197 | ], 198 | "metadata": { 199 | "id": "XffklsVltSdU" 200 | }, 201 | "execution_count": null, 202 | "outputs": [] 203 | }, 204 | { 205 | "cell_type": "markdown", 206 | "source": [ 207 | "- id are names of the image before tif\n", 208 | "- label has two values:\n", 209 | " - 0: No solar panels in the image\n", 210 | " - 1: Solar panels present in the image" 211 | ], 212 | "metadata": { 213 | "id": "iE5jH577tWAL" 214 | } 215 | }, 216 | { 217 | "cell_type": "code", 218 | "source": [ 219 | "# LOADING DATA AND PREPROCESSING\n", 220 | "\n", 221 | "def load_data(dir_data, dir_labels):\n", 222 | " '''\n", 223 | " dir_data: Data directory\n", 224 | " dir_labels: Respective csv file containing ids and labels\n", 225 | " returns: Array of all the image arrays and its respective labels\n", 226 | " '''\n", 227 | " labels_pd = pd.read_csv(dir_labels) # Read the csv file with labels and ids as we saw above\n", 228 | " ids = labels_pd.id.values # Extracting ids from the csv file\n", 229 | " data = [] # Initiating the empty list to store each image as numpy array\n", 230 | " for identifier in ids: # Looping into the desired folder\n", 231 | " fname = dir_data + identifier.astype(str) + '.tif' # Generating the file name\n", 232 | " image = mpl.image.imread(fname) # Reading image as numpy array using matplotlib\n", 233 | " data.append(image) # Appending this array into the empty list and repeat the above cycle\n", 234 | " data = np.array(data) # Now, convert the data list into data array\n", 235 | " labels = labels_pd.label.values # Extract labels from the csv file\n", 236 | " return data, labels # Return the array of data and respective labels" 237 | ], 238 | "metadata": { 239 | "id": "9YrdmgifoLoh" 240 | }, 241 | "execution_count": null, 242 | "outputs": [] 243 | }, 244 | { 245 | "cell_type": "code", 246 | "execution_count": null, 247 | "metadata": { 248 | "id": "OszJkbgrH1SV" 249 | }, 250 | "outputs": [], 251 | "source": [ 252 | "# load train data - time consuming code cell\n", 253 | "X, y = load_data(DIR_TRAIN_IMAGES, DIR_TRAIN_LABELS)" 254 | ] 255 | }, 256 | { 257 | "cell_type": "code", 258 | "source": [ 259 | "# display the images with and without solar panels\n", 260 | "plt.figure(figsize = (13,8)) # Adjust the figure size\n", 261 | "for i in range(6): # For first 6 images in the data\n", 262 | " plt.subplot(2, 3, i+1) # Create subplots\n", 263 | " plt.imshow(X[i]) # Show the respective image in respective postion\n", 264 | " if y[i] == 0: # If label is 0\n", 265 | " title = 'No Solar Panels in this image' # Set this as the title\n", 266 | " else: # Else label is 1\n", 267 | " title = 'Solar Panels in this image' # Set this as the title\n", 268 | " plt.title(title, color = 'r', weight = 'bold') # Adding title to each images in the subplot\n", 269 | "plt.tight_layout() # Automatically adjusts the width and height between images in subplot\n", 270 | "plt.show() # Display the subplot" 271 | ], 272 | "metadata": { 273 | "id": "ZBgNCx7hxbJK" 274 | }, 275 | "execution_count": null, 276 | "outputs": [] 277 | }, 278 | { 279 | "cell_type": "code", 280 | "execution_count": null, 281 | "metadata": { 282 | "id": "uUGT3ZPBH5jd" 283 | }, 284 | "outputs": [], 285 | "source": [ 286 | "# print data shape\n", 287 | "print('X shape:\\n', X.shape)" 288 | ] 289 | }, 290 | { 291 | "cell_type": "markdown", 292 | "source": [ 293 | "- 1500 total images in the training data\n", 294 | "- Each image is of shape (101 x 101 x 3)" 295 | ], 296 | "metadata": { 297 | "id": "KRu_Ivtf2UlF" 298 | } 299 | }, 300 | { 301 | "cell_type": "code", 302 | "execution_count": null, 303 | "metadata": { 304 | "id": "A5AYGyrHH7wF" 305 | }, 306 | "outputs": [], 307 | "source": [ 308 | "# check number of samples\n", 309 | "print('Distribution of y', np.bincount(y))" 310 | ] 311 | }, 312 | { 313 | "cell_type": "markdown", 314 | "source": [ 315 | "- Out of 1500 images:\n", 316 | " - 995 images are without any solar panels\n", 317 | " - 505 images are with solar panels" 318 | ], 319 | "metadata": { 320 | "id": "X4H5f-nr2k3b" 321 | } 322 | }, 323 | { 324 | "cell_type": "code", 325 | "execution_count": null, 326 | "metadata": { 327 | "id": "3Pn9Dq3dH829" 328 | }, 329 | "outputs": [], 330 | "source": [ 331 | "# scale pixel values between 0 and 1\n", 332 | "X = X / 255.0" 333 | ] 334 | }, 335 | { 336 | "cell_type": "markdown", 337 | "source": [ 338 | "#Building the CNN Model\n", 339 | "\n", 340 | "
A neural network type with a topology resembling a grid is known as a CNN. The effectiveness of CNNs in computer vision applications including image classification, picture clustering, and object identification is well recognised. Convolutional neural networks (CNNs) at least one of its layers instead of matrix multiplication at their core. They are structured like other neural networks by a series of layers. Neurons are grouped in three dimensions—width, height, and depth—in the layers of CNN. Although there are many various kinds of CNN architectures, they are the best option for picture identification since they handle pixels in relation to their surrounds.\n", 341 | "\n", 342 | "