├── synthetic_air_quality_data.xlsx ├── README.md └── File1 /synthetic_air_quality_data.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Dogiye12/Air-Quality-Prediction-Using-LSTM-Networks/HEAD/synthetic_air_quality_data.xlsx -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 📊 **Air Quality Prediction Using LSTM Networks** 2 | 3 | This project demonstrates how to use Long Short-Term Memory (LSTM) networks to predict air quality levels—specifically PM2.5—based on synthetic time-series data. It showcases the full pipeline: from synthetic data generation to training an LSTM model and visualizing predictions. 4 | 📁 Project Structure 5 | 6 | . 7 | ├── synthetic_air_quality_data.xlsx # Synthetic dataset (PM2.5 levels) 8 | ├── air_quality_lstm.py # Python script with LSTM model 9 | └── README.md # Project description 10 | 11 | 🔧 **Requirements** 12 | 13 | To run this project, make sure the following Python packages are installed: 14 | 15 | pip install numpy pandas matplotlib scikit-learn tensorflow openpyxl 16 | 17 | 🚀 How to Run 18 | 19 | Generate or load data: The script generates synthetic PM2.5 data with seasonal and random noise characteristics. 20 | 21 | Train LSTM model: The model learns patterns in the PM2.5 data using a sliding window (sequence length of 20). 22 | 23 | Make predictions: Future values are predicted and compared to actual values in a test set. 24 | 25 | Plot results: A comparison plot shows actual vs. predicted PM2.5 values. 26 | 27 | 📊 Output 28 | 29 | Excel File: synthetic_air_quality_data.xlsx 30 | 31 | Graph: Visual representation of predicted vs. actual PM2.5. 32 | 33 | Model Performance: Uses Mean Squared Error (MSE) as the loss metric. 34 | 35 | 📌 Notes 36 | 37 | The data is synthetic. You can swap in real air quality datasets (e.g., from OpenAQ or government environmental agencies). 38 | 39 | The model uses only one feature (PM2.5); you can extend it to use multiple pollutants or meteorological features. 40 | 41 | 👨‍💻 Author 42 | 43 | Amos Meremu Dogiye 44 | Github: https://github.com/Dogiye12 45 | LinkedIn: https://www.linkedin.com/in/meremu-amos-993333314/ 46 | 47 | If you use or adapt this project, consider citing or linking to this GitHub project. 48 | -------------------------------------------------------------------------------- /File1: -------------------------------------------------------------------------------- 1 | # Air Quality Prediction using LSTM with Synthetic Data 2 | 3 | import numpy as np 4 | import pandas as pd 5 | import matplotlib.pyplot as plt 6 | 7 | from sklearn.preprocessing import MinMaxScaler 8 | from tensorflow.keras.models import Sequential 9 | from tensorflow.keras.layers import LSTM, Dense 10 | 11 | # 1. Generate synthetic air quality data (e.g., PM2.5) 12 | np.random.seed(42) 13 | time_steps = 300 14 | t = np.arange(0, time_steps) 15 | # Synthetic PM2.5 data with seasonality and noise 16 | pm25 = 50 + 10 * np.sin(2 * np.pi * t / 50) + np.random.normal(0, 2, time_steps) 17 | 18 | # Create DataFrame 19 | df = pd.DataFrame({'PM2.5': pm25}) 20 | 21 | # 2. Normalize data 22 | scaler = MinMaxScaler() 23 | df_scaled = scaler.fit_transform(df) 24 | 25 | # 3. Create sequences for LSTM 26 | def create_sequences(data, seq_length): 27 | X, y = [], [] 28 | for i in range(len(data) - seq_length): 29 | X.append(data[i:i+seq_length]) 30 | y.append(data[i+seq_length]) 31 | return np.array(X), np.array(y) 32 | 33 | seq_length = 20 34 | X, y = create_sequences(df_scaled, seq_length) 35 | 36 | # Reshape input to LSTM expected [samples, time_steps, features] 37 | X = X.reshape((X.shape[0], X.shape[1], 1)) 38 | 39 | # 4. Split into train and test 40 | train_size = int(len(X) * 0.8) 41 | X_train, X_test = X[:train_size], X[train_size:] 42 | y_train, y_test = y[:train_size], y[train_size:] 43 | 44 | # 5. Define LSTM model 45 | model = Sequential([ 46 | LSTM(50, activation='relu', input_shape=(seq_length, 1)), 47 | Dense(1) 48 | ]) 49 | model.compile(optimizer='adam', loss='mse') 50 | model.summary() 51 | 52 | # 6. Train the model 53 | history = model.fit(X_train, y_train, epochs=30, validation_data=(X_test, y_test), verbose=1) 54 | 55 | # 7. Predict and inverse scale 56 | y_pred = model.predict(X_test) 57 | y_pred_inv = scaler.inverse_transform(y_pred) 58 | y_test_inv = scaler.inverse_transform(y_test) 59 | 60 | # 8. Plot the results 61 | plt.figure(figsize=(10, 5)) 62 | plt.plot(y_test_inv, label='Actual PM2.5') 63 | plt.plot(y_pred_inv, label='Predicted PM2.5') 64 | plt.title("Air Quality Prediction (PM2.5)") 65 | plt.xlabel("Time Steps") 66 | plt.ylabel("PM2.5") 67 | plt.legend() 68 | plt.show() 69 | --------------------------------------------------------------------------------