├── synthetic_air_quality_data.xlsx
├── README.md
└── File1


/synthetic_air_quality_data.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Dogiye12/Air-Quality-Prediction-Using-LSTM-Networks/HEAD/synthetic_air_quality_data.xlsx


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | 📊 **Air Quality Prediction Using LSTM Networks**
 2 | 
 3 | This project demonstrates how to use Long Short-Term Memory (LSTM) networks to predict air quality levels—specifically PM2.5—based on synthetic time-series data. It showcases the full pipeline: from synthetic data generation to training an LSTM model and visualizing predictions.
 4 | 📁 Project Structure
 5 | 
 6 | .
 7 | ├── synthetic_air_quality_data.xlsx  # Synthetic dataset (PM2.5 levels)
 8 | ├── air_quality_lstm.py              # Python script with LSTM model
 9 | └── README.md                        # Project description
10 | 
11 | 🔧 **Requirements**
12 | 
13 | To run this project, make sure the following Python packages are installed:
14 | 
15 | pip install numpy pandas matplotlib scikit-learn tensorflow openpyxl
16 | 
17 | 🚀 How to Run
18 | 
19 |     Generate or load data: The script generates synthetic PM2.5 data with seasonal and random noise characteristics.
20 | 
21 |     Train LSTM model: The model learns patterns in the PM2.5 data using a sliding window (sequence length of 20).
22 | 
23 |     Make predictions: Future values are predicted and compared to actual values in a test set.
24 | 
25 |     Plot results: A comparison plot shows actual vs. predicted PM2.5 values.
26 | 
27 | 📊 Output
28 | 
29 |     Excel File: synthetic_air_quality_data.xlsx
30 | 
31 |     Graph: Visual representation of predicted vs. actual PM2.5.
32 | 
33 |     Model Performance: Uses Mean Squared Error (MSE) as the loss metric.
34 | 
35 | 📌 Notes
36 | 
37 |     The data is synthetic. You can swap in real air quality datasets (e.g., from OpenAQ or government environmental agencies).
38 | 
39 |     The model uses only one feature (PM2.5); you can extend it to use multiple pollutants or meteorological features.
40 | 
41 | 👨‍💻 Author
42 | 
43 | Amos Meremu Dogiye
44 | Github: https://github.com/Dogiye12
45 | LinkedIn: https://www.linkedin.com/in/meremu-amos-993333314/
46 | 
47 | If you use or adapt this project, consider citing or linking to this GitHub project.
48 | 


--------------------------------------------------------------------------------
/File1:
--------------------------------------------------------------------------------
 1 | # Air Quality Prediction using LSTM with Synthetic Data
 2 | 
 3 | import numpy as np
 4 | import pandas as pd
 5 | import matplotlib.pyplot as plt
 6 | 
 7 | from sklearn.preprocessing import MinMaxScaler
 8 | from tensorflow.keras.models import Sequential
 9 | from tensorflow.keras.layers import LSTM, Dense
10 | 
11 | # 1. Generate synthetic air quality data (e.g., PM2.5)
12 | np.random.seed(42)
13 | time_steps = 300
14 | t = np.arange(0, time_steps)
15 | # Synthetic PM2.5 data with seasonality and noise
16 | pm25 = 50 + 10 * np.sin(2 * np.pi * t / 50) + np.random.normal(0, 2, time_steps)
17 | 
18 | # Create DataFrame
19 | df = pd.DataFrame({'PM2.5': pm25})
20 | 
21 | # 2. Normalize data
22 | scaler = MinMaxScaler()
23 | df_scaled = scaler.fit_transform(df)
24 | 
25 | # 3. Create sequences for LSTM
26 | def create_sequences(data, seq_length):
27 |     X, y = [], []
28 |     for i in range(len(data) - seq_length):
29 |         X.append(data[i:i+seq_length])
30 |         y.append(data[i+seq_length])
31 |     return np.array(X), np.array(y)
32 | 
33 | seq_length = 20
34 | X, y = create_sequences(df_scaled, seq_length)
35 | 
36 | # Reshape input to LSTM expected [samples, time_steps, features]
37 | X = X.reshape((X.shape[0], X.shape[1], 1))
38 | 
39 | # 4. Split into train and test
40 | train_size = int(len(X) * 0.8)
41 | X_train, X_test = X[:train_size], X[train_size:]
42 | y_train, y_test = y[:train_size], y[train_size:]
43 | 
44 | # 5. Define LSTM model
45 | model = Sequential([
46 |     LSTM(50, activation='relu', input_shape=(seq_length, 1)),
47 |     Dense(1)
48 | ])
49 | model.compile(optimizer='adam', loss='mse')
50 | model.summary()
51 | 
52 | # 6. Train the model
53 | history = model.fit(X_train, y_train, epochs=30, validation_data=(X_test, y_test), verbose=1)
54 | 
55 | # 7. Predict and inverse scale
56 | y_pred = model.predict(X_test)
57 | y_pred_inv = scaler.inverse_transform(y_pred)
58 | y_test_inv = scaler.inverse_transform(y_test)
59 | 
60 | # 8. Plot the results
61 | plt.figure(figsize=(10, 5))
62 | plt.plot(y_test_inv, label='Actual PM2.5')
63 | plt.plot(y_pred_inv, label='Predicted PM2.5')
64 | plt.title("Air Quality Prediction (PM2.5)")
65 | plt.xlabel("Time Steps")
66 | plt.ylabel("PM2.5")
67 | plt.legend()
68 | plt.show()
69 | 


--------------------------------------------------------------------------------