├── air_quality_climate_data.xlsx
├── README.md
└── First code


/air_quality_climate_data.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Geraldine-Winston/Air-Quality-Prediction-Under-Changing-Climate-using-deep-ensemble-models./HEAD/air_quality_climate_data.xlsx


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Air Quality Prediction Under Changing Climate Using Deep Ensemble Models
 2 | 
 3 | ## Overview
 4 | Air quality has become a critical global concern, especially under the escalating impacts of climate change. Rising temperatures, shifting weather patterns, and increasing urbanization have intensified air pollution levels, necessitating advanced predictive modeling approaches. This project aims to predict PM2.5 concentrations, a key indicator of air quality, using a deep ensemble learning framework. By leveraging multiple deep learning models, the ensemble approach mitigates individual model biases and enhances the robustness and accuracy of predictions. This methodology not only improves generalization across varying climatic scenarios but also provides a reliable tool for policymakers and environmental agencies to anticipate air quality trends and implement timely interventions.
 5 | 
 6 | ## Dataset
 7 | The dataset includes various environmental and climatic features:
 8 | - Temperature
 9 | - Humidity
10 | - Wind Speed
11 | - CO2
12 | - NO2
13 | - SO2
14 | - O3
15 | - PM2.5 (Target Variable)
16 | 
17 | The data is stored in `air_quality_climate_data.xlsx`.
18 | 
19 | ## Requirements
20 | - Python 3.x
21 | - TensorFlow
22 | - Pandas
23 | - NumPy
24 | - Scikit-learn
25 | - Joblib
26 | 
27 | Install the required libraries:
28 | ```bash
29 | pip install tensorflow pandas numpy scikit-learn joblib
30 | ```
31 | 
32 | ## Usage
33 | 1. Load the dataset.
34 | 2. Preprocess the data (standardization).
35 | 3. Train an ensemble of deep learning models.
36 | 4. Predict and evaluate using Mean Squared Error and R-squared metrics.
37 | 5. Save the trained models and scaler for future use.
38 | 
39 | ## Model Architecture
40 | Each model in the ensemble has the following architecture:
41 | - Dense layer (128 units, ReLU activation)
42 | - Dropout (30%)
43 | - Dense layer (64 units, ReLU activation)
44 | - Dropout (30%)
45 | - Dense layer (32 units, ReLU activation)
46 | - Output Dense layer (1 unit, no activation)
47 | 
48 | Optimizer: Adam with learning rate 0.001
49 | Loss: Mean Squared Error
50 | 
51 | ## Output
52 | - Trained model files: `air_quality_model_0.h5`, `air_quality_model_1.h5`, etc.
53 | - Scaler file: `scaler.pkl`
54 | - Evaluation metrics printed on console.
55 | 
56 | ## Author
57 | **Ayebawanaemi Geraldine Winston**
58 | 


--------------------------------------------------------------------------------
/First code:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import pandas as pd
 3 | from sklearn.model_selection import train_test_split
 4 | from sklearn.preprocessing import StandardScaler
 5 | from sklearn.metrics import mean_squared_error, r2_score
 6 | import tensorflow as tf
 7 | from tensorflow.keras.models import Sequential
 8 | from tensorflow.keras.layers import Dense, Dropout
 9 | from tensorflow.keras.optimizers import Adam
10 | 
11 | # Load dataset (assuming a CSV file)
12 | df = pd.read_csv('air_quality_climate_data.csv')
13 | 
14 | # Assume the target variable is 'PM2.5' and features are all other columns
15 | X = df.drop(columns=['PM2.5'])
16 | y = df['PM2.5']
17 | 
18 | # Split into train and test sets
19 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
20 | 
21 | # Standardize the features
22 | scaler = StandardScaler()
23 | X_train_scaled = scaler.fit_transform(X_train)
24 | X_test_scaled = scaler.transform(X_test)
25 | 
26 | # Define a function to create a base model
27 | def create_model():
28 |     model = Sequential()
29 |     model.add(Dense(128, activation='relu', input_shape=(X_train_scaled.shape[1],)))
30 |     model.add(Dropout(0.3))
31 |     model.add(Dense(64, activation='relu'))
32 |     model.add(Dropout(0.3))
33 |     model.add(Dense(32, activation='relu'))
34 |     model.add(Dense(1))
35 |     model.compile(optimizer=Adam(learning_rate=0.001), loss='mse')
36 |     return model
37 | 
38 | # Create an ensemble of models
39 | n_models = 5
40 | models = []
41 | 
42 | for _ in range(n_models):
43 |     model = create_model()
44 |     model.fit(X_train_scaled, y_train, epochs=100, batch_size=32, verbose=0)
45 |     models.append(model)
46 | 
47 | # Make predictions and average them
48 | def ensemble_predict(models, X):
49 |     predictions = np.column_stack([model.predict(X).flatten() for model in models])
50 |     return np.mean(predictions, axis=1)
51 | 
52 | # Predict on the test set
53 | y_pred = ensemble_predict(models, X_test_scaled)
54 | 
55 | # Evaluate
56 | mse = mean_squared_error(y_test, y_pred)
57 | r2 = r2_score(y_test, y_pred)
58 | 
59 | print(f'Mean Squared Error: {mse}')
60 | print(f'R-squared: {r2}')
61 | 
62 | # Save the models
63 | for i, model in enumerate(models):
64 |     model.save(f'air_quality_model_{i}.h5')
65 | 
66 | # Save the scaler
67 | import joblib
68 | joblib.dump(scaler, 'scaler.pkl')
69 | 


--------------------------------------------------------------------------------