├── License.txt
├── airquality.gitignore
├── Predicted AQI Example.txt
├── Air Quality Prediction.py
├── README.md
└── AQP.PY.txt


/License.txt:
--------------------------------------------------------------------------------
1 | MIT License
2 | 
3 | Copyright (c) 2024 Your Name
4 | 
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction...
8 | 


--------------------------------------------------------------------------------
/airquality.gitignore:
--------------------------------------------------------------------------------
 1 | # Byte-compiled files
 2 | __pycache__/
 3 | *.py[cod]
 4 | 
 5 | # Virtual environment
 6 | venv/
 7 | 
 8 | # Jupyter Notebook checkpoints
 9 | .ipynb_checkpoints/
10 | 
11 | # Logs and temporary files
12 | *.log
13 | *.tmp
14 | 
15 | # System files
16 | .DS_Store
17 | Thumbs.db
18 | 


--------------------------------------------------------------------------------
/Predicted AQI Example.txt:
--------------------------------------------------------------------------------
 1 | 
 2 | ---
 3 | 
 4 | ### 2. **`requirements.txt`**
 5 | - **Purpose**: Lists all Python dependencies for easy installation.
 6 | - **Content**: Output from `pip freeze` or a manually curated list of dependencies.
 7 | - Example:
 8 |   ```
 9 |   numpy==1.23.5
10 |   pandas==1.5.3
11 |   scikit-learn==1.2.3
12 |   ```
13 | 
14 | Generate automatically:
15 | ```bash
16 | pip freeze > requirements.txt
17 | 


--------------------------------------------------------------------------------
/Air Quality Prediction.py:
--------------------------------------------------------------------------------
 1 | import pandas as pd
 2 | import numpy as np
 3 | from sklearn.model_selection import train_test_split
 4 | from sklearn.ensemble import RandomForestRegressor
 5 | from sklearn.metrics import mean_absolute_error
 6 | 
 7 | # Step 1: Generate synthetic air quality data for five months
 8 | months = ["January", "February", "March", "April", "May"]
 9 | np.random.seed(42)  # For reproducibility
10 | 
11 | data = {
12 |     "Month": months,
13 |     "CO2": np.random.uniform(400, 1000, size=5).round(3),
14 |     "NO2": np.random.uniform(0.01, 0.2, size=5).round(3),
15 |     "SO2": np.random.uniform(0.01, 0.5, size=5).round(3),
16 |     "PM2.5": np.random.uniform(0.01, 0.05, size=5).round(3),
17 |     "AQI_CO": np.random.uniform(0, 150, size=5).round(2),
18 |     "AQI_NO2": np.random.uniform(0, 200, size=5).round(2),
19 |     "AQI_SO2": np.random.uniform(0, 150, size=5).round(2),
20 | }
21 | 
22 | # Convert data dictionary to a DataFrame
23 | df = pd.DataFrame(data)
24 | 
25 | # Convert categorical 'Month' data into numeric data
26 | df['Month'] = pd.Categorical(df['Month']).codes
27 | 
28 | # Display generated synthetic data
29 | print("Synthetic Air Quality Data:")
30 | print(df)
31 | 
32 | # Step 2: Prepare data for model training
33 | # Features (independent variables) and target variable (AQI values)
34 | X = df[["Month", "CO2", "NO2", "SO2", "PM2.5"]]
35 | y = df[["AQI_CO", "AQI_NO2", "AQI_SO2"]]
36 | 
37 | # Split data into training and test sets
38 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
39 | 
40 | # Step 3: Initialize and train the model
41 | model = RandomForestRegressor(n_estimators=100, random_state=0)
42 | model.fit(X_train, y_train)
43 | 
44 | # Step 4: Make predictions and evaluate model performance
45 | y_pred = model.predict(X_test)
46 | mae = mean_absolute_error(y_test, y_pred)
47 | print(f"\nMean Absolute Error: {mae}")
48 | 
49 | # Step 5: Predict future AQI for a new month (example: June)
50 | future_data = pd.DataFrame({
51 |     "Month": [5],  # For example, "June" could be encoded as 5
52 |     "CO2": [650],  # Replace with actual or expected CO2 level
53 |     "NO2": [0.07],
54 |     "SO2": [0.1],
55 |     "PM2.5": [0.03]
56 | })
57 | future_aqi = model.predict(future_data)
58 | print("\nPredicted AQI for June:", future_aqi)
59 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Air Quality Prediction Model
  2 | 
  3 | Welcome to the **Air Quality Prediction Model** repository! This project aims to predict air quality levels using machine learning techniques. The model is designed to analyze various environmental factors and provide accurate predictions to help monitor and improve air quality.
  4 | 
  5 | ## Table of Contents
  6 | - [Introduction](#introduction)
  7 | - [Features](#features)
  8 | - [Installation](#installation)
  9 | - [Usage](#usage)
 10 | - [Dataset](#dataset)
 11 | - [Model](#model)
 12 | - [Contributing](#contributing)
 13 | - [License](#license)
 14 | - [Contact](#contact)
 15 | 
 16 | ## Introduction
 17 | Air quality is a critical factor affecting public health and the environment. This project leverages machine learning to predict air quality levels based on historical data and environmental factors such as temperature, humidity, pollutant concentrations, and more. The model can be used by researchers, policymakers, and the general public to make informed decisions about air quality management.
 18 | 
 19 | ## Features
 20 | - **Data Preprocessing**: Clean and preprocess air quality data for accurate predictions.
 21 | - **Machine Learning Models**: Implement various machine learning algorithms to predict air quality.
 22 | - **Visualization**: Generate visualizations to understand trends and patterns in air quality data.
 23 | - **Scalability**: The model can be scaled to incorporate additional features and larger datasets.
 24 | 
 25 | ## Installation
 26 | To get started with this project, follow these steps:
 27 | 
 28 | 1. **Clone the repository**:
 29 |    ```bash
 30 |    git clone https://github.com/Eteh1/Air-Quality-Prediction-Model.git
 31 |    cd Air-Quality-Prediction-Model
 32 |    ```
 33 | 
 34 | 2. **Set up a virtual environment** (optional but recommended):
 35 |    ```bash
 36 |    python -m venv venv
 37 |    source venv/bin/activate  # On Windows use `venv\Scripts\activate`
 38 |    ```
 39 | 
 40 | 3. **Install the required dependencies**:
 41 |    ```bash
 42 |    pip install -r requirements.txt
 43 |    ```
 44 | 
 45 | ## Usage
 46 | To use the Air Quality Prediction Model, follow these steps:
 47 | 
 48 | 1. **Prepare your dataset**:
 49 |    - Ensure your dataset is in CSV format.
 50 |    - Place the dataset in the `data/` directory.
 51 | 
 52 | 2. **Run the preprocessing script**:
 53 |    ```bash
 54 |    python src/preprocess.py
 55 |    ```
 56 | 
 57 | 3. **Train the model**:
 58 |    ```bash
 59 |    python src/train.py
 60 |    ```
 61 | 
 62 | 4. **Make predictions**:
 63 |    ```bash
 64 |    python src/predict.py
 65 |    ```
 66 | 
 67 | 5. **Visualize the results**:
 68 |    ```bash
 69 |    python src/visualize.py
 70 |    ```
 71 | 
 72 | ## Dataset
 73 | The dataset used in this project contains historical air quality data, including various environmental factors. You can use your own dataset or download a sample dataset from [here](https://example.com/dataset).
 74 | 
 75 | ## Model
 76 | The model is built using popular machine learning libraries such as `scikit-learn`, `pandas`, and `numpy`. It includes the following algorithms:
 77 | - Linear Regression
 78 | - Random Forest
 79 | - Gradient Boosting
 80 | 
 81 | You can easily extend the model to include other algorithms or techniques.
 82 | 
 83 | ## Contributing
 84 | Contributions are welcome! If you'd like to contribute, please follow these steps:
 85 | 
 86 | 1. Fork the repository.
 87 | 2. Create a new branch (`git checkout -b feature/YourFeatureName`).
 88 | 3. Commit your changes (`git commit -m 'Add some feature'`).
 89 | 4. Push to the branch (`git push origin feature/YourFeatureName`).
 90 | 5. Open a pull request.
 91 | 
 92 | Please ensure your code follows the project's coding standards and includes appropriate documentation.
 93 | 
 94 | ## License
 95 | This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for more details.
 96 | 
 97 | ## Contact
 98 | If you have any questions or suggestions, feel free to reach out:
 99 | 
100 | - **Name**: Desmond Eteh
101 | - **Email**: desmondeteh@gmail.com
102 | - **GitHub**: [Eteh1](https://github.com/Eteh1)
103 | 
104 | Thank you for visiting the Air Quality Prediction Model repository! We hope this project helps you in your efforts to monitor and improve air quality.
105 | 


--------------------------------------------------------------------------------
/AQP.PY.txt:
--------------------------------------------------------------------------------
  1 | # Import necessary libraries
  2 | import requests
  3 | import pandas as pd
  4 | import numpy as np
  5 | from sklearn.model_selection import train_test_split
  6 | from sklearn.preprocessing import StandardScaler
  7 | from sklearn.ensemble import RandomForestRegressor
  8 | from sklearn.metrics import mean_squared_error
  9 | 
 10 | # Step 1: Fetch Real-Time Air Quality Data
 11 | def fetch_air_quality_data(api_key, city, state, country):
 12 |     """
 13 |     Fetches real-time air quality data using the AirVisual API.
 14 |     """
 15 |     url = f"http://api.airvisual.com/v2/city?city={city}&state={state}&country={country}&key={api_key}"
 16 |     response = requests.get(url)
 17 |     if response.status_code == 200:
 18 |         data = response.json()
 19 |         pollutants = data['data']['current']['pollution']
 20 |         return pollutants
 21 |     else:
 22 |         print("Failed to fetch data")
 23 |         return None
 24 | 
 25 | # Step 2: Load and Preprocess Historical Data
 26 | def load_and_preprocess_data(file_path):
 27 |     """
 28 |     Loads historical air quality data and preprocesses it for training.
 29 |     """
 30 |     # Load dataset
 31 |     data = pd.read_csv(file_path)
 32 |     
 33 |     # Features and target
 34 |     X = data[['PM2.5', 'PM10', 'NO2', 'CO', 'O3', 'SO2']]  # Features
 35 |     y = data['AQI']  # Target
 36 |     
 37 |     # Split data into training and testing sets
 38 |     X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
 39 |     
 40 |     # Standardize features
 41 |     scaler = StandardScaler()
 42 |     X_train = scaler.fit_transform(X_train)
 43 |     X_test = scaler.transform(X_test)
 44 |     
 45 |     return X_train, X_test, y_train, y_test, scaler
 46 | 
 47 | # Step 3: Train a Machine Learning Model
 48 | def train_model(X_train, y_train):
 49 |     """
 50 |     Trains a Random Forest Regressor model.
 51 |     """
 52 |     model = RandomForestRegressor(n_estimators=100, random_state=42)
 53 |     model.fit(X_train, y_train)
 54 |     return model
 55 | 
 56 | # Step 4: Evaluate the Model
 57 | def evaluate_model(model, X_test, y_test):
 58 |     """
 59 |     Evaluates the model using Mean Squared Error (MSE).
 60 |     """
 61 |     y_pred = model.predict(X_test)
 62 |     mse = mean_squared_error(y_test, y_pred)
 63 |     print(f"Mean Squared Error: {mse}")
 64 | 
 65 | # Step 5: Make Real-Time Predictions
 66 | def predict_real_time_aqi(model, scaler, pollutants):
 67 |     """
 68 |     Predicts AQI using real-time pollutant data.
 69 |     """
 70 |     # Prepare input data
 71 |     real_time_data = [pollutants['pm25'], pollutants['pm10'], pollutants['no2'], 
 72 |                       pollutants['co'], pollutants['o3'], pollutants['so2']]
 73 |     real_time_data = scaler.transform([real_time_data])
 74 |     
 75 |     # Predict AQI
 76 |     predicted_aqi = model.predict(real_time_data)
 77 |     return predicted_aqi[0]
 78 | 
 79 | # Main Function
 80 | def main():
 81 |     # API key and location details
 82 |     api_key = 'your_api_key_here'  # Replace with your AirVisual API key
 83 |     city = 'Los Angeles'
 84 |     state = 'California'
 85 |     country = 'USA'
 86 |     
 87 |     # Step 1: Fetch real-time air quality data
 88 |     pollutants = fetch_air_quality_data(api_key, city, state, country)
 89 |     if pollutants:
 90 |         print("Fetched Real-Time Pollutants:", pollutants)
 91 |         
 92 |         # Step 2: Load and preprocess historical data
 93 |         file_path = 'air_quality_data.csv'  # Replace with your dataset path
 94 |         X_train, X_test, y_train, y_test, scaler = load_and_preprocess_data(file_path)
 95 |         
 96 |         # Step 3: Train the model
 97 |         model = train_model(X_train, y_train)
 98 |         
 99 |         # Step 4: Evaluate the model
100 |         evaluate_model(model, X_test, y_test)
101 |         
102 |         # Step 5: Make real-time predictions
103 |         predicted_aqi = predict_real_time_aqi(model, scaler, pollutants)
104 |         print(f"Predicted AQI: {predicted_aqi}")
105 |     else:
106 |         print("Failed to fetch real-time data. Please check your API key or network connection.")
107 | 
108 | # Run the program
109 | if __name__ == "__main__":
110 |     main()


--------------------------------------------------------------------------------