├── result ├── code └── README.md /result: -------------------------------------------------------------------------------- 1 | | temperature | humidity | wind | rain | fire\_risk | 2 | | ----------- | -------- | ---- | ---- | ---------- | 3 | | 35 | 30 | 20 | 0.0 | 1 | 4 | | 22 | 80 | 5 | 2.5 | 0 | 5 | -------------------------------------------------------------------------------- /code: -------------------------------------------------------------------------------- 1 | # forest_fire_risk_prediction.py 2 | 3 | import pandas as pd 4 | import numpy as np 5 | from sklearn.model_selection import train_test_split 6 | from sklearn.ensemble import RandomForestClassifier 7 | from sklearn.metrics import classification_report, confusion_matrix, accuracy_score 8 | import matplotlib.pyplot as plt 9 | import seaborn as sns 10 | 11 | # Load dataset (replace with your actual dataset) 12 | # Example columns: 'temperature', 'humidity', 'wind', 'rain', 'fire_risk' (target: 0 = Low, 1 = High) 13 | df = pd.read_csv('forest_fire_data.csv') 14 | 15 | # Preview the data 16 | print(df.head()) 17 | 18 | # Check for missing values 19 | print("Missing values:\n", df.isnull().sum()) 20 | 21 | # Fill or drop missing values 22 | df = df.dropna() 23 | 24 | # Feature selection 25 | features = ['temperature', 'humidity', 'wind', 'rain'] 26 | target = 'fire_risk' 27 | 28 | X = df[features] 29 | y = df[target] 30 | 31 | # Split into train/test sets 32 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) 33 | 34 | # Build and train the model 35 | model = RandomForestClassifier(n_estimators=100, random_state=42) 36 | model.fit(X_train, y_train) 37 | 38 | # Predict on test set 39 | y_pred = model.predict(X_test) 40 | 41 | # Evaluation 42 | print("Accuracy:", accuracy_score(y_test, y_pred)) 43 | print("Classification Report:\n", classification_report(y_test, y_pred)) 44 | 45 | # Confusion matrix 46 | cm = confusion_matrix(y_test, y_pred) 47 | sns.heatmap(cm, annot=True, fmt='d', cmap='YlOrBr') 48 | plt.title("Confusion Matrix") 49 | plt.xlabel("Predicted") 50 | plt.ylabel("Actual") 51 | plt.show() 52 | 53 | # Feature importance plot 54 | importances = model.feature_importances_ 55 | feat_importance = pd.Series(importances, index=features).sort_values(ascending=False) 56 | feat_importance.plot(kind='bar', title='Feature Importances') 57 | plt.ylabel('Importance Score') 58 | plt.show() 59 | 60 | # Predict new sample 61 | new_data = pd.DataFrame({ 62 | 'temperature': [32], 63 | 'humidity': [40], 64 | 'wind': [15], 65 | 'rain': [0] 66 | }) 67 | prediction = model.predict(new_data) 68 | print("Fire Risk Prediction for new sample:", "High" if prediction[0] == 1 else "Low") 69 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # 🔥 Forest Fire Risk Prediction 2 | 3 | This project implements a machine learning model to predict the risk of forest fires based on environmental parameters such as temperature, humidity, wind speed, and rainfall. 4 | 5 | ## 📊 Dataset 6 | 7 | The model uses a dataset with the following columns: 8 | 9 | - `temperature`: Ambient temperature in °C 10 | - `humidity`: Relative humidity in % 11 | - `wind`: Wind speed in km/h 12 | - `rain`: Rainfall in mm 13 | - `fire_risk`: Binary label (0 = Low Risk, 1 = High Risk) 14 | 15 | You can use the provided synthetic dataset generator or your own dataset. 16 | 17 | ## 🧠 Model 18 | 19 | The model uses a **Random Forest Classifier** for classification. 20 | 21 | ## 📁 File Structure 22 | 23 | Forest-Fire-Risk-Prediction/ 24 | │ 25 | ├── forest_fire_risk_prediction.py # Main ML training and prediction script 26 | ├── generate_forest_fire_data.py # Generates synthetic dataset 27 | ├── forest_fire_data.csv # Dataset used for model training 28 | └── README.md # This file 29 | 30 | 31 | ## 🚀 Usage 32 | 33 | ### 1. Install Dependencies 34 | 35 | ```bash 36 | pip install pandas numpy scikit-learn matplotlib seaborn 37 | 38 | 2. Generate Synthetic Dataset 39 | 40 | python generate_forest_fire_data.py 41 | 42 | 3. Run the Model 43 | 44 | python forest_fire_risk_prediction.py 45 | 46 | 📈 Output 47 | 48 | Classification accuracy and metrics 49 | 50 | Confusion matrix heatmap 51 | 52 | Feature importance bar chart 53 | 54 | Fire risk prediction for a new input sample 55 | 56 | 57 | --- 58 | 59 | ### 🧪 `generate_forest_fire_data.py` 60 | 61 | ```python 62 | # generate_forest_fire_data.py 63 | 64 | import pandas as pd 65 | import numpy as np 66 | 67 | # Set random seed for reproducibility 68 | np.random.seed(42) 69 | 70 | # Generate synthetic data 71 | n_samples = 1000 72 | temperature = np.random.normal(loc=30, scale=5, size=n_samples) # °C 73 | humidity = np.random.normal(loc=50, scale=15, size=n_samples) # % 74 | wind = np.random.normal(loc=10, scale=3, size=n_samples) # km/h 75 | rain = np.random.exponential(scale=1.0, size=n_samples) # mm 76 | 77 | # Define fire risk: more likely when temp is high, humidity low, rain low, wind high 78 | fire_risk = ((temperature > 33) & (humidity < 45) & (rain < 1.0) & (wind > 12)).astype(int) 79 | 80 | # Create DataFrame 81 | df = pd.DataFrame({ 82 | 'temperature': temperature.round(2), 83 | 'humidity': humidity.round(2), 84 | 'wind': wind.round(2), 85 | 'rain': rain.round(2), 86 | 'fire_risk': fire_risk 87 | }) 88 | 89 | # Save to CSV 90 | df.to_csv('forest_fire_data.csv', index=False) 91 | print("Synthetic dataset saved as 'forest_fire_data.csv'") 92 | 93 | Author Name: Otutu Anslem 94 | Github: https://github.com/Otutu11/ 95 | LinkedIn: https://www.linkedin.com/in/otutu-anslem-53a687359/ 96 | --------------------------------------------------------------------------------