├── result
├── code
└── README.md


/result:
--------------------------------------------------------------------------------
1 | | temperature | humidity | wind | rain | fire\_risk |
2 | | ----------- | -------- | ---- | ---- | ---------- |
3 | | 35          | 30       | 20   | 0.0  | 1          |
4 | | 22          | 80       | 5    | 2.5  | 0          |
5 | 


--------------------------------------------------------------------------------
/code:
--------------------------------------------------------------------------------
 1 | # forest_fire_risk_prediction.py
 2 | 
 3 | import pandas as pd
 4 | import numpy as np
 5 | from sklearn.model_selection import train_test_split
 6 | from sklearn.ensemble import RandomForestClassifier
 7 | from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
 8 | import matplotlib.pyplot as plt
 9 | import seaborn as sns
10 | 
11 | # Load dataset (replace with your actual dataset)
12 | # Example columns: 'temperature', 'humidity', 'wind', 'rain', 'fire_risk' (target: 0 = Low, 1 = High)
13 | df = pd.read_csv('forest_fire_data.csv')
14 | 
15 | # Preview the data
16 | print(df.head())
17 | 
18 | # Check for missing values
19 | print("Missing values:\n", df.isnull().sum())
20 | 
21 | # Fill or drop missing values
22 | df = df.dropna()
23 | 
24 | # Feature selection
25 | features = ['temperature', 'humidity', 'wind', 'rain']
26 | target = 'fire_risk'
27 | 
28 | X = df[features]
29 | y = df[target]
30 | 
31 | # Split into train/test sets
32 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
33 | 
34 | # Build and train the model
35 | model = RandomForestClassifier(n_estimators=100, random_state=42)
36 | model.fit(X_train, y_train)
37 | 
38 | # Predict on test set
39 | y_pred = model.predict(X_test)
40 | 
41 | # Evaluation
42 | print("Accuracy:", accuracy_score(y_test, y_pred))
43 | print("Classification Report:\n", classification_report(y_test, y_pred))
44 | 
45 | # Confusion matrix
46 | cm = confusion_matrix(y_test, y_pred)
47 | sns.heatmap(cm, annot=True, fmt='d', cmap='YlOrBr')
48 | plt.title("Confusion Matrix")
49 | plt.xlabel("Predicted")
50 | plt.ylabel("Actual")
51 | plt.show()
52 | 
53 | # Feature importance plot
54 | importances = model.feature_importances_
55 | feat_importance = pd.Series(importances, index=features).sort_values(ascending=False)
56 | feat_importance.plot(kind='bar', title='Feature Importances')
57 | plt.ylabel('Importance Score')
58 | plt.show()
59 | 
60 | # Predict new sample
61 | new_data = pd.DataFrame({
62 |     'temperature': [32],
63 |     'humidity': [40],
64 |     'wind': [15],
65 |     'rain': [0]
66 | })
67 | prediction = model.predict(new_data)
68 | print("Fire Risk Prediction for new sample:", "High" if prediction[0] == 1 else "Low")
69 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # 🔥 Forest Fire Risk Prediction
 2 | 
 3 | This project implements a machine learning model to predict the risk of forest fires based on environmental parameters such as temperature, humidity, wind speed, and rainfall.
 4 | 
 5 | ## 📊 Dataset
 6 | 
 7 | The model uses a dataset with the following columns:
 8 | 
 9 | - `temperature`: Ambient temperature in °C
10 | - `humidity`: Relative humidity in %
11 | - `wind`: Wind speed in km/h
12 | - `rain`: Rainfall in mm
13 | - `fire_risk`: Binary label (0 = Low Risk, 1 = High Risk)
14 | 
15 | You can use the provided synthetic dataset generator or your own dataset.
16 | 
17 | ## 🧠 Model
18 | 
19 | The model uses a **Random Forest Classifier** for classification.
20 | 
21 | ## 📁 File Structure
22 | 
23 | Forest-Fire-Risk-Prediction/
24 | │
25 | ├── forest_fire_risk_prediction.py # Main ML training and prediction script
26 | ├── generate_forest_fire_data.py # Generates synthetic dataset
27 | ├── forest_fire_data.csv # Dataset used for model training
28 | └── README.md # This file
29 | 
30 | 
31 | ## 🚀 Usage
32 | 
33 | ### 1. Install Dependencies
34 | 
35 | ```bash
36 | pip install pandas numpy scikit-learn matplotlib seaborn
37 | 
38 | 2. Generate Synthetic Dataset
39 | 
40 | python generate_forest_fire_data.py
41 | 
42 | 3. Run the Model
43 | 
44 | python forest_fire_risk_prediction.py
45 | 
46 | 📈 Output
47 | 
48 |     Classification accuracy and metrics
49 | 
50 |     Confusion matrix heatmap
51 | 
52 |     Feature importance bar chart
53 | 
54 |     Fire risk prediction for a new input sample
55 | 
56 | 
57 | ---
58 | 
59 | ### 🧪 `generate_forest_fire_data.py`
60 | 
61 | ```python
62 | # generate_forest_fire_data.py
63 | 
64 | import pandas as pd
65 | import numpy as np
66 | 
67 | # Set random seed for reproducibility
68 | np.random.seed(42)
69 | 
70 | # Generate synthetic data
71 | n_samples = 1000
72 | temperature = np.random.normal(loc=30, scale=5, size=n_samples)  # °C
73 | humidity = np.random.normal(loc=50, scale=15, size=n_samples)    # %
74 | wind = np.random.normal(loc=10, scale=3, size=n_samples)          # km/h
75 | rain = np.random.exponential(scale=1.0, size=n_samples)           # mm
76 | 
77 | # Define fire risk: more likely when temp is high, humidity low, rain low, wind high
78 | fire_risk = ((temperature > 33) & (humidity < 45) & (rain < 1.0) & (wind > 12)).astype(int)
79 | 
80 | # Create DataFrame
81 | df = pd.DataFrame({
82 |     'temperature': temperature.round(2),
83 |     'humidity': humidity.round(2),
84 |     'wind': wind.round(2),
85 |     'rain': rain.round(2),
86 |     'fire_risk': fire_risk
87 | })
88 | 
89 | # Save to CSV
90 | df.to_csv('forest_fire_data.csv', index=False)
91 | print("Synthetic dataset saved as 'forest_fire_data.csv'")
92 | 
93 | Author Name: Otutu Anslem
94 | Github: https://github.com/Otutu11/
95 | LinkedIn: https://www.linkedin.com/in/otutu-anslem-53a687359/
96 | 


--------------------------------------------------------------------------------