├── WhatsApp Image 2025-04-11 at 3.48.46 PM.jpeg ├── WhatsApp Image 2025-04-11 at 3.48.47 PM.jpeg ├── WhatsApp Image 2025-04-11 at 3.48.46 PM (1).jpeg ├── WhatsApp Image 2025-04-11 at 3.48.46 PM (2).jpeg ├── WhatsApp Image 2025-04-11 at 3.48.46 PM (3).jpeg ├── WhatsApp Image 2025-04-11 at 3.48.46 PM (4).jpeg ├── WhatsApp Image 2025-04-11 at 3.48.47 PM (1).jpeg ├── WhatsApp Image 2025-04-11 at 3.48.47 PM (2).jpeg ├── WhatsApp Image 2025-04-11 at 3.48.47 PM (3).jpeg ├── WhatsApp Image 2025-04-11 at 3.48.47 PM (4).jpeg ├── WhatsApp Image 2025-04-11 at 3.48.47 PM (5).jpeg ├── .github └── workflows │ └── blank.yml ├── README.md └── Code for data analysis /WhatsApp Image 2025-04-11 at 3.48.46 PM.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Rohit140220/Python-Project-by-Rohit-Sharma/HEAD/WhatsApp Image 2025-04-11 at 3.48.46 PM.jpeg -------------------------------------------------------------------------------- /WhatsApp Image 2025-04-11 at 3.48.47 PM.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Rohit140220/Python-Project-by-Rohit-Sharma/HEAD/WhatsApp Image 2025-04-11 at 3.48.47 PM.jpeg -------------------------------------------------------------------------------- /WhatsApp Image 2025-04-11 at 3.48.46 PM (1).jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Rohit140220/Python-Project-by-Rohit-Sharma/HEAD/WhatsApp Image 2025-04-11 at 3.48.46 PM (1).jpeg -------------------------------------------------------------------------------- /WhatsApp Image 2025-04-11 at 3.48.46 PM (2).jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Rohit140220/Python-Project-by-Rohit-Sharma/HEAD/WhatsApp Image 2025-04-11 at 3.48.46 PM (2).jpeg -------------------------------------------------------------------------------- /WhatsApp Image 2025-04-11 at 3.48.46 PM (3).jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Rohit140220/Python-Project-by-Rohit-Sharma/HEAD/WhatsApp Image 2025-04-11 at 3.48.46 PM (3).jpeg -------------------------------------------------------------------------------- /WhatsApp Image 2025-04-11 at 3.48.46 PM (4).jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Rohit140220/Python-Project-by-Rohit-Sharma/HEAD/WhatsApp Image 2025-04-11 at 3.48.46 PM (4).jpeg -------------------------------------------------------------------------------- /WhatsApp Image 2025-04-11 at 3.48.47 PM (1).jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Rohit140220/Python-Project-by-Rohit-Sharma/HEAD/WhatsApp Image 2025-04-11 at 3.48.47 PM (1).jpeg -------------------------------------------------------------------------------- /WhatsApp Image 2025-04-11 at 3.48.47 PM (2).jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Rohit140220/Python-Project-by-Rohit-Sharma/HEAD/WhatsApp Image 2025-04-11 at 3.48.47 PM (2).jpeg -------------------------------------------------------------------------------- /WhatsApp Image 2025-04-11 at 3.48.47 PM (3).jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Rohit140220/Python-Project-by-Rohit-Sharma/HEAD/WhatsApp Image 2025-04-11 at 3.48.47 PM (3).jpeg -------------------------------------------------------------------------------- /WhatsApp Image 2025-04-11 at 3.48.47 PM (4).jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Rohit140220/Python-Project-by-Rohit-Sharma/HEAD/WhatsApp Image 2025-04-11 at 3.48.47 PM (4).jpeg -------------------------------------------------------------------------------- /WhatsApp Image 2025-04-11 at 3.48.47 PM (5).jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Rohit140220/Python-Project-by-Rohit-Sharma/HEAD/WhatsApp Image 2025-04-11 at 3.48.47 PM (5).jpeg -------------------------------------------------------------------------------- /.github/workflows/blank.yml: -------------------------------------------------------------------------------- 1 | # This is a basic workflow to help you get started with Actions 2 | 3 | name: CI 4 | 5 | # Controls when the workflow will run 6 | on: 7 | # Triggers the workflow on push or pull request events but only for the "main" branch 8 | push: 9 | branches: [ "main" ] 10 | pull_request: 11 | branches: [ "main" ] 12 | 13 | # Allows you to run this workflow manually from the Actions tab 14 | workflow_dispatch: 15 | 16 | # A workflow run is made up of one or more jobs that can run sequentially or in parallel 17 | jobs: 18 | # This workflow contains a single job called "build" 19 | build: 20 | # The type of runner that the job will run on 21 | runs-on: ubuntu-latest 22 | 23 | # Steps represent a sequence of tasks that will be executed as part of the job 24 | steps: 25 | # Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it 26 | - uses: actions/checkout@v4 27 | 28 | # Runs a single command using the runners shell 29 | - name: Run a one-line script 30 | run: echo Hello, world! 31 | 32 | # Runs a set of commands using the runners shell 33 | - name: Run a multi-line script 34 | run: | 35 | echo Add other actions to build, 36 | echo test, and deploy your project. 37 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Python-Project-by-Rohit-Sharma 2 | Python project using python libraries for data analysis 3 | 🌍 Global Landslide Analysis Using Python & EDA 4 | This project presents a comprehensive Exploratory Data Analysis (EDA) of global landslide occurrences using Python. Leveraging data from the NASA Global Landslide Catalog, this analysis uncovers temporal, spatial, and statistical patterns that contribute to better understanding and mitigation of landslide disasters. 5 | 6 | 🔍 Project Objectives 7 | The following analytical tasks were performed: 8 | 9 | Data Cleaning & Pre-processing 10 | 11 | Handled missing values, standardized formats, and extracted features like year and location. 12 | 13 | Landslide Frequency Analysis 14 | 15 | Examined how landslide occurrences vary over time and identified peak years/seasons. 16 | 17 | Impact Analysis (Fatalities & Injuries) 18 | 19 | Evaluated the human impact by analyzing trends in fatalities and injuries over the years. 20 | 21 | Geospatial Analysis 22 | 23 | Visualized landslide-prone areas on global maps using latitude/longitude data to identify hotspots. 24 | 25 | Cause & Effect Analysis 26 | 27 | Investigated triggering factors such as rainfall, earthquakes, and construction activities and their correlation with landslide severity. 28 | 29 | Correlation Analysis: Frequency vs. Impact 30 | 31 | Explored the relationship between how often landslides occur and their associated human impact using scatter plots and statistical summaries. 32 | 33 | Outlier Detection Using Z-score Method 34 | 35 | Identified extreme cases of fatalities and injuries using Z-score-based statistical outlier detection. 36 | 37 | 📈 Visual Outputs 38 | Time-series line plots with average trendlines 39 | 40 | Heatmaps and scatter plots for geospatial analysis 41 | 42 | Bar plots for causal factor impact 43 | 44 | Gradient line plots showing frequency over time 45 | 46 | Boxplots and Z-score outlier visuals 47 | 48 | 🛠️ Tools & Libraries 49 | Python 50 | 51 | pandas, numpy – data manipulation 52 | 53 | matplotlib, seaborn – visual storytelling 54 | 55 | scipy – statistical computation 56 | 57 | 💡 Key Insights 58 | Landslides peak during monsoon months and in mountainous regions. 59 | 60 | Rainfall and construction are major triggers behind high-impact events. 61 | 62 | A total of 2,987 outliers were detected using Z-score methodology. 63 | 64 | Visual analytics help support risk prediction and climate resilience initiatives. 65 | -------------------------------------------------------------------------------- /Code for data analysis: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | import numpy as np 3 | import seaborn as sns 4 | import matplotlib.pyplot as plt 5 | from scipy.stats import zscore 6 | 7 | # Load your dataset 8 | df = pd.read_csv("C:/Users/Rohit Sharma/Downloads/Global_Landslide_Catalog_Export.csv") 9 | 10 | # 1. Data Cleaning and Pre-processing 11 | df['event_date'] = pd.to_datetime(df['event_date'], errors='coerce') 12 | df['fatality_count'] = df['fatality_count'].fillna(0) 13 | df['injury_count'] = df['injury_count'].fillna(0) 14 | df['year'] = df['event_date'].dt.year 15 | df = df.dropna(subset=['event_date']) 16 | 17 | # 2. Landslide Frequency Analysis 18 | freq_by_year = df['year'].value_counts().sort_index() 19 | avg_freq = freq_by_year.mean() 20 | 21 | # 3. Impact Analysis 22 | impact_by_year = df.groupby('year')[['fatality_count', 'injury_count']].sum() 23 | avg_fatalities = impact_by_year['fatality_count'].mean() 24 | avg_injuries = impact_by_year['injury_count'].mean() 25 | 26 | # 4. Geospatial Analysis (scatterplot-style) 27 | # 5. Cause and Effect Analysis 28 | trigger_impact = df.groupby('landslide_trigger')[['fatality_count', 'injury_count']].mean().sort_values(by='fatality_count', ascending=False) 29 | 30 | # 6. Relationship between Frequency and Impact 31 | merged_data = freq_by_year.to_frame(name='frequency').join(impact_by_year) 32 | 33 | # 7. Outlier Detection (Z-score) 34 | df['z_fatalities'] = zscore(df['fatality_count']) 35 | df['z_injuries'] = zscore(df['injury_count']) 36 | outliers = df[(df['z_fatalities'].abs() > 3) | (df['z_injuries'].abs() > 3)] 37 | 38 | # ------------------ Visualization ------------------ 39 | sns.set(style="whitegrid") 40 | plt.figure(figsize=(22, 24)) 41 | 42 | # Plot 1 - Landslide Frequency Over Years 43 | plt.subplot(4, 2, 1) 44 | sns.lineplot(x=freq_by_year.index, y=freq_by_year.values) 45 | plt.axhline(avg_freq, color='red', linestyle='--', label='Average') 46 | plt.title("1. Landslide Frequency Over Years") 47 | plt.ylabel("Frequency") 48 | plt.legend() 49 | 50 | # Plot 2 - Impact Analysis (Fatalities & Injuries) 51 | plt.subplot(4, 2, 2) 52 | sns.lineplot(data=impact_by_year) 53 | plt.axhline(avg_fatalities, color='red', linestyle='--', label='Avg Fatalities') 54 | plt.axhline(avg_injuries, color='green', linestyle='--', label='Avg Injuries') 55 | plt.title("2. Impact Analysis (Fatalities and Injuries)") 56 | plt.ylabel("Count") 57 | plt.legend() 58 | 59 | # Plot 3 - Geospatial Distribution 60 | plt.subplot(4, 2, 3) 61 | sns.scatterplot(data=df, x='longitude', y='latitude', hue='fatality_count', size='injury_count', 62 | sizes=(10, 200), palette='Reds', alpha=0.6, legend=False) 63 | plt.title("3. Geospatial Distribution of Landslides") 64 | plt.xlabel("Longitude") 65 | plt.ylabel("Latitude") 66 | 67 | # Plot 4 - Trigger vs Average Impact 68 | plt.subplot(4, 2, 4) 69 | trigger_impact.plot(kind='bar', ax=plt.gca()) 70 | plt.title("4. Average Impact by Landslide Trigger") 71 | plt.ylabel("Average Count") 72 | 73 | # Plot 5 - Frequency vs Impact 74 | plt.subplot(4, 2, 5) 75 | sns.scatterplot(data=merged_data, x='frequency', y='fatality_count', label='Fatalities') 76 | sns.scatterplot(data=merged_data, x='frequency', y='injury_count', label='Injuries') 77 | plt.axhline(avg_fatalities, color='red', linestyle='--') 78 | plt.axhline(avg_injuries, color='green', linestyle='--') 79 | plt.title("5. Frequency vs Fatalities and Injuries") 80 | plt.xlabel("Frequency") 81 | plt.ylabel("Impact") 82 | plt.legend() 83 | 84 | # Plot 6 - Outliers Detected 85 | plt.subplot(4, 2, 6) 86 | sns.scatterplot(data=outliers, x='event_date', y='fatality_count', color='red', label='Fatality Outliers') 87 | sns.scatterplot(data=outliers, x='event_date', y='injury_count', color='blue', label='Injury Outliers') 88 | plt.title("6. Outliers in Fatalities & Injuries (Z-Score)") 89 | plt.xlabel("Date") 90 | plt.ylabel("Count") 91 | plt.legend() 92 | 93 | # Plot 7 - Gradient Line Plot (Frequency with color) 94 | plt.subplot(4, 2, 7) 95 | gradient_data = pd.DataFrame({ 96 | 'year': freq_by_year.index, 97 | 'frequency': freq_by_year.values 98 | }) 99 | gradient_data['color'] = pd.cut(gradient_data['frequency'], bins=5, labels=sns.color_palette("coolwarm", 5).as_hex()) 100 | for i in range(1, len(gradient_data)): 101 | plt.plot( 102 | gradient_data['year'].iloc[i-1:i+1], 103 | gradient_data['frequency'].iloc[i-1:i+1], 104 | color=gradient_data['color'].iloc[i], 105 | linewidth=4 106 | ) 107 | plt.axhline(avg_freq, color='black', linestyle='--', label='Average') 108 | plt.title("7. Gradient Line: Landslide Frequency Over Time") 109 | plt.ylabel("Frequency") 110 | plt.legend() 111 | 112 | plt.tight_layout() 113 | plt.show() 114 | --------------------------------------------------------------------------------