├── README.md
├── Screenshot (399).png
├── Screenshot (400).png
├── Screenshot (401).png
├── Screenshot (402).png
├── Screenshot (403).png
├── Screenshot (405).png
├── Screenshot (406).png
├── Screenshot (407).png
├── Screenshot (408).png
├── Screenshot (409).png
├── code.py
├── datasource
├── description
├── objectives of the project
└── tools used


/README.md:
--------------------------------------------------------------------------------
1 | # python-project
2 | GeoTherma: U.S. Low-Temperature Hydro Analysis
3 | 


--------------------------------------------------------------------------------
/Screenshot (399).png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sakshisinha15/python-project/b7bf73d6ab9a077e0b2c00680ecb4f0709735443/Screenshot (399).png


--------------------------------------------------------------------------------
/Screenshot (400).png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sakshisinha15/python-project/b7bf73d6ab9a077e0b2c00680ecb4f0709735443/Screenshot (400).png


--------------------------------------------------------------------------------
/Screenshot (401).png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sakshisinha15/python-project/b7bf73d6ab9a077e0b2c00680ecb4f0709735443/Screenshot (401).png


--------------------------------------------------------------------------------
/Screenshot (402).png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sakshisinha15/python-project/b7bf73d6ab9a077e0b2c00680ecb4f0709735443/Screenshot (402).png


--------------------------------------------------------------------------------
/Screenshot (403).png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sakshisinha15/python-project/b7bf73d6ab9a077e0b2c00680ecb4f0709735443/Screenshot (403).png


--------------------------------------------------------------------------------
/Screenshot (405).png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sakshisinha15/python-project/b7bf73d6ab9a077e0b2c00680ecb4f0709735443/Screenshot (405).png


--------------------------------------------------------------------------------
/Screenshot (406).png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sakshisinha15/python-project/b7bf73d6ab9a077e0b2c00680ecb4f0709735443/Screenshot (406).png


--------------------------------------------------------------------------------
/Screenshot (407).png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sakshisinha15/python-project/b7bf73d6ab9a077e0b2c00680ecb4f0709735443/Screenshot (407).png


--------------------------------------------------------------------------------
/Screenshot (408).png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sakshisinha15/python-project/b7bf73d6ab9a077e0b2c00680ecb4f0709735443/Screenshot (408).png


--------------------------------------------------------------------------------
/Screenshot (409).png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sakshisinha15/python-project/b7bf73d6ab9a077e0b2c00680ecb4f0709735443/Screenshot (409).png


--------------------------------------------------------------------------------
/code.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import pandas as pd
  3 | import matplotlib.pyplot as plt
  4 | import seaborn as sns
  5 | df=pd.read_csv('datasetproject.csv')
  6 | print("Data loaded successfully")
  7 | print("------------------------------------\n")
  8 | print("1ST 10 ROWS OF THE DATASET")
  9 | #1 st 10 rows of the dataset
 10 | print("------------------------------------\n")
 11 | print(df.head(10))
 12 | print("------------------------------------\n")
 13 | print("LAST 10 ROWS OF THE DATASET")
 14 | #last 10 rows o the dataset
 15 | print("------------------------------------\n")
 16 | print(df.tail())
 17 | print("------------------------------------\n")
 18 | print("INFO ABOUT THE DATASET")
 19 | #info about the dataset
 20 | print("------------------------------------\n")
 21 | print(df.info())
 22 | print("------------------------------------\n")
 23 | print("MEAN ,STD,QUARTILE,MIN,MAX OF THE DATASET")
 24 | print("------------------------------------\n")
 25 | #info about the mean,std.deviation,quartile one,two,three
 26 | print(df.describe())
 27 | print("------------------------------------------------------------\n")
 28 | print("                    DATA  CLEANING                            ")
 29 | print("------------------------------------------------------------\n")
 30 | print(df.isnull().sum())
 31 | #Filling missing values with specific values
 32 | df_filled=df.fillna({'geologic_province':df['geologic_province'].mode()[0],'country': df['country'].mode()[0],
 33 |                      'reference':df['reference'].mode()[0]})
 34 | print("\nDataframe after filling missing values:")
 35 | print(df_filled)
 36 | print(df_filled.isnull().sum())
 37 | 
 38 | #Dropping rows with missing values
 39 | df_dropped_rows=df.dropna()
 40 | print("\nDataFrame after dropping rows with missing values:")
 41 | 
 42 | #Dropping columns with missing values
 43 | df_dropped_cols=df.dropna(axis=1)
 44 | print("\nDataFrame after dropping cols with missing values:")
 45 | print(df_dropped_cols)
 46 | 
 47 | #saving cleaned data
 48 | cleaned_file_path="datasetproject.csv"
 49 | df.to_csv(cleaned_file_path,index=False)
 50 | print(f"Cleaned data saved to: {cleaned_file_path}")
 51 | 
 52 | ####################OBJECTIVE1( Rank Geothermal Sites by Energy Potential)##################################
 53 | ### ranked list of geothermal areas based on their beneficial heat over 30 years and accessible resource base, identifying the most energy-rich locations in the U.S.#################
 54 | top_regions = df.sort_values(by='beneficial_heat_mwh', ascending=False).head(10)
 55 | plt.figure(figsize=(12, 6))
 56 | sns.barplot(x='beneficial_heat_mwh',y='geothermal_area',hue='geothermal_area',data=top_regions,palette='YlOrRd',dodge=False,legend=False)
 57 | plt.title('Top 10 Geothermal Areas by Energy Potential (MWh)')
 58 | plt.xlabel('Beneficial Heat (MWh)')
 59 | plt.ylabel('Geothermal Area')
 60 | plt.tight_layout()
 61 | plt.show()
 62 | #################OBJECTIVE2(Analyze the Relationship Between Temperature and Energy Output)########################
 63 | ##############statistically significant correlation between reservoir temperature and energy output (MWh or MW over 30 years).#########
 64 | numeric_cols = ['temperature_c', 'beneficial_heat_mwh', 'wells', 'reservoir_area_km2', 'reservoir_volume_km3']
 65 | corr_matrix = df[numeric_cols].corr()
 66 | # Plot heatmap
 67 | plt.figure(figsize=(10, 6))
 68 | sns.heatmap(corr_matrix, annot=True, cmap='YlGnBu', fmt=".2f", linewidths=0.5)
 69 | plt.title('Correlation Heatmap of Geothermal Parameters')
 70 | plt.tight_layout()
 71 | plt.show()
 72 | ########################OBJECTIVE3(Compare Efficiency by System Type)####################################
 73 | ################different geothermal system types (e.g., sedimentary basin, delineated area) to see which offers the best energy yield per well or per unit area.########
 74 | df['heat_per_well'] = df['beneficial_heat_mwh'] / df['wells']
 75 | efficiency = df.groupby('system_type')['heat_per_well'].mean().sort_values(ascending=False)
 76 | efficiency_df = efficiency.reset_index()
 77 | plt.figure(figsize=(10, 6))
 78 | sns.barplot(data=efficiency_df,x='heat_per_well',y='system_type',hue='system_type',palette='coolwarm',legend=False)
 79 | plt.title('Average Heat Output per Well by System Type')
 80 | plt.xlabel('Heat per Well (MWh)')
 81 | plt.ylabel('System Type')
 82 | plt.tight_layout()
 83 | plt.show()
 84 | #########################OBJECTIVE4( Visualize Geothermal Energy by State)###############################
 85 | ######################################### Total geothermal potential per U.S. state based on the dataset###########################################################
 86 | area_heat = df.groupby('geothermal_area')['beneficial_heat_mwh'].sum().sort_values(ascending=False)
 87 | area_heat_df = area_heat.reset_index()
 88 | area_heat_df.columns = ['geothermal_area', 'total_heat']
 89 | top_n = 10
 90 | area_heat_df = area_heat_df.head(top_n)
 91 | plt.figure(figsize=(12, 6))
 92 | sns.lineplot(data=area_heat_df, x='geothermal_area', y='total_heat', marker='o', linewidth=2.5)
 93 | plt.title('Beneficial Heat Output by Geothermal Area (Top 10)', fontsize=14)
 94 | plt.xlabel('Geothermal Area', fontsize=12)
 95 | plt.ylabel('Total Heat (MWh)', fontsize=12)
 96 | plt.xticks(rotation=45)
 97 | plt.grid(True)
 98 | plt.tight_layout()
 99 | plt.show()
100 | ##########################OBJECTIVE5( Optimize Well Deployment for Maximum Output)#####################################################
101 | ####################### the most efficient well configurations (wells per area, heat per well) to guide future deployment strategies.#######
102 | df['heat_per_well'] = df['beneficial_heat_mwh'] / df['wells']
103 | df['wells_per_km2'] = df['wells'] / df['reservoir_area_km2']
104 | plt.figure(figsize=(10, 6))
105 | sns.scatterplot(data=df, x='wells_per_km2', y='heat_per_well', hue='system_type')
106 | plt.title('Well Optimization: Heat per Well vs Wells per km²')
107 | plt.xlabel('Wells per km²')
108 | plt.ylabel('Heat per Well (MWh)')
109 | plt.tight_layout()
110 | plt.show()
111 | 
112 | 


--------------------------------------------------------------------------------
/datasource:
--------------------------------------------------------------------------------
1 | got this  dataset from the website of data.gov.in
2 |   link for the dataset is:-https://catalog.data.gov/dataset/low-temperature-hydrothermal-resource-potential-estimate-17562
3 | 


--------------------------------------------------------------------------------
/description:
--------------------------------------------------------------------------------
1 | Description:
2 | GeoTherma is a data-driven project focused on analyzing low-temperature geothermal resources across the United States. Using Python, we processed, cleaned, and analyzed geospatial and thermal datasets to assess the potential of hydrothermal sites for sustainable energy generation.
3 | The analysis involved exploring correlations between subsurface temperature, geographic location, and estimated energy output. We ranked geothermal sites based on key metrics such as temperature gradients, depth, and proximity to infrastructure.
4 | 


--------------------------------------------------------------------------------
/objectives of the project:
--------------------------------------------------------------------------------
1 | objectives of the project-
2 | 1.🔥 Rank geothermal sites by energy potential to identify high-yield regions.
3 | 2.🌡 Analyze the relationship between temperature and energy output.
4 | 3.⚙️ Compare system types based on energy efficiency per well.
5 | 4.🗺 Visualize geothermal resource distribution across U.S. states.
6 | 5.🎯 Optimize well deployment to maximize heat output and minimize inefficiency.
7 | 
8 |  
9 | 


--------------------------------------------------------------------------------
/tools used:
--------------------------------------------------------------------------------
1 | Tools & Libraries Used:
2 | Pandas – for data cleaning, manipulation, and tabular analysis
3 | NumPy – for numerical computations and array operations
4 | Matplotlib & Seaborn – for data visualization and exploratory plotting
5 | 


--------------------------------------------------------------------------------