├── .gitignore ├── README.md ├── healthcare ├── health_care_kg.py ├── health_care_langchain.py └── healthcare.csv ├── kgraph_rag └── roman_emp_graph_rag.py ├── note-neo4j-creds.md ├── prep_text_for_rag └── app.py ├── requirements.txt └── simple_kg └── kg_simple.py /.gitignore: -------------------------------------------------------------------------------- 1 | 2 | # ignore the venv directory 3 | venv/ 4 | # ignore the .vscode directory 5 | .vscode/ 6 | # ignore the .idea directory 7 | .idea/ 8 | # ignore the .env file 9 | .env -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Knowledge Graph RAG System 2 | 3 | A comprehensive implementation of Retrieval-Augmented Generation (RAG) systems using knowledge graphs and Neo4j. This repository demonstrates how to build, query, and leverage knowledge graphs for enhanced AI applications. 4 | 5 | ## Overview 6 | 7 | This project showcases different approaches to building and utilizing knowledge graphs for RAG systems: 8 | 9 | 1. **Simple Knowledge Graph** - Basic Neo4j implementation for creating and querying knowledge graphs 10 | 2. **Knowledge Graph RAG** - Advanced RAG system using knowledge graphs for structured information retrieval 11 | 3. **Healthcare Knowledge Graph** - Domain-specific implementation for healthcare data 12 | 4. **Text Preparation for RAG** - Tools for preparing and processing text data for RAG systems 13 | 14 | ## Features 15 | 16 | - Neo4j integration for knowledge graph storage and querying 17 | - LangChain integration for RAG pipeline implementation 18 | - Entity extraction and relationship mapping 19 | - Full-text search capabilities 20 | - Hybrid retrieval combining structured and unstructured data 21 | - Conversation history handling for contextual responses 22 | - Domain-specific implementations (healthcare, Roman Empire) 23 | 24 | ## Project Structure 25 | 26 | ``` 27 | knowledge-graph-rag/ 28 | ├── simple_kg/ # Basic knowledge graph implementation 29 | │ └── kg_simple.py # Simple Neo4j operations 30 | ├── kgraph_rag/ # Advanced RAG with knowledge graphs 31 | │ └── roman_emp_graph_rag.py # Roman Empire knowledge graph RAG 32 | ├── prep_text_for_rag/ # Text preparation tools 33 | │ └── app.py # Text processing application 34 | ├── healthcare/ # Healthcare domain implementation 35 | │ ├── health_care_kg.py # Healthcare knowledge graph 36 | │ ├── health_care_langchain.py # Healthcare RAG with LangChain 37 | │ └── healthcare.csv # Sample healthcare data 38 | ├── requirements.txt # Project dependencies 39 | └── .env # Environment variables (not tracked in git) 40 | ``` 41 | 42 | ## Prerequisites 43 | 44 | - Python 3.8+ 45 | - Neo4j Aura instance or local Neo4j database 46 | - OpenAI API key 47 | 48 | ## Installation 49 | 50 | 1. Clone the repository: 51 | ```bash 52 | git clone https://github.com/Badribn0612/knowledge-graph-rag.git 53 | cd knowledge-graph-rag 54 | ``` 55 | 56 | 2. Install dependencies: 57 | ```bash 58 | pip install -r requirements.txt 59 | ``` 60 | 61 | 3. Create a `.env` file with your credentials: 62 | ``` 63 | NEO4J_URI=your_neo4j_uri 64 | NEO4J_USERNAME=your_neo4j_username 65 | NEO4J_PASSWORD=your_neo4j_password 66 | AURA_INSTANCENAME=your_aura_instance_name 67 | OPENAI_API_KEY=your_openai_api_key 68 | ``` 69 | 70 | ## Usage 71 | 72 | ### Simple Knowledge Graph 73 | 74 | ```python 75 | from simple_kg.kg_simple import connect_and_query 76 | 77 | # Connect to Neo4j and run a simple query 78 | connect_and_query() 79 | ``` 80 | 81 | ### Knowledge Graph RAG 82 | 83 | ```python 84 | from kgraph_rag.roman_emp_graph_rag import chain 85 | 86 | # Ask a question about the Roman Empire 87 | response = chain.invoke({ 88 | "question": "Who was the first Roman emperor?", 89 | }) 90 | 91 | print(response) 92 | ``` 93 | 94 | ### Healthcare Knowledge Graph 95 | 96 | ```python 97 | from healthcare.health_care_langchain import health_care_chain 98 | 99 | # Ask a question about healthcare 100 | response = health_care_chain.invoke({ 101 | "question": "What are the symptoms of diabetes?", 102 | }) 103 | 104 | print(response) 105 | ``` 106 | 107 | ## Key Components 108 | 109 | ### Knowledge Graph Creation 110 | 111 | The system supports multiple approaches to knowledge graph creation: 112 | - Manual entity and relationship creation 113 | - Automated extraction from text using LLMs 114 | - Import from structured data (CSV) 115 | 116 | ### RAG Implementation 117 | 118 | The RAG system combines: 119 | - Structured data retrieval from the knowledge graph 120 | - Unstructured data retrieval using vector search 121 | - Context-aware question answering with conversation history 122 | 123 | ### Entity Extraction 124 | 125 | The system uses LLMs to extract entities from text, which are then stored in the knowledge graph for future retrieval. 126 | 127 | ## Contributing 128 | 129 | Contributions are welcome! Please feel free to submit a Pull Request. 130 | 131 | ## Acknowledgments 132 | 133 | - Neo4j for the graph database 134 | - LangChain for the RAG framework 135 | - OpenAI for the language models 136 | -------------------------------------------------------------------------------- /healthcare/health_care_kg.py: -------------------------------------------------------------------------------- 1 | import csv 2 | from dotenv import load_dotenv 3 | import os 4 | from neo4j import GraphDatabase 5 | 6 | # Load environment variables from .env file 7 | load_dotenv() 8 | 9 | # Get environment variables 10 | AURA_INSTANCENAME = os.environ["AURA_INSTANCENAME"] 11 | NEO4J_URI = os.environ["NEO4J_URI"] 12 | NEO4J_USERNAME = os.environ["NEO4J_USERNAME"] 13 | NEO4J_PASSWORD = os.environ["NEO4J_PASSWORD"] 14 | AUTH = (NEO4J_USERNAME, NEO4J_PASSWORD) 15 | 16 | 17 | # Function to connect and run a Cypher query 18 | def execute_query(driver, cypher_query, parameters=None): 19 | try: 20 | with driver.session() as session: #database=NEO4J_DATABASE 21 | session.run(cypher_query, parameters) 22 | except Exception as e: 23 | print(f"Error: {e}") 24 | 25 | 26 | # Function to create healthcare provider nodes 27 | def create_healthcare_provider_node(driver, provider, bio): 28 | print("Creating healthcare provider node") 29 | create_provider_query = """ 30 | MERGE (hp:HealthcareProvider {name: $provider, bio: $bio}) 31 | """ 32 | parameters = {"provider": provider, "bio": bio} 33 | execute_query(driver, create_provider_query, parameters) 34 | 35 | 36 | # Function to create patient nodes 37 | def create_patient_node( 38 | driver, patient, patient_age, patient_gender, patient_condition 39 | ): 40 | print("Creating patient node") 41 | create_patient_query = """ 42 | MERGE (p:Patient {name: $patient, age: $patient_age, gender: $patient_gender, condition: $patient_condition}) 43 | """ 44 | parameters = { 45 | "patient": patient, 46 | "patient_age": patient_age, 47 | "patient_gender": patient_gender, 48 | "patient_condition": patient_condition, 49 | } 50 | execute_query(driver, create_patient_query, parameters) 51 | 52 | 53 | # Function to create specialization nodes 54 | def create_specialization_node(driver, specialization): 55 | print("Creating specialization node") 56 | create_specialization_query = """ 57 | MERGE (s:Specialization {name: $specialization}) 58 | """ 59 | parameters = {"specialization": specialization} 60 | execute_query(driver, create_specialization_query, parameters) 61 | 62 | 63 | # Function to create location nodes 64 | def create_location_node(driver, location): 65 | print("Creating location node") 66 | create_location_query = """ 67 | MERGE (l:Location {name: $location}) 68 | """ 69 | parameters = {"location": location} 70 | execute_query(driver, create_location_query, parameters) 71 | 72 | 73 | # Function to create relationships 74 | def create_relationships(driver, provider, patient, specialization, location): 75 | print("Creating relationships") 76 | create_relationships_query = """ 77 | MATCH (hp:HealthcareProvider {name: $provider}), (p:Patient {name: $patient}) 78 | MERGE (hp)-[:TREATS]->(p) 79 | WITH hp 80 | MATCH (hp), (s:Specialization {name: $specialization}) 81 | MERGE (hp)-[:SPECIALIZES_IN]->(s) 82 | WITH hp 83 | MATCH (hp), (l:Location {name: $location}) 84 | MERGE (hp)-[:LOCATED_AT]->(l) 85 | """ 86 | parameters = { 87 | "provider": provider, 88 | "patient": patient, 89 | "specialization": specialization, 90 | "location": location, 91 | } 92 | execute_query(driver, create_relationships_query, parameters) 93 | 94 | 95 | # Main function to read the CSV file and populate the graph 96 | def main(): 97 | driver = GraphDatabase.driver(NEO4J_URI, auth=AUTH) 98 | 99 | with open("/home/badrinarayan/supercharge_ai_with_knowledge_graphs_rag_system_mastery_new_udemy_course/knowledge-graph-rag/knowledge-graph-rag/healthcare/healthcare.csv", mode="r") as file: 100 | reader = csv.DictReader(file) 101 | print("Reading CSV file...") 102 | 103 | for row in reader: 104 | provider = row["Provider"] 105 | patient = row["Patient"] 106 | specialization = row["Specialization"] 107 | location = row["Location"] 108 | bio = row["Bio"] 109 | patient_age = row["Patient_Age"] 110 | patient_gender = row["Patient_Gender"] 111 | patient_condition = row["Patient_Condition"] 112 | 113 | create_healthcare_provider_node(driver, provider, bio) 114 | create_patient_node( 115 | driver, patient, patient_age, patient_gender, patient_condition 116 | ) 117 | create_specialization_node(driver, specialization) 118 | create_location_node(driver, location) 119 | create_relationships(driver, provider, patient, specialization, location) 120 | 121 | driver.close() 122 | print("Graph populated successfully!") 123 | 124 | 125 | # Run the main function 126 | if __name__ == "__main__": 127 | main() 128 | -------------------------------------------------------------------------------- /healthcare/health_care_langchain.py: -------------------------------------------------------------------------------- 1 | from dotenv import load_dotenv 2 | import os 3 | from langchain_neo4j import Neo4jGraph #GraphCypherQAChain 4 | # from langchain_community.graphs import Neo4jGraph 5 | 6 | load_dotenv() 7 | 8 | AURA_INSTANCENAME = os.environ["AURA_INSTANCENAME"] 9 | NEO4J_URI = os.environ["NEO4J_URI"] 10 | NEO4J_USERNAME = os.environ["NEO4J_USERNAME"] 11 | NEO4J_PASSWORD = os.environ["NEO4J_PASSWORD"] 12 | AUTH = (NEO4J_USERNAME, NEO4J_PASSWORD) 13 | 14 | 15 | kg = Neo4jGraph( 16 | url=NEO4J_URI, 17 | username=NEO4J_USERNAME, 18 | password=NEO4J_PASSWORD, 19 | ) # database=NEO4J_DATABASE, 20 | 21 | cypher = """ 22 | MATCH (n) 23 | RETURN count(n) as numberOfNodes 24 | """ 25 | 26 | result = kg.query(cypher) 27 | print(f"There are {result[0]['numberOfNodes']} nodes in this graph.") 28 | 29 | 30 | # Match only the Providers nodes by specifying the node label 31 | cypher = """ 32 | MATCH (n:HealthcareProvider) 33 | RETURN count(n) AS numberOfProviders 34 | """ 35 | res = kg.query(cypher) 36 | print(f"There are {res[0]['numberOfProviders']} Healthcare Providers in this graph.") 37 | 38 | 39 | # return the names of the Healthcare Providers 40 | cypher = """ 41 | MATCH (n:HealthcareProvider) 42 | RETURN n.name AS ProviderName 43 | """ 44 | res = kg.query(cypher) 45 | print("Healthcare Providers:") 46 | for r in res: 47 | print(r["ProviderName"]) 48 | 49 | # list all patients in the graph 50 | cypher = """ 51 | MATCH (n:Patient) 52 | RETURN n.name AS PatientName 53 | LIMIT 10 54 | """ 55 | res = kg.query(cypher) 56 | print("Patients:") 57 | for r in res: 58 | print(r["PatientName"]) 59 | 60 | 61 | # list all Specializations in the graph 62 | cypher = """ 63 | MATCH (n:Specialization) 64 | RETURN n.name AS SpecializationName 65 | """ 66 | res = kg.query(cypher) 67 | print("Specializations:") 68 | for r in res: 69 | print(r["SpecializationName"]) 70 | 71 | 72 | # list all Locations in the graph 73 | cypher = """ 74 | MATCH (n:Location) 75 | RETURN n.name AS LocationName 76 | """ 77 | res = kg.query(cypher) 78 | 79 | print("Locations:") 80 | for r in res: 81 | print(r["LocationName"]) 82 | 83 | 84 | # list all patients treated by a specific provider 85 | cypher = """ 86 | MATCH (hp:HealthcareProvider {name: 'Dr. Smith'})-[:TREATS]->(p:Patient) 87 | RETURN p.name AS PatientName 88 | """ 89 | res = kg.query(cypher) 90 | print("Patients treated by Dr. Smith:") 91 | for r in res: 92 | print(r["PatientName"]) 93 | 94 | # And More... 95 | # list all Specializations of a specific provider 96 | cypher = """ 97 | MATCH (hp:HealthcareProvider {name: 'Dr. Smith'})-[:SPECIALIZES_IN]->(s:Specialization) 98 | RETURN s.name AS SpecializationName 99 | """ 100 | res = kg.query(cypher) 101 | print("Specializations of Dr. Smith:") 102 | for r in res: 103 | print(r["SpecializationName"]) 104 | 105 | # 4. List All Healthcare Providers Located in a Specific Location 106 | cypher = """ 107 | MATCH (hp:HealthcareProvider)-[:LOCATED_AT]->(l:Location {name: 'Houston'}) 108 | RETURN hp.name AS ProviderName 109 | """ 110 | res = kg.query(cypher) 111 | print("Healthcare Providers located Houston:") 112 | for r in res: 113 | print(r["ProviderName"]) 114 | 115 | 116 | # 5. List All Patients Treated by a Provider Specializing in a Specific Specialization 117 | cypher = """ 118 | MATCH (hp:HealthcareProvider)-[:TREATS]->(p:Patient), 119 | (hp)-[:SPECIALIZES_IN]->(s:Specialization {name: 'Cardiology'}) 120 | RETURN p.name AS PatientName 121 | """ 122 | res = kg.query(cypher) 123 | print("Patients treated by a Cardiologist:") 124 | for r in res: 125 | print(r["PatientName"]) 126 | 127 | # 6. List All Healthcare Providers Located in a Specific Location Specializing in a Specific Specialization 128 | cypher = """ 129 | MATCH (hp:HealthcareProvider)-[:LOCATED_AT]->(l:Location {name: 'Houston'}), 130 | (hp)-[:SPECIALIZES_IN]->(s:Specialization {name: 'Cardiology'}) 131 | RETURN hp.name AS ProviderName 132 | """ 133 | res = kg.query(cypher) 134 | print("\nCardiologists located in Houston:") 135 | for r in res: 136 | print(r["ProviderName"]) 137 | 138 | # 7. List All Patients Treated by a Provider Specializing in a Specific Specialization Located in a Specific Location 139 | cypher = """ 140 | MATCH (hp:HealthcareProvider)-[:TREATS]->(p:Patient), 141 | (hp)-[:SPECIALIZES_IN]->(s:Specialization {name: 'Cardiology'}), 142 | (hp)-[:LOCATED_AT]->(l:Location {name: 'Houston'}) 143 | RETURN p.name AS PatientName 144 | """ 145 | res = kg.query(cypher) 146 | print("\nCardiology patients treated by providers in Houston:") 147 | for r in res: 148 | print(r["PatientName"]) 149 | 150 | # list all patients who have Parkinson's Disease 151 | cypher = """ 152 | MATCH (p:Patient {condition: 'Migraine'}) 153 | RETURN p.name AS PatientName 154 | """ 155 | res = kg.query(cypher) 156 | print("\n \n****Patients with Migrane: ***") 157 | for r in res: 158 | print(r["PatientName"]) 159 | -------------------------------------------------------------------------------- /healthcare/healthcare.csv: -------------------------------------------------------------------------------- 1 | Provider,Patient,Specialization,Location,Bio,Patient_Age,Patient_Gender,Patient_Condition 2 | Dr. Jessica Lee,Eva Blue,Pediatrics,Los Angeles,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,66,Male,Asthma 3 | Dr. Michael Brown,Alice Brown,Pediatrics,Los Angeles,Dr. Michael Brown is an orthopedic surgeon with expertise in joint replacement.,59,Female,Osteoarthritis 4 | Dr. Jessica Lee,Grace Red,Pediatrics,New York,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,72,Male,Eczema 5 | Dr. Emily Davis,Frank Yellow,Pediatrics,New York,Dr. Emily Davis specializes in neurology and has published numerous research papers.,78,Female,Epilepsy 6 | Dr. John Smith,David Black,Cardiology,Los Angeles,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,32,Female,Hypertension 7 | Dr. Jessica Lee,Bob White,Pediatrics,Phoenix,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,99,Female,Psoriasis 8 | Dr. Sarah Johnson,Grace Red,Dermatology,Houston,Dr. Sarah Johnson is a pediatrician known for her compassionate care.,90,Male,Acne 9 | Dr. Sarah Johnson,Grace Red,Pediatrics,Phoenix,Dr. Sarah Johnson is a pediatrician known for her compassionate care.,54,Male,ADHD 10 | Dr. John Smith,Frank Yellow,Orthopedics,New York,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,11,Female,Scoliosis 11 | Dr. John Smith,Frank Yellow,Cardiology,New York,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,2,Female,Heart Murmur 12 | Dr. Emily Davis,Grace Red,Orthopedics,New York,Dr. Emily Davis specializes in neurology and has published numerous research papers.,16,Female,Back Pain 13 | Dr. John Smith,David Black,Pediatrics,Los Angeles,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,76,Female,Asthma 14 | Dr. Jessica Lee,Alice Brown,Neurology,New York,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,38,Female,Migraine 15 | Dr. Emily Davis,Grace Red,Dermatology,Chicago,Dr. Emily Davis specializes in neurology and has published numerous research papers.,54,Male,Rosacea 16 | Dr. Jessica Lee,Frank Yellow,Cardiology,Phoenix,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,16,Male,Hypertension 17 | Dr. Emily Davis,Eva Blue,Orthopedics,Chicago,Dr. Emily Davis specializes in neurology and has published numerous research papers.,52,Male,Arthritis 18 | Dr. Jessica Lee,Bob White,Dermatology,New York,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,86,Female,Psoriasis 19 | Dr. John Smith,Alice Brown,Cardiology,New York,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,4,Female,Congenital Heart Disease 20 | Dr. John Smith,Frank Yellow,Cardiology,Phoenix,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,51,Male,Coronary Artery Disease 21 | Dr. Emily Davis,Charlie Green,Pediatrics,Phoenix,Dr. Emily Davis specializes in neurology and has published numerous research papers.,49,Female,ADHD 22 | Dr. Jessica Lee,Frank Yellow,Orthopedics,New York,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,27,Female,Fracture 23 | Dr. Jessica Lee,Bob White,Orthopedics,Los Angeles,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,73,Male,Osteoporosis 24 | Dr. Jessica Lee,David Black,Pediatrics,Phoenix,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,87,Female,Asthma 25 | Dr. Sarah Johnson,Bob White,Pediatrics,New York,Dr. Sarah Johnson is a pediatrician known for her compassionate care.,53,Female,ADHD 26 | Dr. John Smith,Alice Brown,Cardiology,Los Angeles,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,36,Male,Hypertension 27 | Dr. Sarah Johnson,David Black,Cardiology,Houston,Dr. Sarah Johnson is a pediatrician known for her compassionate care.,84,Male,Arrhythmia 28 | Dr. Michael Brown,Bob White,Neurology,New York,Dr. Michael Brown is an orthopedic surgeon with expertise in joint replacement.,70,Male,Parkinson's Disease 29 | Dr. Sarah Johnson,Charlie Green,Neurology,Los Angeles,Dr. Sarah Johnson is a pediatrician known for her compassionate care.,6,Male,Epilepsy 30 | Dr. Sarah Johnson,Eva Blue,Cardiology,Los Angeles,Dr. Sarah Johnson is a pediatrician known for her compassionate care.,38,Female,Heart Disease 31 | Dr. Jessica Lee,Frank Yellow,Neurology,Houston,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,94,Male,Stroke 32 | Dr. Jessica Lee,Charlie Green,Dermatology,Los Angeles,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,47,Male,Eczema 33 | Dr. Jessica Lee,Frank Yellow,Cardiology,New York,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,21,Female,Heart Failure 34 | Dr. Jessica Lee,Frank Yellow,Cardiology,Chicago,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,95,Male,Coronary Artery Disease 35 | Dr. Michael Brown,Frank Yellow,Neurology,Chicago,Dr. Michael Brown is an orthopedic surgeon with expertise in joint replacement.,19,Male,Migraine 36 | Dr. Sarah Johnson,Frank Yellow,Orthopedics,Los Angeles,Dr. Sarah Johnson is a pediatrician known for her compassionate care.,57,Male,Arthritis 37 | Dr. Michael Brown,Frank Yellow,Orthopedics,New York,Dr. Michael Brown is an orthopedic surgeon with expertise in joint replacement.,63,Female,Osteoporosis 38 | Dr. Jessica Lee,David Black,Orthopedics,Houston,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,23,Female,Scoliosis 39 | Dr. Emily Davis,Eva Blue,Orthopedics,Chicago,Dr. Emily Davis specializes in neurology and has published numerous research papers.,89,Male,Back Pain 40 | Dr. Michael Brown,David Black,Cardiology,Chicago,Dr. Michael Brown is an orthopedic surgeon with expertise in joint replacement.,28,Female,Heart Disease 41 | Dr. Jessica Lee,Grace Red,Neurology,New York,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,18,Female,Stroke 42 | Dr. Sarah Johnson,Alice Brown,Dermatology,Phoenix,Dr. Sarah Johnson is a pediatrician known for her compassionate care.,20,Male,Acne 43 | Dr. Jessica Lee,David Black,Neurology,Phoenix,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,95,Male,Parkinson's Disease 44 | Dr. Michael Brown,Alice Brown,Cardiology,Los Angeles,Dr. Michael Brown is an orthopedic surgeon with expertise in joint replacement.,17,Male,Hypertension 45 | Dr. Emily Davis,David Black,Orthopedics,Phoenix,Dr. Emily Davis specializes in neurology and has published numerous research papers.,26,Male,Fracture 46 | Dr. Jessica Lee,Alice Brown,Cardiology,Chicago,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,11,Female,Heart Murmur 47 | Dr. Emily Davis,Grace Red,Dermatology,Los Angeles,Dr. Emily Davis specializes in neurology and has published numerous research papers.,24,Female,Rosacea 48 | Dr. Michael Brown,David Black,Cardiology,Los Angeles,Dr. Michael Brown is an orthopedic surgeon with expertise in joint replacement.,54,Female,Arrhythmia 49 | Dr. John Smith,Eva Blue,Dermatology,Chicago,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,6,Female,Eczema 50 | Dr. John Smith,Alice Brown,Pediatrics,Phoenix,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,42,Female,Asthma 51 | Dr. Jessica Lee,Alice Brown,Cardiology,Phoenix,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,85,Female,Heart Disease 52 | Dr. Jessica Lee,David Black,Neurology,New York,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,9,Male,Epilepsy 53 | Dr. Michael Brown,Frank Yellow,Cardiology,Houston,Dr. Michael Brown is an orthopedic surgeon with expertise in joint replacement.,45,Male,Coronary Artery Disease 54 | Dr. Sarah Johnson,Frank Yellow,Pediatrics,Houston,Dr. Sarah Johnson is a pediatrician known for her compassionate care.,54,Male,ADHD 55 | Dr. Jessica Lee,Alice Brown,Neurology,New York,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,93,Male,Parkinson's Disease 56 | Dr. John Smith,Frank Yellow,Neurology,Chicago,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,84,Male,Stroke 57 | Dr. Michael Brown,Bob White,Pediatrics,Chicago,Dr. Michael Brown is an orthopedic surgeon with expertise in joint replacement.,4,Female,Asthma 58 | Dr. John Smith,David Black,Cardiology,New York,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,49,Male,Hypertension 59 | Dr. John Smith,Frank Yellow,Cardiology,Los Angeles,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,81,Male,Heart Failure 60 | Dr. Jessica Lee,Bob White,Orthopedics,Phoenix,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,3,Female,Fracture 61 | Dr. Jessica Lee,Eva Blue,Cardiology,Phoenix,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,86,Female,Coronary Artery Disease 62 | Dr. Sarah Johnson,Eva Blue,Neurology,Phoenix,Dr. Sarah Johnson is a pediatrician known for her compassionate care.,57,Male,Epilepsy 63 | Dr. Jessica Lee,Bob White,Neurology,Los Angeles,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,30,Male,Parkinson's Disease 64 | Dr. Emily Davis,Charlie Green,Neurology,Houston,Dr. Emily Davis specializes in neurology and has published numerous research papers.,13,Female,Epilepsy 65 | Dr. Sarah Johnson,Eva Blue,Dermatology,Houston,Dr. Sarah Johnson is a pediatrician known for her compassionate care.,16,Female,Acne 66 | Dr. Emily Davis,Grace Red,Cardiology,Chicago,Dr. Emily Davis specializes in neurology and has published numerous research papers.,59,Male,Heart Disease 67 | Dr. John Smith,Bob White,Pediatrics,Houston,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,91,Female,Asthma 68 | Dr. Emily Davis,Grace Red,Neurology,Phoenix,Dr. Emily Davis specializes in neurology and has published numerous research papers.,4,Male,Epilepsy 69 | Dr. Michael Brown,Frank Yellow,Neurology,New York,Dr. Michael Brown is an orthopedic surgeon with expertise in joint replacement.,54,Female,Migraine 70 | Dr. Sarah Johnson,David Black,Neurology,Chicago,Dr. Sarah Johnson is a pediatrician known for her compassionate care.,72,Male,Parkinson's Disease 71 | Dr. John Smith,Grace Red,Dermatology,Houston,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,44,Male,Eczema 72 | Dr. Sarah Johnson,David Black,Dermatology,Los Angeles,Dr. Sarah Johnson is a pediatrician known for her compassionate care.,93,Male,Rosacea 73 | Dr. John Smith,Alice Brown,Orthopedics,Los Angeles,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,39,Male,Osteoporosis 74 | Dr. Emily Davis,Eva Blue,Neurology,Phoenix,Dr. Emily Davis specializes in neurology and has published numerous research papers.,68,Male,Parkinson's Disease 75 | Dr. Sarah Johnson,Alice Brown,Pediatrics,Los Angeles,Dr. Sarah Johnson is a pediatrician known for her compassionate care.,35,Female,Asthma 76 | Dr. Sarah Johnson,David Black,Neurology,Phoenix,Dr. Sarah Johnson is a pediatrician known for her compassionate care.,63,Male,Parkinson's Disease 77 | Dr. Emily Davis,Grace Red,Dermatology,Houston,Dr. Emily Davis specializes in neurology and has published numerous research papers.,1,Female,Acne 78 | Dr. Emily Davis,Frank Yellow,Dermatology,Chicago,Dr. Emily Davis specializes in neurology and has published numerous research papers.,16,Male,Rosacea 79 | Dr. Jessica Lee,David Black,Orthopedics,Chicago,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,5,Male,Scoliosis 80 | Dr. John Smith,Charlie Green,Dermatology,Phoenix,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,51,Female,Eczema 81 | Dr. Emily Davis,Frank Yellow,Neurology,Los Angeles,Dr. Emily Davis specializes in neurology and has published numerous research papers.,47,Male,Stroke 82 | Dr. Sarah Johnson,Frank Yellow,Neurology,Houston,Dr. Sarah Johnson is a pediatrician known for her compassionate care.,81,Female,Epilepsy 83 | Dr. John Smith,Grace Red,Neurology,Chicago,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,56,Male,Stroke 84 | Dr. Sarah Johnson,Charlie Green,Pediatrics,Chicago,Dr. Sarah Johnson is a pediatrician known for her compassionate care.,7,Male,ADHD 85 | Dr. Jessica Lee,Frank Yellow,Cardiology,New York,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,27,Female,Heart Disease 86 | Dr. John Smith,Alice Brown,Dermatology,Chicago,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,72,Female,Eczema 87 | Dr. Jessica Lee,Eva Blue,Dermatology,Phoenix,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,37,Male,Psoriasis 88 | Dr. Jessica Lee,Eva Blue,Neurology,Phoenix,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,32,Female,Epilepsy 89 | Dr. Jessica Lee,Grace Red,Orthopedics,Los Angeles,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,87,Male,Fracture 90 | Dr. John Smith,Eva Blue,Neurology,Phoenix,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,85,Female,Stroke 91 | Dr. Jessica Lee,Eva Blue,Pediatrics,Chicago,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,56,Male,Asthma 92 | Dr. Jessica Lee,Bob White,Neurology,Houston,Dr. Jessica Lee is a dermatologist focused on skin cancer treatment and prevention.,46,Male,Migraine 93 | Dr. John Smith,David Black,Orthopedics,Phoenix,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,69,Female,Fracture 94 | Dr. Emily Davis,Frank Yellow,Dermatology,Los Angeles,Dr. Emily Davis specializes in neurology and has published numerous research papers.,77,Male,Rosacea 95 | Dr. Emily Davis,Bob White,Neurology,Los Angeles,Dr. Emily Davis specializes in neurology and has published numerous research papers.,53,Male,Parkinson's Disease 96 | Dr. Sarah Johnson,Charlie Green,Dermatology,Phoenix,Dr. Sarah Johnson is a pediatrician known for her compassionate care.,97,Male,Acne 97 | Dr. Sarah Johnson,Grace Red,Orthopedics,Los Angeles,Dr. Sarah Johnson is a pediatrician known for her compassionate care.,88,Male,Osteoporosis 98 | Dr. John Smith,Grace Red,Cardiology,Phoenix,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,13,Female,Heart Disease 99 | Dr. John Smith,Charlie Green,Orthopedics,Phoenix,Dr. John Smith is a renowned cardiologist with over 20 years of experience.,56,Female,Fracture 100 | Dr. Michael Brown,Charlie Green,Neurology,Houston,Dr. Michael Brown is an orthopedic surgeon with expertise in joint replacement.,95,Male,Parkinson's Disease 101 | Dr. Sarah Johnson,David Black,Cardiology,Chicago,Dr. Sarah Johnson is a pediatrician known for her compassionate care.,63,Male,Coronary Artery Disease 102 | -------------------------------------------------------------------------------- /kgraph_rag/roman_emp_graph_rag.py: -------------------------------------------------------------------------------- 1 | from dotenv import load_dotenv 2 | import os 3 | from langchain_neo4j import Neo4jGraph 4 | 5 | from langchain_core.runnables import ( 6 | RunnableBranch, 7 | RunnableLambda, 8 | RunnableParallel, 9 | RunnablePassthrough, 10 | ) 11 | from langchain_core.prompts import ChatPromptTemplate 12 | from langchain_core.prompts.prompt import PromptTemplate 13 | from pydantic import BaseModel, Field 14 | # from langchain_core.pydantic_v1 import BaseModel, Field 15 | from typing import Tuple, List 16 | from langchain_core.messages import AIMessage, HumanMessage 17 | from langchain_core.output_parsers import StrOutputParser 18 | from langchain_community.document_loaders import WikipediaLoader 19 | from langchain.text_splitter import TokenTextSplitter 20 | from langchain_openai import ChatOpenAI 21 | from langchain_experimental.graph_transformers import LLMGraphTransformer 22 | 23 | from langchain_neo4j import Neo4jVector 24 | from langchain_openai import OpenAIEmbeddings 25 | from langchain_neo4j.vectorstores.neo4j_vector import remove_lucene_chars 26 | 27 | load_dotenv() 28 | 29 | AURA_INSTANCENAME = os.environ["AURA_INSTANCENAME"] 30 | NEO4J_URI = os.environ["NEO4J_URI"] 31 | NEO4J_USERNAME = os.environ["NEO4J_USERNAME"] 32 | NEO4J_PASSWORD = os.environ["NEO4J_PASSWORD"] 33 | AUTH = (NEO4J_USERNAME, NEO4J_PASSWORD) 34 | 35 | OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") 36 | OPENAI_ENDPOINT = os.getenv("OPENAI_ENDPOINT") 37 | 38 | chat = ChatOpenAI(api_key=OPENAI_API_KEY, temperature=0, model="gpt-4o-mini") 39 | 40 | 41 | kg = Neo4jGraph( 42 | url=NEO4J_URI, 43 | username=NEO4J_USERNAME, 44 | password=NEO4J_PASSWORD, 45 | ) #database=NEO4J_DATABASE, 46 | 47 | # # # read the wikipedia page for the Roman Empire 48 | # raw_documents = WikipediaLoader(query="The Roman empire").load() 49 | 50 | # # # # # Define chunking strategy 51 | # text_splitter = TokenTextSplitter(chunk_size=512, chunk_overlap=24) 52 | # documents = text_splitter.split_documents(raw_documents[:3]) 53 | # print(documents) 54 | 55 | # llm_transformer = LLMGraphTransformer(llm=chat) 56 | # graph_documents = llm_transformer.convert_to_graph_documents(documents) 57 | 58 | # # store to neo4j 59 | # res = kg.add_graph_documents( 60 | # graph_documents, 61 | # include_source=True, 62 | # baseEntityLabel=True, 63 | # ) 64 | 65 | # # MATCH (n) DETACH DELETE n - use this cyper command to delete the Graphs present in neo4j 66 | 67 | # Hybrid Retrieval for RAG 68 | # create vector index 69 | vector_index = Neo4jVector.from_existing_graph( 70 | OpenAIEmbeddings(), 71 | search_type="hybrid", 72 | node_label="Document", 73 | text_node_properties=["text"], 74 | embedding_node_property="embedding", 75 | ) 76 | 77 | 78 | # Extract entities from text 79 | class Entities(BaseModel): 80 | """Identifying information about entities.""" 81 | 82 | names: List[str] = Field( 83 | ..., 84 | description="All the person, organization, or business entities that " 85 | "appear in the text", 86 | ) 87 | 88 | 89 | prompt = ChatPromptTemplate.from_messages( 90 | [ 91 | ( 92 | "system", 93 | "You are extracting organization and person entities from the text.", 94 | ), 95 | ( 96 | "human", 97 | "Use the given format to extract information from the following " 98 | "input: {question}", 99 | ), 100 | ] 101 | ) 102 | entity_chain = prompt | chat.with_structured_output(Entities) 103 | 104 | # # Test it out: 105 | # res = entity_chain.invoke( 106 | # {"question": "In the year of 123 there was an emperor who did not like to rule"} 107 | # ).names 108 | # print(res) 109 | 110 | # Who is Ceaser? 111 | # In the year of 123 there was an emperor who did not like to rule. 112 | 113 | # Retriever 114 | kg.query("CREATE FULLTEXT INDEX entity IF NOT EXISTS FOR (e:__Entity__) ON EACH [e.id]") 115 | 116 | 117 | def generate_full_text_query(input: str) -> str: 118 | """ 119 | Generate a full-text search query for a given input string. 120 | 121 | This function constructs a query string suitable for a full-text search. 122 | It processes the input string by splitting it into words and appending a 123 | similarity threshold (~2 changed characters) to each word, then combines 124 | them using the AND operator. Useful for mapping entities from user questions 125 | to database values, and allows for some misspelings. 126 | """ 127 | full_text_query = "" 128 | words = [el for el in remove_lucene_chars(input).split() if el] 129 | for word in words[:-1]: 130 | full_text_query += f" {word}~2 AND" 131 | full_text_query += f" {words[-1]}~2" 132 | return full_text_query.strip() 133 | 134 | 135 | # Fulltext index query 136 | def structured_retriever(question: str) -> str: 137 | """ 138 | Collects the neighborhood of entities mentioned 139 | in the question 140 | """ 141 | result = "" 142 | entities = entity_chain.invoke({"question": question}) 143 | for entity in entities.names: 144 | print(f" Getting Entity: {entity}") 145 | response = kg.query( 146 | """CALL db.index.fulltext.queryNodes('entity', $query, {limit:2}) 147 | YIELD node,score 148 | CALL { 149 | WITH node 150 | MATCH (node)-[r:!MENTIONS]->(neighbor) 151 | RETURN node.id + ' - ' + type(r) + ' -> ' + neighbor.id AS output 152 | UNION ALL 153 | WITH node 154 | MATCH (node)<-[r:!MENTIONS]-(neighbor) 155 | RETURN neighbor.id + ' - ' + type(r) + ' -> ' + node.id AS output 156 | } 157 | RETURN output LIMIT 50 158 | """, 159 | {"query": generate_full_text_query(entity)}, 160 | ) 161 | # print(response) 162 | result += "\n".join([el["output"] for el in response]) 163 | return result 164 | 165 | 166 | # print(structured_retriever("Who is Aurelian?")) 167 | 168 | 169 | # Final retrieval step 170 | def retriever(question: str): 171 | print(f"Search query: {question}") 172 | structured_data = structured_retriever(question) 173 | unstructured_data = [ 174 | el.page_content for el in vector_index.similarity_search(question) 175 | ] 176 | final_data = f"""Structured data: 177 | {structured_data} 178 | Unstructured data: 179 | {"#Document ". join(unstructured_data)} 180 | """ 181 | print(f"\nFinal Data::: ==>{final_data}") 182 | return final_data 183 | 184 | 185 | # Define the RAG chain 186 | # Condense a chat history and follow-up question into a standalone question 187 | _template = """Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question, 188 | in its original language. 189 | Chat History: 190 | {chat_history} 191 | Follow Up Input: {question} 192 | Standalone question:""" # noqa: E501 193 | CONDENSE_QUESTION_PROMPT = PromptTemplate.from_template(_template) 194 | 195 | 196 | def _format_chat_history(chat_history: List[Tuple[str, str]]) -> List: 197 | buffer = [] 198 | for human, ai in chat_history: 199 | buffer.append(HumanMessage(content=human)) 200 | buffer.append(AIMessage(content=ai)) 201 | return buffer 202 | 203 | 204 | _search_query = RunnableBranch( 205 | # If input includes chat_history, we condense it with the follow-up question 206 | ( 207 | RunnableLambda(lambda x: bool(x.get("chat_history"))).with_config( 208 | run_name="HasChatHistoryCheck" 209 | ), # Condense follow-up question and chat into a standalone_question 210 | RunnablePassthrough.assign( 211 | chat_history=lambda x: _format_chat_history(x["chat_history"]) 212 | ) 213 | | CONDENSE_QUESTION_PROMPT 214 | | ChatOpenAI(temperature=0) 215 | | StrOutputParser(), 216 | ), 217 | # Else, we have no chat history, so just pass through the question 218 | RunnableLambda(lambda x: x["question"]), 219 | ) 220 | 221 | template = """Answer the question based only on the following context: 222 | {context} 223 | 224 | Question: {question} 225 | Use natural language and be concise. 226 | Answer:""" 227 | prompt = ChatPromptTemplate.from_template(template) 228 | 229 | chain = ( 230 | RunnableParallel( 231 | { 232 | "context": _search_query | retriever, 233 | "question": RunnablePassthrough(), 234 | } 235 | ) 236 | | prompt 237 | | chat 238 | | StrOutputParser() 239 | ) 240 | 241 | # # TEST it all out! 242 | # res_simple = chain.invoke( 243 | # { 244 | # "question": "How did the Roman empire fall?", 245 | # } 246 | # ) 247 | 248 | # print(f"\n Results === {res_simple}\n\n") 249 | 250 | res_hist = chain.invoke( 251 | { 252 | "question": "When did he become the first emperor?", 253 | "chat_history": [ 254 | ("Who was the first emperor?", "Augustus was the first emperor.") 255 | ], 256 | } 257 | ) 258 | 259 | print(f"\n === {res_hist}\n\n") 260 | -------------------------------------------------------------------------------- /note-neo4j-creds.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | # Wait 60 seconds before connecting using these details, or login to https://console.neo4j.io to validate the Aura Instance is available 4 | 5 | NEO4J_URI=neo4j+s://add-your-own.databases.neo4j.io 6 | NEO4J_USERNAME=neo4j 7 | NEO4J_PASSWORD=us-GREDDDDE-R-b0qMS1r0DYBj0jmochTnGYAtRo 8 | AURA_INSTANCEID=79f439b2 9 | AURA_INSTANCENAME=Instance01 10 | -------------------------------------------------------------------------------- /prep_text_for_rag/app.py: -------------------------------------------------------------------------------- 1 | from dotenv import load_dotenv 2 | import os 3 | from langchain_neo4j import Neo4jGraph 4 | from langchain_openai import ChatOpenAI 5 | 6 | load_dotenv() 7 | 8 | AURA_INSTANCENAME = os.environ["AURA_INSTANCENAME"] 9 | NEO4J_URI = os.environ["NEO4J_URI"] 10 | NEO4J_USERNAME = os.environ["NEO4J_USERNAME"] 11 | NEO4J_PASSWORD = os.environ["NEO4J_PASSWORD"] 12 | AUTH = (NEO4J_USERNAME, NEO4J_PASSWORD) 13 | 14 | OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") 15 | OPENAI_ENDPOINT = os.getenv("OPENAI_ENDPOINT") 16 | 17 | chat = ChatOpenAI(api_key=OPENAI_API_KEY) 18 | 19 | 20 | kg = Neo4jGraph( 21 | url=NEO4J_URI, 22 | username=NEO4J_USERNAME, 23 | password=NEO4J_PASSWORD, 24 | ) #database=NEO4J_DATABASE, 25 | 26 | # kg.query( 27 | # """ 28 | # CREATE VECTOR INDEX health_providers_embeddings IF NOT EXISTS 29 | # FOR (hp:HealthcareProvider) ON (hp.comprehensiveEmbedding) 30 | # OPTIONS { 31 | # indexConfig: { 32 | # `vector.dimensions`: 1536, 33 | # `vector.similarity_function`: 'cosine' 34 | # } 35 | # } 36 | # """ 37 | # ) 38 | 39 | # # test to see if the index was created 40 | # res = kg.query( 41 | # """ 42 | # SHOW VECTOR INDEXES 43 | # """ 44 | # ) 45 | # print(res) 46 | 47 | # kg.query( 48 | # """ 49 | # MATCH (hp:HealthcareProvider)-[:TREATS]->(p:Patient) 50 | # WHERE hp.bio IS NOT NULL 51 | # WITH hp, genai.vector.encode( 52 | # hp.bio, 53 | # "OpenAI", 54 | # { 55 | # token: $openAiApiKey, 56 | # endpoint: $openAiEndpoint 57 | # }) AS vector 58 | # WITH hp, vector 59 | # WHERE vector IS NOT NULL 60 | # CALL db.create.setNodeVectorProperty(hp, "comprehensiveEmbedding", vector) 61 | # """, 62 | # params={ 63 | # "openAiApiKey": OPENAI_API_KEY, 64 | # "openAiEndpoint": OPENAI_ENDPOINT, 65 | # }, 66 | # ) 67 | 68 | # result = kg.query( 69 | # """ 70 | # MATCH (hp:HealthcareProvider) 71 | # WHERE hp.bio IS NOT NULL 72 | # RETURN hp.bio, hp.name, hp.comprehensiveEmbedding 73 | # LIMIT 5 74 | # """ 75 | # ) 76 | # # loop through the results 77 | # for record in result: 78 | # print(f" bio: {record["hp.bio"]}, name: {record["hp.name"]}") 79 | 80 | # == Queerying the graph for a healthcare provider 81 | question = "give me a list of healthcare providers in the area of dermatology" 82 | 83 | # # Execute the query 84 | result = kg.query( 85 | """ 86 | WITH genai.vector.encode( 87 | $question, 88 | "OpenAI", 89 | { 90 | token: $openAiApiKey, 91 | endpoint: $openAiEndpoint 92 | }) AS question_embedding 93 | CALL db.index.vector.queryNodes( 94 | 'health_providers_embeddings', 95 | $top_k, 96 | question_embedding 97 | ) YIELD node AS healthcare_provider, score 98 | RETURN healthcare_provider.name, healthcare_provider.bio, score 99 | """, 100 | params={ 101 | "openAiApiKey": OPENAI_API_KEY, 102 | "openAiEndpoint": OPENAI_ENDPOINT, 103 | "question": question, 104 | "top_k": 3, 105 | }, 106 | ) 107 | 108 | # # # Print the encoded question vector for debugging 109 | # # print("Encoded question vector:", result) 110 | 111 | # # Print the result 112 | for record in result: 113 | print(f"Name: {record['healthcare_provider.name']}") 114 | print(f"Bio: {record['healthcare_provider.bio']}") 115 | # print(f"Specialization: {record['healthcare_provider.specialization']}") 116 | # print(f"Location: {record['healthcare_provider.location']}") 117 | print(f"Score: {record['score']}") 118 | print("---") 119 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | aiohappyeyeballs==2.6.1 2 | aiohttp==3.11.16 3 | aiosignal==1.3.2 4 | annotated-types==0.7.0 5 | anyio==4.9.0 6 | argon2-cffi==23.1.0 7 | argon2-cffi-bindings==21.2.0 8 | arrow==1.3.0 9 | asttokens==3.0.0 10 | async-lru==2.0.5 11 | attrs==25.3.0 12 | babel==2.17.0 13 | beautifulsoup4==4.13.3 14 | bleach==6.2.0 15 | certifi==2025.1.31 16 | cffi==1.17.1 17 | charset-normalizer==3.4.1 18 | colorama==0.4.6 19 | comm==0.2.2 20 | dataclasses-json==0.6.7 21 | debugpy==1.8.13 22 | decorator==5.2.1 23 | defusedxml==0.7.1 24 | distro==1.9.0 25 | executing==2.2.0 26 | fastjsonschema==2.21.1 27 | fqdn==1.5.1 28 | frozenlist==1.5.0 29 | fsspec==2024.12.0 30 | gitdb==4.0.12 31 | GitPython==3.1.44 32 | greenlet==3.1.1 33 | h11==0.14.0 34 | httpcore==1.0.7 35 | httpx==0.28.1 36 | httpx-sse==0.4.0 37 | idna==3.10 38 | ipykernel==6.29.5 39 | ipython==9.0.2 40 | ipython_pygments_lexers==1.1.1 41 | ipywidgets==8.1.5 42 | isoduration==20.11.0 43 | jedi==0.19.2 44 | Jinja2==3.1.6 45 | jiter==0.9.0 46 | json5==0.11.0 47 | json_repair==0.39.1 48 | jsonpatch==1.33 49 | jsonpointer==3.0.0 50 | jsonschema==4.23.0 51 | jsonschema-specifications==2024.10.1 52 | jupyter==1.1.1 53 | jupyter-console==6.6.3 54 | jupyter-events==0.12.0 55 | jupyter-lsp==2.2.5 56 | jupyter-server-mathjax==0.2.6 57 | jupyter_client==8.6.3 58 | jupyter_core==5.7.2 59 | jupyter_server==2.15.0 60 | jupyter_server_terminals==0.5.3 61 | jupyterlab==4.3.6 62 | jupyterlab_pygments==0.3.0 63 | jupyterlab_server==2.27.3 64 | jupyterlab_widgets==3.0.13 65 | langchain==0.3.23 66 | langchain-community==0.3.21 67 | langchain-core==0.3.51 68 | langchain-experimental==0.3.4 69 | langchain-neo4j==0.4.0 70 | langchain-openai==0.3.12 71 | langchain-text-splitters==0.3.8 72 | langsmith==0.3.24 73 | MarkupSafe==3.0.2 74 | marshmallow==3.26.1 75 | matplotlib-inline==0.1.7 76 | mistune==3.1.3 77 | multidict==6.3.2 78 | mypy-extensions==1.0.0 79 | nbclient==0.10.2 80 | nbconvert==7.16.6 81 | nbdime==4.0.2 82 | nbformat==5.10.4 83 | neo4j==5.28.1 84 | neo4j-graphrag==1.6.1 85 | nest-asyncio==1.6.0 86 | notebook==7.3.3 87 | notebook_shim==0.2.4 88 | numpy==2.2.4 89 | openai==1.70.0 90 | orjson==3.10.16 91 | overrides==7.7.0 92 | packaging==24.2 93 | pandas==2.2.3 94 | pandocfilters==1.5.1 95 | parso==0.8.4 96 | pexpect==4.9.0 97 | platformdirs==4.3.7 98 | prometheus_client==0.21.1 99 | prompt_toolkit==3.0.50 100 | propcache==0.3.1 101 | psutil==7.0.0 102 | ptyprocess==0.7.0 103 | pure_eval==0.2.3 104 | pycparser==2.22 105 | pydantic==2.11.2 106 | pydantic-settings==2.8.1 107 | pydantic_core==2.33.1 108 | Pygments==2.19.1 109 | pypdf==5.4.0 110 | python-dateutil==2.9.0.post0 111 | python-dotenv==1.1.0 112 | python-json-logger==3.3.0 113 | pytz==2025.2 114 | PyYAML==6.0.2 115 | pyzmq==26.3.0 116 | referencing==0.36.2 117 | regex==2024.11.6 118 | requests==2.32.3 119 | requests-toolbelt==1.0.0 120 | rfc3339-validator==0.1.4 121 | rfc3986-validator==0.1.1 122 | rpds-py==0.24.0 123 | Send2Trash==1.8.3 124 | setuptools==75.8.0 125 | six==1.17.0 126 | smmap==5.0.2 127 | sniffio==1.3.1 128 | soupsieve==2.6 129 | SQLAlchemy==2.0.40 130 | stack-data==0.6.3 131 | tenacity==9.1.2 132 | terminado==0.18.1 133 | tiktoken==0.9.0 134 | tinycss2==1.4.0 135 | tornado==6.4.2 136 | tqdm==4.67.1 137 | traitlets==5.14.3 138 | types-python-dateutil==2.9.0.20241206 139 | types-PyYAML==6.0.12.20250402 140 | typing-inspect==0.9.0 141 | typing-inspection==0.4.0 142 | typing_extensions==4.13.0 143 | tzdata==2025.2 144 | uri-template==1.3.0 145 | urllib3==2.3.0 146 | wcwidth==0.2.13 147 | webcolors==24.11.1 148 | webencodings==0.5.1 149 | websocket-client==1.8.0 150 | wheel==0.45.1 151 | widgetsnbextension==4.0.13 152 | wikipedia==1.4.0 153 | yarl==1.19.0 154 | zstandard==0.23.0 155 | -------------------------------------------------------------------------------- /simple_kg/kg_simple.py: -------------------------------------------------------------------------------- 1 | from dotenv import load_dotenv 2 | import os 3 | from neo4j import GraphDatabase 4 | 5 | load_dotenv() 6 | 7 | AURA_INSTANCENAME = os.environ["AURA_INSTANCENAME"] 8 | NEO4J_URI = os.environ["NEO4J_URI"] 9 | NEO4J_USERNAME = os.environ["NEO4J_USERNAME"] 10 | NEO4J_PASSWORD = os.environ["NEO4J_PASSWORD"] 11 | AUTH = (NEO4J_USERNAME, NEO4J_PASSWORD) 12 | 13 | driver = GraphDatabase.driver(NEO4J_URI, auth=AUTH) 14 | 15 | def connect_and_query(): 16 | try: 17 | with driver.session() as session: 18 | result = session.run("MATCH (n) RETURN count(n)") 19 | count = result.single().value() 20 | print(f"Number of nodes: {count}") 21 | except Exception as e: 22 | print(f"Error: {e}") 23 | finally: 24 | driver.close() 25 | 26 | def create_entities(tx): 27 | # Create Albert Einstein node 28 | tx.run("MERGE (a:Person {name: 'Albert Einstein'})") 29 | 30 | # Create other nodes 31 | tx.run("MERGE (p:Subject {name: 'Physics'})") 32 | tx.run("MERGE (n:NobelPrize {name: 'Nobel Prize in Physics'})") 33 | tx.run("MERGE (g:Country {name: 'Germany'})") 34 | tx.run("MERGE (u:Country {name: 'USA'})") 35 | 36 | 37 | def create_relationships(tx): 38 | # Create studied relationship 39 | tx.run( 40 | """ 41 | MATCH (a:Person {name: 'Albert Einstein'}), (p:Subject {name: 'Physics'}) 42 | MERGE (a)-[:STUDIED]->(p) 43 | """ 44 | ) 45 | 46 | # Create won relationship 47 | tx.run( 48 | """ 49 | MATCH (a:Person {name: 'Albert Einstein'}), (n:NobelPrize {name: 'Nobel Prize in Physics'}) 50 | MERGE (a)-[:WON]->(n) 51 | """ 52 | ) 53 | 54 | # Create born in relationship 55 | tx.run( 56 | """ 57 | MATCH (a:Person {name: 'Albert Einstein'}), (g:Country {name: 'Germany'}) 58 | MERGE (a)-[:BORN_IN]->(g) 59 | """ 60 | ) 61 | 62 | # Create died in relationship 63 | tx.run( 64 | """ 65 | MATCH (a:Person {name: 'Albert Einstein'}), (u:Country {name: 'USA'}) 66 | MERGE (a)-[:DIED_IN]->(u) 67 | """ 68 | ) 69 | 70 | 71 | # Function to connect and run a simple Cypher query 72 | def query_graph_simple(cypher_query): 73 | driver = GraphDatabase.driver(NEO4J_URI, auth=AUTH) 74 | try: 75 | with driver.session() as session: #database=NEO4J_DATABASE 76 | result = session.run(cypher_query) 77 | for record in result: 78 | print(record["name"]) 79 | except Exception as e: 80 | print(f"Error: {e}") 81 | finally: 82 | driver.close() 83 | 84 | 85 | # Function to connect and run a Cypher query 86 | def query_graph(cypher_query): 87 | driver = GraphDatabase.driver(NEO4J_URI, auth=AUTH) 88 | try: 89 | with driver.session() as session: #database=NEO4J_DATABASE 90 | result = session.run(cypher_query) 91 | for record in result: 92 | print(record["path"]) 93 | except Exception as e: 94 | print(f"Error: {e}") 95 | finally: 96 | driver.close() 97 | 98 | 99 | def build_knowledge_graph(): 100 | # Open a session with the Neo4j database 101 | 102 | try: 103 | with driver.session() as session: #database=NEO4J_DATABASE 104 | # Create entities 105 | session.execute_write(create_entities) 106 | # Create relationships 107 | session.execute_write(create_relationships) 108 | 109 | except Exception as e: 110 | print(f"Error: {e}") 111 | finally: 112 | driver.close() 113 | 114 | # if __name__ == "__main__": 115 | # build_knowledge_graph() 116 | 117 | # Cypher query to find paths related to Albert Einstein 118 | einstein_query = """ 119 | MATCH path=(a:Person {name: 'Albert Einstein'})-[:STUDIED]->(s:Subject) 120 | RETURN path 121 | UNION 122 | MATCH path=(a:Person {name: 'Albert Einstein'})-[:WON]->(n:NobelPrize) 123 | RETURN path 124 | UNION 125 | MATCH path=(a:Person {name: 'Albert Einstein'})-[:BORN_IN]->(g:Country) 126 | RETURN path 127 | UNION 128 | MATCH path=(a:Person {name: 'Albert Einstein'})-[:DIED_IN]->(u:Country) 129 | RETURN path 130 | """ 131 | 132 | # Simple Cypher query to find all node names 133 | simple_query = """ 134 | MATCH (n) 135 | RETURN n.name AS name 136 | """ 137 | 138 | if __name__ == "__main__": 139 | # Build the knowledge graph 140 | # build_knowledge_graph() 141 | 142 | # query_graph_simple( 143 | # simple_query 144 | # ) 145 | query_graph(einstein_query) 146 | 147 | 148 | # # Run this to see the entire graph in the neo4j browser/console 149 | # # MATCH (n)-[r]->(m) 150 | # # RETURN n, r, m; 151 | --------------------------------------------------------------------------------