├── README.md ├── docs ├── README.md ├── conceptnet_guide.md ├── llm_interface_guide.md └── sqlite_backend.md ├── examples ├── HawkinDB_RAG.py ├── basic_demo.py ├── document.pdf ├── file_rag.py ├── hawkins_basic_demo.py ├── hawkins_demo.py ├── hawkinsdb_complete_example.py ├── hawkinsdb_comprehensive.py ├── hawkinsdb_demo.py ├── hawkinsdb_full_example.py ├── hawkinsdb_sqlite_example.py └── sqlite_usage.py ├── hawkinsdb ├── __init__.py ├── base.py ├── config.py ├── core.py ├── enrichment.py ├── llm_interface.py ├── openai_interface.py ├── py.typed ├── storage │ ├── __init__.py │ └── sqlite.py └── types.py ├── setup.py └── tests ├── __init__.py ├── document.pdf ├── file_rag.py ├── test_basic.py ├── test_conceptnet.py ├── test_enrichment.py ├── test_exmple_full.py ├── test_hawkinsdb_comprehensive.py ├── test_memory_specific.py ├── test_memory_types.py ├── test_openai.py ├── test_rag.py ├── test_readme_examples.py └── test_sqlite_storage.py /README.md: -------------------------------------------------------------------------------- 1 | # 🧠 HawkinsDB: Neuroscience-Inspired Memory Layer for LLM Applications 2 | 3 | Building smarter LLM applications isn't just about better models - it's about better memory. HawkinsDB is our take on giving AI systems a more human-like way to store and recall information, inspired by how our own brains work. Based on Jeff Hawkins' Thousand Brains Theory, it helps AI models manage complex information in a way that's both powerful and intuitive. 4 | 5 | > 📌 **Note for RAG Users**: If you're specifically looking to implement Retrieval-Augmented Generation (RAG), consider using [HawkinsRAG](https://pypi.org/project/hawkins-rag/0.1.0/) - our dedicated package built on top of HawkinsDB that simplifies RAG implementation with support for 22+ data sources. Check out the [documentation](https://github.com/harishsg993010/HawkinsRAG/tree/main/docs) and [examples](https://github.com/harishsg993010/HawkinsRAG/tree/main/examples) for more details. 6 | 7 | > 🤖 **Note for Agent Developers**: If you're interested in building AI agents, check out [Hawkins-Agent](https://pypi.org/project/hawkins-agent/) - our specialized framework built on HawkinsDB for creating intelligent agents. Visit our [GitHub repository](https://github.com/harishsg993010/HawkinsAgent) for implementation details. 8 | 9 | ## Why HawkinsDB? 10 | 11 | While vector databases and embeddings have revolutionized AI applications, they often miss the nuanced, multi-dimensional nature of information. Here's why we built HawkinsDB: 12 | 13 | - **It's not just another vector database**: Instead of relying on fuzzy similarity searches, we enable precise, context-aware queries that understand the actual meaning and relationships of your data. 14 | 15 | - **One memory system to rule them all**: We've unified different types of memory (semantic, episodic, and procedural) into a single framework. Think about a customer support AI that can simultaneously access product specs, past customer interactions, and troubleshooting guides - all working together seamlessly. 16 | 17 | - **Inspired by the human brain**: We've based our architecture on neuroscience research, using concepts like Reference Frames and Cortical Columns to create a more robust and adaptable system. 18 | 19 | - **You can actually understand what's happening**: Unlike black-box embeddings, our structured approach lets you see and understand how information is connected and why certain decisions are made. 20 | 21 | ## Requirements 22 | 23 | - Python 3.10 or higher 24 | - OpenAI API key (for LLM operations) 25 | - SQLite or JSON storage backend 26 | 27 | ## Installation 28 | 29 | ```bash 30 | # Basic installation 31 | pip install hawkinsdb 32 | 33 | # Recommended installation with all features 34 | pip install hawkinsdb[all] 35 | 36 | # Install specific features 37 | pip install hawkinsdb[conceptnet] # ConceptNet tools 38 | ``` 39 | 40 | ## Quick Start 41 | 42 | Here's a simple example showing the power of HawkinsDB: 43 | 44 | ```python 45 | from hawkinsdb import HawkinsDB, LLMInterface 46 | 47 | # Initialize 48 | db = HawkinsDB() 49 | llm = LLMInterface(db) 50 | 51 | # Store knowledge with multiple perspectives 52 | db.add_entity({ 53 | "column": "Semantic", 54 | "name": "Coffee Cup", 55 | "properties": { 56 | "type": "Container", 57 | "material": "Ceramic", 58 | "capacity": "350ml" 59 | }, 60 | "relationships": { 61 | "used_for": ["Drinking Coffee", "Hot Beverages"], 62 | "found_in": ["Kitchen", "Coffee Shop"] 63 | } 64 | }) 65 | 66 | # Query using natural language 67 | response = llm.query("What can you tell me about the coffee cup?") 68 | print(response) 69 | ``` 70 | 71 | For more examples, check out our [examples directory](examples). 72 | 73 | ## How It Works 74 | 75 | HawkinsDB manages information through three core concepts: 76 | 77 | ### 🧩 Reference Frames 78 | Smart containers for information that capture what something is, its properties, relationships, and context. This enables natural handling of complex queries like "Find kitchen items related to coffee brewing." 79 | 80 | ### 🌐 Cortical Columns 81 | Just like your brain processes information from multiple perspectives (visual, tactile, conceptual), our system stores knowledge in different "columns." This means an object isn't just stored as a single definition - it's understood from multiple angles. 82 | 83 | ### Memory Types 84 | 85 | We support three key types of memory: 86 | 87 | - **Semantic Memory**: For storing facts, concepts, and general knowledge 88 | - **Episodic Memory**: For keeping track of events and experiences over time 89 | - **Procedural Memory**: For capturing step-by-step processes and workflows 90 | 91 | ### 💾 Storage Options 92 | 93 | - **SQLite**: Rock-solid storage for production systems 94 | - **JSON**: Quick and easy for prototyping 95 | 96 | ### 🔗 Smart Integrations 97 | ConceptNet integration for automatic knowledge enrichment and relationship discovery. 98 | 99 | ## Contributing 100 | 101 | We love contributions! Here's how to help: 102 | 103 | 1. Fork the repository 104 | 2. Create your feature branch 105 | 3. Make your changes 106 | 4. Run the tests 107 | 5. Submit a pull request 108 | 109 | ## Development 110 | 111 | ```bash 112 | # Clone and set up 113 | git clone https://github.com/your-username/hawkinsdb.git 114 | cd hawkinsdb 115 | pip install -e ".[dev]" 116 | pytest tests/ 117 | ``` 118 | 119 | ## 🗺️ Status and Roadmap 120 | 121 | Currently under active development. Our focus areas: 122 | 123 | - [ ] Enhanced multi-modal processing 124 | - [ ] Performance optimizations for large-scale deployments 125 | - [ ] Extended LLM provider support 126 | - [ ] Advanced querying capabilities 127 | - [ ] Improved documentation and examples 128 | 129 | ## License 130 | 131 | HawkinsDB is available under the MIT License. See [LICENSE](LICENSE) for details. 132 | 133 | --- 134 | 135 | Built by developers who think memory matters ie Harish Santhanalakshmi Ganesan along with few AI Agents 136 | -------------------------------------------------------------------------------- /docs/README.md: -------------------------------------------------------------------------------- 1 | # HawkinsDB Technical Documentation 2 | 3 | ## Overview 4 | 5 | HawkinsDB is a flexible memory storage system designed for semantic, episodic, and procedural memory storage with built-in knowledge enrichment capabilities. It features SQLite backend support, LLM integration, and ConceptNet enrichment. 6 | 7 | ## Table of Contents 8 | 1. [Installation](#installation) 9 | 2. [Core Features](#core-features) 10 | 3. [Basic Usage](#basic-usage) 11 | 4. [Memory Types](#memory-types) 12 | 5. [Storage Backend](#storage-backend) 13 | 6. [LLM Interface](#llm-interface) 14 | 7. [ConceptNet Integration](#conceptnet-integration) 15 | 8. [Error Handling](#error-handling) 16 | 9. [Best Practices](#best-practices) 17 | 18 | ## Installation 19 | 20 | ```bash 21 | pip install hawkinsdb 22 | ``` 23 | 24 | ## Core Features 25 | 26 | - Multiple memory types (Semantic, Episodic, Procedural) 27 | - SQLite persistent storage with ACID compliance 28 | - Natural language interface via LLM 29 | - Automatic knowledge enrichment using ConceptNet 30 | - Property validation and type inference 31 | - Relationship management 32 | - Error handling and validation 33 | 34 | ## Basic Usage 35 | 36 | ### Initialization 37 | 38 | ```python 39 | from hawkinsdb import HawkinsDB , LLMInterface 40 | 41 | # Initialize with SQLite storage 42 | db = HawkinsDB(storage_type="sqlite", db_path="memory.db") 43 | llm = LLMInterface(db) 44 | ``` 45 | 46 | ### Adding Memories 47 | 48 | ```python 49 | # Add semantic memory 50 | semantic_memory = { 51 | "name": "cat", 52 | "column": "Semantic", 53 | "properties": { 54 | "type": "animal", 55 | "size": "medium", 56 | "characteristics": ["furry", "agile", "carnivorous"] 57 | }, 58 | "relationships": { 59 | "habitat": ["homes", "outdoors"], 60 | "behavior": ["hunting", "sleeping", "grooming"] 61 | } 62 | } 63 | 64 | result = db.add_entity(semantic_memory) 65 | response = llm.query( 66 | "Explain the behaviours of cat", 67 | ) 68 | 69 | # Add episodic memory 70 | import time 71 | 72 | episodic_memory = { 73 | "name": "cat_observation", 74 | "column": "Episodic", 75 | "properties": { 76 | "timestamp": time.time(), 77 | "action": "Observed cat behavior", 78 | "location": "Garden", 79 | "details": "Cat was chasing a butterfly" 80 | }, 81 | "relationships": { 82 | "relates_to": ["cat"], 83 | "observed_by": ["human"] 84 | } 85 | } 86 | 87 | result = db.add_entity(episodic_memory) 88 | ``` 89 | 90 | ### Querying Memories 91 | 92 | ```python 93 | # Query specific entity 94 | cat_info = db.query_frames("cat") 95 | 96 | # List all entities 97 | entities = db.list_entities() 98 | ``` 99 | 100 | ## Memory Types 101 | 102 | ### Semantic Memory 103 | Stores conceptual knowledge and facts. 104 | 105 | ```python 106 | semantic_data = { 107 | "name": "Photosynthesis", 108 | "column": "Semantic", 109 | "properties": { 110 | "type": "biological_process", 111 | "location": "plant_cells", 112 | "components": ["chlorophyll", "sunlight", "water", "carbon_dioxide"], 113 | "products": ["glucose", "oxygen"] 114 | }, 115 | "relationships": { 116 | "occurs_in": ["plants", "algae"], 117 | "requires": ["light_energy", "chloroplasts"], 118 | "produces": ["chemical_energy", "organic_compounds"] 119 | } 120 | } 121 | ``` 122 | 123 | ### Episodic Memory 124 | Stores event-based memories with temporal information. 125 | 126 | ```python 127 | episodic_data = { 128 | "name": "first_python_project", 129 | "column": "Episodic", 130 | "properties": { 131 | "timestamp": time.time(), 132 | "duration": "2 hours", 133 | "location": "home_office", 134 | "outcome": "successful" 135 | }, 136 | "relationships": { 137 | "involves": ["Python_Language"], 138 | "followed_by": ["code_review"] 139 | } 140 | } 141 | ``` 142 | 143 | ## Storage Backend 144 | 145 | ### SQLite Configuration 146 | 147 | ```python 148 | # Initialize with custom SQLite path 149 | db = HawkinsDB( 150 | storage_type="sqlite", 151 | db_path="custom_path/memory.db" 152 | ) 153 | 154 | # Basic operations 155 | try: 156 | # Add entity 157 | result = db.add_entity(entity_data) 158 | 159 | # Query data 160 | frames = db.query_frames("entity_name") 161 | 162 | # Cleanup 163 | db.cleanup() # Close connections 164 | except Exception as e: 165 | print(f"Storage error: {str(e)}") 166 | ``` 167 | 168 | ## LLM Interface 169 | 170 | ### Initialization and Basic Usage 171 | 172 | ```python 173 | from hawkinsdb import HawkinsDB, LLMInterface 174 | 175 | # Initialize 176 | db = HawkinsDB() 177 | llm = LLMInterface( 178 | db, 179 | auto_enrich=True, 180 | confidence_threshold=0.7, 181 | max_enrichment_depth=2 182 | ) 183 | 184 | # Convert natural language to structured data 185 | result = llm.add_from_text(""" 186 | The respiratory system is responsible for taking in oxygen 187 | and releasing carbon dioxide. Key organs include the lungs, 188 | trachea, and diaphragm. 189 | """) 190 | print(f"Added entity: {result['entity_name']}") 191 | 192 | # Complex querying with context 193 | response = llm.query( 194 | "What are the main components of the respiratory system?", 195 | ) 196 | print(f"Response: {response}") 197 | 198 | 199 | 200 | 201 | ``` 202 | 203 | ## ConceptNet Integration 204 | 205 | ### Basic Enrichment 206 | 207 | ```python 208 | from hawkinsdb import ConceptNetEnricher 209 | 210 | # Initialize 211 | db = HawkinsDB() 212 | enricher = ConceptNetEnricher() 213 | 214 | # Add and enrich entity 215 | entity_data = { 216 | "name": "Dog", 217 | "column": "Semantic", 218 | "properties": { 219 | "type": "Animal", 220 | "category": "Pet" 221 | } 222 | } 223 | db.add_entity(entity_data) 224 | enriched_result = enricher.enrich_entity(db, "Dog", "Animal") 225 | ``` 226 | 227 | ### Custom Enrichment 228 | 229 | ```python 230 | class CustomEnricher(ConceptNetEnricher): 231 | def __init__(self): 232 | super().__init__() 233 | self.min_confidence = 0.7 234 | 235 | def filter_relations(self, relations): 236 | return [r for r in relations if r.weight >= self.min_confidence] 237 | 238 | # Use custom enricher 239 | custom_enricher = CustomEnricher() 240 | custom_enricher.enrich_entity(db, "Dog", "Animal") 241 | ``` 242 | 243 | ## Error Handling 244 | 245 | ```python 246 | from hawkinsdb import ValidationError 247 | 248 | try: 249 | # Add entity with validation 250 | result = db.add_entity({ 251 | "name": "Test", 252 | "column": "Semantic", 253 | "properties": { 254 | "age": "42" # Will be converted to integer 255 | } 256 | }) 257 | 258 | if result["success"]: 259 | print(f"Added: {result['entity_name']}") 260 | else: 261 | print(f"Error: {result['message']}") 262 | 263 | except ValidationError as e: 264 | print(f"Validation error: {str(e)}") 265 | except Exception as e: 266 | print(f"General error: {str(e)}") 267 | ``` 268 | 269 | ## Best Practices 270 | 271 | 1. Memory Organization 272 | - Use consistent naming conventions 273 | - Group related concepts 274 | - Include relevant metadata 275 | - Link memories using relationships 276 | 277 | 2. Performance Optimization 278 | - Use batch operations for multiple entities 279 | - Implement proper cleanup 280 | - Monitor memory usage 281 | - Cache frequently accessed data 282 | 283 | 3. Error Prevention 284 | - Validate data before adding 285 | - Implement proper error handling 286 | - Use type hints 287 | - Follow schema guidelines 288 | 289 | 4. Integration Tips 290 | - Test ConceptNet enrichment in development 291 | - Validate LLM responses 292 | - Monitor API usage 293 | - Keep security in mind 294 | 295 | For more detailed information about specific features, refer to the individual component guides in the documentation. 296 | -------------------------------------------------------------------------------- /docs/conceptnet_guide.md: -------------------------------------------------------------------------------- 1 | # ConceptNet Integration Guide 2 | 3 | ## Overview 4 | 5 | HawkinsDB's ConceptNet integration provides powerful knowledge enrichment capabilities by connecting to the ConceptNet knowledge graph. This guide explains how to effectively use this feature to enhance your semantic memories with common-sense knowledge and leverage the power of structured knowledge bases. 6 | 7 | ## Features 8 | 9 | ### Core Capabilities 10 | - **Automatic concept enrichment**: Enhance entities with common-sense knowledge 11 | - **Property inference**: Discover new properties based on concept relationships 12 | - **Relationship discovery**: Identify and add meaningful connections 13 | - **Confidence scoring**: Quantify the reliability of enriched data 14 | - **Source tracking**: Monitor the origin of enriched information 15 | 16 | ### Key Benefits 17 | - Richer semantic understanding 18 | - Improved query capabilities 19 | - Better context awareness 20 | - Enhanced knowledge representation 21 | 22 | ## Basic Usage 23 | 24 | ### 1. Direct Enrichment 25 | 26 | ```python 27 | from hawkinsdb import HawkinsDB, ConceptNetEnricher 28 | 29 | # Initialize 30 | db = HawkinsDB() 31 | enricher = ConceptNetEnricher() 32 | 33 | # Add basic entity 34 | entity_data = { 35 | "name": "Dog", 36 | "column": "Semantic", 37 | "properties": { 38 | "type": "Animal", 39 | "category": "Pet" 40 | } 41 | } 42 | db.add_entity(entity_data) 43 | 44 | # Enrich the entity 45 | enriched_result = enricher.enrich_entity(db, "Dog", "Animal") 46 | print(f"Enrichment status: {enriched_result}") 47 | 48 | # Query enriched entity 49 | enriched_dog = db.query_frames("Dog") 50 | print("Enriched properties:", enriched_dog["Semantic"].properties) 51 | print("Enriched relationships:", enriched_dog["Semantic"].relationships) 52 | ``` 53 | 54 | ### 2. Automatic Enrichment via LLM Interface 55 | 56 | ```python 57 | from hawkinsdb import HawkinsDB, LLMInterface 58 | 59 | # Initialize with auto-enrichment 60 | db = HawkinsDB() 61 | llm = LLMInterface(db, auto_enrich=True) 62 | 63 | # Add entity with automatic enrichment 64 | result = llm.add_from_text( 65 | "A golden retriever is a friendly dog breed known for its golden coat" 66 | ) 67 | 68 | # Verify enrichment 69 | if result["success"]: 70 | entity_name = result["entity_name"] 71 | enriched_data = db.query_frames(entity_name) 72 | 73 | # Print enriched properties 74 | semantic_frame = enriched_data.get("Semantic") 75 | if semantic_frame: 76 | print("Enriched properties:", semantic_frame.properties) 77 | print("Added relationships:", semantic_frame.relationships) 78 | ``` 79 | 80 | ## Enrichment Process 81 | 82 | ### 1. Property Inference 83 | The enrichment process automatically discovers and adds relevant properties: 84 | 85 | a) Physical Characteristics 86 | ```python 87 | # Example of physical characteristics enrichment 88 | car_data = { 89 | "name": "Car", 90 | "column": "Semantic", 91 | "properties": {"type": "Vehicle"} 92 | } 93 | db.add_entity(car_data) 94 | enricher.enrich_properties(db, "Car", ["physical_attributes"]) 95 | ``` 96 | 97 | b) Common Behaviors 98 | ```python 99 | # Enriching with behavior information 100 | animal_data = { 101 | "name": "Cat", 102 | "column": "Semantic", 103 | "properties": {"type": "Pet"} 104 | } 105 | db.add_entity(animal_data) 106 | enricher.enrich_properties(db, "Cat", ["behaviors"]) 107 | ``` 108 | 109 | c) Typical Locations 110 | ```python 111 | # Location-based enrichment 112 | tool_data = { 113 | "name": "Hammer", 114 | "column": "Semantic", 115 | "properties": {"type": "Tool"} 116 | } 117 | db.add_entity(tool_data) 118 | enricher.enrich_properties(db, "Hammer", ["locations"]) 119 | ``` 120 | 121 | d) Related Concepts 122 | ```python 123 | # Concept relationship enrichment 124 | fruit_data = { 125 | "name": "Apple", 126 | "column": "Semantic", 127 | "properties": {"type": "Fruit"} 128 | } 129 | db.add_entity(fruit_data) 130 | enricher.enrich_properties(db, "Apple", ["related_concepts"]) 131 | ``` 132 | 133 | ### 2. Relationship Discovery 134 | The system automatically identifies and establishes various types of relationships: 135 | 136 | a) IsA Relationships 137 | ```python 138 | # Example of IsA relationship discovery 139 | computer_data = { 140 | "name": "Laptop", 141 | "column": "Semantic", 142 | "properties": { 143 | "type": "Device", 144 | "manufacturer": "Generic" 145 | } 146 | } 147 | db.add_entity(computer_data) 148 | enricher.enrich_relationships(db, "Laptop", relationship_types=["IsA"]) 149 | ``` 150 | 151 | b) HasA Relationships 152 | ```python 153 | # Discovering part-whole relationships 154 | car_data = { 155 | "name": "Car", 156 | "column": "Semantic", 157 | "properties": {"type": "Vehicle"} 158 | } 159 | db.add_entity(car_data) 160 | enricher.enrich_relationships(db, "Car", relationship_types=["HasA"]) 161 | ``` 162 | 163 | c) CapableOf Relationships 164 | ```python 165 | # Finding capability relationships 166 | robot_data = { 167 | "name": "Robot", 168 | "column": "Semantic", 169 | "properties": {"type": "Machine"} 170 | } 171 | db.add_entity(robot_data) 172 | enricher.enrich_relationships(db, "Robot", relationship_types=["CapableOf"]) 173 | ``` 174 | 175 | d) UsedFor Relationships 176 | ```python 177 | # Discovering utility relationships 178 | tool_data = { 179 | "name": "Screwdriver", 180 | "column": "Semantic", 181 | "properties": {"type": "Tool"} 182 | } 183 | db.add_entity(tool_data) 184 | enricher.enrich_relationships(db, "Screwdriver", relationship_types=["UsedFor"]) 185 | ``` 186 | 187 | ### 3. Confidence Scoring 188 | 189 | HawkinsDB implements a sophisticated confidence scoring system: 190 | 191 | a) ConceptNet Edge Weights 192 | ```python 193 | # Example of confidence-based filtering 194 | class CustomEnricher(ConceptNetEnricher): 195 | def __init__(self): 196 | super().__init__() 197 | self.min_confidence = 0.7 # Set minimum confidence threshold 198 | 199 | def filter_relations(self, relations): 200 | """Custom filtering of ConceptNet relations""" 201 | return [r for r in relations if r.weight >= self.min_confidence] 202 | ``` 203 | 204 | b) Multiple Source Validation 205 | ```python 206 | # Enrichment with multiple sources 207 | enricher = ConceptNetEnricher( 208 | validate_sources=True, 209 | min_sources=2 210 | ) 211 | enricher.enrich_entity(db, "Computer", "Device") 212 | ``` 213 | 214 | c) Context Relevance 215 | ```python 216 | # Context-aware enrichment 217 | enricher = ConceptNetEnricher( 218 | context_aware=True, 219 | domain="technology" 220 | ) 221 | enricher.enrich_entity(db, "Smartphone", "Device") 222 | ``` 223 | 224 | ## Advanced Usage 225 | 226 | ### 1. Custom Enrichment Rules 227 | 228 | ```python 229 | from hawkinsdb import ConceptNetEnricher 230 | 231 | class CustomEnricher(ConceptNetEnricher): 232 | def __init__(self): 233 | super().__init__() 234 | self.min_confidence = 0.7 # Set minimum confidence threshold 235 | 236 | def filter_relations(self, relations): 237 | """Custom filtering of ConceptNet relations""" 238 | return [r for r in relations if r.weight >= self.min_confidence] 239 | ``` 240 | 241 | ### 2. Selective Property Enrichment 242 | 243 | ```python 244 | # Enrich specific properties 245 | enricher.enrich_properties( 246 | db, 247 | entity_name="Car", 248 | properties=["parts", "capabilities", "location"] 249 | ) 250 | ``` 251 | 252 | ### 3. Batch Enrichment 253 | 254 | ```python 255 | # Enrich multiple related entities 256 | entities = ["Dog", "Cat", "Hamster"] 257 | entity_type = "Pet" 258 | 259 | for entity in entities: 260 | enricher.enrich_entity(db, entity, entity_type) 261 | ``` 262 | 263 | ## Best Practices 264 | 265 | 1. **Entity Preparation** 266 | - Provide clear entity types 267 | - Use consistent naming 268 | - Include basic properties 269 | 270 | 2. **Enrichment Strategy** 271 | - Start with core concepts 272 | - Enrich related entities 273 | - Validate enriched data 274 | - Monitor confidence scores 275 | 276 | 3. **Performance Optimization** 277 | - Batch similar entities 278 | - Cache common enrichments 279 | - Use selective enrichment 280 | - Set appropriate confidence thresholds 281 | 282 | ## Error Handling 283 | 284 | ```python 285 | try: 286 | enriched = enricher.enrich_entity(db, entity_name, entity_type) 287 | if enriched: 288 | print("Successfully enriched entity") 289 | 290 | # Verify enrichment 291 | result = db.query_frames(entity_name) 292 | if result: 293 | semantic_frame = result.get("Semantic") 294 | if semantic_frame: 295 | print("Enriched properties:", semantic_frame.properties) 296 | print("Enriched relationships:", semantic_frame.relationships) 297 | else: 298 | print("No enrichment data found") 299 | 300 | except Exception as e: 301 | print(f"Error during enrichment: {str(e)}") 302 | ``` 303 | 304 | ## Troubleshooting 305 | 306 | Common issues and solutions: 307 | 308 | 1. **No Enrichment Data** 309 | - Check entity name spelling 310 | - Verify entity type 311 | - Ensure ConceptNet connectivity 312 | - Check confidence thresholds 313 | 314 | 2. **Low Quality Enrichment** 315 | - Adjust confidence thresholds 316 | - Provide more specific entity types 317 | - Use custom filtering 318 | - Implement validation rules 319 | 320 | 3. **Performance Issues** 321 | - Use batch enrichment 322 | - Implement caching 323 | - Limit enrichment scope 324 | - Optimize query patterns 325 | 326 | ## Examples 327 | 328 | ### 1. Enriching a Technical Concept 329 | 330 | ```python 331 | # Add and enrich a technical concept 332 | computer_data = { 333 | "name": "Laptop", 334 | "column": "Semantic", 335 | "properties": { 336 | "type": "Computer", 337 | "category": "Device" 338 | } 339 | } 340 | db.add_entity(computer_data) 341 | enricher.enrich_entity(db, "Laptop", "Computer") 342 | ``` 343 | 344 | ### 2. Enriching a Natural Concept 345 | 346 | ```python 347 | # Add and enrich a natural concept 348 | tree_data = { 349 | "name": "Oak_Tree", 350 | "column": "Semantic", 351 | "properties": { 352 | "type": "Tree", 353 | "category": "Plant" 354 | } 355 | } 356 | db.add_entity(tree_data) 357 | enricher.enrich_entity(db, "Oak_Tree", "Tree") 358 | ``` 359 | 360 | ### 3. Enriching an Abstract Concept 361 | 362 | ```python 363 | # Add and enrich an abstract concept 364 | concept_data = { 365 | "name": "Happiness", 366 | "column": "Semantic", 367 | "properties": { 368 | "type": "Emotion", 369 | "category": "Feeling" 370 | } 371 | } 372 | db.add_entity(concept_data) 373 | enricher.enrich_entity(db, "Happiness", "Emotion") 374 | ``` 375 | 376 | For more examples and detailed API reference, see the main [documentation](README.md). 377 | -------------------------------------------------------------------------------- /docs/llm_interface_guide.md: -------------------------------------------------------------------------------- 1 | # LLM Interface Guide 2 | 3 | ## Overview 4 | 5 | The LLM Interface in HawkinsDB provides a natural language interface for interacting with the database. It enables seamless interaction with the memory system through natural language, including: 6 | - Adding new entities from text descriptions 7 | - Querying existing memories using natural language 8 | - Automatic property validation and type inference 9 | - Integration with ConceptNet enrichment 10 | - Confidence scoring for responses 11 | 12 | ## Features 13 | 14 | ### Core Capabilities 15 | 16 | 1. **Natural Language Entity Creation** 17 | - Convert unstructured text into structured memories 18 | - Automatic type inference for properties 19 | - Support for multiple memory types (Semantic, Episodic, Procedural) 20 | 21 | 2. **Intelligent Querying** 22 | - Natural language question answering 23 | - Context-aware responses 24 | - Multi-entity relationship understanding 25 | - Temporal query support for episodic memories 26 | 27 | 3. **Automatic Enrichment** 28 | - ConceptNet integration for knowledge expansion 29 | - Property inference from context 30 | - Relationship discovery 31 | - Source tracking 32 | 33 | 4. **Data Quality** 34 | - Confidence scoring for all properties 35 | - Automatic validation of property types 36 | - Inconsistency detection 37 | - Source attribution 38 | 39 | 5. **Integration Features** 40 | - Seamless connection with memory storage 41 | - Event tracking 42 | - Error handling 43 | - Query optimization 44 | 45 | ## Basic Usage 46 | 47 | ### 1. Initialization 48 | 49 | ```python 50 | from hawkinsdb import HawkinsDB, LLMInterface 51 | 52 | # Initialize database and LLM interface with auto-enrichment 53 | db = HawkinsDB() 54 | llm = LLMInterface(db, auto_enrich=True) 55 | 56 | # Or initialize without auto-enrichment for more control 57 | llm_manual = LLMInterface(db, auto_enrich=False) 58 | 59 | # Configure additional settings (optional) 60 | llm = LLMInterface( 61 | db, 62 | auto_enrich=True, 63 | confidence_threshold=0.7, # Minimum confidence for accepting properties 64 | max_enrichment_depth=2, # Maximum depth for ConceptNet enrichment 65 | validate_properties=True # Enable strict property validation 66 | ) 67 | ``` 68 | 69 | The LLM Interface provides a natural way to interact with HawkinsDB. When initialized: 70 | - It connects to your HawkinsDB instance 71 | - Sets up the natural language processing pipeline 72 | - Configures ConceptNet integration if auto-enrichment is enabled 73 | - Establishes validation rules for properties 74 | 75 | ### 2. Adding Entities from Text 76 | 77 | ```python 78 | # Add entity using natural language 79 | result = llm.add_from_text(""" 80 | A Tesla Model 3 is an electric car manufactured by Tesla. 81 | It has autopilot capabilities, a glass roof, and typically comes 82 | in various colors including red, white, and black. 83 | """) 84 | 85 | if result["success"]: 86 | print(f"Added entity: {result['entity_name']}") 87 | print(f"Enriched: {result['enriched']}") 88 | ``` 89 | 90 | ### 3. Querying with Natural Language 91 | 92 | ```python 93 | # Ask questions about stored entities 94 | response = llm.query("What features does the Tesla Model 3 have?") 95 | print(f"Answer: {response['response']}") 96 | 97 | # Query specific entity details 98 | details = llm.query_entity("Tesla_Model_3", include_metadata=True) 99 | print(f"Entity details: {details}") 100 | ``` 101 | 102 | ## Advanced Features 103 | 104 | ### 1. Property Validation 105 | 106 | ```python 107 | # The LLM interface automatically validates properties 108 | result = llm.add_from_text(""" 109 | The speed of light is approximately 299,792,458 meters per second. 110 | It is a fundamental physical constant represented by 'c'. 111 | """) 112 | 113 | # Properties are validated and properly typed 114 | print(result["entity_data"]["properties"]) 115 | ``` 116 | 117 | ### 2. Confidence Scoring 118 | 119 | ```python 120 | # Query with metadata to see confidence scores 121 | response = llm.query_entity( 122 | "Speed_of_Light", 123 | include_metadata=True 124 | ) 125 | 126 | # Check confidence scores for properties 127 | for prop, value in response["data"]["Semantic"]["properties"].items(): 128 | print(f"{prop}: {value[0]['confidence']}") 129 | ``` 130 | 131 | ### 3. Custom Entity Processing 132 | 133 | ```python 134 | from hawkinsdb.llm_interface import LLMInterface 135 | 136 | class CustomLLMInterface(LLMInterface): 137 | def _process_properties(self, properties): 138 | """Custom property processing""" 139 | processed = super()._process_properties(properties) 140 | # Add custom processing logic 141 | return processed 142 | ``` 143 | 144 | ## Best Practices 145 | 146 | ### 1. Input Formatting 147 | 148 | ```python 149 | # Good: Clear, specific descriptions 150 | result = llm.add_from_text(""" 151 | A MacBook Pro is a high-end laptop computer made by Apple. 152 | It features: 153 | - Retina display 154 | - M1 or M2 processor 155 | - Up to 32GB RAM 156 | Location: Office desk 157 | """) 158 | 159 | # Bad: Vague or ambiguous descriptions 160 | result = llm.add_from_text("It's a computer that does stuff") 161 | ``` 162 | 163 | ### 2. Query Formulation 164 | 165 | ```python 166 | # Good: Specific, focused questions 167 | response = llm.query("What is the processor type in the MacBook Pro?") 168 | 169 | # Bad: Vague or compound questions 170 | response = llm.query("Tell me about computers and what they do") 171 | ``` 172 | 173 | ### 3. Error Handling 174 | 175 | ```python 176 | try: 177 | result = llm.add_from_text(text_description) 178 | if result["success"]: 179 | print(f"Added: {result['entity_name']}") 180 | if result["enriched"]: 181 | print("Entity was enriched with ConceptNet data") 182 | else: 183 | print(f"Error: {result['message']}") 184 | except Exception as e: 185 | print(f"Error processing text: {str(e)}") 186 | ``` 187 | 188 | ## Example Use Cases 189 | 190 | ### 1. Knowledge Base Population 191 | 192 | ```python 193 | # Add multiple related entities 194 | descriptions = [ 195 | "Python is a high-level programming language known for its readability", 196 | "JavaScript is a programming language used primarily for web development", 197 | "Java is a widely-used object-oriented programming language" 198 | ] 199 | 200 | for desc in descriptions: 201 | result = llm.add_from_text(desc) 202 | if result["success"]: 203 | print(f"Added programming language: {result['entity_name']}") 204 | ``` 205 | 206 | ### 2. Question-Answering System 207 | 208 | ```python 209 | # Build a simple QA system 210 | def answer_questions(questions): 211 | for question in questions: 212 | response = llm.query(question) 213 | if response["success"]: 214 | print(f"Q: {question}") 215 | print(f"A: {response['response']}") 216 | else: 217 | print(f"Could not answer: {question}") 218 | 219 | # Example usage 220 | questions = [ 221 | "What programming languages are in the database?", 222 | "What is Python used for?", 223 | "Compare JavaScript and Java" 224 | ] 225 | answer_questions(questions) 226 | ``` 227 | 228 | ### 3. Automated Documentation 229 | 230 | ```python 231 | # Generate structured documentation from text 232 | def document_system(description): 233 | # Add system description 234 | result = llm.add_from_text(description) 235 | if not result["success"]: 236 | return False 237 | 238 | # Query for important aspects 239 | components = llm.query("What are the main components?") 240 | features = llm.query("What are the key features?") 241 | requirements = llm.query("What are the system requirements?") 242 | 243 | return { 244 | "components": components["response"], 245 | "features": features["response"], 246 | "requirements": requirements["response"] 247 | } 248 | ``` 249 | 250 | ## Troubleshooting 251 | 252 | Common issues and solutions: 253 | 254 | 1. **Entity Not Added** 255 | - Check input text clarity 256 | - Verify required fields 257 | - Check validation rules 258 | - Review error messages 259 | 260 | 2. **Poor Query Responses** 261 | - Rephrase question 262 | - Check entity existence 263 | - Verify data completeness 264 | - Review context 265 | 266 | 3. **Performance Issues** 267 | - Batch similar operations 268 | - Optimize query patterns 269 | - Use caching when appropriate 270 | - Monitor API usage 271 | 272 | ## API Reference 273 | 274 | ### LLMInterface Methods 275 | 276 | ```python 277 | def add_entity(self, entity_json: Union[str, Dict]) -> Dict[str, Any]: 278 | """Add entity from structured data""" 279 | 280 | def add_from_text(self, text: str) -> Dict[str, Any]: 281 | """Add entity from natural language text""" 282 | 283 | def query(self, question: str) -> Dict[str, Any]: 284 | """Answer questions about entities""" 285 | 286 | def query_entity(self, name: str, include_metadata: bool = False) -> Dict[str, Any]: 287 | """Query specific entity details""" 288 | ``` 289 | 290 | For complete documentation and more examples, see the main [documentation](README.md). 291 | -------------------------------------------------------------------------------- /docs/sqlite_backend.md: -------------------------------------------------------------------------------- 1 | # Using SQLite Backend with HawkinsDB 2 | 3 | HawkinsDB supports SQLite as a persistent storage backend, providing robust data storage with ACID compliance. 4 | 5 | ## Configuration 6 | 7 | To use SQLite storage: 8 | 9 | ```python 10 | from hawkinsdb import HawkinsDB 11 | 12 | # Initialize database 13 | db = HawkinsDB() 14 | 15 | # Enable SQLite storage 16 | db.config.set_storage_backend('sqlite') 17 | 18 | # Optionally configure SQLite path (default: ./hawkins_memory.db) 19 | db.config.set_storage_path('path/to/your/database.db') 20 | ``` 21 | 22 | ## Key Features 23 | 24 | - **Persistent Storage**: Data remains available between sessions 25 | - **ACID Compliance**: Ensures data integrity 26 | - **Concurrent Access**: Safe for multi-threaded applications 27 | - **Automatic Schema Management**: Tables created and updated automatically 28 | 29 | ## Basic Operations 30 | 31 | ### Adding Entities 32 | 33 | ```python 34 | entity_data = { 35 | "name": "Tesla Model 3", 36 | "properties": { 37 | "color": "red", 38 | "year": 2023 39 | }, 40 | "relationships": { 41 | "located_in": ["garage"] 42 | } 43 | } 44 | 45 | result = db.add_entity(entity_data) 46 | ``` 47 | 48 | ### Querying Data 49 | 50 | ```python 51 | # Get all frames for an entity 52 | frames = db.query_frames("Tesla Model 3") 53 | 54 | # List all entities 55 | entities = db.list_entities() 56 | ``` 57 | 58 | ### Error Handling 59 | 60 | ```python 61 | try: 62 | result = db.add_entity(entity_data) 63 | except ValueError as e: 64 | print(f"Invalid data: {str(e)}") 65 | except Exception as e: 66 | print(f"Storage error: {str(e)}") 67 | ``` 68 | 69 | ## Advanced Usage 70 | 71 | ### Bulk Operations 72 | 73 | ```python 74 | entities = [ 75 | {"name": "Entity1", "properties": {...}}, 76 | {"name": "Entity2", "properties": {...}} 77 | ] 78 | 79 | for entity in entities: 80 | db.add_entity(entity) 81 | ``` 82 | 83 | ### Custom Queries 84 | 85 | The SQLite backend supports custom queries through the storage interface: 86 | 87 | ```python 88 | from hawkinsdb.storage import get_storage_backend 89 | 90 | storage = get_storage_backend('sqlite') 91 | storage.execute_query("SELECT * FROM entities WHERE name LIKE ?", ("%Tesla%",)) 92 | ``` 93 | 94 | ## Best Practices 95 | 96 | 1. **Enable SQLite Early**: Configure SQLite backend before any database operations 97 | 2. **Use Error Handling**: Always wrap database operations in try-except blocks 98 | 3. **Regular Backups**: SQLite files can be easily backed up by copying the database file 99 | 4. **Proper Cleanup**: Close database connections when finished: 100 | ```python 101 | db.cleanup() # Closes connections and frees resources 102 | ``` 103 | 104 | ## Performance Considerations 105 | 106 | - SQLite performs best with moderate-sized datasets 107 | - For very large datasets, consider using batch operations 108 | - Index frequently queried fields for better performance 109 | 110 | ## Troubleshooting 111 | 112 | Common issues and solutions: 113 | 114 | 1. **Database Locked** 115 | - Ensure proper connection cleanup 116 | - Reduce concurrent access if needed 117 | 118 | 2. **Permission Errors** 119 | - Check file and directory permissions 120 | - Ensure write access to the database directory 121 | 122 | 3. **Disk Space** 123 | - Monitor available disk space 124 | - Implement regular cleanup of unused data 125 | -------------------------------------------------------------------------------- /examples/HawkinDB_RAG.py: -------------------------------------------------------------------------------- 1 | import os 2 | import json 3 | import logging 4 | from typing import Dict, Any, Optional, List, Union 5 | from openai import OpenAI 6 | from hawkinsdb import HawkinsDB 7 | 8 | os.environ["OPENAI_API_KEY"]="" 9 | 10 | logging.basicConfig(level=logging.INFO) 11 | logger = logging.getLogger(__name__) 12 | 13 | class TextToHawkinsDB: 14 | def __init__(self, api_key: Optional[str] = None): 15 | """Initialize with OpenAI API key.""" 16 | self.api_key = api_key or os.getenv("OPENAI_API_KEY") 17 | if not self.api_key: 18 | raise ValueError("OpenAI API key is required") 19 | self.client = OpenAI(api_key=self.api_key) 20 | self.db = HawkinsDB(storage_type='sqlite') 21 | 22 | def text_to_json(self, text: str) -> Dict[str, Any]: 23 | """Convert text description to HawkinsDB-compatible JSON using GPT-4.""" 24 | prompt = """Convert the following text into a structured JSON format suitable for a memory database. 25 | 26 | Rules: 27 | 1. Extract key entity details, properties, and relationships 28 | 2. Use underscores for entity names (e.g., Python_Language) 29 | 3. Categorize memory as one of: Semantic, Episodic, or Procedural 30 | 4. Include relevant properties and relationships 31 | 32 | Required JSON format: 33 | { 34 | "column": "memory_type", 35 | "name": "entity_name", 36 | "properties": { 37 | "key1": "value1", 38 | "key2": ["value2a", "value2b"] 39 | }, 40 | "relationships": { 41 | "related_to": ["entity1", "entity2"], 42 | "part_of": ["parent_entity"] 43 | } 44 | } 45 | 46 | Text to convert: 47 | """ 48 | 49 | try: 50 | response = self.client.chat.completions.create( 51 | model="gpt-3.5-turbo", 52 | messages=[ 53 | {"role": "system", "content": prompt}, 54 | {"role": "user", "content": text} 55 | ], 56 | temperature=0.3, 57 | response_format={"type": "json_object"} 58 | ) 59 | 60 | json_str = response.choices[0].message.content 61 | return json.loads(json_str) 62 | 63 | except Exception as e: 64 | logger.error(f"Error converting text to JSON: {str(e)}") 65 | raise 66 | 67 | def add_to_db(self, text: str) -> Dict[str, Any]: 68 | """Convert text to JSON and add to HawkinsDB.""" 69 | try: 70 | json_data = self.text_to_json(text) 71 | logger.info(f"Converted JSON: {json.dumps(json_data, indent=2)}") 72 | 73 | result = self.db.add_entity(json_data) 74 | return { 75 | "success": True, 76 | "message": "Successfully added to database", 77 | "entity_data": json_data, 78 | "db_result": result 79 | } 80 | 81 | except Exception as e: 82 | logger.error(f"Error adding to database: {str(e)}") 83 | return { 84 | "success": False, 85 | "message": str(e), 86 | "entity_data": None, 87 | "db_result": None 88 | } 89 | 90 | def query_entity(self, entity_name: str) -> Dict[str, Any]: 91 | """Query specific entity by name.""" 92 | try: 93 | frames = self.db.query_frames(entity_name) 94 | if not frames: 95 | return { 96 | "success": False, 97 | "message": f"No entity found with name: {entity_name}", 98 | "data": None 99 | } 100 | 101 | return { 102 | "success": True, 103 | "message": "Entity found", 104 | "data": frames 105 | } 106 | 107 | except Exception as e: 108 | logger.error(f"Error querying entity: {str(e)}") 109 | return { 110 | "success": False, 111 | "message": str(e), 112 | "data": None 113 | } 114 | 115 | def query_by_text(self, query_text: str) -> Dict[str, Any]: 116 | """Query database using natural language text.""" 117 | try: 118 | # Get all entities for context 119 | entities = self.db.list_entities() 120 | if not entities: 121 | return { 122 | "success": True, 123 | "message": "Database is empty", 124 | "response": "No information available in the database." 125 | } 126 | 127 | # Build context from existing entities 128 | context = [] 129 | for entity_name in entities[:5]: # Limit to 5 most recent entities 130 | frames = self.db.query_frames(entity_name) 131 | if frames: 132 | context.append(json.dumps(frames, indent=2)) 133 | 134 | # Create prompt with context 135 | prompt = f"""You are a helpful assistant with access to a knowledge base. 136 | Answer the following question based on this context: 137 | 138 | Context: 139 | {' '.join(context)} 140 | 141 | Question: {query_text} 142 | 143 | Rules: 144 | 1. Only use information from the provided context 145 | 2. If information is not in the context, say so 146 | 3. Be specific and include details when available 147 | 4. Format numbers and dates clearly 148 | """ 149 | 150 | # Get response from GPT-4 151 | response = self.client.chat.completions.create( 152 | model="gpt4o", 153 | messages=[ 154 | {"role": "system", "content": prompt} 155 | ], 156 | temperature=0.3, 157 | max_tokens=500 158 | ) 159 | 160 | answer = response.choices[0].message.content 161 | 162 | return { 163 | "success": True, 164 | "message": "Query processed successfully", 165 | "response": answer 166 | } 167 | 168 | except Exception as e: 169 | logger.error(f"Error processing query: {str(e)}") 170 | return { 171 | "success": False, 172 | "message": str(e), 173 | "response": None 174 | } 175 | 176 | def list_all_entities(self) -> Dict[str, Any]: 177 | """List all entities in the database.""" 178 | try: 179 | entities = self.db.list_entities() 180 | return { 181 | "success": True, 182 | "message": "Entities retrieved successfully", 183 | "entities": entities 184 | } 185 | except Exception as e: 186 | logger.error(f"Error listing entities: {str(e)}") 187 | return { 188 | "success": False, 189 | "message": str(e), 190 | "entities": None 191 | } 192 | 193 | def test_memory_examples(): 194 | """Test function to demonstrate usage.""" 195 | converter = TextToHawkinsDB() 196 | 197 | # Test adding entries 198 | examples = [ 199 | """ 200 | Python is a programming language created by Guido van Rossum in 1991. 201 | It supports object-oriented, imperative, and functional programming. 202 | It's commonly used for web development, data science, and automation. 203 | """, 204 | """ 205 | Today I completed my first Python project in my home office. 206 | It took 2 hours and was successful. I did a code review afterwards. 207 | """, 208 | """ 209 | The Tesla Model 3 is red, made in 2023, and parked in the garage. 210 | It has a range of 358 miles and goes 0-60 mph in 3.1 seconds. 211 | """ 212 | ] 213 | 214 | # Add examples to database 215 | logger.info("\nAdding examples to database:") 216 | for i, example in enumerate(examples, 1): 217 | logger.info(f"\nAdding Example {i}") 218 | logger.info("=" * 50) 219 | result = converter.add_to_db(example) 220 | logger.info(f"Result: {json.dumps(result, indent=2)}") 221 | 222 | # Test queries 223 | logger.info("\nTesting queries:") 224 | 225 | # List all entities 226 | logger.info("\nListing all entities:") 227 | entities_result = converter.list_all_entities() 228 | logger.info(f"Entities: {json.dumps(entities_result, indent=2)}") 229 | 230 | # Query specific entity 231 | logger.info("\nQuerying specific entity:") 232 | entity_result = converter.query_entity("Python_Language") 233 | print(entity_result) 234 | 235 | # Test natural language queries 236 | test_queries = [ 237 | "What programming language was created by Guido van Rossum?", 238 | "Tell me about the Tesla Model 3's specifications.", 239 | "What happened during the first Python project?" 240 | ] 241 | 242 | logger.info("\nTesting natural language queries:") 243 | for query in test_queries: 244 | logger.info(f"\nQuery: {query}") 245 | result = converter.query_by_text(query) 246 | logger.info(f"Response: {json.dumps(result, indent=2)}") 247 | 248 | if __name__ == "__main__": 249 | test_memory_examples() -------------------------------------------------------------------------------- /examples/basic_demo.py: -------------------------------------------------------------------------------- 1 | """Basic demonstration of HawkinsDB functionality.""" 2 | from hawkinsdb import HawkinsDB 3 | import time 4 | 5 | def main(): 6 | # Initialize the database 7 | print("Initializing HawkinsDB...") 8 | db = HawkinsDB() 9 | 10 | try: 11 | # Add a semantic memory 12 | print("\nAdding semantic memory...") 13 | cat_data = { 14 | "name": "Cat", 15 | "column": "Semantic", 16 | "properties": { 17 | "type": "Animal", 18 | "features": ["fur", "whiskers", "tail"], 19 | "diet": "carnivore" 20 | }, 21 | "relationships": { 22 | "preys_on": ["mice", "birds"], 23 | "related_to": ["tiger", "lion"] 24 | } 25 | } 26 | result = db.add_entity(cat_data) 27 | print(f"Semantic memory result: {result}") 28 | 29 | # Add an episodic memory 30 | print("\nAdding episodic memory...") 31 | event_data = { 32 | "name": "First Pet", 33 | "column": "Episodic", 34 | "properties": { 35 | "timestamp": str(time.time()), 36 | "action": "Got my first cat", 37 | "location": "Pet Store", 38 | "emotion": "happy", 39 | "participants": ["family", "pet store staff"] 40 | } 41 | } 42 | result = db.add_entity(event_data) 43 | print(f"Episodic memory result: {result}") 44 | 45 | # Add a procedural memory 46 | print("\nAdding procedural memory...") 47 | procedure_data = { 48 | "name": "Feed Cat", 49 | "column": "Procedural", 50 | "properties": { 51 | "steps": [ 52 | "Get cat food from cabinet", 53 | "Fill bowl with appropriate amount", 54 | "Add fresh water to water bowl", 55 | "Call cat for feeding" 56 | ], 57 | "frequency": "twice daily", 58 | "importance": "high" 59 | } 60 | } 61 | result = db.add_entity(procedure_data) 62 | print(f"Procedural memory result: {result}") 63 | 64 | # Query memories 65 | print("\nQuerying memories...") 66 | cat_memories = db.query_frames("Cat") 67 | print(f"Cat-related memories: {cat_memories}") 68 | 69 | feeding_memories = db.query_frames("Feed Cat") 70 | print(f"Feeding procedure: {feeding_memories}") 71 | 72 | # List all entities 73 | print("\nAll entities in database:") 74 | all_entities = db.list_entities() 75 | print(f"Entities: {all_entities}") 76 | 77 | except Exception as e: 78 | print(f"Error during demo: {str(e)}") 79 | raise 80 | 81 | finally: 82 | # Cleanup 83 | db.cleanup() 84 | print("\nDemo completed.") 85 | 86 | if __name__ == "__main__": 87 | main() -------------------------------------------------------------------------------- /examples/document.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/harishsg993010/HawkinsDB/5268d4ec11f55ab53f83d2c1ef29317901732e35/examples/document.pdf -------------------------------------------------------------------------------- /examples/file_rag.py: -------------------------------------------------------------------------------- 1 | import os 2 | import logging 3 | from typing import List, Dict, Any 4 | import PyPDF2 5 | from pathlib import Path 6 | from hawkinsdb import HawkinsDB, LLMInterface 7 | 8 | os.environ["OPENAI_API_KEY"]="" 9 | 10 | logging.basicConfig(level=logging.INFO) 11 | logger = logging.getLogger(__name__) 12 | 13 | class PDFHawkinsRAG: 14 | def __init__(self, chunk_size: int = 500): 15 | """Initialize the RAG system.""" 16 | self.db = HawkinsDB(storage_type='sqlite',db_path="rag.db") 17 | self.llm_interface = LLMInterface(self.db,auto_enrich=True) 18 | self.chunk_size = chunk_size 19 | 20 | def extract_text_from_pdf(self, pdf_path: str) -> str: 21 | """Extract text content from a PDF file.""" 22 | try: 23 | with open(pdf_path, 'rb') as file: 24 | pdf_reader = PyPDF2.PdfReader(file) 25 | text = "" 26 | for page in pdf_reader.pages: 27 | text += page.extract_text() + "\n" 28 | return text 29 | except Exception as e: 30 | logger.error(f"Error extracting text from PDF: {str(e)}") 31 | raise 32 | 33 | def chunk_text(self, text: str, filename: str) -> List[Dict[str, Any]]: 34 | """Split text into chunks and prepare for database storage.""" 35 | chunks = [] 36 | words = text.split() 37 | current_chunk = [] 38 | chunk_number = 1 39 | 40 | for word in words: 41 | current_chunk.append(word) 42 | if len(current_chunk) >= self.chunk_size: 43 | chunk_text = " ".join(current_chunk) 44 | chunks.append({ 45 | "name": f"{Path(filename).stem}_chunk_{chunk_number}", 46 | "column": "Semantic", 47 | "properties": { 48 | "content": chunk_text, 49 | "source_file": filename, 50 | "chunk_number": chunk_number, 51 | }, 52 | "relationships": { 53 | "part_of": [filename], 54 | "next_chunk": [f"{Path(filename).stem}_chunk_{chunk_number + 1}"] if len(words) > self.chunk_size else [] 55 | } 56 | }) 57 | current_chunk = [] 58 | chunk_number += 1 59 | 60 | # Handle remaining text 61 | if current_chunk: 62 | chunk_text = " ".join(current_chunk) 63 | chunks.append({ 64 | "name": f"{Path(filename).stem}_chunk_{chunk_number}", 65 | "column": "Semantic", 66 | "properties": { 67 | "content": chunk_text, 68 | "source_file": filename, 69 | "chunk_number": chunk_number, 70 | }, 71 | "relationships": { 72 | "part_of": [filename] 73 | } 74 | }) 75 | 76 | return chunks 77 | 78 | def ingest_pdf(self, pdf_path: str) -> Dict[str, Any]: 79 | """Process and store PDF content in the database.""" 80 | try: 81 | # Extract text from PDF 82 | logger.info(f"Processing PDF: {pdf_path}") 83 | text = self.extract_text_from_pdf(pdf_path) 84 | 85 | # Create document metadata 86 | filename = Path(pdf_path).name 87 | doc_metadata = { 88 | "name": Path(pdf_path).stem, 89 | "column": "Semantic", 90 | "properties": { 91 | "file_type": "PDF", 92 | "file_path": pdf_path, 93 | "file_name": filename, 94 | }, 95 | "relationships": { 96 | "contains": [] 97 | } 98 | } 99 | 100 | # Store document metadata 101 | self.db.add_entity(doc_metadata) 102 | 103 | # Process and store chunks 104 | chunks = self.chunk_text(text, filename) 105 | chunk_names = [] 106 | for chunk in chunks: 107 | self.db.add_entity(chunk) 108 | chunk_names.append(chunk["name"]) 109 | 110 | # Update document metadata with chunk references 111 | doc_metadata["relationships"]["contains"] = chunk_names 112 | self.db.add_entity(doc_metadata) 113 | 114 | return { 115 | "success": True, 116 | "message": f"Successfully processed {filename}", 117 | "chunks_created": len(chunks) 118 | } 119 | 120 | except Exception as e: 121 | logger.error(f"Error ingesting PDF: {str(e)}") 122 | return { 123 | "success": False, 124 | "message": str(e) 125 | } 126 | 127 | def query(self, question: str) -> Dict[str, Any]: 128 | """Query the knowledge base with context from stored documents.""" 129 | try: 130 | return self.llm_interface.query(question) 131 | except Exception as e: 132 | logger.error(f"Error processing query: {str(e)}") 133 | return { 134 | "success": False, 135 | "message": str(e), 136 | "response": None 137 | } 138 | 139 | def test_pdf_rag(): 140 | """Test the PDF RAG system.""" 141 | # Initialize the system 142 | rag = PDFHawkinsRAG(chunk_size=500) 143 | 144 | # Test with sample PDF 145 | pdf_path = r"C:\Users\haris\Desktop\personal\AI-Agent\Hawin\tests\document.pdf" # Replace with actual PDF path 146 | if os.path.exists(pdf_path): 147 | # Ingest PDF 148 | logger.info("Ingesting PDF...") 149 | result = rag.ingest_pdf(pdf_path) 150 | logger.info(f"Ingestion result: {result}") 151 | 152 | if result["success"]: 153 | # Test queries 154 | test_queries = [ 155 | "What is the main topic of the document?", 156 | "Summarize the key points from the document.", 157 | "What are the main conclusions drawn in the document?", 158 | "what is silha center", 159 | "who is Charlotte Higgins", 160 | "Explain the lawsuits", 161 | "Explain OpenAI's Involvement", 162 | "who is Mike Masnick" 163 | ] 164 | 165 | logger.info("\nTesting queries:") 166 | for query in test_queries: 167 | logger.info(f"\nQuery: {query}") 168 | response = rag.query(query) 169 | logger.info(f"Response: {response}") 170 | else: 171 | logger.error(f"PDF file not found: {pdf_path}") 172 | 173 | if __name__ == "__main__": 174 | test_pdf_rag() -------------------------------------------------------------------------------- /examples/hawkins_basic_demo.py: -------------------------------------------------------------------------------- 1 | """Basic demonstration of HawkinsDB functionality.""" 2 | import time 3 | import logging 4 | from hawkinsdb.core import HawkinsDB 5 | from hawkinsdb.enrichment import ConceptNetEnricher 6 | 7 | # Set up logging 8 | logging.basicConfig(level=logging.INFO) 9 | logger = logging.getLogger(__name__) 10 | 11 | def main(): 12 | """Run the basic HawkinsDB demonstration.""" 13 | print("\nStarting HawkinsDB Basic Demo...") 14 | 15 | # Initialize HawkinsDB with SQLite storage 16 | db = HawkinsDB(storage_type="sqlite", db_path="demo_basic.db") 17 | 18 | # Create a semantic memory 19 | cat_data = { 20 | "name": "cat", 21 | "column": "Semantic", 22 | "properties": { 23 | "type": "animal", 24 | "size": "medium", 25 | "characteristics": ["furry", "agile", "carnivorous"] 26 | }, 27 | "relationships": { 28 | "habitat": ["homes", "outdoors"], 29 | "behavior": ["hunting", "sleeping", "grooming"] 30 | } 31 | } 32 | 33 | # Add basic semantic memory 34 | print("\nAdding basic semantic memory for 'cat'...") 35 | result = db.add_entity(cat_data) 36 | print(f"Result: {result}") 37 | 38 | # Add episodic memory 39 | current_time = time.time() 40 | episode = { 41 | "name": "cat_observation", 42 | "column": "Episodic", 43 | "properties": { 44 | "timestamp": current_time, 45 | "action": "Observed cat behavior", 46 | "location": "Garden", 47 | "details": "Cat was chasing a butterfly" 48 | }, 49 | "relationships": { 50 | "relates_to": ["cat"], 51 | "observed_by": ["human"] 52 | } 53 | } 54 | 55 | print("\nAdding episodic memory...") 56 | result = db.add_entity(episode) 57 | print(f"Result: {result}") 58 | 59 | # Demonstrate ConceptNet enrichment 60 | print("\nEnriching 'cat' with ConceptNet knowledge...") 61 | enricher = ConceptNetEnricher() 62 | enriched = enricher.enrich_entity(db, "cat", "cat") 63 | print("Enrichment completed") 64 | 65 | # Query and display results 66 | print("\nQuerying semantic memory for 'cat':") 67 | cat_info = db.query_frames("cat") 68 | print(cat_info) 69 | print(f"Retrieved information: {cat_info}") 70 | 71 | print("\nQuerying episodic memory:") 72 | episode_info = db.query_frames("cat_observation") 73 | print(f"Retrieved episode: {episode_info}") 74 | 75 | print("\nDemo completed successfully!") 76 | 77 | if __name__ == "__main__": 78 | main() 79 | -------------------------------------------------------------------------------- /examples/hawkins_demo.py: -------------------------------------------------------------------------------- 1 | """ 2 | Comprehensive demonstration of HawkinsDB with both JSON and SQLite backends. 3 | """ 4 | import logging 5 | import os 6 | import time 7 | from hawkinsdb import HawkinsDB, LLMInterface 8 | 9 | # Configure logging 10 | 11 | os.environ["OPENAI_API_KEY"]="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" 12 | logging.basicConfig(level=logging.INFO) 13 | logger = logging.getLogger(__name__) 14 | 15 | def demonstrate_memory_operations(db: HawkinsDB): 16 | """Demonstrate core memory operations with different memory types.""" 17 | 18 | # Add semantic memory 19 | logger.info("Adding semantic memory...") 20 | db.add_entity({ 21 | "name": "Computer", 22 | "column": "Semantic", 23 | "properties": { 24 | "type": "Electronic_Device", 25 | "purpose": "Computing", 26 | "components": ["CPU", "RAM", "Storage"], 27 | "power_source": "Electricity" 28 | }, 29 | "relationships": { 30 | "found_in": ["Office", "Home"], 31 | "used_for": ["Work", "Entertainment"] 32 | }, 33 | "metadata": { 34 | "confidence": 1.0, 35 | "source": "manual", 36 | "timestamp": time.time() 37 | } 38 | }) 39 | 40 | # Add episodic memory 41 | logger.info("Adding episodic memory...") 42 | db.add_entity({ 43 | "name": "First_Computer_Use", 44 | "column": "Episodic", 45 | "action": "Setting up new computer", # Required field for episodic memory 46 | "properties": { 47 | "timestamp": str(time.time()), 48 | "location": "Home Office", 49 | "duration": "2 hours", 50 | "outcome": "Success", 51 | "details": "Initial setup and configuration of the computer" 52 | }, 53 | "relationships": { 54 | "involves": ["Computer"], 55 | "participants": ["User"], 56 | "next_action": "Software Installation" 57 | }, 58 | "metadata": { 59 | "confidence": 1.0, 60 | "source": "manual", 61 | "timestamp": time.time() 62 | } 63 | }) 64 | 65 | # Add procedural memory 66 | logger.info("Adding procedural memory...") 67 | db.add_entity({ 68 | "name": "Computer_Startup", 69 | "column": "Procedural", 70 | "properties": { 71 | "steps": [ 72 | "Press power button", 73 | "Wait for boot sequence", 74 | "Login to account", 75 | "Check system status" 76 | ], 77 | "difficulty": "Easy", 78 | "estimated_time": "2 minutes" 79 | }, 80 | "relationships": { 81 | "requires": ["Computer"], 82 | "prerequisites": ["Power_Supply"] 83 | } 84 | }) 85 | 86 | # Query and display results 87 | logger.info("\nQuerying memories...") 88 | for entity_name in ["Computer", "First_Computer_Use", "Computer_Startup"]: 89 | result = db.query_frames(entity_name) 90 | logger.info(f"\nMemory frames for '{entity_name}':") 91 | for column, frame in result.items(): 92 | logger.info(f"Column: {column}") 93 | logger.info(f"Properties: {frame.properties}") 94 | logger.info(f"Relationships: {frame.relationships}") 95 | 96 | def demonstrate_llm_interface(db: HawkinsDB): 97 | """Demonstrate LLM interface capabilities.""" 98 | logger.info("\n=== Testing LLM Interface ===") 99 | 100 | # Initialize LLM interface with auto-enrichment 101 | llm = LLMInterface(db, auto_enrich=True) 102 | 103 | # Add entity using natural language 104 | description = """ 105 | Create a semantic memory with name MacBookPro_M2: 106 | - Type: Computer 107 | - Brand: Apple 108 | - Model: MacBook Pro 16" 109 | - Specifications: M2 chip, 32GB RAM, 1TB storage 110 | - Location: Office 111 | - Primary uses: Software development, Video editing 112 | """ 113 | 114 | logger.info("Adding entity using natural language...") 115 | result = llm.add_from_text(description) 116 | logger.info(f"LLM Add Result: {result}") 117 | 118 | # Query using natural language 119 | queries = [ 120 | "What are the specifications of the MacBook Pro?", 121 | "What memory types are stored in the database?", 122 | "How to start a computer according to the stored procedure?" 123 | ] 124 | 125 | for query in queries: 126 | logger.info(f"\nQuerying: {query}") 127 | response = llm.query(query) 128 | logger.info(f"Response: {response}") 129 | 130 | def main(): 131 | """Run the comprehensive demonstration.""" 132 | try: 133 | # Clean up any existing test files 134 | for file in ["demo_json.json", "demo_sqlite.db"]: 135 | if os.path.exists(file): 136 | os.remove(file) 137 | 138 | # Demo with JSON storage 139 | logger.info("\n=== Testing JSON Storage Backend ===") 140 | json_db = HawkinsDB(db_path="demo_json.json", storage_type="json") 141 | demonstrate_memory_operations(json_db) 142 | json_db.cleanup() 143 | 144 | # Demo with SQLite storage 145 | logger.info("\n=== Testing SQLite Storage Backend ===") 146 | sqlite_db = HawkinsDB(db_path="demo_sqlite.db", storage_type="sqlite") 147 | demonstrate_memory_operations(sqlite_db) 148 | 149 | # Test LLM interface with SQLite backend 150 | demonstrate_llm_interface(sqlite_db) 151 | sqlite_db.cleanup() 152 | 153 | logger.info("\nDemonstration completed successfully!") 154 | 155 | except Exception as e: 156 | logger.error(f"Demonstration failed: {str(e)}") 157 | raise 158 | 159 | if __name__ == "__main__": 160 | main() -------------------------------------------------------------------------------- /examples/hawkinsdb_complete_example.py: -------------------------------------------------------------------------------- 1 | """Complete example demonstrating HawkinsDB usage with SQLite backend.""" 2 | import os 3 | import logging 4 | from datetime import datetime 5 | from hawkinsdb import HawkinsDB 6 | 7 | # Configure logging 8 | logging.basicConfig(level=logging.INFO) 9 | logger = logging.getLogger(__name__) 10 | 11 | def setup_database(): 12 | """Initialize HawkinsDB with SQLite backend.""" 13 | try: 14 | # Initialize with SQLite backend 15 | db = HawkinsDB(storage_type='sqlite', db_path='example.db') 16 | logger.info("Database initialized successfully") 17 | return db 18 | except Exception as e: 19 | logger.error(f"Failed to initialize database: {str(e)}") 20 | raise 21 | 22 | def add_example_semantic_memory(db): 23 | """Add semantic memory examples.""" 24 | try: 25 | # Example semantic memory 26 | car_data = { 27 | "name": "Tesla Model 3", 28 | "properties": { 29 | "color": "red", 30 | "year": 2023, 31 | "features": ["autopilot", "glass roof"] 32 | }, 33 | "relationships": { 34 | "manufactured_by": ["Tesla"], 35 | "located_in": ["garage"] 36 | } 37 | } 38 | 39 | result = db.add_entity(car_data) 40 | logger.info(f"Added semantic memory: {result}") 41 | 42 | except ValueError as ve: 43 | logger.error(f"Validation error: {str(ve)}") 44 | except Exception as e: 45 | logger.error(f"Error adding semantic memory: {str(e)}") 46 | 47 | def add_example_episodic_memory(db): 48 | """Add episodic memory examples.""" 49 | try: 50 | # Example episodic memory with proper timestamp format 51 | current_time = datetime.now().isoformat() 52 | test_drive = { 53 | "name": "first_drive", 54 | "properties": { 55 | "timestamp": current_time, 56 | "action": "test drive", 57 | "duration": "45 minutes" 58 | }, 59 | "relationships": { 60 | "involves": ["Tesla Model 3"], 61 | "location": ["dealership"] 62 | } 63 | } 64 | 65 | db.add_reference_frame( 66 | column_name="Episodic", 67 | name=test_drive["name"], 68 | properties=test_drive["properties"], 69 | relationships=test_drive["relationships"], 70 | memory_type="Episodic" 71 | ) 72 | logger.info("Added episodic memory successfully") 73 | 74 | except Exception as e: 75 | logger.error(f"Error adding episodic memory: {str(e)}") 76 | 77 | def query_and_display_memory(db): 78 | """Query and display stored memories.""" 79 | try: 80 | # List all entities 81 | entities = db.list_entities() 82 | logger.info(f"\nStored entities: {entities}") 83 | 84 | # Query specific memories 85 | for entity_name in entities: 86 | frames = db.query_frames(entity_name) 87 | logger.info(f"\nMemory frames for '{entity_name}':") 88 | 89 | for column_name, frame in frames.items(): 90 | logger.info(f"\nColumn: {column_name}") 91 | logger.info(f"Properties: {frame.properties}") 92 | logger.info(f"Relationships: {frame.relationships}") 93 | if frame.history: 94 | logger.info("History:") 95 | for timestamp, event in frame.history: 96 | logger.info(f" {timestamp}: {event}") 97 | 98 | except Exception as e: 99 | logger.error(f"Error querying memory: {str(e)}") 100 | 101 | def main(): 102 | """Main execution function demonstrating HawkinsDB usage.""" 103 | try: 104 | # Setup database 105 | db = setup_database() 106 | 107 | # Add different types of memories 108 | add_example_semantic_memory(db) 109 | add_example_episodic_memory(db) 110 | 111 | # Query and display stored memories 112 | query_and_display_memory(db) 113 | 114 | except Exception as e: 115 | logger.error(f"Application error: {str(e)}") 116 | finally: 117 | if 'db' in locals(): 118 | db.cleanup() 119 | 120 | if __name__ == '__main__': 121 | main() 122 | -------------------------------------------------------------------------------- /examples/hawkinsdb_comprehensive.py: -------------------------------------------------------------------------------- 1 | """Comprehensive example demonstrating all major features of HawkinsDB.""" 2 | import logging 3 | import time 4 | from datetime import datetime 5 | from typing import Dict, Any 6 | from hawkinsdb import HawkinsDB, LLMInterface 7 | from hawkinsdb.types import CorticalColumn, ReferenceFrame, PropertyCandidate 8 | import json 9 | import os 10 | os.environ["OPENAI_API_KEY"]="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" 11 | 12 | # Configure logging 13 | logging.basicConfig(level=logging.INFO, 14 | format='%(asctime)s - %(name)s - %(levelname)s - %(message)s') 15 | logger = logging.getLogger(__name__) 16 | 17 | def demonstrate_basic_operations(): 18 | """Demonstrate basic database operations with both JSON and SQLite backends.""" 19 | # Initialize databases with proper configuration 20 | json_db = HawkinsDB(db_path="demo_json.json", storage_type="json") 21 | sqlite_db = HawkinsDB(db_path="demo_sqlite.db", storage_type="sqlite") 22 | 23 | # Test data - A Tesla car entity 24 | car_data = { 25 | "name": "Tesla_Model_3", 26 | "column": "Semantic", 27 | "properties": { 28 | "color": "red", 29 | "year": "2023", 30 | "mileage": "1000 miles", 31 | "features": ["autopilot capabilities", "glass roof"], 32 | "type": "electric vehicle" 33 | }, 34 | "relationships": { 35 | "type_of": ["Vehicle"], 36 | "location": "garage" 37 | } 38 | } 39 | 40 | # Add to both databases 41 | logger.info("\nAdding car entity to JSON database...") 42 | json_db.add_entity(car_data) 43 | 44 | logger.info("\nAdding car entity to SQLite database...") 45 | sqlite_db.add_entity(car_data) 46 | 47 | # Query and verify data from both databases 48 | logger.info("\nQuerying JSON database...") 49 | json_result = json_db.query_frames("Tesla_Model_3") 50 | if isinstance(json_result, dict): 51 | formatted_json = {k: v.to_dict() if hasattr(v, 'to_dict') else str(v) 52 | for k, v in json_result.items()} 53 | logger.info(f"JSON DB Result: {json.dumps(formatted_json, indent=2)}") 54 | else: 55 | logger.info(f"JSON DB Result: {str(json_result)}") 56 | 57 | logger.info("\nQuerying SQLite database...") 58 | sqlite_result = sqlite_db.query_frames("Tesla_Model_3") 59 | if isinstance(sqlite_result, dict): 60 | formatted_sqlite = {k: v.to_dict() if hasattr(v, 'to_dict') else str(v) 61 | for k, v in sqlite_result.items()} 62 | logger.info(f"SQLite DB Result: {json.dumps(formatted_sqlite, indent=2)}") 63 | else: 64 | logger.info(f"SQLite DB Result: {str(sqlite_result)}") 65 | 66 | # Clean up resources 67 | 68 | return json_db, sqlite_db 69 | 70 | def demonstrate_conceptnet_enrichment(db: HawkinsDB): 71 | """Demonstrate ConceptNet integration and knowledge enrichment.""" 72 | logger.info("\n=== Demonstrating ConceptNet Integration ===") 73 | 74 | # Initialize ConceptNet interface 75 | from hawkinsdb import ConceptNetEnricher 76 | 77 | # Initialize enricher with the database instance 78 | enricher = ConceptNetEnricher() # ConceptNet's public API doesn't require a key 79 | 80 | # Create an entity with basic information 81 | entity_name = "Computer" 82 | entity_type = "Device" 83 | computer_data = { 84 | "name": entity_name, 85 | "column": "Semantic", 86 | "properties": { 87 | "type": entity_type, 88 | "purpose": "Computing", 89 | "location": "Office" 90 | } 91 | } 92 | 93 | # First add the entity to the database 94 | logger.info("Adding computer entity to database...") 95 | add_result = db.add_entity(computer_data) 96 | logger.info(f"Add result: {add_result}") 97 | 98 | # Then enrich it using ConceptNet 99 | logger.info("\nEnriching computer entity with ConceptNet data...") 100 | enriched_data = None 101 | 102 | ''' 103 | 104 | try: 105 | # Enrich the entity with both name and type 106 | enriched = enricher.enrich_entity( 107 | db=db, 108 | entity_name=entity_name, 109 | entity_type=entity_type 110 | ) 111 | 112 | if enriched: 113 | logger.info(f"Successfully enriched entity {entity_name} with ConceptNet data") 114 | 115 | # Query the enriched entity 116 | enriched_data = db.query_frames(entity_name) 117 | semantic_frame = enriched_data.get("Semantic") 118 | 119 | if semantic_frame: 120 | # Log properties 121 | logger.info("\nEnriched properties:") 122 | if hasattr(semantic_frame, 'properties'): 123 | for prop_name, candidates in semantic_frame.properties.items(): 124 | logger.info(f"\n{prop_name}:") 125 | if isinstance(candidates, list): 126 | for candidate in candidates: 127 | if hasattr(candidate, 'value'): 128 | logger.info(f" - {candidate.value} (confidence: {candidate.confidence:.2f})") 129 | else: 130 | logger.info(f" - {candidate}") 131 | 132 | # Log relationships 133 | logger.info("\nEnriched relationships:") 134 | if hasattr(semantic_frame, 'relationships'): 135 | for rel_type, candidates in semantic_frame.relationships.items(): 136 | logger.info(f"\n{rel_type}:") 137 | if isinstance(candidates, list): 138 | for candidate in candidates: 139 | if hasattr(candidate, 'value'): 140 | logger.info(f" - {candidate.value} (confidence: {candidate.confidence:.2f})") 141 | else: 142 | logger.info(f" - {candidate}") 143 | else: 144 | logger.warning(f"No enrichment data found for entity {entity_name}") 145 | 146 | except Exception as e: 147 | logger.error(f"Error during ConceptNet enrichment: {str(e)}") 148 | 149 | return enriched_data 150 | 151 | ''' 152 | 153 | # Query and display the enriched entity 154 | try: 155 | # Query and display the enriched entity 156 | enriched_result = db.query_frames("computer") 157 | semantic_frame = enriched_result.get("Semantic") 158 | 159 | if semantic_frame: 160 | logger.info("\nQueried enriched entity:") 161 | logger.info("\nProperties:") 162 | if hasattr(semantic_frame, 'properties'): 163 | for prop_name, candidates in semantic_frame.properties.items(): 164 | logger.info(f"\n{prop_name}:") 165 | if isinstance(candidates, list): 166 | for candidate in candidates: 167 | if hasattr(candidate, 'value'): 168 | logger.info(f" - {candidate.value} (confidence: {candidate.confidence:.2f})") 169 | else: 170 | logger.info(f" - {candidate}") 171 | else: 172 | logger.info(f" - {candidates}") 173 | 174 | logger.info("\nRelationships:") 175 | if hasattr(semantic_frame, 'relationships'): 176 | for rel_type, candidates in semantic_frame.relationships.items(): 177 | logger.info(f"\n{rel_type}:") 178 | if isinstance(candidates, list): 179 | for candidate in candidates: 180 | if hasattr(candidate, 'value'): 181 | logger.info(f" - {candidate.value} (confidence: {candidate.confidence:.2f})") 182 | else: 183 | logger.info(f" - {candidate}") 184 | else: 185 | logger.info(f" - {candidates}") 186 | 187 | return enriched_result 188 | except Exception as e: 189 | logger.error(f"Error querying enriched entity: {str(e)}") 190 | return None 191 | 192 | def demonstrate_llm_interface(db: HawkinsDB): 193 | """Demonstrate LLM interface for natural language interactions.""" 194 | logger.info("\n=== Demonstrating LLM Interface ===") 195 | 196 | # Initialize LLM interface with auto-enrichment and proper error handling 197 | try: 198 | llm = LLMInterface(db, auto_enrich=True) 199 | except Exception as e: 200 | logger.error(f"Failed to initialize LLM interface: {str(e)}") 201 | return None, None 202 | 203 | # First, add a structured entity to query later 204 | laptop_entity = { 205 | "name": "MacBookPro_M3", 206 | "column": "Semantic", 207 | "properties": { 208 | "brand": "Apple", 209 | "model": "MacBook Pro", 210 | "year": "2024", 211 | "processor": "M3 chip", 212 | "ram": "16GB", 213 | "storage": "512GB SSD", 214 | "location": "home office" 215 | }, 216 | "relationships": { 217 | "type_of": ["Laptop", "Computer"], 218 | "manufactured_by": ["Apple"] 219 | } 220 | } 221 | 222 | # Add the entity directly first 223 | logger.info("\nAdding MacBook Pro entity...") 224 | db.add_entity(laptop_entity) 225 | enriched_data = db.query_frames("MacBookPro_M3") 226 | semantic_frame = enriched_data.get("Semantic") 227 | print(semantic_frame) 228 | # Now demonstrate natural language interaction 229 | logger.info("\nAdding new entity using natural language...") 230 | new_entity = { 231 | "name": "iPhone15Pro", 232 | "column": "Semantic", 233 | "properties": { 234 | "color": "Space Black", 235 | "storage": "256GB", 236 | "features": ["A17 Pro chip", "ProMotion display"], 237 | "location": "desk drawer", 238 | "type": "smartphone" 239 | }, 240 | "relationships": { 241 | "manufacturer": ["Apple"], 242 | "type_of": ["Mobile Device"] 243 | } 244 | } 245 | # Add entity directly first 246 | logger.info("Adding iPhone entity directly...") 247 | test = db.add_entity(new_entity) 248 | print(test) 249 | 250 | 251 | # Then demonstrate natural language interaction 252 | logger.info("\nQuerying using natural language about the new iPhone...") 253 | llm_result = llm.query("What are the features of the iPhone 15 Pro?") 254 | logger.info(f"LLM Query Result: {json.dumps(llm_result, indent=2)}") 255 | 256 | # Query existing entities using natural language 257 | logger.info("\nQuerying using natural language...") 258 | queries = [ 259 | "What are the specifications of the MacBook Pro?", 260 | "Where is the iPhone 15 Pro located?", 261 | "List all Apple devices and their features", 262 | "What is computer for what it is used for", 263 | "What is Explain about iphone", 264 | "Explain the Features of Tesla Model 3" 265 | ] 266 | 267 | query_results = [] 268 | for query in queries: 269 | logger.info(f"\nQuery: {query}") 270 | result = llm.query(query) 271 | logger.info(f"Response: {result}") 272 | query_results.append(result) 273 | 274 | return llm_result, query_results 275 | 276 | def main(): 277 | """Run comprehensive demonstration of HawkinsDB features.""" 278 | try: 279 | logger.info("Starting HawkinsDB comprehensive example...") 280 | 281 | # Basic operations with both backends 282 | json_db, sqlite_db = demonstrate_basic_operations() 283 | 284 | # ConceptNet integration (using SQLite backend for persistence) 285 | enriched_data = demonstrate_conceptnet_enrichment(sqlite_db) 286 | 287 | # LLM interface demonstration 288 | llm_results = demonstrate_llm_interface(sqlite_db) 289 | 290 | logger.info("Example completed successfully!") 291 | 292 | except Exception as e: 293 | logger.error(f"Example failed: {str(e)}") 294 | raise 295 | 296 | if __name__ == "__main__": 297 | main() -------------------------------------------------------------------------------- /examples/hawkinsdb_demo.py: -------------------------------------------------------------------------------- 1 | """ 2 | Complete demonstration of HawkinsDB functionality using both JSON and SQLite backends. 3 | """ 4 | import logging 5 | import os 6 | from hawkinsdb import HawkinsDB 7 | from hawkinsdb.storage.sqlite import SQLiteStorage 8 | 9 | # Configure logging 10 | logging.basicConfig(level=logging.INFO) 11 | logger = logging.getLogger(__name__) 12 | 13 | def demo_memory_operations(db: HawkinsDB): 14 | """Demonstrate core memory operations.""" 15 | logger.info("Adding semantic memory...") 16 | db.add_entity({ 17 | "name": "Apple", 18 | "column": "Semantic", 19 | "properties": { 20 | "color": "red", 21 | "taste": "sweet", 22 | "category": "fruit" 23 | }, 24 | "relationships": { 25 | "grows_on": "tree", 26 | "belongs_to": ["fruits", "healthy_foods"] 27 | } 28 | }) 29 | 30 | logger.info("Adding episodic memory...") 31 | db.add_entity({ 32 | "name": "First_Apple_Experience", 33 | "column": "Episodic", 34 | "properties": { 35 | "timestamp": "2024-01-01T12:00:00", 36 | "action": "tasting an apple", 37 | "location": "kitchen" 38 | }, 39 | "relationships": { 40 | "involves": ["Apple", "Kitchen"] 41 | } 42 | }) 43 | 44 | logger.info("Adding procedural memory...") 45 | db.add_entity({ 46 | "name": "Apple_Pie_Recipe", 47 | "column": "Procedural", 48 | "properties": { 49 | "steps": [ 50 | "Peel and slice apples", 51 | "Mix with sugar and cinnamon", 52 | "Prepare pie crust", 53 | "Bake at 375°F for 45 minutes" 54 | ], 55 | "difficulty": "medium" 56 | }, 57 | "relationships": { 58 | "requires": ["Apple", "Sugar", "Flour"] 59 | } 60 | }) 61 | 62 | # Query and display results 63 | logger.info("Querying memories...") 64 | for entity_name in ["Apple", "First_Apple_Experience", "Apple_Pie_Recipe"]: 65 | frames = db.query_frames(entity_name) 66 | logger.info(f"\nFound frames for '{entity_name}':") 67 | for column_name, frame in frames.items(): 68 | logger.info(f"Column: {column_name}") 69 | logger.info(f"Properties: {frame.properties}") 70 | logger.info(f"Relationships: {frame.relationships}") 71 | 72 | def main(): 73 | """Run the demonstration with both storage backends.""" 74 | # Clean up any existing test files 75 | for file in ["demo_json.json", "demo_sqlite.db"]: 76 | if os.path.exists(file): 77 | os.remove(file) 78 | 79 | # Demo with JSON storage 80 | logger.info("\n=== Testing JSON Storage Backend ===") 81 | json_db = HawkinsDB(db_path="demo_json.json", storage_type="json") 82 | demo_memory_operations(json_db) 83 | json_db.cleanup() 84 | 85 | # Demo with SQLite storage 86 | logger.info("\n=== Testing SQLite Storage Backend ===") 87 | sqlite_db = HawkinsDB(db_path="demo_sqlite.db", storage_type="sqlite") 88 | demo_memory_operations(sqlite_db) 89 | sqlite_db.cleanup() 90 | 91 | if __name__ == "__main__": 92 | main() 93 | -------------------------------------------------------------------------------- /examples/hawkinsdb_full_example.py: -------------------------------------------------------------------------------- 1 | """ 2 | Comprehensive example demonstrating all major features of HawkinsDB. 3 | This example showcases: 4 | 1. Basic CRUD operations 5 | 2. Advanced caching mechanisms 6 | 3. Different memory types (Semantic, Episodic, Procedural) 7 | 4. ConceptNet integration 8 | 5. Memory type validations 9 | 6. Performance monitoring 10 | """ 11 | 12 | import logging 13 | import time 14 | import json 15 | from hawkinsdb import HawkinsDB, LLMInterface 16 | from datetime import datetime 17 | 18 | logging.basicConfig(level=logging.INFO) 19 | logger = logging.getLogger(__name__) 20 | 21 | def demonstrate_memory_types(db: HawkinsDB): 22 | """Demonstrate different memory types and their validations.""" 23 | logger.info("\n=== Testing Different Memory Types ===") 24 | 25 | # Semantic Memory 26 | semantic_data = { 27 | "name": "Tesla_Model_3", 28 | "column": "Semantic", 29 | "properties": { 30 | "type": "electric_car", 31 | "manufacturer": "Tesla", 32 | "year": 2024, 33 | "features": ["autopilot", "battery_powered", "touch_screen"] 34 | }, 35 | "relationships": { 36 | "similar_to": ["Tesla_Model_Y", "Tesla_Model_S"], 37 | "competes_with": ["BMW_i4", "Polestar_2"] 38 | } 39 | } 40 | db.add_entity(semantic_data) 41 | logger.info("Added semantic memory: Tesla Model 3") 42 | 43 | # Episodic Memory 44 | episodic_data = { 45 | "name": "First_Tesla_Drive", 46 | "column": "Episodic", 47 | "properties": { 48 | "timestamp": datetime.now().isoformat(), 49 | "action": "test_drive", 50 | "location": {"city": "Palo Alto", "state": "CA"}, 51 | "duration": "45 minutes", 52 | "participants": ["customer", "sales_rep"] 53 | } 54 | } 55 | db.add_entity(episodic_data) 56 | logger.info("Added episodic memory: First Tesla Drive") 57 | 58 | # Procedural Memory 59 | procedural_data = { 60 | "name": "Tesla_Charging_Process", 61 | "column": "Procedural", 62 | "properties": { 63 | "steps": [ 64 | "Park near charging station", 65 | "Open charging port", 66 | "Connect charging cable", 67 | "Initiate charging via touchscreen", 68 | "Wait for desired charge level", 69 | "Disconnect charging cable" 70 | ], 71 | "required_tools": ["charging_cable", "Tesla_app"], 72 | "difficulty": "easy" 73 | } 74 | } 75 | db.add_entity(procedural_data) 76 | logger.info("Added procedural memory: Tesla Charging Process") 77 | 78 | # Function removed as caching is no longer supported 79 | 80 | def main(): 81 | """Run the comprehensive example.""" 82 | # Initialize database with SQLite storage 83 | db = HawkinsDB( 84 | storage_type='sqlite' 85 | ) 86 | 87 | try: 88 | # Test different memory types 89 | demonstrate_memory_types(db) 90 | 91 | # Test queries 92 | logger.info("\n=== Testing Queries ===") 93 | tesla_data = db.query_frames("Tesla_Model_3") 94 | # Convert ReferenceFrame objects to dictionaries before JSON serialization 95 | tesla_data_dict = { 96 | column_name: frame.to_dict() 97 | for column_name, frame in tesla_data.items() 98 | } 99 | logger.info(f"Query result for Tesla Model 3: {json.dumps(tesla_data_dict, indent=2)}") 100 | 101 | # List all entities 102 | logger.info("\n=== All Entities ===") 103 | all_entities = db.list_entities() 104 | logger.info(f"Total entities: {len(all_entities)}") 105 | logger.info(f"Entities: {json.dumps(all_entities, indent=2)}") 106 | 107 | except Exception as e: 108 | logger.error(f"Error during example execution: {e}") 109 | raise 110 | finally: 111 | db.cleanup() 112 | 113 | if __name__ == "__main__": 114 | main() 115 | -------------------------------------------------------------------------------- /examples/hawkinsdb_sqlite_example.py: -------------------------------------------------------------------------------- 1 | """Example demonstrating HawkinsDB usage with SQLite backend.""" 2 | import logging 3 | from hawkinsdb import HawkinsDB 4 | 5 | # Configure logging 6 | logging.basicConfig(level=logging.INFO) 7 | logger = logging.getLogger(__name__) 8 | 9 | def main(): 10 | # Initialize HawkinsDB with SQLite backend 11 | db = HawkinsDB(storage_type='sqlite', db_path='example.db') 12 | 13 | # Example 1: Adding Semantic Memory 14 | semantic_data = { 15 | "name": "Tesla Model 3", 16 | "properties": { 17 | "color": "red", 18 | "year": 2023, 19 | "features": ["autopilot", "glass roof"] 20 | }, 21 | "relationships": { 22 | "manufactured_by": ["Tesla"], 23 | "located_in": ["garage"] 24 | } 25 | } 26 | 27 | try: 28 | result = db.add_entity(semantic_data) 29 | logger.info(f"Added semantic memory: {result}") 30 | except Exception as e: 31 | logger.error(f"Error adding semantic memory: {e}") 32 | 33 | # Example 2: Adding Episodic Memory 34 | episodic_data = { 35 | "name": "first_drive", 36 | "properties": { 37 | "timestamp": "2024-01-01T10:00:00", 38 | "action": "test drive", 39 | "duration": "45 minutes" 40 | }, 41 | "relationships": { 42 | "involves": ["Tesla Model 3"], 43 | "location": ["dealership"] 44 | } 45 | } 46 | 47 | try: 48 | db.add_reference_frame( 49 | column_name="Episodic", 50 | name=episodic_data["name"], 51 | properties=episodic_data["properties"], 52 | relationships=episodic_data["relationships"], 53 | memory_type="Episodic" 54 | ) 55 | logger.info("Added episodic memory successfully") 56 | except Exception as e: 57 | logger.error(f"Error adding episodic memory: {e}") 58 | 59 | # Query and display stored information 60 | try: 61 | # List all entities 62 | entities = db.list_entities() 63 | logger.info(f"Stored entities: {entities}") 64 | 65 | # Query specific entity 66 | tesla_frames = db.query_frames("Tesla Model 3") 67 | for column, frame in tesla_frames.items(): 68 | logger.info(f"\nColumn: {column}") 69 | logger.info(f"Properties: {frame.properties}") 70 | logger.info(f"Relationships: {frame.relationships}") 71 | 72 | except Exception as e: 73 | logger.error(f"Error querying data: {e}") 74 | 75 | # Cleanup 76 | db.storage.cleanup() 77 | 78 | if __name__ == "__main__": 79 | main() 80 | -------------------------------------------------------------------------------- /examples/sqlite_usage.py: -------------------------------------------------------------------------------- 1 | """Example demonstrating HawkinsDB usage with SQLite backend.""" 2 | import os 3 | import logging 4 | from datetime import datetime 5 | from pathlib import Path 6 | from hawkinsdb import HawkinsDB 7 | from hawkinsdb.storage.sqlite import SQLiteStorage 8 | 9 | # Configure logging 10 | logging.basicConfig(level=logging.INFO, 11 | format='%(asctime)s - %(name)s - %(levelname)s - %(message)s') 12 | logger = logging.getLogger(__name__) 13 | 14 | def setup_database(): 15 | """Initialize HawkinsDB with SQLite backend.""" 16 | try: 17 | # Set custom SQLite path (optional) 18 | db_path = Path('./hawkins_memory.db').absolute() 19 | 20 | # Initialize database with SQLite backend explicitly 21 | db = HawkinsDB(storage_type='sqlite') # This is the recommended way 22 | 23 | # Alternatively, you can initialize with custom path: 24 | # storage = SQLiteStorage(db_path=str(db_path)) 25 | # db = HawkinsDB(storage=storage) 26 | 27 | logger.info(f"Database initialized with SQLite backend") 28 | return db 29 | 30 | except Exception as e: 31 | logger.error(f"Failed to initialize database: {str(e)}") 32 | raise 33 | # Example memory types supported by HawkinsDB: 34 | # 1. Semantic: For storing facts and properties about entities 35 | # 2. Episodic: For storing time-based events and experiences 36 | # 3. Procedural: For storing step-by-step procedures or workflows 37 | 38 | def add_example_entities(db): 39 | """Add example entities to the database.""" 40 | # Example entities 41 | entities = [ 42 | { 43 | "name": "Tesla Model 3", 44 | "properties": { 45 | "color": "red", 46 | "year": 2023, 47 | "mileage": 1000, 48 | "features": ["autopilot", "glass roof"] 49 | }, 50 | "relationships": { 51 | "located_in": ["garage"], 52 | "manufactured_by": ["Tesla"] 53 | } 54 | }, 55 | { 56 | "name": "Smart Home Hub", 57 | "properties": { 58 | "brand": "HomeKit", 59 | "connected_devices": 5, 60 | "firmware_version": "2.1.0" 61 | }, 62 | "relationships": { 63 | "controls": ["lights", "thermostat"], 64 | "connected_to": ["wifi_network"] 65 | } 66 | } 67 | ] 68 | 69 | for entity_data in entities: 70 | try: 71 | result = db.add_entity(entity_data) 72 | logger.info(f"Added entity: {result['entity_name']}") 73 | except ValueError as ve: 74 | logger.error(f"Invalid entity data: {str(ve)}") 75 | except Exception as e: 76 | logger.error(f"Error adding entity {entity_data['name']}: {str(e)}") 77 | 78 | def query_and_display_data(db): 79 | """Query and display stored data.""" 80 | try: 81 | # List all entities 82 | entities = db.list_entities() 83 | logger.info(f"Stored entities: {entities}") 84 | 85 | # Query frames for each entity 86 | for entity_name in entities: 87 | try: 88 | frames = db.query_frames(entity_name) 89 | logger.info(f"\nEntity: {entity_name}") 90 | 91 | for column, frame in frames.items(): 92 | logger.info(f"Column: {column}") 93 | logger.info(f"Properties: {frame.properties}") 94 | logger.info(f"Relationships: {frame.relationships}") 95 | 96 | except Exception as e: 97 | logger.error(f"Error querying frames for {entity_name}: {str(e)}") 98 | 99 | except Exception as e: 100 | logger.error(f"Error listing entities: {str(e)}") 101 | 102 | def main(): 103 | """Main execution function.""" 104 | try: 105 | # Setup database 106 | db = setup_database() 107 | 108 | # Add example entities 109 | add_example_entities(db) 110 | 111 | # Query and display data 112 | query_and_display_data(db) 113 | 114 | except Exception as e: 115 | logger.error(f"Application error: {str(e)}") 116 | finally: 117 | # Cleanup (if needed) 118 | if 'db' in locals(): 119 | db.cleanup() 120 | 121 | if __name__ == '__main__': 122 | main() 123 | -------------------------------------------------------------------------------- /hawkinsdb/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | HawkinsDB - A memory layer with SQLite backend and error handling. 3 | 4 | Core Features: 5 | - Multiple memory types (Semantic, Episodic, Procedural) 6 | - SQLite and JSON storage backends 7 | - Robust error handling and data validation 8 | """ 9 | 10 | from .core import HawkinsDB, JSONStorage 11 | 12 | __version__ = "1.0.1" 13 | __author__ = "HawkinsDB Contributors" 14 | __email__ = "hawkinsdb@example.com" 15 | __license__ = "MIT" 16 | 17 | # Core components only 18 | __all__ = [ 19 | 'HawkinsDB', 20 | 'JSONStorage' 21 | ] 22 | 23 | # Optional components loaded if dependencies are available 24 | try: 25 | from .enrichment import ConceptNetEnricher 26 | __all__.append('ConceptNetEnricher') 27 | except ImportError: 28 | pass 29 | 30 | try: 31 | from .llm_interface import LLMInterface 32 | __all__.append('LLMInterface') 33 | except ImportError: 34 | pass 35 | -------------------------------------------------------------------------------- /hawkinsdb/base.py: -------------------------------------------------------------------------------- 1 | """Base classes for HawkinsDB.""" 2 | from abc import ABC, abstractmethod 3 | from typing import ( 4 | Any, Dict, List, Optional, Sequence, TypeVar, Generic, 5 | Protocol, runtime_checkable 6 | ) 7 | from typing_extensions import TypeAlias 8 | from datetime import datetime 9 | import time 10 | 11 | # Type variables for generic implementations 12 | T = TypeVar('T') 13 | T_PropertyCandidate = TypeVar('T_PropertyCandidate', bound='PropertyCandidate') 14 | T_ReferenceFrame = TypeVar('T_ReferenceFrame', bound='ReferenceFrame') 15 | T_CorticalColumn = TypeVar('T_CorticalColumn', bound='CorticalColumn') 16 | 17 | @runtime_checkable 18 | class StorageBackend(Protocol[T_CorticalColumn]): 19 | """Protocol class for storage backends.""" 20 | 21 | def load_columns(self) -> Sequence[T_CorticalColumn]: 22 | """Load all columns from storage.""" 23 | ... 24 | 25 | def save_columns(self, columns: Sequence[T_CorticalColumn]) -> None: 26 | """Save all columns to storage.""" 27 | ... 28 | 29 | def initialize(self) -> None: 30 | """Initialize the storage backend.""" 31 | ... 32 | 33 | def cleanup(self) -> None: 34 | """Cleanup any resources.""" 35 | ... 36 | 37 | class PropertyCandidate: 38 | """Represents a candidate value for a property.""" 39 | 40 | def __init__(self, value, confidence=1.0, sources=None): 41 | """Initialize a property candidate with value and metadata.""" 42 | if isinstance(value, dict): 43 | if 'value' in value: 44 | self.value = value['value'] 45 | self.confidence = float(value.get('confidence', confidence)) 46 | self.sources = list(value.get('sources', sources or [])) 47 | else: 48 | self.value = value 49 | self.confidence = float(confidence) 50 | self.sources = list(sources) if sources else [] 51 | elif isinstance(value, PropertyCandidate): 52 | self.value = value.value 53 | self.confidence = float(value.confidence) 54 | self.sources = list(value.sources) 55 | else: 56 | self.value = value 57 | self.confidence = float(confidence) 58 | self.sources = list(sources) if sources else [] 59 | 60 | if not (0.0 <= self.confidence <= 1.0): 61 | raise ValueError(f"Confidence must be between 0.0 and 1.0, got {self.confidence}") 62 | 63 | self.timestamp = time.time() 64 | 65 | def to_dict(self): 66 | """Convert to dictionary representation.""" 67 | return { 68 | "value": self.value, 69 | "confidence": self.confidence, 70 | "sources": self.sources, 71 | "timestamp": self.timestamp 72 | } 73 | 74 | @staticmethod 75 | def from_dict(d): 76 | """Create from dictionary representation.""" 77 | if not isinstance(d, dict): 78 | raise TypeError(f"Expected dict, got {type(d)}") 79 | 80 | if 'value' not in d: 81 | raise ValueError("Dictionary must contain 'value' key") 82 | 83 | return PropertyCandidate( 84 | value=d["value"], 85 | confidence=float(d.get("confidence", 1.0)), 86 | sources=list(d.get("sources", [])) 87 | ) 88 | 89 | class ReferenceFrame: 90 | """Represents a concept or object as a reference frame.""" 91 | 92 | def __init__(self, name, properties=None, relationships=None, location=None, history=None): 93 | """Initialize frame with proper dictionary handling.""" 94 | # Handle dictionary input 95 | if isinstance(name, dict): 96 | data = name 97 | self.name = data.get('name') 98 | properties = data.get('properties', {}) 99 | relationships = data.get('relationships', {}) 100 | location = data.get('location', {}) 101 | history = data.get('history', []) 102 | else: 103 | self.name = name 104 | 105 | self.properties = properties if properties is not None else {} 106 | self.relationships = relationships if relationships is not None else {} 107 | self.location = location if location is not None else {} 108 | self.history = history if history is not None else [] 109 | self.created_at = datetime.now().isoformat() 110 | self.updated_at = self.created_at 111 | 112 | def to_dict(self): 113 | """Convert frame to dictionary representation.""" 114 | try: 115 | return { 116 | "name": self.name, 117 | "properties": { 118 | k: [c.to_dict() if isinstance(c, PropertyCandidate) else {"value": c} for c in candidates] 119 | for k, candidates in self.properties.items() 120 | }, 121 | "relationships": { 122 | k: [c.to_dict() if isinstance(c, PropertyCandidate) else {"value": c} for c in candidates] 123 | for k, candidates in self.relationships.items() 124 | }, 125 | "location": self.location, 126 | "history": self.history, 127 | "created_at": self.created_at, 128 | "updated_at": self.updated_at 129 | } 130 | except Exception as e: 131 | raise ValueError(f"Error converting frame to dict: {str(e)}") 132 | 133 | def __str__(self): 134 | return f"ReferenceFrame(name={self.name})" 135 | 136 | def __repr__(self): 137 | return self.__str__() 138 | 139 | @staticmethod 140 | def from_dict(data): 141 | """Create frame from dictionary representation.""" 142 | try: 143 | if not isinstance(data, dict): 144 | if isinstance(data, str): 145 | # Handle string input by creating minimal frame 146 | return ReferenceFrame(name=data) 147 | raise TypeError(f"Input must be a dictionary or string, got {type(data)}") 148 | 149 | props = {} 150 | for k, vlist in data.get("properties", {}).items(): 151 | if isinstance(vlist, (list, tuple)): 152 | processed = [] 153 | for v in vlist: 154 | if isinstance(v, dict): 155 | processed.append(PropertyCandidate.from_dict(v)) 156 | else: 157 | processed.append(PropertyCandidate(v)) 158 | props[k] = processed 159 | else: 160 | props[k] = [PropertyCandidate(vlist)] 161 | 162 | rels = {} 163 | for k, vlist in data.get("relationships", {}).items(): 164 | if isinstance(vlist, (list, tuple)): 165 | processed = [] 166 | for v in vlist: 167 | if isinstance(v, dict): 168 | processed.append(PropertyCandidate.from_dict(v)) 169 | else: 170 | processed.append(PropertyCandidate(v)) 171 | rels[k] = processed 172 | else: 173 | rels[k] = [PropertyCandidate(vlist)] 174 | 175 | frame = ReferenceFrame( 176 | name=data.get("name", ""), 177 | properties=props, 178 | relationships=rels, 179 | location=data.get("location", {}), 180 | history=data.get("history", []) 181 | ) 182 | frame.created_at = data.get("created_at", frame.created_at) 183 | frame.updated_at = data.get("updated_at", frame.updated_at) 184 | return frame 185 | 186 | except Exception as e: 187 | raise ValueError(f"Error creating frame from dict: {str(e)}") 188 | 189 | class CorticalColumn: 190 | """Represents a collection of reference frames with error handling.""" 191 | 192 | def __init__(self, name, frames=None): 193 | """Initialize a cortical column with proper dictionary handling.""" 194 | # Handle dictionary input 195 | if isinstance(name, dict): 196 | data = name 197 | self.name = data.get('name') 198 | frames = data.get('frames', []) 199 | self.created_at = data.get('created_at', datetime.now().isoformat()) 200 | self.updated_at = data.get('updated_at', self.created_at) 201 | else: 202 | if not name: 203 | raise ValueError("Column name cannot be empty") 204 | self.name = name 205 | self.created_at = datetime.now().isoformat() 206 | self.updated_at = self.created_at 207 | 208 | self.frames = [] 209 | if frames: 210 | for frame in frames: 211 | if isinstance(frame, dict): 212 | self.frames.append(ReferenceFrame(frame)) 213 | else: 214 | self.frames.append(frame) 215 | 216 | def to_dict(self): 217 | """Convert column to dictionary representation with error handling.""" 218 | try: 219 | return { 220 | "name": self.name, 221 | "frames": [f.to_dict() for f in self.frames], 222 | "created_at": self.created_at, 223 | "updated_at": self.updated_at 224 | } 225 | except Exception as e: 226 | raise ValueError(f"Error converting column to dict: {str(e)}") 227 | 228 | @staticmethod 229 | def from_dict(data): 230 | """Create column from dictionary with validation and error handling.""" 231 | try: 232 | if not isinstance(data, dict): 233 | raise TypeError("Input must be a dictionary") 234 | 235 | if "name" not in data: 236 | raise ValueError("Column data must contain 'name' field") 237 | 238 | frames = [] 239 | for frame_data in data.get("frames", []): 240 | try: 241 | frame = ReferenceFrame.from_dict(frame_data) 242 | frames.append(frame) 243 | except Exception as e: 244 | raise ValueError(f"Error creating frame: {str(e)}") 245 | 246 | col = CorticalColumn(name=data["name"], frames=frames) 247 | col.created_at = data.get("created_at", col.created_at) 248 | col.updated_at = data.get("updated_at", col.updated_at) 249 | return col 250 | 251 | except Exception as e: 252 | raise ValueError(f"Error creating column from dict: {str(e)}") 253 | 254 | class BaseJSONStorage(StorageBackend[T_CorticalColumn]): 255 | """Base class for JSON storage implementation.""" 256 | def load_columns(self) -> Sequence[T_CorticalColumn]: 257 | raise NotImplementedError("Subclasses must implement load_columns") 258 | 259 | def save_columns(self, columns: Sequence[T_CorticalColumn]) -> None: 260 | raise NotImplementedError("Subclasses must implement save_columns") 261 | 262 | def initialize(self) -> None: 263 | raise NotImplementedError("Subclasses must implement initialize") 264 | 265 | def cleanup(self) -> None: 266 | raise NotImplementedError("Subclasses must implement cleanup") -------------------------------------------------------------------------------- /hawkinsdb/core.py: -------------------------------------------------------------------------------- 1 | """Core functionality for HawkinsDB.""" 2 | import os 3 | import json 4 | import time 5 | import logging 6 | from datetime import datetime 7 | from pathlib import Path 8 | from .base import PropertyCandidate, ReferenceFrame, CorticalColumn 9 | 10 | # Configure logging 11 | logging.basicConfig(level=logging.INFO) 12 | logger = logging.getLogger(__name__) 13 | 14 | try: 15 | from filelock import FileLock 16 | except ImportError: 17 | logger.warning("FileLock not available, using dummy implementation") 18 | class FileLock: 19 | def __init__(self, *args): pass 20 | def __enter__(self): return self 21 | def __exit__(self, *args): pass 22 | 23 | class EntityValidationError(Exception): 24 | """Raised when entity validation fails.""" 25 | pass 26 | 27 | # Make EntityValidationError available at module level 28 | __all__ = ['HawkinsDB', 'JSONStorage', 'EntityValidationError'] 29 | 30 | # Import storage backends 31 | from .storage.sqlite import SQLiteStorage 32 | # Ensure SQLiteStorage is available by default 33 | if not SQLiteStorage: 34 | logger.error("Failed to import SQLiteStorage. Please check your installation.") 35 | raise ImportError("SQLiteStorage module is required but not available") 36 | 37 | class JSONStorage: 38 | """Handles persistence of the HawkinsDB data in JSON format.""" 39 | def __init__(self, path): 40 | self.path = Path(path) 41 | self.lock = FileLock(str(self.path) + ".lock") 42 | 43 | def initialize(self): 44 | if not self.path.exists(): 45 | self._write_data({"columns": []}) 46 | 47 | def cleanup(self): 48 | pass 49 | 50 | def _read_data(self): 51 | if not self.path.exists(): 52 | return {"columns": []} 53 | with open(self.path, "r", encoding="utf-8") as f: 54 | return json.load(f) 55 | 56 | def _write_data(self, data): 57 | with open(self.path, "w", encoding="utf-8") as f: 58 | json.dump(data, f, ensure_ascii=False, indent=4) 59 | 60 | def load_columns(self): 61 | with self.lock: 62 | data = self._read_data() 63 | return data.get("columns", []) 64 | 65 | def save_columns(self, columns): 66 | with self.lock: 67 | data = {"columns": columns} 68 | self._write_data(data) 69 | 70 | class HawkinsDB: 71 | """Main database interface.""" 72 | 73 | # Make EntityValidationError accessible via the class 74 | EntityValidationError = EntityValidationError 75 | def __init__(self, storage=None, db_path=None, storage_type='sqlite'): 76 | if storage is None: 77 | if storage_type == 'sqlite': 78 | db_path = db_path or "hawkins_memory.db" 79 | self.storage = SQLiteStorage(db_path=db_path) 80 | elif storage_type == 'json': 81 | db_path = db_path or "hawkins_db.json" 82 | self.storage = JSONStorage(path=db_path) 83 | else: 84 | raise ValueError(f"Unsupported storage type: {storage_type}") 85 | else: 86 | self.storage = storage 87 | 88 | self.storage.initialize() 89 | self.columns = {} 90 | self._load_columns() 91 | self._build_indexes() 92 | self._initialize_memory_types() 93 | 94 | def _load_columns(self): 95 | columns = self.storage.load_columns() 96 | self.columns = {c["name"]: c for c in columns} 97 | 98 | def _build_indexes(self): 99 | self.name_index = {} 100 | for col_name, col in self.columns.items(): 101 | for f in col["frames"]: 102 | name = f["name"].lower() 103 | if name not in self.name_index: 104 | self.name_index[name] = [] 105 | self.name_index[name].append((col_name, f)) 106 | 107 | def _initialize_memory_types(self): 108 | for memory_type in ["Semantic", "Episodic", "Procedural"]: 109 | if memory_type not in self.columns: 110 | self.create_column(memory_type) 111 | 112 | def cleanup(self): 113 | if hasattr(self, 'storage'): 114 | self.storage.cleanup() 115 | 116 | def create_column(self, column_name): 117 | if column_name not in self.columns: 118 | self.columns[column_name] = {"name": column_name, "frames": []} 119 | self._save() 120 | 121 | def _save(self): 122 | self.storage.save_columns(list(self.columns.values())) 123 | 124 | def add_entity(self, data): 125 | """Add an entity with validation.""" 126 | try: 127 | if not isinstance(data, dict): 128 | raise EntityValidationError("Entity data must be a dictionary") 129 | 130 | memory_type = data.get("column", "Semantic") 131 | name = data.get("name") 132 | 133 | if not name: 134 | raise EntityValidationError("Entity name is required") 135 | 136 | # Validate required fields based on memory type 137 | if memory_type == "Episodic": 138 | if "properties" not in data or "timestamp" not in data["properties"]: 139 | raise EntityValidationError("Episodic memories require a timestamp") 140 | 141 | elif memory_type == "Procedural": 142 | if "properties" not in data or "steps" not in data["properties"]: 143 | raise EntityValidationError("Procedural memories require steps") 144 | 145 | properties = data.get("properties", {}) 146 | relationships = data.get("relationships", {}) 147 | 148 | frame = { 149 | "name": name, 150 | "properties": properties, 151 | "relationships": relationships, 152 | "location": data.get("location", {}), 153 | "history": [] 154 | } 155 | 156 | if memory_type not in self.columns: 157 | self.create_column(memory_type) 158 | 159 | column = self.columns[memory_type] 160 | column["frames"].append(frame) 161 | name = name.lower() 162 | if name not in self.name_index: 163 | self.name_index[name] = [] 164 | self.name_index[name].append((memory_type, frame)) 165 | 166 | self._save() 167 | 168 | return { 169 | "success": True, 170 | "entity_name": name, 171 | "message": f"Successfully added {memory_type} memory: {name}" 172 | } 173 | 174 | except EntityValidationError as e: 175 | logger.error(f"Validation error: {str(e)}") 176 | raise 177 | except Exception as e: 178 | logger.error(f"Error adding entity: {str(e)}") 179 | return { 180 | "success": False, 181 | "message": str(e) 182 | } 183 | 184 | def query_frames(self, name): 185 | """Query frames by name and return dictionary of frames by column.""" 186 | try: 187 | name = name.lower() 188 | frames = self.name_index.get(name, []) 189 | result = {} 190 | for column_name, frame_data in frames: 191 | if column_name not in result: 192 | try: 193 | # Always convert to ReferenceFrame 194 | result[column_name] = ReferenceFrame.from_dict(frame_data) 195 | except Exception as frame_error: 196 | logger.error(f"Error converting frame data: {str(frame_error)}") 197 | continue 198 | return result 199 | except Exception as e: 200 | logger.error(f"Error querying frames: {str(e)}") 201 | return {} 202 | 203 | def list_entities(self): 204 | try: 205 | return sorted(list(self.name_index.keys())) 206 | except Exception: 207 | return [] 208 | -------------------------------------------------------------------------------- /hawkinsdb/enrichment.py: -------------------------------------------------------------------------------- 1 | import requests 2 | import logging 3 | from .core import HawkinsDB 4 | from .base import PropertyCandidate 5 | from collections import defaultdict 6 | 7 | logger = logging.getLogger(__name__) 8 | 9 | class ConceptNetEnricher: 10 | """Handles auto-enrichment of entities using ConceptNet.""" 11 | 12 | def __init__(self, api_key=None, cache_enabled=False): 13 | """Initialize ConceptNet enricher. 14 | 15 | Args: 16 | api_key: Optional API key for ConceptNet (not required for basic usage) 17 | cache_enabled: Deprecated, kept for backwards compatibility 18 | """ 19 | self.api_key = api_key 20 | self.base_url = "http://api.conceptnet.io" 21 | if cache_enabled: 22 | logger.warning("Caching has been deprecated and will be removed in future versions") 23 | 24 | def get_basic_knowledge(self, concept): 25 | """ 26 | Retrieve basic knowledge about a concept from ConceptNet. 27 | Returns a dictionary with properties and relationships. 28 | 29 | Args: 30 | concept: The concept to query (e.g., "car", "house") 31 | 32 | Returns: 33 | Dictionary containing properties and relationships enriched from ConceptNet 34 | """ 35 | # Removed cache implementation to simplify the code 36 | # Direct API call without caching 37 | 38 | if not concept: 39 | logger.warning("Empty concept provided") 40 | return {"properties": {}, "relationships": {}} 41 | 42 | try: 43 | # Normalize concept name for API 44 | concept_normalized = concept.lower().replace(" ", "_") 45 | # Query both direct and related concepts 46 | urls = [ 47 | f"{self.base_url}/c/en/{concept_normalized}", 48 | f"{self.base_url}/query?node=/c/en/{concept_normalized}&other=/c/en" 49 | ] 50 | 51 | headers = {} 52 | if self.api_key: 53 | headers['Authorization'] = f'Bearer {self.api_key}' 54 | 55 | all_data = [] 56 | for url in urls: 57 | response = requests.get(url, headers=headers, timeout=10) 58 | if response.status_code == 200: 59 | all_data.append(response.json()) 60 | else: 61 | logger.warning(f"Failed to get ConceptNet data from {url}: {response.status_code}") 62 | 63 | if not all_data: 64 | return {"properties": {}, "relationships": {}} 65 | 66 | properties = defaultdict(list) 67 | relationships = defaultdict(list) 68 | 69 | # Process edges to extract meaningful information from all responses 70 | edges = [] 71 | for response_data in all_data: 72 | edges.extend(response_data.get('edges', [])) 73 | 74 | for edge in edges: 75 | try: 76 | # Validate edge structure and language 77 | start_node = edge.get('start', {}) 78 | end_node = edge.get('end', {}) 79 | 80 | if not (start_node.get('language') == 'en' and 81 | end_node.get('language') == 'en'): 82 | continue 83 | 84 | # Enhanced weight validation with fallback 85 | try: 86 | weight = float(edge.get('weight', 0)) 87 | except (TypeError, ValueError): 88 | logger.warning(f"Invalid weight value in edge: {edge.get('weight')}") 89 | continue 90 | 91 | if weight < 0.5: # Filter out low confidence assertions 92 | continue 93 | 94 | # Check for malformed edge data 95 | if not all([start_node.get('label'), 96 | end_node.get('label'), 97 | edge.get('rel', {}).get('label')]): 98 | logger.warning(f"Malformed edge data: {edge}") 99 | continue 100 | 101 | except Exception as e: 102 | logger.warning(f"Error processing edge: {str(e)}") 103 | continue 104 | 105 | rel = edge.get('rel', {}).get('label') 106 | if not rel: 107 | continue 108 | 109 | end_term = edge.get('end', {}).get('label') 110 | if not end_term: 111 | continue 112 | 113 | # Convert ConceptNet weight to confidence score (0.5 to 1.0) 114 | confidence = 0.5 + (weight / 2) 115 | 116 | # Enhanced relation mapping with comprehensive categorization 117 | PROPERTY_RELATIONS = { 118 | 'HasProperty': 'properties', 119 | 'HasA': 'features', 120 | 'MadeOf': 'materials', 121 | 'PartOf': 'components', 122 | 'HasContext': 'contexts', 123 | 'HasSubevent': 'subevents', 124 | 'HasPrerequisite': 'prerequisites', 125 | 'HasFirstSubevent': 'first_steps', 126 | 'HasLastSubevent': 'last_steps', 127 | 'HasSize': 'size', 128 | 'HasShape': 'shape', 129 | 'HasColor': 'color', 130 | 'HasTexture': 'texture', 131 | 'HasWeight': 'weight', 132 | 'HasFeel': 'feel', 133 | 'HasSound': 'sound', 134 | 'HasTaste': 'taste', 135 | 'HasSmell': 'smell', 136 | } 137 | 138 | RELATIONSHIP_RELATIONS = { 139 | 'IsA': 'categories', 140 | 'CapableOf': 'capabilities', 141 | 'UsedFor': 'uses', 142 | 'AtLocation': 'locations', 143 | 'CreatedBy': 'creators', 144 | 'PartOf': 'parent_systems', 145 | 'HasEffect': 'effects', 146 | 'MotivatedByGoal': 'goals', 147 | 'SimilarTo': 'similar_concepts', 148 | 'DerivedFrom': 'origins', 149 | 'SymbolOf': 'symbolism', 150 | 'ReceivesAction': 'actions_received', 151 | 'HasSubevent': 'related_events', 152 | 'HasPrerequisite': 'prerequisites', 153 | 'Causes': 'causes', 154 | 'HasFirstSubevent': 'initial_stages', 155 | 'HasLastSubevent': 'final_stages', 156 | 'RelatedTo': 'related_concepts', 157 | } 158 | 159 | if rel in PROPERTY_RELATIONS: 160 | prop_key = PROPERTY_RELATIONS[rel] 161 | properties[prop_key].append({ 162 | 'value': end_term, 163 | 'confidence': confidence, 164 | 'source': f"ConceptNet:{rel}" 165 | }) 166 | 167 | elif rel in RELATIONSHIP_RELATIONS: 168 | rel_key = RELATIONSHIP_RELATIONS[rel] 169 | relationships[rel_key].append({ 170 | 'value': end_term, 171 | 'confidence': confidence, 172 | 'source': f"ConceptNet:{rel}" 173 | }) 174 | 175 | # Filter and clean the data 176 | def filter_and_sort_by_confidence(items, min_confidence=0.6, max_items=5): 177 | """ 178 | Filter and sort knowledge items based on confidence and quality. 179 | 180 | Args: 181 | items: List of items to filter 182 | min_confidence: Minimum confidence threshold (default: 0.6) 183 | max_items: Maximum number of items to return (default: 5) 184 | 185 | Returns: 186 | List of filtered and sorted items 187 | """ 188 | seen = set() 189 | filtered = [] 190 | 191 | # Sort by confidence and filter 192 | sorted_items = sorted(items, key=lambda x: x['confidence'], reverse=True) 193 | 194 | for item in sorted_items: 195 | value = item['value'].lower() 196 | confidence = item['confidence'] 197 | 198 | # Apply quality filters 199 | if (confidence >= min_confidence and 200 | value not in seen and 201 | len(value.split()) <= 3 and # Keep concise terms 202 | len(value) >= 3 and # Avoid too short terms 203 | not any(c.isdigit() for c in value)): # Avoid numerical values 204 | 205 | seen.add(value) 206 | filtered.append(item) 207 | 208 | if len(filtered) >= max_items: 209 | break 210 | 211 | return filtered 212 | 213 | filtered_data = { 214 | "properties": { 215 | k: filter_and_sort_by_confidence(v) 216 | for k, v in properties.items() 217 | }, 218 | "relationships": { 219 | k: filter_and_sort_by_confidence(v) 220 | for k, v in relationships.items() 221 | } 222 | } 223 | 224 | # Return filtered data directly without caching 225 | return filtered_data 226 | 227 | except Exception as e: 228 | logger.error(f"Error enriching concept {concept}: {str(e)}") 229 | return {} 230 | 231 | def enrich_entity(self, db, entity_name, entity_type): 232 | """ 233 | Enrich an entity with ConceptNet knowledge. 234 | 235 | Args: 236 | db: HawkinsDB instance to update 237 | entity_name: Name of the entity to enrich 238 | entity_type: Type of entity to query in ConceptNet 239 | 240 | Returns: 241 | Enriched entity data or None if enrichment failed 242 | """ 243 | knowledge = self.get_basic_knowledge(entity_type) 244 | if not knowledge: 245 | logger.warning(f"No knowledge found for entity type: {entity_type}") 246 | return 247 | 248 | try: 249 | # First, get existing entity data 250 | frames = db.query_frames(entity_name) 251 | if not frames: 252 | logger.warning(f"Entity {entity_name} not found in database") 253 | return None 254 | 255 | semantic_frame = frames.get("Semantic", {}) 256 | if not semantic_frame: 257 | logger.warning(f"No semantic frame found for entity {entity_name}") 258 | return None 259 | 260 | # Convert ReferenceFrame to dictionary if needed 261 | if hasattr(semantic_frame, 'to_dict'): 262 | semantic_frame = semantic_frame.to_dict() 263 | 264 | # Initialize with empty dicts if needed 265 | properties = semantic_frame.get('properties', {}) if isinstance(semantic_frame, dict) else {} 266 | relationships = semantic_frame.get('relationships', {}) if isinstance(semantic_frame, dict) else {} 267 | 268 | semantic_frame = { 269 | 'properties': properties, 270 | 'relationships': relationships 271 | } 272 | except Exception as e: 273 | logger.error(f"Error accessing entity data: {str(e)}") 274 | return None 275 | 276 | # Prepare enriched entity data 277 | enriched_entity = { 278 | "name": entity_name, 279 | "column": "Semantic", # Always add enrichment to semantic memory 280 | "properties": {}, 281 | "relationships": {} 282 | } 283 | 284 | # Add existing properties and relationships 285 | if semantic_frame.get('properties'): 286 | enriched_entity["properties"].update(semantic_frame['properties']) 287 | 288 | if semantic_frame.get('relationships'): 289 | enriched_entity["relationships"].update(semantic_frame['relationships']) 290 | 291 | # Add ConceptNet knowledge 292 | for prop_key, values in knowledge.get("properties", {}).items(): 293 | if prop_key not in enriched_entity["properties"]: 294 | enriched_entity["properties"][prop_key] = [] 295 | 296 | # Add new properties with ConceptNet source 297 | for value in values[:3]: # Limit to top 3 values 298 | if isinstance(value, dict) and "value" in value: 299 | val = value["value"] 300 | # Convert lists to string representation if needed 301 | if isinstance(val, (list, tuple)): 302 | # Convert each list item to a separate property 303 | for v in val: 304 | enriched_entity["properties"][prop_key].append({ 305 | "value": str(v), 306 | "confidence": 0.7, # Lower confidence for auto-enriched properties 307 | "sources": ["ConceptNet"] 308 | }) 309 | else: 310 | enriched_entity["properties"][prop_key].append({ 311 | "value": str(val), 312 | "confidence": 0.7, # Lower confidence for auto-enriched properties 313 | "sources": ["ConceptNet"] 314 | }) 315 | 316 | for rel_type, targets in knowledge.get("relationships", {}).items(): 317 | if rel_type not in enriched_entity["relationships"]: 318 | enriched_entity["relationships"][rel_type] = [] 319 | 320 | # Add new relationships with ConceptNet source 321 | for target in targets[:3]: # Limit to top 3 relationships 322 | if isinstance(target, dict) and "value" in target: 323 | val = target["value"] 324 | # Ensure relationship targets are strings 325 | # Always convert relationships to individual string entries 326 | if isinstance(val, (list, tuple)): 327 | for v in val: 328 | if v: # Skip empty values 329 | enriched_entity["relationships"][rel_type].append({ 330 | "value": str(v).strip(), 331 | "confidence": 0.7, 332 | "sources": ["ConceptNet"] 333 | }) 334 | elif val: # Skip empty values 335 | enriched_entity["relationships"][rel_type].append({ 336 | "value": str(val).strip(), 337 | "confidence": 0.7, 338 | "sources": ["ConceptNet"] 339 | }) 340 | 341 | # Update entity with enriched data 342 | if (enriched_entity["properties"] or enriched_entity["relationships"]): 343 | logger.info(f"Enriching {entity_name} with ConceptNet knowledge") 344 | db.add_entity(enriched_entity) # Use add_entity instead of propose_entity_update 345 | logger.info(f"Successfully enriched {entity_name} with ConceptNet knowledge") 346 | return enriched_entity 347 | return None -------------------------------------------------------------------------------- /hawkinsdb/py.typed: -------------------------------------------------------------------------------- 1 | # This file is required to mark the package as typed 2 | -------------------------------------------------------------------------------- /hawkinsdb/storage/__init__.py: -------------------------------------------------------------------------------- 1 | """Storage backends for HawkinsDB.""" 2 | from typing import List 3 | from ..base import StorageBackend, BaseJSONStorage 4 | from ..types import ( 5 | CorticalColumn, 6 | ReferenceFrame, 7 | PropertyCandidate 8 | ) 9 | 10 | # Import concrete implementations 11 | from .sqlite import SQLiteStorage 12 | 13 | __all__ = [ 14 | 'StorageBackend', 15 | 'BaseJSONStorage', 16 | 'SQLiteStorage', 17 | 'CorticalColumn', 18 | 'ReferenceFrame', 19 | 'PropertyCandidate' 20 | ] 21 | -------------------------------------------------------------------------------- /hawkinsdb/storage/sqlite.py: -------------------------------------------------------------------------------- 1 | """SQLite storage backend implementation.""" 2 | import os 3 | import json 4 | import logging 5 | import sqlite3 6 | from datetime import datetime 7 | from typing import List, Dict, Any, Optional 8 | from pathlib import Path 9 | 10 | logger = logging.getLogger(__name__) 11 | 12 | class SQLiteStorage: 13 | """Simple SQLite storage implementation.""" 14 | 15 | def __init__(self, db_path: str = "hawkins_memory.db"): 16 | """Initialize SQLite storage.""" 17 | try: 18 | # Convert to absolute path 19 | self.db_path = str(Path(db_path).absolute()) 20 | 21 | # Ensure directory exists 22 | directory = os.path.dirname(self.db_path) 23 | if directory: 24 | os.makedirs(directory, exist_ok=True) 25 | 26 | self._initialized = False 27 | 28 | # Remove existing database if it's corrupted 29 | try: 30 | if os.path.exists(self.db_path): 31 | with sqlite3.connect(self.db_path) as test_conn: 32 | test_conn.execute("SELECT 1") 33 | except sqlite3.DatabaseError: 34 | logger.warning(f"Removing corrupted database file: {self.db_path}") 35 | os.remove(self.db_path) 36 | 37 | # Create a new database connection 38 | with sqlite3.connect(self.db_path) as conn: 39 | # Set pragmas for better performance and reliability 40 | conn.execute("PRAGMA foreign_keys = ON") 41 | conn.execute("PRAGMA journal_mode = WAL") 42 | conn.execute("PRAGMA synchronous = NORMAL") 43 | conn.execute("PRAGMA busy_timeout = 5000") 44 | 45 | # Initialize schema in a transaction 46 | self.initialize() 47 | self._initialized = True 48 | 49 | # Verify tables after initialization 50 | cursor = conn.cursor() 51 | cursor.execute("SELECT name FROM sqlite_master WHERE type='table'") 52 | tables = {row[0] for row in cursor.fetchall()} 53 | 54 | required_tables = {'columns', 'frames'} 55 | if not required_tables.issubset(tables): 56 | missing = required_tables - tables 57 | raise RuntimeError(f"Failed to create tables: {missing}") 58 | 59 | logger.info(f"SQLite storage initialized successfully at {self.db_path}") 60 | 61 | except sqlite3.Error as e: 62 | logger.error(f"SQLite error during initialization: {str(e)}") 63 | self._initialized = False 64 | if os.path.exists(self.db_path): 65 | try: 66 | os.remove(self.db_path) 67 | except OSError: 68 | pass 69 | raise 70 | except Exception as e: 71 | logger.error(f"Failed to initialize SQLite storage: {str(e)}") 72 | self._initialized = False 73 | if os.path.exists(self.db_path): 74 | try: 75 | os.remove(self.db_path) 76 | except OSError: 77 | pass 78 | raise RuntimeError(f"SQLite storage initialization failed: {str(e)}") 79 | 80 | def get_connection(self): 81 | """Get a database connection with row factory.""" 82 | try: 83 | conn = sqlite3.connect(self.db_path, timeout=60) 84 | conn.row_factory = sqlite3.Row 85 | # Enable foreign keys 86 | conn.execute("PRAGMA foreign_keys = ON") 87 | return conn 88 | except sqlite3.Error as e: 89 | logger.error(f"Failed to establish database connection: {e}") 90 | raise 91 | 92 | def initialize(self): 93 | """Initialize database schema with proper error handling.""" 94 | if not os.path.exists(self.db_path): 95 | # Create new database file if it doesn't exist 96 | open(self.db_path, 'a').close() 97 | 98 | try: 99 | with self.get_connection() as conn: 100 | conn.execute("BEGIN IMMEDIATE TRANSACTION") 101 | 102 | try: 103 | # Create tables with proper constraints 104 | conn.executescript(''' 105 | CREATE TABLE IF NOT EXISTS columns ( 106 | id INTEGER PRIMARY KEY AUTOINCREMENT, 107 | name TEXT UNIQUE NOT NULL, 108 | created_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP, 109 | updated_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP 110 | ); 111 | 112 | CREATE TABLE IF NOT EXISTS frames ( 113 | id INTEGER PRIMARY KEY AUTOINCREMENT, 114 | name TEXT NOT NULL, 115 | column_id INTEGER NOT NULL, 116 | properties TEXT NOT NULL DEFAULT '{}', 117 | relationships TEXT NOT NULL DEFAULT '{}', 118 | location TEXT DEFAULT '{}', 119 | history TEXT DEFAULT '[]', 120 | created_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP, 121 | updated_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP, 122 | FOREIGN KEY (column_id) REFERENCES columns(id) 123 | ON DELETE CASCADE 124 | ON UPDATE CASCADE 125 | ); 126 | 127 | CREATE INDEX IF NOT EXISTS idx_frames_name ON frames(name); 128 | CREATE INDEX IF NOT EXISTS idx_frames_column_id ON frames(column_id); 129 | ''') 130 | 131 | # Verify tables were created 132 | cursor = conn.cursor() 133 | cursor.execute(""" 134 | SELECT name FROM sqlite_master 135 | WHERE type='table' AND name IN ('columns', 'frames') 136 | """) 137 | tables = {row[0] for row in cursor.fetchall()} 138 | 139 | required_tables = {'columns', 'frames'} 140 | if not required_tables.issubset(tables): 141 | missing = required_tables - tables 142 | raise RuntimeError(f"Failed to create required tables: {missing}") 143 | 144 | conn.commit() 145 | self._initialized = True 146 | logger.info("SQLite storage schema initialized successfully") 147 | 148 | except Exception as e: 149 | conn.rollback() 150 | logger.error(f"Error during schema initialization: {str(e)}") 151 | raise 152 | 153 | except sqlite3.Error as e: 154 | logger.error(f"SQLite error during schema initialization: {str(e)}") 155 | self._initialized = False 156 | raise 157 | except Exception as e: 158 | logger.error(f"Failed to initialize SQLite storage schema: {str(e)}") 159 | self._initialized = False 160 | raise 161 | 162 | def load_columns(self) -> List[Dict[str, Any]]: 163 | """Load all columns from storage.""" 164 | if not self._initialized: 165 | raise RuntimeError("Storage not initialized") 166 | 167 | try: 168 | columns = [] 169 | with self.get_connection() as conn: 170 | # Get all columns 171 | cursor = conn.cursor() 172 | for col_row in cursor.execute('SELECT * FROM columns').fetchall(): 173 | frames = [] 174 | 175 | # Get frames for this column 176 | frame_rows = cursor.execute( 177 | 'SELECT * FROM frames WHERE column_id = ?', 178 | (col_row['id'],) 179 | ).fetchall() 180 | 181 | for frame_row in frame_rows: 182 | frame = { 183 | 'name': frame_row['name'], 184 | 'properties': json.loads(frame_row['properties']), 185 | 'relationships': json.loads(frame_row['relationships']), 186 | 'location': json.loads(frame_row['location']) if frame_row['location'] else {}, 187 | 'history': json.loads(frame_row['history']) if frame_row['history'] else [], 188 | 'created_at': frame_row['created_at'], 189 | 'updated_at': frame_row['updated_at'] 190 | } 191 | frames.append(frame) 192 | 193 | column = { 194 | 'name': col_row['name'], 195 | 'frames': frames, 196 | 'created_at': col_row['created_at'], 197 | 'updated_at': col_row['updated_at'] 198 | } 199 | columns.append(column) 200 | 201 | return columns 202 | 203 | except Exception as e: 204 | logger.error("Error loading columns: %s", str(e)) 205 | raise 206 | 207 | def save_columns(self, columns: List[Dict[str, Any]]) -> None: 208 | """Save columns to storage.""" 209 | if not self._initialized: 210 | raise RuntimeError("Storage not initialized") 211 | 212 | try: 213 | with self.get_connection() as conn: 214 | cursor = conn.cursor() 215 | 216 | # Clear existing data 217 | cursor.execute('DELETE FROM frames') 218 | cursor.execute('DELETE FROM columns') 219 | 220 | # Save new data 221 | for column in columns: 222 | now = datetime.now().isoformat() 223 | 224 | # Insert column 225 | cursor.execute( 226 | 'INSERT INTO columns (name, created_at, updated_at) VALUES (?, ?, ?)', 227 | (column['name'], 228 | column.get('created_at', now), 229 | column.get('updated_at', now)) 230 | ) 231 | column_id = cursor.lastrowid 232 | 233 | # Insert frames 234 | for frame in column.get('frames', []): 235 | cursor.execute(''' 236 | INSERT INTO frames ( 237 | name, column_id, properties, relationships, 238 | location, history, created_at, updated_at 239 | ) VALUES (?, ?, ?, ?, ?, ?, ?, ?) 240 | ''', ( 241 | frame['name'], 242 | column_id, 243 | json.dumps(frame.get('properties', {})), 244 | json.dumps(frame.get('relationships', {})), 245 | json.dumps(frame.get('location', {})), 246 | json.dumps(frame.get('history', [])), 247 | frame.get('created_at', now), 248 | frame.get('updated_at', now) 249 | )) 250 | 251 | logger.info("Successfully saved %d columns", len(columns)) 252 | 253 | except Exception as e: 254 | logger.error("Error saving columns: %s", str(e)) 255 | raise 256 | 257 | def cleanup(self) -> None: 258 | """Clean up resources.""" 259 | logger.info("SQLite storage cleaned up successfully") 260 | -------------------------------------------------------------------------------- /hawkinsdb/types.py: -------------------------------------------------------------------------------- 1 | """Core classes for HawkinsDB memory management.""" 2 | import time 3 | from datetime import datetime 4 | 5 | class PropertyCandidate: 6 | """A property candidate with value and metadata.""" 7 | def __init__(self, value, confidence=1.0, sources=None, timestamp=None): 8 | self.value = value 9 | self.confidence = confidence 10 | self.sources = sources or [] 11 | self.timestamp = timestamp or time.time() 12 | 13 | def to_dict(self): 14 | """Convert to dictionary representation.""" 15 | return { 16 | "value": self.value, 17 | "confidence": self.confidence, 18 | "sources": self.sources, 19 | "timestamp": self.timestamp 20 | } 21 | 22 | @classmethod 23 | def from_dict(cls, data): 24 | """Create from dictionary.""" 25 | if isinstance(data, dict): 26 | return cls( 27 | data.get("value"), 28 | data.get("confidence", 1.0), 29 | data.get("sources", []), 30 | data.get("timestamp", time.time()) 31 | ) 32 | return cls(data) 33 | 34 | class ReferenceFrame: 35 | """Represents a single concept or object.""" 36 | def __init__(self, name, properties=None, relationships=None, location=None, history=None): 37 | self.name = name 38 | self.properties = properties or {} 39 | self.relationships = relationships or {} 40 | self.location = location or {} 41 | self.history = history or [] 42 | self.created_at = datetime.now().isoformat() 43 | self.updated_at = datetime.now().isoformat() 44 | 45 | def to_dict(self): 46 | """Convert to dictionary representation.""" 47 | return { 48 | "name": self.name, 49 | "properties": self.properties, 50 | "relationships": self.relationships, 51 | "location": self.location, 52 | "history": self.history 53 | } 54 | 55 | @classmethod 56 | def from_dict(cls, data): 57 | """Create from dictionary.""" 58 | return cls( 59 | data["name"], 60 | data.get("properties", {}), 61 | data.get("relationships", {}), 62 | data.get("location", {}), 63 | data.get("history", []) 64 | ) 65 | 66 | class CorticalColumn: 67 | """Base class for memory columns.""" 68 | def __init__(self, name, frames=None): 69 | self.name = name 70 | self.frames = frames or [] 71 | self.created_at = datetime.now().isoformat() 72 | self.updated_at = datetime.now().isoformat() 73 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | from setuptools import setup, find_packages 2 | import os 3 | 4 | # Read long description from README.md 5 | with open("README.md", "r", encoding="utf-8") as fh: 6 | long_description = fh.read() 7 | 8 | # Read version from package 9 | def get_version(): 10 | with open(os.path.join("hawkinsdb", "__init__.py"), "r") as f: 11 | for line in f: 12 | if line.startswith("__version__"): 13 | return line.split("=")[1].strip().strip('"\'') 14 | return "1.0.1" # Default version if not found 15 | 16 | setup( 17 | name="hawkinsdb", 18 | version=get_version(), 19 | packages=find_packages(exclude=["tests*", "examples*", "docs*"]), 20 | install_requires=[ 21 | 'requests>=2.25.1', 22 | 'sqlalchemy>=2.0.0', 23 | 'sqlalchemy-utils>=0.41.2', 24 | 'filelock>=3.0.0', 25 | 'typing-extensions>=4.0.0', 26 | 'python-dateutil>=2.8.2', 27 | 'setuptools>=42.0.0' 28 | ], 29 | extras_require={ 30 | 'dev': [ 31 | 'pytest>=7.0.0', 32 | 'pytest-cov>=4.1.0', 33 | 'pytest-asyncio>=0.23.0', 34 | 'black>=23.0.0', 35 | 'isort>=5.12.0', 36 | 'mypy>=1.0.0', 37 | 'ruff>=0.1.0' 38 | ], 39 | 'docs': [ 40 | 'sphinx>=7.0.0', 41 | 'sphinx-rtd-theme>=1.3.0', 42 | 'myst-parser>=2.0.0' 43 | ], 44 | 'conceptnet': [ 45 | 'networkx>=3.0.0', 46 | 'requests-cache>=1.1.0' 47 | ], 48 | 'llm': [ 49 | 'openai>=1.0.0', 50 | 'tenacity>=8.2.0', 51 | 'tiktoken>=0.5.0' 52 | ], 53 | 'all': [ 54 | 'networkx>=3.0.0', 55 | 'requests-cache>=1.1.0', 56 | 'openai>=1.0.0', 57 | 'tenacity>=8.2.0', 58 | 'tiktoken>=0.5.0' 59 | ] 60 | }, 61 | author="HawkinsDB Contributors", 62 | author_email="hawkinsdb@example.com", 63 | description="A memory layer with ConceptNet integration and LLM-friendly interfaces", 64 | long_description=long_description, 65 | long_description_content_type="text/markdown", 66 | url="https://github.com/hawkinsdb/hawkinsdb", 67 | project_urls={ 68 | "Bug Tracker": "https://github.com/hawkinsdb/hawkinsdb/issues", 69 | "Documentation": "https://hawkinsdb.readthedocs.io/", 70 | "Source Code": "https://github.com/hawkinsdb/hawkinsdb", 71 | }, 72 | classifiers=[ 73 | "Development Status :: 3 - Alpha", 74 | "Intended Audience :: Developers", 75 | "Intended Audience :: Science/Research", 76 | "License :: OSI Approved :: MIT License", 77 | "Operating System :: OS Independent", 78 | "Programming Language :: Python :: 3 :: Only", 79 | "Programming Language :: Python :: 3.8", 80 | "Programming Language :: Python :: 3.9", 81 | "Programming Language :: Python :: 3.10", 82 | "Programming Language :: Python :: 3.11", 83 | "Topic :: Database", 84 | "Topic :: Scientific/Engineering :: Artificial Intelligence", 85 | "Topic :: Software Development :: Libraries :: Python Modules", 86 | "Typing :: Typed" 87 | ], 88 | python_requires='>=3.8,<4', 89 | package_data={ 90 | 'hawkinsdb': [ 91 | 'README.md', 92 | 'LICENSE', 93 | 'py.typed', 94 | 'storage/*.sql', 95 | 'storage/*.json', 96 | 'storage/*.db' 97 | ], 98 | }, 99 | include_package_data=True, 100 | ) -------------------------------------------------------------------------------- /tests/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | Test package for HawkinsDB. 3 | Contains all unit tests, integration tests, and test utilities. 4 | """ 5 | -------------------------------------------------------------------------------- /tests/document.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/harishsg993010/HawkinsDB/5268d4ec11f55ab53f83d2c1ef29317901732e35/tests/document.pdf -------------------------------------------------------------------------------- /tests/file_rag.py: -------------------------------------------------------------------------------- 1 | import os 2 | import logging 3 | from typing import List, Dict, Any 4 | import PyPDF2 5 | from pathlib import Path 6 | from hawkinsdb import HawkinsDB, LLMInterface 7 | 8 | os.environ["OPENAI_API_KEY"]="" 9 | 10 | logging.basicConfig(level=logging.INFO) 11 | logger = logging.getLogger(__name__) 12 | 13 | class PDFHawkinsRAG: 14 | def __init__(self, chunk_size: int = 500): 15 | """Initialize the RAG system.""" 16 | self.db = HawkinsDB(storage_type='sqlite',db_path="rag.db") 17 | self.llm_interface = LLMInterface(self.db,auto_enrich=True) 18 | self.chunk_size = chunk_size 19 | 20 | def extract_text_from_pdf(self, pdf_path: str) -> str: 21 | """Extract text content from a PDF file.""" 22 | try: 23 | with open(pdf_path, 'rb') as file: 24 | pdf_reader = PyPDF2.PdfReader(file) 25 | text = "" 26 | for page in pdf_reader.pages: 27 | text += page.extract_text() + "\n" 28 | return text 29 | except Exception as e: 30 | logger.error(f"Error extracting text from PDF: {str(e)}") 31 | raise 32 | 33 | def chunk_text(self, text: str, filename: str) -> List[Dict[str, Any]]: 34 | """Split text into chunks and prepare for database storage.""" 35 | chunks = [] 36 | words = text.split() 37 | current_chunk = [] 38 | chunk_number = 1 39 | 40 | for word in words: 41 | current_chunk.append(word) 42 | if len(current_chunk) >= self.chunk_size: 43 | chunk_text = " ".join(current_chunk) 44 | chunks.append({ 45 | "name": f"{Path(filename).stem}_chunk_{chunk_number}", 46 | "column": "Semantic", 47 | "properties": { 48 | "content": chunk_text, 49 | "source_file": filename, 50 | "chunk_number": chunk_number, 51 | }, 52 | "relationships": { 53 | "part_of": [filename], 54 | "next_chunk": [f"{Path(filename).stem}_chunk_{chunk_number + 1}"] if len(words) > self.chunk_size else [] 55 | } 56 | }) 57 | current_chunk = [] 58 | chunk_number += 1 59 | 60 | # Handle remaining text 61 | if current_chunk: 62 | chunk_text = " ".join(current_chunk) 63 | chunks.append({ 64 | "name": f"{Path(filename).stem}_chunk_{chunk_number}", 65 | "column": "Semantic", 66 | "properties": { 67 | "content": chunk_text, 68 | "source_file": filename, 69 | "chunk_number": chunk_number, 70 | }, 71 | "relationships": { 72 | "part_of": [filename] 73 | } 74 | }) 75 | 76 | return chunks 77 | 78 | def ingest_pdf(self, pdf_path: str) -> Dict[str, Any]: 79 | """Process and store PDF content in the database.""" 80 | try: 81 | # Extract text from PDF 82 | logger.info(f"Processing PDF: {pdf_path}") 83 | text = self.extract_text_from_pdf(pdf_path) 84 | 85 | # Create document metadata 86 | filename = Path(pdf_path).name 87 | doc_metadata = { 88 | "name": Path(pdf_path).stem, 89 | "column": "Semantic", 90 | "properties": { 91 | "file_type": "PDF", 92 | "file_path": pdf_path, 93 | "file_name": filename, 94 | }, 95 | "relationships": { 96 | "contains": [] 97 | } 98 | } 99 | 100 | # Store document metadata 101 | self.db.add_entity(doc_metadata) 102 | 103 | # Process and store chunks 104 | chunks = self.chunk_text(text, filename) 105 | chunk_names = [] 106 | for chunk in chunks: 107 | self.db.add_entity(chunk) 108 | chunk_names.append(chunk["name"]) 109 | 110 | # Update document metadata with chunk references 111 | doc_metadata["relationships"]["contains"] = chunk_names 112 | self.db.add_entity(doc_metadata) 113 | 114 | return { 115 | "success": True, 116 | "message": f"Successfully processed {filename}", 117 | "chunks_created": len(chunks) 118 | } 119 | 120 | except Exception as e: 121 | logger.error(f"Error ingesting PDF: {str(e)}") 122 | return { 123 | "success": False, 124 | "message": str(e) 125 | } 126 | 127 | def query(self, question: str) -> Dict[str, Any]: 128 | """Query the knowledge base with context from stored documents.""" 129 | try: 130 | return self.llm_interface.query(question) 131 | except Exception as e: 132 | logger.error(f"Error processing query: {str(e)}") 133 | return { 134 | "success": False, 135 | "message": str(e), 136 | "response": None 137 | } 138 | 139 | def test_pdf_rag(): 140 | """Test the PDF RAG system.""" 141 | # Initialize the system 142 | rag = PDFHawkinsRAG(chunk_size=500) 143 | 144 | # Test with sample PDF 145 | pdf_path = r"C:\Users\haris\Desktop\personal\AI-Agent\Hawin\tests\document.pdf" # Replace with actual PDF path 146 | if os.path.exists(pdf_path): 147 | # Ingest PDF 148 | logger.info("Ingesting PDF...") 149 | result = rag.ingest_pdf(pdf_path) 150 | logger.info(f"Ingestion result: {result}") 151 | 152 | if result["success"]: 153 | # Test queries 154 | test_queries = [ 155 | "What is the main topic of the document?", 156 | "Summarize the key points from the document.", 157 | "What are the main conclusions drawn in the document?", 158 | "what is silha center", 159 | "who is Charlotte Higgins", 160 | "Explain the lawsuits", 161 | "Explain OpenAI's Involvement", 162 | "who is Mike Masnick" 163 | ] 164 | 165 | logger.info("\nTesting queries:") 166 | for query in test_queries: 167 | logger.info(f"\nQuery: {query}") 168 | response = rag.query(query) 169 | logger.info(f"Response: {response}") 170 | else: 171 | logger.error(f"PDF file not found: {pdf_path}") 172 | 173 | if __name__ == "__main__": 174 | test_pdf_rag() -------------------------------------------------------------------------------- /tests/test_basic.py: -------------------------------------------------------------------------------- 1 | import logging 2 | from hawkinsdb import HawkinsDB, LLMInterface 3 | 4 | logging.basicConfig(level=logging.INFO) 5 | 6 | def test_basic_functionality(): 7 | # Initialize database and interface 8 | db = HawkinsDB() 9 | interface = LLMInterface(db, auto_enrich=True) 10 | 11 | # Test entity with comprehensive data 12 | test_entity = { 13 | "column": "Conceptual", 14 | "type": "Car", 15 | "name": "TestCar1", 16 | "properties": { 17 | "brand": "Tesla", 18 | "color": "red", 19 | "model": "Model 3" 20 | }, 21 | "relationships": { 22 | "type_of": ["Vehicle", "Transport"], 23 | "has_part": ["Engine", "Wheels"] 24 | }, 25 | "location": {"in": "Garage"} 26 | } 27 | 28 | # Add entity 29 | print("\nAdding test entity...") 30 | result = interface.add_entity(test_entity) 31 | print(f"Add result: {result}") 32 | 33 | if result['success']: 34 | # Query the enriched entity 35 | print("\nQuerying enriched entity...") 36 | query_result = interface.query_entity('TestCar1', include_metadata=True) 37 | print(f"Query result: {query_result}") 38 | 39 | if __name__ == "__main__": 40 | test_basic_functionality() 41 | -------------------------------------------------------------------------------- /tests/test_conceptnet.py: -------------------------------------------------------------------------------- 1 | import logging 2 | from hawkinsdb import HawkinsDB, LLMInterface 3 | import json 4 | 5 | logging.basicConfig(level=logging.INFO) 6 | logger = logging.getLogger(__name__) 7 | 8 | def test_conceptnet_enrichment(): 9 | # Initialize database and interface with auto-enrichment enabled 10 | db = HawkinsDB() 11 | interface = LLMInterface(db, auto_enrich=True) 12 | 13 | # Test entity - Car 14 | car_entity = { 15 | "column": "Semantic", 16 | "type": "Car", 17 | "name": "TestCar", 18 | "properties": { 19 | "brand": "Tesla", 20 | "model": "Model 3" 21 | }, 22 | "relationships": { 23 | "type_of": ["Vehicle"] 24 | } 25 | } 26 | 27 | # Add entity and get enriched data 28 | print("\nAdding car entity with auto-enrichment...") 29 | result = interface.add_entity(car_entity) 30 | print(f"Add result: {json.dumps(result, indent=2)}") 31 | 32 | if result['success']: 33 | # Query the enriched entity 34 | print("\nQuerying enriched entity...") 35 | query_result = interface.query_entity('TestCar1', include_metadata=True) 36 | print(f"Enriched entity data: {json.dumps(query_result, indent=2)}") 37 | 38 | # Verify enrichment 39 | if query_result['success']: 40 | data = query_result['data']['Conceptual'] 41 | print("\nEnriched properties:") 42 | for prop_type, values in data['properties'].items(): 43 | print(f"\n{prop_type}:") 44 | for value in values: 45 | if isinstance(value, dict): 46 | print(f" - {value['value']} (confidence: {value['confidence']}, sources: {value['sources']})") 47 | else: 48 | print(f" - {value}") 49 | 50 | print("\nEnriched relationships:") 51 | for rel_type, values in data['relationships'].items(): 52 | print(f"\n{rel_type}:") 53 | for value in values: 54 | if isinstance(value, dict): 55 | print(f" - {value['value']} (confidence: {value['confidence']}, sources: {value['sources']})") 56 | else: 57 | print(f" - {value}") 58 | 59 | if __name__ == "__main__": 60 | test_conceptnet_enrichment() 61 | -------------------------------------------------------------------------------- /tests/test_enrichment.py: -------------------------------------------------------------------------------- 1 | from hawkinsdb import HawkinsDB, LLMInterface 2 | import json 3 | import logging 4 | 5 | logging.basicConfig(level=logging.INFO) 6 | 7 | def test_enrichment(): 8 | # Initialize database and interface 9 | db = HawkinsDB() 10 | interface = LLMInterface(db, auto_enrich=True) 11 | 12 | # Test entity 13 | car_entity = { 14 | 'column': 'Conceptual', 15 | 'type': 'Car', 16 | 'name': 'TestCar1', 17 | 'properties': { 18 | 'brand': 'Tesla', 19 | 'model': 'Model 3' 20 | }, 21 | 'relationships': { 22 | 'type_of': ['Vehicle'] 23 | } 24 | } 25 | 26 | # Add entity 27 | print("\nAdding test entity...") 28 | result = interface.add_entity(car_entity) 29 | print(f"Add result: {json.dumps(result, indent=2)}") 30 | 31 | if result['success']: 32 | # Query the enriched entity 33 | print("\nQuerying enriched entity...") 34 | query_result = interface.query_entity('TestCar1', include_metadata=True) 35 | print(f"Query result: {json.dumps(query_result, indent=2)}") 36 | 37 | if __name__ == "__main__": 38 | test_enrichment() 39 | -------------------------------------------------------------------------------- /tests/test_exmple_full.py: -------------------------------------------------------------------------------- 1 | """ 2 | Comprehensive example demonstrating all major features of HawkinsDB. 3 | This example showcases: 4 | 1. Basic CRUD operations 5 | 2. Advanced caching mechanisms 6 | 3. Different memory types (Semantic, Episodic, Procedural) 7 | 4. ConceptNet integration 8 | 5. Memory type validations 9 | 6. Performance monitoring 10 | """ 11 | 12 | import logging 13 | import time 14 | import json 15 | from hawkinsdb import HawkinsDB, LLMInterface 16 | from datetime import datetime 17 | 18 | logging.basicConfig(level=logging.INFO) 19 | logger = logging.getLogger(__name__) 20 | 21 | 22 | def demonstrate_memory_types(db: HawkinsDB): 23 | """Demonstrate different memory types and their validations.""" 24 | logger.info("\n=== Testing Different Memory Types ===") 25 | 26 | 27 | laptop_entity = { 28 | "name": "MacBookPro_M4", 29 | "column": "Semantic", 30 | "properties": { 31 | "brand": "Apple", 32 | "model": "MacBook Pro", 33 | "year": "2024", 34 | "processor": "M3 chip", 35 | "ram": "16GB", 36 | "storage": "512GB SSD", 37 | "location": "home office" 38 | }, 39 | "relationships": { 40 | "type_of": ["Laptop", "Computer"], 41 | "manufactured_by": ["Apple"] 42 | } 43 | } 44 | 45 | # Add the entity directly first 46 | logger.info("\nAdding MacBook Pro entity...") 47 | db.add_entity(laptop_entity) 48 | # Semantic Memory 49 | semantic_data = { 50 | "name": "Tesla_Model_3", 51 | "column": "Semantic", 52 | "properties": { 53 | "type": "electric_car", 54 | "manufacturer": "Tesla", 55 | "year": 2024, 56 | "features": ["autopilot", "battery_powered", "touch_screen"] 57 | }, 58 | "relationships": { 59 | "similar_to": ["Tesla_Model_Y", "Tesla_Model_S"], 60 | "competes_with": ["BMW_i4", "Polestar_2"] 61 | } 62 | } 63 | db.add_entity(semantic_data) 64 | logger.info("Added semantic memory: Tesla Model 3") 65 | 66 | # Episodic Memory 67 | episodic_data = { 68 | "name": "First_Tesla_Drive", 69 | "column": "Episodic", 70 | "properties": { 71 | "timestamp": datetime.now().isoformat(), 72 | "action": "test_drive", 73 | "location": { 74 | "city": "Palo Alto", 75 | "state": "CA" 76 | }, 77 | "duration": "45 minutes", 78 | "participants": ["customer", "sales_rep"] 79 | } 80 | } 81 | db.add_entity(episodic_data) 82 | logger.info("Added episodic memory: First Tesla Drive") 83 | 84 | # Procedural Memory 85 | procedural_data = { 86 | "name": "Tesla_Charging_Process", 87 | "column": "Procedural", 88 | "properties": { 89 | "steps": [ 90 | "Park near charging station", "Open charging port", 91 | "Connect charging cable", "Initiate charging via touchscreen", 92 | "Wait for desired charge level", "Disconnect charging cable" 93 | ], 94 | "required_tools": ["charging_cable", "Tesla_app"], 95 | "difficulty": 96 | "easy" 97 | } 98 | } 99 | db.add_entity(procedural_data) 100 | logger.info("Added procedural memory: Tesla Charging Process") 101 | 102 | 103 | # Function removed as caching is no longer supported 104 | 105 | 106 | def main(): 107 | """Run the comprehensive example.""" 108 | # Initialize database with SQLite storage 109 | db = HawkinsDB(storage_type='sqlite') 110 | 111 | try: 112 | # Test different memory types 113 | demonstrate_memory_types(db) 114 | 115 | # Test queries 116 | logger.info("\n=== Testing Queries ===") 117 | tesla_data = db.query_frames("Tesla_Model_3") 118 | # Convert ReferenceFrame objects to dictionaries before JSON serialization 119 | tesla_data_dict = { 120 | column_name: frame.to_dict() 121 | for column_name, frame in tesla_data.items() 122 | } 123 | logger.info( 124 | f"Query result for Tesla Model 3: {json.dumps(tesla_data_dict, indent=2)}" 125 | ) 126 | 127 | # List all entities 128 | logger.info("\n=== All Entities ===") 129 | all_entities = db.list_entities() 130 | logger.info(f"Total entities: {len(all_entities)}") 131 | logger.info(f"Entities: {json.dumps(all_entities, indent=2)}") 132 | 133 | except Exception as e: 134 | logger.error(f"Error during example execution: {e}") 135 | raise 136 | finally: 137 | db.cleanup() 138 | 139 | 140 | if __name__ == "__main__": 141 | main() 142 | -------------------------------------------------------------------------------- /tests/test_hawkinsdb_comprehensive.py: -------------------------------------------------------------------------------- 1 | """Comprehensive test suite for HawkinsDB.""" 2 | import logging 3 | import time 4 | import json 5 | from datetime import datetime 6 | import pytest 7 | from hawkinsdb import HawkinsDB 8 | from hawkinsdb.types import PropertyCandidate 9 | 10 | logging.basicConfig(level=logging.DEBUG) 11 | logger = logging.getLogger(__name__) 12 | 13 | class TestHawkinsDBComprehensive: 14 | """Comprehensive test suite for HawkinsDB functionality.""" 15 | 16 | @pytest.fixture 17 | def db(self): 18 | """Initialize test database.""" 19 | db = HawkinsDB() 20 | yield db 21 | db.cleanup() 22 | 23 | def test_property_handling(self, db): 24 | """Test property handling and validation.""" 25 | # Test various property formats and validations 26 | property_data = { 27 | "name": "TestProperty", 28 | "column": "Semantic", 29 | "properties": { 30 | # Test dictionary format with full metadata 31 | "color": [ 32 | {"value": "red", "confidence": 0.9, "sources": ["observation"]}, 33 | {"value": "crimson", "confidence": 0.7, "sources": ["inference"]} 34 | ], 35 | # Test direct value with defaults 36 | "size": "large", 37 | # Test list with mixed formats 38 | "tags": [ 39 | {"value": "test", "confidence": 0.8}, 40 | "important", 41 | {"value": "verified", "sources": ["validation"]} 42 | ], 43 | # Test complex value type conversion 44 | "metadata": {"key": "value", "nested": {"data": True}}, 45 | # Test empty sources 46 | "status": {"value": "active", "confidence": 1.0, "sources": []} 47 | } 48 | } 49 | 50 | result = db.add_entity(property_data) 51 | assert result["success"], f"Failed to add property test: {result.get('message')}" 52 | 53 | # Query and validate 54 | query_results = db.query_frames("TestProperty") 55 | assert "Semantic" in query_results, "Property test memory not found" 56 | frame = query_results["Semantic"] 57 | 58 | # Validate multi-value property with full metadata 59 | assert len(frame.properties["color"]) == 2 60 | assert frame.properties["color"][0].confidence == 0.9 61 | assert "observation" in frame.properties["color"][0].sources 62 | assert frame.properties["color"][1].value == "crimson" 63 | 64 | # Validate direct value conversion 65 | assert len(frame.properties["size"]) == 1 66 | assert frame.properties["size"][0].value == "large" 67 | assert frame.properties["size"][0].confidence == 1.0 68 | 69 | # Validate mixed format list 70 | assert len(frame.properties["tags"]) == 3 71 | assert frame.properties["tags"][0].confidence == 0.8 72 | assert frame.properties["tags"][1].value == "important" 73 | assert "validation" in frame.properties["tags"][2].sources 74 | 75 | # Validate complex value conversion 76 | assert isinstance(frame.properties["metadata"][0].value, str) 77 | 78 | # Validate empty sources handling 79 | assert frame.properties["status"][0].sources == [] 80 | 81 | def test_relationship_handling(self, db): 82 | """Test relationship handling and validation.""" 83 | # Setup test entities with relationships 84 | entities = [ 85 | { 86 | "name": "Dog", 87 | "column": "Semantic", 88 | "properties": { 89 | "type": "Animal", 90 | "species": "Canis lupus familiaris" 91 | }, 92 | "relationships": { 93 | "is_a": ["Mammal", "Pet"], # Simple values get auto-wrapped 94 | "has_part": [ # Complex values with confidence and sources 95 | {"value": "Tail", "confidence": 1.0, "sources": ["anatomy"]}, 96 | {"value": "Legs", "confidence": 1.0, "sources": ["anatomy"]}, 97 | {"value": "Head", "confidence": 1.0, "sources": ["anatomy"]} 98 | ], 99 | "eats": [ 100 | {"value": "DogFood", "confidence": 0.95, "sources": ["observation"]}, 101 | {"value": "Meat", "confidence": 0.8, "sources": ["nature"]} 102 | ] 103 | } 104 | }, 105 | { 106 | "name": "Mammal", 107 | "column": "Semantic", 108 | "properties": { 109 | "type": "Classification", 110 | "characteristics": ["warm-blooded", "fur/hair", "mammary_glands"] 111 | }, 112 | "relationships": { 113 | "has_instance": [ 114 | {"value": "Dog", "confidence": 1.0}, 115 | {"value": "Cat", "confidence": 1.0} 116 | ] 117 | } 118 | } 119 | ] 120 | 121 | # Add entities 122 | for entity in entities: 123 | result = db.add_entity(entity) 124 | assert result["success"], f"Failed to add entity: {result.get('message')}" 125 | 126 | # Query and validate relationships 127 | dog_result = db.query_frames("Dog") 128 | mammal_result = db.query_frames("Mammal") 129 | 130 | assert "Semantic" in dog_result, "Dog entity not found" 131 | assert "Semantic" in mammal_result, "Mammal entity not found" 132 | 133 | dog_frame = dog_result["Semantic"] 134 | mammal_frame = mammal_result["Semantic"] 135 | 136 | # Validate bidirectional relationships 137 | assert any(v.value == "Mammal" for v in dog_frame.relationships["is_a"]), "Missing 'is_a' relationship" 138 | assert any(v.value == "Dog" for v in mammal_frame.relationships["has_instance"]), "Missing 'has_instance' relationship" 139 | 140 | # Validate relationship properties 141 | assert any(v.value == "Pet" and v.confidence == 0.9 for v in dog_frame.relationships["is_a"]) 142 | assert any(v.value == "DogFood" and v.confidence == 0.95 and "observation" in v.sources for v in dog_frame.relationships["eats"]) 143 | 144 | def test_query_and_update(self, db): 145 | """Test querying and updating functionality.""" 146 | # Add test data 147 | initial_data = { 148 | "name": "TestEntity", 149 | "column": "Semantic", 150 | "properties": { 151 | "status": "active", # Simple value gets auto-wrapped 152 | "tags": {"value": ["test", "initial"], "confidence": 1.0} # Complex value with confidence 153 | } 154 | } 155 | 156 | result = db.add_entity(initial_data) 157 | assert result["success"], "Failed to add initial entity" 158 | 159 | # Test querying 160 | query_result = db.query_frames("TestEntity") 161 | assert "Semantic" in query_result, "Entity not found in query results" 162 | frame = query_result["Semantic"] 163 | assert frame.properties["status"][0].value == "active" 164 | 165 | # Test updating 166 | update_data = { 167 | "name": "TestEntity", 168 | "column": "Semantic", 169 | "properties": { 170 | "status": PropertyCandidate("inactive", confidence=0.8), 171 | "tags": ["test", "updated"] 172 | } 173 | } 174 | 175 | update_result = db.update_entity(update_data) 176 | assert update_result["success"], "Failed to update entity" 177 | 178 | # Verify update 179 | updated_result = db.query_frames("TestEntity") 180 | updated_frame = updated_result["Semantic"] 181 | status_prop = updated_frame.properties.get("status", []) 182 | assert len(status_prop) > 0, "Status property not found" 183 | assert status_prop[0].value == "inactive", f"Expected 'inactive' but got {status_prop[0].value}" 184 | assert status_prop[0].confidence == 0.8, f"Expected confidence 0.8 but got {status_prop[0].confidence}" 185 | 186 | def test_error_handling(self, db): 187 | """Test error handling and validation.""" 188 | # Test invalid entity name 189 | invalid_name = { 190 | "name": "", # Empty name 191 | "column": "Semantic", 192 | "properties": {"test": "value"} 193 | } 194 | result = db.add_entity(invalid_name) 195 | assert not result["success"], "Should fail with empty name" 196 | assert "message" in result, "Error message should be present" 197 | 198 | # Test invalid column 199 | invalid_column = { 200 | "name": "TestInvalid", 201 | "column": "InvalidColumn", # Non-existent column 202 | "properties": {"test": "value"} 203 | } 204 | result = db.add_entity(invalid_column) 205 | assert not result["success"], "Should fail with invalid column" 206 | 207 | # Test invalid property format 208 | invalid_property = { 209 | "name": "TestInvalid", 210 | "column": "Semantic", 211 | "properties": None # Invalid properties 212 | } 213 | result = db.add_entity(invalid_property) 214 | assert not result["success"], "Should fail with invalid properties" 215 | 216 | # Test duplicate entity handling 217 | duplicate = { 218 | "name": "TestDuplicate", 219 | "column": "Semantic", 220 | "properties": {"test": "value"} 221 | } 222 | first_result = db.add_entity(duplicate) 223 | assert first_result["success"], "First addition should succeed" 224 | 225 | second_result = db.add_entity(duplicate) 226 | assert not second_result["success"], "Duplicate addition should fail" 227 | 228 | if __name__ == "__main__": 229 | pytest.main([__file__, "-v", "--log-cli-level=DEBUG"]) -------------------------------------------------------------------------------- /tests/test_memory_specific.py: -------------------------------------------------------------------------------- 1 | import unittest 2 | import time 3 | from hawkinsdb import HawkinsDB 4 | from hawkinsdb.types import PropertyCandidate 5 | import logging 6 | 7 | logging.basicConfig(level=logging.INFO) 8 | logger = logging.getLogger(__name__) 9 | 10 | class TestMemoryTypes(unittest.TestCase): 11 | def setUp(self): 12 | self.db = HawkinsDB(storage_type='sqlite', db_path=":memory:") 13 | def tearDown(self): 14 | """Clean up after each test.""" 15 | if hasattr(self, 'db'): 16 | self.db.cleanup() 17 | 18 | 19 | def test_procedural_memory_basic(self): 20 | """Test basic procedural memory creation and retrieval""" 21 | procedure = { 22 | "name": "TestProcedure", 23 | "column": "Procedural", 24 | "properties": { 25 | "steps": ["Step1: Initialize system", "Step2: Process data", "Step3: Generate output"], 26 | "difficulty": "Easy", 27 | "required_tools": ["Computer", "Software"], 28 | "duration": "5 minutes" 29 | }, 30 | "relationships": { 31 | "requires": ["BasicKnowledge"], 32 | "part_of": ["LargerProcess"] 33 | } 34 | } 35 | 36 | # Add procedure through entity API 37 | result = self.db.add_entity(procedure) 38 | 39 | # Query and verify 40 | frames = self.db.query_frames("TestProcedure") 41 | self.assertIn("Procedural", frames) 42 | 43 | frame = frames["Procedural"] 44 | self.assertEqual(frame.name.lower(), "testprocedure") # Name should be normalized 45 | self.assertTrue(any("steps" in prop for prop in frame.properties.keys())) 46 | self.assertTrue(any("required_tools" in prop for prop in frame.properties.keys())) 47 | 48 | def test_procedural_memory_validation(self): 49 | """Test validation for procedural memory""" 50 | # Test missing required fields 51 | invalid_procedure = { 52 | "name": "InvalidProcedure", 53 | "column": "Procedural", 54 | "properties": { 55 | "difficulty": "Easy" 56 | } 57 | } 58 | 59 | with self.assertRaises(HawkinsDB.EntityValidationError): 60 | self.db.add_entity(invalid_procedure) 61 | 62 | def test_episodic_memory_basic(self): 63 | """Test basic episodic memory creation and retrieval""" 64 | current_time = time.time() 65 | episode = { 66 | "name": "FirstExperience", 67 | "column": "Episodic", 68 | "properties": { 69 | "timestamp": current_time, 70 | "action": "Ran first test", 71 | "location": "TestLab", 72 | "participants": ["User1"], 73 | "outcome": "Success", 74 | "duration": "10 minutes" 75 | }, 76 | "relationships": { 77 | "related_to": ["TestProcedure"], 78 | "follows": ["Setup"] 79 | } 80 | } 81 | 82 | # Add episode through entity API 83 | result = self.db.add_entity(episode) 84 | 85 | # Query and verify 86 | frames = self.db.query_frames("FirstExperience") 87 | self.assertIn("Episodic", frames) 88 | 89 | frame = frames["Episodic"] 90 | self.assertEqual(frame.name.lower(), "firstexperience") 91 | self.assertTrue(any("timestamp" in prop for prop in frame.properties.keys())) 92 | self.assertTrue(any("location" in prop for prop in frame.properties.keys())) 93 | self.assertTrue(any("participants" in prop for prop in frame.properties.keys())) 94 | 95 | def test_episodic_memory_validation(self): 96 | """Test validation for episodic memory""" 97 | # Test missing required fields 98 | invalid_episode = { 99 | "name": "InvalidEpisode", 100 | "column": "Episodic", 101 | "properties": { 102 | "location": "TestLab" 103 | } 104 | } 105 | 106 | with self.assertRaises(HawkinsDB.EntityValidationError): 107 | self.db.add_entity(invalid_episode) 108 | 109 | def test_memory_links(self): 110 | """Test linking between procedural and episodic memories""" 111 | # Add a procedure first 112 | procedure = { 113 | "name": "LinkedProcedure", 114 | "column": "Procedural", 115 | "properties": { 116 | "steps": ["Step1", "Step2"], 117 | "difficulty": "Medium", 118 | "required_tools": ["TestTool"] 119 | }, 120 | "relationships": {} 121 | } 122 | 123 | # Add procedure through entity API 124 | result = self.db.add_entity(procedure) 125 | 126 | # Add an episode that references the procedure 127 | current_time = time.time() 128 | episode = { 129 | "name": "LinkedEpisode", 130 | "column": "Episodic", 131 | "properties": { 132 | "timestamp": current_time, 133 | "action": "Executed procedure", 134 | "location": "TestLocation", 135 | "participants": ["Tester"] 136 | }, 137 | "relationships": { 138 | "follows": ["LinkedProcedure"] 139 | } 140 | } 141 | 142 | # Add episode through entity API 143 | result = self.db.add_entity(episode) 144 | 145 | # Verify the link 146 | episode_frames = self.db.query_frames("LinkedEpisode") 147 | self.assertIn("Episodic", episode_frames) 148 | self.assertTrue( 149 | any("LinkedProcedure" in str(rel.value) 150 | for rel in episode_frames["Episodic"].relationships.get("follows", [])) 151 | ) 152 | 153 | def test_sequential_episodes(self): 154 | """Test creating and linking sequential episodes""" 155 | base_time = time.time() 156 | 157 | # Create a sequence of related episodes 158 | episodes = [ 159 | { 160 | "name": f"Episode_{i}", 161 | "column": "Episodic", 162 | "properties": { 163 | "timestamp": base_time + i * 3600, # Hour intervals 164 | "action": f"Action_{i}", 165 | "participants": ["Tester"] 166 | }, 167 | "relationships": { 168 | "follows": [f"Episode_{i-1}"] if i > 0 else [] 169 | } 170 | } for i in range(3) 171 | ] 172 | 173 | # Add episodes through entity API 174 | for episode in episodes: 175 | result = self.db.add_entity(episode) 176 | 177 | # Verify sequential relationships 178 | for i in range(1, 3): 179 | frames = self.db.query_frames(f"Episode_{i}") 180 | self.assertTrue( 181 | any(f"Episode_{i-1}" in str(rel.value) 182 | for rel in frames["Episodic"].relationships.get("follows", [])) 183 | ) 184 | 185 | if __name__ == '__main__': 186 | unittest.main() 187 | -------------------------------------------------------------------------------- /tests/test_memory_types.py: -------------------------------------------------------------------------------- 1 | """Test different memory types and their validations.""" 2 | import logging 3 | import time 4 | import json 5 | from datetime import datetime 6 | import pytest 7 | from hawkinsdb import HawkinsDB 8 | from hawkinsdb.types import PropertyCandidate 9 | 10 | logging.basicConfig(level=logging.INFO) 11 | logger = logging.getLogger(__name__) 12 | 13 | class TestMemoryTypes: 14 | """Test class for different memory types and their validations.""" 15 | 16 | @pytest.fixture 17 | def db(self): 18 | """Initialize test database.""" 19 | db = HawkinsDB() 20 | yield db 21 | db.cleanup() 22 | 23 | def validate_episodic_memory(self, case, query_result): 24 | """Validate episodic memory specific requirements.""" 25 | assert 'timestamp' in query_result, "Episodic memory missing timestamp" 26 | assert isinstance(query_result.get('timestamp'), (int, float)), "Invalid timestamp type" 27 | assert 'action' in query_result, "Episodic memory missing action" 28 | if 'participants' in case: 29 | assert 'participants' in query_result, "Episodic memory missing participants" 30 | assert isinstance(query_result.get('participants', []), list), "Participants must be a list" 31 | 32 | def validate_procedural_memory(self, case, query_result): 33 | """Validate procedural memory specific requirements.""" 34 | assert 'steps' in query_result, "Procedural memory missing steps" 35 | assert isinstance(query_result.get('steps', []), list), "Steps must be a list" 36 | assert len(query_result.get('steps', [])) > 0, "Steps cannot be empty" 37 | if 'properties' in case and 'required_tools' in case['properties']: 38 | assert 'required_tools' in query_result, "Procedural memory missing required tools" 39 | assert isinstance(query_result.get('required_tools', []), list), "Required tools must be a list" 40 | 41 | def test_semantic_memory(self, db): 42 | """Test semantic memory creation and validation.""" 43 | semantic_data = { 44 | "name": "TestConcept1", 45 | "column": "Semantic", 46 | "properties": { 47 | "definition": "A test concept", 48 | "category": "Test" 49 | }, 50 | "relationships": { 51 | "related_to": ["AnotherConcept"], 52 | "part_of": ["LargerConcept"] 53 | } 54 | } 55 | result = db.add_entity(semantic_data) 56 | assert result["success"], f"Failed to add semantic memory: {result.get('message')}" 57 | 58 | query_results = db.query_frames("TestConcept1") 59 | assert "Semantic" in query_results, "Semantic memory not found in query results" 60 | 61 | frame = query_results["Semantic"] 62 | assert frame.name == "TestConcept1" 63 | assert "definition" in frame.properties 64 | assert "category" in frame.properties 65 | 66 | def test_episodic_memory(self, db): 67 | """Test episodic memory creation and validation.""" 68 | episodic_data = { 69 | "name": "TestEvent1", 70 | "column": "Episodic", 71 | "timestamp": time.time(), 72 | "action": "Created test", 73 | "properties": { 74 | "location": "Test Environment", 75 | "duration": "10 minutes", 76 | "outcome": "Success", 77 | "participants": ["User1", "System"] 78 | } 79 | } 80 | result = db.add_entity(episodic_data) 81 | assert result["success"], f"Failed to add episodic memory: {result.get('message')}" 82 | 83 | query_results = db.query_frames("TestEvent1") 84 | assert "Episodic" in query_results, "Episodic memory not found in query results" 85 | 86 | frame = query_results["Episodic"] 87 | self.validate_episodic_memory(episodic_data, frame.to_dict()) 88 | 89 | def test_procedural_memory(self, db): 90 | """Test procedural memory creation and validation.""" 91 | procedural_data = { 92 | "name": "TestProcedure1", 93 | "column": "Procedural", 94 | "steps": [ 95 | "Step 1", 96 | "Step 2", 97 | "Step 3" 98 | ], 99 | "properties": { 100 | "purpose": "Test procedure execution", 101 | "difficulty": "Easy", 102 | "prerequisites": ["Required skill 1"], 103 | "success_criteria": ["Criterion 1"] 104 | } 105 | } 106 | result = db.add_entity(procedural_data) 107 | assert result["success"], f"Failed to add procedural memory: {result.get('message')}" 108 | 109 | query_results = db.query_frames("TestProcedure1") 110 | assert "Procedural" in query_results, "Procedural memory not found in query results" 111 | 112 | frame = query_results["Procedural"] 113 | self.validate_procedural_memory(procedural_data, frame.to_dict()) 114 | 115 | def test_invalid_memory_types(self, db): 116 | """Test invalid memory type validations.""" 117 | invalid_cases = [ 118 | # Missing name 119 | { 120 | "column": "Semantic", 121 | "properties": {"definition": "Should fail"} 122 | }, 123 | # Invalid timestamp type 124 | { 125 | "name": "InvalidEvent1", 126 | "column": "Episodic", 127 | "timestamp": "not a timestamp", 128 | "action": "Should fail" 129 | }, 130 | # Missing steps 131 | { 132 | "name": "InvalidProcedure1", 133 | "column": "Procedural" 134 | } 135 | ] 136 | 137 | for case in invalid_cases: 138 | result = db.add_entity(case) 139 | assert not result["success"], f"Invalid case should fail: {case}" 140 | assert "message" in result, "Error message should be present" -------------------------------------------------------------------------------- /tests/test_openai.py: -------------------------------------------------------------------------------- 1 | import logging 2 | import os 3 | import json 4 | import unittest 5 | from typing import Optional, Dict, Any 6 | from hawkinsdb import HawkinsDB 7 | from hawkinsdb.openai_interface import OpenAIInterface 8 | from openai import OpenAI, OpenAIError, BadRequestError 9 | from openai.types.chat import ChatCompletion 10 | 11 | # Configure logging with more detailed output 12 | logging.basicConfig( 13 | level=logging.DEBUG, # Changed to DEBUG for more detailed logs 14 | format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', 15 | force=True # Ensure our configuration takes precedence 16 | ) 17 | logger = logging.getLogger(__name__) 18 | 19 | class TestOpenAIInterface(unittest.TestCase): 20 | """Test OpenAI integration with HawkinsDB.""" 21 | 22 | def setUp(self): 23 | """Set up test environment with API key validation.""" 24 | try: 25 | # Initialize database 26 | self.db = HawkinsDB() 27 | 28 | # Get and validate API key 29 | self.api_key = os.getenv("OPENAI_API_KEY") 30 | if not self.api_key: 31 | self.skipTest("OPENAI_API_KEY environment variable not set") 32 | 33 | if not self.api_key.startswith('sk-'): 34 | self.skipTest("Invalid OpenAI API key format") 35 | 36 | # Initialize interface with test model 37 | self.model = "gpt-3.5-turbo-1106" 38 | self.interface = OpenAIInterface(self.db, model=self.model) 39 | 40 | # Verify API connection 41 | try: 42 | self.interface._test_connection() 43 | except Exception as e: 44 | self.skipTest(f"Failed to connect to OpenAI API: {str(e)}") 45 | 46 | # Set up test data 47 | self.text_description = """ 48 | There's a red Tesla Model 3 in my garage. It's an electric vehicle 49 | with autopilot capabilities and a glass roof. The car was manufactured 50 | in 2023 and has only 1000 miles on it. 51 | """ 52 | 53 | # Test data 54 | self.text_description = """ 55 | There's a red Tesla Model 3 in my garage. It's an electric vehicle 56 | with autopilot capabilities and a glass roof. The car was manufactured 57 | in 2023 and has only 1000 miles on it. 58 | """ 59 | 60 | except Exception as e: 61 | logger.error(f"Test environment initialization failed: {str(e)}") 62 | self.fail(f"Failed to initialize test environment: {str(e)}") 63 | 64 | logger.info("Test environment initialized successfully") 65 | 66 | def tearDown(self): 67 | """Clean up test data and resources.""" 68 | try: 69 | # Clean up sensitive data 70 | if hasattr(self, 'db') and hasattr(self.db, 'config'): 71 | try: 72 | # Clear credentials 73 | self.db.config.clear_sensitive_data() 74 | except Exception as e: 75 | logger.warning(f"Failed to clear sensitive data: {str(e)}") 76 | 77 | # Clean up database test data 78 | if hasattr(self, 'db'): 79 | try: 80 | self.db._perform_maintenance() 81 | except Exception as e: 82 | logger.warning(f"Database cleanup failed: {str(e)}") 83 | 84 | # Clear OpenAI interface 85 | if hasattr(self, 'interface'): 86 | try: 87 | # Clear client and model references 88 | if hasattr(self.interface, 'client'): 89 | delattr(self.interface, 'client') 90 | delattr(self, 'interface') 91 | except Exception as e: 92 | logger.warning(f"Interface cleanup failed: {str(e)}") 93 | 94 | except Exception as e: 95 | logger.error(f"Cleanup failed: {str(e)}") 96 | finally: 97 | # Force garbage collection 98 | import gc 99 | gc.collect() 100 | 101 | def test_parse_entity_from_text(self): 102 | """Test parsing entity from natural language text with new API format.""" 103 | logger.info("\nTesting entity parsing from text...") 104 | try: 105 | result = self.interface.parse_entity_from_text(self.text_description) 106 | 107 | # Verify successful response 108 | self.assertTrue(result['success'], f"Failed to parse entity: {result.get('message', 'Unknown error')}") 109 | self.assertIn('entity_data', result, "Response missing entity_data") 110 | self.assertIsNotNone(result['entity_data'], "Entity data is None") 111 | 112 | # Verify entity structure with new API format 113 | entity_data = result['entity_data'] 114 | self.assertIn('name', entity_data, "Entity missing name field") 115 | self.assertTrue(entity_data['name'].strip(), "Entity name should not be empty") 116 | self.assertIn('properties', entity_data, "Entity missing properties field") 117 | self.assertIsInstance(entity_data['properties'], dict, "Properties should be a dictionary") 118 | 119 | # Verify Tesla-specific properties with more flexible matching 120 | props = entity_data['properties'] 121 | 122 | # Check for color property 123 | color_found = any( 124 | 'color' in k.lower() or 'red' in str(v).lower() or 125 | any('red' in str(val).lower() for val in v if isinstance(v, list)) 126 | for k, v in props.items() 127 | ) 128 | self.assertTrue(color_found, "Color property missing or incorrect") 129 | 130 | # Check for model/make property with broader matching 131 | model_found = any( 132 | any(term in k.lower() or term in str(v).lower() or 133 | any(term in str(val).lower() for val in v if isinstance(v, list)) 134 | for term in ['model', 'tesla', 'make', 'vehicle']) 135 | for k, v in props.items() 136 | ) 137 | self.assertTrue(model_found, "Model/make property missing or incorrect") 138 | 139 | year_found = any('year' in k.lower() or '2023' in str(v) 140 | for k, v in props.items()) 141 | self.assertTrue(year_found, "Year property missing or incorrect") 142 | 143 | logger.info(f"Parsed entity: {json.dumps(result, indent=2)}") 144 | 145 | except OpenAIError as oe: 146 | self.skipTest(f"OpenAI API error: {str(oe)}") 147 | except Exception as e: 148 | self.fail(f"Test failed with unexpected exception: {str(e)}") 149 | 150 | def test_add_entity_to_database(self): 151 | """Test adding parsed entity to database.""" 152 | logger.info("\nTesting entity addition to database...") 153 | try: 154 | # First ensure we have a valid API key 155 | if not self.api_key: 156 | self.skipTest("No valid API key available") 157 | 158 | # Parse the entity 159 | parsed_result = self.interface.parse_entity_from_text(self.text_description) 160 | self.assertTrue(parsed_result['success'], 161 | f"Failed to parse entity: {parsed_result.get('message', 'Unknown error')}") 162 | self.assertIsNotNone(parsed_result.get('entity_data'), "No entity data returned") 163 | 164 | # Add to database 165 | add_result = self.db.add_entity(parsed_result['entity_data']) 166 | self.assertTrue(add_result['success'], 167 | f"Failed to add entity: {add_result.get('message', 'Unknown error')}") 168 | self.assertIsNotNone(add_result.get('entity_name'), "No entity name returned") 169 | 170 | # Verify entity was added correctly 171 | entity_name = add_result['entity_name'] 172 | frames = self.db.query_frames(entity_name) 173 | self.assertTrue(frames, f"Entity {entity_name} not found in database") 174 | 175 | logger.info(f"Entity addition result: {json.dumps(add_result, indent=2)}") 176 | 177 | except OpenAIError as oe: 178 | self.skipTest(f"OpenAI API error: {str(oe)}") 179 | except Exception as e: 180 | self.fail(f"Test failed with unexpected exception: {str(e)}") 181 | 182 | def test_answer_question(self): 183 | """Test querying the database using natural language.""" 184 | logger.info("\nTesting natural language query...") 185 | try: 186 | # First add an entity to query 187 | parsed_result = self.interface.parse_entity_from_text(self.text_description) 188 | self.assertTrue(parsed_result['success'], "Failed to parse entity for query test") 189 | 190 | add_result = self.db.add_entity(parsed_result['entity_data']) 191 | self.assertTrue(add_result['success'], "Failed to add entity for query test") 192 | 193 | # Test querying 194 | query = "What are the main features of this car and where is it located?" 195 | query_result = self.interface.answer_question(query) 196 | 197 | self.assertTrue(query_result['success'], 198 | f"Query failed: {query_result.get('message', 'Unknown error')}") 199 | self.assertIsNotNone(query_result.get('response'), "No response returned") 200 | 201 | # Verify response content with more detailed assertions 202 | response = query_result['response'] 203 | self.assertIn('Tesla', response, "Response should mention Tesla") 204 | self.assertIn('garage', response, "Response should mention location (garage)") 205 | 206 | # Verify response structure 207 | self.assertIsInstance(response, str, "Response should be a string") 208 | self.assertTrue(len(response) > 20, "Response should be a meaningful length") 209 | 210 | # Log the actual response for debugging 211 | logger.info(f"Query response: {response}") 212 | 213 | # Verify key information is present 214 | key_terms = ['Model 3', 'electric', 'red'] 215 | found_terms = [term for term in key_terms if term.lower() in response.lower()] 216 | self.assertTrue(found_terms, f"Response should contain at least one of: {key_terms}") 217 | 218 | logger.info(f"Query result: {json.dumps(query_result, indent=2)}") 219 | 220 | except OpenAIError as oe: 221 | self.skipTest(f"OpenAI API error: {str(oe)}") 222 | except Exception as e: 223 | self.fail(f"Test failed with exception: {str(e)}") 224 | 225 | def test_error_handling(self): 226 | """Test error handling with OpenAI API v1.0+.""" 227 | logger.info("\nTesting error handling...") 228 | 229 | # Test with empty input 230 | result = self.interface.parse_entity_from_text("") 231 | self.assertFalse(result['success']) 232 | self.assertIn('message', result) 233 | self.assertIsNone(result.get('entity_data')) 234 | 235 | # Test with empty query 236 | query_result = self.interface.answer_question("") 237 | self.assertFalse(query_result['success']) 238 | self.assertIn('message', query_result) 239 | self.assertIsNone(query_result.get('response')) 240 | 241 | # Test API key validation 242 | def test_invalid_key(): 243 | # Test with invalid API key format 244 | try: 245 | bad_interface = OpenAIInterface(self.db) 246 | bad_interface.client = OpenAI(api_key="sk_test_invalid123") 247 | result = bad_interface.parse_entity_from_text(self.text_description) 248 | self.assertFalse(result['success']) 249 | self.assertTrue('invalid' in result['message'].lower()) 250 | except Exception as e: 251 | self.fail(f"Unexpected exception: {str(e)}") 252 | 253 | # Test various error scenarios with v1.0+ error patterns 254 | original_client = self.interface.client 255 | try: 256 | # Test rate limit 257 | def mock_rate_limit(*args, **kwargs): 258 | raise OpenAIError("rate_limit_exceeded: Please retry your request after 20s") 259 | 260 | self.interface.client.chat.completions.create = mock_rate_limit 261 | result = self.interface.parse_entity_from_text(self.text_description) 262 | self.assertFalse(result['success']) 263 | self.assertTrue( 264 | any(term in result.get('message', '').lower() 265 | for term in ['rate limit', 'try again']), 266 | "Error message should indicate rate limit" 267 | ) 268 | 269 | # Test quota exceeded 270 | def mock_quota(*args, **kwargs): 271 | raise OpenAIError("insufficient_quota: You exceeded your current quota") 272 | 273 | self.interface.client.chat.completions.create = mock_quota 274 | result = self.interface.parse_entity_from_text(self.text_description) 275 | self.assertFalse(result['success']) 276 | self.assertTrue( 277 | 'quota' in result.get('message', '').lower(), 278 | "Error message should indicate quota exceeded" 279 | ) 280 | 281 | # Test model not found 282 | def mock_model_error(*args, **kwargs): 283 | raise OpenAIError("model_not_found: The model does not exist") 284 | 285 | self.interface.client.chat.completions.create = mock_model_error 286 | result = self.interface.parse_entity_from_text(self.text_description) 287 | self.assertFalse(result['success']) 288 | self.assertTrue( 289 | 'model' in result.get('message', '').lower(), 290 | "Error message should indicate model not found" 291 | ) 292 | 293 | # Test timeout 294 | def mock_timeout(*args, **kwargs): 295 | raise OpenAIError("request_timeout: Request timed out") 296 | 297 | self.interface.client.chat.completions.create = mock_timeout 298 | result = self.interface.parse_entity_from_text(self.text_description) 299 | self.assertFalse(result['success']) 300 | self.assertTrue( 301 | 'timeout' in result.get('message', '').lower(), 302 | "Error message should indicate timeout" 303 | ) 304 | 305 | # Test server error 306 | def mock_server_error(*args, **kwargs): 307 | raise OpenAIError("server_error: Internal server error") 308 | 309 | self.interface.client.chat.completions.create = mock_server_error 310 | result = self.interface.parse_entity_from_text(self.text_description) 311 | self.assertFalse(result['success']) 312 | self.assertTrue( 313 | 'server error' in result.get('message', '').lower(), 314 | "Error message should indicate server error" 315 | ) 316 | 317 | # Test invalid response structure 318 | class MockResponse: 319 | def __init__(self): 320 | self.choices = [] 321 | 322 | def mock_invalid_response(*args, **kwargs): 323 | return MockResponse() 324 | 325 | self.interface.client.chat.completions.create = mock_invalid_response 326 | result = self.interface.parse_entity_from_text(self.text_description) 327 | self.assertFalse(result['success']) 328 | self.assertTrue( 329 | any(term in result.get('message', '').lower() 330 | for term in ['invalid', 'error']), 331 | "Error message should indicate invalid response" 332 | ) 333 | 334 | finally: 335 | self.interface.client = original_client 336 | 337 | if __name__ == '__main__': 338 | unittest.main(verbosity=2) 339 | -------------------------------------------------------------------------------- /tests/test_rag.py: -------------------------------------------------------------------------------- 1 | import json 2 | import logging 3 | import os 4 | from hawkinsdb import HawkinsDB, LLMInterface 5 | from openai import OpenAI 6 | 7 | os.environ["OPENAI_API_KEY"]="" 8 | 9 | 10 | logging.basicConfig(level=logging.INFO) 11 | logger = logging.getLogger(__name__) 12 | 13 | class HawkinsWrapper: 14 | def __init__(self): 15 | """Initialize HawkinsDB and its LLM interface.""" 16 | self.db = HawkinsDB(storage_type='sqlite') 17 | self.llm_interface = LLMInterface(self.db,auto_enrich=True) 18 | self.client = OpenAI() # For pre-processing text 19 | 20 | def preprocess_text(self, text): 21 | """Preprocess text to ensure proper entity structure.""" 22 | system_prompt = """Convert the text into a structured entity format ie json. Follow these rules strictly: 23 | 24 | 1. ALWAYS include a clear, unique entity name using underscores (e.g., Python_Language, First_Python_Project) 25 | 2. ALWAYS include a column type (Semantic, Episodic, or Procedural) 26 | 3. Ensure all required fields are present 27 | 28 | Required format: 29 | { 30 | "name": "Entity_Name", // This is REQUIRED 31 | "column": "Semantic", // One of: Semantic, Episodic, Procedural 32 | "type": "category_type", 33 | "properties": { 34 | "key1": "value1", 35 | "key2": ["value2a", "value2b"] 36 | }, 37 | "relationships": { 38 | "related_to": ["entity1", "entity2"] 39 | } 40 | } 41 | 42 | Extract meaningful details and ensure name field is properly set.""" 43 | 44 | try: 45 | response = self.client.chat.completions.create( 46 | model="gpt-3.5-turbo-1106", 47 | messages=[ 48 | {"role": "system", "content": system_prompt}, 49 | {"role": "user", "content": text} 50 | ], 51 | temperature=0.3, 52 | response_format={"type": "json_object"} 53 | ) 54 | 55 | result = json.loads(response.choices[0].message.content) 56 | 57 | # Verify required fields 58 | if not result.get("name"): 59 | raise ValueError("Missing required field: name") 60 | if not result.get("column"): 61 | result["column"] = "Semantic" # Default to Semantic if not specified 62 | 63 | return result 64 | 65 | except Exception as e: 66 | logger.error(f"Error in preprocessing: {str(e)}") 67 | raise 68 | 69 | def add_from_text(self, text): 70 | """Add entity from text description with preprocessing.""" 71 | try: 72 | # First preprocess the text to ensure proper structure 73 | processed_data = self.preprocess_text(text) 74 | logger.info(f"Preprocessed data: {json.dumps(processed_data, indent=2)}") 75 | 76 | # Add to database using HawkinsDB's add_entity 77 | result = self.db.add_entity(processed_data) 78 | 79 | return { 80 | "success": True, 81 | "message": "Successfully added to database", 82 | "entity_data": processed_data, 83 | "db_result": result, 84 | "entity_name": processed_data.get("name") 85 | } 86 | 87 | except Exception as e: 88 | logger.error(f"Error adding to database: {str(e)}") 89 | return { 90 | "success": False, 91 | "message": str(e), 92 | "entity_data": None, 93 | "entity_name": None 94 | } 95 | 96 | def query_entity(self, entity_name): 97 | """Query specific entity by name.""" 98 | try: 99 | frames = self.db.query_frames(entity_name) 100 | if not frames: 101 | return { 102 | "success": False, 103 | "message": f"No entity found with name: {entity_name}", 104 | "data": None 105 | } 106 | 107 | return { 108 | "success": True, 109 | "message": "Entity found", 110 | "data": frames 111 | } 112 | 113 | except Exception as e: 114 | logger.error(f"Error querying entity: {str(e)}") 115 | return { 116 | "success": False, 117 | "message": str(e), 118 | "data": None 119 | } 120 | 121 | def query_by_text(self, query_text): 122 | """Query database using natural language text.""" 123 | try: 124 | result = self.llm_interface.query(query_text) 125 | return result 126 | 127 | except Exception as e: 128 | logger.error(f"Error processing query: {str(e)}") 129 | return { 130 | "success": False, 131 | "message": str(e), 132 | "response": None 133 | } 134 | 135 | def list_all_entities(self): 136 | """List all entities in the database.""" 137 | try: 138 | entities = self.db.list_entities() 139 | return { 140 | "success": True, 141 | "message": "Entities retrieved successfully", 142 | "entities": entities 143 | } 144 | except Exception as e: 145 | logger.error(f"Error listing entities: {str(e)}") 146 | return { 147 | "success": False, 148 | "message": str(e), 149 | "entities": None 150 | } 151 | 152 | def test_memory_examples(): 153 | """Test function to demonstrate usage.""" 154 | hawkins = HawkinsWrapper() 155 | 156 | # Test adding entries 157 | examples = [ 158 | """ 159 | Python is a programming language created by Guido van Rossum in 1991. 160 | It supports object-oriented, imperative, and functional programming. 161 | It's commonly used for web development, data science, and automation. 162 | Similar languages include Ruby and JavaScript. 163 | """, 164 | """ 165 | Today I completed my first Python project in my home office. 166 | It took 2 hours and was successful. I did a code review afterwards. 167 | The project helped me learn about functions and classes. 168 | """, 169 | """ 170 | The Tesla Model 3 is red, made in 2023, and parked in the garage. 171 | It has a range of 358 miles and goes 0-60 mph in 3.1 seconds. 172 | It features autopilot and a minimalist interior design. 173 | """, 174 | """ 175 | Visual Studio Code (VS Code) is a popular code editor developed by Microsoft. 176 | It was first released in 2015 and is written in TypeScript and JavaScript. 177 | It supports multiple programming languages through extensions, has integrated 178 | Git control, and features intelligent code completion. It's commonly used 179 | alongside Python, JavaScript, and Java development environments. 180 | """, 181 | """ 182 | C++ is a beautiful programming language 183 | """ 184 | ] 185 | 186 | # Add examples to database 187 | logger.info("\nAdding examples to database:") 188 | for i, example in enumerate(examples, 1): 189 | logger.info(f"\nAdding Example {i}") 190 | logger.info("=" * 50) 191 | logger.info(f"Input Text:\n{example}") 192 | result = hawkins.add_from_text(example) 193 | logger.info(f"Result: {json.dumps(result, indent=2)}") 194 | 195 | # List all entities 196 | logger.info("\nListing all entities:") 197 | entities_result = hawkins.list_all_entities() 198 | logger.info(f"Entities: {json.dumps(entities_result, indent=2)}") 199 | 200 | # Test natural language queries 201 | test_queries = [ 202 | "What has car has range of 358 miles and goes 0-60 mph in 3.1 seconds" 203 | ] 204 | 205 | logger.info("\nTesting natural language queries:") 206 | for query in test_queries: 207 | logger.info(f"\nQuery: {query}") 208 | result = hawkins.query_by_text(query) 209 | logger.info(f"Response: {json.dumps(result, indent=2)}") 210 | 211 | if __name__ == "__main__": 212 | test_memory_examples() -------------------------------------------------------------------------------- /tests/test_readme_examples.py: -------------------------------------------------------------------------------- 1 | import logging 2 | import time 3 | from hawkinsdb import HawkinsDB, LLMInterface 4 | import os 5 | 6 | 7 | 8 | logging.basicConfig(level=logging.INFO) 9 | logger = logging.getLogger(__name__) 10 | 11 | def test_readme_examples(): 12 | # Initialize database 13 | db = HawkinsDB(storage_type='sqlite') 14 | 15 | logger.info("\n=== Testing Semantic Memory Example ===") 16 | # Test semantic memory 17 | semantic_memory = { 18 | "column": "Semantic", 19 | "name": "Python_Language", 20 | "properties": { 21 | "type": "Programming_Language", 22 | "paradigm": ["Object-oriented", "Imperative", "Functional"], 23 | "created_by": "Guido van Rossum", 24 | "year": 1991 25 | }, 26 | "relationships": { 27 | "used_for": ["Web Development", "Data Science", "Automation"], 28 | "similar_to": ["Ruby", "JavaScript"] 29 | } 30 | } 31 | 32 | result = db.add_entity(semantic_memory) 33 | logger.info(f"Semantic memory add result: {result}") 34 | frames = db.query_frames("Python_Language") 35 | logger.info(f"Semantic memory query result: {frames}") 36 | 37 | logger.info("\n=== Testing Episodic Memory Example ===") 38 | # Test episodic memory 39 | episodic_memory = { 40 | "column": "Episodic", 41 | "type": "Event", 42 | "name": "First_Python_Project", 43 | "properties": { 44 | "location": "Home Office", 45 | "duration": "2 hours", 46 | "outcome": "Success", 47 | "timestamp": time.time(), 48 | "action": "Completed project", 49 | "participants": ["User1"] 50 | }, 51 | "relationships": { 52 | "related_to": ["Python_Language"], 53 | "followed_by": ["Code_Review"] 54 | } 55 | } 56 | 57 | result = db.add_entity(episodic_memory) 58 | logger.info(f"Episodic memory add result: {result}") 59 | frames = db.query_frames("First_Python_Project") 60 | logger.info(f"Episodic memory query result: {frames}") 61 | 62 | logger.info("\n=== Testing Procedural Memory Example ===") 63 | # Test procedural memory 64 | procedural_memory = { 65 | "column": "Procedural", 66 | "type": "Procedure", 67 | "name": "Git_Commit_Process", 68 | "properties": { 69 | "difficulty": "Beginner", 70 | "required_tools": ["Git"], 71 | "estimated_time": "5 minutes", 72 | "steps": [ 73 | "Stage changes using git add", 74 | "Review changes with git status", 75 | "Commit with descriptive message", 76 | "Push to remote repository" 77 | ] 78 | }, 79 | "relationships": { 80 | "prerequisites": ["Git_Installation"], 81 | "followed_by": ["Git_Push_Process"] 82 | } 83 | } 84 | 85 | result = db.add_entity(procedural_memory) 86 | logger.info(f"Procedural memory add result: {result}") 87 | frames = db.query_frames("Git_Commit_Process") 88 | logger.info(f"Procedural memory query result: {frames}") 89 | 90 | logger.info("\n=== Testing LLM Interface Example ===") 91 | # Test LLM interface 92 | interface = LLMInterface(db) 93 | 94 | # Test natural language addition 95 | nl_result = interface.add_from_text(""" 96 | The Tesla Model 3 is a battery electric sedan car manufactured by Tesla. 97 | It has a red exterior color, was manufactured in 2023, and is currently 98 | located in the garage. It has an estimated range of 358 miles and 99 | accelerates from 0 to 60 mph in 3.1 seconds. 100 | """) 101 | logger.info(f"LLM interface add result: {nl_result}") 102 | 103 | # Test natural language query 104 | query_result = interface.query("Explain about First Python Project") 105 | logger.info(f"LLM interface query result: {query_result}") 106 | 107 | if __name__ == "__main__": 108 | test_readme_examples() 109 | -------------------------------------------------------------------------------- /tests/test_sqlite_storage.py: -------------------------------------------------------------------------------- 1 | """Test suite for SQLite storage backend.""" 2 | import os 3 | import unittest 4 | import tempfile 5 | from datetime import datetime 6 | from typing import Optional, Sequence, cast, Type, TypeVar 7 | 8 | from hawkinsdb.storage.sqlite import SQLiteStorage 9 | from hawkinsdb.types import CorticalColumn, ReferenceFrame, PropertyCandidate 10 | from hawkinsdb.base import BaseCorticalColumn 11 | 12 | # Type variable for CorticalColumn 13 | T_CorticalColumn = TypeVar('T_CorticalColumn', bound=BaseCorticalColumn) 14 | 15 | class TestSQLiteStorage(unittest.TestCase): 16 | """Test cases for SQLite storage backend.""" 17 | 18 | def setUp(self): 19 | """Set up test environment with temporary database.""" 20 | # Use temporary file for testing 21 | self.temp_dir = tempfile.mkdtemp() 22 | self.db_path = os.path.join(self.temp_dir, "test_hawkins.db") 23 | self.storage = SQLiteStorage(db_path=self.db_path) 24 | self.storage.initialize() 25 | 26 | def tearDown(self): 27 | """Clean up test environment.""" 28 | self.storage.cleanup() 29 | if os.path.exists(self.db_path): 30 | os.remove(self.db_path) 31 | os.rmdir(self.temp_dir) 32 | 33 | def test_initialize_and_cleanup(self): 34 | """Test database initialization and cleanup.""" 35 | self.assertTrue(os.path.exists(self.db_path)) 36 | self.storage.cleanup() 37 | 38 | def test_save_and_load_columns(self): 39 | """Test saving and loading columns with various data types.""" 40 | # Create test data 41 | test_time = datetime.now().isoformat() 42 | test_columns: Sequence[T_CorticalColumn] = cast( 43 | Sequence[T_CorticalColumn], 44 | [ 45 | CorticalColumn( 46 | name="test_column", 47 | frames=[ 48 | ReferenceFrame( 49 | name="test_frame", 50 | properties={ 51 | "color": [PropertyCandidate(value="red", confidence=0.9)], 52 | "size": [PropertyCandidate(value=42, confidence=1.0)] 53 | }, 54 | relationships={ 55 | "contains": [PropertyCandidate(value="item", confidence=0.8)] 56 | }, 57 | location={"x": 0, "y": 0}, 58 | history=[(test_time, "created"), (test_time, "updated")] 59 | ) 60 | ] 61 | ) 62 | ] 63 | ) 64 | 65 | # Save columns 66 | self.storage.save_columns(test_columns) 67 | 68 | # Load columns 69 | loaded_columns = self.storage.load_columns() 70 | 71 | # Verify data 72 | self.assertEqual(len(loaded_columns), 1) 73 | self.assertEqual(loaded_columns[0].name, "test_column") 74 | self.assertEqual(len(loaded_columns[0].frames), 1) 75 | 76 | loaded_frame = loaded_columns[0].frames[0] 77 | self.assertEqual(loaded_frame.name, "test_frame") 78 | self.assertEqual(loaded_frame.properties["color"][0].value, "red") 79 | self.assertEqual(loaded_frame.properties["size"][0].value, 42) 80 | self.assertEqual(loaded_frame.relationships["contains"][0].value, "item") 81 | self.assertEqual(loaded_frame.location, {"x": 0, "y": 0}) 82 | self.assertEqual(loaded_frame.history, [(test_time, "created"), (test_time, "updated")]) 83 | 84 | def test_error_handling(self): 85 | """Test error handling for invalid operations.""" 86 | # Test saving invalid data (empty column name) 87 | with self.assertRaises(ValueError): 88 | invalid_columns: Sequence[T_CorticalColumn] = cast( 89 | Sequence[T_CorticalColumn], 90 | [CorticalColumn(name="", frames=[])] 91 | ) 92 | self.storage.save_columns(invalid_columns) 93 | 94 | # Test empty database path 95 | with self.assertRaises(ValueError) as cm: 96 | SQLiteStorage(db_path="") 97 | self.assertIn("Invalid database path", str(cm.exception)) 98 | 99 | # Test non-existent directory creation 100 | with tempfile.TemporaryDirectory() as temp_dir: 101 | new_dir = os.path.join(temp_dir, "newdir") 102 | db_path = os.path.join(new_dir, "test.db") 103 | storage = SQLiteStorage(db_path=db_path) 104 | self.assertTrue(os.path.exists(new_dir)) 105 | storage.cleanup() 106 | 107 | # Test invalid directory permissions 108 | if os.name != 'nt': # Skip on Windows 109 | with tempfile.TemporaryDirectory() as temp_dir: 110 | # Create a read-only directory 111 | read_only_dir = os.path.join(temp_dir, "readonly") 112 | os.makedirs(read_only_dir) 113 | os.chmod(read_only_dir, 0o555) # Read + execute only 114 | 115 | db_path = os.path.join(read_only_dir, "test.db") 116 | with self.assertRaises(ValueError) as cm: 117 | SQLiteStorage(db_path=db_path) 118 | self.assertIn("write", str(cm.exception).lower()) 119 | 120 | if __name__ == '__main__': 121 | unittest.main() --------------------------------------------------------------------------------