├── README.md
├── docs
    ├── README.md
    ├── conceptnet_guide.md
    ├── llm_interface_guide.md
    └── sqlite_backend.md
├── examples
    ├── HawkinDB_RAG.py
    ├── basic_demo.py
    ├── document.pdf
    ├── file_rag.py
    ├── hawkins_basic_demo.py
    ├── hawkins_demo.py
    ├── hawkinsdb_complete_example.py
    ├── hawkinsdb_comprehensive.py
    ├── hawkinsdb_demo.py
    ├── hawkinsdb_full_example.py
    ├── hawkinsdb_sqlite_example.py
    └── sqlite_usage.py
├── hawkinsdb
    ├── __init__.py
    ├── base.py
    ├── config.py
    ├── core.py
    ├── enrichment.py
    ├── llm_interface.py
    ├── openai_interface.py
    ├── py.typed
    ├── storage
    │   ├── __init__.py
    │   └── sqlite.py
    └── types.py
├── setup.py
└── tests
    ├── __init__.py
    ├── document.pdf
    ├── file_rag.py
    ├── test_basic.py
    ├── test_conceptnet.py
    ├── test_enrichment.py
    ├── test_exmple_full.py
    ├── test_hawkinsdb_comprehensive.py
    ├── test_memory_specific.py
    ├── test_memory_types.py
    ├── test_openai.py
    ├── test_rag.py
    ├── test_readme_examples.py
    └── test_sqlite_storage.py


/README.md:
--------------------------------------------------------------------------------
  1 | # 🧠 HawkinsDB: Neuroscience-Inspired Memory Layer for LLM Applications
  2 | 
  3 | Building smarter LLM applications isn't just about better models - it's about better memory. HawkinsDB is our take on giving AI systems a more human-like way to store and recall information, inspired by how our own brains work. Based on Jeff Hawkins' Thousand Brains Theory, it helps AI models manage complex information in a way that's both powerful and intuitive.
  4 | 
  5 | > 📌 **Note for RAG Users**: If you're specifically looking to implement Retrieval-Augmented Generation (RAG), consider using [HawkinsRAG](https://pypi.org/project/hawkins-rag/0.1.0/) - our dedicated package built on top of HawkinsDB that simplifies RAG implementation with support for 22+ data sources. Check out the [documentation](https://github.com/harishsg993010/HawkinsRAG/tree/main/docs) and [examples](https://github.com/harishsg993010/HawkinsRAG/tree/main/examples) for more details.
  6 | 
  7 | > 🤖 **Note for Agent Developers**: If you're interested in building AI agents, check out [Hawkins-Agent](https://pypi.org/project/hawkins-agent/) - our specialized framework built on HawkinsDB for creating intelligent agents. Visit our [GitHub repository](https://github.com/harishsg993010/HawkinsAgent) for implementation details.
  8 | 
  9 | ## Why HawkinsDB?
 10 | 
 11 | While vector databases and embeddings have revolutionized AI applications, they often miss the nuanced, multi-dimensional nature of information. Here's why we built HawkinsDB:
 12 | 
 13 | - **It's not just another vector database**: Instead of relying on fuzzy similarity searches, we enable precise, context-aware queries that understand the actual meaning and relationships of your data.
 14 | 
 15 | - **One memory system to rule them all**: We've unified different types of memory (semantic, episodic, and procedural) into a single framework. Think about a customer support AI that can simultaneously access product specs, past customer interactions, and troubleshooting guides - all working together seamlessly.
 16 | 
 17 | - **Inspired by the human brain**: We've based our architecture on neuroscience research, using concepts like Reference Frames and Cortical Columns to create a more robust and adaptable system.
 18 | 
 19 | - **You can actually understand what's happening**: Unlike black-box embeddings, our structured approach lets you see and understand how information is connected and why certain decisions are made.
 20 | 
 21 | ## Requirements
 22 | 
 23 | - Python 3.10 or higher
 24 | - OpenAI API key (for LLM operations)
 25 | - SQLite or JSON storage backend
 26 | 
 27 | ## Installation
 28 | 
 29 | ```bash
 30 | # Basic installation
 31 | pip install hawkinsdb
 32 | 
 33 | # Recommended installation with all features
 34 | pip install hawkinsdb[all]
 35 | 
 36 | # Install specific features
 37 | pip install hawkinsdb[conceptnet]  # ConceptNet tools
 38 | ```
 39 | 
 40 | ## Quick Start
 41 | 
 42 | Here's a simple example showing the power of HawkinsDB:
 43 | 
 44 | ```python
 45 | from hawkinsdb import HawkinsDB, LLMInterface
 46 | 
 47 | # Initialize
 48 | db = HawkinsDB()
 49 | llm = LLMInterface(db)
 50 | 
 51 | # Store knowledge with multiple perspectives
 52 | db.add_entity({
 53 |     "column": "Semantic",
 54 |     "name": "Coffee Cup",
 55 |     "properties": {
 56 |         "type": "Container",
 57 |         "material": "Ceramic",
 58 |         "capacity": "350ml"
 59 |     },
 60 |     "relationships": {
 61 |         "used_for": ["Drinking Coffee", "Hot Beverages"],
 62 |         "found_in": ["Kitchen", "Coffee Shop"]
 63 |     }
 64 | })
 65 | 
 66 | # Query using natural language
 67 | response = llm.query("What can you tell me about the coffee cup?")
 68 | print(response)
 69 | ```
 70 | 
 71 | For more examples, check out our [examples directory](examples).
 72 | 
 73 | ## How It Works
 74 | 
 75 | HawkinsDB manages information through three core concepts:
 76 | 
 77 | ### 🧩 Reference Frames
 78 | Smart containers for information that capture what something is, its properties, relationships, and context. This enables natural handling of complex queries like "Find kitchen items related to coffee brewing."
 79 | 
 80 | ### 🌐 Cortical Columns
 81 | Just like your brain processes information from multiple perspectives (visual, tactile, conceptual), our system stores knowledge in different "columns." This means an object isn't just stored as a single definition - it's understood from multiple angles.
 82 | 
 83 | ### Memory Types
 84 | 
 85 | We support three key types of memory:
 86 | 
 87 | - **Semantic Memory**: For storing facts, concepts, and general knowledge
 88 | - **Episodic Memory**: For keeping track of events and experiences over time
 89 | - **Procedural Memory**: For capturing step-by-step processes and workflows
 90 | 
 91 | ### 💾 Storage Options
 92 | 
 93 | - **SQLite**: Rock-solid storage for production systems
 94 | - **JSON**: Quick and easy for prototyping
 95 | 
 96 | ### 🔗 Smart Integrations
 97 | ConceptNet integration for automatic knowledge enrichment and relationship discovery.
 98 | 
 99 | ## Contributing
100 | 
101 | We love contributions! Here's how to help:
102 | 
103 | 1. Fork the repository
104 | 2. Create your feature branch
105 | 3. Make your changes
106 | 4. Run the tests
107 | 5. Submit a pull request
108 | 
109 | ## Development
110 | 
111 | ```bash
112 | # Clone and set up
113 | git clone https://github.com/your-username/hawkinsdb.git
114 | cd hawkinsdb
115 | pip install -e ".[dev]"
116 | pytest tests/
117 | ```
118 | 
119 | ## 🗺️ Status and Roadmap
120 | 
121 | Currently under active development. Our focus areas:
122 | 
123 | - [ ] Enhanced multi-modal processing
124 | - [ ] Performance optimizations for large-scale deployments
125 | - [ ] Extended LLM provider support
126 | - [ ] Advanced querying capabilities
127 | - [ ] Improved documentation and examples
128 | 
129 | ## License
130 | 
131 | HawkinsDB is available under the MIT License. See [LICENSE](LICENSE) for details.
132 | 
133 | ---
134 | 
135 | Built by developers who think memory matters ie Harish Santhanalakshmi Ganesan along with few AI Agents
136 | 


--------------------------------------------------------------------------------
/docs/README.md:
--------------------------------------------------------------------------------
  1 | # HawkinsDB Technical Documentation
  2 | 
  3 | ## Overview
  4 | 
  5 | HawkinsDB is a flexible memory storage system designed for semantic, episodic, and procedural memory storage with built-in knowledge enrichment capabilities. It features SQLite backend support, LLM integration, and ConceptNet enrichment.
  6 | 
  7 | ## Table of Contents
  8 | 1. [Installation](#installation)
  9 | 2. [Core Features](#core-features)
 10 | 3. [Basic Usage](#basic-usage)
 11 | 4. [Memory Types](#memory-types)
 12 | 5. [Storage Backend](#storage-backend)
 13 | 6. [LLM Interface](#llm-interface)
 14 | 7. [ConceptNet Integration](#conceptnet-integration)
 15 | 8. [Error Handling](#error-handling)
 16 | 9. [Best Practices](#best-practices)
 17 | 
 18 | ## Installation
 19 | 
 20 | ```bash
 21 | pip install hawkinsdb
 22 | ```
 23 | 
 24 | ## Core Features
 25 | 
 26 | - Multiple memory types (Semantic, Episodic, Procedural)
 27 | - SQLite persistent storage with ACID compliance
 28 | - Natural language interface via LLM
 29 | - Automatic knowledge enrichment using ConceptNet
 30 | - Property validation and type inference
 31 | - Relationship management
 32 | - Error handling and validation
 33 | 
 34 | ## Basic Usage
 35 | 
 36 | ### Initialization
 37 | 
 38 | ```python
 39 | from hawkinsdb import HawkinsDB , LLMInterface
 40 | 
 41 | # Initialize with SQLite storage
 42 | db = HawkinsDB(storage_type="sqlite", db_path="memory.db")
 43 | llm = LLMInterface(db)
 44 | ```
 45 | 
 46 | ### Adding Memories
 47 | 
 48 | ```python
 49 | # Add semantic memory
 50 | semantic_memory = {
 51 |     "name": "cat",
 52 |     "column": "Semantic",
 53 |     "properties": {
 54 |         "type": "animal",
 55 |         "size": "medium",
 56 |         "characteristics": ["furry", "agile", "carnivorous"]
 57 |     },
 58 |     "relationships": {
 59 |         "habitat": ["homes", "outdoors"],
 60 |         "behavior": ["hunting", "sleeping", "grooming"]
 61 |     }
 62 | }
 63 | 
 64 | result = db.add_entity(semantic_memory)
 65 | response = llm.query(
 66 |     "Explain the behaviours of cat",
 67 | )
 68 | 
 69 | # Add episodic memory
 70 | import time
 71 | 
 72 | episodic_memory = {
 73 |     "name": "cat_observation",
 74 |     "column": "Episodic",
 75 |     "properties": {
 76 |         "timestamp": time.time(),
 77 |         "action": "Observed cat behavior",
 78 |         "location": "Garden",
 79 |         "details": "Cat was chasing a butterfly"
 80 |     },
 81 |     "relationships": {
 82 |         "relates_to": ["cat"],
 83 |         "observed_by": ["human"]
 84 |     }
 85 | }
 86 | 
 87 | result = db.add_entity(episodic_memory)
 88 | ```
 89 | 
 90 | ### Querying Memories
 91 | 
 92 | ```python
 93 | # Query specific entity
 94 | cat_info = db.query_frames("cat")
 95 | 
 96 | # List all entities
 97 | entities = db.list_entities()
 98 | ```
 99 | 
100 | ## Memory Types
101 | 
102 | ### Semantic Memory
103 | Stores conceptual knowledge and facts.
104 | 
105 | ```python
106 | semantic_data = {
107 |     "name": "Photosynthesis",
108 |     "column": "Semantic",
109 |     "properties": {
110 |         "type": "biological_process",
111 |         "location": "plant_cells",
112 |         "components": ["chlorophyll", "sunlight", "water", "carbon_dioxide"],
113 |         "products": ["glucose", "oxygen"]
114 |     },
115 |     "relationships": {
116 |         "occurs_in": ["plants", "algae"],
117 |         "requires": ["light_energy", "chloroplasts"],
118 |         "produces": ["chemical_energy", "organic_compounds"]
119 |     }
120 | }
121 | ```
122 | 
123 | ### Episodic Memory
124 | Stores event-based memories with temporal information.
125 | 
126 | ```python
127 | episodic_data = {
128 |     "name": "first_python_project",
129 |     "column": "Episodic",
130 |     "properties": {
131 |         "timestamp": time.time(),
132 |         "duration": "2 hours",
133 |         "location": "home_office",
134 |         "outcome": "successful"
135 |     },
136 |     "relationships": {
137 |         "involves": ["Python_Language"],
138 |         "followed_by": ["code_review"]
139 |     }
140 | }
141 | ```
142 | 
143 | ## Storage Backend
144 | 
145 | ### SQLite Configuration
146 | 
147 | ```python
148 | # Initialize with custom SQLite path
149 | db = HawkinsDB(
150 |     storage_type="sqlite",
151 |     db_path="custom_path/memory.db"
152 | )
153 | 
154 | # Basic operations
155 | try:
156 |     # Add entity
157 |     result = db.add_entity(entity_data)
158 |     
159 |     # Query data
160 |     frames = db.query_frames("entity_name")
161 |     
162 |     # Cleanup
163 |     db.cleanup()  # Close connections
164 | except Exception as e:
165 |     print(f"Storage error: {str(e)}")
166 | ```
167 | 
168 | ## LLM Interface
169 | 
170 | ### Initialization and Basic Usage
171 | 
172 | ```python
173 | from hawkinsdb import HawkinsDB, LLMInterface
174 | 
175 | # Initialize
176 | db = HawkinsDB()
177 | llm = LLMInterface(
178 |     db,
179 |     auto_enrich=True,
180 |     confidence_threshold=0.7,
181 |     max_enrichment_depth=2
182 | )
183 | 
184 | # Convert natural language to structured data
185 | result = llm.add_from_text("""
186 |     The respiratory system is responsible for taking in oxygen 
187 |     and releasing carbon dioxide. Key organs include the lungs, 
188 |     trachea, and diaphragm.
189 | """)
190 | print(f"Added entity: {result['entity_name']}")
191 | 
192 | # Complex querying with context
193 | response = llm.query(
194 |     "What are the main components of the respiratory system?",
195 | )
196 | print(f"Response: {response}")
197 | 
198 | 
199 | 
200 | 
201 | ```
202 | 
203 | ## ConceptNet Integration
204 | 
205 | ### Basic Enrichment
206 | 
207 | ```python
208 | from hawkinsdb import ConceptNetEnricher
209 | 
210 | # Initialize
211 | db = HawkinsDB()
212 | enricher = ConceptNetEnricher()
213 | 
214 | # Add and enrich entity
215 | entity_data = {
216 |     "name": "Dog",
217 |     "column": "Semantic",
218 |     "properties": {
219 |         "type": "Animal",
220 |         "category": "Pet"
221 |     }
222 | }
223 | db.add_entity(entity_data)
224 | enriched_result = enricher.enrich_entity(db, "Dog", "Animal")
225 | ```
226 | 
227 | ### Custom Enrichment
228 | 
229 | ```python
230 | class CustomEnricher(ConceptNetEnricher):
231 |     def __init__(self):
232 |         super().__init__()
233 |         self.min_confidence = 0.7
234 |         
235 |     def filter_relations(self, relations):
236 |         return [r for r in relations if r.weight >= self.min_confidence]
237 | 
238 | # Use custom enricher
239 | custom_enricher = CustomEnricher()
240 | custom_enricher.enrich_entity(db, "Dog", "Animal")
241 | ```
242 | 
243 | ## Error Handling
244 | 
245 | ```python
246 | from hawkinsdb import ValidationError
247 | 
248 | try:
249 |     # Add entity with validation
250 |     result = db.add_entity({
251 |         "name": "Test",
252 |         "column": "Semantic",
253 |         "properties": {
254 |             "age": "42"  # Will be converted to integer
255 |         }
256 |     })
257 |     
258 |     if result["success"]:
259 |         print(f"Added: {result['entity_name']}")
260 |     else:
261 |         print(f"Error: {result['message']}")
262 |         
263 | except ValidationError as e:
264 |     print(f"Validation error: {str(e)}")
265 | except Exception as e:
266 |     print(f"General error: {str(e)}")
267 | ```
268 | 
269 | ## Best Practices
270 | 
271 | 1. Memory Organization
272 |    - Use consistent naming conventions
273 |    - Group related concepts
274 |    - Include relevant metadata
275 |    - Link memories using relationships
276 | 
277 | 2. Performance Optimization
278 |    - Use batch operations for multiple entities
279 |    - Implement proper cleanup
280 |    - Monitor memory usage
281 |    - Cache frequently accessed data
282 | 
283 | 3. Error Prevention
284 |    - Validate data before adding
285 |    - Implement proper error handling
286 |    - Use type hints
287 |    - Follow schema guidelines
288 | 
289 | 4. Integration Tips
290 |    - Test ConceptNet enrichment in development
291 |    - Validate LLM responses
292 |    - Monitor API usage
293 |    - Keep security in mind
294 | 
295 | For more detailed information about specific features, refer to the individual component guides in the documentation.
296 | 


--------------------------------------------------------------------------------
/docs/conceptnet_guide.md:
--------------------------------------------------------------------------------
  1 | # ConceptNet Integration Guide
  2 | 
  3 | ## Overview
  4 | 
  5 | HawkinsDB's ConceptNet integration provides powerful knowledge enrichment capabilities by connecting to the ConceptNet knowledge graph. This guide explains how to effectively use this feature to enhance your semantic memories with common-sense knowledge and leverage the power of structured knowledge bases.
  6 | 
  7 | ## Features
  8 | 
  9 | ### Core Capabilities
 10 | - **Automatic concept enrichment**: Enhance entities with common-sense knowledge
 11 | - **Property inference**: Discover new properties based on concept relationships
 12 | - **Relationship discovery**: Identify and add meaningful connections
 13 | - **Confidence scoring**: Quantify the reliability of enriched data
 14 | - **Source tracking**: Monitor the origin of enriched information
 15 | 
 16 | ### Key Benefits
 17 | - Richer semantic understanding
 18 | - Improved query capabilities
 19 | - Better context awareness
 20 | - Enhanced knowledge representation
 21 | 
 22 | ## Basic Usage
 23 | 
 24 | ### 1. Direct Enrichment
 25 | 
 26 | ```python
 27 | from hawkinsdb import HawkinsDB, ConceptNetEnricher
 28 | 
 29 | # Initialize
 30 | db = HawkinsDB()
 31 | enricher = ConceptNetEnricher()
 32 | 
 33 | # Add basic entity
 34 | entity_data = {
 35 |     "name": "Dog",
 36 |     "column": "Semantic",
 37 |     "properties": {
 38 |         "type": "Animal",
 39 |         "category": "Pet"
 40 |     }
 41 | }
 42 | db.add_entity(entity_data)
 43 | 
 44 | # Enrich the entity
 45 | enriched_result = enricher.enrich_entity(db, "Dog", "Animal")
 46 | print(f"Enrichment status: {enriched_result}")
 47 | 
 48 | # Query enriched entity
 49 | enriched_dog = db.query_frames("Dog")
 50 | print("Enriched properties:", enriched_dog["Semantic"].properties)
 51 | print("Enriched relationships:", enriched_dog["Semantic"].relationships)
 52 | ```
 53 | 
 54 | ### 2. Automatic Enrichment via LLM Interface
 55 | 
 56 | ```python
 57 | from hawkinsdb import HawkinsDB, LLMInterface
 58 | 
 59 | # Initialize with auto-enrichment
 60 | db = HawkinsDB()
 61 | llm = LLMInterface(db, auto_enrich=True)
 62 | 
 63 | # Add entity with automatic enrichment
 64 | result = llm.add_from_text(
 65 |     "A golden retriever is a friendly dog breed known for its golden coat"
 66 | )
 67 | 
 68 | # Verify enrichment
 69 | if result["success"]:
 70 |     entity_name = result["entity_name"]
 71 |     enriched_data = db.query_frames(entity_name)
 72 |     
 73 |     # Print enriched properties
 74 |     semantic_frame = enriched_data.get("Semantic")
 75 |     if semantic_frame:
 76 |         print("Enriched properties:", semantic_frame.properties)
 77 |         print("Added relationships:", semantic_frame.relationships)
 78 | ```
 79 | 
 80 | ## Enrichment Process
 81 | 
 82 | ### 1. Property Inference
 83 | The enrichment process automatically discovers and adds relevant properties:
 84 | 
 85 | a) Physical Characteristics
 86 |    ```python
 87 |    # Example of physical characteristics enrichment
 88 |    car_data = {
 89 |        "name": "Car",
 90 |        "column": "Semantic",
 91 |        "properties": {"type": "Vehicle"}
 92 |    }
 93 |    db.add_entity(car_data)
 94 |    enricher.enrich_properties(db, "Car", ["physical_attributes"])
 95 |    ```
 96 | 
 97 | b) Common Behaviors
 98 |    ```python
 99 |    # Enriching with behavior information
100 |    animal_data = {
101 |        "name": "Cat",
102 |        "column": "Semantic",
103 |        "properties": {"type": "Pet"}
104 |    }
105 |    db.add_entity(animal_data)
106 |    enricher.enrich_properties(db, "Cat", ["behaviors"])
107 |    ```
108 | 
109 | c) Typical Locations
110 |    ```python
111 |    # Location-based enrichment
112 |    tool_data = {
113 |        "name": "Hammer",
114 |        "column": "Semantic",
115 |        "properties": {"type": "Tool"}
116 |    }
117 |    db.add_entity(tool_data)
118 |    enricher.enrich_properties(db, "Hammer", ["locations"])
119 |    ```
120 | 
121 | d) Related Concepts
122 |    ```python
123 |    # Concept relationship enrichment
124 |    fruit_data = {
125 |        "name": "Apple",
126 |        "column": "Semantic",
127 |        "properties": {"type": "Fruit"}
128 |    }
129 |    db.add_entity(fruit_data)
130 |    enricher.enrich_properties(db, "Apple", ["related_concepts"])
131 |    ```
132 | 
133 | ### 2. Relationship Discovery
134 | The system automatically identifies and establishes various types of relationships:
135 | 
136 | a) IsA Relationships
137 | ```python
138 | # Example of IsA relationship discovery
139 | computer_data = {
140 |     "name": "Laptop",
141 |     "column": "Semantic",
142 |     "properties": {
143 |         "type": "Device",
144 |         "manufacturer": "Generic"
145 |     }
146 | }
147 | db.add_entity(computer_data)
148 | enricher.enrich_relationships(db, "Laptop", relationship_types=["IsA"])
149 | ```
150 | 
151 | b) HasA Relationships
152 | ```python
153 | # Discovering part-whole relationships
154 | car_data = {
155 |     "name": "Car",
156 |     "column": "Semantic",
157 |     "properties": {"type": "Vehicle"}
158 | }
159 | db.add_entity(car_data)
160 | enricher.enrich_relationships(db, "Car", relationship_types=["HasA"])
161 | ```
162 | 
163 | c) CapableOf Relationships
164 | ```python
165 | # Finding capability relationships
166 | robot_data = {
167 |     "name": "Robot",
168 |     "column": "Semantic",
169 |     "properties": {"type": "Machine"}
170 | }
171 | db.add_entity(robot_data)
172 | enricher.enrich_relationships(db, "Robot", relationship_types=["CapableOf"])
173 | ```
174 | 
175 | d) UsedFor Relationships
176 | ```python
177 | # Discovering utility relationships
178 | tool_data = {
179 |     "name": "Screwdriver",
180 |     "column": "Semantic",
181 |     "properties": {"type": "Tool"}
182 | }
183 | db.add_entity(tool_data)
184 | enricher.enrich_relationships(db, "Screwdriver", relationship_types=["UsedFor"])
185 | ```
186 | 
187 | ### 3. Confidence Scoring
188 | 
189 | HawkinsDB implements a sophisticated confidence scoring system:
190 | 
191 | a) ConceptNet Edge Weights
192 | ```python
193 | # Example of confidence-based filtering
194 | class CustomEnricher(ConceptNetEnricher):
195 |     def __init__(self):
196 |         super().__init__()
197 |         self.min_confidence = 0.7  # Set minimum confidence threshold
198 |         
199 |     def filter_relations(self, relations):
200 |         """Custom filtering of ConceptNet relations"""
201 |         return [r for r in relations if r.weight >= self.min_confidence]
202 | ```
203 | 
204 | b) Multiple Source Validation
205 | ```python
206 | # Enrichment with multiple sources
207 | enricher = ConceptNetEnricher(
208 |     validate_sources=True,
209 |     min_sources=2
210 | )
211 | enricher.enrich_entity(db, "Computer", "Device")
212 | ```
213 | 
214 | c) Context Relevance
215 | ```python
216 | # Context-aware enrichment
217 | enricher = ConceptNetEnricher(
218 |     context_aware=True,
219 |     domain="technology"
220 | )
221 | enricher.enrich_entity(db, "Smartphone", "Device")
222 | ```
223 | 
224 | ## Advanced Usage
225 | 
226 | ### 1. Custom Enrichment Rules
227 | 
228 | ```python
229 | from hawkinsdb import ConceptNetEnricher
230 | 
231 | class CustomEnricher(ConceptNetEnricher):
232 |     def __init__(self):
233 |         super().__init__()
234 |         self.min_confidence = 0.7  # Set minimum confidence threshold
235 |         
236 |     def filter_relations(self, relations):
237 |         """Custom filtering of ConceptNet relations"""
238 |         return [r for r in relations if r.weight >= self.min_confidence]
239 | ```
240 | 
241 | ### 2. Selective Property Enrichment
242 | 
243 | ```python
244 | # Enrich specific properties
245 | enricher.enrich_properties(
246 |     db,
247 |     entity_name="Car",
248 |     properties=["parts", "capabilities", "location"]
249 | )
250 | ```
251 | 
252 | ### 3. Batch Enrichment
253 | 
254 | ```python
255 | # Enrich multiple related entities
256 | entities = ["Dog", "Cat", "Hamster"]
257 | entity_type = "Pet"
258 | 
259 | for entity in entities:
260 |     enricher.enrich_entity(db, entity, entity_type)
261 | ```
262 | 
263 | ## Best Practices
264 | 
265 | 1. **Entity Preparation**
266 |    - Provide clear entity types
267 |    - Use consistent naming
268 |    - Include basic properties
269 | 
270 | 2. **Enrichment Strategy**
271 |    - Start with core concepts
272 |    - Enrich related entities
273 |    - Validate enriched data
274 |    - Monitor confidence scores
275 | 
276 | 3. **Performance Optimization**
277 |    - Batch similar entities
278 |    - Cache common enrichments
279 |    - Use selective enrichment
280 |    - Set appropriate confidence thresholds
281 | 
282 | ## Error Handling
283 | 
284 | ```python
285 | try:
286 |     enriched = enricher.enrich_entity(db, entity_name, entity_type)
287 |     if enriched:
288 |         print("Successfully enriched entity")
289 |         
290 |         # Verify enrichment
291 |         result = db.query_frames(entity_name)
292 |         if result:
293 |             semantic_frame = result.get("Semantic")
294 |             if semantic_frame:
295 |                 print("Enriched properties:", semantic_frame.properties)
296 |                 print("Enriched relationships:", semantic_frame.relationships)
297 |     else:
298 |         print("No enrichment data found")
299 |         
300 | except Exception as e:
301 |     print(f"Error during enrichment: {str(e)}")
302 | ```
303 | 
304 | ## Troubleshooting
305 | 
306 | Common issues and solutions:
307 | 
308 | 1. **No Enrichment Data**
309 |    - Check entity name spelling
310 |    - Verify entity type
311 |    - Ensure ConceptNet connectivity
312 |    - Check confidence thresholds
313 | 
314 | 2. **Low Quality Enrichment**
315 |    - Adjust confidence thresholds
316 |    - Provide more specific entity types
317 |    - Use custom filtering
318 |    - Implement validation rules
319 | 
320 | 3. **Performance Issues**
321 |    - Use batch enrichment
322 |    - Implement caching
323 |    - Limit enrichment scope
324 |    - Optimize query patterns
325 | 
326 | ## Examples
327 | 
328 | ### 1. Enriching a Technical Concept
329 | 
330 | ```python
331 | # Add and enrich a technical concept
332 | computer_data = {
333 |     "name": "Laptop",
334 |     "column": "Semantic",
335 |     "properties": {
336 |         "type": "Computer",
337 |         "category": "Device"
338 |     }
339 | }
340 | db.add_entity(computer_data)
341 | enricher.enrich_entity(db, "Laptop", "Computer")
342 | ```
343 | 
344 | ### 2. Enriching a Natural Concept
345 | 
346 | ```python
347 | # Add and enrich a natural concept
348 | tree_data = {
349 |     "name": "Oak_Tree",
350 |     "column": "Semantic",
351 |     "properties": {
352 |         "type": "Tree",
353 |         "category": "Plant"
354 |     }
355 | }
356 | db.add_entity(tree_data)
357 | enricher.enrich_entity(db, "Oak_Tree", "Tree")
358 | ```
359 | 
360 | ### 3. Enriching an Abstract Concept
361 | 
362 | ```python
363 | # Add and enrich an abstract concept
364 | concept_data = {
365 |     "name": "Happiness",
366 |     "column": "Semantic",
367 |     "properties": {
368 |         "type": "Emotion",
369 |         "category": "Feeling"
370 |     }
371 | }
372 | db.add_entity(concept_data)
373 | enricher.enrich_entity(db, "Happiness", "Emotion")
374 | ```
375 | 
376 | For more examples and detailed API reference, see the main [documentation](README.md).
377 | 


--------------------------------------------------------------------------------
/docs/llm_interface_guide.md:
--------------------------------------------------------------------------------
  1 | # LLM Interface Guide
  2 | 
  3 | ## Overview
  4 | 
  5 | The LLM Interface in HawkinsDB provides a natural language interface for interacting with the database. It enables seamless interaction with the memory system through natural language, including:
  6 | - Adding new entities from text descriptions
  7 | - Querying existing memories using natural language
  8 | - Automatic property validation and type inference
  9 | - Integration with ConceptNet enrichment
 10 | - Confidence scoring for responses
 11 | 
 12 | ## Features
 13 | 
 14 | ### Core Capabilities
 15 | 
 16 | 1. **Natural Language Entity Creation**
 17 |    - Convert unstructured text into structured memories
 18 |    - Automatic type inference for properties
 19 |    - Support for multiple memory types (Semantic, Episodic, Procedural)
 20 | 
 21 | 2. **Intelligent Querying**
 22 |    - Natural language question answering
 23 |    - Context-aware responses
 24 |    - Multi-entity relationship understanding
 25 |    - Temporal query support for episodic memories
 26 | 
 27 | 3. **Automatic Enrichment**
 28 |    - ConceptNet integration for knowledge expansion
 29 |    - Property inference from context
 30 |    - Relationship discovery
 31 |    - Source tracking
 32 | 
 33 | 4. **Data Quality**
 34 |    - Confidence scoring for all properties
 35 |    - Automatic validation of property types
 36 |    - Inconsistency detection
 37 |    - Source attribution
 38 | 
 39 | 5. **Integration Features**
 40 |    - Seamless connection with memory storage
 41 |    - Event tracking
 42 |    - Error handling
 43 |    - Query optimization
 44 | 
 45 | ## Basic Usage
 46 | 
 47 | ### 1. Initialization
 48 | 
 49 | ```python
 50 | from hawkinsdb import HawkinsDB, LLMInterface
 51 | 
 52 | # Initialize database and LLM interface with auto-enrichment
 53 | db = HawkinsDB()
 54 | llm = LLMInterface(db, auto_enrich=True)
 55 | 
 56 | # Or initialize without auto-enrichment for more control
 57 | llm_manual = LLMInterface(db, auto_enrich=False)
 58 | 
 59 | # Configure additional settings (optional)
 60 | llm = LLMInterface(
 61 |     db,
 62 |     auto_enrich=True,
 63 |     confidence_threshold=0.7,  # Minimum confidence for accepting properties
 64 |     max_enrichment_depth=2,    # Maximum depth for ConceptNet enrichment
 65 |     validate_properties=True    # Enable strict property validation
 66 | )
 67 | ```
 68 | 
 69 | The LLM Interface provides a natural way to interact with HawkinsDB. When initialized:
 70 | - It connects to your HawkinsDB instance
 71 | - Sets up the natural language processing pipeline
 72 | - Configures ConceptNet integration if auto-enrichment is enabled
 73 | - Establishes validation rules for properties
 74 | 
 75 | ### 2. Adding Entities from Text
 76 | 
 77 | ```python
 78 | # Add entity using natural language
 79 | result = llm.add_from_text("""
 80 |     A Tesla Model 3 is an electric car manufactured by Tesla. 
 81 |     It has autopilot capabilities, a glass roof, and typically comes 
 82 |     in various colors including red, white, and black.
 83 | """)
 84 | 
 85 | if result["success"]:
 86 |     print(f"Added entity: {result['entity_name']}")
 87 |     print(f"Enriched: {result['enriched']}")
 88 | ```
 89 | 
 90 | ### 3. Querying with Natural Language
 91 | 
 92 | ```python
 93 | # Ask questions about stored entities
 94 | response = llm.query("What features does the Tesla Model 3 have?")
 95 | print(f"Answer: {response['response']}")
 96 | 
 97 | # Query specific entity details
 98 | details = llm.query_entity("Tesla_Model_3", include_metadata=True)
 99 | print(f"Entity details: {details}")
100 | ```
101 | 
102 | ## Advanced Features
103 | 
104 | ### 1. Property Validation
105 | 
106 | ```python
107 | # The LLM interface automatically validates properties
108 | result = llm.add_from_text("""
109 |     The speed of light is approximately 299,792,458 meters per second.
110 |     It is a fundamental physical constant represented by 'c'.
111 | """)
112 | 
113 | # Properties are validated and properly typed
114 | print(result["entity_data"]["properties"])
115 | ```
116 | 
117 | ### 2. Confidence Scoring
118 | 
119 | ```python
120 | # Query with metadata to see confidence scores
121 | response = llm.query_entity(
122 |     "Speed_of_Light",
123 |     include_metadata=True
124 | )
125 | 
126 | # Check confidence scores for properties
127 | for prop, value in response["data"]["Semantic"]["properties"].items():
128 |     print(f"{prop}: {value[0]['confidence']}")
129 | ```
130 | 
131 | ### 3. Custom Entity Processing
132 | 
133 | ```python
134 | from hawkinsdb.llm_interface import LLMInterface
135 | 
136 | class CustomLLMInterface(LLMInterface):
137 |     def _process_properties(self, properties):
138 |         """Custom property processing"""
139 |         processed = super()._process_properties(properties)
140 |         # Add custom processing logic
141 |         return processed
142 | ```
143 | 
144 | ## Best Practices
145 | 
146 | ### 1. Input Formatting
147 | 
148 | ```python
149 | # Good: Clear, specific descriptions
150 | result = llm.add_from_text("""
151 |     A MacBook Pro is a high-end laptop computer made by Apple.
152 |     It features:
153 |     - Retina display
154 |     - M1 or M2 processor
155 |     - Up to 32GB RAM
156 |     Location: Office desk
157 | """)
158 | 
159 | # Bad: Vague or ambiguous descriptions
160 | result = llm.add_from_text("It's a computer that does stuff")
161 | ```
162 | 
163 | ### 2. Query Formulation
164 | 
165 | ```python
166 | # Good: Specific, focused questions
167 | response = llm.query("What is the processor type in the MacBook Pro?")
168 | 
169 | # Bad: Vague or compound questions
170 | response = llm.query("Tell me about computers and what they do")
171 | ```
172 | 
173 | ### 3. Error Handling
174 | 
175 | ```python
176 | try:
177 |     result = llm.add_from_text(text_description)
178 |     if result["success"]:
179 |         print(f"Added: {result['entity_name']}")
180 |         if result["enriched"]:
181 |             print("Entity was enriched with ConceptNet data")
182 |     else:
183 |         print(f"Error: {result['message']}")
184 | except Exception as e:
185 |     print(f"Error processing text: {str(e)}")
186 | ```
187 | 
188 | ## Example Use Cases
189 | 
190 | ### 1. Knowledge Base Population
191 | 
192 | ```python
193 | # Add multiple related entities
194 | descriptions = [
195 |     "Python is a high-level programming language known for its readability",
196 |     "JavaScript is a programming language used primarily for web development",
197 |     "Java is a widely-used object-oriented programming language"
198 | ]
199 | 
200 | for desc in descriptions:
201 |     result = llm.add_from_text(desc)
202 |     if result["success"]:
203 |         print(f"Added programming language: {result['entity_name']}")
204 | ```
205 | 
206 | ### 2. Question-Answering System
207 | 
208 | ```python
209 | # Build a simple QA system
210 | def answer_questions(questions):
211 |     for question in questions:
212 |         response = llm.query(question)
213 |         if response["success"]:
214 |             print(f"Q: {question}")
215 |             print(f"A: {response['response']}")
216 |         else:
217 |             print(f"Could not answer: {question}")
218 | 
219 | # Example usage
220 | questions = [
221 |     "What programming languages are in the database?",
222 |     "What is Python used for?",
223 |     "Compare JavaScript and Java"
224 | ]
225 | answer_questions(questions)
226 | ```
227 | 
228 | ### 3. Automated Documentation
229 | 
230 | ```python
231 | # Generate structured documentation from text
232 | def document_system(description):
233 |     # Add system description
234 |     result = llm.add_from_text(description)
235 |     if not result["success"]:
236 |         return False
237 |         
238 |     # Query for important aspects
239 |     components = llm.query("What are the main components?")
240 |     features = llm.query("What are the key features?")
241 |     requirements = llm.query("What are the system requirements?")
242 |     
243 |     return {
244 |         "components": components["response"],
245 |         "features": features["response"],
246 |         "requirements": requirements["response"]
247 |     }
248 | ```
249 | 
250 | ## Troubleshooting
251 | 
252 | Common issues and solutions:
253 | 
254 | 1. **Entity Not Added**
255 |    - Check input text clarity
256 |    - Verify required fields
257 |    - Check validation rules
258 |    - Review error messages
259 | 
260 | 2. **Poor Query Responses**
261 |    - Rephrase question
262 |    - Check entity existence
263 |    - Verify data completeness
264 |    - Review context
265 | 
266 | 3. **Performance Issues**
267 |    - Batch similar operations
268 |    - Optimize query patterns
269 |    - Use caching when appropriate
270 |    - Monitor API usage
271 | 
272 | ## API Reference
273 | 
274 | ### LLMInterface Methods
275 | 
276 | ```python
277 | def add_entity(self, entity_json: Union[str, Dict]) -> Dict[str, Any]:
278 |     """Add entity from structured data"""
279 |     
280 | def add_from_text(self, text: str) -> Dict[str, Any]:
281 |     """Add entity from natural language text"""
282 |     
283 | def query(self, question: str) -> Dict[str, Any]:
284 |     """Answer questions about entities"""
285 |     
286 | def query_entity(self, name: str, include_metadata: bool = False) -> Dict[str, Any]:
287 |     """Query specific entity details"""
288 | ```
289 | 
290 | For complete documentation and more examples, see the main [documentation](README.md).
291 | 


--------------------------------------------------------------------------------
/docs/sqlite_backend.md:
--------------------------------------------------------------------------------
  1 | # Using SQLite Backend with HawkinsDB
  2 | 
  3 | HawkinsDB supports SQLite as a persistent storage backend, providing robust data storage with ACID compliance.
  4 | 
  5 | ## Configuration
  6 | 
  7 | To use SQLite storage:
  8 | 
  9 | ```python
 10 | from hawkinsdb import HawkinsDB
 11 | 
 12 | # Initialize database
 13 | db = HawkinsDB()
 14 | 
 15 | # Enable SQLite storage
 16 | db.config.set_storage_backend('sqlite')
 17 | 
 18 | # Optionally configure SQLite path (default: ./hawkins_memory.db)
 19 | db.config.set_storage_path('path/to/your/database.db')
 20 | ```
 21 | 
 22 | ## Key Features
 23 | 
 24 | - **Persistent Storage**: Data remains available between sessions
 25 | - **ACID Compliance**: Ensures data integrity
 26 | - **Concurrent Access**: Safe for multi-threaded applications
 27 | - **Automatic Schema Management**: Tables created and updated automatically
 28 | 
 29 | ## Basic Operations
 30 | 
 31 | ### Adding Entities
 32 | 
 33 | ```python
 34 | entity_data = {
 35 |     "name": "Tesla Model 3",
 36 |     "properties": {
 37 |         "color": "red",
 38 |         "year": 2023
 39 |     },
 40 |     "relationships": {
 41 |         "located_in": ["garage"]
 42 |     }
 43 | }
 44 | 
 45 | result = db.add_entity(entity_data)
 46 | ```
 47 | 
 48 | ### Querying Data
 49 | 
 50 | ```python
 51 | # Get all frames for an entity
 52 | frames = db.query_frames("Tesla Model 3")
 53 | 
 54 | # List all entities
 55 | entities = db.list_entities()
 56 | ```
 57 | 
 58 | ### Error Handling
 59 | 
 60 | ```python
 61 | try:
 62 |     result = db.add_entity(entity_data)
 63 | except ValueError as e:
 64 |     print(f"Invalid data: {str(e)}")
 65 | except Exception as e:
 66 |     print(f"Storage error: {str(e)}")
 67 | ```
 68 | 
 69 | ## Advanced Usage
 70 | 
 71 | ### Bulk Operations
 72 | 
 73 | ```python
 74 | entities = [
 75 |     {"name": "Entity1", "properties": {...}},
 76 |     {"name": "Entity2", "properties": {...}}
 77 | ]
 78 | 
 79 | for entity in entities:
 80 |     db.add_entity(entity)
 81 | ```
 82 | 
 83 | ### Custom Queries
 84 | 
 85 | The SQLite backend supports custom queries through the storage interface:
 86 | 
 87 | ```python
 88 | from hawkinsdb.storage import get_storage_backend
 89 | 
 90 | storage = get_storage_backend('sqlite')
 91 | storage.execute_query("SELECT * FROM entities WHERE name LIKE ?", ("%Tesla%",))
 92 | ```
 93 | 
 94 | ## Best Practices
 95 | 
 96 | 1. **Enable SQLite Early**: Configure SQLite backend before any database operations
 97 | 2. **Use Error Handling**: Always wrap database operations in try-except blocks
 98 | 3. **Regular Backups**: SQLite files can be easily backed up by copying the database file
 99 | 4. **Proper Cleanup**: Close database connections when finished:
100 |    ```python
101 |    db.cleanup()  # Closes connections and frees resources
102 |    ```
103 | 
104 | ## Performance Considerations
105 | 
106 | - SQLite performs best with moderate-sized datasets
107 | - For very large datasets, consider using batch operations
108 | - Index frequently queried fields for better performance
109 | 
110 | ## Troubleshooting
111 | 
112 | Common issues and solutions:
113 | 
114 | 1. **Database Locked**
115 |    - Ensure proper connection cleanup
116 |    - Reduce concurrent access if needed
117 | 
118 | 2. **Permission Errors**
119 |    - Check file and directory permissions
120 |    - Ensure write access to the database directory
121 | 
122 | 3. **Disk Space**
123 |    - Monitor available disk space
124 |    - Implement regular cleanup of unused data
125 | 


--------------------------------------------------------------------------------
/examples/HawkinDB_RAG.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import json
  3 | import logging
  4 | from typing import Dict, Any, Optional, List, Union
  5 | from openai import OpenAI
  6 | from hawkinsdb import HawkinsDB
  7 | 
  8 | os.environ["OPENAI_API_KEY"]=""
  9 | 
 10 | logging.basicConfig(level=logging.INFO)
 11 | logger = logging.getLogger(__name__)
 12 | 
 13 | class TextToHawkinsDB:
 14 |     def __init__(self, api_key: Optional[str] = None):
 15 |         """Initialize with OpenAI API key."""
 16 |         self.api_key = api_key or os.getenv("OPENAI_API_KEY")
 17 |         if not self.api_key:
 18 |             raise ValueError("OpenAI API key is required")
 19 |         self.client = OpenAI(api_key=self.api_key)
 20 |         self.db = HawkinsDB(storage_type='sqlite')
 21 | 
 22 |     def text_to_json(self, text: str) -> Dict[str, Any]:
 23 |         """Convert text description to HawkinsDB-compatible JSON using GPT-4."""
 24 |         prompt = """Convert the following text into a structured JSON format suitable for a memory database. 
 25 |         
 26 |         Rules:
 27 |         1. Extract key entity details, properties, and relationships
 28 |         2. Use underscores for entity names (e.g., Python_Language)
 29 |         3. Categorize memory as one of: Semantic, Episodic, or Procedural
 30 |         4. Include relevant properties and relationships
 31 |         
 32 |         Required JSON format:
 33 |         {
 34 |             "column": "memory_type",
 35 |             "name": "entity_name",
 36 |             "properties": {
 37 |                 "key1": "value1",
 38 |                 "key2": ["value2a", "value2b"]
 39 |             },
 40 |             "relationships": {
 41 |                 "related_to": ["entity1", "entity2"],
 42 |                 "part_of": ["parent_entity"]
 43 |             }
 44 |         }
 45 | 
 46 |         Text to convert:
 47 |         """
 48 | 
 49 |         try:
 50 |             response = self.client.chat.completions.create(
 51 |                 model="gpt-3.5-turbo",
 52 |                 messages=[
 53 |                     {"role": "system", "content": prompt},
 54 |                     {"role": "user", "content": text}
 55 |                 ],
 56 |                 temperature=0.3,
 57 |                 response_format={"type": "json_object"}
 58 |             )
 59 | 
 60 |             json_str = response.choices[0].message.content
 61 |             return json.loads(json_str)
 62 | 
 63 |         except Exception as e:
 64 |             logger.error(f"Error converting text to JSON: {str(e)}")
 65 |             raise
 66 | 
 67 |     def add_to_db(self, text: str) -> Dict[str, Any]:
 68 |         """Convert text to JSON and add to HawkinsDB."""
 69 |         try:
 70 |             json_data = self.text_to_json(text)
 71 |             logger.info(f"Converted JSON: {json.dumps(json_data, indent=2)}")
 72 | 
 73 |             result = self.db.add_entity(json_data)
 74 |             return {
 75 |                 "success": True,
 76 |                 "message": "Successfully added to database",
 77 |                 "entity_data": json_data,
 78 |                 "db_result": result
 79 |             }
 80 | 
 81 |         except Exception as e:
 82 |             logger.error(f"Error adding to database: {str(e)}")
 83 |             return {
 84 |                 "success": False,
 85 |                 "message": str(e),
 86 |                 "entity_data": None,
 87 |                 "db_result": None
 88 |             }
 89 | 
 90 |     def query_entity(self, entity_name: str) -> Dict[str, Any]:
 91 |         """Query specific entity by name."""
 92 |         try:
 93 |             frames = self.db.query_frames(entity_name)
 94 |             if not frames:
 95 |                 return {
 96 |                     "success": False,
 97 |                     "message": f"No entity found with name: {entity_name}",
 98 |                     "data": None
 99 |                 }
100 |                 
101 |             return {
102 |                 "success": True,
103 |                 "message": "Entity found",
104 |                 "data": frames
105 |             }
106 |             
107 |         except Exception as e:
108 |             logger.error(f"Error querying entity: {str(e)}")
109 |             return {
110 |                 "success": False,
111 |                 "message": str(e),
112 |                 "data": None
113 |             }
114 | 
115 |     def query_by_text(self, query_text: str) -> Dict[str, Any]:
116 |         """Query database using natural language text."""
117 |         try:
118 |             # Get all entities for context
119 |             entities = self.db.list_entities()
120 |             if not entities:
121 |                 return {
122 |                     "success": True,
123 |                     "message": "Database is empty",
124 |                     "response": "No information available in the database."
125 |                 }
126 | 
127 |             # Build context from existing entities
128 |             context = []
129 |             for entity_name in entities[:5]:  # Limit to 5 most recent entities
130 |                 frames = self.db.query_frames(entity_name)
131 |                 if frames:
132 |                     context.append(json.dumps(frames, indent=2))
133 | 
134 |             # Create prompt with context
135 |             prompt = f"""You are a helpful assistant with access to a knowledge base.
136 |             Answer the following question based on this context:
137 | 
138 |             Context:
139 |             {' '.join(context)}
140 | 
141 |             Question: {query_text}
142 | 
143 |             Rules:
144 |             1. Only use information from the provided context
145 |             2. If information is not in the context, say so
146 |             3. Be specific and include details when available
147 |             4. Format numbers and dates clearly
148 |             """
149 | 
150 |             # Get response from GPT-4
151 |             response = self.client.chat.completions.create(
152 |                 model="gpt4o",
153 |                 messages=[
154 |                     {"role": "system", "content": prompt}
155 |                 ],
156 |                 temperature=0.3,
157 |                 max_tokens=500
158 |             )
159 | 
160 |             answer = response.choices[0].message.content
161 |             
162 |             return {
163 |                 "success": True,
164 |                 "message": "Query processed successfully",
165 |                 "response": answer
166 |             }
167 | 
168 |         except Exception as e:
169 |             logger.error(f"Error processing query: {str(e)}")
170 |             return {
171 |                 "success": False,
172 |                 "message": str(e),
173 |                 "response": None
174 |             }
175 | 
176 |     def list_all_entities(self) -> Dict[str, Any]:
177 |         """List all entities in the database."""
178 |         try:
179 |             entities = self.db.list_entities()
180 |             return {
181 |                 "success": True,
182 |                 "message": "Entities retrieved successfully",
183 |                 "entities": entities
184 |             }
185 |         except Exception as e:
186 |             logger.error(f"Error listing entities: {str(e)}")
187 |             return {
188 |                 "success": False,
189 |                 "message": str(e),
190 |                 "entities": None
191 |             }
192 | 
193 | def test_memory_examples():
194 |     """Test function to demonstrate usage."""
195 |     converter = TextToHawkinsDB()
196 |     
197 |     # Test adding entries
198 |     examples = [
199 |         """
200 |         Python is a programming language created by Guido van Rossum in 1991.
201 |         It supports object-oriented, imperative, and functional programming.
202 |         It's commonly used for web development, data science, and automation.
203 |         """,
204 |         """
205 |         Today I completed my first Python project in my home office.
206 |         It took 2 hours and was successful. I did a code review afterwards.
207 |         """,
208 |         """
209 |         The Tesla Model 3 is red, made in 2023, and parked in the garage.
210 |         It has a range of 358 miles and goes 0-60 mph in 3.1 seconds.
211 |         """
212 |     ]
213 | 
214 |     # Add examples to database
215 |     logger.info("\nAdding examples to database:")
216 |     for i, example in enumerate(examples, 1):
217 |         logger.info(f"\nAdding Example {i}")
218 |         logger.info("=" * 50)
219 |         result = converter.add_to_db(example)
220 |         logger.info(f"Result: {json.dumps(result, indent=2)}")
221 | 
222 |     # Test queries
223 |     logger.info("\nTesting queries:")
224 |     
225 |     # List all entities
226 |     logger.info("\nListing all entities:")
227 |     entities_result = converter.list_all_entities()
228 |     logger.info(f"Entities: {json.dumps(entities_result, indent=2)}")
229 | 
230 |     # Query specific entity
231 |     logger.info("\nQuerying specific entity:")
232 |     entity_result = converter.query_entity("Python_Language")
233 |     print(entity_result)
234 | 
235 |     # Test natural language queries
236 |     test_queries = [
237 |         "What programming language was created by Guido van Rossum?",
238 |         "Tell me about the Tesla Model 3's specifications.",
239 |         "What happened during the first Python project?"
240 |     ]
241 | 
242 |     logger.info("\nTesting natural language queries:")
243 |     for query in test_queries:
244 |         logger.info(f"\nQuery: {query}")
245 |         result = converter.query_by_text(query)
246 |         logger.info(f"Response: {json.dumps(result, indent=2)}")
247 | 
248 | if __name__ == "__main__":
249 |     test_memory_examples()


--------------------------------------------------------------------------------
/examples/basic_demo.py:
--------------------------------------------------------------------------------
 1 | """Basic demonstration of HawkinsDB functionality."""
 2 | from hawkinsdb import HawkinsDB
 3 | import time
 4 | 
 5 | def main():
 6 |     # Initialize the database
 7 |     print("Initializing HawkinsDB...")
 8 |     db = HawkinsDB()
 9 | 
10 |     try:
11 |         # Add a semantic memory
12 |         print("\nAdding semantic memory...")
13 |         cat_data = {
14 |             "name": "Cat",
15 |             "column": "Semantic",
16 |             "properties": {
17 |                 "type": "Animal",
18 |                 "features": ["fur", "whiskers", "tail"],
19 |                 "diet": "carnivore"
20 |             },
21 |             "relationships": {
22 |                 "preys_on": ["mice", "birds"],
23 |                 "related_to": ["tiger", "lion"]
24 |             }
25 |         }
26 |         result = db.add_entity(cat_data)
27 |         print(f"Semantic memory result: {result}")
28 | 
29 |         # Add an episodic memory
30 |         print("\nAdding episodic memory...")
31 |         event_data = {
32 |             "name": "First Pet",
33 |             "column": "Episodic",
34 |             "properties": {
35 |                 "timestamp": str(time.time()),
36 |                 "action": "Got my first cat",
37 |                 "location": "Pet Store",
38 |                 "emotion": "happy",
39 |                 "participants": ["family", "pet store staff"]
40 |             }
41 |         }
42 |         result = db.add_entity(event_data)
43 |         print(f"Episodic memory result: {result}")
44 | 
45 |         # Add a procedural memory
46 |         print("\nAdding procedural memory...")
47 |         procedure_data = {
48 |             "name": "Feed Cat",
49 |             "column": "Procedural",
50 |             "properties": {
51 |                 "steps": [
52 |                     "Get cat food from cabinet",
53 |                     "Fill bowl with appropriate amount",
54 |                     "Add fresh water to water bowl",
55 |                     "Call cat for feeding"
56 |                 ],
57 |                 "frequency": "twice daily",
58 |                 "importance": "high"
59 |             }
60 |         }
61 |         result = db.add_entity(procedure_data)
62 |         print(f"Procedural memory result: {result}")
63 | 
64 |         # Query memories
65 |         print("\nQuerying memories...")
66 |         cat_memories = db.query_frames("Cat")
67 |         print(f"Cat-related memories: {cat_memories}")
68 | 
69 |         feeding_memories = db.query_frames("Feed Cat")
70 |         print(f"Feeding procedure: {feeding_memories}")
71 | 
72 |         # List all entities
73 |         print("\nAll entities in database:")
74 |         all_entities = db.list_entities()
75 |         print(f"Entities: {all_entities}")
76 | 
77 |     except Exception as e:
78 |         print(f"Error during demo: {str(e)}")
79 |         raise
80 | 
81 |     finally:
82 |         # Cleanup
83 |         db.cleanup()
84 |         print("\nDemo completed.")
85 | 
86 | if __name__ == "__main__":
87 |     main()


--------------------------------------------------------------------------------
/examples/document.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/harishsg993010/HawkinsDB/5268d4ec11f55ab53f83d2c1ef29317901732e35/examples/document.pdf


--------------------------------------------------------------------------------
/examples/file_rag.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import logging
  3 | from typing import List, Dict, Any
  4 | import PyPDF2
  5 | from pathlib import Path
  6 | from hawkinsdb import HawkinsDB, LLMInterface
  7 | 
  8 | os.environ["OPENAI_API_KEY"]=""
  9 | 
 10 | logging.basicConfig(level=logging.INFO)
 11 | logger = logging.getLogger(__name__)
 12 | 
 13 | class PDFHawkinsRAG:
 14 |     def __init__(self, chunk_size: int = 500):
 15 |         """Initialize the RAG system."""
 16 |         self.db = HawkinsDB(storage_type='sqlite',db_path="rag.db")
 17 |         self.llm_interface = LLMInterface(self.db,auto_enrich=True)
 18 |         self.chunk_size = chunk_size
 19 | 
 20 |     def extract_text_from_pdf(self, pdf_path: str) -> str:
 21 |         """Extract text content from a PDF file."""
 22 |         try:
 23 |             with open(pdf_path, 'rb') as file:
 24 |                 pdf_reader = PyPDF2.PdfReader(file)
 25 |                 text = ""
 26 |                 for page in pdf_reader.pages:
 27 |                     text += page.extract_text() + "\n"
 28 |                 return text
 29 |         except Exception as e:
 30 |             logger.error(f"Error extracting text from PDF: {str(e)}")
 31 |             raise
 32 | 
 33 |     def chunk_text(self, text: str, filename: str) -> List[Dict[str, Any]]:
 34 |         """Split text into chunks and prepare for database storage."""
 35 |         chunks = []
 36 |         words = text.split()
 37 |         current_chunk = []
 38 |         chunk_number = 1
 39 | 
 40 |         for word in words:
 41 |             current_chunk.append(word)
 42 |             if len(current_chunk) >= self.chunk_size:
 43 |                 chunk_text = " ".join(current_chunk)
 44 |                 chunks.append({
 45 |                     "name": f"{Path(filename).stem}_chunk_{chunk_number}",
 46 |                     "column": "Semantic",
 47 |                     "properties": {
 48 |                         "content": chunk_text,
 49 |                         "source_file": filename,
 50 |                         "chunk_number": chunk_number,
 51 |                     },
 52 |                     "relationships": {
 53 |                         "part_of": [filename],
 54 |                         "next_chunk": [f"{Path(filename).stem}_chunk_{chunk_number + 1}"] if len(words) > self.chunk_size else []
 55 |                     }
 56 |                 })
 57 |                 current_chunk = []
 58 |                 chunk_number += 1
 59 | 
 60 |         # Handle remaining text
 61 |         if current_chunk:
 62 |             chunk_text = " ".join(current_chunk)
 63 |             chunks.append({
 64 |                 "name": f"{Path(filename).stem}_chunk_{chunk_number}",
 65 |                 "column": "Semantic",
 66 |                 "properties": {
 67 |                     "content": chunk_text,
 68 |                     "source_file": filename,
 69 |                     "chunk_number": chunk_number,
 70 |                 },
 71 |                 "relationships": {
 72 |                     "part_of": [filename]
 73 |                 }
 74 |             })
 75 | 
 76 |         return chunks
 77 | 
 78 |     def ingest_pdf(self, pdf_path: str) -> Dict[str, Any]:
 79 |         """Process and store PDF content in the database."""
 80 |         try:
 81 |             # Extract text from PDF
 82 |             logger.info(f"Processing PDF: {pdf_path}")
 83 |             text = self.extract_text_from_pdf(pdf_path)
 84 | 
 85 |             # Create document metadata
 86 |             filename = Path(pdf_path).name
 87 |             doc_metadata = {
 88 |                 "name": Path(pdf_path).stem,
 89 |                 "column": "Semantic",
 90 |                 "properties": {
 91 |                     "file_type": "PDF",
 92 |                     "file_path": pdf_path,
 93 |                     "file_name": filename,
 94 |                 },
 95 |                 "relationships": {
 96 |                     "contains": []
 97 |                 }
 98 |             }
 99 | 
100 |             # Store document metadata
101 |             self.db.add_entity(doc_metadata)
102 | 
103 |             # Process and store chunks
104 |             chunks = self.chunk_text(text, filename)
105 |             chunk_names = []
106 |             for chunk in chunks:
107 |                 self.db.add_entity(chunk)
108 |                 chunk_names.append(chunk["name"])
109 | 
110 |             # Update document metadata with chunk references
111 |             doc_metadata["relationships"]["contains"] = chunk_names
112 |             self.db.add_entity(doc_metadata)
113 | 
114 |             return {
115 |                 "success": True,
116 |                 "message": f"Successfully processed {filename}",
117 |                 "chunks_created": len(chunks)
118 |             }
119 | 
120 |         except Exception as e:
121 |             logger.error(f"Error ingesting PDF: {str(e)}")
122 |             return {
123 |                 "success": False,
124 |                 "message": str(e)
125 |             }
126 | 
127 |     def query(self, question: str) -> Dict[str, Any]:
128 |         """Query the knowledge base with context from stored documents."""
129 |         try:
130 |             return self.llm_interface.query(question)
131 |         except Exception as e:
132 |             logger.error(f"Error processing query: {str(e)}")
133 |             return {
134 |                 "success": False,
135 |                 "message": str(e),
136 |                 "response": None
137 |             }
138 | 
139 | def test_pdf_rag():
140 |     """Test the PDF RAG system."""
141 |     # Initialize the system
142 |     rag = PDFHawkinsRAG(chunk_size=500)
143 | 
144 |     # Test with sample PDF
145 |     pdf_path = r"C:\Users\haris\Desktop\personal\AI-Agent\Hawin\tests\document.pdf"  # Replace with actual PDF path
146 |     if os.path.exists(pdf_path):
147 |         # Ingest PDF
148 |         logger.info("Ingesting PDF...")
149 |         result = rag.ingest_pdf(pdf_path)
150 |         logger.info(f"Ingestion result: {result}")
151 | 
152 |         if result["success"]:
153 |             # Test queries
154 |             test_queries = [
155 |                 "What is the main topic of the document?",
156 |                 "Summarize the key points from the document.",
157 |                 "What are the main conclusions drawn in the document?",
158 |                 "what is silha center",
159 |                 "who is Charlotte Higgins",
160 |                 "Explain the lawsuits",
161 |                 "Explain OpenAI's Involvement",
162 |                 "who is Mike Masnick"
163 |             ]
164 | 
165 |             logger.info("\nTesting queries:")
166 |             for query in test_queries:
167 |                 logger.info(f"\nQuery: {query}")
168 |                 response = rag.query(query)
169 |                 logger.info(f"Response: {response}")
170 |     else:
171 |         logger.error(f"PDF file not found: {pdf_path}")
172 | 
173 | if __name__ == "__main__":
174 |     test_pdf_rag()


--------------------------------------------------------------------------------
/examples/hawkins_basic_demo.py:
--------------------------------------------------------------------------------
 1 | """Basic demonstration of HawkinsDB functionality."""
 2 | import time
 3 | import logging
 4 | from hawkinsdb.core import HawkinsDB
 5 | from hawkinsdb.enrichment import ConceptNetEnricher
 6 | 
 7 | # Set up logging
 8 | logging.basicConfig(level=logging.INFO)
 9 | logger = logging.getLogger(__name__)
10 | 
11 | def main():
12 |     """Run the basic HawkinsDB demonstration."""
13 |     print("\nStarting HawkinsDB Basic Demo...")
14 |     
15 |     # Initialize HawkinsDB with SQLite storage
16 |     db = HawkinsDB(storage_type="sqlite", db_path="demo_basic.db")
17 |     
18 |     # Create a semantic memory
19 |     cat_data = {
20 |         "name": "cat",
21 |         "column": "Semantic",
22 |         "properties": {
23 |             "type": "animal",
24 |             "size": "medium",
25 |             "characteristics": ["furry", "agile", "carnivorous"]
26 |         },
27 |         "relationships": {
28 |             "habitat": ["homes", "outdoors"],
29 |             "behavior": ["hunting", "sleeping", "grooming"]
30 |         }
31 |     }
32 |     
33 |     # Add basic semantic memory
34 |     print("\nAdding basic semantic memory for 'cat'...")
35 |     result = db.add_entity(cat_data)
36 |     print(f"Result: {result}")
37 |     
38 |     # Add episodic memory
39 |     current_time = time.time()
40 |     episode = {
41 |         "name": "cat_observation",
42 |         "column": "Episodic",
43 |         "properties": {
44 |             "timestamp": current_time,
45 |             "action": "Observed cat behavior",
46 |             "location": "Garden",
47 |             "details": "Cat was chasing a butterfly"
48 |         },
49 |         "relationships": {
50 |             "relates_to": ["cat"],
51 |             "observed_by": ["human"]
52 |         }
53 |     }
54 |     
55 |     print("\nAdding episodic memory...")
56 |     result = db.add_entity(episode)
57 |     print(f"Result: {result}")
58 |     
59 |     # Demonstrate ConceptNet enrichment
60 |     print("\nEnriching 'cat' with ConceptNet knowledge...")
61 |     enricher = ConceptNetEnricher()
62 |     enriched = enricher.enrich_entity(db, "cat", "cat")
63 |     print("Enrichment completed")
64 |     
65 |     # Query and display results
66 |     print("\nQuerying semantic memory for 'cat':")
67 |     cat_info = db.query_frames("cat")
68 |     print(cat_info)
69 |     print(f"Retrieved information: {cat_info}")
70 |     
71 |     print("\nQuerying episodic memory:")
72 |     episode_info = db.query_frames("cat_observation")
73 |     print(f"Retrieved episode: {episode_info}")
74 |     
75 |     print("\nDemo completed successfully!")
76 | 
77 | if __name__ == "__main__":
78 |     main()
79 | 


--------------------------------------------------------------------------------
/examples/hawkins_demo.py:
--------------------------------------------------------------------------------
  1 | """
  2 | Comprehensive demonstration of HawkinsDB with both JSON and SQLite backends.
  3 | """
  4 | import logging
  5 | import os
  6 | import time
  7 | from hawkinsdb import HawkinsDB, LLMInterface
  8 | 
  9 | # Configure logging
 10 | 
 11 | os.environ["OPENAI_API_KEY"]="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"
 12 | logging.basicConfig(level=logging.INFO)
 13 | logger = logging.getLogger(__name__)
 14 | 
 15 | def demonstrate_memory_operations(db: HawkinsDB):
 16 |     """Demonstrate core memory operations with different memory types."""
 17 |     
 18 |     # Add semantic memory
 19 |     logger.info("Adding semantic memory...")
 20 |     db.add_entity({
 21 |         "name": "Computer",
 22 |         "column": "Semantic",
 23 |         "properties": {
 24 |             "type": "Electronic_Device",
 25 |             "purpose": "Computing",
 26 |             "components": ["CPU", "RAM", "Storage"],
 27 |             "power_source": "Electricity"
 28 |         },
 29 |         "relationships": {
 30 |             "found_in": ["Office", "Home"],
 31 |             "used_for": ["Work", "Entertainment"]
 32 |         },
 33 |         "metadata": {
 34 |             "confidence": 1.0,
 35 |             "source": "manual",
 36 |             "timestamp": time.time()
 37 |         }
 38 |     })
 39 | 
 40 |     # Add episodic memory
 41 |     logger.info("Adding episodic memory...")
 42 |     db.add_entity({
 43 |         "name": "First_Computer_Use",
 44 |         "column": "Episodic",
 45 |         "action": "Setting up new computer",  # Required field for episodic memory
 46 |         "properties": {
 47 |             "timestamp": str(time.time()),
 48 |             "location": "Home Office",
 49 |             "duration": "2 hours",
 50 |             "outcome": "Success",
 51 |             "details": "Initial setup and configuration of the computer"
 52 |         },
 53 |         "relationships": {
 54 |             "involves": ["Computer"],
 55 |             "participants": ["User"],
 56 |             "next_action": "Software Installation"
 57 |         },
 58 |         "metadata": {
 59 |             "confidence": 1.0,
 60 |             "source": "manual",
 61 |             "timestamp": time.time()
 62 |         }
 63 |     })
 64 | 
 65 |     # Add procedural memory
 66 |     logger.info("Adding procedural memory...")
 67 |     db.add_entity({
 68 |         "name": "Computer_Startup",
 69 |         "column": "Procedural",
 70 |         "properties": {
 71 |             "steps": [
 72 |                 "Press power button",
 73 |                 "Wait for boot sequence",
 74 |                 "Login to account",
 75 |                 "Check system status"
 76 |             ],
 77 |             "difficulty": "Easy",
 78 |             "estimated_time": "2 minutes"
 79 |         },
 80 |         "relationships": {
 81 |             "requires": ["Computer"],
 82 |             "prerequisites": ["Power_Supply"]
 83 |         }
 84 |     })
 85 | 
 86 |     # Query and display results
 87 |     logger.info("\nQuerying memories...")
 88 |     for entity_name in ["Computer", "First_Computer_Use", "Computer_Startup"]:
 89 |         result = db.query_frames(entity_name)
 90 |         logger.info(f"\nMemory frames for '{entity_name}':")
 91 |         for column, frame in result.items():
 92 |             logger.info(f"Column: {column}")
 93 |             logger.info(f"Properties: {frame.properties}")
 94 |             logger.info(f"Relationships: {frame.relationships}")
 95 | 
 96 | def demonstrate_llm_interface(db: HawkinsDB):
 97 |     """Demonstrate LLM interface capabilities."""
 98 |     logger.info("\n=== Testing LLM Interface ===")
 99 |     
100 |     # Initialize LLM interface with auto-enrichment
101 |     llm = LLMInterface(db, auto_enrich=True)
102 |     
103 |     # Add entity using natural language
104 |     description = """
105 |     Create a semantic memory with name MacBookPro_M2:
106 |     - Type: Computer
107 |     - Brand: Apple
108 |     - Model: MacBook Pro 16"
109 |     - Specifications: M2 chip, 32GB RAM, 1TB storage
110 |     - Location: Office
111 |     - Primary uses: Software development, Video editing
112 |     """
113 |     
114 |     logger.info("Adding entity using natural language...")
115 |     result = llm.add_from_text(description)
116 |     logger.info(f"LLM Add Result: {result}")
117 |     
118 |     # Query using natural language
119 |     queries = [
120 |         "What are the specifications of the MacBook Pro?",
121 |         "What memory types are stored in the database?",
122 |         "How to start a computer according to the stored procedure?"
123 |     ]
124 |     
125 |     for query in queries:
126 |         logger.info(f"\nQuerying: {query}")
127 |         response = llm.query(query)
128 |         logger.info(f"Response: {response}")
129 | 
130 | def main():
131 |     """Run the comprehensive demonstration."""
132 |     try:
133 |         # Clean up any existing test files
134 |         for file in ["demo_json.json", "demo_sqlite.db"]:
135 |             if os.path.exists(file):
136 |                 os.remove(file)
137 |         
138 |         # Demo with JSON storage
139 |         logger.info("\n=== Testing JSON Storage Backend ===")
140 |         json_db = HawkinsDB(db_path="demo_json.json", storage_type="json")
141 |         demonstrate_memory_operations(json_db)
142 |         json_db.cleanup()
143 |         
144 |         # Demo with SQLite storage
145 |         logger.info("\n=== Testing SQLite Storage Backend ===")
146 |         sqlite_db = HawkinsDB(db_path="demo_sqlite.db", storage_type="sqlite")
147 |         demonstrate_memory_operations(sqlite_db)
148 |         
149 |         # Test LLM interface with SQLite backend
150 |         demonstrate_llm_interface(sqlite_db)
151 |         sqlite_db.cleanup()
152 |         
153 |         logger.info("\nDemonstration completed successfully!")
154 |         
155 |     except Exception as e:
156 |         logger.error(f"Demonstration failed: {str(e)}")
157 |         raise
158 | 
159 | if __name__ == "__main__":
160 |     main()


--------------------------------------------------------------------------------
/examples/hawkinsdb_complete_example.py:
--------------------------------------------------------------------------------
  1 | """Complete example demonstrating HawkinsDB usage with SQLite backend."""
  2 | import os
  3 | import logging
  4 | from datetime import datetime
  5 | from hawkinsdb import HawkinsDB
  6 | 
  7 | # Configure logging
  8 | logging.basicConfig(level=logging.INFO)
  9 | logger = logging.getLogger(__name__)
 10 | 
 11 | def setup_database():
 12 |     """Initialize HawkinsDB with SQLite backend."""
 13 |     try:
 14 |         # Initialize with SQLite backend
 15 |         db = HawkinsDB(storage_type='sqlite', db_path='example.db')
 16 |         logger.info("Database initialized successfully")
 17 |         return db
 18 |     except Exception as e:
 19 |         logger.error(f"Failed to initialize database: {str(e)}")
 20 |         raise
 21 | 
 22 | def add_example_semantic_memory(db):
 23 |     """Add semantic memory examples."""
 24 |     try:
 25 |         # Example semantic memory
 26 |         car_data = {
 27 |             "name": "Tesla Model 3",
 28 |             "properties": {
 29 |                 "color": "red",
 30 |                 "year": 2023,
 31 |                 "features": ["autopilot", "glass roof"]
 32 |             },
 33 |             "relationships": {
 34 |                 "manufactured_by": ["Tesla"],
 35 |                 "located_in": ["garage"]
 36 |             }
 37 |         }
 38 |         
 39 |         result = db.add_entity(car_data)
 40 |         logger.info(f"Added semantic memory: {result}")
 41 |         
 42 |     except ValueError as ve:
 43 |         logger.error(f"Validation error: {str(ve)}")
 44 |     except Exception as e:
 45 |         logger.error(f"Error adding semantic memory: {str(e)}")
 46 | 
 47 | def add_example_episodic_memory(db):
 48 |     """Add episodic memory examples."""
 49 |     try:
 50 |         # Example episodic memory with proper timestamp format
 51 |         current_time = datetime.now().isoformat()
 52 |         test_drive = {
 53 |             "name": "first_drive",
 54 |             "properties": {
 55 |                 "timestamp": current_time,
 56 |                 "action": "test drive",
 57 |                 "duration": "45 minutes"
 58 |             },
 59 |             "relationships": {
 60 |                 "involves": ["Tesla Model 3"],
 61 |                 "location": ["dealership"]
 62 |             }
 63 |         }
 64 |         
 65 |         db.add_reference_frame(
 66 |             column_name="Episodic",
 67 |             name=test_drive["name"],
 68 |             properties=test_drive["properties"],
 69 |             relationships=test_drive["relationships"],
 70 |             memory_type="Episodic"
 71 |         )
 72 |         logger.info("Added episodic memory successfully")
 73 |         
 74 |     except Exception as e:
 75 |         logger.error(f"Error adding episodic memory: {str(e)}")
 76 | 
 77 | def query_and_display_memory(db):
 78 |     """Query and display stored memories."""
 79 |     try:
 80 |         # List all entities
 81 |         entities = db.list_entities()
 82 |         logger.info(f"\nStored entities: {entities}")
 83 | 
 84 |         # Query specific memories
 85 |         for entity_name in entities:
 86 |             frames = db.query_frames(entity_name)
 87 |             logger.info(f"\nMemory frames for '{entity_name}':")
 88 |             
 89 |             for column_name, frame in frames.items():
 90 |                 logger.info(f"\nColumn: {column_name}")
 91 |                 logger.info(f"Properties: {frame.properties}")
 92 |                 logger.info(f"Relationships: {frame.relationships}")
 93 |                 if frame.history:
 94 |                     logger.info("History:")
 95 |                     for timestamp, event in frame.history:
 96 |                         logger.info(f"  {timestamp}: {event}")
 97 | 
 98 |     except Exception as e:
 99 |         logger.error(f"Error querying memory: {str(e)}")
100 | 
101 | def main():
102 |     """Main execution function demonstrating HawkinsDB usage."""
103 |     try:
104 |         # Setup database
105 |         db = setup_database()
106 |         
107 |         # Add different types of memories
108 |         add_example_semantic_memory(db)
109 |         add_example_episodic_memory(db)
110 |         
111 |         # Query and display stored memories
112 |         query_and_display_memory(db)
113 |         
114 |     except Exception as e:
115 |         logger.error(f"Application error: {str(e)}")
116 |     finally:
117 |         if 'db' in locals():
118 |             db.cleanup()
119 | 
120 | if __name__ == '__main__':
121 |     main()
122 | 


--------------------------------------------------------------------------------
/examples/hawkinsdb_comprehensive.py:
--------------------------------------------------------------------------------
  1 | """Comprehensive example demonstrating all major features of HawkinsDB."""
  2 | import logging
  3 | import time
  4 | from datetime import datetime
  5 | from typing import Dict, Any
  6 | from hawkinsdb import HawkinsDB, LLMInterface
  7 | from hawkinsdb.types import CorticalColumn, ReferenceFrame, PropertyCandidate
  8 | import json
  9 | import os
 10 | os.environ["OPENAI_API_KEY"]="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"
 11 | 
 12 | # Configure logging
 13 | logging.basicConfig(level=logging.INFO,
 14 |                    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
 15 | logger = logging.getLogger(__name__)
 16 | 
 17 | def demonstrate_basic_operations():
 18 |     """Demonstrate basic database operations with both JSON and SQLite backends."""
 19 |     # Initialize databases with proper configuration
 20 |     json_db = HawkinsDB(db_path="demo_json.json", storage_type="json")
 21 |     sqlite_db = HawkinsDB(db_path="demo_sqlite.db", storage_type="sqlite")
 22 |     
 23 |     # Test data - A Tesla car entity
 24 |     car_data = {
 25 |         "name": "Tesla_Model_3",
 26 |         "column": "Semantic",
 27 |         "properties": {
 28 |             "color": "red",
 29 |             "year": "2023",
 30 |             "mileage": "1000 miles",
 31 |             "features": ["autopilot capabilities", "glass roof"],
 32 |             "type": "electric vehicle"
 33 |         },
 34 |         "relationships": {
 35 |             "type_of": ["Vehicle"],
 36 |             "location": "garage"
 37 |         }
 38 |     }
 39 |     
 40 |     # Add to both databases
 41 |     logger.info("\nAdding car entity to JSON database...")
 42 |     json_db.add_entity(car_data)
 43 |     
 44 |     logger.info("\nAdding car entity to SQLite database...")
 45 |     sqlite_db.add_entity(car_data)
 46 |     
 47 |     # Query and verify data from both databases
 48 |     logger.info("\nQuerying JSON database...")
 49 |     json_result = json_db.query_frames("Tesla_Model_3")
 50 |     if isinstance(json_result, dict):
 51 |         formatted_json = {k: v.to_dict() if hasattr(v, 'to_dict') else str(v) 
 52 |                             for k, v in json_result.items()}
 53 |         logger.info(f"JSON DB Result: {json.dumps(formatted_json, indent=2)}")
 54 |     else:
 55 |         logger.info(f"JSON DB Result: {str(json_result)}")
 56 |             
 57 |     logger.info("\nQuerying SQLite database...")
 58 |     sqlite_result = sqlite_db.query_frames("Tesla_Model_3")
 59 |     if isinstance(sqlite_result, dict):
 60 |         formatted_sqlite = {k: v.to_dict() if hasattr(v, 'to_dict') else str(v) 
 61 |                               for k, v in sqlite_result.items()}
 62 |         logger.info(f"SQLite DB Result: {json.dumps(formatted_sqlite, indent=2)}")
 63 |     else:
 64 |         logger.info(f"SQLite DB Result: {str(sqlite_result)}")
 65 |     
 66 |     # Clean up resources
 67 |     
 68 |     return json_db, sqlite_db
 69 | 
 70 | def demonstrate_conceptnet_enrichment(db: HawkinsDB):
 71 |     """Demonstrate ConceptNet integration and knowledge enrichment."""
 72 |     logger.info("\n=== Demonstrating ConceptNet Integration ===")
 73 |     
 74 |     # Initialize ConceptNet interface
 75 |     from hawkinsdb import ConceptNetEnricher
 76 |     
 77 |     # Initialize enricher with the database instance
 78 |     enricher = ConceptNetEnricher()  # ConceptNet's public API doesn't require a key
 79 |     
 80 |     # Create an entity with basic information
 81 |     entity_name = "Computer"
 82 |     entity_type = "Device"
 83 |     computer_data = {
 84 |         "name": entity_name,
 85 |         "column": "Semantic",
 86 |         "properties": {
 87 |             "type": entity_type,
 88 |             "purpose": "Computing",
 89 |             "location": "Office"
 90 |         }
 91 |     }
 92 |     
 93 |     # First add the entity to the database
 94 |     logger.info("Adding computer entity to database...")
 95 |     add_result = db.add_entity(computer_data)
 96 |     logger.info(f"Add result: {add_result}")
 97 |     
 98 |     # Then enrich it using ConceptNet
 99 |     logger.info("\nEnriching computer entity with ConceptNet data...")
100 |     enriched_data = None
101 | 
102 |     '''
103 |     
104 |     try:
105 |         # Enrich the entity with both name and type
106 |         enriched = enricher.enrich_entity(
107 |             db=db,
108 |             entity_name=entity_name,
109 |             entity_type=entity_type
110 |         )
111 |         
112 |         if enriched:
113 |             logger.info(f"Successfully enriched entity {entity_name} with ConceptNet data")
114 |             
115 |             # Query the enriched entity
116 |             enriched_data = db.query_frames(entity_name)
117 |             semantic_frame = enriched_data.get("Semantic")
118 |             
119 |             if semantic_frame:
120 |                 # Log properties
121 |                 logger.info("\nEnriched properties:")
122 |                 if hasattr(semantic_frame, 'properties'):
123 |                     for prop_name, candidates in semantic_frame.properties.items():
124 |                         logger.info(f"\n{prop_name}:")
125 |                         if isinstance(candidates, list):
126 |                             for candidate in candidates:
127 |                                 if hasattr(candidate, 'value'):
128 |                                     logger.info(f"  - {candidate.value} (confidence: {candidate.confidence:.2f})")
129 |                                 else:
130 |                                     logger.info(f"  - {candidate}")
131 |                                     
132 |                 # Log relationships
133 |                 logger.info("\nEnriched relationships:")
134 |                 if hasattr(semantic_frame, 'relationships'):
135 |                     for rel_type, candidates in semantic_frame.relationships.items():
136 |                         logger.info(f"\n{rel_type}:")
137 |                         if isinstance(candidates, list):
138 |                             for candidate in candidates:
139 |                                 if hasattr(candidate, 'value'):
140 |                                     logger.info(f"  - {candidate.value} (confidence: {candidate.confidence:.2f})")
141 |                                 else:
142 |                                     logger.info(f"  - {candidate}")
143 |         else:
144 |             logger.warning(f"No enrichment data found for entity {entity_name}")
145 |             
146 |     except Exception as e:
147 |         logger.error(f"Error during ConceptNet enrichment: {str(e)}")
148 |         
149 |     return enriched_data
150 | 
151 |     '''
152 |     
153 |     # Query and display the enriched entity
154 |     try:
155 |         # Query and display the enriched entity
156 |         enriched_result = db.query_frames("computer")
157 |         semantic_frame = enriched_result.get("Semantic")
158 |         
159 |         if semantic_frame:
160 |             logger.info("\nQueried enriched entity:")
161 |             logger.info("\nProperties:")
162 |             if hasattr(semantic_frame, 'properties'):
163 |                 for prop_name, candidates in semantic_frame.properties.items():
164 |                     logger.info(f"\n{prop_name}:")
165 |                     if isinstance(candidates, list):
166 |                         for candidate in candidates:
167 |                             if hasattr(candidate, 'value'):
168 |                                 logger.info(f"  - {candidate.value} (confidence: {candidate.confidence:.2f})")
169 |                             else:
170 |                                 logger.info(f"  - {candidate}")
171 |                     else:
172 |                         logger.info(f"  - {candidates}")
173 |                 
174 |                 logger.info("\nRelationships:")
175 |                 if hasattr(semantic_frame, 'relationships'):
176 |                     for rel_type, candidates in semantic_frame.relationships.items():
177 |                         logger.info(f"\n{rel_type}:")
178 |                         if isinstance(candidates, list):
179 |                             for candidate in candidates:
180 |                                 if hasattr(candidate, 'value'):
181 |                                     logger.info(f"  - {candidate.value} (confidence: {candidate.confidence:.2f})")
182 |                                 else:
183 |                                     logger.info(f"  - {candidate}")
184 |                         else:
185 |                             logger.info(f"  - {candidates}")
186 |         
187 |         return enriched_result
188 |     except Exception as e:
189 |         logger.error(f"Error querying enriched entity: {str(e)}")
190 |         return None
191 | 
192 | def demonstrate_llm_interface(db: HawkinsDB):
193 |     """Demonstrate LLM interface for natural language interactions."""
194 |     logger.info("\n=== Demonstrating LLM Interface ===")
195 |     
196 |     # Initialize LLM interface with auto-enrichment and proper error handling
197 |     try:
198 |         llm = LLMInterface(db, auto_enrich=True)
199 |     except Exception as e:
200 |         logger.error(f"Failed to initialize LLM interface: {str(e)}")
201 |         return None, None
202 |     
203 |     # First, add a structured entity to query later
204 |     laptop_entity = {
205 |         "name": "MacBookPro_M3",
206 |         "column": "Semantic",
207 |         "properties": {
208 |             "brand": "Apple",
209 |             "model": "MacBook Pro",
210 |             "year": "2024",
211 |             "processor": "M3 chip",
212 |             "ram": "16GB",
213 |             "storage": "512GB SSD",
214 |             "location": "home office"
215 |         },
216 |         "relationships": {
217 |             "type_of": ["Laptop", "Computer"],
218 |             "manufactured_by": ["Apple"]
219 |         }
220 |     }
221 |     
222 |     # Add the entity directly first
223 |     logger.info("\nAdding MacBook Pro entity...")
224 |     db.add_entity(laptop_entity)
225 |     enriched_data = db.query_frames("MacBookPro_M3")
226 |     semantic_frame = enriched_data.get("Semantic")
227 |     print(semantic_frame)
228 |     # Now demonstrate natural language interaction
229 |     logger.info("\nAdding new entity using natural language...")
230 |     new_entity = {
231 |         "name": "iPhone15Pro",
232 |         "column": "Semantic",
233 |         "properties": {
234 |             "color": "Space Black",
235 |             "storage": "256GB",
236 |             "features": ["A17 Pro chip", "ProMotion display"],
237 |             "location": "desk drawer",
238 |             "type": "smartphone"
239 |         },
240 |         "relationships": {
241 |             "manufacturer": ["Apple"],
242 |             "type_of": ["Mobile Device"]
243 |         }
244 |     }
245 |     # Add entity directly first
246 |     logger.info("Adding iPhone entity directly...")
247 |     test = db.add_entity(new_entity)
248 |     print(test)
249 | 
250 |     
251 |     # Then demonstrate natural language interaction
252 |     logger.info("\nQuerying using natural language about the new iPhone...")
253 |     llm_result = llm.query("What are the features of the iPhone 15 Pro?")
254 |     logger.info(f"LLM Query Result: {json.dumps(llm_result, indent=2)}")
255 |     
256 |     # Query existing entities using natural language
257 |     logger.info("\nQuerying using natural language...")
258 |     queries = [
259 |         "What are the specifications of the MacBook Pro?",
260 |         "Where is the iPhone 15 Pro located?",
261 |         "List all Apple devices and their features",
262 |         "What is computer for what it is used for",
263 |         "What is Explain about iphone",
264 |         "Explain the Features of Tesla Model 3"
265 |     ]
266 |     
267 |     query_results = []
268 |     for query in queries:
269 |         logger.info(f"\nQuery: {query}")
270 |         result = llm.query(query)
271 |         logger.info(f"Response: {result}")
272 |         query_results.append(result)
273 |     
274 |     return llm_result, query_results
275 | 
276 | def main():
277 |     """Run comprehensive demonstration of HawkinsDB features."""
278 |     try:
279 |         logger.info("Starting HawkinsDB comprehensive example...")
280 |         
281 |         # Basic operations with both backends
282 |         json_db, sqlite_db = demonstrate_basic_operations()
283 |         
284 |         # ConceptNet integration (using SQLite backend for persistence)
285 |         enriched_data = demonstrate_conceptnet_enrichment(sqlite_db)
286 |         
287 |         # LLM interface demonstration
288 |         llm_results = demonstrate_llm_interface(sqlite_db)
289 |         
290 |         logger.info("Example completed successfully!")
291 |         
292 |     except Exception as e:
293 |         logger.error(f"Example failed: {str(e)}")
294 |         raise
295 | 
296 | if __name__ == "__main__":
297 |     main()


--------------------------------------------------------------------------------
/examples/hawkinsdb_demo.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Complete demonstration of HawkinsDB functionality using both JSON and SQLite backends.
 3 | """
 4 | import logging
 5 | import os
 6 | from hawkinsdb import HawkinsDB
 7 | from hawkinsdb.storage.sqlite import SQLiteStorage
 8 | 
 9 | # Configure logging
10 | logging.basicConfig(level=logging.INFO)
11 | logger = logging.getLogger(__name__)
12 | 
13 | def demo_memory_operations(db: HawkinsDB):
14 |     """Demonstrate core memory operations."""
15 |     logger.info("Adding semantic memory...")
16 |     db.add_entity({
17 |         "name": "Apple",
18 |         "column": "Semantic",
19 |         "properties": {
20 |             "color": "red",
21 |             "taste": "sweet",
22 |             "category": "fruit"
23 |         },
24 |         "relationships": {
25 |             "grows_on": "tree",
26 |             "belongs_to": ["fruits", "healthy_foods"]
27 |         }
28 |     })
29 | 
30 |     logger.info("Adding episodic memory...")
31 |     db.add_entity({
32 |         "name": "First_Apple_Experience",
33 |         "column": "Episodic",
34 |         "properties": {
35 |             "timestamp": "2024-01-01T12:00:00",
36 |             "action": "tasting an apple",
37 |             "location": "kitchen"
38 |         },
39 |         "relationships": {
40 |             "involves": ["Apple", "Kitchen"]
41 |         }
42 |     })
43 | 
44 |     logger.info("Adding procedural memory...")
45 |     db.add_entity({
46 |         "name": "Apple_Pie_Recipe",
47 |         "column": "Procedural",
48 |         "properties": {
49 |             "steps": [
50 |                 "Peel and slice apples",
51 |                 "Mix with sugar and cinnamon",
52 |                 "Prepare pie crust",
53 |                 "Bake at 375°F for 45 minutes"
54 |             ],
55 |             "difficulty": "medium"
56 |         },
57 |         "relationships": {
58 |             "requires": ["Apple", "Sugar", "Flour"]
59 |         }
60 |     })
61 | 
62 |     # Query and display results
63 |     logger.info("Querying memories...")
64 |     for entity_name in ["Apple", "First_Apple_Experience", "Apple_Pie_Recipe"]:
65 |         frames = db.query_frames(entity_name)
66 |         logger.info(f"\nFound frames for '{entity_name}':")
67 |         for column_name, frame in frames.items():
68 |             logger.info(f"Column: {column_name}")
69 |             logger.info(f"Properties: {frame.properties}")
70 |             logger.info(f"Relationships: {frame.relationships}")
71 | 
72 | def main():
73 |     """Run the demonstration with both storage backends."""
74 |     # Clean up any existing test files
75 |     for file in ["demo_json.json", "demo_sqlite.db"]:
76 |         if os.path.exists(file):
77 |             os.remove(file)
78 | 
79 |     # Demo with JSON storage
80 |     logger.info("\n=== Testing JSON Storage Backend ===")
81 |     json_db = HawkinsDB(db_path="demo_json.json", storage_type="json")
82 |     demo_memory_operations(json_db)
83 |     json_db.cleanup()
84 | 
85 |     # Demo with SQLite storage
86 |     logger.info("\n=== Testing SQLite Storage Backend ===")
87 |     sqlite_db = HawkinsDB(db_path="demo_sqlite.db", storage_type="sqlite")
88 |     demo_memory_operations(sqlite_db)
89 |     sqlite_db.cleanup()
90 | 
91 | if __name__ == "__main__":
92 |     main()
93 | 


--------------------------------------------------------------------------------
/examples/hawkinsdb_full_example.py:
--------------------------------------------------------------------------------
  1 | """
  2 | Comprehensive example demonstrating all major features of HawkinsDB.
  3 | This example showcases:
  4 | 1. Basic CRUD operations
  5 | 2. Advanced caching mechanisms
  6 | 3. Different memory types (Semantic, Episodic, Procedural)
  7 | 4. ConceptNet integration
  8 | 5. Memory type validations
  9 | 6. Performance monitoring
 10 | """
 11 | 
 12 | import logging
 13 | import time
 14 | import json
 15 | from hawkinsdb import HawkinsDB, LLMInterface
 16 | from datetime import datetime
 17 | 
 18 | logging.basicConfig(level=logging.INFO)
 19 | logger = logging.getLogger(__name__)
 20 | 
 21 | def demonstrate_memory_types(db: HawkinsDB):
 22 |     """Demonstrate different memory types and their validations."""
 23 |     logger.info("\n=== Testing Different Memory Types ===")
 24 |     
 25 |     # Semantic Memory
 26 |     semantic_data = {
 27 |         "name": "Tesla_Model_3",
 28 |         "column": "Semantic",
 29 |         "properties": {
 30 |             "type": "electric_car",
 31 |             "manufacturer": "Tesla",
 32 |             "year": 2024,
 33 |             "features": ["autopilot", "battery_powered", "touch_screen"]
 34 |         },
 35 |         "relationships": {
 36 |             "similar_to": ["Tesla_Model_Y", "Tesla_Model_S"],
 37 |             "competes_with": ["BMW_i4", "Polestar_2"]
 38 |         }
 39 |     }
 40 |     db.add_entity(semantic_data)
 41 |     logger.info("Added semantic memory: Tesla Model 3")
 42 |     
 43 |     # Episodic Memory
 44 |     episodic_data = {
 45 |         "name": "First_Tesla_Drive",
 46 |         "column": "Episodic",
 47 |         "properties": {
 48 |             "timestamp": datetime.now().isoformat(),
 49 |             "action": "test_drive",
 50 |             "location": {"city": "Palo Alto", "state": "CA"},
 51 |             "duration": "45 minutes",
 52 |             "participants": ["customer", "sales_rep"]
 53 |         }
 54 |     }
 55 |     db.add_entity(episodic_data)
 56 |     logger.info("Added episodic memory: First Tesla Drive")
 57 |     
 58 |     # Procedural Memory
 59 |     procedural_data = {
 60 |         "name": "Tesla_Charging_Process",
 61 |         "column": "Procedural",
 62 |         "properties": {
 63 |             "steps": [
 64 |                 "Park near charging station",
 65 |                 "Open charging port",
 66 |                 "Connect charging cable",
 67 |                 "Initiate charging via touchscreen",
 68 |                 "Wait for desired charge level",
 69 |                 "Disconnect charging cable"
 70 |             ],
 71 |             "required_tools": ["charging_cable", "Tesla_app"],
 72 |             "difficulty": "easy"
 73 |         }
 74 |     }
 75 |     db.add_entity(procedural_data)
 76 |     logger.info("Added procedural memory: Tesla Charging Process")
 77 | 
 78 | # Function removed as caching is no longer supported
 79 | 
 80 | def main():
 81 |     """Run the comprehensive example."""
 82 |     # Initialize database with SQLite storage
 83 |     db = HawkinsDB(
 84 |         storage_type='sqlite'
 85 |     )
 86 |     
 87 |     try:
 88 |         # Test different memory types
 89 |         demonstrate_memory_types(db)
 90 |         
 91 |         # Test queries
 92 |         logger.info("\n=== Testing Queries ===")
 93 |         tesla_data = db.query_frames("Tesla_Model_3")
 94 |         # Convert ReferenceFrame objects to dictionaries before JSON serialization
 95 |         tesla_data_dict = {
 96 |             column_name: frame.to_dict() 
 97 |             for column_name, frame in tesla_data.items()
 98 |         }
 99 |         logger.info(f"Query result for Tesla Model 3: {json.dumps(tesla_data_dict, indent=2)}")
100 |         
101 |         # List all entities
102 |         logger.info("\n=== All Entities ===")
103 |         all_entities = db.list_entities()
104 |         logger.info(f"Total entities: {len(all_entities)}")
105 |         logger.info(f"Entities: {json.dumps(all_entities, indent=2)}")
106 |         
107 |     except Exception as e:
108 |         logger.error(f"Error during example execution: {e}")
109 |         raise
110 |     finally:
111 |         db.cleanup()
112 | 
113 | if __name__ == "__main__":
114 |     main()
115 | 


--------------------------------------------------------------------------------
/examples/hawkinsdb_sqlite_example.py:
--------------------------------------------------------------------------------
 1 | """Example demonstrating HawkinsDB usage with SQLite backend."""
 2 | import logging
 3 | from hawkinsdb import HawkinsDB
 4 | 
 5 | # Configure logging
 6 | logging.basicConfig(level=logging.INFO)
 7 | logger = logging.getLogger(__name__)
 8 | 
 9 | def main():
10 |     # Initialize HawkinsDB with SQLite backend
11 |     db = HawkinsDB(storage_type='sqlite', db_path='example.db')
12 |     
13 |     # Example 1: Adding Semantic Memory
14 |     semantic_data = {
15 |         "name": "Tesla Model 3",
16 |         "properties": {
17 |             "color": "red",
18 |             "year": 2023,
19 |             "features": ["autopilot", "glass roof"]
20 |         },
21 |         "relationships": {
22 |             "manufactured_by": ["Tesla"],
23 |             "located_in": ["garage"]
24 |         }
25 |     }
26 |     
27 |     try:
28 |         result = db.add_entity(semantic_data)
29 |         logger.info(f"Added semantic memory: {result}")
30 |     except Exception as e:
31 |         logger.error(f"Error adding semantic memory: {e}")
32 | 
33 |     # Example 2: Adding Episodic Memory
34 |     episodic_data = {
35 |         "name": "first_drive",
36 |         "properties": {
37 |             "timestamp": "2024-01-01T10:00:00",
38 |             "action": "test drive",
39 |             "duration": "45 minutes"
40 |         },
41 |         "relationships": {
42 |             "involves": ["Tesla Model 3"],
43 |             "location": ["dealership"]
44 |         }
45 |     }
46 |     
47 |     try:
48 |         db.add_reference_frame(
49 |             column_name="Episodic",
50 |             name=episodic_data["name"],
51 |             properties=episodic_data["properties"],
52 |             relationships=episodic_data["relationships"],
53 |             memory_type="Episodic"
54 |         )
55 |         logger.info("Added episodic memory successfully")
56 |     except Exception as e:
57 |         logger.error(f"Error adding episodic memory: {e}")
58 | 
59 |     # Query and display stored information
60 |     try:
61 |         # List all entities
62 |         entities = db.list_entities()
63 |         logger.info(f"Stored entities: {entities}")
64 | 
65 |         # Query specific entity
66 |         tesla_frames = db.query_frames("Tesla Model 3")
67 |         for column, frame in tesla_frames.items():
68 |             logger.info(f"\nColumn: {column}")
69 |             logger.info(f"Properties: {frame.properties}")
70 |             logger.info(f"Relationships: {frame.relationships}")
71 |             
72 |     except Exception as e:
73 |         logger.error(f"Error querying data: {e}")
74 | 
75 |     # Cleanup
76 |     db.storage.cleanup()
77 | 
78 | if __name__ == "__main__":
79 |     main()
80 | 


--------------------------------------------------------------------------------
/examples/sqlite_usage.py:
--------------------------------------------------------------------------------
  1 | """Example demonstrating HawkinsDB usage with SQLite backend."""
  2 | import os
  3 | import logging
  4 | from datetime import datetime
  5 | from pathlib import Path
  6 | from hawkinsdb import HawkinsDB
  7 | from hawkinsdb.storage.sqlite import SQLiteStorage
  8 | 
  9 | # Configure logging
 10 | logging.basicConfig(level=logging.INFO, 
 11 |                    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
 12 | logger = logging.getLogger(__name__)
 13 | 
 14 | def setup_database():
 15 |     """Initialize HawkinsDB with SQLite backend."""
 16 |     try:
 17 |         # Set custom SQLite path (optional)
 18 |         db_path = Path('./hawkins_memory.db').absolute()
 19 |         
 20 |         # Initialize database with SQLite backend explicitly
 21 |         db = HawkinsDB(storage_type='sqlite')  # This is the recommended way
 22 |         
 23 |         # Alternatively, you can initialize with custom path:
 24 |         # storage = SQLiteStorage(db_path=str(db_path))
 25 |         # db = HawkinsDB(storage=storage)
 26 |         
 27 |         logger.info(f"Database initialized with SQLite backend")
 28 |         return db
 29 |         
 30 |     except Exception as e:
 31 |         logger.error(f"Failed to initialize database: {str(e)}")
 32 |         raise
 33 | # Example memory types supported by HawkinsDB:
 34 | # 1. Semantic: For storing facts and properties about entities
 35 | # 2. Episodic: For storing time-based events and experiences
 36 | # 3. Procedural: For storing step-by-step procedures or workflows
 37 | 
 38 | def add_example_entities(db):
 39 |     """Add example entities to the database."""
 40 |     # Example entities
 41 |     entities = [
 42 |         {
 43 |             "name": "Tesla Model 3",
 44 |             "properties": {
 45 |                 "color": "red",
 46 |                 "year": 2023,
 47 |                 "mileage": 1000,
 48 |                 "features": ["autopilot", "glass roof"]
 49 |             },
 50 |             "relationships": {
 51 |                 "located_in": ["garage"],
 52 |                 "manufactured_by": ["Tesla"]
 53 |             }
 54 |         },
 55 |         {
 56 |             "name": "Smart Home Hub",
 57 |             "properties": {
 58 |                 "brand": "HomeKit",
 59 |                 "connected_devices": 5,
 60 |                 "firmware_version": "2.1.0"
 61 |             },
 62 |             "relationships": {
 63 |                 "controls": ["lights", "thermostat"],
 64 |                 "connected_to": ["wifi_network"]
 65 |             }
 66 |         }
 67 |     ]
 68 |     
 69 |     for entity_data in entities:
 70 |         try:
 71 |             result = db.add_entity(entity_data)
 72 |             logger.info(f"Added entity: {result['entity_name']}")
 73 |         except ValueError as ve:
 74 |             logger.error(f"Invalid entity data: {str(ve)}")
 75 |         except Exception as e:
 76 |             logger.error(f"Error adding entity {entity_data['name']}: {str(e)}")
 77 | 
 78 | def query_and_display_data(db):
 79 |     """Query and display stored data."""
 80 |     try:
 81 |         # List all entities
 82 |         entities = db.list_entities()
 83 |         logger.info(f"Stored entities: {entities}")
 84 |         
 85 |         # Query frames for each entity
 86 |         for entity_name in entities:
 87 |             try:
 88 |                 frames = db.query_frames(entity_name)
 89 |                 logger.info(f"\nEntity: {entity_name}")
 90 |                 
 91 |                 for column, frame in frames.items():
 92 |                     logger.info(f"Column: {column}")
 93 |                     logger.info(f"Properties: {frame.properties}")
 94 |                     logger.info(f"Relationships: {frame.relationships}")
 95 |                     
 96 |             except Exception as e:
 97 |                 logger.error(f"Error querying frames for {entity_name}: {str(e)}")
 98 |                 
 99 |     except Exception as e:
100 |         logger.error(f"Error listing entities: {str(e)}")
101 | 
102 | def main():
103 |     """Main execution function."""
104 |     try:
105 |         # Setup database
106 |         db = setup_database()
107 |         
108 |         # Add example entities
109 |         add_example_entities(db)
110 |         
111 |         # Query and display data
112 |         query_and_display_data(db)
113 |         
114 |     except Exception as e:
115 |         logger.error(f"Application error: {str(e)}")
116 |     finally:
117 |         # Cleanup (if needed)
118 |         if 'db' in locals():
119 |             db.cleanup()
120 | 
121 | if __name__ == '__main__':
122 |     main()
123 | 


--------------------------------------------------------------------------------
/hawkinsdb/__init__.py:
--------------------------------------------------------------------------------
 1 | """
 2 | HawkinsDB - A memory layer with SQLite backend and error handling.
 3 | 
 4 | Core Features:
 5 | - Multiple memory types (Semantic, Episodic, Procedural)
 6 | - SQLite and JSON storage backends
 7 | - Robust error handling and data validation
 8 | """
 9 | 
10 | from .core import HawkinsDB, JSONStorage
11 | 
12 | __version__ = "1.0.1"
13 | __author__ = "HawkinsDB Contributors"
14 | __email__ = "hawkinsdb@example.com"
15 | __license__ = "MIT"
16 | 
17 | # Core components only
18 | __all__ = [
19 |     'HawkinsDB',
20 |     'JSONStorage'
21 | ]
22 | 
23 | # Optional components loaded if dependencies are available
24 | try:
25 |     from .enrichment import ConceptNetEnricher
26 |     __all__.append('ConceptNetEnricher')
27 | except ImportError:
28 |     pass
29 | 
30 | try:
31 |     from .llm_interface import LLMInterface
32 |     __all__.append('LLMInterface')
33 | except ImportError:
34 |     pass
35 | 


--------------------------------------------------------------------------------
/hawkinsdb/base.py:
--------------------------------------------------------------------------------
  1 | """Base classes for HawkinsDB."""
  2 | from abc import ABC, abstractmethod
  3 | from typing import (
  4 |     Any, Dict, List, Optional, Sequence, TypeVar, Generic,
  5 |     Protocol, runtime_checkable
  6 | )
  7 | from typing_extensions import TypeAlias
  8 | from datetime import datetime
  9 | import time
 10 | 
 11 | # Type variables for generic implementations
 12 | T = TypeVar('T')
 13 | T_PropertyCandidate = TypeVar('T_PropertyCandidate', bound='PropertyCandidate')
 14 | T_ReferenceFrame = TypeVar('T_ReferenceFrame', bound='ReferenceFrame')
 15 | T_CorticalColumn = TypeVar('T_CorticalColumn', bound='CorticalColumn')
 16 | 
 17 | @runtime_checkable
 18 | class StorageBackend(Protocol[T_CorticalColumn]):
 19 |     """Protocol class for storage backends."""
 20 |     
 21 |     def load_columns(self) -> Sequence[T_CorticalColumn]:
 22 |         """Load all columns from storage."""
 23 |         ...
 24 |         
 25 |     def save_columns(self, columns: Sequence[T_CorticalColumn]) -> None:
 26 |         """Save all columns to storage."""
 27 |         ...
 28 |         
 29 |     def initialize(self) -> None:
 30 |         """Initialize the storage backend."""
 31 |         ...
 32 |         
 33 |     def cleanup(self) -> None:
 34 |         """Cleanup any resources."""
 35 |         ...
 36 | 
 37 | class PropertyCandidate:
 38 |     """Represents a candidate value for a property."""
 39 | 
 40 |     def __init__(self, value, confidence=1.0, sources=None):
 41 |         """Initialize a property candidate with value and metadata."""
 42 |         if isinstance(value, dict):
 43 |             if 'value' in value:
 44 |                 self.value = value['value']
 45 |                 self.confidence = float(value.get('confidence', confidence))
 46 |                 self.sources = list(value.get('sources', sources or []))
 47 |             else:
 48 |                 self.value = value
 49 |                 self.confidence = float(confidence)
 50 |                 self.sources = list(sources) if sources else []
 51 |         elif isinstance(value, PropertyCandidate):
 52 |             self.value = value.value
 53 |             self.confidence = float(value.confidence)
 54 |             self.sources = list(value.sources)
 55 |         else:
 56 |             self.value = value
 57 |             self.confidence = float(confidence)
 58 |             self.sources = list(sources) if sources else []
 59 |         
 60 |         if not (0.0 <= self.confidence <= 1.0):
 61 |             raise ValueError(f"Confidence must be between 0.0 and 1.0, got {self.confidence}")
 62 |         
 63 |         self.timestamp = time.time()
 64 | 
 65 |     def to_dict(self):
 66 |         """Convert to dictionary representation."""
 67 |         return {
 68 |             "value": self.value,
 69 |             "confidence": self.confidence,
 70 |             "sources": self.sources,
 71 |             "timestamp": self.timestamp
 72 |         }
 73 | 
 74 |     @staticmethod
 75 |     def from_dict(d):
 76 |         """Create from dictionary representation."""
 77 |         if not isinstance(d, dict):
 78 |             raise TypeError(f"Expected dict, got {type(d)}")
 79 |             
 80 |         if 'value' not in d:
 81 |             raise ValueError("Dictionary must contain 'value' key")
 82 |             
 83 |         return PropertyCandidate(
 84 |             value=d["value"],
 85 |             confidence=float(d.get("confidence", 1.0)),
 86 |             sources=list(d.get("sources", []))
 87 |         )
 88 | 
 89 | class ReferenceFrame:
 90 |     """Represents a concept or object as a reference frame."""
 91 | 
 92 |     def __init__(self, name, properties=None, relationships=None, location=None, history=None):
 93 |         """Initialize frame with proper dictionary handling."""
 94 |         # Handle dictionary input
 95 |         if isinstance(name, dict):
 96 |             data = name
 97 |             self.name = data.get('name')
 98 |             properties = data.get('properties', {})
 99 |             relationships = data.get('relationships', {})
100 |             location = data.get('location', {})
101 |             history = data.get('history', [])
102 |         else:
103 |             self.name = name
104 |             
105 |         self.properties = properties if properties is not None else {}
106 |         self.relationships = relationships if relationships is not None else {}
107 |         self.location = location if location is not None else {}
108 |         self.history = history if history is not None else []
109 |         self.created_at = datetime.now().isoformat()
110 |         self.updated_at = self.created_at
111 | 
112 |     def to_dict(self):
113 |         """Convert frame to dictionary representation."""
114 |         try:
115 |             return {
116 |                 "name": self.name,
117 |                 "properties": {
118 |                     k: [c.to_dict() if isinstance(c, PropertyCandidate) else {"value": c} for c in candidates] 
119 |                     for k, candidates in self.properties.items()
120 |                 },
121 |                 "relationships": {
122 |                     k: [c.to_dict() if isinstance(c, PropertyCandidate) else {"value": c} for c in candidates]
123 |                     for k, candidates in self.relationships.items()
124 |                 },
125 |                 "location": self.location,
126 |                 "history": self.history,
127 |                 "created_at": self.created_at,
128 |                 "updated_at": self.updated_at
129 |             }
130 |         except Exception as e:
131 |             raise ValueError(f"Error converting frame to dict: {str(e)}")
132 | 
133 |     def __str__(self):
134 |         return f"ReferenceFrame(name={self.name})"
135 | 
136 |     def __repr__(self):
137 |         return self.__str__()
138 | 
139 |     @staticmethod
140 |     def from_dict(data):
141 |         """Create frame from dictionary representation."""
142 |         try:
143 |             if not isinstance(data, dict):
144 |                 if isinstance(data, str):
145 |                     # Handle string input by creating minimal frame
146 |                     return ReferenceFrame(name=data)
147 |                 raise TypeError(f"Input must be a dictionary or string, got {type(data)}")
148 | 
149 |             props = {}
150 |             for k, vlist in data.get("properties", {}).items():
151 |                 if isinstance(vlist, (list, tuple)):
152 |                     processed = []
153 |                     for v in vlist:
154 |                         if isinstance(v, dict):
155 |                             processed.append(PropertyCandidate.from_dict(v))
156 |                         else:
157 |                             processed.append(PropertyCandidate(v))
158 |                     props[k] = processed
159 |                 else:
160 |                     props[k] = [PropertyCandidate(vlist)]
161 | 
162 |             rels = {}
163 |             for k, vlist in data.get("relationships", {}).items():
164 |                 if isinstance(vlist, (list, tuple)):
165 |                     processed = []
166 |                     for v in vlist:
167 |                         if isinstance(v, dict):
168 |                             processed.append(PropertyCandidate.from_dict(v))
169 |                         else:
170 |                             processed.append(PropertyCandidate(v))
171 |                     rels[k] = processed
172 |                 else:
173 |                     rels[k] = [PropertyCandidate(vlist)]
174 | 
175 |             frame = ReferenceFrame(
176 |                 name=data.get("name", ""),
177 |                 properties=props,
178 |                 relationships=rels,
179 |                 location=data.get("location", {}),
180 |                 history=data.get("history", [])
181 |             )
182 |             frame.created_at = data.get("created_at", frame.created_at)
183 |             frame.updated_at = data.get("updated_at", frame.updated_at)
184 |             return frame
185 |             
186 |         except Exception as e:
187 |             raise ValueError(f"Error creating frame from dict: {str(e)}")
188 | 
189 | class CorticalColumn:
190 |     """Represents a collection of reference frames with error handling."""
191 | 
192 |     def __init__(self, name, frames=None):
193 |         """Initialize a cortical column with proper dictionary handling."""
194 |         # Handle dictionary input
195 |         if isinstance(name, dict):
196 |             data = name
197 |             self.name = data.get('name')
198 |             frames = data.get('frames', [])
199 |             self.created_at = data.get('created_at', datetime.now().isoformat())
200 |             self.updated_at = data.get('updated_at', self.created_at)
201 |         else:
202 |             if not name:
203 |                 raise ValueError("Column name cannot be empty")
204 |             self.name = name
205 |             self.created_at = datetime.now().isoformat()
206 |             self.updated_at = self.created_at
207 |             
208 |         self.frames = []
209 |         if frames:
210 |             for frame in frames:
211 |                 if isinstance(frame, dict):
212 |                     self.frames.append(ReferenceFrame(frame))
213 |                 else:
214 |                     self.frames.append(frame)
215 | 
216 |     def to_dict(self):
217 |         """Convert column to dictionary representation with error handling."""
218 |         try:
219 |             return {
220 |                 "name": self.name,
221 |                 "frames": [f.to_dict() for f in self.frames],
222 |                 "created_at": self.created_at,
223 |                 "updated_at": self.updated_at
224 |             }
225 |         except Exception as e:
226 |             raise ValueError(f"Error converting column to dict: {str(e)}")
227 | 
228 |     @staticmethod
229 |     def from_dict(data):
230 |         """Create column from dictionary with validation and error handling."""
231 |         try:
232 |             if not isinstance(data, dict):
233 |                 raise TypeError("Input must be a dictionary")
234 |                 
235 |             if "name" not in data:
236 |                 raise ValueError("Column data must contain 'name' field")
237 |                 
238 |             frames = []
239 |             for frame_data in data.get("frames", []):
240 |                 try:
241 |                     frame = ReferenceFrame.from_dict(frame_data)
242 |                     frames.append(frame)
243 |                 except Exception as e:
244 |                     raise ValueError(f"Error creating frame: {str(e)}")
245 |                     
246 |             col = CorticalColumn(name=data["name"], frames=frames)
247 |             col.created_at = data.get("created_at", col.created_at)
248 |             col.updated_at = data.get("updated_at", col.updated_at)
249 |             return col
250 |             
251 |         except Exception as e:
252 |             raise ValueError(f"Error creating column from dict: {str(e)}")
253 | 
254 | class BaseJSONStorage(StorageBackend[T_CorticalColumn]):
255 |     """Base class for JSON storage implementation."""
256 |     def load_columns(self) -> Sequence[T_CorticalColumn]:
257 |         raise NotImplementedError("Subclasses must implement load_columns")
258 | 
259 |     def save_columns(self, columns: Sequence[T_CorticalColumn]) -> None:
260 |         raise NotImplementedError("Subclasses must implement save_columns")
261 | 
262 |     def initialize(self) -> None:
263 |         raise NotImplementedError("Subclasses must implement initialize")
264 | 
265 |     def cleanup(self) -> None:
266 |         raise NotImplementedError("Subclasses must implement cleanup")


--------------------------------------------------------------------------------
/hawkinsdb/core.py:
--------------------------------------------------------------------------------
  1 | """Core functionality for HawkinsDB."""
  2 | import os
  3 | import json
  4 | import time
  5 | import logging
  6 | from datetime import datetime
  7 | from pathlib import Path
  8 | from .base import PropertyCandidate, ReferenceFrame, CorticalColumn
  9 | 
 10 | # Configure logging
 11 | logging.basicConfig(level=logging.INFO)
 12 | logger = logging.getLogger(__name__)
 13 | 
 14 | try:
 15 |     from filelock import FileLock
 16 | except ImportError:
 17 |     logger.warning("FileLock not available, using dummy implementation")
 18 |     class FileLock:
 19 |         def __init__(self, *args): pass
 20 |         def __enter__(self): return self
 21 |         def __exit__(self, *args): pass
 22 | 
 23 | class EntityValidationError(Exception):
 24 |     """Raised when entity validation fails."""
 25 |     pass
 26 | 
 27 | # Make EntityValidationError available at module level
 28 | __all__ = ['HawkinsDB', 'JSONStorage', 'EntityValidationError']
 29 | 
 30 | # Import storage backends
 31 | from .storage.sqlite import SQLiteStorage
 32 | # Ensure SQLiteStorage is available by default
 33 | if not SQLiteStorage:
 34 |     logger.error("Failed to import SQLiteStorage. Please check your installation.")
 35 |     raise ImportError("SQLiteStorage module is required but not available")
 36 | 
 37 | class JSONStorage:
 38 |     """Handles persistence of the HawkinsDB data in JSON format."""
 39 |     def __init__(self, path):
 40 |         self.path = Path(path)
 41 |         self.lock = FileLock(str(self.path) + ".lock")
 42 | 
 43 |     def initialize(self):
 44 |         if not self.path.exists():
 45 |             self._write_data({"columns": []})
 46 | 
 47 |     def cleanup(self):
 48 |         pass
 49 | 
 50 |     def _read_data(self):
 51 |         if not self.path.exists():
 52 |             return {"columns": []}
 53 |         with open(self.path, "r", encoding="utf-8") as f:
 54 |             return json.load(f)
 55 | 
 56 |     def _write_data(self, data):
 57 |         with open(self.path, "w", encoding="utf-8") as f:
 58 |             json.dump(data, f, ensure_ascii=False, indent=4)
 59 | 
 60 |     def load_columns(self):
 61 |         with self.lock:
 62 |             data = self._read_data()
 63 |             return data.get("columns", [])
 64 | 
 65 |     def save_columns(self, columns):
 66 |         with self.lock:
 67 |             data = {"columns": columns}
 68 |             self._write_data(data)
 69 | 
 70 | class HawkinsDB:
 71 |     """Main database interface."""
 72 |     
 73 |     # Make EntityValidationError accessible via the class
 74 |     EntityValidationError = EntityValidationError
 75 |     def __init__(self, storage=None, db_path=None, storage_type='sqlite'):
 76 |         if storage is None:
 77 |             if storage_type == 'sqlite':
 78 |                 db_path = db_path or "hawkins_memory.db"
 79 |                 self.storage = SQLiteStorage(db_path=db_path)
 80 |             elif storage_type == 'json':
 81 |                 db_path = db_path or "hawkins_db.json"
 82 |                 self.storage = JSONStorage(path=db_path)
 83 |             else:
 84 |                 raise ValueError(f"Unsupported storage type: {storage_type}")
 85 |         else:
 86 |             self.storage = storage
 87 |         
 88 |         self.storage.initialize()
 89 |         self.columns = {}
 90 |         self._load_columns()
 91 |         self._build_indexes()
 92 |         self._initialize_memory_types()
 93 | 
 94 |     def _load_columns(self):
 95 |         columns = self.storage.load_columns()
 96 |         self.columns = {c["name"]: c for c in columns}
 97 | 
 98 |     def _build_indexes(self):
 99 |         self.name_index = {}
100 |         for col_name, col in self.columns.items():
101 |             for f in col["frames"]:
102 |                 name = f["name"].lower()
103 |                 if name not in self.name_index:
104 |                     self.name_index[name] = []
105 |                 self.name_index[name].append((col_name, f))
106 | 
107 |     def _initialize_memory_types(self):
108 |         for memory_type in ["Semantic", "Episodic", "Procedural"]:
109 |             if memory_type not in self.columns:
110 |                 self.create_column(memory_type)
111 | 
112 |     def cleanup(self):
113 |         if hasattr(self, 'storage'):
114 |             self.storage.cleanup()
115 | 
116 |     def create_column(self, column_name):
117 |         if column_name not in self.columns:
118 |             self.columns[column_name] = {"name": column_name, "frames": []}
119 |             self._save()
120 | 
121 |     def _save(self):
122 |         self.storage.save_columns(list(self.columns.values()))
123 |         
124 |     def add_entity(self, data):
125 |         """Add an entity with validation."""
126 |         try:
127 |             if not isinstance(data, dict):
128 |                 raise EntityValidationError("Entity data must be a dictionary")
129 | 
130 |             memory_type = data.get("column", "Semantic")
131 |             name = data.get("name")
132 |             
133 |             if not name:
134 |                 raise EntityValidationError("Entity name is required")
135 | 
136 |             # Validate required fields based on memory type
137 |             if memory_type == "Episodic":
138 |                 if "properties" not in data or "timestamp" not in data["properties"]:
139 |                     raise EntityValidationError("Episodic memories require a timestamp")
140 |                     
141 |             elif memory_type == "Procedural":
142 |                 if "properties" not in data or "steps" not in data["properties"]:
143 |                     raise EntityValidationError("Procedural memories require steps")
144 |                 
145 |             properties = data.get("properties", {})
146 |             relationships = data.get("relationships", {})
147 |             
148 |             frame = {
149 |                 "name": name,
150 |                 "properties": properties,
151 |                 "relationships": relationships,
152 |                 "location": data.get("location", {}),
153 |                 "history": []
154 |             }
155 |             
156 |             if memory_type not in self.columns:
157 |                 self.create_column(memory_type)
158 |                 
159 |             column = self.columns[memory_type]
160 |             column["frames"].append(frame)
161 |             name = name.lower()
162 |             if name not in self.name_index:
163 |                 self.name_index[name] = []
164 |             self.name_index[name].append((memory_type, frame))
165 |             
166 |             self._save()
167 |             
168 |             return {
169 |                 "success": True,
170 |                 "entity_name": name,
171 |                 "message": f"Successfully added {memory_type} memory: {name}"
172 |             }
173 |             
174 |         except EntityValidationError as e:
175 |             logger.error(f"Validation error: {str(e)}")
176 |             raise
177 |         except Exception as e:
178 |             logger.error(f"Error adding entity: {str(e)}")
179 |             return {
180 |                 "success": False,
181 |                 "message": str(e)
182 |             }
183 | 
184 |     def query_frames(self, name):
185 |         """Query frames by name and return dictionary of frames by column."""
186 |         try:
187 |             name = name.lower()
188 |             frames = self.name_index.get(name, [])
189 |             result = {}
190 |             for column_name, frame_data in frames:
191 |                 if column_name not in result:
192 |                     try:
193 |                         # Always convert to ReferenceFrame
194 |                         result[column_name] = ReferenceFrame.from_dict(frame_data)
195 |                     except Exception as frame_error:
196 |                         logger.error(f"Error converting frame data: {str(frame_error)}")
197 |                         continue
198 |             return result
199 |         except Exception as e:
200 |             logger.error(f"Error querying frames: {str(e)}")
201 |             return {}
202 | 
203 |     def list_entities(self):
204 |         try:
205 |             return sorted(list(self.name_index.keys()))
206 |         except Exception:
207 |             return []
208 | 


--------------------------------------------------------------------------------
/hawkinsdb/enrichment.py:
--------------------------------------------------------------------------------
  1 | import requests
  2 | import logging
  3 | from .core import HawkinsDB
  4 | from .base import PropertyCandidate
  5 | from collections import defaultdict
  6 | 
  7 | logger = logging.getLogger(__name__)
  8 | 
  9 | class ConceptNetEnricher:
 10 |     """Handles auto-enrichment of entities using ConceptNet."""
 11 |     
 12 |     def __init__(self, api_key=None, cache_enabled=False):
 13 |         """Initialize ConceptNet enricher.
 14 |         
 15 |         Args:
 16 |             api_key: Optional API key for ConceptNet (not required for basic usage)
 17 |             cache_enabled: Deprecated, kept for backwards compatibility
 18 |         """
 19 |         self.api_key = api_key
 20 |         self.base_url = "http://api.conceptnet.io"
 21 |         if cache_enabled:
 22 |             logger.warning("Caching has been deprecated and will be removed in future versions")
 23 |         
 24 |     def get_basic_knowledge(self, concept):
 25 |         """
 26 |         Retrieve basic knowledge about a concept from ConceptNet.
 27 |         Returns a dictionary with properties and relationships.
 28 |         
 29 |         Args:
 30 |             concept: The concept to query (e.g., "car", "house")
 31 |             
 32 |         Returns:
 33 |             Dictionary containing properties and relationships enriched from ConceptNet
 34 |         """
 35 |         # Removed cache implementation to simplify the code
 36 |         # Direct API call without caching
 37 |             
 38 |         if not concept:
 39 |             logger.warning("Empty concept provided")
 40 |             return {"properties": {}, "relationships": {}}
 41 | 
 42 |         try:
 43 |             # Normalize concept name for API
 44 |             concept_normalized = concept.lower().replace(" ", "_")
 45 |             # Query both direct and related concepts
 46 |             urls = [
 47 |                 f"{self.base_url}/c/en/{concept_normalized}",
 48 |                 f"{self.base_url}/query?node=/c/en/{concept_normalized}&other=/c/en"
 49 |             ]
 50 |             
 51 |             headers = {}
 52 |             if self.api_key:
 53 |                 headers['Authorization'] = f'Bearer {self.api_key}'
 54 |             
 55 |             all_data = []
 56 |             for url in urls:
 57 |                 response = requests.get(url, headers=headers, timeout=10)
 58 |                 if response.status_code == 200:
 59 |                     all_data.append(response.json())
 60 |                 else:
 61 |                     logger.warning(f"Failed to get ConceptNet data from {url}: {response.status_code}")
 62 |             
 63 |             if not all_data:
 64 |                 return {"properties": {}, "relationships": {}}
 65 |             
 66 |             properties = defaultdict(list)
 67 |             relationships = defaultdict(list)
 68 |             
 69 |             # Process edges to extract meaningful information from all responses
 70 |             edges = []
 71 |             for response_data in all_data:
 72 |                 edges.extend(response_data.get('edges', []))
 73 |             
 74 |             for edge in edges:
 75 |                 try:
 76 |                     # Validate edge structure and language
 77 |                     start_node = edge.get('start', {})
 78 |                     end_node = edge.get('end', {})
 79 |                     
 80 |                     if not (start_node.get('language') == 'en' and 
 81 |                            end_node.get('language') == 'en'):
 82 |                         continue
 83 |                     
 84 |                     # Enhanced weight validation with fallback
 85 |                     try:
 86 |                         weight = float(edge.get('weight', 0))
 87 |                     except (TypeError, ValueError):
 88 |                         logger.warning(f"Invalid weight value in edge: {edge.get('weight')}")
 89 |                         continue
 90 |                         
 91 |                     if weight < 0.5:  # Filter out low confidence assertions
 92 |                         continue
 93 |                         
 94 |                     # Check for malformed edge data
 95 |                     if not all([start_node.get('label'), 
 96 |                               end_node.get('label'),
 97 |                               edge.get('rel', {}).get('label')]):
 98 |                         logger.warning(f"Malformed edge data: {edge}")
 99 |                         continue
100 |                         
101 |                 except Exception as e:
102 |                     logger.warning(f"Error processing edge: {str(e)}")
103 |                     continue
104 |                 
105 |                 rel = edge.get('rel', {}).get('label')
106 |                 if not rel:
107 |                     continue
108 |                         
109 |                 end_term = edge.get('end', {}).get('label')
110 |                 if not end_term:
111 |                     continue
112 |                         
113 |                 # Convert ConceptNet weight to confidence score (0.5 to 1.0)
114 |                 confidence = 0.5 + (weight / 2)
115 |                 
116 |                 # Enhanced relation mapping with comprehensive categorization
117 |                 PROPERTY_RELATIONS = {
118 |                     'HasProperty': 'properties',
119 |                     'HasA': 'features',
120 |                     'MadeOf': 'materials',
121 |                     'PartOf': 'components',
122 |                     'HasContext': 'contexts',
123 |                     'HasSubevent': 'subevents',
124 |                     'HasPrerequisite': 'prerequisites',
125 |                     'HasFirstSubevent': 'first_steps',
126 |                     'HasLastSubevent': 'last_steps',
127 |                     'HasSize': 'size',
128 |                     'HasShape': 'shape',
129 |                     'HasColor': 'color',
130 |                     'HasTexture': 'texture',
131 |                     'HasWeight': 'weight',
132 |                     'HasFeel': 'feel',
133 |                     'HasSound': 'sound',
134 |                     'HasTaste': 'taste',
135 |                     'HasSmell': 'smell',
136 |                 }
137 |                 
138 |                 RELATIONSHIP_RELATIONS = {
139 |                     'IsA': 'categories',
140 |                     'CapableOf': 'capabilities',
141 |                     'UsedFor': 'uses',
142 |                     'AtLocation': 'locations',
143 |                     'CreatedBy': 'creators',
144 |                     'PartOf': 'parent_systems',
145 |                     'HasEffect': 'effects',
146 |                     'MotivatedByGoal': 'goals',
147 |                     'SimilarTo': 'similar_concepts',
148 |                     'DerivedFrom': 'origins',
149 |                     'SymbolOf': 'symbolism',
150 |                     'ReceivesAction': 'actions_received',
151 |                     'HasSubevent': 'related_events',
152 |                     'HasPrerequisite': 'prerequisites',
153 |                     'Causes': 'causes',
154 |                     'HasFirstSubevent': 'initial_stages',
155 |                     'HasLastSubevent': 'final_stages',
156 |                     'RelatedTo': 'related_concepts',
157 |                 }
158 |                 
159 |                 if rel in PROPERTY_RELATIONS:
160 |                     prop_key = PROPERTY_RELATIONS[rel]
161 |                     properties[prop_key].append({
162 |                         'value': end_term,
163 |                         'confidence': confidence,
164 |                         'source': f"ConceptNet:{rel}"
165 |                     })
166 |                 
167 |                 elif rel in RELATIONSHIP_RELATIONS:
168 |                     rel_key = RELATIONSHIP_RELATIONS[rel]
169 |                     relationships[rel_key].append({
170 |                         'value': end_term,
171 |                         'confidence': confidence,
172 |                         'source': f"ConceptNet:{rel}"
173 |                     })
174 |             
175 |             # Filter and clean the data
176 |             def filter_and_sort_by_confidence(items, min_confidence=0.6, max_items=5):
177 |                 """
178 |                 Filter and sort knowledge items based on confidence and quality.
179 |                 
180 |                 Args:
181 |                     items: List of items to filter
182 |                     min_confidence: Minimum confidence threshold (default: 0.6)
183 |                     max_items: Maximum number of items to return (default: 5)
184 |                     
185 |                 Returns:
186 |                     List of filtered and sorted items
187 |                 """
188 |                 seen = set()
189 |                 filtered = []
190 |                 
191 |                 # Sort by confidence and filter
192 |                 sorted_items = sorted(items, key=lambda x: x['confidence'], reverse=True)
193 |                 
194 |                 for item in sorted_items:
195 |                     value = item['value'].lower()
196 |                     confidence = item['confidence']
197 |                     
198 |                     # Apply quality filters
199 |                     if (confidence >= min_confidence and
200 |                         value not in seen and
201 |                         len(value.split()) <= 3 and  # Keep concise terms
202 |                         len(value) >= 3 and  # Avoid too short terms
203 |                         not any(c.isdigit() for c in value)):  # Avoid numerical values
204 |                         
205 |                         seen.add(value)
206 |                         filtered.append(item)
207 |                         
208 |                         if len(filtered) >= max_items:
209 |                             break
210 |                             
211 |                 return filtered
212 |             
213 |             filtered_data = {
214 |                 "properties": {
215 |                     k: filter_and_sort_by_confidence(v)
216 |                     for k, v in properties.items()
217 |                 },
218 |                 "relationships": {
219 |                     k: filter_and_sort_by_confidence(v)
220 |                     for k, v in relationships.items()
221 |                 }
222 |             }
223 |             
224 |             # Return filtered data directly without caching
225 |             return filtered_data
226 |             
227 |         except Exception as e:
228 |             logger.error(f"Error enriching concept {concept}: {str(e)}")
229 |             return {}
230 |     
231 |     def enrich_entity(self, db, entity_name, entity_type):
232 |         """
233 |         Enrich an entity with ConceptNet knowledge.
234 |         
235 |         Args:
236 |             db: HawkinsDB instance to update
237 |             entity_name: Name of the entity to enrich
238 |             entity_type: Type of entity to query in ConceptNet
239 |             
240 |         Returns:
241 |             Enriched entity data or None if enrichment failed
242 |         """
243 |         knowledge = self.get_basic_knowledge(entity_type)
244 |         if not knowledge:
245 |             logger.warning(f"No knowledge found for entity type: {entity_type}")
246 |             return
247 |             
248 |         try:
249 |             # First, get existing entity data
250 |             frames = db.query_frames(entity_name)
251 |             if not frames:
252 |                 logger.warning(f"Entity {entity_name} not found in database")
253 |                 return None
254 |                 
255 |             semantic_frame = frames.get("Semantic", {})
256 |             if not semantic_frame:
257 |                 logger.warning(f"No semantic frame found for entity {entity_name}")
258 |                 return None
259 |                 
260 |             # Convert ReferenceFrame to dictionary if needed
261 |             if hasattr(semantic_frame, 'to_dict'):
262 |                 semantic_frame = semantic_frame.to_dict()
263 |             
264 |             # Initialize with empty dicts if needed
265 |             properties = semantic_frame.get('properties', {}) if isinstance(semantic_frame, dict) else {}
266 |             relationships = semantic_frame.get('relationships', {}) if isinstance(semantic_frame, dict) else {}
267 |             
268 |             semantic_frame = {
269 |                 'properties': properties,
270 |                 'relationships': relationships
271 |             }
272 |         except Exception as e:
273 |             logger.error(f"Error accessing entity data: {str(e)}")
274 |             return None
275 |             
276 |         # Prepare enriched entity data
277 |         enriched_entity = {
278 |             "name": entity_name,
279 |             "column": "Semantic",  # Always add enrichment to semantic memory
280 |             "properties": {},
281 |             "relationships": {}
282 |         }
283 |         
284 |         # Add existing properties and relationships
285 |         if semantic_frame.get('properties'):
286 |             enriched_entity["properties"].update(semantic_frame['properties'])
287 |             
288 |         if semantic_frame.get('relationships'):
289 |             enriched_entity["relationships"].update(semantic_frame['relationships'])
290 |             
291 |         # Add ConceptNet knowledge
292 |         for prop_key, values in knowledge.get("properties", {}).items():
293 |             if prop_key not in enriched_entity["properties"]:
294 |                 enriched_entity["properties"][prop_key] = []
295 |                 
296 |             # Add new properties with ConceptNet source
297 |             for value in values[:3]:  # Limit to top 3 values
298 |                 if isinstance(value, dict) and "value" in value:
299 |                     val = value["value"]
300 |                     # Convert lists to string representation if needed
301 |                     if isinstance(val, (list, tuple)):
302 |                         # Convert each list item to a separate property
303 |                         for v in val:
304 |                             enriched_entity["properties"][prop_key].append({
305 |                                 "value": str(v),
306 |                                 "confidence": 0.7,  # Lower confidence for auto-enriched properties
307 |                                 "sources": ["ConceptNet"]
308 |                             })
309 |                     else:
310 |                         enriched_entity["properties"][prop_key].append({
311 |                             "value": str(val),
312 |                             "confidence": 0.7,  # Lower confidence for auto-enriched properties
313 |                             "sources": ["ConceptNet"]
314 |                         })
315 |                 
316 |         for rel_type, targets in knowledge.get("relationships", {}).items():
317 |             if rel_type not in enriched_entity["relationships"]:
318 |                 enriched_entity["relationships"][rel_type] = []
319 |                 
320 |             # Add new relationships with ConceptNet source
321 |             for target in targets[:3]:  # Limit to top 3 relationships
322 |                 if isinstance(target, dict) and "value" in target:
323 |                     val = target["value"]
324 |                     # Ensure relationship targets are strings
325 |                     # Always convert relationships to individual string entries
326 |                     if isinstance(val, (list, tuple)):
327 |                         for v in val:
328 |                             if v:  # Skip empty values
329 |                                 enriched_entity["relationships"][rel_type].append({
330 |                                     "value": str(v).strip(),
331 |                                     "confidence": 0.7,
332 |                                     "sources": ["ConceptNet"]
333 |                                 })
334 |                     elif val:  # Skip empty values
335 |                         enriched_entity["relationships"][rel_type].append({
336 |                             "value": str(val).strip(),
337 |                             "confidence": 0.7,
338 |                             "sources": ["ConceptNet"]
339 |                         })
340 |                 
341 |         # Update entity with enriched data
342 |         if (enriched_entity["properties"] or enriched_entity["relationships"]):
343 |             logger.info(f"Enriching {entity_name} with ConceptNet knowledge")
344 |             db.add_entity(enriched_entity)  # Use add_entity instead of propose_entity_update
345 |             logger.info(f"Successfully enriched {entity_name} with ConceptNet knowledge")
346 |             return enriched_entity
347 |         return None


--------------------------------------------------------------------------------
/hawkinsdb/py.typed:
--------------------------------------------------------------------------------
1 | # This file is required to mark the package as typed
2 | 


--------------------------------------------------------------------------------
/hawkinsdb/storage/__init__.py:
--------------------------------------------------------------------------------
 1 | """Storage backends for HawkinsDB."""
 2 | from typing import List
 3 | from ..base import StorageBackend, BaseJSONStorage
 4 | from ..types import (
 5 |     CorticalColumn,
 6 |     ReferenceFrame,
 7 |     PropertyCandidate
 8 | )
 9 | 
10 | # Import concrete implementations
11 | from .sqlite import SQLiteStorage
12 | 
13 | __all__ = [
14 |     'StorageBackend',
15 |     'BaseJSONStorage',
16 |     'SQLiteStorage',
17 |     'CorticalColumn',
18 |     'ReferenceFrame',
19 |     'PropertyCandidate'
20 | ]
21 | 


--------------------------------------------------------------------------------
/hawkinsdb/storage/sqlite.py:
--------------------------------------------------------------------------------
  1 | """SQLite storage backend implementation."""
  2 | import os
  3 | import json
  4 | import logging
  5 | import sqlite3
  6 | from datetime import datetime
  7 | from typing import List, Dict, Any, Optional
  8 | from pathlib import Path
  9 | 
 10 | logger = logging.getLogger(__name__)
 11 | 
 12 | class SQLiteStorage:
 13 |     """Simple SQLite storage implementation."""
 14 |     
 15 |     def __init__(self, db_path: str = "hawkins_memory.db"):
 16 |         """Initialize SQLite storage."""
 17 |         try:
 18 |             # Convert to absolute path
 19 |             self.db_path = str(Path(db_path).absolute())
 20 |             
 21 |             # Ensure directory exists
 22 |             directory = os.path.dirname(self.db_path)
 23 |             if directory:
 24 |                 os.makedirs(directory, exist_ok=True)
 25 |             
 26 |             self._initialized = False
 27 |             
 28 |             # Remove existing database if it's corrupted
 29 |             try:
 30 |                 if os.path.exists(self.db_path):
 31 |                     with sqlite3.connect(self.db_path) as test_conn:
 32 |                         test_conn.execute("SELECT 1")
 33 |             except sqlite3.DatabaseError:
 34 |                 logger.warning(f"Removing corrupted database file: {self.db_path}")
 35 |                 os.remove(self.db_path)
 36 |             
 37 |             # Create a new database connection
 38 |             with sqlite3.connect(self.db_path) as conn:
 39 |                 # Set pragmas for better performance and reliability
 40 |                 conn.execute("PRAGMA foreign_keys = ON")
 41 |                 conn.execute("PRAGMA journal_mode = WAL")
 42 |                 conn.execute("PRAGMA synchronous = NORMAL")
 43 |                 conn.execute("PRAGMA busy_timeout = 5000")
 44 |                 
 45 |                 # Initialize schema in a transaction
 46 |                 self.initialize()
 47 |                 self._initialized = True
 48 |                 
 49 |                 # Verify tables after initialization
 50 |                 cursor = conn.cursor()
 51 |                 cursor.execute("SELECT name FROM sqlite_master WHERE type='table'")
 52 |                 tables = {row[0] for row in cursor.fetchall()}
 53 |                 
 54 |                 required_tables = {'columns', 'frames'}
 55 |                 if not required_tables.issubset(tables):
 56 |                     missing = required_tables - tables
 57 |                     raise RuntimeError(f"Failed to create tables: {missing}")
 58 |             
 59 |             logger.info(f"SQLite storage initialized successfully at {self.db_path}")
 60 |             
 61 |         except sqlite3.Error as e:
 62 |             logger.error(f"SQLite error during initialization: {str(e)}")
 63 |             self._initialized = False
 64 |             if os.path.exists(self.db_path):
 65 |                 try:
 66 |                     os.remove(self.db_path)
 67 |                 except OSError:
 68 |                     pass
 69 |             raise
 70 |         except Exception as e:
 71 |             logger.error(f"Failed to initialize SQLite storage: {str(e)}")
 72 |             self._initialized = False
 73 |             if os.path.exists(self.db_path):
 74 |                 try:
 75 |                     os.remove(self.db_path)
 76 |                 except OSError:
 77 |                     pass
 78 |             raise RuntimeError(f"SQLite storage initialization failed: {str(e)}")
 79 |     
 80 |     def get_connection(self):
 81 |         """Get a database connection with row factory."""
 82 |         try:
 83 |             conn = sqlite3.connect(self.db_path, timeout=60)
 84 |             conn.row_factory = sqlite3.Row
 85 |             # Enable foreign keys
 86 |             conn.execute("PRAGMA foreign_keys = ON")
 87 |             return conn
 88 |         except sqlite3.Error as e:
 89 |             logger.error(f"Failed to establish database connection: {e}")
 90 |             raise
 91 | 
 92 |     def initialize(self):
 93 |         """Initialize database schema with proper error handling."""
 94 |         if not os.path.exists(self.db_path):
 95 |             # Create new database file if it doesn't exist
 96 |             open(self.db_path, 'a').close()
 97 |             
 98 |         try:
 99 |             with self.get_connection() as conn:
100 |                 conn.execute("BEGIN IMMEDIATE TRANSACTION")
101 |                 
102 |                 try:
103 |                     # Create tables with proper constraints
104 |                     conn.executescript('''
105 |                         CREATE TABLE IF NOT EXISTS columns (
106 |                             id INTEGER PRIMARY KEY AUTOINCREMENT,
107 |                             name TEXT UNIQUE NOT NULL,
108 |                             created_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP,
109 |                             updated_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP
110 |                         );
111 |                         
112 |                         CREATE TABLE IF NOT EXISTS frames (
113 |                             id INTEGER PRIMARY KEY AUTOINCREMENT,
114 |                             name TEXT NOT NULL,
115 |                             column_id INTEGER NOT NULL,
116 |                             properties TEXT NOT NULL DEFAULT '{}',
117 |                             relationships TEXT NOT NULL DEFAULT '{}',
118 |                             location TEXT DEFAULT '{}',
119 |                             history TEXT DEFAULT '[]',
120 |                             created_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP,
121 |                             updated_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP,
122 |                             FOREIGN KEY (column_id) REFERENCES columns(id)
123 |                                 ON DELETE CASCADE
124 |                                 ON UPDATE CASCADE
125 |                         );
126 |                         
127 |                         CREATE INDEX IF NOT EXISTS idx_frames_name ON frames(name);
128 |                         CREATE INDEX IF NOT EXISTS idx_frames_column_id ON frames(column_id);
129 |                     ''')
130 |                     
131 |                     # Verify tables were created
132 |                     cursor = conn.cursor()
133 |                     cursor.execute("""
134 |                         SELECT name FROM sqlite_master 
135 |                         WHERE type='table' AND name IN ('columns', 'frames')
136 |                     """)
137 |                     tables = {row[0] for row in cursor.fetchall()}
138 |                     
139 |                     required_tables = {'columns', 'frames'}
140 |                     if not required_tables.issubset(tables):
141 |                         missing = required_tables - tables
142 |                         raise RuntimeError(f"Failed to create required tables: {missing}")
143 |                     
144 |                     conn.commit()
145 |                     self._initialized = True
146 |                     logger.info("SQLite storage schema initialized successfully")
147 |                     
148 |                 except Exception as e:
149 |                     conn.rollback()
150 |                     logger.error(f"Error during schema initialization: {str(e)}")
151 |                     raise
152 |                     
153 |         except sqlite3.Error as e:
154 |             logger.error(f"SQLite error during schema initialization: {str(e)}")
155 |             self._initialized = False
156 |             raise
157 |         except Exception as e:
158 |             logger.error(f"Failed to initialize SQLite storage schema: {str(e)}")
159 |             self._initialized = False
160 |             raise
161 | 
162 |     def load_columns(self) -> List[Dict[str, Any]]:
163 |         """Load all columns from storage."""
164 |         if not self._initialized:
165 |             raise RuntimeError("Storage not initialized")
166 |             
167 |         try:
168 |             columns = []
169 |             with self.get_connection() as conn:
170 |                 # Get all columns
171 |                 cursor = conn.cursor()
172 |                 for col_row in cursor.execute('SELECT * FROM columns').fetchall():
173 |                     frames = []
174 |                     
175 |                     # Get frames for this column
176 |                     frame_rows = cursor.execute(
177 |                         'SELECT * FROM frames WHERE column_id = ?', 
178 |                         (col_row['id'],)
179 |                     ).fetchall()
180 |                     
181 |                     for frame_row in frame_rows:
182 |                         frame = {
183 |                             'name': frame_row['name'],
184 |                             'properties': json.loads(frame_row['properties']),
185 |                             'relationships': json.loads(frame_row['relationships']),
186 |                             'location': json.loads(frame_row['location']) if frame_row['location'] else {},
187 |                             'history': json.loads(frame_row['history']) if frame_row['history'] else [],
188 |                             'created_at': frame_row['created_at'],
189 |                             'updated_at': frame_row['updated_at']
190 |                         }
191 |                         frames.append(frame)
192 |                         
193 |                     column = {
194 |                         'name': col_row['name'],
195 |                         'frames': frames,
196 |                         'created_at': col_row['created_at'],
197 |                         'updated_at': col_row['updated_at']
198 |                     }
199 |                     columns.append(column)
200 |                     
201 |             return columns
202 |             
203 |         except Exception as e:
204 |             logger.error("Error loading columns: %s", str(e))
205 |             raise
206 | 
207 |     def save_columns(self, columns: List[Dict[str, Any]]) -> None:
208 |         """Save columns to storage."""
209 |         if not self._initialized:
210 |             raise RuntimeError("Storage not initialized")
211 |             
212 |         try:
213 |             with self.get_connection() as conn:
214 |                 cursor = conn.cursor()
215 |                 
216 |                 # Clear existing data
217 |                 cursor.execute('DELETE FROM frames')
218 |                 cursor.execute('DELETE FROM columns')
219 |                 
220 |                 # Save new data
221 |                 for column in columns:
222 |                     now = datetime.now().isoformat()
223 |                     
224 |                     # Insert column
225 |                     cursor.execute(
226 |                         'INSERT INTO columns (name, created_at, updated_at) VALUES (?, ?, ?)',
227 |                         (column['name'], 
228 |                          column.get('created_at', now), 
229 |                          column.get('updated_at', now))
230 |                     )
231 |                     column_id = cursor.lastrowid
232 |                     
233 |                     # Insert frames
234 |                     for frame in column.get('frames', []):
235 |                         cursor.execute('''
236 |                             INSERT INTO frames (
237 |                                 name, column_id, properties, relationships, 
238 |                                 location, history, created_at, updated_at
239 |                             ) VALUES (?, ?, ?, ?, ?, ?, ?, ?)
240 |                         ''', (
241 |                             frame['name'],
242 |                             column_id,
243 |                             json.dumps(frame.get('properties', {})),
244 |                             json.dumps(frame.get('relationships', {})),
245 |                             json.dumps(frame.get('location', {})),
246 |                             json.dumps(frame.get('history', [])),
247 |                             frame.get('created_at', now),
248 |                             frame.get('updated_at', now)
249 |                         ))
250 |                         
251 |             logger.info("Successfully saved %d columns", len(columns))
252 |             
253 |         except Exception as e:
254 |             logger.error("Error saving columns: %s", str(e))
255 |             raise
256 | 
257 |     def cleanup(self) -> None:
258 |         """Clean up resources."""
259 |         logger.info("SQLite storage cleaned up successfully")
260 | 


--------------------------------------------------------------------------------
/hawkinsdb/types.py:
--------------------------------------------------------------------------------
 1 | """Core classes for HawkinsDB memory management."""
 2 | import time
 3 | from datetime import datetime
 4 | 
 5 | class PropertyCandidate:
 6 |     """A property candidate with value and metadata."""
 7 |     def __init__(self, value, confidence=1.0, sources=None, timestamp=None):
 8 |         self.value = value
 9 |         self.confidence = confidence
10 |         self.sources = sources or []
11 |         self.timestamp = timestamp or time.time()
12 | 
13 |     def to_dict(self):
14 |         """Convert to dictionary representation."""
15 |         return {
16 |             "value": self.value,
17 |             "confidence": self.confidence,
18 |             "sources": self.sources,
19 |             "timestamp": self.timestamp
20 |         }
21 | 
22 |     @classmethod
23 |     def from_dict(cls, data):
24 |         """Create from dictionary."""
25 |         if isinstance(data, dict):
26 |             return cls(
27 |                 data.get("value"),
28 |                 data.get("confidence", 1.0),
29 |                 data.get("sources", []),
30 |                 data.get("timestamp", time.time())
31 |             )
32 |         return cls(data)
33 | 
34 | class ReferenceFrame:
35 |     """Represents a single concept or object."""
36 |     def __init__(self, name, properties=None, relationships=None, location=None, history=None):
37 |         self.name = name
38 |         self.properties = properties or {}
39 |         self.relationships = relationships or {}
40 |         self.location = location or {}
41 |         self.history = history or []
42 |         self.created_at = datetime.now().isoformat()
43 |         self.updated_at = datetime.now().isoformat()
44 | 
45 |     def to_dict(self):
46 |         """Convert to dictionary representation."""
47 |         return {
48 |             "name": self.name,
49 |             "properties": self.properties,
50 |             "relationships": self.relationships,
51 |             "location": self.location,
52 |             "history": self.history
53 |         }
54 | 
55 |     @classmethod
56 |     def from_dict(cls, data):
57 |         """Create from dictionary."""
58 |         return cls(
59 |             data["name"],
60 |             data.get("properties", {}),
61 |             data.get("relationships", {}),
62 |             data.get("location", {}),
63 |             data.get("history", [])
64 |         )
65 | 
66 | class CorticalColumn:
67 |     """Base class for memory columns."""
68 |     def __init__(self, name, frames=None):
69 |         self.name = name
70 |         self.frames = frames or []
71 |         self.created_at = datetime.now().isoformat()
72 |         self.updated_at = datetime.now().isoformat()
73 | 


--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
  1 | from setuptools import setup, find_packages
  2 | import os
  3 | 
  4 | # Read long description from README.md
  5 | with open("README.md", "r", encoding="utf-8") as fh:
  6 |     long_description = fh.read()
  7 | 
  8 | # Read version from package
  9 | def get_version():
 10 |     with open(os.path.join("hawkinsdb", "__init__.py"), "r") as f:
 11 |         for line in f:
 12 |             if line.startswith("__version__"):
 13 |                 return line.split("=")[1].strip().strip('"\'')
 14 |     return "1.0.1"  # Default version if not found
 15 | 
 16 | setup(
 17 |     name="hawkinsdb",
 18 |     version=get_version(),
 19 |     packages=find_packages(exclude=["tests*", "examples*", "docs*"]),
 20 |     install_requires=[
 21 |         'requests>=2.25.1',
 22 |         'sqlalchemy>=2.0.0',
 23 |         'sqlalchemy-utils>=0.41.2',
 24 |         'filelock>=3.0.0',
 25 |         'typing-extensions>=4.0.0',
 26 |         'python-dateutil>=2.8.2',
 27 |         'setuptools>=42.0.0'
 28 |     ],
 29 |     extras_require={
 30 |         'dev': [
 31 |             'pytest>=7.0.0',
 32 |             'pytest-cov>=4.1.0',
 33 |             'pytest-asyncio>=0.23.0',
 34 |             'black>=23.0.0',
 35 |             'isort>=5.12.0',
 36 |             'mypy>=1.0.0',
 37 |             'ruff>=0.1.0'
 38 |         ],
 39 |         'docs': [
 40 |             'sphinx>=7.0.0',
 41 |             'sphinx-rtd-theme>=1.3.0',
 42 |             'myst-parser>=2.0.0'
 43 |         ],
 44 |         'conceptnet': [
 45 |             'networkx>=3.0.0',
 46 |             'requests-cache>=1.1.0'
 47 |         ],
 48 |         'llm': [
 49 |             'openai>=1.0.0',
 50 |             'tenacity>=8.2.0',
 51 |             'tiktoken>=0.5.0'
 52 |         ],
 53 |         'all': [
 54 |             'networkx>=3.0.0',
 55 |             'requests-cache>=1.1.0',
 56 |             'openai>=1.0.0',
 57 |             'tenacity>=8.2.0',
 58 |             'tiktoken>=0.5.0'
 59 |         ]
 60 |     },
 61 |     author="HawkinsDB Contributors",
 62 |     author_email="hawkinsdb@example.com",
 63 |     description="A memory layer with ConceptNet integration and LLM-friendly interfaces",
 64 |     long_description=long_description,
 65 |     long_description_content_type="text/markdown",
 66 |     url="https://github.com/hawkinsdb/hawkinsdb",
 67 |     project_urls={
 68 |         "Bug Tracker": "https://github.com/hawkinsdb/hawkinsdb/issues",
 69 |         "Documentation": "https://hawkinsdb.readthedocs.io/",
 70 |         "Source Code": "https://github.com/hawkinsdb/hawkinsdb",
 71 |     },
 72 |     classifiers=[
 73 |         "Development Status :: 3 - Alpha",
 74 |         "Intended Audience :: Developers",
 75 |         "Intended Audience :: Science/Research",
 76 |         "License :: OSI Approved :: MIT License",
 77 |         "Operating System :: OS Independent",
 78 |         "Programming Language :: Python :: 3 :: Only",
 79 |         "Programming Language :: Python :: 3.8",
 80 |         "Programming Language :: Python :: 3.9",
 81 |         "Programming Language :: Python :: 3.10",
 82 |         "Programming Language :: Python :: 3.11",
 83 |         "Topic :: Database",
 84 |         "Topic :: Scientific/Engineering :: Artificial Intelligence",
 85 |         "Topic :: Software Development :: Libraries :: Python Modules",
 86 |         "Typing :: Typed"
 87 |     ],
 88 |     python_requires='>=3.8,<4',
 89 |     package_data={
 90 |         'hawkinsdb': [
 91 |             'README.md',
 92 |             'LICENSE',
 93 |             'py.typed',
 94 |             'storage/*.sql',
 95 |             'storage/*.json',
 96 |             'storage/*.db'
 97 |         ],
 98 |     },
 99 |     include_package_data=True,
100 | )


--------------------------------------------------------------------------------
/tests/__init__.py:
--------------------------------------------------------------------------------
1 | """
2 | Test package for HawkinsDB.
3 | Contains all unit tests, integration tests, and test utilities.
4 | """
5 | 


--------------------------------------------------------------------------------
/tests/document.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/harishsg993010/HawkinsDB/5268d4ec11f55ab53f83d2c1ef29317901732e35/tests/document.pdf


--------------------------------------------------------------------------------
/tests/file_rag.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import logging
  3 | from typing import List, Dict, Any
  4 | import PyPDF2
  5 | from pathlib import Path
  6 | from hawkinsdb import HawkinsDB, LLMInterface
  7 | 
  8 | os.environ["OPENAI_API_KEY"]=""
  9 | 
 10 | logging.basicConfig(level=logging.INFO)
 11 | logger = logging.getLogger(__name__)
 12 | 
 13 | class PDFHawkinsRAG:
 14 |     def __init__(self, chunk_size: int = 500):
 15 |         """Initialize the RAG system."""
 16 |         self.db = HawkinsDB(storage_type='sqlite',db_path="rag.db")
 17 |         self.llm_interface = LLMInterface(self.db,auto_enrich=True)
 18 |         self.chunk_size = chunk_size
 19 | 
 20 |     def extract_text_from_pdf(self, pdf_path: str) -> str:
 21 |         """Extract text content from a PDF file."""
 22 |         try:
 23 |             with open(pdf_path, 'rb') as file:
 24 |                 pdf_reader = PyPDF2.PdfReader(file)
 25 |                 text = ""
 26 |                 for page in pdf_reader.pages:
 27 |                     text += page.extract_text() + "\n"
 28 |                 return text
 29 |         except Exception as e:
 30 |             logger.error(f"Error extracting text from PDF: {str(e)}")
 31 |             raise
 32 | 
 33 |     def chunk_text(self, text: str, filename: str) -> List[Dict[str, Any]]:
 34 |         """Split text into chunks and prepare for database storage."""
 35 |         chunks = []
 36 |         words = text.split()
 37 |         current_chunk = []
 38 |         chunk_number = 1
 39 | 
 40 |         for word in words:
 41 |             current_chunk.append(word)
 42 |             if len(current_chunk) >= self.chunk_size:
 43 |                 chunk_text = " ".join(current_chunk)
 44 |                 chunks.append({
 45 |                     "name": f"{Path(filename).stem}_chunk_{chunk_number}",
 46 |                     "column": "Semantic",
 47 |                     "properties": {
 48 |                         "content": chunk_text,
 49 |                         "source_file": filename,
 50 |                         "chunk_number": chunk_number,
 51 |                     },
 52 |                     "relationships": {
 53 |                         "part_of": [filename],
 54 |                         "next_chunk": [f"{Path(filename).stem}_chunk_{chunk_number + 1}"] if len(words) > self.chunk_size else []
 55 |                     }
 56 |                 })
 57 |                 current_chunk = []
 58 |                 chunk_number += 1
 59 | 
 60 |         # Handle remaining text
 61 |         if current_chunk:
 62 |             chunk_text = " ".join(current_chunk)
 63 |             chunks.append({
 64 |                 "name": f"{Path(filename).stem}_chunk_{chunk_number}",
 65 |                 "column": "Semantic",
 66 |                 "properties": {
 67 |                     "content": chunk_text,
 68 |                     "source_file": filename,
 69 |                     "chunk_number": chunk_number,
 70 |                 },
 71 |                 "relationships": {
 72 |                     "part_of": [filename]
 73 |                 }
 74 |             })
 75 | 
 76 |         return chunks
 77 | 
 78 |     def ingest_pdf(self, pdf_path: str) -> Dict[str, Any]:
 79 |         """Process and store PDF content in the database."""
 80 |         try:
 81 |             # Extract text from PDF
 82 |             logger.info(f"Processing PDF: {pdf_path}")
 83 |             text = self.extract_text_from_pdf(pdf_path)
 84 | 
 85 |             # Create document metadata
 86 |             filename = Path(pdf_path).name
 87 |             doc_metadata = {
 88 |                 "name": Path(pdf_path).stem,
 89 |                 "column": "Semantic",
 90 |                 "properties": {
 91 |                     "file_type": "PDF",
 92 |                     "file_path": pdf_path,
 93 |                     "file_name": filename,
 94 |                 },
 95 |                 "relationships": {
 96 |                     "contains": []
 97 |                 }
 98 |             }
 99 | 
100 |             # Store document metadata
101 |             self.db.add_entity(doc_metadata)
102 | 
103 |             # Process and store chunks
104 |             chunks = self.chunk_text(text, filename)
105 |             chunk_names = []
106 |             for chunk in chunks:
107 |                 self.db.add_entity(chunk)
108 |                 chunk_names.append(chunk["name"])
109 | 
110 |             # Update document metadata with chunk references
111 |             doc_metadata["relationships"]["contains"] = chunk_names
112 |             self.db.add_entity(doc_metadata)
113 | 
114 |             return {
115 |                 "success": True,
116 |                 "message": f"Successfully processed {filename}",
117 |                 "chunks_created": len(chunks)
118 |             }
119 | 
120 |         except Exception as e:
121 |             logger.error(f"Error ingesting PDF: {str(e)}")
122 |             return {
123 |                 "success": False,
124 |                 "message": str(e)
125 |             }
126 | 
127 |     def query(self, question: str) -> Dict[str, Any]:
128 |         """Query the knowledge base with context from stored documents."""
129 |         try:
130 |             return self.llm_interface.query(question)
131 |         except Exception as e:
132 |             logger.error(f"Error processing query: {str(e)}")
133 |             return {
134 |                 "success": False,
135 |                 "message": str(e),
136 |                 "response": None
137 |             }
138 | 
139 | def test_pdf_rag():
140 |     """Test the PDF RAG system."""
141 |     # Initialize the system
142 |     rag = PDFHawkinsRAG(chunk_size=500)
143 | 
144 |     # Test with sample PDF
145 |     pdf_path = r"C:\Users\haris\Desktop\personal\AI-Agent\Hawin\tests\document.pdf"  # Replace with actual PDF path
146 |     if os.path.exists(pdf_path):
147 |         # Ingest PDF
148 |         logger.info("Ingesting PDF...")
149 |         result = rag.ingest_pdf(pdf_path)
150 |         logger.info(f"Ingestion result: {result}")
151 | 
152 |         if result["success"]:
153 |             # Test queries
154 |             test_queries = [
155 |                 "What is the main topic of the document?",
156 |                 "Summarize the key points from the document.",
157 |                 "What are the main conclusions drawn in the document?",
158 |                 "what is silha center",
159 |                 "who is Charlotte Higgins",
160 |                 "Explain the lawsuits",
161 |                 "Explain OpenAI's Involvement",
162 |                 "who is Mike Masnick"
163 |             ]
164 | 
165 |             logger.info("\nTesting queries:")
166 |             for query in test_queries:
167 |                 logger.info(f"\nQuery: {query}")
168 |                 response = rag.query(query)
169 |                 logger.info(f"Response: {response}")
170 |     else:
171 |         logger.error(f"PDF file not found: {pdf_path}")
172 | 
173 | if __name__ == "__main__":
174 |     test_pdf_rag()


--------------------------------------------------------------------------------
/tests/test_basic.py:
--------------------------------------------------------------------------------
 1 | import logging
 2 | from hawkinsdb import HawkinsDB, LLMInterface
 3 | 
 4 | logging.basicConfig(level=logging.INFO)
 5 | 
 6 | def test_basic_functionality():
 7 |     # Initialize database and interface
 8 |     db = HawkinsDB()
 9 |     interface = LLMInterface(db, auto_enrich=True)
10 |     
11 |     # Test entity with comprehensive data
12 |     test_entity = {
13 |         "column": "Conceptual",
14 |         "type": "Car",
15 |         "name": "TestCar1",
16 |         "properties": {
17 |             "brand": "Tesla",
18 |             "color": "red",
19 |             "model": "Model 3"
20 |         },
21 |         "relationships": {
22 |             "type_of": ["Vehicle", "Transport"],
23 |             "has_part": ["Engine", "Wheels"]
24 |         },
25 |         "location": {"in": "Garage"}
26 |     }
27 |     
28 |     # Add entity
29 |     print("\nAdding test entity...")
30 |     result = interface.add_entity(test_entity)
31 |     print(f"Add result: {result}")
32 |     
33 |     if result['success']:
34 |         # Query the enriched entity
35 |         print("\nQuerying enriched entity...")
36 |         query_result = interface.query_entity('TestCar1', include_metadata=True)
37 |         print(f"Query result: {query_result}")
38 | 
39 | if __name__ == "__main__":
40 |     test_basic_functionality()
41 | 


--------------------------------------------------------------------------------
/tests/test_conceptnet.py:
--------------------------------------------------------------------------------
 1 | import logging
 2 | from hawkinsdb import HawkinsDB, LLMInterface
 3 | import json
 4 | 
 5 | logging.basicConfig(level=logging.INFO)
 6 | logger = logging.getLogger(__name__)
 7 | 
 8 | def test_conceptnet_enrichment():
 9 |     # Initialize database and interface with auto-enrichment enabled
10 |     db = HawkinsDB()
11 |     interface = LLMInterface(db, auto_enrich=True)
12 |     
13 |     # Test entity - Car
14 |     car_entity = {
15 |         "column": "Semantic",
16 |         "type": "Car",
17 |         "name": "TestCar",
18 |         "properties": {
19 |             "brand": "Tesla",
20 |             "model": "Model 3"
21 |         },
22 |         "relationships": {
23 |             "type_of": ["Vehicle"]
24 |         }
25 |     }
26 |     
27 |     # Add entity and get enriched data
28 |     print("\nAdding car entity with auto-enrichment...")
29 |     result = interface.add_entity(car_entity)
30 |     print(f"Add result: {json.dumps(result, indent=2)}")
31 |     
32 |     if result['success']:
33 |         # Query the enriched entity
34 |         print("\nQuerying enriched entity...")
35 |         query_result = interface.query_entity('TestCar1', include_metadata=True)
36 |         print(f"Enriched entity data: {json.dumps(query_result, indent=2)}")
37 |         
38 |         # Verify enrichment
39 |         if query_result['success']:
40 |             data = query_result['data']['Conceptual']
41 |             print("\nEnriched properties:")
42 |             for prop_type, values in data['properties'].items():
43 |                 print(f"\n{prop_type}:")
44 |                 for value in values:
45 |                     if isinstance(value, dict):
46 |                         print(f"  - {value['value']} (confidence: {value['confidence']}, sources: {value['sources']})")
47 |                     else:
48 |                         print(f"  - {value}")
49 |             
50 |             print("\nEnriched relationships:")
51 |             for rel_type, values in data['relationships'].items():
52 |                 print(f"\n{rel_type}:")
53 |                 for value in values:
54 |                     if isinstance(value, dict):
55 |                         print(f"  - {value['value']} (confidence: {value['confidence']}, sources: {value['sources']})")
56 |                     else:
57 |                         print(f"  - {value}")
58 | 
59 | if __name__ == "__main__":
60 |     test_conceptnet_enrichment()
61 | 


--------------------------------------------------------------------------------
/tests/test_enrichment.py:
--------------------------------------------------------------------------------
 1 | from hawkinsdb import HawkinsDB, LLMInterface
 2 | import json
 3 | import logging
 4 | 
 5 | logging.basicConfig(level=logging.INFO)
 6 | 
 7 | def test_enrichment():
 8 |     # Initialize database and interface
 9 |     db = HawkinsDB()
10 |     interface = LLMInterface(db, auto_enrich=True)
11 |     
12 |     # Test entity
13 |     car_entity = {
14 |         'column': 'Conceptual',
15 |         'type': 'Car',
16 |         'name': 'TestCar1',
17 |         'properties': {
18 |             'brand': 'Tesla',
19 |             'model': 'Model 3'
20 |         },
21 |         'relationships': {
22 |             'type_of': ['Vehicle']
23 |         }
24 |     }
25 |     
26 |     # Add entity
27 |     print("\nAdding test entity...")
28 |     result = interface.add_entity(car_entity)
29 |     print(f"Add result: {json.dumps(result, indent=2)}")
30 |     
31 |     if result['success']:
32 |         # Query the enriched entity
33 |         print("\nQuerying enriched entity...")
34 |         query_result = interface.query_entity('TestCar1', include_metadata=True)
35 |         print(f"Query result: {json.dumps(query_result, indent=2)}")
36 |         
37 | if __name__ == "__main__":
38 |     test_enrichment()
39 | 


--------------------------------------------------------------------------------
/tests/test_exmple_full.py:
--------------------------------------------------------------------------------
  1 | """
  2 | Comprehensive example demonstrating all major features of HawkinsDB.
  3 | This example showcases:
  4 | 1. Basic CRUD operations
  5 | 2. Advanced caching mechanisms
  6 | 3. Different memory types (Semantic, Episodic, Procedural)
  7 | 4. ConceptNet integration
  8 | 5. Memory type validations
  9 | 6. Performance monitoring
 10 | """
 11 | 
 12 | import logging
 13 | import time
 14 | import json
 15 | from hawkinsdb import HawkinsDB, LLMInterface
 16 | from datetime import datetime
 17 | 
 18 | logging.basicConfig(level=logging.INFO)
 19 | logger = logging.getLogger(__name__)
 20 | 
 21 | 
 22 | def demonstrate_memory_types(db: HawkinsDB):
 23 |     """Demonstrate different memory types and their validations."""
 24 |     logger.info("\n=== Testing Different Memory Types ===")
 25 |   
 26 | 
 27 |     laptop_entity = {
 28 |         "name": "MacBookPro_M4",
 29 |         "column": "Semantic",
 30 |         "properties": {
 31 |             "brand": "Apple",
 32 |             "model": "MacBook Pro",
 33 |             "year": "2024",
 34 |             "processor": "M3 chip",
 35 |             "ram": "16GB",
 36 |             "storage": "512GB SSD",
 37 |             "location": "home office"
 38 |         },
 39 |         "relationships": {
 40 |             "type_of": ["Laptop", "Computer"],
 41 |             "manufactured_by": ["Apple"]
 42 |         }
 43 |     }
 44 |     
 45 |     # Add the entity directly first
 46 |     logger.info("\nAdding MacBook Pro entity...")
 47 |     db.add_entity(laptop_entity)
 48 |     # Semantic Memory
 49 |     semantic_data = {
 50 |         "name": "Tesla_Model_3",
 51 |         "column": "Semantic",
 52 |         "properties": {
 53 |             "type": "electric_car",
 54 |             "manufacturer": "Tesla",
 55 |             "year": 2024,
 56 |             "features": ["autopilot", "battery_powered", "touch_screen"]
 57 |         },
 58 |         "relationships": {
 59 |             "similar_to": ["Tesla_Model_Y", "Tesla_Model_S"],
 60 |             "competes_with": ["BMW_i4", "Polestar_2"]
 61 |         }
 62 |     }
 63 |     db.add_entity(semantic_data)
 64 |     logger.info("Added semantic memory: Tesla Model 3")
 65 | 
 66 |     # Episodic Memory
 67 |     episodic_data = {
 68 |         "name": "First_Tesla_Drive",
 69 |         "column": "Episodic",
 70 |         "properties": {
 71 |             "timestamp": datetime.now().isoformat(),
 72 |             "action": "test_drive",
 73 |             "location": {
 74 |                 "city": "Palo Alto",
 75 |                 "state": "CA"
 76 |             },
 77 |             "duration": "45 minutes",
 78 |             "participants": ["customer", "sales_rep"]
 79 |         }
 80 |     }
 81 |     db.add_entity(episodic_data)
 82 |     logger.info("Added episodic memory: First Tesla Drive")
 83 | 
 84 |     # Procedural Memory
 85 |     procedural_data = {
 86 |         "name": "Tesla_Charging_Process",
 87 |         "column": "Procedural",
 88 |         "properties": {
 89 |             "steps": [
 90 |                 "Park near charging station", "Open charging port",
 91 |                 "Connect charging cable", "Initiate charging via touchscreen",
 92 |                 "Wait for desired charge level", "Disconnect charging cable"
 93 |             ],
 94 |             "required_tools": ["charging_cable", "Tesla_app"],
 95 |             "difficulty":
 96 |             "easy"
 97 |         }
 98 |     }
 99 |     db.add_entity(procedural_data)
100 |     logger.info("Added procedural memory: Tesla Charging Process")
101 | 
102 | 
103 | # Function removed as caching is no longer supported
104 | 
105 | 
106 | def main():
107 |     """Run the comprehensive example."""
108 |     # Initialize database with SQLite storage
109 |     db = HawkinsDB(storage_type='sqlite')
110 | 
111 |     try:
112 |         # Test different memory types
113 |         demonstrate_memory_types(db)
114 | 
115 |         # Test queries
116 |         logger.info("\n=== Testing Queries ===")
117 |         tesla_data = db.query_frames("Tesla_Model_3")
118 |         # Convert ReferenceFrame objects to dictionaries before JSON serialization
119 |         tesla_data_dict = {
120 |             column_name: frame.to_dict()
121 |             for column_name, frame in tesla_data.items()
122 |         }
123 |         logger.info(
124 |             f"Query result for Tesla Model 3: {json.dumps(tesla_data_dict, indent=2)}"
125 |         )
126 | 
127 |         # List all entities
128 |         logger.info("\n=== All Entities ===")
129 |         all_entities = db.list_entities()
130 |         logger.info(f"Total entities: {len(all_entities)}")
131 |         logger.info(f"Entities: {json.dumps(all_entities, indent=2)}")
132 | 
133 |     except Exception as e:
134 |         logger.error(f"Error during example execution: {e}")
135 |         raise
136 |     finally:
137 |         db.cleanup()
138 | 
139 | 
140 | if __name__ == "__main__":
141 |     main()
142 | 


--------------------------------------------------------------------------------
/tests/test_hawkinsdb_comprehensive.py:
--------------------------------------------------------------------------------
  1 | """Comprehensive test suite for HawkinsDB."""
  2 | import logging
  3 | import time
  4 | import json
  5 | from datetime import datetime
  6 | import pytest
  7 | from hawkinsdb import HawkinsDB
  8 | from hawkinsdb.types import PropertyCandidate
  9 | 
 10 | logging.basicConfig(level=logging.DEBUG)
 11 | logger = logging.getLogger(__name__)
 12 | 
 13 | class TestHawkinsDBComprehensive:
 14 |     """Comprehensive test suite for HawkinsDB functionality."""
 15 |     
 16 |     @pytest.fixture
 17 |     def db(self):
 18 |         """Initialize test database."""
 19 |         db = HawkinsDB()
 20 |         yield db
 21 |         db.cleanup()
 22 |     
 23 |     def test_property_handling(self, db):
 24 |         """Test property handling and validation."""
 25 |         # Test various property formats and validations
 26 |         property_data = {
 27 |             "name": "TestProperty",
 28 |             "column": "Semantic",
 29 |             "properties": {
 30 |                 # Test dictionary format with full metadata
 31 |                 "color": [
 32 |                     {"value": "red", "confidence": 0.9, "sources": ["observation"]},
 33 |                     {"value": "crimson", "confidence": 0.7, "sources": ["inference"]}
 34 |                 ],
 35 |                 # Test direct value with defaults
 36 |                 "size": "large",
 37 |                 # Test list with mixed formats
 38 |                 "tags": [
 39 |                     {"value": "test", "confidence": 0.8},
 40 |                     "important",
 41 |                     {"value": "verified", "sources": ["validation"]}
 42 |                 ],
 43 |                 # Test complex value type conversion
 44 |                 "metadata": {"key": "value", "nested": {"data": True}},
 45 |                 # Test empty sources
 46 |                 "status": {"value": "active", "confidence": 1.0, "sources": []}
 47 |             }
 48 |         }
 49 |         
 50 |         result = db.add_entity(property_data)
 51 |         assert result["success"], f"Failed to add property test: {result.get('message')}"
 52 |         
 53 |         # Query and validate
 54 |         query_results = db.query_frames("TestProperty")
 55 |         assert "Semantic" in query_results, "Property test memory not found"
 56 |         frame = query_results["Semantic"]
 57 |         
 58 |         # Validate multi-value property with full metadata
 59 |         assert len(frame.properties["color"]) == 2
 60 |         assert frame.properties["color"][0].confidence == 0.9
 61 |         assert "observation" in frame.properties["color"][0].sources
 62 |         assert frame.properties["color"][1].value == "crimson"
 63 |         
 64 |         # Validate direct value conversion
 65 |         assert len(frame.properties["size"]) == 1
 66 |         assert frame.properties["size"][0].value == "large"
 67 |         assert frame.properties["size"][0].confidence == 1.0
 68 |         
 69 |         # Validate mixed format list
 70 |         assert len(frame.properties["tags"]) == 3
 71 |         assert frame.properties["tags"][0].confidence == 0.8
 72 |         assert frame.properties["tags"][1].value == "important"
 73 |         assert "validation" in frame.properties["tags"][2].sources
 74 |         
 75 |         # Validate complex value conversion
 76 |         assert isinstance(frame.properties["metadata"][0].value, str)
 77 |         
 78 |         # Validate empty sources handling
 79 |         assert frame.properties["status"][0].sources == []
 80 |     
 81 |     def test_relationship_handling(self, db):
 82 |         """Test relationship handling and validation."""
 83 |         # Setup test entities with relationships
 84 |         entities = [
 85 |             {
 86 |                 "name": "Dog",
 87 |                 "column": "Semantic",
 88 |                 "properties": {
 89 |                     "type": "Animal",
 90 |                     "species": "Canis lupus familiaris"
 91 |                 },
 92 |                 "relationships": {
 93 |                     "is_a": ["Mammal", "Pet"],  # Simple values get auto-wrapped
 94 |                     "has_part": [  # Complex values with confidence and sources
 95 |                         {"value": "Tail", "confidence": 1.0, "sources": ["anatomy"]},
 96 |                         {"value": "Legs", "confidence": 1.0, "sources": ["anatomy"]},
 97 |                         {"value": "Head", "confidence": 1.0, "sources": ["anatomy"]}
 98 |                     ],
 99 |                     "eats": [
100 |                         {"value": "DogFood", "confidence": 0.95, "sources": ["observation"]},
101 |                         {"value": "Meat", "confidence": 0.8, "sources": ["nature"]}
102 |                     ]
103 |                 }
104 |             },
105 |             {
106 |                 "name": "Mammal",
107 |                 "column": "Semantic",
108 |                 "properties": {
109 |                     "type": "Classification",
110 |                     "characteristics": ["warm-blooded", "fur/hair", "mammary_glands"]
111 |                 },
112 |                 "relationships": {
113 |                     "has_instance": [
114 |                         {"value": "Dog", "confidence": 1.0},
115 |                         {"value": "Cat", "confidence": 1.0}
116 |                     ]
117 |                 }
118 |             }
119 |         ]
120 |         
121 |         # Add entities
122 |         for entity in entities:
123 |             result = db.add_entity(entity)
124 |             assert result["success"], f"Failed to add entity: {result.get('message')}"
125 |         
126 |         # Query and validate relationships
127 |         dog_result = db.query_frames("Dog")
128 |         mammal_result = db.query_frames("Mammal")
129 |         
130 |         assert "Semantic" in dog_result, "Dog entity not found"
131 |         assert "Semantic" in mammal_result, "Mammal entity not found"
132 |         
133 |         dog_frame = dog_result["Semantic"]
134 |         mammal_frame = mammal_result["Semantic"]
135 |         
136 |         # Validate bidirectional relationships
137 |         assert any(v.value == "Mammal" for v in dog_frame.relationships["is_a"]), "Missing 'is_a' relationship"
138 |         assert any(v.value == "Dog" for v in mammal_frame.relationships["has_instance"]), "Missing 'has_instance' relationship"
139 |         
140 |         # Validate relationship properties
141 |         assert any(v.value == "Pet" and v.confidence == 0.9 for v in dog_frame.relationships["is_a"])
142 |         assert any(v.value == "DogFood" and v.confidence == 0.95 and "observation" in v.sources for v in dog_frame.relationships["eats"])
143 |     
144 |     def test_query_and_update(self, db):
145 |         """Test querying and updating functionality."""
146 |         # Add test data
147 |         initial_data = {
148 |             "name": "TestEntity",
149 |             "column": "Semantic",
150 |             "properties": {
151 |                 "status": "active",  # Simple value gets auto-wrapped
152 |                 "tags": {"value": ["test", "initial"], "confidence": 1.0}  # Complex value with confidence
153 |             }
154 |         }
155 |         
156 |         result = db.add_entity(initial_data)
157 |         assert result["success"], "Failed to add initial entity"
158 |         
159 |         # Test querying
160 |         query_result = db.query_frames("TestEntity")
161 |         assert "Semantic" in query_result, "Entity not found in query results"
162 |         frame = query_result["Semantic"]
163 |         assert frame.properties["status"][0].value == "active"
164 |         
165 |         # Test updating
166 |         update_data = {
167 |             "name": "TestEntity",
168 |             "column": "Semantic",
169 |             "properties": {
170 |                 "status": PropertyCandidate("inactive", confidence=0.8),
171 |                 "tags": ["test", "updated"]
172 |             }
173 |         }
174 |         
175 |         update_result = db.update_entity(update_data)
176 |         assert update_result["success"], "Failed to update entity"
177 |         
178 |         # Verify update
179 |         updated_result = db.query_frames("TestEntity")
180 |         updated_frame = updated_result["Semantic"]
181 |         status_prop = updated_frame.properties.get("status", [])
182 |         assert len(status_prop) > 0, "Status property not found"
183 |         assert status_prop[0].value == "inactive", f"Expected 'inactive' but got {status_prop[0].value}"
184 |         assert status_prop[0].confidence == 0.8, f"Expected confidence 0.8 but got {status_prop[0].confidence}"
185 |     
186 |     def test_error_handling(self, db):
187 |         """Test error handling and validation."""
188 |         # Test invalid entity name
189 |         invalid_name = {
190 |             "name": "",  # Empty name
191 |             "column": "Semantic",
192 |             "properties": {"test": "value"}
193 |         }
194 |         result = db.add_entity(invalid_name)
195 |         assert not result["success"], "Should fail with empty name"
196 |         assert "message" in result, "Error message should be present"
197 |         
198 |         # Test invalid column
199 |         invalid_column = {
200 |             "name": "TestInvalid",
201 |             "column": "InvalidColumn",  # Non-existent column
202 |             "properties": {"test": "value"}
203 |         }
204 |         result = db.add_entity(invalid_column)
205 |         assert not result["success"], "Should fail with invalid column"
206 |         
207 |         # Test invalid property format
208 |         invalid_property = {
209 |             "name": "TestInvalid",
210 |             "column": "Semantic",
211 |             "properties": None  # Invalid properties
212 |         }
213 |         result = db.add_entity(invalid_property)
214 |         assert not result["success"], "Should fail with invalid properties"
215 |         
216 |         # Test duplicate entity handling
217 |         duplicate = {
218 |             "name": "TestDuplicate",
219 |             "column": "Semantic",
220 |             "properties": {"test": "value"}
221 |         }
222 |         first_result = db.add_entity(duplicate)
223 |         assert first_result["success"], "First addition should succeed"
224 |         
225 |         second_result = db.add_entity(duplicate)
226 |         assert not second_result["success"], "Duplicate addition should fail"
227 | 
228 | if __name__ == "__main__":
229 |     pytest.main([__file__, "-v", "--log-cli-level=DEBUG"])


--------------------------------------------------------------------------------
/tests/test_memory_specific.py:
--------------------------------------------------------------------------------
  1 | import unittest
  2 | import time
  3 | from hawkinsdb import HawkinsDB
  4 | from hawkinsdb.types import PropertyCandidate
  5 | import logging
  6 | 
  7 | logging.basicConfig(level=logging.INFO)
  8 | logger = logging.getLogger(__name__)
  9 | 
 10 | class TestMemoryTypes(unittest.TestCase):
 11 |     def setUp(self):
 12 |         self.db = HawkinsDB(storage_type='sqlite', db_path=":memory:")
 13 |     def tearDown(self):
 14 |         """Clean up after each test."""
 15 |         if hasattr(self, 'db'):
 16 |             self.db.cleanup()
 17 | 
 18 |         
 19 |     def test_procedural_memory_basic(self):
 20 |         """Test basic procedural memory creation and retrieval"""
 21 |         procedure = {
 22 |             "name": "TestProcedure",
 23 |             "column": "Procedural",
 24 |             "properties": {
 25 |                 "steps": ["Step1: Initialize system", "Step2: Process data", "Step3: Generate output"],
 26 |                 "difficulty": "Easy",
 27 |                 "required_tools": ["Computer", "Software"],
 28 |                 "duration": "5 minutes"
 29 |             },
 30 |             "relationships": {
 31 |                 "requires": ["BasicKnowledge"],
 32 |                 "part_of": ["LargerProcess"]
 33 |             }
 34 |         }
 35 |         
 36 |         # Add procedure through entity API
 37 |         result = self.db.add_entity(procedure)
 38 |         
 39 |         # Query and verify
 40 |         frames = self.db.query_frames("TestProcedure")
 41 |         self.assertIn("Procedural", frames)
 42 |         
 43 |         frame = frames["Procedural"]
 44 |         self.assertEqual(frame.name.lower(), "testprocedure")  # Name should be normalized
 45 |         self.assertTrue(any("steps" in prop for prop in frame.properties.keys()))
 46 |         self.assertTrue(any("required_tools" in prop for prop in frame.properties.keys()))
 47 |         
 48 |     def test_procedural_memory_validation(self):
 49 |         """Test validation for procedural memory"""
 50 |         # Test missing required fields
 51 |         invalid_procedure = {
 52 |             "name": "InvalidProcedure",
 53 |             "column": "Procedural",
 54 |             "properties": {
 55 |                 "difficulty": "Easy"
 56 |             }
 57 |         }
 58 |         
 59 |         with self.assertRaises(HawkinsDB.EntityValidationError):
 60 |             self.db.add_entity(invalid_procedure)
 61 |             
 62 |     def test_episodic_memory_basic(self):
 63 |         """Test basic episodic memory creation and retrieval"""
 64 |         current_time = time.time()
 65 |         episode = {
 66 |             "name": "FirstExperience",
 67 |             "column": "Episodic",
 68 |             "properties": {
 69 |                 "timestamp": current_time,
 70 |                 "action": "Ran first test",
 71 |                 "location": "TestLab",
 72 |                 "participants": ["User1"],
 73 |                 "outcome": "Success",
 74 |                 "duration": "10 minutes"
 75 |             },
 76 |             "relationships": {
 77 |                 "related_to": ["TestProcedure"],
 78 |                 "follows": ["Setup"]
 79 |             }
 80 |         }
 81 |         
 82 |         # Add episode through entity API
 83 |         result = self.db.add_entity(episode)
 84 |         
 85 |         # Query and verify
 86 |         frames = self.db.query_frames("FirstExperience")
 87 |         self.assertIn("Episodic", frames)
 88 |         
 89 |         frame = frames["Episodic"]
 90 |         self.assertEqual(frame.name.lower(), "firstexperience")
 91 |         self.assertTrue(any("timestamp" in prop for prop in frame.properties.keys()))
 92 |         self.assertTrue(any("location" in prop for prop in frame.properties.keys()))
 93 |         self.assertTrue(any("participants" in prop for prop in frame.properties.keys()))
 94 |         
 95 |     def test_episodic_memory_validation(self):
 96 |         """Test validation for episodic memory"""
 97 |         # Test missing required fields
 98 |         invalid_episode = {
 99 |             "name": "InvalidEpisode",
100 |             "column": "Episodic",
101 |             "properties": {
102 |                 "location": "TestLab"
103 |             }
104 |         }
105 |         
106 |         with self.assertRaises(HawkinsDB.EntityValidationError):
107 |             self.db.add_entity(invalid_episode)
108 |             
109 |     def test_memory_links(self):
110 |         """Test linking between procedural and episodic memories"""
111 |         # Add a procedure first
112 |         procedure = {
113 |             "name": "LinkedProcedure",
114 |             "column": "Procedural",
115 |             "properties": {
116 |                 "steps": ["Step1", "Step2"],
117 |                 "difficulty": "Medium",
118 |                 "required_tools": ["TestTool"]
119 |             },
120 |             "relationships": {}
121 |         }
122 |         
123 |         # Add procedure through entity API
124 |         result = self.db.add_entity(procedure)
125 |         
126 |         # Add an episode that references the procedure
127 |         current_time = time.time()
128 |         episode = {
129 |             "name": "LinkedEpisode",
130 |             "column": "Episodic",
131 |             "properties": {
132 |                 "timestamp": current_time,
133 |                 "action": "Executed procedure",
134 |                 "location": "TestLocation",
135 |                 "participants": ["Tester"]
136 |             },
137 |             "relationships": {
138 |                 "follows": ["LinkedProcedure"]
139 |             }
140 |         }
141 |         
142 |         # Add episode through entity API
143 |         result = self.db.add_entity(episode)
144 |         
145 |         # Verify the link
146 |         episode_frames = self.db.query_frames("LinkedEpisode")
147 |         self.assertIn("Episodic", episode_frames)
148 |         self.assertTrue(
149 |             any("LinkedProcedure" in str(rel.value)
150 |                 for rel in episode_frames["Episodic"].relationships.get("follows", []))
151 |         )
152 | 
153 |     def test_sequential_episodes(self):
154 |         """Test creating and linking sequential episodes"""
155 |         base_time = time.time()
156 |         
157 |         # Create a sequence of related episodes
158 |         episodes = [
159 |             {
160 |                 "name": f"Episode_{i}",
161 |                 "column": "Episodic",
162 |                 "properties": {
163 |                     "timestamp": base_time + i * 3600,  # Hour intervals
164 |                     "action": f"Action_{i}",
165 |                     "participants": ["Tester"]
166 |                 },
167 |                 "relationships": {
168 |                     "follows": [f"Episode_{i-1}"] if i > 0 else []
169 |                 }
170 |             } for i in range(3)
171 |         ]
172 |         
173 |         # Add episodes through entity API
174 |         for episode in episodes:
175 |             result = self.db.add_entity(episode)
176 |             
177 |         # Verify sequential relationships
178 |         for i in range(1, 3):
179 |             frames = self.db.query_frames(f"Episode_{i}")
180 |             self.assertTrue(
181 |                 any(f"Episode_{i-1}" in str(rel.value) 
182 |                     for rel in frames["Episodic"].relationships.get("follows", []))
183 |             )
184 | 
185 | if __name__ == '__main__':
186 |     unittest.main()
187 | 


--------------------------------------------------------------------------------
/tests/test_memory_types.py:
--------------------------------------------------------------------------------
  1 | """Test different memory types and their validations."""
  2 | import logging
  3 | import time
  4 | import json
  5 | from datetime import datetime
  6 | import pytest
  7 | from hawkinsdb import HawkinsDB
  8 | from hawkinsdb.types import PropertyCandidate
  9 | 
 10 | logging.basicConfig(level=logging.INFO)
 11 | logger = logging.getLogger(__name__)
 12 | 
 13 | class TestMemoryTypes:
 14 |     """Test class for different memory types and their validations."""
 15 |     
 16 |     @pytest.fixture
 17 |     def db(self):
 18 |         """Initialize test database."""
 19 |         db = HawkinsDB()
 20 |         yield db
 21 |         db.cleanup()
 22 |     
 23 |     def validate_episodic_memory(self, case, query_result):
 24 |         """Validate episodic memory specific requirements."""
 25 |         assert 'timestamp' in query_result, "Episodic memory missing timestamp"
 26 |         assert isinstance(query_result.get('timestamp'), (int, float)), "Invalid timestamp type"
 27 |         assert 'action' in query_result, "Episodic memory missing action"
 28 |         if 'participants' in case:
 29 |             assert 'participants' in query_result, "Episodic memory missing participants"
 30 |             assert isinstance(query_result.get('participants', []), list), "Participants must be a list"
 31 |     
 32 |     def validate_procedural_memory(self, case, query_result):
 33 |         """Validate procedural memory specific requirements."""
 34 |         assert 'steps' in query_result, "Procedural memory missing steps"
 35 |         assert isinstance(query_result.get('steps', []), list), "Steps must be a list"
 36 |         assert len(query_result.get('steps', [])) > 0, "Steps cannot be empty"
 37 |         if 'properties' in case and 'required_tools' in case['properties']:
 38 |             assert 'required_tools' in query_result, "Procedural memory missing required tools"
 39 |             assert isinstance(query_result.get('required_tools', []), list), "Required tools must be a list"
 40 | 
 41 |     def test_semantic_memory(self, db):
 42 |         """Test semantic memory creation and validation."""
 43 |         semantic_data = {
 44 |             "name": "TestConcept1",
 45 |             "column": "Semantic",
 46 |             "properties": {
 47 |                 "definition": "A test concept",
 48 |                 "category": "Test"
 49 |             },
 50 |             "relationships": {
 51 |                 "related_to": ["AnotherConcept"],
 52 |                 "part_of": ["LargerConcept"]
 53 |             }
 54 |         }
 55 |         result = db.add_entity(semantic_data)
 56 |         assert result["success"], f"Failed to add semantic memory: {result.get('message')}"
 57 |         
 58 |         query_results = db.query_frames("TestConcept1")
 59 |         assert "Semantic" in query_results, "Semantic memory not found in query results"
 60 |         
 61 |         frame = query_results["Semantic"]
 62 |         assert frame.name == "TestConcept1"
 63 |         assert "definition" in frame.properties
 64 |         assert "category" in frame.properties
 65 | 
 66 |     def test_episodic_memory(self, db):
 67 |         """Test episodic memory creation and validation."""
 68 |         episodic_data = {
 69 |             "name": "TestEvent1",
 70 |             "column": "Episodic",
 71 |             "timestamp": time.time(),
 72 |             "action": "Created test",
 73 |             "properties": {
 74 |                 "location": "Test Environment",
 75 |                 "duration": "10 minutes",
 76 |                 "outcome": "Success",
 77 |                 "participants": ["User1", "System"]
 78 |             }
 79 |         }
 80 |         result = db.add_entity(episodic_data)
 81 |         assert result["success"], f"Failed to add episodic memory: {result.get('message')}"
 82 |         
 83 |         query_results = db.query_frames("TestEvent1")
 84 |         assert "Episodic" in query_results, "Episodic memory not found in query results"
 85 |         
 86 |         frame = query_results["Episodic"]
 87 |         self.validate_episodic_memory(episodic_data, frame.to_dict())
 88 | 
 89 |     def test_procedural_memory(self, db):
 90 |         """Test procedural memory creation and validation."""
 91 |         procedural_data = {
 92 |             "name": "TestProcedure1",
 93 |             "column": "Procedural",
 94 |             "steps": [
 95 |                 "Step 1",
 96 |                 "Step 2",
 97 |                 "Step 3"
 98 |             ],
 99 |             "properties": {
100 |                 "purpose": "Test procedure execution",
101 |                 "difficulty": "Easy",
102 |                 "prerequisites": ["Required skill 1"],
103 |                 "success_criteria": ["Criterion 1"]
104 |             }
105 |         }
106 |         result = db.add_entity(procedural_data)
107 |         assert result["success"], f"Failed to add procedural memory: {result.get('message')}"
108 |         
109 |         query_results = db.query_frames("TestProcedure1")
110 |         assert "Procedural" in query_results, "Procedural memory not found in query results"
111 |         
112 |         frame = query_results["Procedural"]
113 |         self.validate_procedural_memory(procedural_data, frame.to_dict())
114 | 
115 |     def test_invalid_memory_types(self, db):
116 |         """Test invalid memory type validations."""
117 |         invalid_cases = [
118 |             # Missing name
119 |             {
120 |                 "column": "Semantic",
121 |                 "properties": {"definition": "Should fail"}
122 |             },
123 |             # Invalid timestamp type
124 |             {
125 |                 "name": "InvalidEvent1",
126 |                 "column": "Episodic",
127 |                 "timestamp": "not a timestamp",
128 |                 "action": "Should fail"
129 |             },
130 |             # Missing steps
131 |             {
132 |                 "name": "InvalidProcedure1",
133 |                 "column": "Procedural"
134 |             }
135 |         ]
136 |         
137 |         for case in invalid_cases:
138 |             result = db.add_entity(case)
139 |             assert not result["success"], f"Invalid case should fail: {case}"
140 |             assert "message" in result, "Error message should be present"


--------------------------------------------------------------------------------
/tests/test_openai.py:
--------------------------------------------------------------------------------
  1 | import logging
  2 | import os
  3 | import json
  4 | import unittest
  5 | from typing import Optional, Dict, Any
  6 | from hawkinsdb import HawkinsDB
  7 | from hawkinsdb.openai_interface import OpenAIInterface
  8 | from openai import OpenAI, OpenAIError, BadRequestError
  9 | from openai.types.chat import ChatCompletion
 10 | 
 11 | # Configure logging with more detailed output
 12 | logging.basicConfig(
 13 |     level=logging.DEBUG,  # Changed to DEBUG for more detailed logs
 14 |     format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
 15 |     force=True  # Ensure our configuration takes precedence
 16 | )
 17 | logger = logging.getLogger(__name__)
 18 | 
 19 | class TestOpenAIInterface(unittest.TestCase):
 20 |     """Test OpenAI integration with HawkinsDB."""
 21 | 
 22 |     def setUp(self):
 23 |         """Set up test environment with API key validation."""
 24 |         try:
 25 |             # Initialize database
 26 |             self.db = HawkinsDB()
 27 |             
 28 |             # Get and validate API key
 29 |             self.api_key = os.getenv("OPENAI_API_KEY")
 30 |             if not self.api_key:
 31 |                 self.skipTest("OPENAI_API_KEY environment variable not set")
 32 |             
 33 |             if not self.api_key.startswith('sk-'):
 34 |                 self.skipTest("Invalid OpenAI API key format")
 35 |             
 36 |             # Initialize interface with test model
 37 |             self.model = "gpt-3.5-turbo-1106"
 38 |             self.interface = OpenAIInterface(self.db, model=self.model)
 39 |             
 40 |             # Verify API connection
 41 |             try:
 42 |                 self.interface._test_connection()
 43 |             except Exception as e:
 44 |                 self.skipTest(f"Failed to connect to OpenAI API: {str(e)}")
 45 |                 
 46 |             # Set up test data
 47 |             self.text_description = """
 48 |             There's a red Tesla Model 3 in my garage. It's an electric vehicle 
 49 |             with autopilot capabilities and a glass roof. The car was manufactured 
 50 |             in 2023 and has only 1000 miles on it.
 51 |             """
 52 |             
 53 |             # Test data
 54 |             self.text_description = """
 55 |             There's a red Tesla Model 3 in my garage. It's an electric vehicle 
 56 |             with autopilot capabilities and a glass roof. The car was manufactured 
 57 |             in 2023 and has only 1000 miles on it.
 58 |             """
 59 |             
 60 |         except Exception as e:
 61 |             logger.error(f"Test environment initialization failed: {str(e)}")
 62 |             self.fail(f"Failed to initialize test environment: {str(e)}")
 63 |             
 64 |         logger.info("Test environment initialized successfully")
 65 |             
 66 |     def tearDown(self):
 67 |         """Clean up test data and resources."""
 68 |         try:
 69 |             # Clean up sensitive data
 70 |             if hasattr(self, 'db') and hasattr(self.db, 'config'):
 71 |                 try:
 72 |                     # Clear credentials
 73 |                     self.db.config.clear_sensitive_data()
 74 |                 except Exception as e:
 75 |                     logger.warning(f"Failed to clear sensitive data: {str(e)}")
 76 | 
 77 |             # Clean up database test data
 78 |             if hasattr(self, 'db'):
 79 |                 try:
 80 |                     self.db._perform_maintenance()
 81 |                 except Exception as e:
 82 |                     logger.warning(f"Database cleanup failed: {str(e)}")
 83 | 
 84 |             # Clear OpenAI interface
 85 |             if hasattr(self, 'interface'):
 86 |                 try:
 87 |                     # Clear client and model references
 88 |                     if hasattr(self.interface, 'client'):
 89 |                         delattr(self.interface, 'client')
 90 |                     delattr(self, 'interface')
 91 |                 except Exception as e:
 92 |                     logger.warning(f"Interface cleanup failed: {str(e)}")
 93 | 
 94 |         except Exception as e:
 95 |             logger.error(f"Cleanup failed: {str(e)}")
 96 |         finally:
 97 |             # Force garbage collection
 98 |             import gc
 99 |             gc.collect()
100 | 
101 |     def test_parse_entity_from_text(self):
102 |         """Test parsing entity from natural language text with new API format."""
103 |         logger.info("\nTesting entity parsing from text...")
104 |         try:
105 |             result = self.interface.parse_entity_from_text(self.text_description)
106 |             
107 |             # Verify successful response
108 |             self.assertTrue(result['success'], f"Failed to parse entity: {result.get('message', 'Unknown error')}")
109 |             self.assertIn('entity_data', result, "Response missing entity_data")
110 |             self.assertIsNotNone(result['entity_data'], "Entity data is None")
111 |             
112 |             # Verify entity structure with new API format
113 |             entity_data = result['entity_data']
114 |             self.assertIn('name', entity_data, "Entity missing name field")
115 |             self.assertTrue(entity_data['name'].strip(), "Entity name should not be empty")
116 |             self.assertIn('properties', entity_data, "Entity missing properties field")
117 |             self.assertIsInstance(entity_data['properties'], dict, "Properties should be a dictionary")
118 |             
119 |             # Verify Tesla-specific properties with more flexible matching
120 |             props = entity_data['properties']
121 |             
122 |             # Check for color property
123 |             color_found = any(
124 |                 'color' in k.lower() or 'red' in str(v).lower() or
125 |                 any('red' in str(val).lower() for val in v if isinstance(v, list))
126 |                 for k, v in props.items()
127 |             )
128 |             self.assertTrue(color_found, "Color property missing or incorrect")
129 |             
130 |             # Check for model/make property with broader matching
131 |             model_found = any(
132 |                 any(term in k.lower() or term in str(v).lower() or
133 |                     any(term in str(val).lower() for val in v if isinstance(v, list))
134 |                     for term in ['model', 'tesla', 'make', 'vehicle'])
135 |                 for k, v in props.items()
136 |             )
137 |             self.assertTrue(model_found, "Model/make property missing or incorrect")
138 |             
139 |             year_found = any('year' in k.lower() or '2023' in str(v) 
140 |                            for k, v in props.items())
141 |             self.assertTrue(year_found, "Year property missing or incorrect")
142 |             
143 |             logger.info(f"Parsed entity: {json.dumps(result, indent=2)}")
144 |             
145 |         except OpenAIError as oe:
146 |             self.skipTest(f"OpenAI API error: {str(oe)}")
147 |         except Exception as e:
148 |             self.fail(f"Test failed with unexpected exception: {str(e)}")
149 | 
150 |     def test_add_entity_to_database(self):
151 |         """Test adding parsed entity to database."""
152 |         logger.info("\nTesting entity addition to database...")
153 |         try:
154 |             # First ensure we have a valid API key
155 |             if not self.api_key:
156 |                 self.skipTest("No valid API key available")
157 | 
158 |             # Parse the entity
159 |             parsed_result = self.interface.parse_entity_from_text(self.text_description)
160 |             self.assertTrue(parsed_result['success'], 
161 |                           f"Failed to parse entity: {parsed_result.get('message', 'Unknown error')}")
162 |             self.assertIsNotNone(parsed_result.get('entity_data'), "No entity data returned")
163 |             
164 |             # Add to database
165 |             add_result = self.db.add_entity(parsed_result['entity_data'])
166 |             self.assertTrue(add_result['success'], 
167 |                           f"Failed to add entity: {add_result.get('message', 'Unknown error')}")
168 |             self.assertIsNotNone(add_result.get('entity_name'), "No entity name returned")
169 |             
170 |             # Verify entity was added correctly
171 |             entity_name = add_result['entity_name']
172 |             frames = self.db.query_frames(entity_name)
173 |             self.assertTrue(frames, f"Entity {entity_name} not found in database")
174 |             
175 |             logger.info(f"Entity addition result: {json.dumps(add_result, indent=2)}")
176 |             
177 |         except OpenAIError as oe:
178 |             self.skipTest(f"OpenAI API error: {str(oe)}")
179 |         except Exception as e:
180 |             self.fail(f"Test failed with unexpected exception: {str(e)}")
181 | 
182 |     def test_answer_question(self):
183 |         """Test querying the database using natural language."""
184 |         logger.info("\nTesting natural language query...")
185 |         try:
186 |             # First add an entity to query
187 |             parsed_result = self.interface.parse_entity_from_text(self.text_description)
188 |             self.assertTrue(parsed_result['success'], "Failed to parse entity for query test")
189 |             
190 |             add_result = self.db.add_entity(parsed_result['entity_data'])
191 |             self.assertTrue(add_result['success'], "Failed to add entity for query test")
192 |             
193 |             # Test querying
194 |             query = "What are the main features of this car and where is it located?"
195 |             query_result = self.interface.answer_question(query)
196 |             
197 |             self.assertTrue(query_result['success'], 
198 |                           f"Query failed: {query_result.get('message', 'Unknown error')}")
199 |             self.assertIsNotNone(query_result.get('response'), "No response returned")
200 |             
201 |             # Verify response content with more detailed assertions
202 |             response = query_result['response']
203 |             self.assertIn('Tesla', response, "Response should mention Tesla")
204 |             self.assertIn('garage', response, "Response should mention location (garage)")
205 |             
206 |             # Verify response structure
207 |             self.assertIsInstance(response, str, "Response should be a string")
208 |             self.assertTrue(len(response) > 20, "Response should be a meaningful length")
209 |             
210 |             # Log the actual response for debugging
211 |             logger.info(f"Query response: {response}")
212 |             
213 |             # Verify key information is present
214 |             key_terms = ['Model 3', 'electric', 'red']
215 |             found_terms = [term for term in key_terms if term.lower() in response.lower()]
216 |             self.assertTrue(found_terms, f"Response should contain at least one of: {key_terms}")
217 |             
218 |             logger.info(f"Query result: {json.dumps(query_result, indent=2)}")
219 |             
220 |         except OpenAIError as oe:
221 |             self.skipTest(f"OpenAI API error: {str(oe)}")
222 |         except Exception as e:
223 |             self.fail(f"Test failed with exception: {str(e)}")
224 | 
225 |     def test_error_handling(self):
226 |         """Test error handling with OpenAI API v1.0+."""
227 |         logger.info("\nTesting error handling...")
228 |         
229 |         # Test with empty input
230 |         result = self.interface.parse_entity_from_text("")
231 |         self.assertFalse(result['success'])
232 |         self.assertIn('message', result)
233 |         self.assertIsNone(result.get('entity_data'))
234 |         
235 |         # Test with empty query
236 |         query_result = self.interface.answer_question("")
237 |         self.assertFalse(query_result['success'])
238 |         self.assertIn('message', query_result)
239 |         self.assertIsNone(query_result.get('response'))
240 |         
241 |         # Test API key validation
242 |         def test_invalid_key():
243 |             # Test with invalid API key format
244 |             try:
245 |                 bad_interface = OpenAIInterface(self.db)
246 |                 bad_interface.client = OpenAI(api_key="sk_test_invalid123")
247 |                 result = bad_interface.parse_entity_from_text(self.text_description)
248 |                 self.assertFalse(result['success'])
249 |                 self.assertTrue('invalid' in result['message'].lower())
250 |             except Exception as e:
251 |                 self.fail(f"Unexpected exception: {str(e)}")
252 |         
253 |         # Test various error scenarios with v1.0+ error patterns
254 |         original_client = self.interface.client
255 |         try:
256 |             # Test rate limit
257 |             def mock_rate_limit(*args, **kwargs):
258 |                 raise OpenAIError("rate_limit_exceeded: Please retry your request after 20s")
259 |             
260 |             self.interface.client.chat.completions.create = mock_rate_limit
261 |             result = self.interface.parse_entity_from_text(self.text_description)
262 |             self.assertFalse(result['success'])
263 |             self.assertTrue(
264 |                 any(term in result.get('message', '').lower() 
265 |                     for term in ['rate limit', 'try again']),
266 |                 "Error message should indicate rate limit"
267 |             )
268 |             
269 |             # Test quota exceeded
270 |             def mock_quota(*args, **kwargs):
271 |                 raise OpenAIError("insufficient_quota: You exceeded your current quota")
272 |             
273 |             self.interface.client.chat.completions.create = mock_quota
274 |             result = self.interface.parse_entity_from_text(self.text_description)
275 |             self.assertFalse(result['success'])
276 |             self.assertTrue(
277 |                 'quota' in result.get('message', '').lower(),
278 |                 "Error message should indicate quota exceeded"
279 |             )
280 |             
281 |             # Test model not found
282 |             def mock_model_error(*args, **kwargs):
283 |                 raise OpenAIError("model_not_found: The model does not exist")
284 |             
285 |             self.interface.client.chat.completions.create = mock_model_error
286 |             result = self.interface.parse_entity_from_text(self.text_description)
287 |             self.assertFalse(result['success'])
288 |             self.assertTrue(
289 |                 'model' in result.get('message', '').lower(),
290 |                 "Error message should indicate model not found"
291 |             )
292 |             
293 |             # Test timeout
294 |             def mock_timeout(*args, **kwargs):
295 |                 raise OpenAIError("request_timeout: Request timed out")
296 |             
297 |             self.interface.client.chat.completions.create = mock_timeout
298 |             result = self.interface.parse_entity_from_text(self.text_description)
299 |             self.assertFalse(result['success'])
300 |             self.assertTrue(
301 |                 'timeout' in result.get('message', '').lower(),
302 |                 "Error message should indicate timeout"
303 |             )
304 |             
305 |             # Test server error
306 |             def mock_server_error(*args, **kwargs):
307 |                 raise OpenAIError("server_error: Internal server error")
308 |             
309 |             self.interface.client.chat.completions.create = mock_server_error
310 |             result = self.interface.parse_entity_from_text(self.text_description)
311 |             self.assertFalse(result['success'])
312 |             self.assertTrue(
313 |                 'server error' in result.get('message', '').lower(),
314 |                 "Error message should indicate server error"
315 |             )
316 |             
317 |             # Test invalid response structure
318 |             class MockResponse:
319 |                 def __init__(self):
320 |                     self.choices = []
321 |                     
322 |             def mock_invalid_response(*args, **kwargs):
323 |                 return MockResponse()
324 |             
325 |             self.interface.client.chat.completions.create = mock_invalid_response
326 |             result = self.interface.parse_entity_from_text(self.text_description)
327 |             self.assertFalse(result['success'])
328 |             self.assertTrue(
329 |                 any(term in result.get('message', '').lower() 
330 |                     for term in ['invalid', 'error']),
331 |                 "Error message should indicate invalid response"
332 |             )
333 |             
334 |         finally:
335 |             self.interface.client = original_client
336 | 
337 | if __name__ == '__main__':
338 |     unittest.main(verbosity=2)
339 | 


--------------------------------------------------------------------------------
/tests/test_rag.py:
--------------------------------------------------------------------------------
  1 | import json
  2 | import logging
  3 | import os
  4 | from hawkinsdb import HawkinsDB, LLMInterface
  5 | from openai import OpenAI
  6 | 
  7 | os.environ["OPENAI_API_KEY"]=""
  8 | 
  9 | 
 10 | logging.basicConfig(level=logging.INFO)
 11 | logger = logging.getLogger(__name__)
 12 | 
 13 | class HawkinsWrapper:
 14 |     def __init__(self):
 15 |         """Initialize HawkinsDB and its LLM interface."""
 16 |         self.db = HawkinsDB(storage_type='sqlite')
 17 |         self.llm_interface = LLMInterface(self.db,auto_enrich=True)
 18 |         self.client = OpenAI()  # For pre-processing text
 19 | 
 20 |     def preprocess_text(self, text):
 21 |         """Preprocess text to ensure proper entity structure."""
 22 |         system_prompt = """Convert the text into a structured entity format ie json. Follow these rules strictly:
 23 | 
 24 | 1. ALWAYS include a clear, unique entity name using underscores (e.g., Python_Language, First_Python_Project)
 25 | 2. ALWAYS include a column type (Semantic, Episodic, or Procedural)
 26 | 3. Ensure all required fields are present
 27 | 
 28 | Required format:
 29 | {
 30 |     "name": "Entity_Name",  // This is REQUIRED
 31 |     "column": "Semantic",   // One of: Semantic, Episodic, Procedural
 32 |     "type": "category_type",
 33 |     "properties": {
 34 |         "key1": "value1",
 35 |         "key2": ["value2a", "value2b"]
 36 |     },
 37 |     "relationships": {
 38 |         "related_to": ["entity1", "entity2"]
 39 |     }
 40 | }
 41 | 
 42 | Extract meaningful details and ensure name field is properly set."""
 43 | 
 44 |         try:
 45 |             response = self.client.chat.completions.create(
 46 |                 model="gpt-3.5-turbo-1106",
 47 |                 messages=[
 48 |                     {"role": "system", "content": system_prompt},
 49 |                     {"role": "user", "content": text}
 50 |                 ],
 51 |                 temperature=0.3,
 52 |                 response_format={"type": "json_object"}
 53 |             )
 54 | 
 55 |             result = json.loads(response.choices[0].message.content)
 56 |             
 57 |             # Verify required fields
 58 |             if not result.get("name"):
 59 |                 raise ValueError("Missing required field: name")
 60 |             if not result.get("column"):
 61 |                 result["column"] = "Semantic"  # Default to Semantic if not specified
 62 |                 
 63 |             return result
 64 | 
 65 |         except Exception as e:
 66 |             logger.error(f"Error in preprocessing: {str(e)}")
 67 |             raise
 68 | 
 69 |     def add_from_text(self, text):
 70 |         """Add entity from text description with preprocessing."""
 71 |         try:
 72 |             # First preprocess the text to ensure proper structure
 73 |             processed_data = self.preprocess_text(text)
 74 |             logger.info(f"Preprocessed data: {json.dumps(processed_data, indent=2)}")
 75 | 
 76 |             # Add to database using HawkinsDB's add_entity
 77 |             result = self.db.add_entity(processed_data)
 78 |             
 79 |             return {
 80 |                 "success": True,
 81 |                 "message": "Successfully added to database",
 82 |                 "entity_data": processed_data,
 83 |                 "db_result": result,
 84 |                 "entity_name": processed_data.get("name")
 85 |             }
 86 | 
 87 |         except Exception as e:
 88 |             logger.error(f"Error adding to database: {str(e)}")
 89 |             return {
 90 |                 "success": False,
 91 |                 "message": str(e),
 92 |                 "entity_data": None,
 93 |                 "entity_name": None
 94 |             }
 95 | 
 96 |     def query_entity(self, entity_name):
 97 |         """Query specific entity by name."""
 98 |         try:
 99 |             frames = self.db.query_frames(entity_name)
100 |             if not frames:
101 |                 return {
102 |                     "success": False,
103 |                     "message": f"No entity found with name: {entity_name}",
104 |                     "data": None
105 |                 }
106 |             
107 |             return {
108 |                 "success": True,
109 |                 "message": "Entity found",
110 |                 "data": frames
111 |             }
112 |             
113 |         except Exception as e:
114 |             logger.error(f"Error querying entity: {str(e)}")
115 |             return {
116 |                 "success": False,
117 |                 "message": str(e),
118 |                 "data": None
119 |             }
120 | 
121 |     def query_by_text(self, query_text):
122 |         """Query database using natural language text."""
123 |         try:
124 |             result = self.llm_interface.query(query_text)
125 |             return result
126 |             
127 |         except Exception as e:
128 |             logger.error(f"Error processing query: {str(e)}")
129 |             return {
130 |                 "success": False,
131 |                 "message": str(e),
132 |                 "response": None
133 |             }
134 | 
135 |     def list_all_entities(self):
136 |         """List all entities in the database."""
137 |         try:
138 |             entities = self.db.list_entities()
139 |             return {
140 |                 "success": True,
141 |                 "message": "Entities retrieved successfully",
142 |                 "entities": entities
143 |             }
144 |         except Exception as e:
145 |             logger.error(f"Error listing entities: {str(e)}")
146 |             return {
147 |                 "success": False,
148 |                 "message": str(e),
149 |                 "entities": None
150 |             }
151 | 
152 | def test_memory_examples():
153 |     """Test function to demonstrate usage."""
154 |     hawkins = HawkinsWrapper()
155 |     
156 |     # Test adding entries
157 |     examples = [
158 |         """
159 |         Python is a programming language created by Guido van Rossum in 1991.
160 |         It supports object-oriented, imperative, and functional programming.
161 |         It's commonly used for web development, data science, and automation.
162 |         Similar languages include Ruby and JavaScript.
163 |         """,
164 |         """
165 |         Today I completed my first Python project in my home office.
166 |         It took 2 hours and was successful. I did a code review afterwards.
167 |         The project helped me learn about functions and classes.
168 |         """,
169 |         """
170 |         The Tesla Model 3 is red, made in 2023, and parked in the garage.
171 |         It has a range of 358 miles and goes 0-60 mph in 3.1 seconds.
172 |         It features autopilot and a minimalist interior design.
173 |         """,
174 |         """
175 |         Visual Studio Code (VS Code) is a popular code editor developed by Microsoft.
176 |         It was first released in 2015 and is written in TypeScript and JavaScript.
177 |         It supports multiple programming languages through extensions, has integrated
178 |         Git control, and features intelligent code completion. It's commonly used
179 |         alongside Python, JavaScript, and Java development environments.
180 |         """,
181 |         """
182 |         C++ is a beautiful programming language 
183 |         """
184 |     ]
185 | 
186 |     # Add examples to database
187 |     logger.info("\nAdding examples to database:")
188 |     for i, example in enumerate(examples, 1):
189 |         logger.info(f"\nAdding Example {i}")
190 |         logger.info("=" * 50)
191 |         logger.info(f"Input Text:\n{example}")
192 |         result = hawkins.add_from_text(example)
193 |         logger.info(f"Result: {json.dumps(result, indent=2)}")
194 | 
195 |     # List all entities
196 |     logger.info("\nListing all entities:")
197 |     entities_result = hawkins.list_all_entities()
198 |     logger.info(f"Entities: {json.dumps(entities_result, indent=2)}")
199 | 
200 |     # Test natural language queries
201 |     test_queries = [
202 |         "What has car has range of 358 miles and goes 0-60 mph in 3.1 seconds"
203 |     ]
204 | 
205 |     logger.info("\nTesting natural language queries:")
206 |     for query in test_queries:
207 |         logger.info(f"\nQuery: {query}")
208 |         result = hawkins.query_by_text(query)
209 |         logger.info(f"Response: {json.dumps(result, indent=2)}")
210 | 
211 | if __name__ == "__main__":
212 |     test_memory_examples()


--------------------------------------------------------------------------------
/tests/test_readme_examples.py:
--------------------------------------------------------------------------------
  1 | import logging
  2 | import time
  3 | from hawkinsdb import HawkinsDB, LLMInterface
  4 | import os
  5 | 
  6 | 
  7 | 
  8 | logging.basicConfig(level=logging.INFO)
  9 | logger = logging.getLogger(__name__)
 10 | 
 11 | def test_readme_examples():
 12 |     # Initialize database
 13 |     db = HawkinsDB(storage_type='sqlite')
 14 |     
 15 |     logger.info("\n=== Testing Semantic Memory Example ===")
 16 |     # Test semantic memory
 17 |     semantic_memory = {
 18 |         "column": "Semantic",
 19 |         "name": "Python_Language",
 20 |         "properties": {
 21 |             "type": "Programming_Language",
 22 |             "paradigm": ["Object-oriented", "Imperative", "Functional"],
 23 |             "created_by": "Guido van Rossum",
 24 |             "year": 1991
 25 |         },
 26 |         "relationships": {
 27 |             "used_for": ["Web Development", "Data Science", "Automation"],
 28 |             "similar_to": ["Ruby", "JavaScript"]
 29 |         }
 30 |     }
 31 |     
 32 |     result = db.add_entity(semantic_memory)
 33 |     logger.info(f"Semantic memory add result: {result}")
 34 |     frames = db.query_frames("Python_Language")
 35 |     logger.info(f"Semantic memory query result: {frames}")
 36 |     
 37 |     logger.info("\n=== Testing Episodic Memory Example ===")
 38 |     # Test episodic memory
 39 |     episodic_memory = {
 40 |         "column": "Episodic",
 41 |         "type": "Event",
 42 |         "name": "First_Python_Project",
 43 |         "properties": {
 44 |             "location": "Home Office",
 45 |             "duration": "2 hours",
 46 |             "outcome": "Success",
 47 |             "timestamp": time.time(),
 48 |             "action": "Completed project",
 49 |             "participants": ["User1"]
 50 |         },
 51 |         "relationships": {
 52 |             "related_to": ["Python_Language"],
 53 |             "followed_by": ["Code_Review"]
 54 |         }
 55 |     }
 56 |     
 57 |     result = db.add_entity(episodic_memory)
 58 |     logger.info(f"Episodic memory add result: {result}")
 59 |     frames = db.query_frames("First_Python_Project")
 60 |     logger.info(f"Episodic memory query result: {frames}")
 61 |     
 62 |     logger.info("\n=== Testing Procedural Memory Example ===")
 63 |     # Test procedural memory
 64 |     procedural_memory = {
 65 |         "column": "Procedural",
 66 |         "type": "Procedure",
 67 |         "name": "Git_Commit_Process",
 68 |         "properties": {
 69 |             "difficulty": "Beginner",
 70 |             "required_tools": ["Git"],
 71 |             "estimated_time": "5 minutes",
 72 |             "steps": [
 73 |                 "Stage changes using git add",
 74 |                 "Review changes with git status",
 75 |                 "Commit with descriptive message",
 76 |                 "Push to remote repository"
 77 |             ]
 78 |         },
 79 |         "relationships": {
 80 |             "prerequisites": ["Git_Installation"],
 81 |             "followed_by": ["Git_Push_Process"]
 82 |         }
 83 |     }
 84 |     
 85 |     result = db.add_entity(procedural_memory)
 86 |     logger.info(f"Procedural memory add result: {result}")
 87 |     frames = db.query_frames("Git_Commit_Process")
 88 |     logger.info(f"Procedural memory query result: {frames}")
 89 |     
 90 |     logger.info("\n=== Testing LLM Interface Example ===")
 91 |     # Test LLM interface
 92 |     interface = LLMInterface(db)
 93 |     
 94 |     # Test natural language addition
 95 |     nl_result = interface.add_from_text("""
 96 |          The Tesla Model 3 is a battery electric sedan car manufactured by Tesla.
 97 |     It has a red exterior color, was manufactured in 2023, and is currently
 98 |     located in the garage. It has an estimated range of 358 miles and
 99 |     accelerates from 0 to 60 mph in 3.1 seconds.
100 |     """)
101 |     logger.info(f"LLM interface add result: {nl_result}")
102 |     
103 |     # Test natural language query
104 |     query_result = interface.query("Explain about First Python Project")
105 |     logger.info(f"LLM interface query result: {query_result}")
106 | 
107 | if __name__ == "__main__":
108 |     test_readme_examples()
109 | 


--------------------------------------------------------------------------------
/tests/test_sqlite_storage.py:
--------------------------------------------------------------------------------
  1 | """Test suite for SQLite storage backend."""
  2 | import os
  3 | import unittest
  4 | import tempfile
  5 | from datetime import datetime
  6 | from typing import Optional, Sequence, cast, Type, TypeVar
  7 | 
  8 | from hawkinsdb.storage.sqlite import SQLiteStorage
  9 | from hawkinsdb.types import CorticalColumn, ReferenceFrame, PropertyCandidate
 10 | from hawkinsdb.base import BaseCorticalColumn
 11 | 
 12 | # Type variable for CorticalColumn
 13 | T_CorticalColumn = TypeVar('T_CorticalColumn', bound=BaseCorticalColumn)
 14 | 
 15 | class TestSQLiteStorage(unittest.TestCase):
 16 |     """Test cases for SQLite storage backend."""
 17 |     
 18 |     def setUp(self):
 19 |         """Set up test environment with temporary database."""
 20 |         # Use temporary file for testing
 21 |         self.temp_dir = tempfile.mkdtemp()
 22 |         self.db_path = os.path.join(self.temp_dir, "test_hawkins.db")
 23 |         self.storage = SQLiteStorage(db_path=self.db_path)
 24 |         self.storage.initialize()
 25 |         
 26 |     def tearDown(self):
 27 |         """Clean up test environment."""
 28 |         self.storage.cleanup()
 29 |         if os.path.exists(self.db_path):
 30 |             os.remove(self.db_path)
 31 |         os.rmdir(self.temp_dir)
 32 |         
 33 |     def test_initialize_and_cleanup(self):
 34 |         """Test database initialization and cleanup."""
 35 |         self.assertTrue(os.path.exists(self.db_path))
 36 |         self.storage.cleanup()
 37 |         
 38 |     def test_save_and_load_columns(self):
 39 |         """Test saving and loading columns with various data types."""
 40 |         # Create test data
 41 |         test_time = datetime.now().isoformat()
 42 |         test_columns: Sequence[T_CorticalColumn] = cast(
 43 |             Sequence[T_CorticalColumn],
 44 |             [
 45 |                 CorticalColumn(
 46 |                     name="test_column",
 47 |                     frames=[
 48 |                         ReferenceFrame(
 49 |                             name="test_frame",
 50 |                             properties={
 51 |                                 "color": [PropertyCandidate(value="red", confidence=0.9)],
 52 |                                 "size": [PropertyCandidate(value=42, confidence=1.0)]
 53 |                             },
 54 |                             relationships={
 55 |                                 "contains": [PropertyCandidate(value="item", confidence=0.8)]
 56 |                             },
 57 |                             location={"x": 0, "y": 0},
 58 |                             history=[(test_time, "created"), (test_time, "updated")]
 59 |                         )
 60 |                     ]
 61 |                 )
 62 |             ]
 63 |         )
 64 |         
 65 |         # Save columns
 66 |         self.storage.save_columns(test_columns)
 67 |         
 68 |         # Load columns
 69 |         loaded_columns = self.storage.load_columns()
 70 |         
 71 |         # Verify data
 72 |         self.assertEqual(len(loaded_columns), 1)
 73 |         self.assertEqual(loaded_columns[0].name, "test_column")
 74 |         self.assertEqual(len(loaded_columns[0].frames), 1)
 75 |         
 76 |         loaded_frame = loaded_columns[0].frames[0]
 77 |         self.assertEqual(loaded_frame.name, "test_frame")
 78 |         self.assertEqual(loaded_frame.properties["color"][0].value, "red")
 79 |         self.assertEqual(loaded_frame.properties["size"][0].value, 42)
 80 |         self.assertEqual(loaded_frame.relationships["contains"][0].value, "item")
 81 |         self.assertEqual(loaded_frame.location, {"x": 0, "y": 0})
 82 |         self.assertEqual(loaded_frame.history, [(test_time, "created"), (test_time, "updated")])
 83 |         
 84 |     def test_error_handling(self):
 85 |         """Test error handling for invalid operations."""
 86 |         # Test saving invalid data (empty column name)
 87 |         with self.assertRaises(ValueError):
 88 |             invalid_columns: Sequence[T_CorticalColumn] = cast(
 89 |                 Sequence[T_CorticalColumn], 
 90 |                 [CorticalColumn(name="", frames=[])]
 91 |             )
 92 |             self.storage.save_columns(invalid_columns)
 93 |             
 94 |         # Test empty database path
 95 |         with self.assertRaises(ValueError) as cm:
 96 |             SQLiteStorage(db_path="")
 97 |         self.assertIn("Invalid database path", str(cm.exception))
 98 |             
 99 |         # Test non-existent directory creation
100 |         with tempfile.TemporaryDirectory() as temp_dir:
101 |             new_dir = os.path.join(temp_dir, "newdir")
102 |             db_path = os.path.join(new_dir, "test.db")
103 |             storage = SQLiteStorage(db_path=db_path)
104 |             self.assertTrue(os.path.exists(new_dir))
105 |             storage.cleanup()
106 |             
107 |         # Test invalid directory permissions
108 |         if os.name != 'nt':  # Skip on Windows
109 |             with tempfile.TemporaryDirectory() as temp_dir:
110 |                 # Create a read-only directory
111 |                 read_only_dir = os.path.join(temp_dir, "readonly")
112 |                 os.makedirs(read_only_dir)
113 |                 os.chmod(read_only_dir, 0o555)  # Read + execute only
114 |                 
115 |                 db_path = os.path.join(read_only_dir, "test.db")
116 |                 with self.assertRaises(ValueError) as cm:
117 |                     SQLiteStorage(db_path=db_path)
118 |                 self.assertIn("write", str(cm.exception).lower())
119 | 
120 | if __name__ == '__main__':
121 |     unittest.main()


--------------------------------------------------------------------------------