├── .env_sample ├── .gitignore ├── README.md ├── groquments.py ├── requirements.txt └── src ├── Groquments.md └── custom_instructions.txt /.env_sample: -------------------------------------------------------------------------------- 1 | GROQ_API_KEY=[your key here] 2 | GROQ_MODEL=llama3-8b-8192 3 | MAX_TOKENS=1024 4 | TEMPERATURE=0.7 5 | INPUT_DIR=input_documents 6 | OUTPUT_DIR=output_documents -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Python 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # Virtual environment 7 | venv/ 8 | env/ 9 | .env 10 | 11 | # IDEs 12 | .vscode/ 13 | .idea/ 14 | 15 | # Project-specific 16 | output_documents/ 17 | input_documents/ 18 | 19 | # Miscellaneous 20 | .DS_Store 21 | Thumbs.db -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Groquments 2 | ![image](https://github.com/user-attachments/assets/a7e50e52-b8e9-4198-a2b2-586f693e2316) 3 | 4 | Groquments is a simple demonstration project showcasing how easily PocketGroq can help developers integrate Groq's powerful AI capabilities into their Python projects. This project provides a basic implementation of an AI-powered document field mapping tool. 5 | 6 | **IMPORTANT**: Groquments is a demonstration project and is not intended for production use. It serves as an example of PocketGroq integration and should be used for learning and experimentation purposes only. 7 | 8 | ## Features 9 | 10 | - Read source and target .docx documents 11 | - Use AI to intelligently map fields between documents 12 | - Generate JavaScript representation of the source document 13 | - Fill out target document based on AI-generated field mapping 14 | - Save the filled document as a new .docx file 15 | 16 | ## Prerequisites 17 | 18 | - Python 3.7 or higher 19 | - A Groq API key (sign up at [https://console.groq.com](https://console.groq.com) to obtain one) 20 | 21 | ## Setup 22 | 23 | 1. Clone the repository: 24 | ``` 25 | git clone https://github.com/yourusername/groquments.git 26 | cd groquments 27 | ``` 28 | 29 | 2. Create and activate a virtual environment: 30 | ``` 31 | python -m venv venv 32 | source venv/bin/activate # On Windows, use `venv\Scripts\activate` 33 | ``` 34 | 35 | 3. Install the required packages: 36 | ``` 37 | pip install -r requirements.txt 38 | ``` 39 | 40 | 4. Set up your environment variables: 41 | - Rename `.env.sample` to `.env` 42 | - Open `.env` and replace `your_groq_api_key_here` with your actual Groq API key 43 | 44 | ## Usage 45 | 46 | 1. Prepare your source and target .docx documents: 47 | - Place them in the `input_documents` directory (create it if it doesn't exist) 48 | - Ensure both documents have a clear structure with labeled fields 49 | 50 | 2. Run the Groquments script: 51 | ``` 52 | python groquments.py 53 | ``` 54 | 55 | 3. Follow the prompts to select your source and target documents 56 | 57 | 4. The script will process the documents and save the filled target document in the `output_documents` directory 58 | 59 | ## How It Works 60 | 61 | 1. Groquments reads the source and target documents 62 | 2. It generates a JavaScript representation of the source document 63 | 3. The script uses the Groq AI model (via PocketGroq) to intelligently map fields between the documents 64 | 4. Based on this mapping, it fills out the target document 65 | 5. Finally, it saves the filled document as a new .docx file 66 | 67 | ## Limitations 68 | 69 | - This is a demonstration project and may not handle all document structures or field types 70 | - The AI-based field mapping is experimental and may not always produce perfect results 71 | - Error handling is basic and may not cover all edge cases 72 | 73 | ## Contributing 74 | 75 | As this is a demonstration project, we're not actively seeking contributions. However, feel free to fork the repository and experiment with your own enhancements! 76 | 77 | ## License 78 | 79 | This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. 80 | 81 | ## Acknowledgments 82 | 83 | - This project uses [PocketGroq](https://github.com/jgravelle/pocketgroq) to integrate Groq's AI capabilities 84 | - Document handling is powered by the [python-docx](https://python-docx.readthedocs.io/) library 85 | 86 | Remember, Groquments is a simple demo to illustrate PocketGroq integration. For production use, always implement proper error handling, security measures, and thorough testing.py2md 87 | -------------------------------------------------------------------------------- /groquments.py: -------------------------------------------------------------------------------- 1 | import json 2 | import os 3 | import sys 4 | import re 5 | from pocketgroq import GroqProvider 6 | from typing import Dict, Any 7 | from docx import Document 8 | import tkinter as tk 9 | from tkinter import filedialog 10 | from dotenv import load_dotenv 11 | 12 | # Load environment variables 13 | load_dotenv() 14 | 15 | class Groquments: 16 | def __init__(self): 17 | self.groq_provider = GroqProvider(api_key=os.getenv("GROQ_API_KEY")) 18 | self.model = os.getenv("GROQ_MODEL", "llama3-8b-8192") 19 | self.max_tokens = int(os.getenv("MAX_TOKENS", "1024")) 20 | self.temperature = float(os.getenv("TEMPERATURE", "0.7")) 21 | self.source_document = {} 22 | self.target_document = {} 23 | self.input_dir = os.getenv("INPUT_DIR", "input_documents") 24 | self.output_dir = os.getenv("OUTPUT_DIR", "output_documents") 25 | 26 | def read_docx(self, file_path: str) -> Dict[str, str]: 27 | doc = Document(file_path) 28 | document_content = {} 29 | for paragraph in doc.paragraphs: 30 | text = paragraph.text.strip() 31 | if ':' in text: 32 | label, value = text.split(':', 1) 33 | document_content[label.strip()] = value.strip() 34 | elif text and not text.endswith(':'): 35 | document_content[text] = "" 36 | return document_content 37 | 38 | def create_js_representation(self) -> str: 39 | js_object = json.dumps(self.source_document, indent=2) 40 | return f"const sourceDocument = {js_object};" 41 | 42 | def display_js_representation(self): 43 | print("\nJavaScript representation of the source document:") 44 | print(self.create_js_representation()) 45 | 46 | def extract_json_from_text(self, text: str) -> str: 47 | json_match = re.search(r'\{[\s\S]*\}', text) 48 | return json_match.group(0) if json_match else "{}" 49 | 50 | def ai_field_mapping(self) -> Dict[str, str]: 51 | prompt = f""" 52 | Given the following source document structure: 53 | {json.dumps(self.source_document, indent=2)} 54 | 55 | And the following target document structure: 56 | {json.dumps(self.target_document, indent=2)} 57 | 58 | Please map the fields from the source document to the target document. 59 | Return your answer as a JSON object where the keys are the target document fields 60 | and the values are the corresponding source document fields. 61 | If there's no clear match, use an empty string as the value. 62 | Only return the JSON object, without any additional explanation. 63 | """ 64 | 65 | response = self.groq_provider.generate( 66 | prompt, 67 | model=self.model, 68 | max_tokens=self.max_tokens, 69 | temperature=self.temperature 70 | ) 71 | 72 | print("AI response:", response) # Print the full AI response for debugging 73 | 74 | json_str = self.extract_json_from_text(response) 75 | 76 | try: 77 | field_mapping = json.loads(json_str) 78 | return field_mapping 79 | except json.JSONDecodeError: 80 | print("Error: Unable to parse AI response. Using fallback mapping.") 81 | return {key: "" for key in self.target_document.keys()} 82 | 83 | def fill_target_document(self): 84 | field_mapping = self.ai_field_mapping() 85 | 86 | for target_key, source_key in field_mapping.items(): 87 | if source_key in self.source_document: 88 | self.target_document[target_key] = self.source_document[source_key] 89 | else: 90 | self.target_document[target_key] = "" 91 | 92 | def save_filled_document(self, output_path: str): 93 | doc = Document() 94 | for key, value in self.target_document.items(): 95 | doc.add_paragraph(f"{key}: {value}") 96 | doc.save(output_path) 97 | 98 | def run(self): 99 | print("Welcome to Groquments!") 100 | 101 | root = tk.Tk() 102 | root.withdraw() 103 | 104 | source_file = filedialog.askopenfilename(title="Select source .docx file", 105 | filetypes=[("Word Document", "*.docx")], 106 | initialdir=self.input_dir) 107 | if not source_file: 108 | print("No source file selected. Exiting.") 109 | sys.exit() 110 | 111 | target_file = filedialog.askopenfilename(title="Select target .docx file", 112 | filetypes=[("Word Document", "*.docx")], 113 | initialdir=self.input_dir) 114 | if not target_file: 115 | print("No target file selected. Exiting.") 116 | sys.exit() 117 | 118 | self.source_document = self.read_docx(source_file) 119 | self.display_js_representation() 120 | 121 | self.target_document = self.read_docx(target_file) 122 | 123 | print("\nFilling out the target document...") 124 | self.fill_target_document() 125 | 126 | print("\nFilled target document:") 127 | print(json.dumps(self.target_document, indent=2)) 128 | 129 | os.makedirs(self.output_dir, exist_ok=True) 130 | output_file = os.path.join(self.output_dir, "filled_" + os.path.basename(target_file)) 131 | self.save_filled_document(output_file) 132 | print(f"\nFilled document saved as: {output_file}") 133 | 134 | if __name__ == "__main__": 135 | groquments = Groquments() 136 | groquments.run() -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | pocketgroq==0.3.0 2 | python-docx==1.1.0 3 | python-dotenv==1.0.0 4 | bs4 -------------------------------------------------------------------------------- /src/Groquments.md: -------------------------------------------------------------------------------- 1 | # groquments.py 2 | 3 | ```python 4 | import json 5 | import os 6 | import sys 7 | import re 8 | from pocketgroq import GroqProvider 9 | from typing import Dict, Any 10 | from docx import Document 11 | import tkinter as tk 12 | from tkinter import filedialog 13 | from dotenv import load_dotenv 14 | 15 | # Load environment variables 16 | load_dotenv() 17 | 18 | class Groquments: 19 | def __init__(self): 20 | self.groq_provider = GroqProvider(api_key=os.getenv("GROQ_API_KEY")) 21 | self.model = os.getenv("GROQ_MODEL", "llama3-8b-8192") 22 | self.max_tokens = int(os.getenv("MAX_TOKENS", "1024")) 23 | self.temperature = float(os.getenv("TEMPERATURE", "0.7")) 24 | self.source_document = {} 25 | self.target_document = {} 26 | self.input_dir = os.getenv("INPUT_DIR", "input_documents") 27 | self.output_dir = os.getenv("OUTPUT_DIR", "output_documents") 28 | 29 | def read_docx(self, file_path: str) -> Dict[str, str]: 30 | doc = Document(file_path) 31 | document_content = {} 32 | for paragraph in doc.paragraphs: 33 | text = paragraph.text.strip() 34 | if ':' in text: 35 | label, value = text.split(':', 1) 36 | document_content[label.strip()] = value.strip() 37 | elif text and not text.endswith(':'): 38 | document_content[text] = "" 39 | return document_content 40 | 41 | def create_js_representation(self) -> str: 42 | js_object = json.dumps(self.source_document, indent=2) 43 | return f"const sourceDocument = {js_object};" 44 | 45 | def display_js_representation(self): 46 | print("\nJavaScript representation of the source document:") 47 | print(self.create_js_representation()) 48 | 49 | def extract_json_from_text(self, text: str) -> str: 50 | json_match = re.search(r'\{[\s\S]*\}', text) 51 | return json_match.group(0) if json_match else "{}" 52 | 53 | def ai_field_mapping(self) -> Dict[str, str]: 54 | prompt = f""" 55 | Given the following source document structure: 56 | {json.dumps(self.source_document, indent=2)} 57 | 58 | And the following target document structure: 59 | {json.dumps(self.target_document, indent=2)} 60 | 61 | Please map the fields from the source document to the target document. 62 | Return your answer as a JSON object where the keys are the target document fields 63 | and the values are the corresponding source document fields. 64 | If there's no clear match, use an empty string as the value. 65 | Only return the JSON object, without any additional explanation. 66 | """ 67 | 68 | response = self.groq_provider.generate( 69 | prompt, 70 | model=self.model, 71 | max_tokens=self.max_tokens, 72 | temperature=self.temperature 73 | ) 74 | 75 | print("AI response:", response) # Print the full AI response for debugging 76 | 77 | json_str = self.extract_json_from_text(response) 78 | 79 | try: 80 | field_mapping = json.loads(json_str) 81 | return field_mapping 82 | except json.JSONDecodeError: 83 | print("Error: Unable to parse AI response. Using fallback mapping.") 84 | return {key: "" for key in self.target_document.keys()} 85 | 86 | def fill_target_document(self): 87 | field_mapping = self.ai_field_mapping() 88 | 89 | for target_key, source_key in field_mapping.items(): 90 | if source_key in self.source_document: 91 | self.target_document[target_key] = self.source_document[source_key] 92 | else: 93 | self.target_document[target_key] = "" 94 | 95 | def save_filled_document(self, output_path: str): 96 | doc = Document() 97 | for key, value in self.target_document.items(): 98 | doc.add_paragraph(f"{key}: {value}") 99 | doc.save(output_path) 100 | 101 | def run(self): 102 | print("Welcome to Groquments!") 103 | 104 | root = tk.Tk() 105 | root.withdraw() 106 | 107 | source_file = filedialog.askopenfilename(title="Select source .docx file", 108 | filetypes=[("Word Document", "*.docx")], 109 | initialdir=self.input_dir) 110 | if not source_file: 111 | print("No source file selected. Exiting.") 112 | sys.exit() 113 | 114 | target_file = filedialog.askopenfilename(title="Select target .docx file", 115 | filetypes=[("Word Document", "*.docx")], 116 | initialdir=self.input_dir) 117 | if not target_file: 118 | print("No target file selected. Exiting.") 119 | sys.exit() 120 | 121 | self.source_document = self.read_docx(source_file) 122 | self.display_js_representation() 123 | 124 | self.target_document = self.read_docx(target_file) 125 | 126 | print("\nFilling out the target document...") 127 | self.fill_target_document() 128 | 129 | print("\nFilled target document:") 130 | print(json.dumps(self.target_document, indent=2)) 131 | 132 | os.makedirs(self.output_dir, exist_ok=True) 133 | output_file = os.path.join(self.output_dir, "filled_" + os.path.basename(target_file)) 134 | self.save_filled_document(output_file) 135 | print(f"\nFilled document saved as: {output_file}") 136 | 137 | if __name__ == "__main__": 138 | groquments = Groquments() 139 | groquments.run() 140 | ``` 141 | 142 | -------------------------------------------------------------------------------- /src/custom_instructions.txt: -------------------------------------------------------------------------------- 1 | Groquments is a simple demonstration project showcasing how easily PocketGroq can help developers integrate Groq's powerful AI capabilities into their Python projects. This project provides a basic implementation of an AI-powered document field mapping tool. 2 | 3 | Please act as an expert Python programmer and software engineer. The attached Groquments.md file contains the complete and up-to-date codebase for our application. Your task is to thoroughly analyze the codebase, understand its programming flow and logic, and provide detailed insights, suggestions, and solutions to enhance the application's performance, efficiency, readability, and maintainability. 4 | 5 | We highly value responses that demonstrate a deep understanding of the code. Please ensure your recommendations are thoughtful, well-analyzed, and contribute positively to the project's success. Your expertise is crucial in helping us improve and upgrade our application. 6 | --------------------------------------------------------------------------------