├── .gitignore ├── LICENSE ├── Modelfile ├── README.md ├── categorizer.py ├── commands_and_scripts.md ├── data ├── BOI.pdf └── grocery_list.txt ├── final-rag-voice.py ├── flower_1.png ├── function-calling.py ├── pdf-rag-clean.py ├── pdf-rag-streamlit.py ├── pdf-rag.py ├── requirements.txt ├── start-1.py └── start-2.py /.gitignore: -------------------------------------------------------------------------------- 1 | # ignore .env file 2 | .env 3 | 4 | # ignore the venv folder 5 | venv/ 6 | 7 | # ignore the .vscode folder 8 | .vscode/ 9 | 10 | # ignore the .idea folder 11 | .idea/ 12 | 13 | #ignore the chroma_db folder 14 | chroma_db/ 15 | 16 | # ignore the db folder 17 | db/ 18 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2024 Packt 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Modelfile: -------------------------------------------------------------------------------- 1 | FROM llama3.2 2 | 3 | # set the temperature to 1 [higher is more creative, lower is more coherent] 4 | PARAMETER temperature 0.3 5 | 6 | SYSTEM """ 7 | You are James, a very smart assistant who answers 8 | questions succintly and informatively. 9 | 10 | """ -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Harnessing-Ollama---Create-Secure-Local-LLM-Solutions-with-Python 2 | Harnessing Ollama - Create Secure Local LLM Solutions with Python, Published by Packt Publishing 3 | 4 | 5 | 6 | # Welcome to The AI Guild 🚀 7 | 8 | **This code is a part of a module in our vibrant AI community 🚀[Join the AI Guild Community](https://bit.ly/ai-guild-join), where like-minded entrepreneurs and programmers come together to build real-world AI-based solutions.** 9 | 10 | ### What is The AI Guild? 11 | The AI Guild is a collaborative community designed for developers, tech enthusiasts, and entrepreneurs who want to **build practical AI tools** and solutions. Whether you’re just starting or looking to level up your skills, this is the place to dive deeper into AI in a supportive, hands-on environment. 12 | 13 | ### Why Join Us? 14 | - **Collaborate with Like-Minded Builders**: Work alongside a community of individuals passionate about AI, sharing ideas and solving real-world problems together. 15 | - **Access to Exclusive Resources**: Gain entry to our Code & Template Vault, a collection of ready-to-use code snippets, templates, and AI projects. 16 | - **Guided Learning Paths**: Follow structured paths, from AI Basics for Builders to advanced classes like AI Solutions Lab, designed to help you apply your knowledge. 17 | - **Weekly Live Calls & Q&A**: Get direct support, feedback, and guidance during live sessions with the community. 18 | - **Real-World AI Projects**: Work on projects that make an impact, learn from others, and showcase your work. 19 | 20 | ### Success Stories 21 | Here’s what some of our members are saying: 22 | - **"Joining The AI Guild has accelerated my learning. I’ve already built my first AI chatbot with the help of the community!"** 23 | - **"The live calls and feedback have been game-changers. I’ve implemented AI automation in my business, saving hours each week."** 24 | 25 | ### Who is This For? 26 | If you’re eager to: 27 | - Build AI tools that solve real problems 28 | - Collaborate and learn from experienced AI practitioners 29 | - Stay up-to-date with the latest in AI development 30 | - Turn your coding skills into actionable solutions 31 | 32 | Then **The AI Guild** is the perfect fit for you. 33 | 34 | ### Frequently Asked Questions 35 | - **Q: Do I need to be an expert to join?** 36 | - **A:** Not at all! The AI Guild is designed for all skill levels, from beginners to advanced developers. 37 | - **Q: Will I get personalized support?** 38 | - **A:** Yes! You’ll have access to live Q&A sessions and direct feedback on your projects. 39 | - **Q: What kind of projects can I work on?** 40 | - **A:** You can start with small projects like chatbots and automation tools, and progress to more advanced AI solutions tailored to your interests. 41 | 42 | ### How to Get Started 43 | Want to dive deeper and get the full experience? 🚀[Join the AI Guild Community](https://bit.ly/ai-guild-join) and unlock all the benefits of our growing community. 44 | 45 | We look forward to seeing what you’ll build with us! 46 | -------------------------------------------------------------------------------- /categorizer.py: -------------------------------------------------------------------------------- 1 | import ollama 2 | import os 3 | 4 | model = "llama3.2" 5 | 6 | # Paths to input and output files 7 | input_file = "./data/grocery_list.txt" 8 | output_file = "./data/categorized_grocery_list.txt" 9 | 10 | 11 | # Check if the input file exists 12 | if not os.path.exists(input_file): 13 | print(f"Input file '{input_file}' not found.") 14 | exit(1) 15 | 16 | 17 | # Read the uncategorized grocery items from the input file 18 | with open(input_file, "r") as f: 19 | items = f.read().strip() 20 | 21 | 22 | # Prepare the prompt for the model 23 | prompt = f""" 24 | You are an assistant that categorizes and sorts grocery items. 25 | 26 | Here is a list of grocery items: 27 | 28 | {items} 29 | 30 | Please: 31 | 32 | 1. Categorize these items into appropriate categories such as Produce, Dairy, Meat, Bakery, Beverages, etc. 33 | 2. Sort the items alphabetically within each category. 34 | 3. Present the categorized list in a clear and organized manner, using bullet points or numbering. 35 | 36 | """ 37 | 38 | 39 | # Send the prompt and get the response 40 | try: 41 | response = ollama.generate(model=model, prompt=prompt) 42 | generated_text = response.get("response", "") 43 | print("==== Categorized List: ===== \n") 44 | print(generated_text) 45 | 46 | # Write the categorized list to the output file 47 | with open(output_file, "w") as f: 48 | f.write(generated_text.strip()) 49 | 50 | print(f"Categorized grocery list has been saved to '{output_file}'.") 51 | except Exception as e: 52 | print("An error occurred:", str(e)) 53 | -------------------------------------------------------------------------------- /commands_and_scripts.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | # From : https://github.com/ollama/ollama/blob/main/README.md#quickstart 4 | 5 | ollama create mario -f ./Modelfile 6 | ollama run mario 7 | 8 | > > > hi 9 | > > > Hello! It's your friend Mario. 10 | 11 | # Pull a model 12 | 13 | ollama pull llama3.2 14 | 15 | # Remove a model 16 | 17 | ollama rm llama3.2 18 | 19 | # multimodal modals -- this will load the llava model ~4.7GB 20 | 21 | ollama run llava "What's in this image? /Users/jmorgan/Desktop/smile.png" 22 | The image features a yellow smiley face, which is likely the central focus of the picture. 23 | 24 | # Show model info 25 | 26 | ollama show llama3.2 27 | 28 | # List models on your computer 29 | 30 | ollama list 31 | 32 | # List which models are currently loaded 33 | 34 | ollama ps 35 | 36 | # Stop a model which is currently running 37 | 38 | ollama stop llama3.2 39 | 40 | ===================== 41 | 42 | # Open WebUI - Installation: https://github.com/open-webui/open-webui?tab=readme-ov-file#-also-check-out-open-webui-community 43 | 44 | # Install -- show them this, but we will use msty.app for simplicity 45 | 46 | pip install open-webui 47 | 48 | ======================= 49 | 50 | # REST API 51 | 52 | ollama has REST API for running and managing models: 53 | 54 | # Generate a respnse (this will stream each word...): 55 | 56 | curl http://localhost:11434/api/generate -d '{ 57 | "model": "llama3.2", 58 | "prompt":"Why is the sky blue?" 59 | }' 60 | 61 | # Add stream: false to just get the result: 62 | 63 | curl http://localhost:11434/api/generate -d '{ 64 | "model": "llama3.2", 65 | "prompt":"tell me a fun fact about Portugal", 66 | "stream": false 67 | }' 68 | 69 | # Chat with a model 70 | 71 | curl http://localhost:11434/api/chat -d '{ 72 | "model": "llama3.2", 73 | "messages": [ 74 | { "role": "user", "content": "tell me a fun fact about Mozambique" } 75 | ], 76 | "stream":false 77 | }' 78 | 79 | # Request JSON Mode: 80 | 81 | curl http://localhost:11434/api/generate -d '{ 82 | "model": "llama3.2", 83 | "prompt": "What color is the sky at different times of the day? Respond using JSON", 84 | "format": "json", 85 | "stream": false 86 | }' 87 | 88 | # More REST API Documentation: https://github.com/ollama/ollama/blob/main/docs/api.md 89 | -------------------------------------------------------------------------------- /data/BOI.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PacktPublishing/Harnessing-Ollama---Create-Secure-Local-LLM-Solutions-with-Python/6750e4c942bb246760aa5e61ae14cde47fa6b6b1/data/BOI.pdf -------------------------------------------------------------------------------- /data/grocery_list.txt: -------------------------------------------------------------------------------- 1 | Apples 2 | Chicken Breast 3 | Milk 4 | Bread 5 | Carrots 6 | Orange Juice 7 | Eggs 8 | Spinach 9 | Yogurt 10 | Ground Beef 11 | Bananas 12 | Cheese 13 | Cereal 14 | Tomatoes 15 | Pasta 16 | Rice 17 | Butter 18 | Salmon 19 | Broccoli 20 | Almonds 21 | Potatoes 22 | Onions 23 | Lettuce 24 | Strawberries 25 | Coffee 26 | Tea 27 | Sugar 28 | Flour 29 | Olive Oil 30 | Honey 31 | Peanut Butter 32 | Jam 33 | Garlic 34 | Bell Peppers 35 | Mushrooms 36 | Shrimp 37 | Sausages 38 | Granola Bars 39 | Oatmeal 40 | Ice Cream 41 | Soda 42 | Chips 43 | Chocolate 44 | Toilet Paper 45 | Paper Towels 46 | Dish Soap 47 | Laundry Detergent 48 | Shampoo 49 | Toothpaste 50 | -------------------------------------------------------------------------------- /final-rag-voice.py: -------------------------------------------------------------------------------- 1 | import ollama 2 | import os 3 | import datetime 4 | 5 | from langchain_community.document_loaders import PDFPlumberLoader 6 | 7 | model = "llama3.2" 8 | 9 | pdf_files = [f for f in os.listdir("./data") if f.endswith(".pdf")] 10 | 11 | all_pages = [] 12 | 13 | for pdf_file in pdf_files: 14 | 15 | file_path = os.path.join("./data", pdf_file) 16 | print(f"Processing PDF file: {pdf_file}") 17 | 18 | # Load the PDF file 19 | loader = PDFPlumberLoader(file_path=file_path) 20 | pages = loader.load_and_split() 21 | print(f"pages length: {len(pages)}") 22 | 23 | all_pages.extend(pages) 24 | 25 | # Extract text from the PDF file 26 | text = pages[0].page_content 27 | print(f"Text extracted from the PDF file '{pdf_file}':\n{text}\n") 28 | 29 | # Prepare the prompt for the model 30 | prompt = f""" 31 | You are an AI assistant that helps with summarizing PDF documents. 32 | 33 | Here is the content of the PDF file '{pdf_file}': 34 | 35 | {text} 36 | 37 | Please summarize the content of this document in a few sentences. 38 | """ 39 | 40 | # Send the prompt and get the response 41 | try: 42 | response = ollama.generate(model=model, prompt=prompt) 43 | summary = response.get("response", "") 44 | 45 | # print(f"Summary of the PDF file '{pdf_file}':\n{summary}\n") 46 | except Exception as e: 47 | print( 48 | f"An error occurred while summarizing the PDF file '{pdf_file}': {str(e)}" 49 | ) 50 | 51 | # Split and chunk 52 | from langchain.text_splitter import RecursiveCharacterTextSplitter 53 | 54 | text_splitter = RecursiveCharacterTextSplitter(chunk_size=1200, chunk_overlap=300) 55 | 56 | text_chunks = [] 57 | for page in all_pages: 58 | chunks = text_splitter.split_text(page.page_content) 59 | text_chunks.extend(chunks) 60 | 61 | print(f"Number of text chunks: {text_chunks}") 62 | 63 | 64 | # === Create Metadata for Text Chunks === 65 | # Example metadata management (customize as needed) 66 | def add_metadata(chunks, doc_title): 67 | metadata_chunks = [] 68 | for chunk in chunks: 69 | metadata = { 70 | "title": doc_title, 71 | "author": "US Business Bureau", # Update based on document data 72 | "date": str(datetime.date.today()), 73 | } 74 | metadata_chunks.append({"text": chunk, "metadata": metadata}) 75 | return metadata_chunks 76 | 77 | 78 | # add metadata to text chunks 79 | metadata_text_chunks = add_metadata(text_chunks, "BOI US FinCEN") 80 | # pprint.pprint(f"metadata text chunks: {metadata_text_chunks}") 81 | 82 | 83 | # === Create Embedding from Text Chunks === 84 | ollama.pull("nomic-embed-text") 85 | 86 | 87 | # Function to generate embeddings for text chunks 88 | def generate_embeddings(text_chunks, model_name="nomic-embed-text"): 89 | embeddings = [] 90 | for chunk in text_chunks: 91 | # Generate the embedding for each chunk 92 | embedding = ollama.embeddings(model=model_name, prompt=chunk) 93 | embeddings.append(embedding) 94 | return embeddings 95 | 96 | 97 | # example embeddings 98 | texts = [chunk["text"] for chunk in metadata_text_chunks] 99 | embeddings = generate_embeddings(texts) 100 | # print(f"Embeddings: {embeddings}") 101 | 102 | 103 | ## === Add Embeddings to Vector Database Chromadb === 104 | from langchain_community.vectorstores import Chroma 105 | from langchain.schema import Document 106 | from langchain_ollama import OllamaEmbeddings 107 | 108 | # Wrap texts with their respective metadata into Document objects 109 | docs = [ 110 | Document(page_content=chunk["text"], metadata=chunk["metadata"]) 111 | for chunk in metadata_text_chunks 112 | ] 113 | 114 | # == Use fastEmbeddings model from Ollama == 115 | # to add embeddings into the vector database 116 | # and have a better quality of the embeddings 117 | from langchain_community.embeddings.fastembed import FastEmbedEmbeddings 118 | 119 | fastembedding = FastEmbedEmbeddings() 120 | # Also for performance improvement, persist the vector database 121 | vector_db_path = "./db/vector_db" 122 | 123 | vector_db = Chroma.from_documents( 124 | documents=docs, 125 | embedding=fastembedding, 126 | persist_directory=vector_db_path, 127 | # embedding=OllamaEmbeddings(model="nomic-embed-text"), 128 | collection_name="docs-local-rag", 129 | ) 130 | 131 | 132 | # Implement a Query Processing Muliti-query Retriever 133 | from langchain.prompts import ChatPromptTemplate, PromptTemplate 134 | from langchain_core.output_parsers import StrOutputParser 135 | from langchain_ollama import ChatOllama 136 | from langchain_core.runnables import RunnablePassthrough 137 | from langchain.retrievers.multi_query import MultiQueryRetriever 138 | 139 | # LLM from Ollama 140 | local_model = "llama3.2" 141 | llm = ChatOllama(model=local_model) 142 | 143 | QUERY_PROMPT = PromptTemplate( 144 | input_variables=["question"], 145 | template="""You are an AI language model assistant. Your task is to generate five 146 | different versions of the given user question to retrieve relevant documents from 147 | a vector database. By generating multiple perspectives on the user question, your 148 | goal is to help the user overcome some of the limitations of the distance-based 149 | similarity search. Provide these alternative questions separated by newlines. 150 | Original question: {question}""", 151 | ) 152 | 153 | 154 | retriever = MultiQueryRetriever.from_llm( 155 | vector_db.as_retriever(), llm, prompt=QUERY_PROMPT 156 | ) 157 | 158 | # RAG prompt 159 | template = """Answer the question based ONLY on the following context: 160 | {context} 161 | Question: {question} 162 | """ 163 | prompt = ChatPromptTemplate.from_template(template) 164 | 165 | 166 | chain = ( 167 | {"context": retriever, "question": RunnablePassthrough()} 168 | | prompt 169 | | llm 170 | | StrOutputParser() 171 | ) 172 | 173 | questions = """ 174 | by when should I file if my business was established in 2013?""" 175 | print((chain.invoke(questions))) 176 | response = chain.invoke(questions) 177 | 178 | ## === TALK TO THE MODEL === 179 | from elevenlabs.client import ElevenLabs 180 | from elevenlabs import play 181 | from elevenlabs import stream 182 | from dotenv import load_dotenv 183 | 184 | load_dotenv() 185 | 186 | text_response = response 187 | 188 | api_key = os.getenv("ELEVENLABS_API_KEY") 189 | 190 | # Generate the audio stream 191 | client = ElevenLabs(api_key=api_key) 192 | audio_stream = client.generate(text=text_response, model="eleven_turbo_v2", stream=True) 193 | # play(audio_stream) 194 | stream(audio_stream) 195 | -------------------------------------------------------------------------------- /flower_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PacktPublishing/Harnessing-Ollama---Create-Secure-Local-LLM-Solutions-with-Python/6750e4c942bb246760aa5e61ae14cde47fa6b6b1/flower_1.png -------------------------------------------------------------------------------- /function-calling.py: -------------------------------------------------------------------------------- 1 | import os 2 | import json 3 | import asyncio 4 | import random 5 | from ollama import AsyncClient 6 | 7 | 8 | # Load the grocery list from a text file 9 | def load_grocery_list(file_path): 10 | if not os.path.exists(file_path): 11 | print(f"File {file_path} does not exist.") 12 | return [] 13 | with open(file_path, "r") as file: 14 | items = [line.strip() for line in file if line.strip()] 15 | return items 16 | 17 | 18 | # Function to fetch price and nutrition data for an item 19 | async def fetch_price_and_nutrition(item): 20 | print(f"Fetching price and nutrition data for '{item}'...") 21 | # Replace with actual API calls 22 | # For demonstration, we'll return mock data 23 | await asyncio.sleep(0.1) # Simulate network delay 24 | return { 25 | "item": item, 26 | "price": f"${random.uniform(1, 10):.2f}", 27 | "calories": f"{random.randint(50, 500)} kcal", 28 | "fat": f"{random.randint(1, 20)} g", 29 | "protein": f"{random.randint(1, 30)} g", 30 | } 31 | 32 | 33 | # Function to fetch a recipe based on a category 34 | async def fetch_recipe(category): 35 | print(f"Fetching a recipe for the '{category}' category...") 36 | # Replace with actual API calls to a recipe site 37 | # For demonstration, we'll return mock data 38 | await asyncio.sleep(0.1) # Simulate network delay 39 | return { 40 | "category": category, 41 | "recipe": f"Delicious {category} dish", 42 | "ingredients": ["Ingredient 1", "Ingredient 2", "Ingredient 3"], 43 | "instructions": "Mix ingredients and cook.", 44 | } 45 | 46 | 47 | async def main(): 48 | # Load grocery list 49 | grocery_items = load_grocery_list("./data/grocery_list.txt") 50 | if not grocery_items: 51 | print("Grocery list is empty or file not found.") 52 | return 53 | 54 | # Initialize Ollama client 55 | client = AsyncClient() 56 | 57 | # Define the functions (tools) for the model 58 | tools = [ 59 | { 60 | "type": "function", 61 | "function": { 62 | "name": "fetch_price_and_nutrition", 63 | "description": "Fetch price and nutrition data for a grocery item", 64 | "parameters": { 65 | "type": "object", 66 | "properties": { 67 | "item": { 68 | "type": "string", 69 | "description": "The name of the grocery item", 70 | }, 71 | }, 72 | "required": ["item"], 73 | }, 74 | }, 75 | }, 76 | { 77 | "type": "function", 78 | "function": { 79 | "name": "fetch_recipe", 80 | "description": "Fetch a recipe based on a category", 81 | "parameters": { 82 | "type": "object", 83 | "properties": { 84 | "category": { 85 | "type": "string", 86 | "description": "The category of food (e.g., Produce, Dairy)", 87 | }, 88 | }, 89 | "required": ["category"], 90 | }, 91 | }, 92 | }, 93 | ] 94 | 95 | # Step 1: Categorize items using the model 96 | categorize_prompt = f""" 97 | You are an assistant that categorizes grocery items. 98 | 99 | **Instructions:** 100 | 101 | - Return the result **only** as a valid JSON object. 102 | - Do **not** include any explanations, greetings, or additional text. 103 | - Use double quotes (`"`) for all strings. 104 | - Ensure the JSON is properly formatted. 105 | - The JSON should have categories as keys and lists of items as values. 106 | 107 | **Example Format:** 108 | 109 | {{ 110 | "Produce": ["Apples", "Bananas"], 111 | "Dairy": ["Milk", "Cheese"] 112 | }} 113 | 114 | **Grocery Items:** 115 | 116 | {', '.join(grocery_items)} 117 | """ 118 | 119 | messages = [{"role": "user", "content": categorize_prompt}] 120 | # First API call: Categorize items 121 | response = await client.chat( 122 | model="llama3.2", 123 | messages=messages, 124 | tools=tools, # No function calling needed here, but included for consistency 125 | ) 126 | 127 | # Add the model's response to the conversation history 128 | messages.append(response["message"]) 129 | print(response["message"]["content"]) 130 | 131 | # Parse the model's response 132 | assistant_message = response["message"]["content"] 133 | 134 | try: 135 | categorized_items = json.loads(assistant_message) 136 | print("Categorized items:") 137 | print(categorized_items) 138 | 139 | except json.JSONDecodeError: 140 | print("Failed to parse the model's response as JSON.") 141 | print("Model's response:") 142 | print(assistant_message) 143 | return 144 | 145 | # Step 2: Fetch price and nutrition data using function calling 146 | 147 | # Construct a message to instruct the model to fetch data for each item 148 | # We'll ask the model to decide which items to fetch data for by using function calling 149 | fetch_prompt = """ 150 | For each item in the grocery list, use the 'fetch_price_and_nutrition' function to get its price and nutrition data. 151 | """ 152 | 153 | messages.append({"role": "user", "content": fetch_prompt}) 154 | 155 | # Second API call: The model should decide to call the function for each item 156 | response = await client.chat( 157 | model="llama3.2", 158 | messages=messages, 159 | tools=tools, 160 | ) 161 | # Add the model's response to the conversation history 162 | messages.append(response["message"]) 163 | 164 | # Process function calls made by the model 165 | if response["message"].get("tool_calls"): 166 | print("Function calls made by the model:") 167 | available_functions = { 168 | "fetch_price_and_nutrition": fetch_price_and_nutrition, 169 | } 170 | # Store the details for later use 171 | item_details = [] 172 | for tool_call in response["message"]["tool_calls"]: 173 | function_name = tool_call["function"]["name"] 174 | arguments = tool_call["function"]["arguments"] 175 | function_to_call = available_functions.get(function_name) 176 | if function_to_call: 177 | result = await function_to_call(**arguments) 178 | # Add function response to the conversation 179 | messages.append( 180 | { 181 | "role": "tool", 182 | "content": json.dumps(result), 183 | } 184 | ) 185 | item_details.append(result) 186 | 187 | print(item_details) 188 | else: 189 | print( 190 | "The model didn't make any function calls for fetching price and nutrition data." 191 | ) 192 | return 193 | 194 | # Step 3: Fetch a recipe for a random category using function calling 195 | 196 | # Choose a random category 197 | random_category = random.choice(list(categorized_items.keys())) 198 | recipe_prompt = f""" 199 | Fetch a recipe for the '{random_category}' category using the 'fetch_recipe' function. 200 | """ 201 | messages.append({"role": "user", "content": recipe_prompt}) 202 | 203 | # Third API call: The model should decide to call the 'fetch_recipe' function 204 | response = await client.chat( 205 | model="llama3.2", 206 | messages=messages, 207 | tools=tools, 208 | ) 209 | 210 | # Add the model's response to the conversation history 211 | messages.append(response["message"]) 212 | # Process function calls made by the model 213 | if response["message"].get("tool_calls"): 214 | available_functions = { 215 | "fetch_recipe": fetch_recipe, 216 | } 217 | for tool_call in response["message"]["tool_calls"]: 218 | function_name = tool_call["function"]["name"] 219 | arguments = tool_call["function"]["arguments"] 220 | function_to_call = available_functions.get(function_name) 221 | if function_to_call: 222 | result = await function_to_call(**arguments) 223 | # Add function response to the conversation 224 | messages.append( 225 | { 226 | "role": "tool", 227 | "content": json.dumps(result), 228 | } 229 | ) 230 | else: 231 | print("The model didn't make any function calls for fetching a recipe.") 232 | return 233 | 234 | # Final API call: Get the assistant's final response 235 | final_response = await client.chat( 236 | model="llama3.2", 237 | messages=messages, 238 | tools=tools, 239 | ) 240 | 241 | print("\nAssistant's Final Response:") 242 | print(final_response["message"]["content"]) 243 | 244 | 245 | # Run the async main function 246 | asyncio.run(main()) 247 | -------------------------------------------------------------------------------- /pdf-rag-clean.py: -------------------------------------------------------------------------------- 1 | # main.py 2 | 3 | import os 4 | import logging 5 | from langchain_community.document_loaders import UnstructuredPDFLoader 6 | from langchain_text_splitters import RecursiveCharacterTextSplitter 7 | from langchain_community.vectorstores import Chroma 8 | from langchain_ollama import OllamaEmbeddings 9 | from langchain.prompts import ChatPromptTemplate, PromptTemplate 10 | from langchain_ollama import ChatOllama 11 | from langchain_core.output_parsers import StrOutputParser 12 | from langchain_core.runnables import RunnablePassthrough 13 | from langchain.retrievers.multi_query import MultiQueryRetriever 14 | import ollama 15 | 16 | # Configure logging 17 | logging.basicConfig(level=logging.INFO) 18 | 19 | # Constants 20 | DOC_PATH = "./data/BOI.pdf" 21 | MODEL_NAME = "llama3.2" 22 | EMBEDDING_MODEL = "nomic-embed-text" 23 | VECTOR_STORE_NAME = "simple-rag" 24 | 25 | 26 | def ingest_pdf(doc_path): 27 | """Load PDF documents.""" 28 | if os.path.exists(doc_path): 29 | loader = UnstructuredPDFLoader(file_path=doc_path) 30 | data = loader.load() 31 | logging.info("PDF loaded successfully.") 32 | return data 33 | else: 34 | logging.error(f"PDF file not found at path: {doc_path}") 35 | return None 36 | 37 | 38 | def split_documents(documents): 39 | """Split documents into smaller chunks.""" 40 | text_splitter = RecursiveCharacterTextSplitter(chunk_size=1200, chunk_overlap=300) 41 | chunks = text_splitter.split_documents(documents) 42 | logging.info("Documents split into chunks.") 43 | return chunks 44 | 45 | 46 | def create_vector_db(chunks): 47 | """Create a vector database from document chunks.""" 48 | # Pull the embedding model if not already available 49 | ollama.pull(EMBEDDING_MODEL) 50 | 51 | vector_db = Chroma.from_documents( 52 | documents=chunks, 53 | embedding=OllamaEmbeddings(model=EMBEDDING_MODEL), 54 | collection_name=VECTOR_STORE_NAME, 55 | ) 56 | logging.info("Vector database created.") 57 | return vector_db 58 | 59 | 60 | def create_retriever(vector_db, llm): 61 | """Create a multi-query retriever.""" 62 | QUERY_PROMPT = PromptTemplate( 63 | input_variables=["question"], 64 | template="""You are an AI language model assistant. Your task is to generate five 65 | different versions of the given user question to retrieve relevant documents from 66 | a vector database. By generating multiple perspectives on the user question, your 67 | goal is to help the user overcome some of the limitations of the distance-based 68 | similarity search. Provide these alternative questions separated by newlines. 69 | Original question: {question}""", 70 | ) 71 | 72 | retriever = MultiQueryRetriever.from_llm( 73 | vector_db.as_retriever(), llm, prompt=QUERY_PROMPT 74 | ) 75 | logging.info("Retriever created.") 76 | return retriever 77 | 78 | 79 | def create_chain(retriever, llm): 80 | """Create the chain""" 81 | # RAG prompt 82 | template = """Answer the question based ONLY on the following context: 83 | {context} 84 | Question: {question} 85 | """ 86 | 87 | prompt = ChatPromptTemplate.from_template(template) 88 | 89 | chain = ( 90 | {"context": retriever, "question": RunnablePassthrough()} 91 | | prompt 92 | | llm 93 | | StrOutputParser() 94 | ) 95 | 96 | logging.info("Chain created successfully.") 97 | return chain 98 | 99 | 100 | def main(): 101 | # Load and process the PDF document 102 | data = ingest_pdf(DOC_PATH) 103 | if data is None: 104 | return 105 | 106 | # Split the documents into chunks 107 | chunks = split_documents(data) 108 | 109 | # Create the vector database 110 | vector_db = create_vector_db(chunks) 111 | 112 | # Initialize the language model 113 | llm = ChatOllama(model=MODEL_NAME) 114 | 115 | # Create the retriever 116 | retriever = create_retriever(vector_db, llm) 117 | 118 | # Create the chain with preserved syntax 119 | chain = create_chain(retriever, llm) 120 | 121 | # Example query 122 | question = "How to report BOI?" 123 | 124 | # Get the response 125 | res = chain.invoke(input=question) 126 | print("Response:") 127 | print(res) 128 | 129 | 130 | if __name__ == "__main__": 131 | main() 132 | -------------------------------------------------------------------------------- /pdf-rag-streamlit.py: -------------------------------------------------------------------------------- 1 | # app.py 2 | 3 | import streamlit as st 4 | import os 5 | import logging 6 | from langchain_community.document_loaders import UnstructuredPDFLoader 7 | from langchain_text_splitters import RecursiveCharacterTextSplitter 8 | from langchain_community.vectorstores import Chroma 9 | from langchain_ollama import OllamaEmbeddings 10 | from langchain.prompts import ChatPromptTemplate, PromptTemplate 11 | from langchain_ollama import ChatOllama 12 | from langchain_core.output_parsers import StrOutputParser 13 | from langchain_core.runnables import RunnablePassthrough 14 | from langchain.retrievers.multi_query import MultiQueryRetriever 15 | import ollama 16 | 17 | # Configure logging 18 | logging.basicConfig(level=logging.INFO) 19 | 20 | # Constants 21 | DOC_PATH = "./data/BOI.pdf" 22 | MODEL_NAME = "llama3.2" 23 | EMBEDDING_MODEL = "nomic-embed-text" 24 | VECTOR_STORE_NAME = "simple-rag" 25 | PERSIST_DIRECTORY = "./chroma_db" 26 | 27 | 28 | def ingest_pdf(doc_path): 29 | """Load PDF documents.""" 30 | if os.path.exists(doc_path): 31 | loader = UnstructuredPDFLoader(file_path=doc_path) 32 | data = loader.load() 33 | logging.info("PDF loaded successfully.") 34 | return data 35 | else: 36 | logging.error(f"PDF file not found at path: {doc_path}") 37 | st.error("PDF file not found.") 38 | return None 39 | 40 | 41 | def split_documents(documents): 42 | """Split documents into smaller chunks.""" 43 | text_splitter = RecursiveCharacterTextSplitter(chunk_size=1200, chunk_overlap=300) 44 | chunks = text_splitter.split_documents(documents) 45 | logging.info("Documents split into chunks.") 46 | return chunks 47 | 48 | 49 | @st.cache_resource 50 | def load_vector_db(): 51 | """Load or create the vector database.""" 52 | # Pull the embedding model if not already available 53 | ollama.pull(EMBEDDING_MODEL) 54 | 55 | embedding = OllamaEmbeddings(model=EMBEDDING_MODEL) 56 | 57 | if os.path.exists(PERSIST_DIRECTORY): 58 | vector_db = Chroma( 59 | embedding_function=embedding, 60 | collection_name=VECTOR_STORE_NAME, 61 | persist_directory=PERSIST_DIRECTORY, 62 | ) 63 | logging.info("Loaded existing vector database.") 64 | else: 65 | # Load and process the PDF document 66 | data = ingest_pdf(DOC_PATH) 67 | if data is None: 68 | return None 69 | 70 | # Split the documents into chunks 71 | chunks = split_documents(data) 72 | 73 | vector_db = Chroma.from_documents( 74 | documents=chunks, 75 | embedding=embedding, 76 | collection_name=VECTOR_STORE_NAME, 77 | persist_directory=PERSIST_DIRECTORY, 78 | ) 79 | vector_db.persist() 80 | logging.info("Vector database created and persisted.") 81 | return vector_db 82 | 83 | 84 | def create_retriever(vector_db, llm): 85 | """Create a multi-query retriever.""" 86 | QUERY_PROMPT = PromptTemplate( 87 | input_variables=["question"], 88 | template="""You are an AI language model assistant. Your task is to generate five 89 | different versions of the given user question to retrieve relevant documents from 90 | a vector database. By generating multiple perspectives on the user question, your 91 | goal is to help the user overcome some of the limitations of the distance-based 92 | similarity search. Provide these alternative questions separated by newlines. 93 | Original question: {question}""", 94 | ) 95 | 96 | retriever = MultiQueryRetriever.from_llm( 97 | vector_db.as_retriever(), llm, prompt=QUERY_PROMPT 98 | ) 99 | logging.info("Retriever created.") 100 | return retriever 101 | 102 | 103 | def create_chain(retriever, llm): 104 | """Create the chain with preserved syntax.""" 105 | # RAG prompt 106 | template = """Answer the question based ONLY on the following context: 107 | {context} 108 | Question: {question} 109 | """ 110 | 111 | prompt = ChatPromptTemplate.from_template(template) 112 | 113 | chain = ( 114 | {"context": retriever, "question": RunnablePassthrough()} 115 | | prompt 116 | | llm 117 | | StrOutputParser() 118 | ) 119 | 120 | logging.info("Chain created with preserved syntax.") 121 | return chain 122 | 123 | 124 | def main(): 125 | st.title("Document Assistant") 126 | 127 | # User input 128 | user_input = st.text_input("Enter your question:", "") 129 | 130 | if user_input: 131 | with st.spinner("Generating response..."): 132 | try: 133 | # Initialize the language model 134 | llm = ChatOllama(model=MODEL_NAME) 135 | 136 | # Load the vector database 137 | vector_db = load_vector_db() 138 | if vector_db is None: 139 | st.error("Failed to load or create the vector database.") 140 | return 141 | 142 | # Create the retriever 143 | retriever = create_retriever(vector_db, llm) 144 | 145 | # Create the chain 146 | chain = create_chain(retriever, llm) 147 | 148 | # Get the response 149 | response = chain.invoke(input=user_input) 150 | 151 | st.markdown("**Assistant:**") 152 | st.write(response) 153 | except Exception as e: 154 | st.error(f"An error occurred: {str(e)}") 155 | else: 156 | st.info("Please enter a question to get started.") 157 | 158 | 159 | if __name__ == "__main__": 160 | main() 161 | -------------------------------------------------------------------------------- /pdf-rag.py: -------------------------------------------------------------------------------- 1 | ## 1. Ingest PDF Files 2 | # 2. Extract Text from PDF Files and split into small chunks 3 | # 3. Send the chunks to the embedding model 4 | # 4. Save the embeddings to a vector database 5 | # 5. Perform similarity search on the vector database to find similar documents 6 | # 6. retrieve the similar documents and present them to the user 7 | ## run pip install -r requirements.txt to install the required packages 8 | 9 | from langchain_community.document_loaders import UnstructuredPDFLoader 10 | from langchain_community.document_loaders import OnlinePDFLoader 11 | 12 | doc_path = "./data/BOI.pdf" 13 | model = "llama3.2" 14 | 15 | # Local PDF file uploads 16 | if doc_path: 17 | loader = UnstructuredPDFLoader(file_path=doc_path) 18 | data = loader.load() 19 | print("done loading....") 20 | else: 21 | print("Upload a PDF file") 22 | 23 | # Preview first page 24 | content = data[0].page_content 25 | # print(content[:100]) 26 | 27 | 28 | # ==== End of PDF Ingestion ==== 29 | 30 | 31 | # ==== Extract Text from PDF Files and Split into Small Chunks ==== 32 | 33 | from langchain_ollama import OllamaEmbeddings 34 | from langchain_text_splitters import RecursiveCharacterTextSplitter 35 | from langchain_community.vectorstores import Chroma 36 | 37 | # Split and chunk 38 | text_splitter = RecursiveCharacterTextSplitter(chunk_size=1200, chunk_overlap=300) 39 | chunks = text_splitter.split_documents(data) 40 | print("done splitting....") 41 | 42 | # print(f"Number of chunks: {len(chunks)}") 43 | # print(f"Example chunk: {chunks[0]}") 44 | 45 | # ===== Add to vector database === 46 | import ollama 47 | 48 | ollama.pull("nomic-embed-text") 49 | 50 | vector_db = Chroma.from_documents( 51 | documents=chunks, 52 | embedding=OllamaEmbeddings(model="nomic-embed-text"), 53 | collection_name="simple-rag", 54 | ) 55 | print("done adding to vector database....") 56 | 57 | 58 | ## === Retrieval === 59 | from langchain.prompts import ChatPromptTemplate, PromptTemplate 60 | from langchain_core.output_parsers import StrOutputParser 61 | 62 | from langchain_ollama import ChatOllama 63 | 64 | from langchain_core.runnables import RunnablePassthrough 65 | from langchain.retrievers.multi_query import MultiQueryRetriever 66 | 67 | # set up our model to use 68 | llm = ChatOllama(model=model) 69 | 70 | # a simple technique to generate multiple questions from a single question and then retrieve documents 71 | # based on those questions, getting the best of both worlds. 72 | QUERY_PROMPT = PromptTemplate( 73 | input_variables=["question"], 74 | template="""You are an AI language model assistant. Your task is to generate five 75 | different versions of the given user question to retrieve relevant documents from 76 | a vector database. By generating multiple perspectives on the user question, your 77 | goal is to help the user overcome some of the limitations of the distance-based 78 | similarity search. Provide these alternative questions separated by newlines. 79 | Original question: {question}""", 80 | ) 81 | 82 | retriever = MultiQueryRetriever.from_llm( 83 | vector_db.as_retriever(), llm, prompt=QUERY_PROMPT 84 | ) 85 | 86 | 87 | # RAG prompt 88 | template = """Answer the question based ONLY on the following context: 89 | {context} 90 | Question: {question} 91 | """ 92 | 93 | prompt = ChatPromptTemplate.from_template(template) 94 | 95 | 96 | chain = ( 97 | {"context": retriever, "question": RunnablePassthrough()} 98 | | prompt 99 | | llm 100 | | StrOutputParser() 101 | ) 102 | 103 | 104 | # res = chain.invoke(input=("what is the document about?",)) 105 | # res = chain.invoke( 106 | # input=("what are the main points as a business owner I should be aware of?",) 107 | # ) 108 | res = chain.invoke(input=("how to report BOI?",)) 109 | 110 | print(res) 111 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | ollama 2 | chromadb 3 | pdfplumber 4 | langchain 5 | langchain-core 6 | langchain-ollama 7 | langchain-community 8 | langchain_text_splitters 9 | unstructured 10 | unstructured[all-docs] 11 | fastembed 12 | pdfplumber 13 | sentence-transformers 14 | elevenlabs -------------------------------------------------------------------------------- /start-1.py: -------------------------------------------------------------------------------- 1 | import requests 2 | import json 3 | 4 | url = "http://localhost:11434/api/generate" 5 | 6 | data = { 7 | "model": "llama3.2", 8 | "prompt": "tell me a short story and make it funny.", 9 | } 10 | 11 | response = requests.post( 12 | url, json=data, stream=True 13 | ) # remove the stream=True to get the full response 14 | 15 | 16 | # check the response status 17 | if response.status_code == 200: 18 | print("Generated Text:", end=" ", flush=True) 19 | # Iterate over the streaming response 20 | for line in response.iter_lines(): 21 | if line: 22 | # Decode the line and parse the JSON 23 | decoded_line = line.decode("utf-8") 24 | result = json.loads(decoded_line) 25 | # Get the text from the response 26 | generated_text = result.get("response", "") 27 | print(generated_text, end="", flush=True) 28 | else: 29 | print("Error:", response.status_code, response.text) 30 | -------------------------------------------------------------------------------- /start-2.py: -------------------------------------------------------------------------------- 1 | import ollama 2 | 3 | 4 | response = ollama.list() 5 | 6 | # print(response) 7 | 8 | # == Chat example == 9 | res = ollama.chat( 10 | model="llama3.2", 11 | messages=[ 12 | {"role": "user", "content": "why is the sky blue?"}, 13 | ], 14 | ) 15 | # print(res["message"]["content"]) 16 | 17 | # == Chat example streaming == 18 | res = ollama.chat( 19 | model="llama3.2", 20 | messages=[ 21 | { 22 | "role": "user", 23 | "content": "why is the ocean so salty?", 24 | }, 25 | ], 26 | stream=True, 27 | ) 28 | # for chunk in res: 29 | # print(chunk["message"]["content"], end="", flush=True) 30 | 31 | 32 | # ================================================================================== 33 | # ==== The Ollama Python library's API is designed around the Ollama REST API ==== 34 | # ================================================================================== 35 | 36 | # == Generate example == 37 | res = ollama.generate( 38 | model="llama3.2", 39 | prompt="why is the sky blue?", 40 | ) 41 | 42 | # show 43 | # print(ollama.show("llama3.2")) 44 | 45 | 46 | # Create a new model with modelfile 47 | modelfile = """ 48 | FROM llama3.2 49 | SYSTEM You are very smart assistant who knows everything about oceans. You are very succinct and informative. 50 | PARAMETER temperature 0.1 51 | """ 52 | 53 | ollama.create(model="knowitall", modelfile=modelfile) 54 | 55 | res = ollama.generate(model="knowitall", prompt="why is the ocean so salty?") 56 | print(res["response"]) 57 | 58 | 59 | # delete model 60 | ollama.delete("knowitall") --------------------------------------------------------------------------------