├── .env ├── docs ├── gemini.pdf └── test_sample.txt ├── requirements.txt ├── README.md └── RAG.py /.env: -------------------------------------------------------------------------------- 1 | GOOGLE_API_KEY= 2 | -------------------------------------------------------------------------------- /docs/gemini.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MrSentinel137/gemini-ollama-RAG/HEAD/docs/gemini.pdf -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | langchain 2 | chromadb 3 | streamlit 4 | google-generativeai 5 | langchain-google-genai 6 | python-dotenv 7 | pypdf 8 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # RAG Python Chat Bot with Gemini, Ollama, Streamlit Madness! 🤖💬 2 | 3 | 🚀 Welcome to the repository for our thrilling journey into the world of Python chat bots powered by RAG (Retrieval Augmented Generation)! 🐍 In this project, we harness the capabilities of Gemini, Ollama, and Streamlit to create an intelligent and entertaining chat bot. 4 | 5 | ## Key Features: 6 | 7 | - **Gemini Brilliance:** Explore the cutting-edge capabilities of Gemini in enhancing the bot's retrieval and generation mechanisms. 8 | - **Ollama Charm:** Experience the conversational charm brought to the bot by Ollama, making interactions smoother and more engaging. 9 | - **Sleek Streamlit Interface:** Navigate through a sleek and user-friendly interface powered by Streamlit, providing a seamless user experience. 10 | 11 | ## Secret Sauce – LangChain Integration: 12 | Uncover the magic behind the scenes with LangChain integration, adding a unique layer of functionality to elevate your chat bot. 13 | 14 | ## Get Started: 15 | 1. **Clone the Repository:** 16 | ```bash 17 | git clone https://github.com/MrSentinel137/gemini-ollama-RAG.git 18 | cd gemini-ollama-RAG 19 | ``` 20 | 21 | 2. **Install Dependencies:** 22 | ```bash 23 | pip install -r requirements.txt 24 | ``` 25 | OR 26 | 27 | ```bash 28 | pip3 install -r requirements.txt 29 | ``` 30 | 31 | 3. **Run the Code:** 32 | ```bash 33 | streamlit run RAG.py 34 | ``` 35 | -------------------------------------------------------------------------------- /docs/test_sample.txt: -------------------------------------------------------------------------------- 1 | Lacaille 9352 (Lac 9352) is a red dwarf star in the southern constellation of Piscis Austrinus. With an apparent visual magnitude of 7.34, 2 | this star is too faint to be viewed with the naked eye except possibly under excellent seeing conditions. Parallax measurements place it at a distance of about 10.74 3 | light-years (3.29 parsecs) from Earth.[1] It is the eleventh closest star system to the Solar System[14] and is the closest star in the constellation Piscis Austrinus. 4 | The ChView simulation shows that its closest neighbour is the EZ Aquarii triple star system at about 4.1 ly away. 5 | 6 | This star has the fourth highest known proper motion, (which was first noticed by Benjamin Gould in 1881) moving a total of 6.9 arcseconds per year. 7 | However, this is still a very small movement overall, as there are 3,600 arcseconds in a degree of arc. The space velocity components of this star are 8 | (U, V, W) = (−93.9, −14.1, −51.4) km/s. If the radial velocity (Vr) equals +9.7 km/s then about 2,700 years ago Lacaille 9352 was at its minimal distance of 9 | approximately 10.63 ly (3.26 pc) from the Sun. 10 | 11 | The spectrum of Lacaille 9352 places it at a stellar classification of M0.5V,indicating it is a type of main sequence star known as a red dwarf. 12 | This was the first red dwarf star to have its angular diameter measured, with the physical diameter being about 47% of the Sun's radius. 13 | It has around half the mass of the Sun and the outer envelope has an effective temperature of about 3,670 K. 14 | 15 | -------------------------------------------------------------------------------- /RAG.py: -------------------------------------------------------------------------------- 1 | from langchain.vectorstores import Chroma 2 | from langchain_google_genai import GoogleGenerativeAIEmbeddings 3 | from langchain_google_genai import ChatGoogleGenerativeAI 4 | import google.generativeai as genai 5 | from langchain.document_loaders import TextLoader, DirectoryLoader, PyPDFLoader 6 | from langchain.text_splitter import RecursiveCharacterTextSplitter 7 | from langchain.llms import Ollama 8 | from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler 9 | from langchain.callbacks.manager import CallbackManager 10 | from langchain.chains import RetrievalQA 11 | from dotenv import load_dotenv 12 | import streamlit as st 13 | import os 14 | 15 | if "history" not in st.session_state: 16 | st.session_state.history = [] 17 | 18 | load_dotenv() 19 | 20 | model_type= 'ollama' 21 | 22 | # Initializing Gemini 23 | if(model_type == "ollama"): 24 | model = Ollama( 25 | model=, # Provide your ollama model name here 26 | callback_manager=CallbackManager([StreamingStdOutCallbackHandler]) 27 | ) 28 | 29 | elif(model_type == "gemini"): 30 | model = ChatGoogleGenerativeAI( 31 | model="gemini-pro", 32 | temperature=0.1, 33 | convert_system_message_to_human=True 34 | ) 35 | 36 | genai.configure(api_key=os.environ["GOOGLE_API_KEY"]) 37 | 38 | # Vector Database 39 | persist_directory = # Persist directory path 40 | embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001") 41 | 42 | if not os.path.exists(persist_directory): 43 | with st.spinner('🚀 Starting your bot. This might take a while'): 44 | # Data Pre-processing 45 | pdf_loader = DirectoryLoader("./docs/", glob="./*.pdf", loader_cls=PyPDFLoader) 46 | text_loader = DirectoryLoader("./docs/", glob="./*.txt", loader_cls=TextLoader) 47 | 48 | pdf_documents = pdf_loader.load() 49 | text_documents = text_loader.load() 50 | 51 | splitter = RecursiveCharacterTextSplitter(chunk_size=10000, chunk_overlap=0) 52 | 53 | pdf_context = "\n\n".join(str(p.page_content) for p in pdf_documents) 54 | text_context = "\n\n".join(str(p.page_content) for p in text_documents) 55 | 56 | pdfs = splitter.split_text(pdf_context) 57 | texts = splitter.split_text(text_context) 58 | 59 | data = pdfs + texts 60 | 61 | print("Data Processing Complete") 62 | 63 | vectordb = Chroma.from_texts(data, embeddings, persist_directory=persist_directory) 64 | vectordb.persist() 65 | 66 | print("Vector DB Creating Complete\n") 67 | 68 | elif os.path.exists(persist_directory): 69 | vectordb = Chroma(persist_directory=persist_directory, 70 | embedding_function=embeddings) 71 | 72 | print("Vector DB Loaded\n") 73 | 74 | # Quering Model 75 | query_chain = RetrievalQA.from_chain_type( 76 | llm=model, 77 | retriever=vectordb.as_retriever() 78 | ) 79 | 80 | for msg in st.session_state.history: 81 | with st.chat_message(msg['role']): 82 | st.markdown(msg['content']) 83 | 84 | 85 | prompt = st.chat_input("Say something") 86 | if prompt: 87 | st.session_state.history.append({ 88 | 'role':'user', 89 | 'content':prompt 90 | }) 91 | 92 | with st.chat_message("user"): 93 | st.markdown(prompt) 94 | 95 | with st.spinner('💡Thinking'): 96 | response = query_chain({"query": prompt}) 97 | 98 | st.session_state.history.append({ 99 | 'role' : 'Assistant', 100 | 'content' : response['result'] 101 | }) 102 | 103 | with st.chat_message("Assistant"): 104 | st.markdown(response['result']) 105 | --------------------------------------------------------------------------------