├── .env
├── docs
    ├── gemini.pdf
    └── test_sample.txt
├── requirements.txt
├── README.md
└── RAG.py


/.env:
--------------------------------------------------------------------------------
1 | GOOGLE_API_KEY=<YOUR_GOOGLE_API_KEY>
2 | 


--------------------------------------------------------------------------------
/docs/gemini.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/MrSentinel137/gemini-ollama-RAG/HEAD/docs/gemini.pdf


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | langchain 
2 | chromadb
3 | streamlit
4 | google-generativeai 
5 | langchain-google-genai 
6 | python-dotenv 
7 | pypdf
8 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # RAG Python Chat Bot with Gemini, Ollama, Streamlit Madness! 🤖💬
 2 | 
 3 | 🚀 Welcome to the repository for our thrilling journey into the world of Python chat bots powered by RAG (Retrieval Augmented Generation)! 🐍 In this project, we harness the capabilities of Gemini, Ollama, and Streamlit to create an intelligent and entertaining chat bot.
 4 | 
 5 | ## Key Features:
 6 | 
 7 | - **Gemini Brilliance:** Explore the cutting-edge capabilities of Gemini in enhancing the bot's retrieval and generation mechanisms.
 8 | - **Ollama Charm:** Experience the conversational charm brought to the bot by Ollama, making interactions smoother and more engaging.
 9 | - **Sleek Streamlit Interface:** Navigate through a sleek and user-friendly interface powered by Streamlit, providing a seamless user experience.
10 | 
11 | ## Secret Sauce – LangChain Integration:
12 | Uncover the magic behind the scenes with LangChain integration, adding a unique layer of functionality to elevate your chat bot.
13 | 
14 | ## Get Started:
15 | 1. **Clone the Repository:**
16 | ```bash
17 | git clone https://github.com/MrSentinel137/gemini-ollama-RAG.git
18 | cd gemini-ollama-RAG
19 | ```
20 | 
21 | 2. **Install Dependencies:**
22 |     ```bash
23 |     pip install -r requirements.txt
24 |     ```
25 |      OR
26 |     
27 |     ```bash
28 |     pip3 install -r requirements.txt
29 |     ```
30 | 
31 | 3. **Run the Code:**
32 |  ```bash
33 |  streamlit run RAG.py
34 | ```
35 | 


--------------------------------------------------------------------------------
/docs/test_sample.txt:
--------------------------------------------------------------------------------
 1 | Lacaille 9352 (Lac 9352) is a red dwarf star in the southern constellation of Piscis Austrinus. With an apparent visual magnitude of 7.34, 
 2 | this star is too faint to be viewed with the naked eye except possibly under excellent seeing conditions. Parallax measurements place it at a distance of about 10.74 
 3 | light-years (3.29 parsecs) from Earth.[1] It is the eleventh closest star system to the Solar System[14] and is the closest star in the constellation Piscis Austrinus. 
 4 | The ChView simulation shows that its closest neighbour is the EZ Aquarii triple star system at about 4.1 ly away.
 5 | 
 6 | This star has the fourth highest known proper motion, (which was first noticed by Benjamin Gould in 1881) moving a total of 6.9 arcseconds per year. 
 7 | However, this is still a very small movement overall, as there are 3,600 arcseconds in a degree of arc. The space velocity components of this star are 
 8 | (U, V, W) = (−93.9, −14.1, −51.4) km/s. If the radial velocity (Vr) equals +9.7 km/s then about 2,700 years ago Lacaille 9352 was at its minimal distance of 
 9 | approximately 10.63 ly (3.26 pc) from the Sun.
10 | 
11 | The spectrum of Lacaille 9352 places it at a stellar classification of M0.5V,indicating it is a type of main sequence star known as a red dwarf. 
12 | This was the first red dwarf star to have its angular diameter measured, with the physical diameter being about 47% of the Sun's radius.
13 |  It has around half the mass of the Sun and the outer envelope has an effective temperature of about 3,670 K.
14 | 
15 | 


--------------------------------------------------------------------------------
/RAG.py:
--------------------------------------------------------------------------------
  1 | from langchain.vectorstores import Chroma
  2 | from langchain_google_genai import GoogleGenerativeAIEmbeddings
  3 | from langchain_google_genai import ChatGoogleGenerativeAI
  4 | import google.generativeai as genai
  5 | from langchain.document_loaders import TextLoader, DirectoryLoader, PyPDFLoader
  6 | from langchain.text_splitter import RecursiveCharacterTextSplitter
  7 | from langchain.llms import Ollama
  8 | from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
  9 | from langchain.callbacks.manager import CallbackManager
 10 | from langchain.chains import RetrievalQA
 11 | from dotenv import load_dotenv
 12 | import streamlit as st
 13 | import os
 14 | 
 15 | if "history" not in st.session_state:
 16 |     st.session_state.history = []
 17 | 
 18 | load_dotenv()
 19 | 
 20 | model_type= 'ollama'
 21 | 
 22 | # Initializing Gemini
 23 | if(model_type == "ollama"):
 24 |     model = Ollama(
 25 |                     model=<MODEL_NAME>,  # Provide your ollama model name here
 26 |                     callback_manager=CallbackManager([StreamingStdOutCallbackHandler])
 27 |                 )
 28 |     
 29 | elif(model_type == "gemini"):
 30 |     model = ChatGoogleGenerativeAI(
 31 |                                 model="gemini-pro", 
 32 |                                 temperature=0.1, 
 33 |                                 convert_system_message_to_human=True
 34 |                             )
 35 |     
 36 | genai.configure(api_key=os.environ["GOOGLE_API_KEY"])
 37 | 
 38 | # Vector Database
 39 | persist_directory = <PERSIST_DIR> # Persist directory path
 40 | embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
 41 | 
 42 | if not os.path.exists(persist_directory):
 43 |     with st.spinner('🚀 Starting your bot.  This might take a while'):
 44 |         # Data Pre-processing
 45 |         pdf_loader = DirectoryLoader("./docs/", glob="./*.pdf", loader_cls=PyPDFLoader)
 46 |         text_loader = DirectoryLoader("./docs/", glob="./*.txt", loader_cls=TextLoader)
 47 |         
 48 |         pdf_documents = pdf_loader.load()
 49 |         text_documents = text_loader.load()
 50 |         
 51 |         splitter = RecursiveCharacterTextSplitter(chunk_size=10000, chunk_overlap=0)
 52 |         
 53 |         pdf_context = "\n\n".join(str(p.page_content) for p in pdf_documents)
 54 |         text_context = "\n\n".join(str(p.page_content) for p in text_documents)
 55 | 
 56 |         pdfs = splitter.split_text(pdf_context)
 57 |         texts = splitter.split_text(text_context)
 58 | 
 59 |         data = pdfs + texts
 60 | 
 61 |         print("Data Processing Complete")
 62 | 
 63 |         vectordb = Chroma.from_texts(data, embeddings, persist_directory=persist_directory)
 64 |         vectordb.persist()
 65 | 
 66 |         print("Vector DB Creating Complete\n")
 67 | 
 68 | elif os.path.exists(persist_directory):
 69 |     vectordb = Chroma(persist_directory=persist_directory, 
 70 |                   embedding_function=embeddings)
 71 |     
 72 |     print("Vector DB Loaded\n")
 73 | 
 74 | # Quering Model
 75 | query_chain = RetrievalQA.from_chain_type(
 76 |     llm=model,
 77 |     retriever=vectordb.as_retriever()
 78 | )
 79 | 
 80 | for msg in st.session_state.history:
 81 |     with st.chat_message(msg['role']):
 82 |         st.markdown(msg['content'])
 83 | 
 84 | 
 85 | prompt = st.chat_input("Say something")
 86 | if prompt:
 87 |     st.session_state.history.append({
 88 |         'role':'user',
 89 |         'content':prompt
 90 |     })
 91 | 
 92 |     with st.chat_message("user"):
 93 |         st.markdown(prompt)
 94 | 
 95 |     with st.spinner('💡Thinking'):
 96 |         response = query_chain({"query": prompt})
 97 | 
 98 |         st.session_state.history.append({
 99 |             'role' : 'Assistant',
100 |             'content' : response['result']
101 |         })
102 | 
103 |         with st.chat_message("Assistant"):
104 |             st.markdown(response['result'])
105 | 


--------------------------------------------------------------------------------