├── Export-d3adfe0f-3131-4bf3-8987-a52017fc1bae.zip
├── LICENSE
├── README.md
├── app.py
├── export_format.png
├── export_notion.png
├── ingest_data.py
├── query_data.py
├── requirements.txt
└── vectorstore.pkl


/Export-d3adfe0f-3131-4bf3-8987-a52017fc1bae.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hwchase17/chat-langchain-notion/33f9e63dd2c683beee47056b64cc2d98af8daf79/Export-d3adfe0f-3131-4bf3-8987-a52017fc1bae.zip


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2023 Harrison Chase
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Chat-LangChain-Notion
 2 | 
 3 | Create a ChatGPT like experience over your Notion database using [LangChain](https://github.com/hwchase17/langchain).
 4 | 
 5 | 
 6 | ## 📊 Example Data
 7 | This repo uses the [Blendle Employee Handbook](https://www.notion.so/Blendle-s-Employee-Handbook-7692ffe24f07450785f093b94bbe1a09) as an example.
 8 | It was downloaded October 18th so may have changed slightly since then!
 9 | 
10 | ## 🧑 Instructions for ingesting your own dataset
11 | 
12 | Export your dataset from Notion. You can do this by clicking on the three dots in the upper right hand corner and then clicking `Export`.
13 | 
14 | <img src="export_notion.png" alt="export" width="200"/>
15 | 
16 | When exporting, make sure to select the `Markdown & CSV` format option.
17 | 
18 | <img src="export_format.png" alt="export-format" width="200"/>
19 | 
20 | This will produce a `.zip` file in your Downloads folder. Move the `.zip` file into this repository.
21 | 
22 | Run the following command to unzip the zip file (replace the `Export...` with your own file name as needed).
23 | 
24 | ```shell
25 | unzip Export-d3adfe0f-3131-4bf3-8987-a52017fc1bae.zip -d Notion_DB
26 | ```
27 | 
28 | ## Ingest data
29 | 
30 | Therefor, the only thing that is needed is to be done to ingest data is run `python ingest_data.py`
31 | 
32 | ## Query data
33 | Custom prompts are used to ground the answers in the Blendle Employee Handbook files.
34 | 
35 | ## Running the Application
36 | 
37 | By running `python app.py` from the command line you can easily interact with your ChatGPT over your own data.
38 | 


--------------------------------------------------------------------------------
/app.py:
--------------------------------------------------------------------------------
 1 | import pickle
 2 | from query_data import get_chain
 3 | 
 4 | 
 5 | if __name__ == "__main__":
 6 |     with open("vectorstore.pkl", "rb") as f:
 7 |         vectorstore = pickle.load(f)
 8 |     qa_chain = get_chain(vectorstore)
 9 |     chat_history = []
10 |     print("Chat with your docs!")
11 |     while True:
12 |         print("Human:")
13 |         question = input()
14 |         result = qa_chain({"question": question, "chat_history": chat_history})
15 |         chat_history.append((question, result["answer"]))
16 |         print("AI:")
17 |         print(result["answer"])
18 | 


--------------------------------------------------------------------------------
/export_format.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hwchase17/chat-langchain-notion/33f9e63dd2c683beee47056b64cc2d98af8daf79/export_format.png


--------------------------------------------------------------------------------
/export_notion.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hwchase17/chat-langchain-notion/33f9e63dd2c683beee47056b64cc2d98af8daf79/export_notion.png


--------------------------------------------------------------------------------
/ingest_data.py:
--------------------------------------------------------------------------------
 1 | from langchain.text_splitter import RecursiveCharacterTextSplitter
 2 | from langchain.document_loaders import NotionDirectoryLoader
 3 | from langchain.vectorstores.faiss import FAISS
 4 | from langchain.embeddings import OpenAIEmbeddings
 5 | import pickle
 6 | 
 7 | # Load Data
 8 | loader = NotionDirectoryLoader("Notion_DB")
 9 | raw_documents = loader.load()
10 | 
11 | # Split text
12 | text_splitter = RecursiveCharacterTextSplitter()
13 | documents = text_splitter.split_documents(raw_documents)
14 | 
15 | 
16 | # Load Data to vectorstore
17 | embeddings = OpenAIEmbeddings()
18 | vectorstore = FAISS.from_documents(documents, embeddings)
19 | 
20 | 
21 | # Save vectorstore
22 | with open("vectorstore.pkl", "wb") as f:
23 |     pickle.dump(vectorstore, f)
24 | 


--------------------------------------------------------------------------------
/query_data.py:
--------------------------------------------------------------------------------
 1 | from langchain.prompts.prompt import PromptTemplate
 2 | from langchain.llms import OpenAI
 3 | from langchain.chains import ChatVectorDBChain
 4 | 
 5 | _template = """Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question.
 6 | You can assume the question about the Blendle Employee Handbook.
 7 | 
 8 | Chat History:
 9 | {chat_history}
10 | Follow Up Input: {question}
11 | Standalone question:"""
12 | CONDENSE_QUESTION_PROMPT = PromptTemplate.from_template(_template)
13 | 
14 | template = """You are an AI assistant for answering questions about the Blendle Employee Handbook.
15 | You are given the following extracted parts of a long document and a question. Provide a conversational answer.
16 | If you don't know the answer, just say "Hmm, I'm not sure." Don't try to make up an answer.
17 | If the question is not about the Blendle Employee Handbook, politely inform them that you are tuned to only answer questions about the Blendle Employee Handbook.
18 | 
19 | Question: {question}
20 | =========
21 | {context}
22 | =========
23 | Answer in Markdown:"""
24 | QA_PROMPT = PromptTemplate(template=template, input_variables=["question", "context"])
25 | 
26 | 
27 | def get_chain(vectorstore):
28 |     llm = OpenAI(temperature=0)
29 |     qa_chain = ChatVectorDBChain.from_llm(
30 |         llm,
31 |         vectorstore,
32 |         qa_prompt=QA_PROMPT,
33 |         condense_question_prompt=CONDENSE_QUESTION_PROMPT,
34 |     )
35 |     return qa_chain
36 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | langchain
2 | openai
3 | unstructured
4 | faiss-cpu
5 | 


--------------------------------------------------------------------------------
/vectorstore.pkl:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hwchase17/chat-langchain-notion/33f9e63dd2c683beee47056b64cc2d98af8daf79/vectorstore.pkl


--------------------------------------------------------------------------------