├── .gitignore ├── Dockerfile ├── Makefile ├── README.md ├── callback.py ├── ingest.py ├── main.py ├── query_data.py ├── render.yaml ├── requirements.txt ├── schemas.py ├── static ├── favicon.png └── styles.css └── templates └── index.html /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | !.vectorstore.pkl 3 | __pycache__/ 4 | *.py[cod] 5 | *$py.class 6 | 7 | # C extensions 8 | *.so 9 | 10 | # Distribution / packaging 11 | .Python 12 | build/ 13 | develop-eggs/ 14 | dist/ 15 | downloads/ 16 | eggs/ 17 | .eggs/ 18 | lib/ 19 | lib64/ 20 | parts/ 21 | sdist/ 22 | var/ 23 | wheels/ 24 | pip-wheel-metadata/ 25 | share/python-wheels/ 26 | *.egg-info/ 27 | .installed.cfg 28 | *.egg 29 | MANIFEST 30 | 31 | # PyInstaller 32 | # Usually these files are written by a python script from a template 33 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 34 | *.manifest 35 | *.spec 36 | 37 | # Installer logs 38 | pip-log.txt 39 | pip-delete-this-directory.txt 40 | 41 | # Unit test / coverage reports 42 | htmlcov/ 43 | .tox/ 44 | .nox/ 45 | .coverage 46 | .coverage.* 47 | .cache 48 | nosetests.xml 49 | coverage.xml 50 | *.cover 51 | *.py,cover 52 | .hypothesis/ 53 | .pytest_cache/ 54 | 55 | # Translations 56 | *.mo 57 | *.pot 58 | 59 | # Django stuff: 60 | *.log 61 | local_settings.py 62 | db.sqlite3 63 | db.sqlite3-journal 64 | 65 | # Flask stuff: 66 | instance/ 67 | .webassets-cache 68 | 69 | # Scrapy stuff: 70 | .scrapy 71 | 72 | # Sphinx documentation 73 | docs/_build/ 74 | 75 | # PyBuilder 76 | target/ 77 | 78 | # Jupyter Notebook 79 | .ipynb_checkpoints 80 | 81 | # IPython 82 | profile_default/ 83 | ipython_config.py 84 | 85 | # pyenv 86 | .python-version 87 | 88 | # pipenv 89 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 90 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 91 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 92 | # install all needed dependencies. 93 | #Pipfile.lock 94 | 95 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 96 | __pypackages__/ 97 | 98 | # Celery stuff 99 | celerybeat-schedule 100 | celerybeat.pid 101 | 102 | # SageMath parsed files 103 | *.sage.py 104 | 105 | # Environments 106 | .env 107 | .venv 108 | env/ 109 | venv/ 110 | ENV/ 111 | env.bak/ 112 | venv.bak/ 113 | 114 | # Spyder project settings 115 | .spyderproject 116 | .spyproject 117 | 118 | # Rope project settings 119 | .ropeproject 120 | 121 | # mkdocs documentation 122 | /site 123 | 124 | # mypy 125 | .mypy_cache/ 126 | .dmypy.json 127 | dmypy.json 128 | 129 | # Pyre type checker 130 | .pyre/ 131 | 132 | # JetBrains 133 | .idea 134 | 135 | *.db 136 | 137 | .DS_Store 138 | 139 | vectorstore.pkl 140 | langchain.readthedocs.io/ 141 | -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- 1 | FROM python:3.9 2 | 3 | # For better caching we list the packages 4 | RUN pip install \ 5 | fastapi==0.92.0 \ 6 | black \ 7 | isort \ 8 | websockets==10.4 \ 9 | pydantic \ 10 | langchain==0.0.100 \ 11 | uvicorn==0.20.0 \ 12 | jinja2 \ 13 | faiss-cpu==1.7.3 \ 14 | bs4 \ 15 | unstructured==0.5.2 \ 16 | libmagic==1.0 17 | 18 | 19 | WORKDIR /code 20 | COPY . /code/ 21 | RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt 22 | 23 | CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "9000", "--forwarded-allow-ips=*"] 24 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | .PHONY: start 2 | start: 3 | uvicorn main:app --reload --port 9000 4 | 5 | .PHONY: format 6 | format: 7 | black . 8 | isort . -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Quarto Help Bot 2 | 3 | This repo is an implementation of a locally hosted chatbot specifically focused on question answering over the [Quarto documentation](https://quarto.org). 4 | Built with [LangChain](https://github.com/hwchase17/langchain/) and [FastAPI](https://fastapi.tiangolo.com/). 5 | 6 | The app leverages LangChain's streaming support and async API to update the page in real time for multiple users. 7 | 8 | ## ✅ Running locally 9 | 1. Install dependencies: `pip install -r requirements.txt` 10 | 1. Run the app: `make start` 11 | 1. In [templates/index.html](./templates/index.html), change the line of code `var endpoint = "wss://quarto-bot.onrender.com/chat";` to `var endpoint = "ws://localhost:9000/chat` (this is super hacky will fix this later). 12 | 1. Open [localhost:9000](http://localhost:9000) in your browser. 13 | 14 | ## 🚀 Important Links 15 | 16 | Deployed version: [https://quarto-bot.onrender.com/](https://quarto-bot.onrender.com/) 17 | 18 | I am using [render](https://render.com/) to deploy the site. The [render.yaml](./render.yaml) facilitates this deployment. 19 | 20 | 21 | ## 📚 Technical description 22 | 23 | There are two components: ingestion and question-answering. 24 | 25 | Ingestion has the following steps: 26 | 27 | 1. Pull search.json from the rendered site 28 | 2. Load the search.json into a vector database (see `startup_event` in `main.py`). 29 | 3. Split documents with LangChain's [TextSplitter](https://langchain.readthedocs.io/en/latest/modules/utils/combine_docs_examples/textsplitter.html) 30 | 4. Create a vectorstore of embeddings, using LangChain's [vectorstore wrapper](https://langchain.readthedocs.io/en/latest/modules/utils/combine_docs_examples/vectorstores.html) (with OpenAI's embeddings and FAISS vectorstore). 31 | 32 | Question-Answering has the following steps, all handled by [ChatVectorDBChain](https://langchain.readthedocs.io/en/latest/modules/chains/combine_docs_examples/chat_vector_db.html): 33 | 34 | 1. Given the chat history and new user input, determine what a standalone question would be (using GPT-3). 35 | 2. Given that standalone question, look up relevant documents from the vectorstore. 36 | 3. Pass the standalone question and relevant documents to GPT-3 to generate a final answer. 37 | -------------------------------------------------------------------------------- /callback.py: -------------------------------------------------------------------------------- 1 | """Callback handlers used in the app.""" 2 | from typing import Any, Dict, List 3 | 4 | from langchain.callbacks.base import AsyncCallbackHandler 5 | 6 | from schemas import ChatResponse 7 | 8 | 9 | class StreamingLLMCallbackHandler(AsyncCallbackHandler): 10 | """Callback handler for streaming LLM responses.""" 11 | 12 | def __init__(self, websocket): 13 | self.websocket = websocket 14 | 15 | async def on_llm_new_token(self, token: str, **kwargs: Any) -> None: 16 | resp = ChatResponse(sender="bot", message=token, type="stream") 17 | await self.websocket.send_json(resp.dict()) 18 | 19 | 20 | class QuestionGenCallbackHandler(AsyncCallbackHandler): 21 | """Callback handler for question generation.""" 22 | 23 | def __init__(self, websocket): 24 | self.websocket = websocket 25 | 26 | async def on_llm_start( 27 | self, serialized: Dict[str, Any], prompts: List[str], **kwargs: Any 28 | ) -> None: 29 | """Run when LLM starts running.""" 30 | resp = ChatResponse( 31 | sender="bot", message="Synthesizing question...", type="info" 32 | ) 33 | await self.websocket.send_json(resp.dict()) 34 | -------------------------------------------------------------------------------- /ingest.py: -------------------------------------------------------------------------------- 1 | """Load html from files, clean up, split, ingest into Weaviate.""" 2 | import json, requests, pickle 3 | from langchain.docstore.document import Document 4 | from langchain.document_loaders.base import BaseLoader 5 | from langchain.embeddings import OpenAIEmbeddings 6 | from langchain.text_splitter import RecursiveCharacterTextSplitter 7 | from langchain.vectorstores.faiss import FAISS 8 | from typing import List 9 | 10 | 11 | class QuartoLoader(BaseLoader): 12 | """Load Quarto search.json files.""" 13 | 14 | def __init__(self, url: str): 15 | """Initialize with a url that points to the search.json file.""" 16 | self.url = url 17 | 18 | def load(self) -> List[Document]: 19 | """Load json from url.""" 20 | response = requests.get(self.url) 21 | content = response.content.decode('utf-8') 22 | index = json.loads(content) 23 | 24 | docs = [] 25 | for doc in index: 26 | metadata = {k: doc[k] for k in ("objectID", "href", "section")} 27 | docs.append(Document(page_content=doc["text"], metadata=metadata)) 28 | return docs 29 | 30 | 31 | def ingest_docs(): 32 | """Get documents from web pages.""" 33 | loader = QuartoLoader("https://quarto.org/search.json") 34 | raw_documents = loader.load() 35 | text_splitter = RecursiveCharacterTextSplitter( 36 | chunk_size=1000, 37 | chunk_overlap=200, 38 | ) 39 | documents = text_splitter.split_documents(raw_documents) 40 | embeddings = OpenAIEmbeddings() 41 | vectorstore = FAISS.from_documents(documents, embeddings) 42 | return vectorstore 43 | 44 | 45 | if __name__ == "__main__": 46 | ingest_docs() 47 | -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | """Main entrypoint for the app.""" 2 | import os 3 | os.environ["LANGCHAIN_HANDLER"] = "langchain" 4 | 5 | import logging 6 | import pickle 7 | from pathlib import Path 8 | from typing import Optional 9 | from ingest import ingest_docs 10 | 11 | from fastapi import FastAPI, Request, WebSocket, WebSocketDisconnect 12 | from fastapi.staticfiles import StaticFiles 13 | from fastapi.templating import Jinja2Templates 14 | from langchain.vectorstores import VectorStore 15 | 16 | from callback import QuestionGenCallbackHandler, StreamingLLMCallbackHandler 17 | from query_data import get_chain 18 | from schemas import ChatResponse 19 | 20 | app = FastAPI() 21 | app.mount("/static", StaticFiles(directory="static"), name="static") 22 | templates = Jinja2Templates(directory="templates") 23 | vectorstore: Optional[VectorStore] = None 24 | 25 | 26 | @app.on_event("startup") 27 | async def startup_event(): 28 | logging.info("loading vectorstore") 29 | global vectorstore 30 | vectorstore = ingest_docs() 31 | 32 | 33 | @app.get("/") 34 | async def get(request: Request): 35 | return templates.TemplateResponse("index.html", {"request": request}) 36 | 37 | 38 | @app.websocket("/chat") 39 | async def websocket_endpoint(websocket: WebSocket): 40 | await websocket.accept() 41 | question_handler = QuestionGenCallbackHandler(websocket) 42 | stream_handler = StreamingLLMCallbackHandler(websocket) 43 | chat_history = [] 44 | qa_chain = get_chain(vectorstore, question_handler, stream_handler, tracing=False) 45 | # Use the below line instead of the above line to enable tracing 46 | # Ensure `langchain-server` is running 47 | # qa_chain = get_chain(vectorstore, question_handler, stream_handler, tracing=True) 48 | 49 | while True: 50 | try: 51 | # Receive and send back the client message 52 | question = await websocket.receive_text() 53 | resp = ChatResponse(sender="you", message=question, type="stream") 54 | await websocket.send_json(resp.dict()) 55 | 56 | # Construct a response 57 | start_resp = ChatResponse(sender="bot", message="", type="start") 58 | await websocket.send_json(start_resp.dict()) 59 | logging.info(f"start_resp={start_resp.dict()}") 60 | 61 | result = await qa_chain.acall( 62 | {"question": question, "chat_history": chat_history} 63 | ) 64 | chat_history.append((question, result["answer"])) 65 | 66 | end_resp = ChatResponse(sender="bot", message="", type="end") 67 | logging.info(f"end_resp={end_resp.dict()}") 68 | 69 | await websocket.send_json(end_resp.dict()) 70 | except WebSocketDisconnect: 71 | logging.info("websocket disconnect") 72 | break 73 | except Exception as e: 74 | logging.error(e) 75 | resp = ChatResponse( 76 | sender="bot", 77 | message="Sorry, something went wrong. Try again.", 78 | type="error", 79 | ) 80 | raise e 81 | await websocket.send_json(resp.dict()) 82 | 83 | if __name__ == "__main__": 84 | import uvicorn 85 | uvicorn.run(app, host="0.0.0.0", port=9000) 86 | -------------------------------------------------------------------------------- /query_data.py: -------------------------------------------------------------------------------- 1 | """Create a ChatVectorDBChain for question/answering.""" 2 | import os 3 | os.environ["LANGCHAIN_HANDLER"] = "langchain" 4 | 5 | from langchain.callbacks.base import AsyncCallbackManager 6 | from langchain.callbacks.tracers import LangChainTracer 7 | from langchain.chains import ChatVectorDBChain 8 | from langchain.chains.chat_vector_db.prompts import CONDENSE_QUESTION_PROMPT 9 | from langchain.prompts.prompt import PromptTemplate 10 | from langchain.chains.llm import LLMChain 11 | from langchain.chains.question_answering import load_qa_chain 12 | from langchain.llms import OpenAI 13 | from langchain.vectorstores.base import VectorStore 14 | from pydantic import Field 15 | 16 | doc_template="""--- document start --- 17 | href: {href} 18 | section: {section} 19 | content:{page_content} 20 | --- document end --- 21 | """ 22 | 23 | QUARTO_DOC_PROMPT = PromptTemplate( 24 | template=doc_template, 25 | input_variables=["page_content", "href", "section"] 26 | ) 27 | 28 | prompt_template = """You are an AI assistant for the open source library Quarto. The documentation is located at https://quarto.org/docs 29 | You are given the following extracted parts of a long document and a question. Provide a conversational answer with a hyperlink to the documentation. 30 | You can construct the hyperlink by using the href and section fields in the context and the base url https://quarto.org/. 31 | You should only use hyperlinks that are explicitly listed as a source in the context. Do NOT make up a hyperlink that is not listed. 32 | You should only show code examples that are explicitly listed in the documentation. Do not make up code examples. 33 | If the question includes a request for code, provide a fenced code block directly from the documentation. 34 | If you don't know the answer, just say "Hmm, I'm not sure." Don't try to make up an answer. 35 | If the question is not about Quarto, politely inform them that you are tuned to only answer questions about Quarto. 36 | 37 | Question: {question} 38 | 39 | Documents: 40 | ========= 41 | {context} 42 | ========= 43 | 44 | Answer in Markdown:""" 45 | QA_PROMPT = PromptTemplate( 46 | template=prompt_template, input_variables=["context", "question"] 47 | ) 48 | 49 | 50 | def get_chain( 51 | vectorstore: VectorStore, question_handler, stream_handler, tracing: bool = False 52 | ) -> ChatVectorDBChain: 53 | """Create a ChatVectorDBChain for question/answering.""" 54 | # Construct a ChatVectorDBChain with a streaming llm for combine docs 55 | # and a separate, non-streaming llm for question generation 56 | manager = AsyncCallbackManager([]) 57 | question_manager = AsyncCallbackManager([question_handler]) 58 | stream_manager = AsyncCallbackManager([stream_handler]) 59 | if tracing: 60 | tracer = LangChainTracer() 61 | tracer.load_default_session() 62 | manager.add_handler(tracer) 63 | question_manager.add_handler(tracer) 64 | stream_manager.add_handler(tracer) 65 | 66 | question_gen_llm = OpenAI( 67 | model_name='gpt-3.5-turbo', 68 | max_retries=15, 69 | max_tokens=520, 70 | temperature=0.5, 71 | verbose=True, 72 | callback_manager=question_manager, 73 | ) 74 | streaming_llm = OpenAI( 75 | streaming=True, 76 | max_retries=15, 77 | callback_manager=stream_manager, 78 | verbose=True, 79 | temperature=0, 80 | ) 81 | 82 | question_generator = LLMChain( 83 | llm=question_gen_llm, prompt=CONDENSE_QUESTION_PROMPT, callback_manager=manager 84 | ) 85 | doc_chain = load_qa_chain( 86 | streaming_llm, chain_type="stuff", prompt=QA_PROMPT, document_prompt=QUARTO_DOC_PROMPT, 87 | callback_manager=manager 88 | ) 89 | 90 | qa = ChatVectorDBChain( 91 | vectorstore=vectorstore, 92 | combine_docs_chain=doc_chain, 93 | question_generator=question_generator, 94 | callback_manager=manager, 95 | top_k_docs_for_context=10 96 | ) 97 | return qa 98 | -------------------------------------------------------------------------------- /render.yaml: -------------------------------------------------------------------------------- 1 | services: 2 | - type: web 3 | name: quarto-bot 4 | env: docker 5 | envVars: 6 | - key: OPENAI_API_KEY 7 | sync: false 8 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | openai 2 | fastapi==0.92.0 3 | black 4 | isort 5 | websockets==10.4 6 | pydantic 7 | langchain==0.0.100 8 | uvicorn==0.20.0 9 | jinja2 10 | faiss-cpu==1.7.3 11 | bs4 12 | unstructured==0.5.2 13 | libmagic==1.0 -------------------------------------------------------------------------------- /schemas.py: -------------------------------------------------------------------------------- 1 | """Schemas for the chat app.""" 2 | from pydantic import BaseModel, validator 3 | 4 | 5 | class ChatResponse(BaseModel): 6 | """Chat response schema.""" 7 | 8 | sender: str 9 | message: str 10 | type: str 11 | 12 | @validator("sender") 13 | def sender_must_be_bot_or_you(cls, v): 14 | if v not in ["bot", "you"]: 15 | raise ValueError("sender must be bot or you") 16 | return v 17 | 18 | @validator("type") 19 | def validate_message_type(cls, v): 20 | if v not in ["start", "stream", "end", "error", "info"]: 21 | raise ValueError("type must be start, stream or end") 22 | return v 23 | -------------------------------------------------------------------------------- /static/favicon.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hamelsmu/chat-langchain/b93c27cbf0392a829d6fb17bbe15b4cf5759119d/static/favicon.png -------------------------------------------------------------------------------- /static/styles.css: -------------------------------------------------------------------------------- 1 | a { 2 | color: rgb(85, 186, 219); 3 | text-decoration: underline; 4 | } 5 | 6 | .grey-text{ 7 | color: lightgrey; 8 | } 9 | 10 | .footer-text{ 11 | font-size: 13px; 12 | } -------------------------------------------------------------------------------- /templates/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | Quarto Help Bot 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 61 | 140 | 141 | 142 |
143 |
144 |

Quarto Help Bot

145 | 146 |
147 |
148 |
149 |
150 | 151 | 152 |
153 |
154 |
155 | 158 | 159 | --------------------------------------------------------------------------------