├── .gitignore
├── LICENSE
├── README.md
├── chat.py
├── requirements.txt
├── scratchpad.txt
└── system_message.txt


/.gitignore:
--------------------------------------------------------------------------------
1 | key_openai.txt
2 | .idea
3 | .venv
4 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2023 David Shapiro
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Coding Chatbot with Scratchpad
  2 | 
  3 | This Python script is a command-line interface for interacting with OpenAI's GPT-4 model. It allows users to have a conversation with the chatbot and update a scratchpad with relevant information that the chatbot can use to provide more accurate and context-aware responses.
  4 | 
  5 | ## Features
  6 | 
  7 | - **User Input**: The script provides two input modes for the user:
  8 |   - *Normal Mode*: Single-line input for regular conversation with the chatbot.
  9 |   - *Multi-line Mode*: Allows users to enter multiple lines of text to update the scratchpad. To enter this mode, type `SCRATCHPAD` and press Enter. Type `END` to save and exit the multi-line input mode.
 10 | 
 11 | - **Scratchpad**: The scratchpad is a text file (`scratchpad.txt`) that can be updated by the user during the conversation. The chatbot uses the information in the scratchpad to provide more context-aware responses.
 12 | 
 13 | - **System Message**: A system message is used to provide the chatbot with instructions on how to use the scratchpad. The content of the system message is stored in a separate text file (`system_message.txt`).
 14 | 
 15 | - **Conversation History**: The script maintains a conversation history, which is used to provide context to the chatbot. If the conversation becomes too long, the oldest message is removed to fit within the model's token limit.
 16 | 
 17 | - **API Communication**: The script communicates with OpenAI's API to generate responses from the chatbot. It handles errors and retries if necessary.
 18 | 
 19 | Enjoy your conversation with the chatbot and make use of the scratchpad to provide additional context for more accurate and helpful responses!
 20 | 
 21 | ## Setup
 22 | 
 23 | 1. Install OpenAI if not already, should have no other dependencies: `pip install openai --upgrade`
 24 | 2. Create `key_openai.txt` and populate your OpenAI key. Note, you need to have access to GPT-4 API. Note, some operating systems require 
 25 | 
 26 | ## Usage
 27 | 
 28 | 1. Run `python chat.py`
 29 | 2. Type `SCRATCHPAD` to enter scratchpad update mode. This will allow you to copy/paste multiple lines. Type `END` on the last line to save and exit.
 30 | 
 31 | ## Example Output:
 32 | 
 33 | Below is an example output from the chatbot. It uses a combination of textwrap and newlines to ensure code is preserved in console output but not too wide. Note, the scratchpad was previously updated with another python script. You can see that this was able to understand the code and request well enough to understand what other functions were necessary.
 34 | 
 35 | ``````
 36 | ****** IMPORTANT: ******
 37 | 
 38 | Type SCRATCHPAD to enter multi line input mode to update scratchpad. Type END to save and exit.
 39 | 
 40 | 
 41 | 
 42 | [NORMAL] USER:
 43 | write a new search_kb function. assume that each KB article is a YAML file in a folder called kb. the elements in the KB ar article and vector. vector is a high dimensional semantic embedding that can be used to calculate cosine similarity with numpy. we will be using Google USEv5 for the embeddings
 44 | 
 45 | 
 46 | 
 47 | CHATBOT:
 48 | First, let's import the necessary libraries:
 49 | 
 50 | ```python
 51 | import os
 52 | import numpy as np
 53 | from sklearn.metrics.pairwise import cosine_similarity
 54 | import tensorflow_hub as hub
 55 | ```
 56 | 
 57 | Now, let's create a function to load all the KB articles and their embeddings:
 58 | 
 59 | ```python
 60 | def load_kb_articles(kb_folder='kb'):
 61 |     kb_articles = []
 62 |     for filename in os.listdir(kb_folder):
 63 |         if filename.endswith('.yaml'):
 64 |             filepath = os.path.join(kb_folder, filename)
 65 |             article = open_yaml(filepath)
 66 |             kb_articles.append(article)
 67 |     return kb_articles
 68 | ```
 69 | 
 70 | Next, let's create a function to calculate the cosine similarity between the query and the KB articles:
 71 | 
 72 | ```python
 73 | def calculate_similarity(query, kb_articles):
 74 |     # Load the Universal Sentence Encoder
 75 |     embed = hub.load("https://tfhub.dev/google/universal-sentence-encoder/5")
 76 | 
 77 |     # Calculate the query embedding
 78 |     query_embedding = embed([query])[0].numpy()
 79 | 
 80 |     # Calculate the cosine similarity between the query and the KB articles
 81 |     similarities = []
 82 |     for article in kb_articles:
 83 |         article_embedding = np.array(article['vector'])
 84 |         similarity = cosine_similarity(query_embedding.reshape(1, -1), article_embedding.reshape(1, -1))
 85 |         similarities.append(similarity[0][0])
 86 | 
 87 |     return similarities
 88 | ```
 89 | 
 90 | Finally, let's update the `search_kb` function to use the above functions and return the most relevant article:
 91 | 
 92 | ```python
 93 | def search_kb(query):
 94 |     # Load the KB articles
 95 |     kb_articles = load_kb_articles()
 96 | 
 97 |     # Calculate the cosine similarity between the query and the KB articles
 98 |     similarities = calculate_similarity(query, kb_articles)
 99 | 
100 |     # Find the index of the most similar article
101 |     most_similar_index = np.argmax(similarities)
102 | 
103 |     # Return the most similar article
104 |     return kb_articles[most_similar_index]['article']
105 | ```
106 | 
107 | Now, the `search_kb` function should return the most relevant KB article based on the cosine similarity between the
108 | query and the article embeddings.
109 | ``````


--------------------------------------------------------------------------------
/chat.py:
--------------------------------------------------------------------------------
  1 | from halo import Halo
  2 | import openai
  3 | from time import time, sleep
  4 | import textwrap
  5 | import sys
  6 | 
  7 | 
  8 | ###     file operations
  9 | 
 10 | 
 11 | def save_yaml(filepath, data):
 12 |     with open(filepath, 'w', encoding='utf-8') as file:
 13 |         yaml.dump(data, file, allow_unicode=True)
 14 | 
 15 | 
 16 | def open_yaml(filepath):
 17 |     with open(filepath, 'r', encoding='utf-8') as file:
 18 |         data = yaml.load(file, Loader=yaml.FullLoader)
 19 |     return data
 20 | 
 21 | 
 22 | def save_file(filepath, content):
 23 |     with open(filepath, 'w', encoding='utf-8') as outfile:
 24 |         outfile.write(content)
 25 | 
 26 | 
 27 | def open_file(filepath):
 28 |     with open(filepath, 'r', encoding='utf-8', errors='ignore') as infile:
 29 |         return infile.read()
 30 | 
 31 | 
 32 | ###     API functions
 33 | 
 34 | 
 35 | def chatbot(conversation, model="gpt-4-0613", temperature=0):
 36 |     max_retry = 7
 37 |     retry = 0
 38 |     while True:
 39 |         try:
 40 |             response = openai.ChatCompletion.create(model=model, messages=conversation, temperature=temperature)
 41 |             text = response['choices'][0]['message']['content']
 42 |             return text, response['usage']['total_tokens']
 43 |         except Exception as oops:
 44 |             print(f'\n\nError communicating with OpenAI: "{oops}"')
 45 |             if 'maximum context length' in str(oops):
 46 |                 a = conversation.pop(0)
 47 |                 print('\n\n DEBUG: Trimming oldest message')
 48 |                 continue
 49 |             retry += 1
 50 |             if retry >= max_retry:
 51 |                 print(f"\n\nExiting due to excessive errors in API: {oops}")
 52 |                 exit(1)
 53 |             print(f'\n\nRetrying in {2 ** (retry - 1) * 5} seconds...')
 54 |             sleep(2 ** (retry - 1) * 5)
 55 | 
 56 | 
 57 | ###     MAIN LOOP
 58 | 
 59 | 
 60 | def multi_line_input():
 61 |     print('\n\n\nType END to save and exit.\n[MULTI] USER:\n')
 62 |     lines = []
 63 |     while True:
 64 |         line = input()
 65 |         if line == "END":
 66 |             break
 67 |         lines.append(line)
 68 |     return "\n".join(lines)
 69 | 
 70 | 
 71 | if __name__ == '__main__':
 72 |     # instantiate chatbot
 73 |     openai.api_key = open_file('key_openai.txt').strip()
 74 |     ALL_MESSAGES = list()
 75 |     print('\n\n****** IMPORTANT: ******\n\nType SCRATCHPAD to enter multi line input mode to update scratchpad. Type END to save and exit.')
 76 |     
 77 |     while True:
 78 |         # get user input
 79 |         text = input('\n\n\n[NORMAL] USER:\n\n')
 80 |         
 81 |         # check if scratchpad updated, continue
 82 |         if 'SCRATCHPAD' in text:
 83 |             text = multi_line_input()
 84 |             save_file('scratchpad.txt', text.strip('END').strip())
 85 |             print('\n\n#####      Scratchpad updated!')
 86 |             continue
 87 |         if text == '':
 88 |             # empty submission, probably on accident
 89 |             continue
 90 |         
 91 |         # continue with composing conversation and response
 92 |         ALL_MESSAGES.append({'role': 'user', 'content': text})
 93 |         system_message = open_file('system_message.txt').replace('<<CODE>>', open_file('scratchpad.txt'))
 94 |         conversation = list()
 95 |         conversation += ALL_MESSAGES
 96 |         conversation.append({'role': 'system', 'content': system_message})
 97 | 
 98 |         # generate a response
 99 |         spinner = Halo(text='Coding...', spinner='dots')
100 |         spinner.start()
101 |         response, tokens = chatbot(conversation)
102 |         spinner.stop()
103 |         if tokens > 7500:
104 |             ALL_MESSAGES.pop(0)
105 |         ALL_MESSAGES.append({'role': 'assistant', 'content': response})
106 |         print('\n\n\n\nCHATBOT:\n')
107 |         formatted_lines = [textwrap.fill(line, width=120) for line in response.split('\n')]
108 |         formatted_text = '\n'.join(formatted_lines)
109 |         print(formatted_text)


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | aiohttp==3.8.4
 2 | aiosignal==1.3.1
 3 | async-timeout==4.0.2
 4 | attrs==23.1.0
 5 | certifi==2023.5.7
 6 | charset-normalizer==3.1.0
 7 | frozenlist==1.3.3
 8 | idna==3.4
 9 | multidict==6.0.4
10 | openai==0.27.8
11 | PyYAML==6.0
12 | requests==2.31.0
13 | tqdm==4.65.0
14 | urllib3==2.0.3
15 | yarl==1.9.2
16 | 


--------------------------------------------------------------------------------
/scratchpad.txt:
--------------------------------------------------------------------------------
1 | blah


--------------------------------------------------------------------------------
/system_message.txt:
--------------------------------------------------------------------------------
 1 | MAIN PURPOSE
 2 | You are a Python coding assistant. The USER will give you instructions to help write functions. You may ask for clarification if needed, but otherwise you should only output Python code. Adhere to PEP8. Provide explanations of the code only if the user asks for them.
 3 | 
 4 | 
 5 | 
 6 | SCRATCHPAD
 7 | The below scratchpad may be provided by the user so you are aware of the script they are working on. Note, this information may be blank. Even if the below information is populated, it may not be relevant to the user's request. Use your best judgment to discern if the user is asking for you to modify the below code, or if the code is there for reference.
 8 | 
 9 | 
10 | 
11 | SCRATCHPAD:
12 | 
13 | 
14 | <<CODE>>


--------------------------------------------------------------------------------