├── requirements.txt
├── app_streamlit.py
├── LICENSE
├── .gitignore
├── streams.ipynb
└── README.md


/requirements.txt:
--------------------------------------------------------------------------------
1 | streamlit
2 | openai
3 | 


--------------------------------------------------------------------------------
/app_streamlit.py:
--------------------------------------------------------------------------------
 1 | import openai
 2 | import os
 3 | import time
 4 | import streamlit as st
 5 | openai.api_key = st.secrets["OPENAI_API_KEY"]
 6 | #INPUT INTERFACE & SUBMIT BUTTON
 7 | st.title('Stream LLM responses')
 8 | prompt = st.text_input(label="Prompt: ")
 9 | submit_button_exist = st.button('Submit')
10 | #CALL TO API
11 | start_time = time.time()
12 | max_response_length = 200
13 | delay_time = 0.01
14 | response = openai.ChatCompletion.create(
15 |     # CHATPG GPT API REQQUEST
16 |     model='gpt-3.5-turbo',
17 |     messages=[
18 |         {'role': 'user', 'content': f'{prompt}'}
19 |     ],
20 |     max_tokens=max_response_length,
21 |     temperature=0,
22 |     stream=True,  # this time, we set stream=True
23 | )
24 | #PRINTING OUTPUT
25 | answer=''
26 | 
27 | if submit_button_exist is True:
28 |     c = st.empty()
29 |     for event in response:
30 |         c.write(answer)
31 |         event_time = time.time() - start_time
32 |         event_text = event['choices'][0]['delta']
33 |         answer += event_text.get('content', '')
34 |         time.sleep(delay_time)
35 | else: 
36 |   st.write("")
37 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2023 Teemu Maatta
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
  1 | # Byte-compiled / optimized / DLL files
  2 | __pycache__/
  3 | *.py[cod]
  4 | *$py.class
  5 | 
  6 | # C extensions
  7 | *.so
  8 | 
  9 | # Distribution / packaging
 10 | .Python
 11 | build/
 12 | develop-eggs/
 13 | dist/
 14 | downloads/
 15 | eggs/
 16 | .eggs/
 17 | lib/
 18 | lib64/
 19 | parts/
 20 | sdist/
 21 | var/
 22 | wheels/
 23 | pip-wheel-metadata/
 24 | share/python-wheels/
 25 | *.egg-info/
 26 | .installed.cfg
 27 | *.egg
 28 | MANIFEST
 29 | 
 30 | # PyInstaller
 31 | #  Usually these files are written by a python script from a template
 32 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
 33 | *.manifest
 34 | *.spec
 35 | 
 36 | # Installer logs
 37 | pip-log.txt
 38 | pip-delete-this-directory.txt
 39 | 
 40 | # Unit test / coverage reports
 41 | htmlcov/
 42 | .tox/
 43 | .nox/
 44 | .coverage
 45 | .coverage.*
 46 | .cache
 47 | nosetests.xml
 48 | coverage.xml
 49 | *.cover
 50 | *.py,cover
 51 | .hypothesis/
 52 | .pytest_cache/
 53 | 
 54 | # Translations
 55 | *.mo
 56 | *.pot
 57 | 
 58 | # Django stuff:
 59 | *.log
 60 | local_settings.py
 61 | db.sqlite3
 62 | db.sqlite3-journal
 63 | 
 64 | # Flask stuff:
 65 | instance/
 66 | .webassets-cache
 67 | 
 68 | # Scrapy stuff:
 69 | .scrapy
 70 | 
 71 | # Sphinx documentation
 72 | docs/_build/
 73 | 
 74 | # PyBuilder
 75 | target/
 76 | 
 77 | # Jupyter Notebook
 78 | .ipynb_checkpoints
 79 | 
 80 | # IPython
 81 | profile_default/
 82 | ipython_config.py
 83 | 
 84 | # pyenv
 85 | .python-version
 86 | 
 87 | # pipenv
 88 | #   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
 89 | #   However, in case of collaboration, if having platform-specific dependencies or dependencies
 90 | #   having no cross-platform support, pipenv may install dependencies that don't work, or not
 91 | #   install all needed dependencies.
 92 | #Pipfile.lock
 93 | 
 94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow
 95 | __pypackages__/
 96 | 
 97 | # Celery stuff
 98 | celerybeat-schedule
 99 | celerybeat.pid
100 | 
101 | # SageMath parsed files
102 | *.sage.py
103 | 
104 | # Environments
105 | .env
106 | .venv
107 | env/
108 | venv/
109 | ENV/
110 | env.bak/
111 | venv.bak/
112 | 
113 | # Spyder project settings
114 | .spyderproject
115 | .spyproject
116 | 
117 | # Rope project settings
118 | .ropeproject
119 | 
120 | # mkdocs documentation
121 | /site
122 | 
123 | # mypy
124 | .mypy_cache/
125 | .dmypy.json
126 | dmypy.json
127 | 
128 | # Pyre type checker
129 | .pyre/
130 | 


--------------------------------------------------------------------------------
/streams.ipynb:
--------------------------------------------------------------------------------
1 | {"metadata":{"kernelspec":{"language":"python","display_name":"Python 3","name":"python3"},"language_info":{"name":"python","version":"3.7.12","mimetype":"text/x-python","codemirror_mode":{"name":"ipython","version":3},"pygments_lexer":"ipython3","nbconvert_exporter":"python","file_extension":".py"}},"nbformat_minor":4,"nbformat":4,"cells":[{"cell_type":"code","source":"!pip install --upgrade openai\nimport openai\nimport time\nopenai.api_key = user_secrets.get_secret(\"OPENAI_API_KEY\")\nstartime = time.time()","metadata":{"_uuid":"8f2839f25d086af736a60e9eeb907d3b93b6e0e5","_cell_guid":"b1076dfc-b9ad-4769-8c92-a6c4dae69d19","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"### STREAM GPT-4 API RESPONSES\ndelay_time = 0.01 #  faster\nmax_response_length = 8000\nanswer = ''\n# ASK QUESTION\nprompt = input(\"Ask a question: \")\nstart_time = time.time()\n\nresponse = openai.ChatCompletion.create(\n    # GPT-4 API REQQUEST\n    model='gpt-4',\n    messages=[\n        {'role': 'user', 'content': f'{prompt}'}\n    ],\n    max_tokens=max_response_length,\n    temperature=0,\n    stream=True,  # this time, we set stream=True\n)\n\nfor event in response: \n    # STREAM THE ANSWER\n    print(answer, end='', flush=True) # Print the response    \n    # RETRIEVE THE TEXT FROM THE RESPONSE\n    event_time = time.time() - start_time  # CALCULATE TIME DELAY BY THE EVENT\n    event_text = event['choices'][0]['delta'] # EVENT DELTA RESPONSE\n    answer = event_text.get('content', '') # RETRIEVE CONTENT\n    time.sleep(delay_time)","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"### STREAM CHATGPT API RESPONSES\ndelay_time = 0.01 #  faster\nmax_response_length = 200\nanswer = ''\n# ASK QUESTION\nprompt = input(\"Ask a question: \")\nstart_time = time.time()\n\nresponse = openai.ChatCompletion.create(\n    # CHATPG GPT API REQQUEST\n    model='gpt-3.5-turbo',\n    messages=[\n        {'role': 'user', 'content': f'{prompt}'}\n    ],\n    max_tokens=max_response_length,\n    temperature=0,\n    stream=True,  # this time, we set stream=True\n)\n\nfor event in response: \n    # STREAM THE ANSWER\n    print(answer, end='', flush=True) # Print the response    \n    # RETRIEVE THE TEXT FROM THE RESPONSE\n    event_time = time.time() - start_time  # CALCULATE TIME DELAY BY THE EVENT\n    event_text = event['choices'][0]['delta'] # EVENT DELTA RESPONSE\n    answer = event_text.get('content', '') # RETRIEVE CONTENT\n    time.sleep(delay_time)","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"collected_events = []\ncompletion_text = []\nspeed = 0.05 #smaller is faster\nmax_response_length = 200\nstart_time = time.time()\nprompt = input(\"Ask a question: \")\n# Generate Answer\nresponse = openai.Completion.create(\n    model='text-davinci-003',\n    prompt=prompt,\n    max_tokens=max_response_length,\n    temperature=0,\n    stream=True,  # this time, we set stream=True\n)\n\n# Stream Answer\nfor event in response:\n    event_time = time.time() - start_time  # calculate the time delay of the event\n    collected_events.append(event)  # save the event response\n    event_text = event['choices'][0]['text']  # extract the text\n    completion_text += event_text  # append the text\n    time.sleep(speed)\n    print(f\"{event_text}\", end=\"\", flush=True)","metadata":{"trusted":true},"execution_count":null,"outputs":[]}]}
2 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | <div align="center"> 
  2 |   
  3 | [![GitHub Repo stars](https://img.shields.io/github/stars/tmgthb/Stream-responses?style=social)](https://github.com/tmgthb/Stream-responses/stargazers) 
  4 | [![Twitter Follow](https://img.shields.io/twitter/follow/Teemumtt3?style=social)](https://twitter.com/Teemumtt3)
  5 | 
  6 | </div>  
  7 | 
  8 | # Stream GPT-4, ChatGPT and GPT-3.5 responses & Deploy Streamlit apps
  9 | The code is to demonstrate usage of streaming with GPT-4 API, ChatGPT API and InstructGPT (GPT-3.5.) models & Streamlit-app.
 10 | 
 11 | ## Pre-requisites:
 12 | The approach uses only openai and time libraries and re-prints the streams using print(end='', flush=True):
 13 | 
 14 | ```python
 15 | 
 16 | !pip install --upgrade openai
 17 | import openai
 18 | import time
 19 | openai.api_key = user_secrets.get_secret("OPENAI_API_KEY")
 20 | startime = time.time()
 21 | ```
 22 | Disclaimer: The downside of streaming in production usage is the control of appropiate usage policy: https://beta.openai.com/docs/usage-guidelines, which should be reviewed in advance for each application, so I suggest to take a look this policy prior deciding to use streaming. 
 23 | 
 24 | ## How to stream GPT-4 API model (gpt-4 or gpt-4-32k & gpt-4-32k-0314) responses? 
 25 | 
 26 | Run the file streams.ipnyb first part. 
 27 | 
 28 | ```python
 29 | ### STREAM GPT-4 API RESPONSES
 30 | delay_time = 0.01 #  faster
 31 | max_response_length = 8000
 32 | answer = ''
 33 | # ASK QUESTION
 34 | prompt = input("Ask a question: ")
 35 | start_time = time.time()
 36 | 
 37 | response = openai.ChatCompletion.create(
 38 |     # GPT-4 API REQQUEST
 39 |     model='gpt-4',
 40 |     messages=[
 41 |         {'role': 'user', 'content': f'{prompt}'}
 42 |     ],
 43 |     max_tokens=max_response_length,
 44 |     temperature=0,
 45 |     stream=True,  # this time, we set stream=True
 46 | )
 47 | 
 48 | for event in response: 
 49 |     # STREAM THE ANSWER
 50 |     print(answer, end='', flush=True) # Print the response    
 51 |     # RETRIEVE THE TEXT FROM THE RESPONSE
 52 |     event_time = time.time() - start_time  # CALCULATE TIME DELAY BY THE EVENT
 53 |     event_text = event['choices'][0]['delta'] # EVENT DELTA RESPONSE
 54 |     answer = event_text.get('content', '') # RETRIEVE CONTENT
 55 |     time.sleep(delay_time)
 56 | ```
 57 | 
 58 | 
 59 | 
 60 | After inserting the user input and pressing enter, you should see the output printed:
 61 | 
 62 | ![image](https://user-images.githubusercontent.com/46755670/225767202-41026ba9-bbf0-4941-ba5b-64d60de539dc.png)
 63 | 
 64 | 
 65 | ## How to stream ChatGPT API model (gpt-3.5-turbo) responses? 
 66 | Run the file streams.ipnyb second part. Add user input and you should see similar to below:
 67 | ```python
 68 | ### STREAM CHATGPT API RESPONSES
 69 | delay_time = 0.01 #  faster
 70 | max_response_length = 200
 71 | answer = ''
 72 | # ASK QUESTION
 73 | prompt = input("Ask a question: ")
 74 | start_time = time.time()
 75 | 
 76 | response = openai.ChatCompletion.create(
 77 |     # CHATPG GPT API REQQUEST
 78 |     model='gpt-3.5-turbo',
 79 |     messages=[
 80 |         {'role': 'user', 'content': f'{prompt}'}
 81 |     ],
 82 |     max_tokens=max_response_length,
 83 |     temperature=0,
 84 |     stream=True,  # this time, we set stream=True
 85 | )
 86 | 
 87 | for event in response: 
 88 |     # STREAM THE ANSWER
 89 |     print(answer, end='', flush=True) # Print the response    
 90 |     # RETRIEVE THE TEXT FROM THE RESPONSE
 91 |     event_time = time.time() - start_time  # CALCULATE TIME DELAY BY THE EVENT
 92 |     event_text = event['choices'][0]['delta'] # EVENT DELTA RESPONSE
 93 |     answer = event_text.get('content', '') # RETRIEVE CONTENT
 94 |     time.sleep(delay_time)
 95 | ```
 96 | 
 97 | ![image](https://user-images.githubusercontent.com/46755670/225769321-0a361329-3a2e-469a-b6b8-057d121c3179.png)
 98 | 
 99 | 
100 | ## How to stream InstructGPT API model (text-davinci-003) responses? 
101 | Run the file streams.pnyb third part. Add user input and you should see similar to below:
102 | 
103 | 
104 | ```python
105 | collected_events = []
106 | completion_text = []
107 | speed = 0.05 #smaller is faster
108 | max_response_length = 200
109 | start_time = time.time()
110 | prompt = input("Ask a question: ")
111 | # Generate Answer
112 | response = openai.Completion.create(
113 |     model='text-davinci-003',
114 |     prompt=prompt,
115 |     max_tokens=max_response_length,
116 |     temperature=0,
117 |     stream=True,  # this time, we set stream=True
118 | )
119 | 
120 | # Stream Answer
121 | for event in response:
122 |     event_time = time.time() - start_time  # calculate the time delay of the event
123 |     collected_events.append(event)  # save the event response
124 |     event_text = event['choices'][0]['text']  # extract the text
125 |     completion_text += event_text  # append the text
126 |     time.sleep(speed)
127 |     print(f"{event_text}", end="", flush=True)
128 | ```
129 | 
130 | 
131 | ![image](https://user-images.githubusercontent.com/46755670/224536590-bbe76d52-4356-4b0c-a0c0-e3aefbeb178b.png)
132 | 
133 | 
134 | ## How to create Streamlit app with OpenAI API?
135 | I add a working "app_streamlit.py"-file, which you can fork to your repository with the "requirements.txt" and deploy it in Streamlit. 
136 | 
137 | ![image](https://user-images.githubusercontent.com/46755670/230953129-51775d7d-9585-4e1a-bdcb-9a32a23421c4.png)
138 | 
139 | In the advanced settings, add the OPENAI_API_KEY-variable using format: 
140 | ```python
141 | OPENAI_API_KEY = "INSERT HERE YOUR KEY"
142 | ```
143 | 
144 | 
145 | 
146 | 
147 | 
148 | ## Suggestions and improvements
149 | Feel free to fork and further improve the code as per the license. For example you can further improve the ChatML to ensure the flow follows desired "system" rules. I left these empty now to make this basic script very generic. I recommend to check my articles specific to ChatGPT API about [streaming responses](https://tmmtt.medium.com/how-to-stream-chatgpt-api-responses-b783f1e5f13d) in Medium related to Streaming, [ChatML: guiding prompts with system, assistant and user roles](https://tmmtt.medium.com/chat-markup-language-chatml-35767c2c69a1) and [ChatGPT API introduction tutorial](https://tmmtt.medium.com/chatgpt-api-tutorial-3da433eb041e). 
150 | 


--------------------------------------------------------------------------------