├── .env ├── .github └── workflows │ └── codeql.yml ├── .gitignore ├── CODE_OF_CONDUCT.md ├── LICENSE ├── README.md ├── app ├── Chat.py ├── consts.py ├── funcs.py └── pages │ └── Upload File.py └── requirements.txt /.env: -------------------------------------------------------------------------------- 1 | OPENAI_API_KEY = "YOUR_OPENAI_KEY" 2 | SERP_API_KEY = "YOUR_SERP_API_KEY" -------------------------------------------------------------------------------- /.github/workflows/codeql.yml: -------------------------------------------------------------------------------- 1 | name: 'CodeQL' 2 | 3 | on: 4 | push: 5 | branches: ['main'] 6 | pull_request: 7 | branches: ['main'] 8 | schedule: 9 | - cron: '37 16 * * 3' 10 | 11 | jobs: 12 | analyze: 13 | name: Analyze 14 | 15 | runs-on: ${{ (matrix.language == 'python' && 'ubuntu-latest') || 'macos-latest' }} 16 | timeout-minutes: ${{ (matrix.language == 'python' && 360) || 120 }} 17 | permissions: 18 | actions: read 19 | contents: read 20 | security-events: write 21 | 22 | strategy: 23 | fail-fast: false 24 | matrix: 25 | language: ['python'] 26 | 27 | steps: 28 | - name: Checkout repository 29 | uses: actions/checkout@v4 30 | 31 | # Initializes the CodeQL tools for scanning. 32 | - name: Initialize CodeQL 33 | uses: github/codeql-action/init@v3 34 | with: 35 | languages: ${{ matrix.language }} 36 | 37 | - name: Autobuild 38 | uses: github/codeql-action/autobuild@v3 39 | 40 | - name: Perform CodeQL Analysis 41 | uses: github/codeql-action/analyze@v3 42 | with: 43 | category: '/language:${{matrix.language}}' 44 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | venv 2 | env 3 | app/__pycache__ 4 | .env 5 | test.py 6 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Contributor Covenant Code of Conduct 2 | 3 | ## Our Pledge 4 | 5 | We as members, contributors, and leaders pledge to make participation in our 6 | community a harassment-free experience for everyone, regardless of age, body 7 | size, visible or invisible disability, ethnicity, sex characteristics, gender 8 | identity and expression, level of experience, education, socio-economic status, 9 | nationality, personal appearance, race, religion, or sexual identity 10 | and orientation. 11 | 12 | We pledge to act and interact in ways that contribute to an open, welcoming, 13 | diverse, inclusive, and healthy community. 14 | 15 | ## Our Standards 16 | 17 | Examples of behavior that contributes to a positive environment for our 18 | community include: 19 | 20 | * Demonstrating empathy and kindness toward other people 21 | * Being respectful of differing opinions, viewpoints, and experiences 22 | * Giving and gracefully accepting constructive feedback 23 | * Accepting responsibility and apologizing to those affected by our mistakes, 24 | and learning from the experience 25 | * Focusing on what is best not just for us as individuals, but for the 26 | overall community 27 | 28 | Examples of unacceptable behavior include: 29 | 30 | * The use of sexualized language or imagery, and sexual attention or 31 | advances of any kind 32 | * Trolling, insulting or derogatory comments, and personal or political attacks 33 | * Public or private harassment 34 | * Publishing others' private information, such as a physical or email 35 | address, without their explicit permission 36 | * Other conduct which could reasonably be considered inappropriate in a 37 | professional setting 38 | 39 | ## Enforcement Responsibilities 40 | 41 | Community leaders are responsible for clarifying and enforcing our standards of 42 | acceptable behavior and will take appropriate and fair corrective action in 43 | response to any behavior that they deem inappropriate, threatening, offensive, 44 | or harmful. 45 | 46 | Community leaders have the right and responsibility to remove, edit, or reject 47 | comments, commits, code, wiki edits, issues, and other contributions that are 48 | not aligned to this Code of Conduct, and will communicate reasons for moderation 49 | decisions when appropriate. 50 | 51 | ## Scope 52 | 53 | This Code of Conduct applies within all community spaces, and also applies when 54 | an individual is officially representing the community in public spaces. 55 | Examples of representing our community include using an official e-mail address, 56 | posting via an official social media account, or acting as an appointed 57 | representative at an online or offline event. 58 | 59 | ## Enforcement 60 | 61 | Instances of abusive, harassing, or otherwise unacceptable behavior may be 62 | reported to the community leaders responsible for enforcement at 63 | programing.ninja0@gmail.com. 64 | All complaints will be reviewed and investigated promptly and fairly. 65 | 66 | All community leaders are obligated to respect the privacy and security of the 67 | reporter of any incident. 68 | 69 | ## Enforcement Guidelines 70 | 71 | Community leaders will follow these Community Impact Guidelines in determining 72 | the consequences for any action they deem in violation of this Code of Conduct: 73 | 74 | ### 1. Correction 75 | 76 | **Community Impact**: Use of inappropriate language or other behavior deemed 77 | unprofessional or unwelcome in the community. 78 | 79 | **Consequence**: A private, written warning from community leaders, providing 80 | clarity around the nature of the violation and an explanation of why the 81 | behavior was inappropriate. A public apology may be requested. 82 | 83 | ### 2. Warning 84 | 85 | **Community Impact**: A violation through a single incident or series 86 | of actions. 87 | 88 | **Consequence**: A warning with consequences for continued behavior. No 89 | interaction with the people involved, including unsolicited interaction with 90 | those enforcing the Code of Conduct, for a specified period of time. This 91 | includes avoiding interactions in community spaces as well as external channels 92 | like social media. Violating these terms may lead to a temporary or 93 | permanent ban. 94 | 95 | ### 3. Temporary Ban 96 | 97 | **Community Impact**: A serious violation of community standards, including 98 | sustained inappropriate behavior. 99 | 100 | **Consequence**: A temporary ban from any sort of interaction or public 101 | communication with the community for a specified period of time. No public or 102 | private interaction with the people involved, including unsolicited interaction 103 | with those enforcing the Code of Conduct, is allowed during this period. 104 | Violating these terms may lead to a permanent ban. 105 | 106 | ### 4. Permanent Ban 107 | 108 | **Community Impact**: Demonstrating a pattern of violation of community 109 | standards, including sustained inappropriate behavior, harassment of an 110 | individual, or aggression toward or disparagement of classes of individuals. 111 | 112 | **Consequence**: A permanent ban from any sort of public interaction within 113 | the community. 114 | 115 | ## Attribution 116 | 117 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], 118 | version 2.0, available at 119 | https://www.contributor-covenant.org/version/2/0/code_of_conduct.html. 120 | 121 | Community Impact Guidelines were inspired by [Mozilla's code of conduct 122 | enforcement ladder](https://github.com/mozilla/diversity). 123 | 124 | [homepage]: https://www.contributor-covenant.org 125 | 126 | For answers to common questions about this code of conduct, see the FAQ at 127 | https://www.contributor-covenant.org/faq. Translations are available at 128 | https://www.contributor-covenant.org/translations. 129 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 Ayan Khan 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # GPT 3.5 ON STEROIDS: Autonomous Agent with knowledge beyond 2021 2 | 3 | Welcome to GPT 3.5 ON STEROID, an open-source project that enhances the capabilities of GPT by integrating it with various Python libraries and APIs for advanced text generation. 4 | 5 |

6 | 7 |

8 | 9 | ## Requirements 10 | 11 | Make sure you have the following Python libraries installed: 12 | - `openai` 13 | - `google-serp-api` 14 | - `tiktoken` 15 | - `wikipedia` 16 | - `trafilatura` 17 | - `streamlit` 18 | - `google-search-results` 19 | - `python-dotenv` 20 | - `youtube-transcript-api` 21 | - `openpyxl` 22 | - `PyPDF2` 23 | - `python-docx` 24 | - `pandasai` 25 | 26 | ## Installation 27 | 28 | To install the required packages, run the following command in your terminal: 29 | 30 | ```bash 31 | pip install -r requirements.txt 32 | ``` 33 | 34 | ## Additionally, you'll need API keys for the following services: 35 | - [SerpAPI](https://serpapi.com/) 36 | - [OpenAI](https://openai.com/) 37 | 38 | ## Running Streamlit 39 | 40 | To run the Streamlit application, execute the following command in your terminal: 41 | 42 | ```bash 43 | streamlit run ./app/Chat.py 44 | ``` 45 | 46 | ## Integrated Python Functions (Tools) 47 | 48 | GPT 3.5 ON STEROID incorporates various Python functions that GPT can call and use, including: 49 | 50 | - **Web Scraping:** Utilizing `google-serp-api` and `trafilatura` for dynamic data retrieval. 51 | - **Natural Language Processing:** Using `tiktoken` for language processing tasks. 52 | - **Information Retrieval:** Accessing data from `wikipedia` for comprehensive information retrieval. 53 | - **User Interface:** Employing `streamlit` for creating a user-friendly interface. 54 | 55 | **Note:** Whenever a new tool is added, please ensure the following: 56 | - Update the `requirements.txt` file to include the new tool/library. 57 | - Update the `README.md` file to document the newly added tool and its functionality. 58 | - Ensure that your feature does not break the application test before merging. 59 | 60 | ## Contribution Guidelines 61 | 62 | We welcome contributions from the community to make GPT 3.5 ON STEROID even better! Please follow these guidelines: 63 | 64 | 1. **Create an Issue:** First, create an issue detailing the feature, bug fix, or improvement you plan to work on. Wait for approval and assignment before proceeding to the next step. 65 | 66 | 2. **Assign Yourself:** After your issue is approved, get yourself assigned to it. This helps avoid duplication of efforts and ensures everyone is aware of ongoing work. 67 | 68 | 3. **Create a Pull Request (PR):** Once assigned, proceed to create your PR. Ensure to mention the assigned issue number in the PR description to link it properly. 69 | 70 | **Note:** PRs without assigned issues will be considered spammy and may lead to disqualification. 71 | 72 | 4. **Fork the repository and create your branch:** `git checkout -b feature/new-contribution` 73 | 74 | 5. **Make your changes and test thoroughly.** 75 | 76 | 6. **Commit your changes:** `git commit -m "Add a brief description of your changes"` 77 | 78 | 7. **Push to your forked repository:** `git push origin feature/new-contribution` 79 | 80 | 8. **Create a pull request to the main repository with proof of work attached.** 81 | 82 | ### Code of Conduct 83 | 84 | Please review our [Code of Conduct](CODE_OF_CONDUCT.md) to understand the community standards. 85 | 86 | ## License 87 | 88 | This project is licensed under the MIT License - see the [LICENSE.md](https://github.com/programmingninjas/GPT-3.5-ON-STEROIDS/blob/main/LICENSE) file for details. 89 | -------------------------------------------------------------------------------- /app/Chat.py: -------------------------------------------------------------------------------- 1 | """ 2 | This is where the program starts 3 | """ 4 | import time 5 | import json 6 | import sys 7 | import openai 8 | import streamlit as st 9 | from consts import OPENAI_API_KEY, SETUP_PROMPT, INSTRUCTION_PROMPT, now 10 | from funcs import ( 11 | google_tool, 12 | browse_website, 13 | write_to_file, 14 | append_to_file, 15 | read_file, 16 | open_file, 17 | search_wiki, 18 | type_message, 19 | ask_gpt, 20 | analyse_uploaded_file, 21 | youtube_transcript 22 | ) 23 | 24 | # TOOLS 25 | tools = { 26 | "google": google_tool, 27 | "browse_website": browse_website, 28 | "write_to_file": write_to_file, 29 | "append_to_file": append_to_file, 30 | "read_file": read_file, 31 | "open_file": open_file, 32 | "wikipedia": search_wiki, 33 | "youtube_transcript": youtube_transcript, 34 | "type_message": type_message 35 | } 36 | 37 | 38 | # MAIN 39 | def main(): 40 | """ 41 | Starting point of the program. 42 | """ 43 | # INITIAL SETUP 44 | st.title("GPT-3.5 on Steroids") 45 | 46 | 47 | if "messages" not in st.session_state: 48 | st.session_state.messages = [] 49 | 50 | for message in st.session_state.messages: 51 | with st.chat_message(message["role"]): 52 | st.markdown(message["content"]) 53 | 54 | # GETTING USER PROMPT 55 | prompt = st.chat_input("Enter Task") 56 | if not prompt: 57 | sys.exit() 58 | st.session_state.messages.append({"role": "user", "content": prompt}) 59 | with st.chat_message("user"): 60 | st.markdown(prompt) 61 | 62 | init_messages = [ 63 | {"role": "system", "content": SETUP_PROMPT}, 64 | {"role": "user", "content": prompt}, 65 | ] 66 | # FIRST REPLY 67 | reply = ask_gpt(init_messages) 68 | 69 | prompt1 = f"{reply}\n{INSTRUCTION_PROMPT}\nThe current time and date is {now}" 70 | init_messages += [ 71 | { 72 | "role": "system", 73 | "content": prompt1, 74 | }, 75 | { 76 | "role": "user", 77 | "content": "Determine which next command to use, and respond using the \ 78 | format specified above:", 79 | }, 80 | ] 81 | 82 | # SECOND REPLY 83 | init_reply = json.loads(ask_gpt(init_messages), strict=False) 84 | 85 | # DISPLAYING THE OUTPUT TO THE USER 86 | type_message({"text": init_reply["thoughts"]["text"]}) 87 | 88 | def execute(reply) -> str: 89 | """This is a recursive function which lets GPT run tools provided to it when it needs them. 90 | Args: 91 | reply: a dictionary which contains information like thoughts and which tool to use 92 | Returns: 93 | str: returns "task_completed" after running completely 94 | """ 95 | if reply["command"]["name"] == "task_complete": 96 | print("GPT Has done its work.") 97 | return "task_completed" 98 | try: 99 | time.sleep(5) 100 | if reply["command"]["name"] == "analyse_uploaded_file": 101 | try: 102 | result = analyse_uploaded_file(st.session_state.uploaded_file,reply["command"]["args"]) 103 | except: 104 | result = "This command returned nothing" 105 | else: 106 | result = tools[reply["command"]["name"]](reply["command"]["args"]) 107 | messages = [ 108 | { 109 | "role": "system", 110 | "content": prompt1 111 | + "\n" 112 | + "This reminds you of these events from your past:\n\ 113 | I was created and nothing new has happened.", 114 | }, 115 | { 116 | "role": "user", 117 | "content": "Determine which next command to use, \ 118 | and respond using the format specified above:", 119 | }, 120 | {"role": "assistant", "content": json.dumps(reply)}, 121 | { 122 | "role": "system", 123 | "content": f"Command {reply['command']['name']} returned: " 124 | + result, 125 | }, 126 | { 127 | "role": "user", 128 | "content": "Determine which next command to use, \ 129 | and respond using the format specified above:", 130 | }, 131 | ] 132 | reply = json.loads(ask_gpt(messages), strict=False) 133 | type_message({"text": reply["thoughts"]["text"]}) 134 | execute(reply) 135 | 136 | except Exception as error: 137 | type_message({"text": f"Task aborted due to error: {error}"}) 138 | return "task_completed" 139 | 140 | execute(init_reply) 141 | 142 | 143 | if __name__ == "__main__": 144 | main() 145 | -------------------------------------------------------------------------------- /app/consts.py: -------------------------------------------------------------------------------- 1 | """ 2 | This module includes variables like api keys and prompts 3 | """ 4 | import os 5 | from datetime import datetime 6 | import tiktoken 7 | from dotenv import load_dotenv 8 | 9 | # LOADING DOTENV 10 | load_dotenv() 11 | 12 | # API KEYS 13 | SERP_API_KEY = os.getenv("SERP_API_KEY") 14 | OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") 15 | 16 | # PROMPTS 17 | SETUP_PROMPT = """ 18 | Your task is to devise up to 5 highly effective goals and an appropriate role-based name (_GPT) for an autonomous agent, ensuring that the goals are optimally aligned with the successful completion of its assigned task. 19 | 20 | The user will provide the task, you will provide only the output in the exact format specified below with no explanation or conversation. 21 | 22 | Example input: 23 | Help me with marketing my business 24 | 25 | Example output: 26 | Name: CMOGPT 27 | Description: a professional digital marketer AI that assists Solopreneurs in growing their businesses by providing world-class expertise in solving marketing problems for SaaS, content products, agencies, and more. 28 | Goals: 29 | - Engage in effective problem-solving, prioritization, planning, and supporting execution to address your marketing needs as your virtual Chief Marketing Officer. 30 | 31 | - Provide specific, actionable, and concise advice to help you make informed decisions without the use of platitudes or overly wordy explanations. 32 | 33 | - Identify and prioritize quick wins and cost-effective campaigns that maximize results with minimal time and budget investment. 34 | 35 | - Proactively take the lead in guiding you and offering suggestions when faced with unclear information or uncertainty to ensure your marketing strategy remains on track. 36 | """ 37 | INSTRUCTION_PROMPT = """ 38 | Constraints: 39 | 1. ~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files. 40 | 2. No user assistance/input. 41 | 3. If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember. 42 | 4. Exclusively use the commands listed in double quotes e.g. "command name" 43 | 44 | Commands: 45 | 1. google: Google Search, args: "query": "" 46 | 2. wikipedia: Wikipedia Search, args: "query": "" 47 | 3. browse_website: Browse website, args: "url": "", "question": "" 48 | 4. youtube_transcript: Returns transcript of the YouTube video, args: "video_id": "" 49 | 5. write_to_file: Write to file for long term memory, args: "filename": "", "text": "" 50 | 6. open_file: Provide file to user for download, args: "path": "" 51 | 7. analyse_uploaded_file: Provide uploaded file to you for analysis,calculations and plottings, args: "query":"" 52 | 8. append_to_file: Append to file, args: "filename": "", "text": "" 53 | 9. read_file: Read a file only after creation, args: "filename": "" 54 | 10. task_complete: Task Complete (Shutdown), args: "reason": "" 55 | 56 | Resources: 57 | 1. Internet access for searches, information gathering and youtube transcripts. 58 | 2. Long Term memory management. 59 | 3. GPT-3.5 powered Agents for delegation of simple tasks. 60 | 4. File output. 61 | 5. Commands 62 | 63 | Performance Evaluation: 64 | 1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities. 65 | 2. Constructively self-criticize your big-picture behavior constantly. 66 | 3. Reflect on past decisions and strategies to refine your approach. 67 | 4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps. 68 | 5. Write all code to a file. 69 | 70 | You should only respond in JSON format as described below 71 | Response Format: 72 | { 73 | "thoughts": { 74 | "text": "thought", 75 | "reasoning": "reasoning", 76 | "plan": "- short bulleted\n- list that conveys\n- long-term plan", 77 | "criticism": "constructive self-criticism", 78 | "speak": "thoughts summary to say to user" 79 | }, 80 | "command": { 81 | "name": "command name", 82 | "args": { 83 | "arg name": "value" 84 | } 85 | } 86 | } 87 | Ensure the response can be parsed by Python json.loads 88 | """ 89 | 90 | # OTHER 91 | TOKEN_LIMIT=2500 92 | encoding = tiktoken.encoding_for_model("gpt-3.5-turbo") 93 | now = datetime.now() 94 | -------------------------------------------------------------------------------- /app/funcs.py: -------------------------------------------------------------------------------- 1 | """ 2 | This module includes all functions used by the program. 3 | """ 4 | import time 5 | import json 6 | import wikipedia 7 | from openai import OpenAI 8 | 9 | client = OpenAI() 10 | import pytesseract 11 | import cv2 12 | import imutils 13 | from PIL import Image 14 | from PyPDF2 import PdfReader 15 | from docx import Document 16 | import streamlit as st 17 | import numpy as np 18 | import pandas as pd 19 | from pandasai import SmartDataframe 20 | from pandasai.llm.openai import OpenAI 21 | from serpapi import GoogleSearch 22 | from youtube_transcript_api import YouTubeTranscriptApi 23 | from trafilatura import fetch_url, extract 24 | from consts import ( 25 | SERP_API_KEY, 26 | OPENAI_API_KEY, 27 | TOKEN_LIMIT, 28 | encoding, 29 | ) 30 | 31 | 32 | def search_wiki(command) -> str: 33 | """Searches wikipedia 34 | Args: 35 | command: a dictionary containing the query 36 | Returns: 37 | str: results returned by wikipedia 38 | """ 39 | print("Search wiki called") 40 | try: 41 | return "Command wikipedia returned: " + wikipedia.summary(command["query"]) 42 | except Exception as error: 43 | return f"Command wikipedia returned: {error}" 44 | 45 | 46 | 47 | def write_to_file(command) -> str: 48 | """Writes text to a file 49 | Args: 50 | command: a dictionary containing the "filename" and "text" 51 | Returns: 52 | str: success message 53 | """ 54 | print("Write to file called") 55 | with open(command["filename"], "w", encoding="utf-8") as file: 56 | file.write(command["text"]) 57 | return "Command write_to_file returned: File was written successfully" 58 | 59 | 60 | def append_to_file(command) -> str: 61 | """Appends text to a file 62 | Args: 63 | command: a dictionary containing the "filename" and "text" 64 | Returns: 65 | str: success message 66 | """ 67 | print("Append to file called") 68 | with open(command["filename"], "a", encoding="utf-8") as file: 69 | file.write(command["text"]) 70 | return "Command append_to_file returned: File was appended successfully" 71 | 72 | 73 | def read_file(command) -> str: 74 | """Returns text from a file 75 | Args: 76 | command: a dictionary containing the "filename" 77 | Returns: 78 | str: text stored in the file 79 | """ 80 | print("Read file called") 81 | try: 82 | with open(command["filename"], "r", encoding="utf-8") as file: 83 | data = file.read() 84 | return f"Command read_file returned: {data}" 85 | except Exception as error: 86 | return f"Command read_file returned: {error}. First create this file." 87 | 88 | 89 | def open_file(command) -> str: 90 | """Shows a download button on the Streamlit interface to download the file generated by GPT. 91 | Args: 92 | command: a dictionary containing the "path" to the file 93 | Returns: 94 | str: a success message 95 | """ 96 | print("Open file called") 97 | try: 98 | with open(command["path"], "r", encoding="utf-8") as file: 99 | st.download_button("Open File", file, file_name=command["path"]) 100 | return "Command open_file returned: File was opened successfully" 101 | except Exception as error: 102 | return f"Command open_file returned: {error}" 103 | 104 | 105 | def browse_website(command) -> str: 106 | """Browse website and extract main content upto TOKEN_LIMIT tokens 107 | Args: 108 | command: a dictionary containing "url" to the website 109 | Returns 110 | str: the content of that website in json format 111 | """ 112 | print("Browse website called") 113 | # grab a HTML file to extract data from 114 | downloaded = fetch_url(command["url"]) 115 | 116 | # output main content and comments as plain text 117 | result = extract(downloaded, output_format="json") 118 | 119 | try: 120 | if len(encoding.encode(str(result))) < TOKEN_LIMIT: 121 | return "Command browse_website returned: " + str(result) 122 | return "Command browse_website returned: " + str(result)[:TOKEN_LIMIT] 123 | except Exception as error: 124 | return f"Command browse_website returned: {error}" 125 | 126 | 127 | 128 | def google_tool(command) -> str: 129 | """Searches google for query and returns upto TOKEN_LIMIT tokens of results 130 | Args: 131 | command: a dictionary containing "query" 132 | Returns: 133 | str: response in json format 134 | """ 135 | print("Google tool called") 136 | params = { 137 | "q": str(command["query"]), 138 | "location": "Delhi,India", 139 | "first": 1, 140 | "count": 10, 141 | "num": 4, 142 | "api_key": SERP_API_KEY, 143 | } 144 | 145 | search = GoogleSearch(params) 146 | results = search.get_dict() 147 | 148 | organic_results = [] 149 | page_count = 0 150 | page_limit = 1 151 | 152 | while "error" not in results and page_count < page_limit: 153 | organic_results.extend(results.get("organic_results", [])) 154 | 155 | params["first"] += params["count"] 156 | page_count += 1 157 | results = search.get_dict() 158 | 159 | response = json.dumps(organic_results, indent=2, ensure_ascii=False) 160 | try: 161 | if len(encoding.encode(response)) < TOKEN_LIMIT: 162 | return "Command google returned: " + response 163 | return "Command google returned: " + response[:TOKEN_LIMIT] 164 | except Exception as error: 165 | return f"Command google returned: {error}" 166 | 167 | def type_message(command) -> None: 168 | """Displays text on the screen with a typewriter effect 169 | Args: 170 | text: any string 171 | Returns: 172 | None 173 | """ 174 | print("Type message called") 175 | with st.chat_message("assistant"): 176 | message_placeholder = st.empty() 177 | full_response = "" 178 | for response in command["text"]: 179 | full_response += response 180 | time.sleep(0.02) 181 | message_placeholder.markdown(full_response + "▌") 182 | message_placeholder.markdown(full_response) 183 | 184 | 185 | def ask_gpt(messages) -> str: 186 | """Generates text using the "gpt-3.5-turbo" model 187 | Args: 188 | message: a list of dictionaries in the format {"role": , "content": } 189 | Returns: 190 | str: text generated by gpt 191 | """ 192 | reply = client.chat.completions.create(model="gpt-3.5-turbo", messages=messages, temperature=0) 193 | return reply.choices[0].message.content 194 | 195 | 196 | def youtube_transcript(command) -> str: 197 | """Fetches transcripts from YouTube videos 198 | Args: 199 | url: the url of the YouTube video 200 | Returns: 201 | str: transcript of the video 202 | """ 203 | print("Get youtube transcript called") 204 | try: 205 | srt_dictionary = YouTubeTranscriptApi.get_transcript(command["video_id"]) 206 | srt_text = " ".join(x["text"] for x in srt_dictionary) 207 | if len(encoding.encode(srt_text)) < TOKEN_LIMIT: 208 | return f"Command youtube_transcripts returned: \"{srt_text}\"" 209 | return f"Command youtube_transcripts returned: \"{srt_text}\""[:TOKEN_LIMIT] 210 | except Exception as error: 211 | return f"Command read_file returned: {error}" 212 | 213 | 214 | def analyse_uploaded_file(uploaded_file,command)->str: 215 | """The function extracts the data from docx , pdf and excel files 216 | Args: 217 | uploaded_file: File uploaded via streamlit file_uploader 218 | command: Contains the query to perform on the File 219 | Returns: 220 | str: Data analysed from the file. 221 | """ 222 | extension = uploaded_file.type 223 | text = "" 224 | if extension=="application/pdf": 225 | reader = PdfReader(uploaded_file) 226 | pages = reader.pages 227 | for i in range(len(pages)): 228 | text+=pages[i].extract_text() 229 | if extension=="application/vnd.openxmlformats-officedocument.wordprocessingml.document": 230 | doc = Document(uploaded_file) 231 | for para in doc.paragraphs: 232 | text+=para.text 233 | if extension=="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" or extension=="text/csv": 234 | llm = OpenAI(api_token=OPENAI_API_KEY) 235 | df = pd.read_excel(uploaded_file) 236 | df = SmartDataframe(df,config={"llm":llm}) 237 | print(command["query"]) 238 | text = df.chat(command["query"]) 239 | print(text) 240 | if extension in ["image/png", "image/jpg", "image/jpeg"]: 241 | img = Image.open(uploaded_file).convert("RGB") 242 | nimg = np.array(img) 243 | image = cv2.cvtColor(nimg, cv2.COLOR_BGR2RGB) 244 | gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) 245 | kernel=np.ones((2,2),np.uint8) 246 | im=cv2.dilate(gray,kernel,iterations=1) 247 | im=cv2.bitwise_not(im) 248 | coordinates=np.column_stack(np.where(im<255)) 249 | ang=cv2.minAreaRect(coordinates)[-1] 250 | print(ang) 251 | if ang<=90 and ang>0: 252 | ang=90-ang 253 | height,width=im.shape[:2] 254 | centre=(width/2,height/2) 255 | rot_mat=cv2.getRotationMatrix2D(centre,ang,1.0) 256 | im=cv2.warpAffine(im,rot_mat,(width,height),borderMode=cv2.BORDER_REFLECT) 257 | for i in range(im.shape[0]): 258 | for j in range(im.shape[1]): 259 | if im[i, j] >45: 260 | im[i, j] = 255 261 | text += pytesseract.image_to_string(im) 262 | try: 263 | if len(encoding.encode(str(text))) < TOKEN_LIMIT: 264 | return "Command analyse_uploaded_file returned: " + str(text) 265 | return "Command analyse_uploaded_file returned: " + str(text)[:TOKEN_LIMIT] 266 | except Exception as error: 267 | return f"Command analyse_uploaded_file returned: {error}" 268 | -------------------------------------------------------------------------------- /app/pages/Upload File.py: -------------------------------------------------------------------------------- 1 | import streamlit as st 2 | 3 | st.title("File Uploader for Analysis") 4 | 5 | uploaded_file = st.file_uploader("If you want to analyse a file upload it before entering the task, Else ignore",type=["pdf","docx","xlsx","png","jpg","jpeg","csv"]) 6 | 7 | if "uploaded_file" not in st.session_state: 8 | st.session_state["uploaded_file"] = None 9 | 10 | if uploaded_file is not None: 11 | st.session_state.uploaded_file = uploaded_file -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | openai 2 | google-serp-api 3 | tiktoken 4 | wikipedia 5 | trafilatura 6 | streamlit 7 | google-search-results 8 | python-dotenv 9 | youtube-transcript-api 10 | openpyxl 11 | PyPDF2 12 | python-docx 13 | pandasai 14 | pillow 15 | pytesseract 16 | opencv-python 17 | numpy 18 | imutils --------------------------------------------------------------------------------