├── icons ├── key.png ├── url.png ├── icon.png ├── select.png ├── desktop.png ├── shortcut.png └── terminal.png ├── Ask AI.shortcut ├── .env.example ├── requirements.txt ├── Dockerfile ├── LICENSE ├── app.py └── README.md /icons/key.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bradAGI/LLM-Shortcut/HEAD/icons/key.png -------------------------------------------------------------------------------- /icons/url.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bradAGI/LLM-Shortcut/HEAD/icons/url.png -------------------------------------------------------------------------------- /Ask AI.shortcut: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bradAGI/LLM-Shortcut/HEAD/Ask AI.shortcut -------------------------------------------------------------------------------- /icons/icon.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bradAGI/LLM-Shortcut/HEAD/icons/icon.png -------------------------------------------------------------------------------- /icons/select.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bradAGI/LLM-Shortcut/HEAD/icons/select.png -------------------------------------------------------------------------------- /icons/desktop.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bradAGI/LLM-Shortcut/HEAD/icons/desktop.png -------------------------------------------------------------------------------- /icons/shortcut.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bradAGI/LLM-Shortcut/HEAD/icons/shortcut.png -------------------------------------------------------------------------------- /icons/terminal.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bradAGI/LLM-Shortcut/HEAD/icons/terminal.png -------------------------------------------------------------------------------- /.env.example: -------------------------------------------------------------------------------- 1 | NGROK_AUTHTOKEN= 2 | LLM_ANYSCALE_ENDPOINTS_KEY= 3 | MODEL=mistralai/Mixtral-8x7B-Instruct-v0.1 4 | MAX_TOKENS=150 5 | 6 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | llm 2 | fastapi 3 | pyngrok 4 | uvicorn 5 | pydantic_settings 6 | llm-anyscale-endpoints 7 | llm-llama-cpp 8 | llama-cpp-python 9 | 10 | -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- 1 | # Use an official Python runtime as a parent image 2 | FROM python:3.11-slim 3 | 4 | # Set the working directory in the container 5 | WORKDIR /usr/src/app 6 | 7 | # Install system dependencies 8 | RUN apt update && apt install -y \ 9 | build-essential \ 10 | && rm -rf /var/lib/apt/lists/* 11 | 12 | # Copy the dependencies file to the working directory 13 | COPY requirements.txt . 14 | 15 | # Install any dependencies 16 | RUN pip install --no-cache-dir -r requirements.txt 17 | 18 | # Copy the current directory contents into the container at /usr/src/app 19 | COPY . . 20 | 21 | # Set the .env file (for runtime use) 22 | ENV ENV_FILE .env 23 | 24 | # Add model if .gguf file is present 25 | RUN file=$(find . -name "*.gguf" -print -quit) && [ -n "$file" ] && llm llama-cpp add-model "$file" || echo "No .gguf file found, skipping command" 26 | 27 | # Command to run the application with .env file 28 | CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000", "--reload", "--env-file", ".env"] 29 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2024 00Brad 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /app.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import uuid 4 | 5 | import llm 6 | from fastapi import FastAPI 7 | from pydantic_settings import BaseSettings 8 | from pyngrok import ngrok 9 | 10 | 11 | class Settings(BaseSettings): 12 | USE_NGROK: bool = True 13 | BASE_URL: str = "http://localhost:8000" 14 | NGROK_AUTHTOKEN: str 15 | LLM_ANYSCALE_ENDPOINTS_KEY: str 16 | MODEL: str 17 | MAX_TOKENS: int = 100 18 | 19 | 20 | settings = Settings(_env_file='.env') 21 | app = FastAPI() 22 | key = uuid.uuid4().hex 23 | 24 | print("\nThe following models are available:") 25 | print('\n'.join(l.split(':', 1)[1].strip() for l in os.popen('llm models').read().splitlines() if ':' in l)) 26 | 27 | 28 | @app.on_event("startup") 29 | async def startup_event(): 30 | if settings.USE_NGROK: 31 | port = sys.argv[sys.argv.index("--port") + 1] if "--port" in sys.argv else "8000" 32 | ngrok.set_auth_token(settings.NGROK_AUTHTOKEN) 33 | public_url = ngrok.connect(port).public_url 34 | print("NGROK URL: ", public_url, flush=True) 35 | print("API KEY: ", key, flush=True) 36 | settings.BASE_URL = public_url 37 | 38 | 39 | @app.get("/{key}/{prompt}") 40 | async def get_answer(key: str, prompt: str): 41 | if key != str(key): 42 | return "Invalid key" 43 | prompt = prompt.replace("_", " ") 44 | system = "You are a helpful chatbot." 45 | return await get_response(prompt, system) 46 | 47 | 48 | async def get_response(prompt, system): 49 | model = llm.get_model(settings.MODEL) 50 | model.key = settings.LLM_ANYSCALE_ENDPOINTS_KEY 51 | return model.prompt(system, prompt, max_tokens=settings.MAX_TOKENS).text().strip() 52 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # LLM Shortcut - iOS Shortcut (Ask AI) 2 | 3 | ### Run LLMs on desktop computer / remotely and access them from your iOS device 4 | 5 | ![LLM Shortcut Logo](icons/icon.png "Ask AI") 6 | 7 | ## Overview 8 | 9 | This guide details how to use an iOS Shortcut to interact with large language models (LLMs) hosted on a local desktop 10 | computer through a FastAPI endpoint. The setup allows you to send prompts from your iOS device and receive responses 11 | from the LLM, enabling on-the-go access to advanced language models. There are 2 main components to this setup: 12 | 13 | - **FastAPI Application**: A FastAPI application that serves as an endpoint for the iOS Shortcut to send requests to. 14 | - **iOS Shortcut**: A shortcut that sends prompts to the FastAPI application and displays the response. 15 | 16 | ## Why use LLM Shortcut? 17 | 18 | - Less battery drain - sending web requests is much less battery intensive than running inference on-device 19 | - Faster inference with remote models 20 | - More flexibility - run models accessible via [LLM](https://llm.datasette.io/en/stable/index.html) library 21 | - AnyScale compatible 22 | 23 | ## Which models are available? 24 | - Current AnyScale model names include (1/3/24) 25 | - `meta-llama/Llama-2-7b-chat-hf` 26 | - `meta-llama/Llama-2-13b-chat-hf` 27 | - `meta-llama/Llama-2-70b-chat-hf` 28 | - `codellama/CodeLlama-34b-Instruct-hf` 29 | - ` mistralai/Mistral-7B-Instruct-v0.1` 30 | - `mistralai/Mixtral-8x7B-Instruct-v0.1` 31 | - `Open-Orca/Mistral-7B-OpenOrca` 32 | - `HuggingFaceH4/zephyr-7b-beta` 33 | 34 | 35 | - All GGUF models are available through local inference via llama.cpp 36 | - [The Bloke](https://huggingface.co/TheBloke) regularly publishes many GGUF models 37 | 38 | ## Prerequisites 39 | 40 | - [ngrok](https://dashboard.ngrok.com/get-started/your-authtoken) AuthToken 41 | - [AnyScale](https://app.endpoints.anyscale.com/console/credentials) API key 42 | - [Docker](https://www.docker.com/products/docker-desktop/) 43 | - iOS / [Shortcuts app](https://apps.apple.com/us/app/shortcuts/id915249334) 44 | 45 | ## Setup Instructions 46 | 47 | ### Server (Desktop) Setup 48 | 49 | 1. **Git Clone**: 50 | - Clone this repository to your local machine: 51 | ```bash 52 | git clone https://github.com/00brad/LLM-Shortcut.git 53 | ``` 54 | 55 | 2. **Build Docker Image**: 56 | - Create a `.env` file in the project root with your [ngrok](https://dashboard.ngrok.com/get-started/your-authtoken) 57 | AuthToken and [AnyScale](https://app.endpoints.anyscale.com/console/credentials) API Key (Model can be set to an AnyScale model name or a local GGUF model name): 58 | 59 | ``` 60 | NGROK_AUTHTOKEN=your_ngrok_authtoken 61 | ANYSCALE_ENDPOINTS_KEY=your_anyscale_key 62 | MODEL=your_model_name 63 | MAX_TOKENS=your_max_tokens 64 | ``` 65 | 66 | - Build the Docker image: 67 | ```bash 68 | docker build -t llm_shortcut . 69 | ``` 70 | 71 | 3. If using local GGUF model (optional), download the into project root: 72 | - Example: 73 | ```bash 74 | wget https://huggingface.co/TheBloke/Llama-2-7b-Chat-GGUF/resolve/main/llama-2-7b-chat.Q6_K.gguf 75 | ``` 76 | - Set the model name in the `.env` file to the name of the downloaded model. 77 | 78 | 3. **Run the FastAPI Application**: 79 | - Start the FastAPI server using Uvicorn: 80 | ```bash 81 | docker run llm_shortcut 82 | ``` 83 | - Upon startup, `ngrok` generates a public URL that tunnels to your local desktop computer. 84 | - The generated ngrok URL, which is your endpoint, will be displayed in the console after running: 85 |
86 | Edit URL 87 |
88 | ### iOS Shortcut Setup 89 | 90 | 1. **Shortcut Setup**: 91 | - Click the following link to open 92 | the [LLM Shortcut](https://github.com/00brad/LLM-Shortcut/raw/main/Ask%20AI.shortcut) in the Shortcuts app. 93 | - Click the '+ Add to Shortcut' button to add the shortcut to your library. 94 |
95 | Add to Shortcut 96 |
97 | - Click the '...' button to edit the shortcut. 98 |
99 | Edit Shortcut 100 |
101 | - Change the url in the first line to the ngrok URL generated in the previous section. 102 |
103 | Edit URL 104 |
105 | - Add the API key. 106 |
107 | Edit URL 108 |
109 | - Click the 'Done' button to save the changes. 110 | - Click the 'i' button to view the shortcut details and click Add to Home Screen to add the shortcut to your home 111 | 112 | 113 | 114 | 115 | 2. **Using the Shortcut**: 116 | - Click the shortcut and speak your prompt or say ***"Hey Siri, Ask AI"*** to activate the shortcut. 117 |
118 | Edit URL 119 |
120 | - The prompt is sent to the FastAPI server, processed by the LLM, and a response is returned to your device. 121 | 122 | ## Additional Notes 123 | 124 | - The ngrok URL will change each time the FastAPI server restarts. 125 | - You can run in detached mode by adding the `-d` flag to the `docker run` command. (Use docker logs to view the ngrok URL) 126 | - Ensure the `.env` file is correctly set up with your AnyScale API key, ngrok AuthToken, and model name. 127 | - Currently, the server is configured to use an AnyScale models. 128 | 129 | This setup allows you to leverage the power of language models directly from your iOS device, making it 130 | fast and convenient to use advanced AI capabilities wherever you go. 131 | --------------------------------------------------------------------------------