├── .gitignore ├── requirements.txt └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | models/** 2 | env -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | llama-cpp-python[server] 2 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Step-by-step: run local models with GGML (~5min + download time for model weights) 2 | 3 | ### Setup Python environment 4 | 5 | 1. Clone this repository `git clone https://github.com/continuedev/ggml-server-example` 6 | 2. Move into the folder: `cd ggml-server-example` 7 | 3. Create a virtual environment: `python3 -m venv env` 8 | 4. Activate the virtual environment: `source env/bin/activate` on Mac, `env\Scripts\activate.bat` on Windows, `source env/bin/activate.fish` if using fish terminal 9 | 5. Install required packages: `pip install -r requirements.txt` 10 | 11 | ### Download a model 12 | 13 | 6. Download a model to the `models/` folder 14 | - Here is a convenient source of models that can be downloaded: https://huggingface.co/TheBloke 15 | - For example, download 4-bit quantized WizardLM-7B from here (we recommend this model): https://huggingface.co/TheBloke/wizardLM-7B-GGML/blob/main/wizardLM-7B.ggmlv3.q4_0.bin 16 | 17 | ### Serve the model 18 | 19 | 7. Run the server with `python3 -m llama_cpp.server --model models/wizardLM-7B.ggmlv3.q4_0.bin` 20 | 21 | ### Use with Continue 22 | 23 | 8. To set this as your default model in Continue, you can open `~/.continue/config.json` either manually or using the `/config` slash command in Continue. Then, import the `GGML` class (`from continuedev.src.continuedev.libs.llm.ggml import GGML`), set `"default_model": "default=GGML(max_context_length=2048)"`, reload your VS Code window, and you're good to go! 24 | 25 | --- 26 | 27 | ## Any questions? 28 | 29 | Happy to help. Email use at hi@continue.dev. 30 | --------------------------------------------------------------------------------