├── .gitignore
├── requirements.txt
└── README.md


/.gitignore:
--------------------------------------------------------------------------------
1 | models/**
2 | env


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | llama-cpp-python[server]
2 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Step-by-step: run local models with GGML (~5min + download time for model weights)
 2 | 
 3 | ### Setup Python environment
 4 | 
 5 | 1. Clone this repository `git clone https://github.com/continuedev/ggml-server-example`
 6 | 2. Move into the folder: `cd ggml-server-example`
 7 | 3. Create a virtual environment: `python3 -m venv env`
 8 | 4. Activate the virtual environment: `source env/bin/activate` on Mac, `env\Scripts\activate.bat` on Windows, `source env/bin/activate.fish` if using fish terminal
 9 | 5. Install required packages: `pip install -r requirements.txt`
10 | 
11 | ### Download a model
12 | 
13 | 6. Download a model to the `models/` folder
14 |    - Here is a convenient source of models that can be downloaded: https://huggingface.co/TheBloke
15 |    - For example, download 4-bit quantized WizardLM-7B from here (we recommend this model): https://huggingface.co/TheBloke/wizardLM-7B-GGML/blob/main/wizardLM-7B.ggmlv3.q4_0.bin
16 | 
17 | ### Serve the model
18 | 
19 | 7. Run the server with `python3 -m llama_cpp.server --model models/wizardLM-7B.ggmlv3.q4_0.bin`
20 | 
21 | ### Use with Continue
22 | 
23 | 8. To set this as your default model in Continue, you can open `~/.continue/config.json` either manually or using the `/config` slash command in Continue. Then, import the `GGML` class (`from continuedev.src.continuedev.libs.llm.ggml import GGML`), set `"default_model": "default=GGML(max_context_length=2048)"`, reload your VS Code window, and you're good to go!
24 | 
25 | ---
26 | 
27 | ## Any questions?
28 | 
29 | Happy to help. Email use at hi@continue.dev.
30 | 


--------------------------------------------------------------------------------