├── LICENSE ├── README.md ├── demo.png └── index.html /LICENSE: -------------------------------------------------------------------------------- 1 | It's public domain, whatever, use it as you want :) -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # SmolVLM real-time camera demo 2 | 3 |  4 | 5 | This repository is a simple demo for how to use llama.cpp server with SmolVLM 500M to get real-time object detection 6 | 7 | ## How to setup 8 | 9 | 1. Install [llama.cpp](https://github.com/ggml-org/llama.cpp) 10 | 2. Run `llama-server -hf ggml-org/SmolVLM-500M-Instruct-GGUF` 11 | Note: you may need to add `-ngl 99` to enable GPU (if you are using NVidia/AMD/Intel GPU) 12 | Note (2): You can also try other models [here](https://github.com/ggml-org/llama.cpp/blob/master/docs/multimodal.md) 13 | 3. Open `index.html` 14 | 4. Optionally change the instruction (for example, make it returns JSON) 15 | 5. Click on "Start" and enjoy 16 | -------------------------------------------------------------------------------- /demo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ngxson/smolvlm-realtime-webcam/37b62fc3c9fee5b90040a4496db8ad9e4f66d959/demo.png -------------------------------------------------------------------------------- /index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 |
4 | 5 | 6 |