├── .github ├── dependabot.yml └── workflows │ ├── build-wheel-rocm.yml │ ├── build-wheels-basic.yml │ ├── build-wheels-batch-basic.yml │ ├── build-wheels-batch-cpu.yml │ ├── build-wheels-batch-ggml.yml │ ├── build-wheels-batch-macos.yml │ ├── build-wheels-batch-oobabooga-basic.yml │ ├── build-wheels-batch-oobabooga.yml │ ├── build-wheels-batch-rocm.yml │ ├── build-wheels-batch.yml │ ├── build-wheels-cpu.yml │ ├── build-wheels-full-release.yml │ ├── build-wheels-ggml-cpu.yml │ ├── build-wheels-ggml-oobabooga.yml │ ├── build-wheels-ggml.yml │ ├── build-wheels-macos.yml │ ├── build-wheels-oobabooga-basic.yml │ ├── build-wheels-oobabooga-rocm.yml │ ├── build-wheels-oobabooga.yml │ ├── build-wheels-prioritized-release.yml │ ├── build-wheels-rocm-full.yml │ ├── build-wheels-test.yml │ ├── build-wheels.yml │ └── deploy-index.yml ├── LICENSE ├── README.md ├── generate-html.ps1 ├── generate-textgen-html.ps1 ├── host-index.bat ├── host-index.sh ├── index ├── AVX │ ├── cpu │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu116 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu117 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu118 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu120 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu121 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu122 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── index.html │ ├── rocm5.4.2 │ │ ├── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── rocm5.5.1 │ │ ├── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── rocm5.5 │ │ ├── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ └── rocm5.6.1 │ │ ├── index.html │ │ └── llama-cpp-python │ │ └── index.html ├── AVX2 │ ├── cpu │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu116 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu117 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu118 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu120 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu121 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu122 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── index.html │ ├── rocm5.4.2 │ │ ├── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── rocm5.5.1 │ │ ├── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── rocm5.5 │ │ ├── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ └── rocm5.6.1 │ │ ├── index.html │ │ └── llama-cpp-python │ │ └── index.html ├── AVX512 │ ├── cpu │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu116 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu117 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu118 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu120 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu121 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu122 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ └── index.html ├── basic │ ├── cpu │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu116 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu117 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu118 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu120 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu121 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── cu122 │ │ ├── index.html │ │ ├── llama-cpp-python-ggml │ │ │ └── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── index.html │ ├── rocm5.4.2 │ │ ├── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── rocm5.5.1 │ │ ├── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ ├── rocm5.5 │ │ ├── index.html │ │ └── llama-cpp-python │ │ │ └── index.html │ └── rocm5.6.1 │ │ ├── index.html │ │ └── llama-cpp-python │ │ └── index.html ├── cpu.html ├── cu116.html ├── cu117.html ├── cu118.html ├── cu120.html ├── cu121.html ├── cu122.html ├── index.html ├── rocm5.4.2.html ├── rocm5.5.1.html ├── rocm5.5.html ├── rocm5.6.1.html └── textgen │ ├── AVX │ ├── cu117 │ │ ├── index.html │ │ ├── llama-cpp-python-cuda │ │ │ └── index.html │ │ └── llama-cpp-python-ggml-cuda │ │ │ └── index.html │ ├── cu118 │ │ ├── index.html │ │ ├── llama-cpp-python-cuda │ │ │ └── index.html │ │ └── llama-cpp-python-ggml-cuda │ │ │ └── index.html │ ├── cu120 │ │ ├── index.html │ │ ├── llama-cpp-python-cuda │ │ │ └── index.html │ │ └── llama-cpp-python-ggml-cuda │ │ │ └── index.html │ ├── cu121 │ │ ├── index.html │ │ ├── llama-cpp-python-cuda │ │ │ └── index.html │ │ └── llama-cpp-python-ggml-cuda │ │ │ └── index.html │ ├── cu122 │ │ ├── index.html │ │ ├── llama-cpp-python-cuda │ │ │ └── index.html │ │ └── llama-cpp-python-ggml-cuda │ │ │ └── index.html │ ├── index.html │ ├── rocm5.4.2 │ │ ├── index.html │ │ └── llama-cpp-python-cuda │ │ │ └── index.html │ ├── rocm5.5.1 │ │ ├── index.html │ │ └── llama-cpp-python-cuda │ │ │ └── index.html │ ├── rocm5.5 │ │ ├── index.html │ │ └── llama-cpp-python-cuda │ │ │ └── index.html │ └── rocm5.6.1 │ │ ├── index.html │ │ └── llama-cpp-python-cuda │ │ └── index.html │ ├── AVX2 │ ├── cu117 │ │ ├── index.html │ │ ├── llama-cpp-python-cuda │ │ │ └── index.html │ │ └── llama-cpp-python-ggml-cuda │ │ │ └── index.html │ ├── cu118 │ │ ├── index.html │ │ ├── llama-cpp-python-cuda │ │ │ └── index.html │ │ └── llama-cpp-python-ggml-cuda │ │ │ └── index.html │ ├── cu120 │ │ ├── index.html │ │ ├── llama-cpp-python-cuda │ │ │ └── index.html │ │ └── llama-cpp-python-ggml-cuda │ │ │ └── index.html │ ├── cu121 │ │ ├── index.html │ │ ├── llama-cpp-python-cuda │ │ │ └── index.html │ │ └── llama-cpp-python-ggml-cuda │ │ │ └── index.html │ ├── cu122 │ │ ├── index.html │ │ ├── llama-cpp-python-cuda │ │ │ └── index.html │ │ └── llama-cpp-python-ggml-cuda │ │ │ └── index.html │ ├── index.html │ ├── rocm5.4.2 │ │ ├── index.html │ │ └── llama-cpp-python-cuda │ │ │ └── index.html │ ├── rocm5.5.1 │ │ ├── index.html │ │ └── llama-cpp-python-cuda │ │ │ └── index.html │ ├── rocm5.5 │ │ ├── index.html │ │ └── llama-cpp-python-cuda │ │ │ └── index.html │ └── rocm5.6.1 │ │ ├── index.html │ │ └── llama-cpp-python-cuda │ │ └── index.html │ ├── basic │ ├── cu117 │ │ ├── index.html │ │ ├── llama-cpp-python-cuda │ │ │ └── index.html │ │ └── llama-cpp-python-ggml-cuda │ │ │ └── index.html │ ├── cu118 │ │ ├── index.html │ │ ├── llama-cpp-python-cuda │ │ │ └── index.html │ │ └── llama-cpp-python-ggml-cuda │ │ │ └── index.html │ ├── cu120 │ │ ├── index.html │ │ ├── llama-cpp-python-cuda │ │ │ └── index.html │ │ └── llama-cpp-python-ggml-cuda │ │ │ └── index.html │ ├── cu121 │ │ ├── index.html │ │ ├── llama-cpp-python-cuda │ │ │ └── index.html │ │ └── llama-cpp-python-ggml-cuda │ │ │ └── index.html │ ├── cu122 │ │ ├── index.html │ │ ├── llama-cpp-python-cuda │ │ │ └── index.html │ │ └── llama-cpp-python-ggml-cuda │ │ │ └── index.html │ ├── index.html │ ├── rocm5.4.2 │ │ ├── index.html │ │ └── llama-cpp-python-cuda │ │ │ └── index.html │ ├── rocm5.5.1 │ │ ├── index.html │ │ └── llama-cpp-python-cuda │ │ │ └── index.html │ ├── rocm5.5 │ │ ├── index.html │ │ └── llama-cpp-python-cuda │ │ │ └── index.html │ └── rocm5.6.1 │ │ ├── index.html │ │ └── llama-cpp-python-cuda │ │ └── index.html │ └── index.html ├── old_workflows ├── build-all-wheels.yml ├── build-wheel-rocm-windows.yml ├── build-wheel-rocm.yml ├── build-wheels-0.1.62.yml ├── build-wheels-0.1.66.yml ├── build-wheels-0.1.67.yml ├── build-wheels-0.1.68.yml ├── build-wheels-0.1.69.yml ├── build-wheels-0.1.70.yml ├── build-wheels-0.1.71.yml ├── build-wheels-0.1.72.yml ├── build-wheels-0.1.73.yml ├── build-wheels-0.1.74.yml ├── build-wheels-0.1.76.yml ├── build-wheels-0.1.77.yml ├── build-wheels-avx.yml ├── build-wheels-oobabooga.yml ├── build-wheels-prioritized-release.yml ├── build-wheels-rocm-full.yml └── build-wheels.yml └── workflows.md /.github/dependabot.yml: -------------------------------------------------------------------------------- 1 | version: 2 2 | updates: 3 | - package-ecosystem: "github-actions" 4 | directory: "/" 5 | schedule: 6 | interval: "daily" 7 | -------------------------------------------------------------------------------- /.github/workflows/build-wheel-rocm.yml: -------------------------------------------------------------------------------- 1 | name: Build ROCm Wheels 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | version: 7 | description: 'Version tag of llama-cpp-python to build: v0.2.9' 8 | default: 'v0.2.9' 9 | required: true 10 | type: string 11 | workflow_call: 12 | inputs: 13 | version: 14 | description: 'Version tag of llama-cpp-python to build: v0.2.9' 15 | default: 'v0.2.9' 16 | required: true 17 | type: string 18 | 19 | permissions: 20 | contents: write 21 | 22 | jobs: 23 | build_wheels_rocm: 24 | name: ROCm Wheels 25 | uses: ./.github/workflows/build-wheels-rocm-full.yml 26 | with: 27 | version: ${{ inputs.version }} 28 | config: 'rename:0' 29 | -------------------------------------------------------------------------------- /.github/workflows/build-wheels-basic.yml: -------------------------------------------------------------------------------- 1 | name: Build Basic Wheels 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | version: 7 | description: 'Version tag of llama-cpp-python to build: v0.2.9' 8 | default: 'v0.2.9' 9 | required: true 10 | type: string 11 | workflow_call: 12 | inputs: 13 | version: 14 | description: 'Version tag of llama-cpp-python to build: v0.2.9' 15 | default: 'v0.2.9' 16 | required: true 17 | type: string 18 | 19 | permissions: 20 | contents: write 21 | 22 | jobs: 23 | build_wheels_main: 24 | name: Main Wheels 25 | uses: ./.github/workflows/build-wheels.yml 26 | with: 27 | version: ${{ inputs.version }} 28 | cpu: '0' 29 | config: 'releasetag:basic' 30 | -------------------------------------------------------------------------------- /.github/workflows/build-wheels-batch-basic.yml: -------------------------------------------------------------------------------- 1 | name: Batch Build Basic Wheels 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | versions: 7 | description: 'Comma-seperated version tags of llama-cpp-python to build' 8 | default: 'v0.1.77,v0.1.76' 9 | required: true 10 | type: string 11 | 12 | permissions: 13 | contents: write 14 | 15 | jobs: 16 | define_matrix: 17 | name: Define Workflow Matrix 18 | runs-on: ubuntu-latest 19 | outputs: 20 | matrix: ${{ steps.set-matrix.outputs.matrix }} 21 | defaults: 22 | run: 23 | shell: pwsh 24 | env: 25 | PCKGVERS: ${{ inputs.versions }} 26 | 27 | steps: 28 | - uses: actions/checkout@v4 29 | 30 | - name: Define Job Output 31 | id: set-matrix 32 | run: | 33 | $x = ConvertTo-Json $env:PCKGVERS.Split(',').Trim() -Compress 34 | Write-Output ('matrix=' + $x) >> $env:GITHUB_OUTPUT 35 | 36 | run_workflows: 37 | name: Build ${{ matrix.version }} Basic CUDA Wheels 38 | needs: define_matrix 39 | strategy: 40 | max-parallel: 1 41 | matrix: 42 | version: ${{ fromJSON(needs.define_matrix.outputs.matrix) }} 43 | uses: ./.github/workflows/build-wheels-basic.yml 44 | with: 45 | version: ${{ matrix.version }} 46 | -------------------------------------------------------------------------------- /.github/workflows/build-wheels-batch-cpu.yml: -------------------------------------------------------------------------------- 1 | name: Batch Build CPU-only Wheels 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | versions: 7 | description: 'Comma-seperated version tags of llama-cpp-python to build' 8 | default: 'v0.1.77,v0.1.76' 9 | required: true 10 | type: string 11 | 12 | permissions: 13 | contents: write 14 | 15 | jobs: 16 | define_matrix: 17 | name: Define Workflow Matrix 18 | runs-on: ubuntu-latest 19 | outputs: 20 | matrix: ${{ steps.set-matrix.outputs.matrix }} 21 | defaults: 22 | run: 23 | shell: pwsh 24 | env: 25 | PCKGVERS: ${{ inputs.versions }} 26 | 27 | steps: 28 | - uses: actions/checkout@v4 29 | 30 | - name: Define Job Output 31 | id: set-matrix 32 | run: | 33 | $x = ConvertTo-Json $env:PCKGVERS.Split(',').Trim() -Compress 34 | Write-Output ('matrix=' + $x) >> $env:GITHUB_OUTPUT 35 | 36 | run_workflows: 37 | name: Build ${{ matrix.version }} CPU-only Wheels 38 | needs: define_matrix 39 | strategy: 40 | max-parallel: 1 41 | matrix: 42 | version: ${{ fromJSON(needs.define_matrix.outputs.matrix) }} 43 | uses: ./.github/workflows/build-wheels-cpu.yml 44 | with: 45 | version: ${{ matrix.version }} 46 | -------------------------------------------------------------------------------- /.github/workflows/build-wheels-batch-ggml.yml: -------------------------------------------------------------------------------- 1 | name: Batch Build GGML Wheels 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | versions: 7 | description: 'Comma-seperated version tags of llama-cpp-python-ggml to build' 8 | default: 'v0.1.78,v0.1.77' 9 | required: true 10 | type: string 11 | 12 | permissions: 13 | contents: write 14 | 15 | jobs: 16 | define_matrix: 17 | name: Define Workflow Matrix 18 | runs-on: ubuntu-latest 19 | outputs: 20 | matrix: ${{ steps.set-matrix.outputs.matrix }} 21 | defaults: 22 | run: 23 | shell: pwsh 24 | env: 25 | PCKGVERS: ${{ inputs.versions }} 26 | 27 | steps: 28 | - uses: actions/checkout@v4 29 | 30 | - name: Define Job Output 31 | id: set-matrix 32 | run: | 33 | $x = ConvertTo-Json $env:PCKGVERS.Split(',').Trim() -Compress 34 | Write-Output ('matrix=' + $x) >> $env:GITHUB_OUTPUT 35 | 36 | run_workflows: 37 | name: Build ${{ matrix.version }} Wheels 38 | needs: define_matrix 39 | strategy: 40 | max-parallel: 1 41 | matrix: 42 | version: ${{ fromJSON(needs.define_matrix.outputs.matrix) }} 43 | uses: ./.github/workflows/build-wheels.yml 44 | with: 45 | version: ${{ matrix.version }} 46 | -------------------------------------------------------------------------------- /.github/workflows/build-wheels-batch-macos.yml: -------------------------------------------------------------------------------- 1 | name: Batch Build MacOS Wheels 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | versions: 7 | description: 'Comma-seperated version tags of llama-cpp-python to build' 8 | default: 'v0.2.6,v0.2.5,v0.2.4,v0.2.3,v0.2.2,v0.2.1,v0.2.0,v0.1.85' 9 | required: true 10 | type: string 11 | 12 | permissions: 13 | contents: write 14 | 15 | jobs: 16 | define_matrix: 17 | name: Define Workflow Matrix 18 | runs-on: ubuntu-latest 19 | outputs: 20 | matrix: ${{ steps.set-matrix.outputs.matrix }} 21 | defaults: 22 | run: 23 | shell: pwsh 24 | env: 25 | PCKGVERS: ${{ inputs.versions }} 26 | 27 | steps: 28 | - uses: actions/checkout@v4 29 | 30 | - name: Define Job Output 31 | id: set-matrix 32 | run: | 33 | $x = ConvertTo-Json $env:PCKGVERS.Split(',').Trim() -Compress 34 | Write-Output ('matrix=' + $x) >> $env:GITHUB_OUTPUT 35 | 36 | run_workflows: 37 | name: Build ${{ matrix.version }} MacOS Metal Wheels 38 | needs: define_matrix 39 | strategy: 40 | matrix: 41 | version: ${{ fromJSON(needs.define_matrix.outputs.matrix) }} 42 | uses: ./.github/workflows/build-wheels-macos.yml 43 | with: 44 | version: ${{ matrix.version }} 45 | -------------------------------------------------------------------------------- /.github/workflows/build-wheels-batch-oobabooga-basic.yml: -------------------------------------------------------------------------------- 1 | name: Batch Build Basic Wheels for Text Generation Webui 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | versions: 7 | description: 'Comma-seperated version tags of llama-cpp-python to build' 8 | default: 'v0.1.77,v0.1.76' 9 | required: true 10 | type: string 11 | 12 | permissions: 13 | contents: write 14 | 15 | jobs: 16 | define_matrix: 17 | name: Define Workflow Matrix 18 | runs-on: ubuntu-latest 19 | outputs: 20 | matrix: ${{ steps.set-matrix.outputs.matrix }} 21 | defaults: 22 | run: 23 | shell: pwsh 24 | env: 25 | PCKGVERS: ${{ inputs.versions }} 26 | 27 | steps: 28 | - uses: actions/checkout@v4 29 | 30 | - name: Define Job Output 31 | id: set-matrix 32 | run: | 33 | $x = ConvertTo-Json $env:PCKGVERS.Split(',').Trim() -Compress 34 | Write-Output ('matrix=' + $x) >> $env:GITHUB_OUTPUT 35 | 36 | run_workflows: 37 | name: Build ${{ matrix.version }} Wheels 38 | needs: define_matrix 39 | strategy: 40 | max-parallel: 1 41 | matrix: 42 | version: ${{ fromJSON(needs.define_matrix.outputs.matrix) }} 43 | uses: ./.github/workflows/build-wheels-oobabooga-basic.yml 44 | with: 45 | version: ${{ matrix.version }} 46 | -------------------------------------------------------------------------------- /.github/workflows/build-wheels-batch-oobabooga.yml: -------------------------------------------------------------------------------- 1 | name: Batch Build Wheels for Text Generation Webui 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | versions: 7 | description: 'Comma-seperated version tags of llama-cpp-python to build' 8 | default: 'v0.2.18,v0.2.17,v0.2.16,v0.2.15,v0.2.14' 9 | required: true 10 | type: string 11 | 12 | permissions: 13 | contents: write 14 | 15 | jobs: 16 | define_matrix: 17 | name: Define Workflow Matrix 18 | runs-on: ubuntu-latest 19 | outputs: 20 | matrix: ${{ steps.set-matrix.outputs.matrix }} 21 | defaults: 22 | run: 23 | shell: pwsh 24 | env: 25 | PCKGVERS: ${{ inputs.versions }} 26 | 27 | steps: 28 | - uses: actions/checkout@v4 29 | 30 | - name: Define Job Output 31 | id: set-matrix 32 | run: | 33 | $x = ConvertTo-Json $env:PCKGVERS.Split(',').Trim() -Compress 34 | Write-Output ('matrix=' + $x) >> $env:GITHUB_OUTPUT 35 | 36 | run_workflows: 37 | name: Build ${{ matrix.version }} Wheels 38 | needs: define_matrix 39 | strategy: 40 | max-parallel: 1 41 | matrix: 42 | version: ${{ fromJSON(needs.define_matrix.outputs.matrix) }} 43 | uses: ./.github/workflows/build-wheels-oobabooga.yml 44 | with: 45 | version: ${{ matrix.version }} 46 | -------------------------------------------------------------------------------- /.github/workflows/build-wheels-batch-rocm.yml: -------------------------------------------------------------------------------- 1 | name: Batch Build ROCm Wheels 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | versions: 7 | description: 'Comma-seperated version tags of llama-cpp-python to build' 8 | default: 'v0.2.18,v0.2.17,v0.2.16,v0.2.15,v0.2.14' 9 | required: true 10 | type: string 11 | config: 12 | description: 'Override configurations to build: key1:item1-1,item1-2;key2:item2-1,item2-2' 13 | default: 'Default' 14 | required: false 15 | type: string 16 | exclude: 17 | description: 'Exclude build configurations: key1-1:item1-1,key1-2:item1-2;key2-1:item2-1,key2-2:item2-2' 18 | default: 'Default' 19 | required: false 20 | type: string 21 | 22 | permissions: 23 | contents: write 24 | 25 | jobs: 26 | define_matrix: 27 | name: Define Workflow Matrix 28 | runs-on: ubuntu-latest 29 | outputs: 30 | matrix: ${{ steps.set-matrix.outputs.matrix }} 31 | defaults: 32 | run: 33 | shell: pwsh 34 | env: 35 | PCKGVERS: ${{ inputs.versions }} 36 | 37 | steps: 38 | - uses: actions/checkout@v4 39 | 40 | - name: Define Job Output 41 | id: set-matrix 42 | run: | 43 | $versions = $env:PCKGVERS.Split(',').Trim() 44 | $versions.foreach({if ([version]$_.TrimStart('v') -lt [version]'0.1.80') {Throw "$_ does not support ROCm!"}}) 45 | $x = ConvertTo-Json $versions -Compress 46 | Write-Output ('matrix=' + $x) >> $env:GITHUB_OUTPUT 47 | 48 | run_workflows: 49 | name: Build ${{ matrix.version }} Wheels 50 | needs: define_matrix 51 | strategy: 52 | max-parallel: 1 53 | matrix: 54 | version: ${{ fromJSON(needs.define_matrix.outputs.matrix) }} 55 | uses: ./.github/workflows/build-wheels-rocm-full.yml 56 | with: 57 | version: ${{ matrix.version }} 58 | config: ${{ inputs.config }} 59 | exclude: ${{ inputs.exclude }} 60 | -------------------------------------------------------------------------------- /.github/workflows/build-wheels-batch.yml: -------------------------------------------------------------------------------- 1 | name: Batch Build Wheels 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | versions: 7 | description: 'Comma-seperated version tags of llama-cpp-python to build' 8 | default: 'v0.1.77,v0.1.76' 9 | required: true 10 | type: string 11 | cpu: 12 | description: 'Build CPU-only wheels as well? 0/1' 13 | default: '1' 14 | required: false 15 | type: string 16 | 17 | permissions: 18 | contents: write 19 | 20 | jobs: 21 | define_matrix: 22 | name: Define Workflow Matrix 23 | runs-on: ubuntu-latest 24 | outputs: 25 | matrix: ${{ steps.set-matrix.outputs.matrix }} 26 | defaults: 27 | run: 28 | shell: pwsh 29 | env: 30 | PCKGVERS: ${{ inputs.versions }} 31 | 32 | steps: 33 | - uses: actions/checkout@v4 34 | 35 | - name: Define Job Output 36 | id: set-matrix 37 | run: | 38 | $x = ConvertTo-Json $env:PCKGVERS.Split(',').Trim() -Compress 39 | Write-Output ('matrix=' + $x) >> $env:GITHUB_OUTPUT 40 | 41 | run_workflows: 42 | name: Build ${{ matrix.version }} Wheels 43 | needs: define_matrix 44 | strategy: 45 | max-parallel: 1 46 | matrix: 47 | version: ${{ fromJSON(needs.define_matrix.outputs.matrix) }} 48 | uses: ./.github/workflows/build-wheels.yml 49 | with: 50 | version: ${{ matrix.version }} 51 | cpu: ${{ inputs.cpu }} 52 | -------------------------------------------------------------------------------- /.github/workflows/build-wheels-cpu.yml: -------------------------------------------------------------------------------- 1 | name: Build CPU-only Wheels 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | version: 7 | description: 'Version tag of llama-cpp-python to build: v0.2.14' 8 | default: 'v0.2.14' 9 | required: true 10 | type: string 11 | config: 12 | description: 'Override configurations to build: key1:item1-1,item1-2;key2:item2-1,item2-2' 13 | default: 'Default' 14 | required: false 15 | type: string 16 | exclude: 17 | description: 'Exclude build configurations: key1-1:item1-1,key1-2:item1-2;key2-1:item2-1,key2-2:item2-2' 18 | default: 'None' 19 | required: false 20 | type: string 21 | workflow_call: 22 | inputs: 23 | version: 24 | description: 'Version tag of llama-cpp-python to build: v0.2.14' 25 | default: 'v0.2.14' 26 | required: true 27 | type: string 28 | config: 29 | description: 'Configurations to build: key1:item1-1,item1-2;key2:item2-1,item2-2' 30 | default: 'Default' 31 | required: false 32 | type: string 33 | exclude: 34 | description: 'Exclude build configurations: key1-1:item1-1,key1-2:item1-2;key2-1:item2-1,key2-2:item2-2' 35 | default: 'None' 36 | required: false 37 | type: string 38 | 39 | permissions: 40 | contents: write 41 | 42 | jobs: 43 | define_matrix: 44 | name: Define Build Matrix 45 | runs-on: ubuntu-latest 46 | outputs: 47 | matrix: ${{ steps.set-matrix.outputs.matrix }} 48 | defaults: 49 | run: 50 | shell: pwsh 51 | env: 52 | CONFIGIN: ${{ inputs.config }} 53 | EXCLUDEIN: ${{ inputs.exclude }} 54 | 55 | steps: 56 | - name: Define Job Output 57 | id: set-matrix 58 | run: | 59 | $matrix = @{ 60 | 'os' = 'ubuntu-20.04', 'windows-latest' 61 | 'pyver' = "3.10", "3.8", "3.9", "3.11" 62 | 'avx' = "AVX", "AVX2", "AVX512", "basic" 63 | } 64 | 65 | if ($env:CONFIGIN -ne 'Default') {$env:CONFIGIN.split(';').foreach({$matrix[$_.split(':')[0]] = $_.split(':')[1].split(',')})} 66 | 67 | if ($env:EXCLUDEIN -ne 'None') { 68 | $exclusions = @() 69 | $exclusions += $env:EXCLUDEIN.split(';').replace(':','=').replace(',',"`n") | ConvertFrom-StringData 70 | $matrix['exclude'] = $exclusions 71 | } 72 | 73 | $matrixOut = ConvertTo-Json $matrix -Compress 74 | Write-Output ('matrix=' + $matrixOut) >> $env:GITHUB_OUTPUT 75 | 76 | build_wheels: 77 | name: ${{ matrix.os }} ${{ matrix.pyver }} CPU ${{ matrix.avx }} 78 | needs: define_matrix 79 | runs-on: ${{ matrix.os }} 80 | strategy: 81 | matrix: ${{ fromJSON(needs.define_matrix.outputs.matrix) }} 82 | defaults: 83 | run: 84 | shell: pwsh 85 | env: 86 | AVXVER: ${{ matrix.avx }} 87 | PCKGVER: ${{ inputs.version }} 88 | 89 | steps: 90 | - uses: actions/checkout@v4 91 | with: 92 | repository: 'abetlen/llama-cpp-python' 93 | ref: ${{ inputs.version }} 94 | submodules: 'recursive' 95 | 96 | - uses: actions/setup-python@v4 97 | with: 98 | python-version: ${{ matrix.pyver }} 99 | 100 | - name: Install Dependencies 101 | run: | 102 | python -m pip install build wheel 103 | 104 | - name: Build Wheel 105 | run: | 106 | $packageVersion = [version]$env:PCKGVER.TrimStart('v') 107 | $env:VERBOSE = '1' 108 | if ($env:AVXVER -eq 'AVX') {$env:CMAKE_ARGS = '-DLLAMA_AVX2=off -DLLAMA_FMA=off -DLLAMA_F16C=off'} 109 | if ($env:AVXVER -eq 'AVX512') {$env:CMAKE_ARGS = '-DLLAMA_AVX512=on'} 110 | if ($env:AVXVER -eq 'basic') {$env:CMAKE_ARGS = '-DLLAMA_AVX=off -DLLAMA_AVX2=off -DLLAMA_FMA=off -DLLAMA_F16C=off'} 111 | if ($packageVersion -gt [version]'0.2.13') {$env:CMAKE_ARGS = "-DLLAMA_NATIVE=off $env:CMAKE_ARGS"} 112 | $buildtag = "+cpu$env:AVXVER" 113 | if ($packageVersion -lt [version]'0.2.0') { 114 | $env:FORCE_CMAKE = '1' 115 | python -m build --wheel -C--build-option=egg_info "-C--build-option=--tag-build=$buildtag" 116 | } else { 117 | $initpath = Join-Path '.' 'llama_cpp' '__init__.py' -resolve 118 | $initcontent = Get-Content $initpath -raw 119 | $regexstr = '(?s)(?<=__version__ \= ")\d+(?:\.\d+)*(?=")' 120 | $regexmatch = [Regex]::Matches($initcontent,$regexstr) 121 | if (!($regexmatch[0].Success)) {throw '__init__.py parsing failed'} 122 | $newinit = $regexmatch[0].Result(('$`' + '$&' + $buildtag + '$''')) 123 | New-Item $initpath -itemType File -value $newinit -force 124 | python -m build --wheel 125 | } 126 | 127 | - name: Upload files to a GitHub release 128 | id: upload-release 129 | uses: svenstaro/upload-release-action@2.7.0 130 | continue-on-error: true 131 | with: 132 | file: ./dist/*.whl 133 | tag: 'cpu' 134 | file_glob: true 135 | make_latest: false 136 | overwrite: true 137 | 138 | - uses: actions/upload-artifact@v3 139 | if: steps.upload-release.outcome == 'failure' 140 | with: 141 | name: cpu 142 | path: ./dist/*.whl 143 | -------------------------------------------------------------------------------- /.github/workflows/build-wheels-full-release.yml: -------------------------------------------------------------------------------- 1 | name: Build Wheels for New Release 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | version: 7 | description: 'Version tag of llama-cpp-python to build: v0.2.11' 8 | default: 'v0.2.11' 9 | required: true 10 | type: string 11 | 12 | permissions: 13 | contents: write 14 | 15 | jobs: 16 | run_main: 17 | name: Build ${{ inputs.version }} CUDA Wheels 18 | uses: ./.github/workflows/build-wheels.yml 19 | with: 20 | version: ${{ inputs.version }} 21 | cpu: '0' 22 | 23 | run_ooba: 24 | name: Build ${{ inputs.version }} CUDA Wheels for Text Generation Webui 25 | needs: run_main 26 | uses: ./.github/workflows/build-wheels-oobabooga.yml 27 | with: 28 | version: ${{ inputs.version }} 29 | 30 | run_cpu: 31 | name: Build CPU-only Wheels 32 | needs: run_ooba 33 | uses: ./.github/workflows/build-wheels-cpu.yml 34 | with: 35 | version: ${{ inputs.version }} 36 | 37 | run_macos: 38 | name: Build MacOS Metal Wheels 39 | needs: run_cpu 40 | uses: ./.github/workflows/build-wheels-macos.yml 41 | with: 42 | version: ${{ inputs.version }} 43 | 44 | run_rocm: 45 | name: Build ROCm Wheels 46 | needs: run_macos 47 | uses: ./.github/workflows/build-wheels-rocm-full.yml 48 | with: 49 | version: ${{ inputs.version }} 50 | -------------------------------------------------------------------------------- /.github/workflows/build-wheels-ggml-cpu.yml: -------------------------------------------------------------------------------- 1 | name: Build CPU-only GGML Wheels 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | version: 7 | description: 'Version tag of llama-cpp-python to build: v0.1.78' 8 | default: 'v0.1.78' 9 | required: false 10 | type: string 11 | workflow_call: 12 | inputs: 13 | version: 14 | description: 'Version tag of llama-cpp-python to build: v0.1.78' 15 | default: 'v0.1.78' 16 | required: false 17 | type: string 18 | 19 | permissions: 20 | contents: write 21 | 22 | jobs: 23 | build_wheels: 24 | name: ${{ matrix.os }} ${{ matrix.pyver }} CPU ${{ matrix.avx }} 25 | runs-on: ${{ matrix.os }} 26 | strategy: 27 | matrix: 28 | os: [ubuntu-20.04, windows-latest] 29 | pyver: ["3.7", "3.8", "3.9", "3.10", "3.11"] 30 | avx: ["AVX","AVX2","AVX512","basic"] 31 | defaults: 32 | run: 33 | shell: pwsh 34 | env: 35 | AVXVER: ${{ matrix.avx }} 36 | PCKGVER: ${{ inputs.version }} 37 | 38 | steps: 39 | - uses: actions/checkout@v4 40 | with: 41 | repository: 'abetlen/llama-cpp-python' 42 | ref: ${{ inputs.version }} 43 | submodules: 'recursive' 44 | 45 | - uses: actions/setup-python@v4 46 | with: 47 | python-version: ${{ matrix.pyver }} 48 | 49 | - name: Install Dependencies 50 | run: | 51 | python -m pip install build wheel 52 | 53 | - name: Change Package Name 54 | run: | 55 | $packageVersion = [version]$env:PCKGVER.TrimStart('v') 56 | $setup = Get-Content 'setup.py' -raw 57 | $pyproject = Get-Content 'pyproject.toml' -raw 58 | $cmakelists = Get-Content 'CMakeLists.txt' -raw 59 | $regexstr = '(?s)name="llama_cpp_python",(.+)(package_dir={"llama_cpp": "llama_cpp", "llama_cpp.server": "llama_cpp/server"},.+?packages=\["llama_cpp", "llama_cpp.server"],)' 60 | if ($packageVersion -gt [version]'0.1.77') {$regexstr = '(?s)name="llama_cpp_python",(.+)(package_dir={"llama_cpp": "llama_cpp", "llama_cpp.server": "llama_cpp/server"},.+?package_data={"llama_cpp": \["py.typed"]},.+?packages=\["llama_cpp", "llama_cpp.server"],)'} 61 | $regexmatch = [Regex]::Matches($setup,$regexstr) 62 | if (!($regexmatch[0].Success)) {throw 'setup.py parsing failed'} 63 | $newstr = 'name="llama_cpp_python_ggml",' + $regexmatch[0].Groups[1].Value + $regexmatch[0].Groups[2].Value.Replace('llama_cpp','llama_cpp_ggml') 64 | $newsetup = $regexmatch[0].Result(('$`'+$newstr+'$''')) 65 | $regexstr = '(?s)(?<=name = ")llama_cpp_python(".+?packages = \[{include = ")llama_cpp(".+)' 66 | $regexmatch = [Regex]::Matches($pyproject,$regexstr) 67 | if (!($regexmatch[0].Success)) {throw 'pyproject.toml parsing failed'} 68 | $newpyproject = $regexmatch[0].Result(('$`'+'llama_cpp_python_ggml'+'$1llama_cpp_ggml$2')) 69 | Copy-Item 'llama_cpp' 'llama_cpp_ggml' -recurse 70 | New-Item 'setup.py' -itemType File -value $newsetup -force 71 | New-Item 'pyproject.toml' -itemType File -value $newpyproject -force 72 | New-Item 'CMakeLists.txt' -itemType File -value $cmakelists.Replace('llama_cpp','llama_cpp_ggml') -force 73 | 74 | - name: Build Wheel 75 | run: | 76 | $env:VERBOSE = '1' 77 | $env:FORCE_CMAKE = '1' 78 | if ($env:AVXVER -eq 'AVX') {$env:CMAKE_ARGS = '-DLLAMA_AVX2=off -DLLAMA_FMA=off -DLLAMA_F16C=off'} 79 | if ($env:AVXVER -eq 'AVX512') {$env:CMAKE_ARGS = '-DLLAMA_AVX512=on'} 80 | if ($env:AVXVER -eq 'basic') {$env:CMAKE_ARGS = '-DLLAMA_AVX=off -DLLAMA_AVX2=off -DLLAMA_FMA=off -DLLAMA_F16C=off'} 81 | if ($packageVersion -gt [version]'0.2.13') {$env:CMAKE_ARGS = "-DLLAMA_NATIVE=off $env:CMAKE_ARGS"} 82 | python -m build --wheel -C--build-option=egg_info "-C--build-option=--tag-build=+cpu$env:AVXVER" 83 | 84 | - name: Upload files to a GitHub release 85 | id: upload-release 86 | uses: svenstaro/upload-release-action@2.7.0 87 | continue-on-error: true 88 | with: 89 | file: ./dist/*.whl 90 | tag: 'cpu' 91 | file_glob: true 92 | make_latest: false 93 | overwrite: true 94 | 95 | - uses: actions/upload-artifact@v3 96 | if: steps.upload-release.outcome == 'failure' 97 | with: 98 | name: cpu 99 | path: ./dist/*.whl 100 | -------------------------------------------------------------------------------- /.github/workflows/build-wheels-macos.yml: -------------------------------------------------------------------------------- 1 | name: Build Wheels MacOS 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | version: 7 | description: 'Version tag of llama-cpp-python to build: v0.2.20' 8 | default: 'v0.2.20' 9 | required: true 10 | type: string 11 | config: 12 | description: 'Override configurations to build: key1:item1-1,item1-2;key2:item2-1,item2-2' 13 | default: 'Default' 14 | required: false 15 | type: string 16 | exclude: 17 | description: 'Exclude build configurations: key1-1:item1-1,key1-2:item1-2;key2-1:item2-1,key2-2:item2-2' 18 | default: 'None' 19 | required: false 20 | type: string 21 | workflow_call: 22 | inputs: 23 | version: 24 | description: 'Version tag of llama-cpp-python to build: v0.2.20' 25 | default: 'v0.2.20' 26 | required: true 27 | type: string 28 | config: 29 | description: 'Configurations to build: key1:item1-1,item1-2;key2:item2-1,item2-2' 30 | default: 'Default' 31 | required: false 32 | type: string 33 | exclude: 34 | description: 'Exclude build configurations: key1-1:item1-1,key1-2:item1-2;key2-1:item2-1,key2-2:item2-2' 35 | default: 'None' 36 | required: false 37 | type: string 38 | 39 | permissions: 40 | contents: write 41 | 42 | jobs: 43 | define_matrix: 44 | name: Define Build Matrix 45 | runs-on: ubuntu-latest 46 | outputs: 47 | matrix: ${{ steps.set-matrix.outputs.matrix }} 48 | defaults: 49 | run: 50 | shell: pwsh 51 | env: 52 | CONFIGIN: ${{ inputs.config }} 53 | EXCLUDEIN: ${{ inputs.exclude }} 54 | 55 | steps: 56 | - name: Define Job Output 57 | id: set-matrix 58 | run: | 59 | $matrix = @{ 60 | 'os' = "macos-13", "macos-12" 61 | 'pyver' = "3.10", "3.8", "3.9", "3.11" 62 | } 63 | 64 | if ($env:CONFIGIN -ne 'Default') {$env:CONFIGIN.split(';').foreach({$matrix[$_.split(':')[0]] = $_.split(':')[1].split(',')})} 65 | 66 | if ($env:EXCLUDEIN -ne 'None') { 67 | $exclusions = @() 68 | $exclusions += $env:EXCLUDEIN.split(';').replace(':','=').replace(',',"`n") | ConvertFrom-StringData 69 | $matrix['exclude'] = $exclusions 70 | } 71 | 72 | $matrixOut = ConvertTo-Json $matrix -Compress 73 | Write-Output ('matrix=' + $matrixOut) >> $env:GITHUB_OUTPUT 74 | 75 | build_wheels: 76 | name: ${{ matrix.os }} Python ${{ matrix.pyver }} 77 | needs: define_matrix 78 | runs-on: ${{ matrix.os }} 79 | strategy: 80 | matrix: ${{ fromJSON(needs.define_matrix.outputs.matrix) }} 81 | env: 82 | OSVER: ${{ matrix.os }} 83 | 84 | steps: 85 | - uses: actions/checkout@v4 86 | with: 87 | repository: 'abetlen/llama-cpp-python' 88 | ref: ${{ inputs.version }} 89 | submodules: 'recursive' 90 | 91 | - uses: actions/setup-python@v4 92 | with: 93 | python-version: ${{ matrix.pyver }} 94 | 95 | - name: Install Dependencies 96 | run: | 97 | python -m pip install build wheel cmake 98 | 99 | - name: Build Wheel 100 | run: | 101 | XCODE15PATH="/Applications/Xcode_15.0.app/Contents/Developer" 102 | XCODE15BINPATH="${XCODE15PATH}/Toolchains/XcodeDefault.xctoolchain/usr/bin" 103 | export CMAKE_ARGS="-DLLAMA_NATIVE=off -DLLAMA_METAL=on" 104 | [[ "$OSVER" == "macos-13" ]] && export CC="${XCODE15BINPATH}/cc" && export CXX="${XCODE15BINPATH}/c++" && export MACOSX_DEPLOYMENT_TARGET="13.0" 105 | [[ "$OSVER" == "macos-12" ]] && export MACOSX_DEPLOYMENT_TARGET="12.0" 106 | [[ "$OSVER" == "macos-11" ]] && export MACOSX_DEPLOYMENT_TARGET="11.0" 107 | 108 | export CMAKE_OSX_ARCHITECTURES="arm64" && export ARCHFLAGS="-arch arm64" 109 | VERBOSE=1 python -m build --wheel 110 | 111 | if [[ "$OSVER" == "macos-13" ]]; then 112 | export SDKROOT="${XCODE15PATH}/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.0.sdk" 113 | export MACOSX_DEPLOYMENT_TARGET="14.0" 114 | VERBOSE=1 python -m build --wheel 115 | fi 116 | 117 | for file in ./dist/*.whl; do cp "$file" "${file/arm64.whl/aarch64.whl}"; done 118 | 119 | export CMAKE_OSX_ARCHITECTURES="x86_64" && export CMAKE_ARGS="-DLLAMA_NATIVE=off -DLLAMA_AVX=off -DLLAMA_AVX2=off -DLLAMA_FMA=off -DLLAMA_F16C=off -DLLAMA_METAL=on" && export ARCHFLAGS="-arch x86_64" 120 | VERBOSE=1 python -m build --wheel 121 | 122 | if [[ "$OSVER" == "macos-13" ]]; then 123 | export SDKROOT="${XCODE15PATH}/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.0.sdk" 124 | export MACOSX_DEPLOYMENT_TARGET="14.0" 125 | VERBOSE=1 python -m build --wheel 126 | fi 127 | 128 | - name: Upload files to a GitHub release 129 | id: upload-release 130 | uses: svenstaro/upload-release-action@2.7.0 131 | continue-on-error: true 132 | with: 133 | file: ./dist/*.whl 134 | tag: 'metal' 135 | file_glob: true 136 | make_latest: false 137 | overwrite: true 138 | 139 | - uses: actions/upload-artifact@v3 140 | if: steps.upload-release.outcome == 'failure' 141 | with: 142 | name: macos-wheels 143 | path: ./dist/*.whl 144 | -------------------------------------------------------------------------------- /.github/workflows/build-wheels-oobabooga-basic.yml: -------------------------------------------------------------------------------- 1 | name: Build Basic Wheels for Text Generation Webui 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | version: 7 | description: 'Version tag of llama-cpp-python to build: v0.2.9' 8 | default: 'v0.2.9' 9 | required: true 10 | type: string 11 | workflow_call: 12 | inputs: 13 | version: 14 | description: 'Version tag of llama-cpp-python to build: v0.2.9' 15 | default: 'v0.2.9' 16 | required: true 17 | type: string 18 | 19 | permissions: 20 | contents: write 21 | 22 | jobs: 23 | build_textgen_wheels: 24 | name: Textgen Wheels 25 | uses: ./.github/workflows/build-wheels-oobabooga.yml 26 | with: 27 | version: ${{ inputs.version }} 28 | config: 'avxver:basic' -------------------------------------------------------------------------------- /.github/workflows/build-wheels-oobabooga-rocm.yml: -------------------------------------------------------------------------------- 1 | name: Build ROCm Wheels for Text Generation Webui 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | version: 7 | description: 'Version tag of llama-cpp-python to build: v0.2.9' 8 | default: 'v0.2.9' 9 | required: true 10 | type: string 11 | workflow_call: 12 | inputs: 13 | version: 14 | description: 'Version tag of llama-cpp-python to build: v0.2.9' 15 | default: 'v0.2.9' 16 | required: true 17 | type: string 18 | 19 | permissions: 20 | contents: write 21 | 22 | jobs: 23 | build_wheels_rocm: 24 | name: ROCm Wheels 25 | uses: ./.github/workflows/build-wheels-rocm-full.yml 26 | with: 27 | version: ${{ inputs.version }} 28 | config: 'rename:1' 29 | -------------------------------------------------------------------------------- /.github/workflows/build-wheels-prioritized-release.yml: -------------------------------------------------------------------------------- 1 | name: Build Wheels for New Release with Prioritization 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | version: 7 | description: 'Version tag of llama-cpp-python to build: v0.2.14' 8 | default: 'v0.2.14' 9 | required: true 10 | type: string 11 | 12 | permissions: 13 | contents: write 14 | 15 | jobs: 16 | build_textgen_wheels_prio: 17 | name: Textgen Prioritized 18 | uses: ./.github/workflows/build-wheels-oobabooga.yml 19 | with: 20 | version: ${{ inputs.version }} 21 | config: 'pyver:3.10;cuda:12.1.1' 22 | 23 | build_wheels_main_prio: 24 | name: Main Prioritized 25 | uses: ./.github/workflows/build-wheels.yml 26 | with: 27 | version: ${{ inputs.version }} 28 | cpu: '0' 29 | config: 'pyver:3.10;cuda:12.1.1' 30 | 31 | build_wheels_cpu_prio: 32 | name: CPU-only Prioritized 33 | uses: ./.github/workflows/build-wheels-cpu.yml 34 | with: 35 | version: ${{ inputs.version }} 36 | config: 'pyver:3.10;avx:basic' 37 | 38 | build_wheels_rocm_prio: 39 | name: ROCm Prioritized 40 | uses: ./.github/workflows/build-wheels-rocm-full.yml 41 | with: 42 | version: ${{ inputs.version }} 43 | config: 'os:ubuntu-20.04;pyver:3.10;rocm:5.6.1;rename:1' 44 | exclude: 'None' 45 | 46 | build_wheels_macos_prio: 47 | name: MacOS Metal Prioritized 48 | uses: ./.github/workflows/build-wheels-macos.yml 49 | with: 50 | version: ${{ inputs.version }} 51 | config: 'pyver:3.10' 52 | 53 | build_wheels_main: 54 | name: Main Wheels 55 | needs: ['build_wheels_main_prio', 'build_textgen_wheels_prio', 'build_wheels_cpu_prio', 'build_wheels_rocm_prio', 'build_wheels_macos_prio'] 56 | uses: ./.github/workflows/build-wheels.yml 57 | with: 58 | version: ${{ inputs.version }} 59 | cpu: '0' 60 | exclude: 'pyver:3.10,cuda:12.1.1' 61 | 62 | build_textgen_wheels: 63 | name: Textgen Wheels 64 | needs: build_wheels_main 65 | uses: ./.github/workflows/build-wheels-oobabooga.yml 66 | with: 67 | version: ${{ inputs.version }} 68 | exclude: 'pyver:3.10,cuda:12.1.1' 69 | 70 | build_wheels_cpu: 71 | name: CPU-only Wheels 72 | needs: build_textgen_wheels 73 | uses: ./.github/workflows/build-wheels-cpu.yml 74 | with: 75 | version: ${{ inputs.version }} 76 | exclude: 'pyver:3.10,avx:basic' 77 | 78 | build_wheels_macos: 79 | name: MacOS Metal Wheels 80 | needs: build_wheels_cpu 81 | uses: ./.github/workflows/build-wheels-macos.yml 82 | with: 83 | version: ${{ inputs.version }} 84 | exclude: 'pyver:3.10' 85 | 86 | build_wheels_rocm: 87 | name: ROCm Wheels 88 | needs: build_wheels_macos 89 | uses: ./.github/workflows/build-wheels-rocm-full.yml 90 | with: 91 | version: ${{ inputs.version }} 92 | exclude: 'os:ubuntu-20.04,pyver:3.10,rocm:5.6.1,rename:1' 93 | -------------------------------------------------------------------------------- /.github/workflows/deploy-index.yml: -------------------------------------------------------------------------------- 1 | name: Deploy package index to Pages 2 | 3 | on: 4 | # Runs on pushes targeting the default branch when html files are modified under docs directory 5 | push: 6 | branches: ["main"] 7 | paths: 8 | - 'index/**.html' 9 | 10 | # Allows you to run this workflow manually from the Actions tab 11 | workflow_dispatch: 12 | 13 | # Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages 14 | permissions: 15 | contents: read 16 | pages: write 17 | id-token: write 18 | 19 | # Allow only one concurrent deployment, skipping runs queued between the run in-progress and latest queued. 20 | # However, do NOT cancel in-progress runs as we want to allow these production deployments to complete. 21 | concurrency: 22 | group: "pages" 23 | cancel-in-progress: false 24 | 25 | jobs: 26 | # Single deploy job since we're just deploying 27 | deploy: 28 | environment: 29 | name: github-pages 30 | url: ${{ steps.deployment.outputs.page_url }} 31 | runs-on: ubuntu-latest 32 | steps: 33 | - name: Checkout 34 | uses: actions/checkout@v4 35 | with: 36 | sparse-checkout: 'index/' 37 | - name: Setup Pages 38 | uses: actions/configure-pages@v4 39 | - name: Upload artifact 40 | uses: actions/upload-pages-artifact@v2 41 | with: 42 | # Upload docs directory 43 | path: 'index/' 44 | - name: Deploy to GitHub Pages 45 | id: deployment 46 | uses: actions/deploy-pages@v3 47 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | This is free and unencumbered software released into the public domain. 2 | 3 | Anyone is free to copy, modify, publish, use, compile, sell, or 4 | distribute this software, either in source code form or as a compiled 5 | binary, for any purpose, commercial or non-commercial, and by any 6 | means. 7 | 8 | In jurisdictions that recognize copyright laws, the author or authors 9 | of this software dedicate any and all copyright interest in the 10 | software to the public domain. We make this dedication for the benefit 11 | of the public at large and to the detriment of our heirs and 12 | successors. We intend this dedication to be an overt act of 13 | relinquishment in perpetuity of all present and future rights to this 14 | software under copyright law. 15 | 16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 17 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 18 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. 19 | IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR 20 | OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 21 | ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR 22 | OTHER DEALINGS IN THE SOFTWARE. 23 | 24 | For more information, please refer to 25 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # llama-cpp-python cuBLAS wheels 2 | Wheels for [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) compiled with cuBLAS support. 3 | 4 | Requirements: 5 | - Windows x64, Linux x64, or MacOS 11.0+ 6 | - CUDA 11.6 - 12.2 7 | - CPython 3.8 - 3.11 8 | 9 | > [!WARNING] 10 | > MacOS 11 and Windows ROCm wheels are unavailable for 0.2.21+. 11 | > This is due to build issues with llama.cpp that are not yet resolved. 12 | 13 | ROCm builds for AMD GPUs: https://github.com/jllllll/llama-cpp-python-cuBLAS-wheels/releases/tag/rocm 14 | Metal builds for MacOS 11.0+: https://github.com/jllllll/llama-cpp-python-cuBLAS-wheels/releases/tag/metal 15 | 16 | Installation instructions: 17 | --- 18 | To install, you can use this command: 19 | ``` 20 | python -m pip install llama-cpp-python --prefer-binary --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX2/cu117 21 | ``` 22 | This will install the latest llama-cpp-python version available from here for CUDA 11.7. You can change `cu117` to change the CUDA version. 23 | You can also change `AVX2` to `AVX`, `AVX512` or `basic` based on what your CPU supports. 24 | `basic` is a build without `AVX`, `FMA` and `F16C` instructions for old or basic CPUs. 25 | CPU-only builds are also available by changing `cu117` to `cpu`. 26 | 27 | You can install a specific version with: 28 | ``` 29 | python -m pip install llama-cpp-python== --prefer-binary --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX2/cu117 30 | ``` 31 | An example for installing 0.1.62 for CUDA 12.1 on a CPU without AVX2 support: 32 | ``` 33 | python -m pip install llama-cpp-python==0.1.62 --prefer-binary --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX/cu121 34 | ``` 35 | List of available versions: 36 | ``` 37 | python -m pip index versions llama-cpp-python --index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX2/cu117 38 | ``` 39 | 40 | If you are replacing an already existing installation, you may need to uninstall that version before running the command above. 41 | You can also replace the existing version in one command like so: 42 | ``` 43 | python -m pip install llama-cpp-python --force-reinstall --no-deps --index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX2/cu117 44 | -OR- 45 | python -m pip install llama-cpp-python==0.1.66 --force-reinstall --no-deps --index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX2/cu117 46 | -OR- 47 | python -m pip install llama-cpp-python --prefer-binary --upgrade --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX2/cu117 48 | ``` 49 | 50 | Wheels can be manually downloaded from: https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels 51 | 52 | --- 53 | I have renamed llama-cpp-python packages available to ease the transition to GGUF. 54 | This is accomplished by installing the renamed package alongside the main llama-cpp-python package. 55 | This should allow applications to maintain GGML support while still supporting GGUF. 56 | ``` 57 | python -m pip install llama-cpp-python-ggml --prefer-binary --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX2/cu117 58 | ``` 59 | 60 | --- 61 | ### All wheels are compiled using GitHub Actions 62 | -------------------------------------------------------------------------------- /host-index.bat: -------------------------------------------------------------------------------- 1 | @echo off 2 | 3 | cd index 4 | python -m http.server 7860 5 | -------------------------------------------------------------------------------- /host-index.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | cd index 4 | python -m http.server 7860 5 | -------------------------------------------------------------------------------- /index/AVX/cpu/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/AVX/cpu/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cpuavx-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cpuavx-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cpuavx-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cpuavx-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cpuavx-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cpuavx-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cpuavx-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cpuavx-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cpuavx-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cpuavx-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/AVX/cu116/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/AVX/cu116/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu116-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu116-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu116-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu116-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu116-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu116-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu116-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu116-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu116-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu116-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/AVX/cu117/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/AVX/cu117/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu117-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu117-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu117-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu117-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu117-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu117-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu117-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu117-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu117-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu117-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/AVX/cu118/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/AVX/cu118/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu118-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu118-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu118-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu118-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu118-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu118-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu118-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu118-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu118-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu118-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/AVX/cu120/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/AVX/cu120/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu120-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu120-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu120-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu120-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu120-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu120-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu120-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu120-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu120-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu120-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/AVX/cu121/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/AVX/cu121/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu121-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu121-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu121-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu121-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu121-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu121-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu121-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu121-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu121-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu121-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/AVX/cu122/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/AVX/cu122/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu122-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu122-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu122-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu122-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu122-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu122-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu122-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu122-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu122-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu122-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/AVX/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | CUDA 11.6
5 | CUDA 11.7
6 | CUDA 11.8
7 | CUDA 12.0
8 | CUDA 12.1
9 | CUDA 12.2
10 | ROCm 5.4.2
11 | ROCm 5.5
12 | ROCm 5.5.1
13 | ROCm 5.6.1
14 | cpu
15 | 16 | 17 | -------------------------------------------------------------------------------- /index/AVX/rocm5.4.2/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/AVX/rocm5.5.1/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/AVX/rocm5.5/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/AVX/rocm5.6.1/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/AVX2/cpu/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/AVX2/cpu/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cpuavx2-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cpuavx2-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cpuavx2-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cpuavx2-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cpuavx2-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cpuavx2-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cpuavx2-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cpuavx2-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cpuavx2-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cpuavx2-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/AVX2/cu116/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/AVX2/cu116/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu116-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu116-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu116-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu116-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu116-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu116-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu116-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu116-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu116-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu116-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/AVX2/cu117/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/AVX2/cu117/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu117-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu117-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu117-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu117-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu117-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu117-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu117-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu117-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu117-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu117-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/AVX2/cu118/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/AVX2/cu118/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu118-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu118-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu118-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu118-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu118-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu118-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu118-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu118-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu118-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu118-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/AVX2/cu120/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/AVX2/cu120/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu120-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu120-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu120-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu120-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu120-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu120-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu120-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu120-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu120-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu120-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/AVX2/cu121/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/AVX2/cu121/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu121-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu121-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu121-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu121-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu121-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu121-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu121-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu121-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu121-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu121-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/AVX2/cu122/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/AVX2/cu122/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu122-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu122-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu122-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu122-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu122-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu122-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu122-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu122-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu122-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu122-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/AVX2/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | CUDA 11.6
5 | CUDA 11.7
6 | CUDA 11.8
7 | CUDA 12.0
8 | CUDA 12.1
9 | CUDA 12.2
10 | ROCm 5.4.2
11 | ROCm 5.5
12 | ROCm 5.5.1
13 | ROCm 5.6.1
14 | cpu
15 | 16 | 17 | -------------------------------------------------------------------------------- /index/AVX2/rocm5.4.2/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/AVX2/rocm5.5.1/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/AVX2/rocm5.5/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/AVX2/rocm5.6.1/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/AVX512/cpu/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/AVX512/cpu/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cpuavx512-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cpuavx512-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cpuavx512-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cpuavx512-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cpuavx512-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cpuavx512-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cpuavx512-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cpuavx512-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cpuavx512-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cpuavx512-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/AVX512/cu116/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/AVX512/cu116/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu116-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu116-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu116-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu116-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu116-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu116-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu116-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu116-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu116-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu116-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/AVX512/cu117/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/AVX512/cu117/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu117-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu117-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu117-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu117-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu117-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu117-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu117-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu117-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu117-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu117-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/AVX512/cu118/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/AVX512/cu118/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu118-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu118-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu118-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu118-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu118-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu118-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu118-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu118-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu118-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu118-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/AVX512/cu120/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/AVX512/cu120/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu120-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu120-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu120-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu120-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu120-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu120-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu120-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu120-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu120-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu120-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/AVX512/cu121/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/AVX512/cu121/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu121-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu121-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu121-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu121-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu121-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu121-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu121-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu121-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu121-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu121-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/AVX512/cu122/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/AVX512/cu122/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu122-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu122-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu122-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu122-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu122-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu122-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu122-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu122-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu122-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu122-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/AVX512/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | CUDA 11.6
5 | CUDA 11.7
6 | CUDA 11.8
7 | CUDA 12.0
8 | CUDA 12.1
9 | CUDA 12.2
10 | cpu
11 | 12 | 13 | -------------------------------------------------------------------------------- /index/basic/cpu/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/basic/cpu/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cpubasic-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cpubasic-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cpubasic-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cpubasic-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cpubasic-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cpubasic-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cpubasic-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cpubasic-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cpubasic-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cpubasic-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/basic/cu116/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/basic/cu116/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu116-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu116-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu116-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu116-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu116-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu116-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu116-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu116-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu116-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu116-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/basic/cu117/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/basic/cu117/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu117-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu117-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu117-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu117-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu117-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu117-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu117-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu117-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu117-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu117-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/basic/cu118/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/basic/cu118/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu118-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu118-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu118-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu118-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu118-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu118-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu118-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu118-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu118-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu118-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/basic/cu120/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/basic/cu120/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu120-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu120-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu120-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu120-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu120-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu120-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu120-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu120-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu120-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu120-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/basic/cu121/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/basic/cu121/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu121-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu121-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu121-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu121-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu121-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu121-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu121-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu121-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu121-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu121-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/basic/cu122/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | llama_cpp_python_ggml 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/basic/cu122/llama-cpp-python-ggml/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml-0.1.78+cu122-cp37-cp37m-linux_x86_64.whl
5 | llama_cpp_python_ggml-0.1.78+cu122-cp37-cp37m-win_amd64.whl
6 | llama_cpp_python_ggml-0.1.78+cu122-cp38-cp38-linux_x86_64.whl
7 | llama_cpp_python_ggml-0.1.78+cu122-cp38-cp38-win_amd64.whl
8 | llama_cpp_python_ggml-0.1.78+cu122-cp39-cp39-linux_x86_64.whl
9 | llama_cpp_python_ggml-0.1.78+cu122-cp39-cp39-win_amd64.whl
10 | llama_cpp_python_ggml-0.1.78+cu122-cp310-cp310-linux_x86_64.whl
11 | llama_cpp_python_ggml-0.1.78+cu122-cp310-cp310-win_amd64.whl
12 | llama_cpp_python_ggml-0.1.78+cu122-cp311-cp311-linux_x86_64.whl
13 | llama_cpp_python_ggml-0.1.78+cu122-cp311-cp311-win_amd64.whl
14 | 15 | 16 | -------------------------------------------------------------------------------- /index/basic/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | CUDA 11.6
5 | CUDA 11.7
6 | CUDA 11.8
7 | CUDA 12.0
8 | CUDA 12.1
9 | CUDA 12.2
10 | ROCm 5.4.2
11 | ROCm 5.5
12 | ROCm 5.5.1
13 | ROCm 5.6.1
14 | cpu
15 | 16 | 17 | -------------------------------------------------------------------------------- /index/basic/rocm5.4.2/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/basic/rocm5.5.1/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/basic/rocm5.5/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/basic/rocm5.6.1/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | AVX
5 | AVX2
6 | AVX512
7 | basic
8 | 9 | 10 | -------------------------------------------------------------------------------- /index/textgen/AVX/cu117/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | llama_cpp_python_ggml_cuda 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/textgen/AVX/cu117/llama-cpp-python-ggml-cuda/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml_cuda-0.1.78+cu117avx-cp38-cp38-linux_x86_64.whl
5 | llama_cpp_python_ggml_cuda-0.1.78+cu117avx-cp38-cp38-win_amd64.whl
6 | llama_cpp_python_ggml_cuda-0.1.78+cu117avx-cp39-cp39-linux_x86_64.whl
7 | llama_cpp_python_ggml_cuda-0.1.78+cu117avx-cp39-cp39-win_amd64.whl
8 | llama_cpp_python_ggml_cuda-0.1.78+cu117avx-cp310-cp310-linux_x86_64.whl
9 | llama_cpp_python_ggml_cuda-0.1.78+cu117avx-cp310-cp310-win_amd64.whl
10 | llama_cpp_python_ggml_cuda-0.1.78+cu117avx-cp311-cp311-linux_x86_64.whl
11 | llama_cpp_python_ggml_cuda-0.1.78+cu117avx-cp311-cp311-win_amd64.whl
12 | 13 | 14 | -------------------------------------------------------------------------------- /index/textgen/AVX/cu118/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | llama_cpp_python_ggml_cuda 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/textgen/AVX/cu118/llama-cpp-python-ggml-cuda/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml_cuda-0.1.78+cu118avx-cp38-cp38-linux_x86_64.whl
5 | llama_cpp_python_ggml_cuda-0.1.78+cu118avx-cp38-cp38-win_amd64.whl
6 | llama_cpp_python_ggml_cuda-0.1.78+cu118avx-cp39-cp39-linux_x86_64.whl
7 | llama_cpp_python_ggml_cuda-0.1.78+cu118avx-cp39-cp39-win_amd64.whl
8 | llama_cpp_python_ggml_cuda-0.1.78+cu118avx-cp310-cp310-linux_x86_64.whl
9 | llama_cpp_python_ggml_cuda-0.1.78+cu118avx-cp310-cp310-win_amd64.whl
10 | llama_cpp_python_ggml_cuda-0.1.78+cu118avx-cp311-cp311-linux_x86_64.whl
11 | llama_cpp_python_ggml_cuda-0.1.78+cu118avx-cp311-cp311-win_amd64.whl
12 | 13 | 14 | -------------------------------------------------------------------------------- /index/textgen/AVX/cu120/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | llama_cpp_python_ggml_cuda 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/textgen/AVX/cu120/llama-cpp-python-ggml-cuda/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml_cuda-0.1.78+cu120avx-cp38-cp38-linux_x86_64.whl
5 | llama_cpp_python_ggml_cuda-0.1.78+cu120avx-cp38-cp38-win_amd64.whl
6 | llama_cpp_python_ggml_cuda-0.1.78+cu120avx-cp39-cp39-linux_x86_64.whl
7 | llama_cpp_python_ggml_cuda-0.1.78+cu120avx-cp39-cp39-win_amd64.whl
8 | llama_cpp_python_ggml_cuda-0.1.78+cu120avx-cp310-cp310-linux_x86_64.whl
9 | llama_cpp_python_ggml_cuda-0.1.78+cu120avx-cp310-cp310-win_amd64.whl
10 | llama_cpp_python_ggml_cuda-0.1.78+cu120avx-cp311-cp311-linux_x86_64.whl
11 | llama_cpp_python_ggml_cuda-0.1.78+cu120avx-cp311-cp311-win_amd64.whl
12 | 13 | 14 | -------------------------------------------------------------------------------- /index/textgen/AVX/cu121/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | llama_cpp_python_ggml_cuda 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/textgen/AVX/cu121/llama-cpp-python-ggml-cuda/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml_cuda-0.1.78+cu121avx-cp38-cp38-linux_x86_64.whl
5 | llama_cpp_python_ggml_cuda-0.1.78+cu121avx-cp38-cp38-win_amd64.whl
6 | llama_cpp_python_ggml_cuda-0.1.78+cu121avx-cp39-cp39-linux_x86_64.whl
7 | llama_cpp_python_ggml_cuda-0.1.78+cu121avx-cp39-cp39-win_amd64.whl
8 | llama_cpp_python_ggml_cuda-0.1.78+cu121avx-cp310-cp310-linux_x86_64.whl
9 | llama_cpp_python_ggml_cuda-0.1.78+cu121avx-cp310-cp310-win_amd64.whl
10 | llama_cpp_python_ggml_cuda-0.1.78+cu121avx-cp311-cp311-linux_x86_64.whl
11 | llama_cpp_python_ggml_cuda-0.1.78+cu121avx-cp311-cp311-win_amd64.whl
12 | 13 | 14 | -------------------------------------------------------------------------------- /index/textgen/AVX/cu122/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | llama_cpp_python_ggml_cuda 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/textgen/AVX/cu122/llama-cpp-python-ggml-cuda/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml_cuda-0.1.78+cu122avx-cp38-cp38-linux_x86_64.whl
5 | llama_cpp_python_ggml_cuda-0.1.78+cu122avx-cp38-cp38-win_amd64.whl
6 | llama_cpp_python_ggml_cuda-0.1.78+cu122avx-cp39-cp39-linux_x86_64.whl
7 | llama_cpp_python_ggml_cuda-0.1.78+cu122avx-cp39-cp39-win_amd64.whl
8 | llama_cpp_python_ggml_cuda-0.1.78+cu122avx-cp310-cp310-linux_x86_64.whl
9 | llama_cpp_python_ggml_cuda-0.1.78+cu122avx-cp310-cp310-win_amd64.whl
10 | llama_cpp_python_ggml_cuda-0.1.78+cu122avx-cp311-cp311-linux_x86_64.whl
11 | llama_cpp_python_ggml_cuda-0.1.78+cu122avx-cp311-cp311-win_amd64.whl
12 | 13 | 14 | -------------------------------------------------------------------------------- /index/textgen/AVX/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | CUDA 11.7
5 | CUDA 11.8
6 | CUDA 12.0
7 | CUDA 12.1
8 | CUDA 12.2
9 | ROCm 5.4.2
10 | ROCm 5.5
11 | ROCm 5.5.1
12 | ROCm 5.6.1
13 | 14 | 15 | -------------------------------------------------------------------------------- /index/textgen/AVX/rocm5.4.2/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/textgen/AVX/rocm5.5.1/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/textgen/AVX/rocm5.5/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/textgen/AVX/rocm5.6.1/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/textgen/AVX2/cu117/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | llama_cpp_python_ggml_cuda 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/textgen/AVX2/cu117/llama-cpp-python-ggml-cuda/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml_cuda-0.1.78+cu117-cp38-cp38-linux_x86_64.whl
5 | llama_cpp_python_ggml_cuda-0.1.78+cu117-cp38-cp38-win_amd64.whl
6 | llama_cpp_python_ggml_cuda-0.1.78+cu117-cp39-cp39-linux_x86_64.whl
7 | llama_cpp_python_ggml_cuda-0.1.78+cu117-cp39-cp39-win_amd64.whl
8 | llama_cpp_python_ggml_cuda-0.1.78+cu117-cp310-cp310-linux_x86_64.whl
9 | llama_cpp_python_ggml_cuda-0.1.78+cu117-cp310-cp310-win_amd64.whl
10 | llama_cpp_python_ggml_cuda-0.1.78+cu117-cp311-cp311-linux_x86_64.whl
11 | llama_cpp_python_ggml_cuda-0.1.78+cu117-cp311-cp311-win_amd64.whl
12 | 13 | 14 | -------------------------------------------------------------------------------- /index/textgen/AVX2/cu118/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | llama_cpp_python_ggml_cuda 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/textgen/AVX2/cu118/llama-cpp-python-ggml-cuda/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml_cuda-0.1.78+cu118-cp38-cp38-linux_x86_64.whl
5 | llama_cpp_python_ggml_cuda-0.1.78+cu118-cp38-cp38-win_amd64.whl
6 | llama_cpp_python_ggml_cuda-0.1.78+cu118-cp39-cp39-linux_x86_64.whl
7 | llama_cpp_python_ggml_cuda-0.1.78+cu118-cp39-cp39-win_amd64.whl
8 | llama_cpp_python_ggml_cuda-0.1.78+cu118-cp310-cp310-linux_x86_64.whl
9 | llama_cpp_python_ggml_cuda-0.1.78+cu118-cp310-cp310-win_amd64.whl
10 | llama_cpp_python_ggml_cuda-0.1.78+cu118-cp311-cp311-linux_x86_64.whl
11 | llama_cpp_python_ggml_cuda-0.1.78+cu118-cp311-cp311-win_amd64.whl
12 | 13 | 14 | -------------------------------------------------------------------------------- /index/textgen/AVX2/cu120/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | llama_cpp_python_ggml_cuda 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/textgen/AVX2/cu120/llama-cpp-python-ggml-cuda/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml_cuda-0.1.78+cu120-cp38-cp38-linux_x86_64.whl
5 | llama_cpp_python_ggml_cuda-0.1.78+cu120-cp38-cp38-win_amd64.whl
6 | llama_cpp_python_ggml_cuda-0.1.78+cu120-cp39-cp39-linux_x86_64.whl
7 | llama_cpp_python_ggml_cuda-0.1.78+cu120-cp39-cp39-win_amd64.whl
8 | llama_cpp_python_ggml_cuda-0.1.78+cu120-cp310-cp310-linux_x86_64.whl
9 | llama_cpp_python_ggml_cuda-0.1.78+cu120-cp310-cp310-win_amd64.whl
10 | llama_cpp_python_ggml_cuda-0.1.78+cu120-cp311-cp311-linux_x86_64.whl
11 | llama_cpp_python_ggml_cuda-0.1.78+cu120-cp311-cp311-win_amd64.whl
12 | 13 | 14 | -------------------------------------------------------------------------------- /index/textgen/AVX2/cu121/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | llama_cpp_python_ggml_cuda 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/textgen/AVX2/cu121/llama-cpp-python-ggml-cuda/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml_cuda-0.1.78+cu121-cp38-cp38-linux_x86_64.whl
5 | llama_cpp_python_ggml_cuda-0.1.78+cu121-cp38-cp38-win_amd64.whl
6 | llama_cpp_python_ggml_cuda-0.1.78+cu121-cp39-cp39-linux_x86_64.whl
7 | llama_cpp_python_ggml_cuda-0.1.78+cu121-cp39-cp39-win_amd64.whl
8 | llama_cpp_python_ggml_cuda-0.1.78+cu121-cp310-cp310-linux_x86_64.whl
9 | llama_cpp_python_ggml_cuda-0.1.78+cu121-cp310-cp310-win_amd64.whl
10 | llama_cpp_python_ggml_cuda-0.1.78+cu121-cp311-cp311-linux_x86_64.whl
11 | llama_cpp_python_ggml_cuda-0.1.78+cu121-cp311-cp311-win_amd64.whl
12 | 13 | 14 | -------------------------------------------------------------------------------- /index/textgen/AVX2/cu122/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | llama_cpp_python_ggml_cuda 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/textgen/AVX2/cu122/llama-cpp-python-ggml-cuda/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml_cuda-0.1.78+cu122-cp38-cp38-linux_x86_64.whl
5 | llama_cpp_python_ggml_cuda-0.1.78+cu122-cp38-cp38-win_amd64.whl
6 | llama_cpp_python_ggml_cuda-0.1.78+cu122-cp39-cp39-linux_x86_64.whl
7 | llama_cpp_python_ggml_cuda-0.1.78+cu122-cp39-cp39-win_amd64.whl
8 | llama_cpp_python_ggml_cuda-0.1.78+cu122-cp310-cp310-linux_x86_64.whl
9 | llama_cpp_python_ggml_cuda-0.1.78+cu122-cp310-cp310-win_amd64.whl
10 | llama_cpp_python_ggml_cuda-0.1.78+cu122-cp311-cp311-linux_x86_64.whl
11 | llama_cpp_python_ggml_cuda-0.1.78+cu122-cp311-cp311-win_amd64.whl
12 | 13 | 14 | -------------------------------------------------------------------------------- /index/textgen/AVX2/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | CUDA 11.7
5 | CUDA 11.8
6 | CUDA 12.0
7 | CUDA 12.1
8 | CUDA 12.2
9 | ROCm 5.4.2
10 | ROCm 5.5
11 | ROCm 5.5.1
12 | ROCm 5.6.1
13 | 14 | 15 | -------------------------------------------------------------------------------- /index/textgen/AVX2/rocm5.4.2/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/textgen/AVX2/rocm5.5.1/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/textgen/AVX2/rocm5.5/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/textgen/AVX2/rocm5.6.1/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/textgen/basic/cu117/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | llama_cpp_python_ggml_cuda 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/textgen/basic/cu117/llama-cpp-python-ggml-cuda/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml_cuda-0.1.78+cu117basic-cp38-cp38-linux_x86_64.whl
5 | llama_cpp_python_ggml_cuda-0.1.78+cu117basic-cp38-cp38-win_amd64.whl
6 | llama_cpp_python_ggml_cuda-0.1.78+cu117basic-cp39-cp39-linux_x86_64.whl
7 | llama_cpp_python_ggml_cuda-0.1.78+cu117basic-cp39-cp39-win_amd64.whl
8 | llama_cpp_python_ggml_cuda-0.1.78+cu117basic-cp310-cp310-linux_x86_64.whl
9 | llama_cpp_python_ggml_cuda-0.1.78+cu117basic-cp310-cp310-win_amd64.whl
10 | llama_cpp_python_ggml_cuda-0.1.78+cu117basic-cp311-cp311-linux_x86_64.whl
11 | llama_cpp_python_ggml_cuda-0.1.78+cu117basic-cp311-cp311-win_amd64.whl
12 | 13 | 14 | -------------------------------------------------------------------------------- /index/textgen/basic/cu118/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | llama_cpp_python_ggml_cuda 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/textgen/basic/cu118/llama-cpp-python-ggml-cuda/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml_cuda-0.1.78+cu118basic-cp38-cp38-linux_x86_64.whl
5 | llama_cpp_python_ggml_cuda-0.1.78+cu118basic-cp38-cp38-win_amd64.whl
6 | llama_cpp_python_ggml_cuda-0.1.78+cu118basic-cp39-cp39-linux_x86_64.whl
7 | llama_cpp_python_ggml_cuda-0.1.78+cu118basic-cp39-cp39-win_amd64.whl
8 | llama_cpp_python_ggml_cuda-0.1.78+cu118basic-cp310-cp310-linux_x86_64.whl
9 | llama_cpp_python_ggml_cuda-0.1.78+cu118basic-cp310-cp310-win_amd64.whl
10 | llama_cpp_python_ggml_cuda-0.1.78+cu118basic-cp311-cp311-linux_x86_64.whl
11 | llama_cpp_python_ggml_cuda-0.1.78+cu118basic-cp311-cp311-win_amd64.whl
12 | 13 | 14 | -------------------------------------------------------------------------------- /index/textgen/basic/cu120/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | llama_cpp_python_ggml_cuda 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/textgen/basic/cu120/llama-cpp-python-ggml-cuda/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml_cuda-0.1.78+cu120basic-cp38-cp38-linux_x86_64.whl
5 | llama_cpp_python_ggml_cuda-0.1.78+cu120basic-cp38-cp38-win_amd64.whl
6 | llama_cpp_python_ggml_cuda-0.1.78+cu120basic-cp39-cp39-linux_x86_64.whl
7 | llama_cpp_python_ggml_cuda-0.1.78+cu120basic-cp39-cp39-win_amd64.whl
8 | llama_cpp_python_ggml_cuda-0.1.78+cu120basic-cp310-cp310-linux_x86_64.whl
9 | llama_cpp_python_ggml_cuda-0.1.78+cu120basic-cp310-cp310-win_amd64.whl
10 | llama_cpp_python_ggml_cuda-0.1.78+cu120basic-cp311-cp311-linux_x86_64.whl
11 | llama_cpp_python_ggml_cuda-0.1.78+cu120basic-cp311-cp311-win_amd64.whl
12 | 13 | 14 | -------------------------------------------------------------------------------- /index/textgen/basic/cu121/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | llama_cpp_python_ggml_cuda 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/textgen/basic/cu121/llama-cpp-python-ggml-cuda/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml_cuda-0.1.78+cu121basic-cp38-cp38-linux_x86_64.whl
5 | llama_cpp_python_ggml_cuda-0.1.78+cu121basic-cp38-cp38-win_amd64.whl
6 | llama_cpp_python_ggml_cuda-0.1.78+cu121basic-cp39-cp39-linux_x86_64.whl
7 | llama_cpp_python_ggml_cuda-0.1.78+cu121basic-cp39-cp39-win_amd64.whl
8 | llama_cpp_python_ggml_cuda-0.1.78+cu121basic-cp310-cp310-linux_x86_64.whl
9 | llama_cpp_python_ggml_cuda-0.1.78+cu121basic-cp310-cp310-win_amd64.whl
10 | llama_cpp_python_ggml_cuda-0.1.78+cu121basic-cp311-cp311-linux_x86_64.whl
11 | llama_cpp_python_ggml_cuda-0.1.78+cu121basic-cp311-cp311-win_amd64.whl
12 | 13 | 14 | -------------------------------------------------------------------------------- /index/textgen/basic/cu122/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | llama_cpp_python_ggml_cuda 6 | 7 | 8 | -------------------------------------------------------------------------------- /index/textgen/basic/cu122/llama-cpp-python-ggml-cuda/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_ggml_cuda-0.1.78+cu122basic-cp38-cp38-linux_x86_64.whl
5 | llama_cpp_python_ggml_cuda-0.1.78+cu122basic-cp38-cp38-win_amd64.whl
6 | llama_cpp_python_ggml_cuda-0.1.78+cu122basic-cp39-cp39-linux_x86_64.whl
7 | llama_cpp_python_ggml_cuda-0.1.78+cu122basic-cp39-cp39-win_amd64.whl
8 | llama_cpp_python_ggml_cuda-0.1.78+cu122basic-cp310-cp310-linux_x86_64.whl
9 | llama_cpp_python_ggml_cuda-0.1.78+cu122basic-cp310-cp310-win_amd64.whl
10 | llama_cpp_python_ggml_cuda-0.1.78+cu122basic-cp311-cp311-linux_x86_64.whl
11 | llama_cpp_python_ggml_cuda-0.1.78+cu122basic-cp311-cp311-win_amd64.whl
12 | 13 | 14 | -------------------------------------------------------------------------------- /index/textgen/basic/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | CUDA 11.7
5 | CUDA 11.8
6 | CUDA 12.0
7 | CUDA 12.1
8 | CUDA 12.2
9 | ROCm 5.4.2
10 | ROCm 5.5
11 | ROCm 5.5.1
12 | ROCm 5.6.1
13 | 14 | 15 | -------------------------------------------------------------------------------- /index/textgen/basic/rocm5.4.2/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/textgen/basic/rocm5.5.1/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/textgen/basic/rocm5.5/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/textgen/basic/rocm5.6.1/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | llama_cpp_python_cuda
5 | 6 | 7 | -------------------------------------------------------------------------------- /index/textgen/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | AVX
5 | AVX2
6 | basic
7 | 8 | 9 | -------------------------------------------------------------------------------- /old_workflows/build-all-wheels.yml: -------------------------------------------------------------------------------- 1 | name: Build All Wheels 2 | 3 | on: workflow_dispatch 4 | 5 | permissions: 6 | contents: write 7 | 8 | jobs: 9 | run_workflow_0-1-62: 10 | name: Build Wheels 0.1.62 11 | uses: ./.github/workflows/build-wheels-0.1.62.yml 12 | 13 | run_workflow_0-1-66: 14 | name: Build Wheels 0.1.66 15 | needs: run_workflow_0-1-62 16 | uses: ./.github/workflows/build-wheels-0.1.66.yml 17 | 18 | run_workflow_0-1-67: 19 | name: Build Wheels 0.1.67 20 | needs: run_workflow_0-1-66 21 | uses: ./.github/workflows/build-wheels-0.1.67.yml 22 | 23 | run_workflow_0-1-68: 24 | name: Build Wheels 0.1.68 25 | needs: run_workflow_0-1-67 26 | uses: ./.github/workflows/build-wheels-0.1.68.yml 27 | 28 | run_workflow_0-1-70: 29 | name: Build Wheels 0.1.70 30 | needs: run_workflow_0-1-68 31 | uses: ./.github/workflows/build-wheels-0.1.70.yml 32 | 33 | run_workflow_0-1-71: 34 | name: Build Wheels 0.1.71 35 | needs: run_workflow_0-1-70 36 | uses: ./.github/workflows/build-wheels-0.1.71.yml 37 | 38 | run_workflow_0-1-72: 39 | name: Build Wheels 0.1.72 40 | needs: run_workflow_0-1-71 41 | uses: ./.github/workflows/build-wheels-0.1.72.yml 42 | 43 | run_workflow_0-1-73: 44 | name: Build Wheels 0.1.73 45 | needs: run_workflow_0-1-72 46 | uses: ./.github/workflows/build-wheels-0.1.73.yml 47 | 48 | run_workflow_0-1-74: 49 | name: Build Wheels 0.1.74 50 | needs: run_workflow_0-1-73 51 | uses: ./.github/workflows/build-wheels-0.1.74.yml 52 | -------------------------------------------------------------------------------- /old_workflows/build-wheel-rocm-windows.yml: -------------------------------------------------------------------------------- 1 | name: Build ROCm Windows Wheel 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | version: 7 | description: 'Version tag of llama-cpp-python to build: v0.1.79' 8 | default: 'v0.1.79' 9 | required: false 10 | type: string 11 | workflow_call: 12 | inputs: 13 | version: 14 | description: 'Version tag of llama-cpp-python to build: v0.1.79' 15 | default: 'v0.1.79' 16 | required: false 17 | type: string 18 | 19 | permissions: 20 | contents: write 21 | 22 | jobs: 23 | build_libs: 24 | name: Build ROCm Lib 25 | runs-on: windows-latest 26 | defaults: 27 | run: 28 | shell: pwsh 29 | 30 | steps: 31 | - uses: actions/checkout@v3 32 | with: 33 | repository: 'abetlen/llama-cpp-python' 34 | ref: ${{ inputs.version }} 35 | submodules: 'recursive' 36 | 37 | - name: Install ROCm SDK 38 | run: | 39 | curl -LO https://download.amd.com/developer/eula/rocm-hub/AMD-Software-PRO-Edition-23.Q3-Win10-Win11-For-HIP.exe 40 | Start-Process 'AMD-Software-PRO-Edition-23.Q3-Win10-Win11-For-HIP.exe' -ArgumentList '-install' -NoNewWindow -Wait 41 | echo "C:\Program Files\AMD\ROCm\5.5\bin" >> $env:GITHUB_PATH 42 | echo 'ROCM_PATH=C:\Program Files\AMD\ROCm\5.5' >> $env:GITHUB_ENV 43 | echo 'HIP_PATH=C:\Program Files\AMD\ROCm\5.5' >> $env:GITHUB_ENV 44 | echo "ROCM_VERSION=5.5.1" >> $env:GITHUB_ENV 45 | 46 | - uses: actions/setup-python@v3 47 | with: 48 | python-version: "3.10" 49 | 50 | - name: Install Dependencies 51 | run: | 52 | python -m pip install cmake ninja 53 | 54 | - name: Build Lib 55 | run: | 56 | $env:CC = 'C:\Program Files\AMD\ROCm\5.5\bin\clang.exe' 57 | $env:CXX = 'C:\Program Files\AMD\ROCm\5.5\bin\clang++.exe' 58 | $env:CMAKE_PREFIX_PATH = 'C:\Program Files\AMD\ROCm\5.5' 59 | $env:VERBOSE = '1' 60 | mkdir 'build' 61 | Set-Location '.\vendor\llama.cpp' 62 | cmake -B build -G "Ninja" -DLLAMA_HIPBLAS=ON -DBUILD_SHARED_LIBS=ON '-DGPU_TARGETS=gfx803;gfx900;gfx906:xnack-;gfx908:xnack-;gfx90a:xnack+;gfx90a:xnack-;gfx1010;gfx1012;gfx1030;gfx1100;gfx1101;gfx1102' 63 | cmake --build build --config Release --target llama 64 | Copy-Item '.\build\bin\llama.dll' '..\..\build' 65 | 66 | - uses: actions/upload-artifact@v3 67 | with: 68 | name: 'win-rocm-lib' 69 | path: ./build/llama.dll 70 | 71 | build_wheel: 72 | name: Build ROCm Wheels 73 | runs-on: windows-latest 74 | needs: build_libs 75 | strategy: 76 | matrix: 77 | pyver: ["3.8", "3.9", "3.10", "3.11"] 78 | defaults: 79 | run: 80 | shell: pwsh 81 | env: 82 | PCKGVER: ${{ inputs.version }} 83 | 84 | steps: 85 | - uses: actions/checkout@v3 86 | with: 87 | repository: 'abetlen/llama-cpp-python' 88 | ref: ${{ inputs.version }} 89 | 90 | - uses: actions/download-artifact@v3 91 | with: 92 | name: 'win-rocm-lib' 93 | path: ./llama_cpp 94 | 95 | - uses: actions/setup-python@v3 96 | with: 97 | python-version: ${{ matrix.pyver }} 98 | 99 | - name: Install Dependencies 100 | run: | 101 | python -m pip install build wheel cmake scikit-build ninja 102 | 103 | - name: Build Wheel 104 | run: | 105 | $packageVersion = [version]$env:PCKGVER.TrimStart('v') 106 | $setup = Get-Content 'setup.py' -raw 107 | if ($packageVersion -lt [version]'0.1.78') {$newsetup = $setup.Replace("packages=[`"llama_cpp`", `"llama_cpp.server`"],","packages=[`"llama_cpp`", `"llama_cpp.server`"],`n package_data={'llama_cpp': ['llama.dll']},")} 108 | if ($packageVersion -gt [version]'0.1.77') {$newsetup = $setup.Replace('package_data={"llama_cpp": ["py.typed"]},','package_data={"llama_cpp": ["py.typed", "llama.dll"]},')} 109 | New-Item 'setup.py' -itemType File -value $newsetup -force 110 | python setup.py --skip-cmake bdist_wheel egg_info --tag-build=+rocm5.5.1 111 | 112 | - name: Upload files to a GitHub release 113 | id: upload-release 114 | uses: svenstaro/upload-release-action@2.6.1 115 | continue-on-error: true 116 | with: 117 | file: ./dist/*.whl 118 | tag: rocm 119 | file_glob: true 120 | make_latest: false 121 | overwrite: true 122 | 123 | - uses: actions/upload-artifact@v3 124 | if: steps.upload-release.outcome == 'failure' 125 | with: 126 | name: 'win-rocm-wheels' 127 | path: ./dist/*.whl 128 | -------------------------------------------------------------------------------- /old_workflows/build-wheels-0.1.62.yml: -------------------------------------------------------------------------------- 1 | name: Build Wheels 0.1.62 2 | 3 | on: 4 | workflow_dispatch: 5 | workflow_call: 6 | 7 | permissions: 8 | contents: write 9 | 10 | jobs: 11 | build_wheels: 12 | name: ${{ matrix.os }} ${{ matrix.pyver }} ${{ matrix.cuda }} ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 13 | runs-on: ${{ matrix.os }} 14 | strategy: 15 | matrix: 16 | os: [ubuntu-20.04, windows-latest] 17 | pyver: ["3.7", "3.8", "3.9", "3.10", "3.11"] 18 | cuda: ["11.6.2", "11.7.1", "11.8.0", "12.0.1", "12.1.1", "12.2.0"] 19 | releasetag: ["AVX","wheels","AVX512"] 20 | defaults: 21 | run: 22 | shell: pwsh 23 | env: 24 | CUDAVER: ${{ matrix.cuda }} 25 | AVXVER: ${{ matrix.releasetag }} 26 | 27 | steps: 28 | - uses: actions/checkout@v3 29 | with: 30 | repository: 'abetlen/llama-cpp-python' 31 | ref: 'v0.1.62' 32 | submodules: 'recursive' 33 | 34 | - uses: actions/setup-python@v3 35 | with: 36 | python-version: ${{ matrix.pyver }} 37 | 38 | - name: Setup Mamba 39 | uses: conda-incubator/setup-miniconda@v2.2.0 40 | with: 41 | activate-environment: "build" 42 | python-version: ${{ matrix.pyver }} 43 | miniforge-variant: Mambaforge 44 | miniforge-version: latest 45 | use-mamba: true 46 | add-pip-as-python-dependency: true 47 | auto-activate-base: false 48 | 49 | - name: Install Dependencies 50 | run: | 51 | $cudaVersion = $env:CUDAVER 52 | $cudaChannels = '' 53 | $cudaNum = [int]$cudaVersion.substring($cudaVersion.LastIndexOf('.')+1) 54 | while ($cudaNum -ge 0) { $cudaChannels += '-c nvidia/label/cuda-' + $cudaVersion.Remove($cudaVersion.LastIndexOf('.')+1) + $cudaNum + ' '; $cudaNum-- } 55 | mamba install -y 'cuda' $cudaChannels.TrimEnd().Split() 56 | python -m pip install build wheel 57 | 58 | - name: Build Wheel 59 | run: | 60 | $cudaVersion = $env:CUDAVER.Remove($env:CUDAVER.LastIndexOf('.')).Replace('.','') 61 | $env:CUDA_PATH = $env:CONDA_PREFIX 62 | $env:CUDA_HOME = $env:CONDA_PREFIX 63 | if ($IsLinux) {$env:LD_LIBRARY_PATH = $env:CONDA_PREFIX + '/lib:' + $env:LD_LIBRARY_PATH} 64 | $env:VERBOSE = '1' 65 | $env:FORCE_CMAKE = '1' 66 | $env:CMAKE_ARGS = '-DLLAMA_CUBLAS=on' 67 | if ($env:AVXVER -eq 'AVX') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX2=off'} 68 | if ($env:AVXVER -eq 'AVX512') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX512=on'} 69 | $env:CUDAFLAGS = '-arch=all' 70 | python -m build --wheel -C--build-option=egg_info "-C--build-option=--tag-build=+cu$cudaVersion" 71 | 72 | - name: Upload files to a GitHub release 73 | id: upload-release 74 | uses: svenstaro/upload-release-action@2.6.1 75 | continue-on-error: true 76 | with: 77 | file: ./dist/*.whl 78 | tag: ${{ matrix.releasetag }} 79 | file_glob: true 80 | make_latest: false 81 | overwrite: true 82 | 83 | - uses: actions/upload-artifact@v3 84 | if: steps.upload-release.outcome == 'failure' 85 | with: 86 | name: ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 87 | path: ./dist/*.whl 88 | -------------------------------------------------------------------------------- /old_workflows/build-wheels-0.1.66.yml: -------------------------------------------------------------------------------- 1 | name: Build Wheels 0.1.66 2 | 3 | on: 4 | workflow_dispatch: 5 | workflow_call: 6 | 7 | permissions: 8 | contents: write 9 | 10 | jobs: 11 | build_wheels: 12 | name: ${{ matrix.os }} ${{ matrix.pyver }} ${{ matrix.cuda }} ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 13 | runs-on: ${{ matrix.os }} 14 | strategy: 15 | matrix: 16 | os: [ubuntu-20.04, windows-latest] 17 | pyver: ["3.7", "3.8", "3.9", "3.10", "3.11"] 18 | cuda: ["11.6.2", "11.7.1", "11.8.0", "12.0.1", "12.1.1", "12.2.0"] 19 | releasetag: ["AVX","wheels","AVX512"] 20 | defaults: 21 | run: 22 | shell: pwsh 23 | env: 24 | CUDAVER: ${{ matrix.cuda }} 25 | AVXVER: ${{ matrix.releasetag }} 26 | 27 | steps: 28 | - uses: actions/checkout@v3 29 | with: 30 | repository: 'abetlen/llama-cpp-python' 31 | ref: 'v0.1.66' 32 | submodules: 'recursive' 33 | 34 | - uses: actions/setup-python@v3 35 | with: 36 | python-version: ${{ matrix.pyver }} 37 | 38 | - name: Setup Mamba 39 | uses: conda-incubator/setup-miniconda@v2.2.0 40 | with: 41 | activate-environment: "build" 42 | python-version: ${{ matrix.pyver }} 43 | miniforge-variant: Mambaforge 44 | miniforge-version: latest 45 | use-mamba: true 46 | add-pip-as-python-dependency: true 47 | auto-activate-base: false 48 | 49 | - name: Install Dependencies 50 | run: | 51 | $cudaVersion = $env:CUDAVER 52 | $cudaChannels = '' 53 | $cudaNum = [int]$cudaVersion.substring($cudaVersion.LastIndexOf('.')+1) 54 | while ($cudaNum -ge 0) { $cudaChannels += '-c nvidia/label/cuda-' + $cudaVersion.Remove($cudaVersion.LastIndexOf('.')+1) + $cudaNum + ' '; $cudaNum-- } 55 | mamba install -y 'cuda' $cudaChannels.TrimEnd().Split() 56 | python -m pip install build wheel 57 | 58 | - name: Build Wheel 59 | run: | 60 | $cudaVersion = $env:CUDAVER.Remove($env:CUDAVER.LastIndexOf('.')).Replace('.','') 61 | $env:CUDA_PATH = $env:CONDA_PREFIX 62 | $env:CUDA_HOME = $env:CONDA_PREFIX 63 | if ($IsLinux) {$env:LD_LIBRARY_PATH = $env:CONDA_PREFIX + '/lib:' + $env:LD_LIBRARY_PATH} 64 | $env:VERBOSE = '1' 65 | $env:FORCE_CMAKE = '1' 66 | $env:CMAKE_ARGS = '-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=all' 67 | if ($env:AVXVER -eq 'AVX') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX2=off'} 68 | if ($env:AVXVER -eq 'AVX512') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX512=on'} 69 | python -m build --wheel -C--build-option=egg_info "-C--build-option=--tag-build=+cu$cudaVersion" 70 | 71 | - name: Upload files to a GitHub release 72 | id: upload-release 73 | uses: svenstaro/upload-release-action@2.6.1 74 | continue-on-error: true 75 | with: 76 | file: ./dist/*.whl 77 | tag: ${{ matrix.releasetag }} 78 | file_glob: true 79 | make_latest: false 80 | overwrite: true 81 | 82 | - uses: actions/upload-artifact@v3 83 | if: steps.upload-release.outcome == 'failure' 84 | with: 85 | name: ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 86 | path: ./dist/*.whl 87 | -------------------------------------------------------------------------------- /old_workflows/build-wheels-0.1.67.yml: -------------------------------------------------------------------------------- 1 | name: Build Wheels 0.1.67 2 | 3 | on: 4 | workflow_dispatch: 5 | workflow_call: 6 | 7 | permissions: 8 | contents: write 9 | 10 | jobs: 11 | build_wheels: 12 | name: ${{ matrix.os }} ${{ matrix.pyver }} ${{ matrix.cuda }} ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 13 | runs-on: ${{ matrix.os }} 14 | strategy: 15 | matrix: 16 | os: [ubuntu-20.04, windows-latest] 17 | pyver: ["3.7", "3.8", "3.9", "3.10", "3.11"] 18 | cuda: ["11.6.2", "11.7.1", "11.8.0", "12.0.1", "12.1.1", "12.2.0"] 19 | releasetag: ["AVX","wheels","AVX512"] 20 | defaults: 21 | run: 22 | shell: pwsh 23 | env: 24 | CUDAVER: ${{ matrix.cuda }} 25 | AVXVER: ${{ matrix.releasetag }} 26 | 27 | steps: 28 | - uses: actions/checkout@v3 29 | with: 30 | repository: 'abetlen/llama-cpp-python' 31 | ref: 'v0.1.67' 32 | submodules: 'recursive' 33 | 34 | - uses: actions/setup-python@v3 35 | with: 36 | python-version: ${{ matrix.pyver }} 37 | 38 | - name: Setup Mamba 39 | uses: conda-incubator/setup-miniconda@v2.2.0 40 | with: 41 | activate-environment: "build" 42 | python-version: ${{ matrix.pyver }} 43 | miniforge-variant: Mambaforge 44 | miniforge-version: latest 45 | use-mamba: true 46 | add-pip-as-python-dependency: true 47 | auto-activate-base: false 48 | 49 | - name: Install Dependencies 50 | run: | 51 | $cudaVersion = $env:CUDAVER 52 | $cudaChannels = '' 53 | $cudaNum = [int]$cudaVersion.substring($cudaVersion.LastIndexOf('.')+1) 54 | while ($cudaNum -ge 0) { $cudaChannels += '-c nvidia/label/cuda-' + $cudaVersion.Remove($cudaVersion.LastIndexOf('.')+1) + $cudaNum + ' '; $cudaNum-- } 55 | mamba install -y 'cuda' $cudaChannels.TrimEnd().Split() 56 | python -m pip install build wheel 57 | 58 | - name: Build Wheel 59 | run: | 60 | $cudaVersion = $env:CUDAVER.Remove($env:CUDAVER.LastIndexOf('.')).Replace('.','') 61 | $env:CUDA_PATH = $env:CONDA_PREFIX 62 | $env:CUDA_HOME = $env:CONDA_PREFIX 63 | if ($IsLinux) {$env:LD_LIBRARY_PATH = $env:CONDA_PREFIX + '/lib:' + $env:LD_LIBRARY_PATH} 64 | $env:VERBOSE = '1' 65 | $env:FORCE_CMAKE = '1' 66 | $env:CMAKE_ARGS = '-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=all' 67 | if ($env:AVXVER -eq 'AVX') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX2=off'} 68 | if ($env:AVXVER -eq 'AVX512') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX512=on'} 69 | python -m build --wheel -C--build-option=egg_info "-C--build-option=--tag-build=+cu$cudaVersion" 70 | 71 | - name: Upload files to a GitHub release 72 | id: upload-release 73 | uses: svenstaro/upload-release-action@2.6.1 74 | continue-on-error: true 75 | with: 76 | file: ./dist/*.whl 77 | tag: ${{ matrix.releasetag }} 78 | file_glob: true 79 | make_latest: false 80 | overwrite: true 81 | 82 | - uses: actions/upload-artifact@v3 83 | if: steps.upload-release.outcome == 'failure' 84 | with: 85 | name: ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 86 | path: ./dist/*.whl 87 | -------------------------------------------------------------------------------- /old_workflows/build-wheels-0.1.68.yml: -------------------------------------------------------------------------------- 1 | name: Build Wheels 0.1.68 2 | 3 | on: 4 | workflow_dispatch: 5 | workflow_call: 6 | 7 | permissions: 8 | contents: write 9 | 10 | jobs: 11 | build_wheels: 12 | name: ${{ matrix.os }} ${{ matrix.pyver }} ${{ matrix.cuda }} ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 13 | runs-on: ${{ matrix.os }} 14 | strategy: 15 | matrix: 16 | os: [ubuntu-20.04, windows-latest] 17 | pyver: ["3.7", "3.8", "3.9", "3.10", "3.11"] 18 | cuda: ["11.6.2", "11.7.1", "11.8.0", "12.0.1", "12.1.1", "12.2.0"] 19 | releasetag: ["AVX","wheels","AVX512"] 20 | defaults: 21 | run: 22 | shell: pwsh 23 | env: 24 | CUDAVER: ${{ matrix.cuda }} 25 | AVXVER: ${{ matrix.releasetag }} 26 | 27 | steps: 28 | - uses: actions/checkout@v3 29 | with: 30 | repository: 'abetlen/llama-cpp-python' 31 | ref: 'v0.1.68' 32 | submodules: 'recursive' 33 | 34 | - uses: actions/setup-python@v3 35 | with: 36 | python-version: ${{ matrix.pyver }} 37 | 38 | - name: Setup Mamba 39 | uses: conda-incubator/setup-miniconda@v2.2.0 40 | with: 41 | activate-environment: "build" 42 | python-version: ${{ matrix.pyver }} 43 | miniforge-variant: Mambaforge 44 | miniforge-version: latest 45 | use-mamba: true 46 | add-pip-as-python-dependency: true 47 | auto-activate-base: false 48 | 49 | - name: Install Dependencies 50 | run: | 51 | $cudaVersion = $env:CUDAVER 52 | $cudaChannels = '' 53 | $cudaNum = [int]$cudaVersion.substring($cudaVersion.LastIndexOf('.')+1) 54 | while ($cudaNum -ge 0) { $cudaChannels += '-c nvidia/label/cuda-' + $cudaVersion.Remove($cudaVersion.LastIndexOf('.')+1) + $cudaNum + ' '; $cudaNum-- } 55 | mamba install -y 'cuda' $cudaChannels.TrimEnd().Split() 56 | python -m pip install build wheel 57 | 58 | - name: Build Wheel 59 | run: | 60 | $cudaVersion = $env:CUDAVER.Remove($env:CUDAVER.LastIndexOf('.')).Replace('.','') 61 | $env:CUDA_PATH = $env:CONDA_PREFIX 62 | $env:CUDA_HOME = $env:CONDA_PREFIX 63 | if ($IsLinux) {$env:LD_LIBRARY_PATH = $env:CONDA_PREFIX + '/lib:' + $env:LD_LIBRARY_PATH} 64 | $env:VERBOSE = '1' 65 | $env:FORCE_CMAKE = '1' 66 | $env:CMAKE_ARGS = '-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=all' 67 | if ($env:AVXVER -eq 'AVX') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX2=off'} 68 | if ($env:AVXVER -eq 'AVX512') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX512=on'} 69 | python -m build --wheel -C--build-option=egg_info "-C--build-option=--tag-build=+cu$cudaVersion" 70 | 71 | - name: Upload files to a GitHub release 72 | id: upload-release 73 | uses: svenstaro/upload-release-action@2.6.1 74 | continue-on-error: true 75 | with: 76 | file: ./dist/*.whl 77 | tag: ${{ matrix.releasetag }} 78 | file_glob: true 79 | make_latest: false 80 | overwrite: true 81 | 82 | - uses: actions/upload-artifact@v3 83 | if: steps.upload-release.outcome == 'failure' 84 | with: 85 | name: ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 86 | path: ./dist/*.whl 87 | -------------------------------------------------------------------------------- /old_workflows/build-wheels-0.1.69.yml: -------------------------------------------------------------------------------- 1 | name: Build Wheels 0.1.69 2 | 3 | on: 4 | workflow_dispatch: 5 | workflow_call: 6 | 7 | permissions: 8 | contents: write 9 | 10 | jobs: 11 | build_wheels: 12 | name: ${{ matrix.os }} ${{ matrix.pyver }} ${{ matrix.cuda }} ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 13 | runs-on: ${{ matrix.os }} 14 | strategy: 15 | matrix: 16 | os: [ubuntu-20.04, windows-latest] 17 | pyver: ["3.7", "3.8", "3.9", "3.10", "3.11"] 18 | cuda: ["11.6.2", "11.7.1", "11.8.0", "12.0.1", "12.1.1", "12.2.0"] 19 | releasetag: ["AVX","wheels","AVX512"] 20 | defaults: 21 | run: 22 | shell: pwsh 23 | env: 24 | CUDAVER: ${{ matrix.cuda }} 25 | AVXVER: ${{ matrix.releasetag }} 26 | 27 | steps: 28 | - uses: actions/checkout@v3 29 | with: 30 | repository: 'abetlen/llama-cpp-python' 31 | ref: 'v0.1.69' 32 | submodules: 'recursive' 33 | 34 | - uses: actions/setup-python@v3 35 | with: 36 | python-version: ${{ matrix.pyver }} 37 | 38 | - name: Setup Mamba 39 | uses: conda-incubator/setup-miniconda@v2.2.0 40 | with: 41 | activate-environment: "build" 42 | python-version: ${{ matrix.pyver }} 43 | miniforge-variant: Mambaforge 44 | miniforge-version: latest 45 | use-mamba: true 46 | add-pip-as-python-dependency: true 47 | auto-activate-base: false 48 | 49 | - name: Install Dependencies 50 | run: | 51 | $cudaVersion = $env:CUDAVER 52 | $cudaChannels = '' 53 | $cudaNum = [int]$cudaVersion.substring($cudaVersion.LastIndexOf('.')+1) 54 | while ($cudaNum -ge 0) { $cudaChannels += '-c nvidia/label/cuda-' + $cudaVersion.Remove($cudaVersion.LastIndexOf('.')+1) + $cudaNum + ' '; $cudaNum-- } 55 | mamba install -y 'cuda' $cudaChannels.TrimEnd().Split() 56 | python -m pip install build wheel 57 | 58 | - name: Build Wheel 59 | run: | 60 | $cudaVersion = $env:CUDAVER.Remove($env:CUDAVER.LastIndexOf('.')).Replace('.','') 61 | $env:CUDA_PATH = $env:CONDA_PREFIX 62 | $env:CUDA_HOME = $env:CONDA_PREFIX 63 | if ($IsLinux) {$env:LD_LIBRARY_PATH = $env:CONDA_PREFIX + '/lib:' + $env:LD_LIBRARY_PATH} 64 | $env:VERBOSE = '1' 65 | $env:FORCE_CMAKE = '1' 66 | $env:CMAKE_ARGS = '-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=35-real;37-real;52;61-real;70-real;72-real;75-real;80-real;86-real;89-real;90' 67 | if ([version]$env:CUDAVER -ge [version]'12.0') {$env:CMAKE_ARGS = '-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=52;61-real;70-real;72-real;75-real;80-real;86-real;89-real;90'} 68 | if ([version]$env:CUDAVER -lt [version]'11.8') {$env:CMAKE_ARGS = '-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=35-real;37-real;52;61-real;70-real;72-real;75-real;80-real;86'} 69 | if ($env:AVXVER -eq 'AVX') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX2=off'} 70 | if ($env:AVXVER -eq 'AVX512') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX512=on'} 71 | python -m build --wheel -C--build-option=egg_info "-C--build-option=--tag-build=+cu$cudaVersion" 72 | 73 | - name: Upload files to a GitHub release 74 | id: upload-release 75 | uses: svenstaro/upload-release-action@2.6.1 76 | continue-on-error: true 77 | with: 78 | file: ./dist/*.whl 79 | tag: ${{ matrix.releasetag }} 80 | file_glob: true 81 | make_latest: false 82 | overwrite: true 83 | 84 | - uses: actions/upload-artifact@v3 85 | if: steps.upload-release.outcome == 'failure' 86 | with: 87 | name: ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 88 | path: ./dist/*.whl 89 | -------------------------------------------------------------------------------- /old_workflows/build-wheels-0.1.70.yml: -------------------------------------------------------------------------------- 1 | name: Build Wheels 0.1.70 2 | 3 | on: 4 | workflow_dispatch: 5 | workflow_call: 6 | 7 | permissions: 8 | contents: write 9 | 10 | jobs: 11 | build_wheels: 12 | name: ${{ matrix.os }} ${{ matrix.pyver }} ${{ matrix.cuda }} ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 13 | runs-on: ${{ matrix.os }} 14 | strategy: 15 | matrix: 16 | os: [ubuntu-20.04, windows-latest] 17 | pyver: ["3.7", "3.8", "3.9", "3.10", "3.11"] 18 | cuda: ["11.6.2", "11.7.1", "11.8.0", "12.0.1", "12.1.1", "12.2.0"] 19 | releasetag: ["AVX","wheels","AVX512"] 20 | defaults: 21 | run: 22 | shell: pwsh 23 | env: 24 | CUDAVER: ${{ matrix.cuda }} 25 | AVXVER: ${{ matrix.releasetag }} 26 | 27 | steps: 28 | - uses: actions/checkout@v3 29 | with: 30 | repository: 'abetlen/llama-cpp-python' 31 | ref: 'v0.1.70' 32 | submodules: 'recursive' 33 | 34 | - uses: actions/setup-python@v3 35 | with: 36 | python-version: ${{ matrix.pyver }} 37 | 38 | - name: Setup Mamba 39 | uses: conda-incubator/setup-miniconda@v2.2.0 40 | with: 41 | activate-environment: "build" 42 | python-version: ${{ matrix.pyver }} 43 | miniforge-variant: Mambaforge 44 | miniforge-version: latest 45 | use-mamba: true 46 | add-pip-as-python-dependency: true 47 | auto-activate-base: false 48 | 49 | - name: Install Dependencies 50 | run: | 51 | $cudaVersion = $env:CUDAVER 52 | $cudaChannels = '' 53 | $cudaNum = [int]$cudaVersion.substring($cudaVersion.LastIndexOf('.')+1) 54 | while ($cudaNum -ge 0) { $cudaChannels += '-c nvidia/label/cuda-' + $cudaVersion.Remove($cudaVersion.LastIndexOf('.')+1) + $cudaNum + ' '; $cudaNum-- } 55 | mamba install -y 'cuda' $cudaChannels.TrimEnd().Split() 56 | python -m pip install build wheel 57 | 58 | - name: Build Wheel 59 | run: | 60 | $cudaVersion = $env:CUDAVER.Remove($env:CUDAVER.LastIndexOf('.')).Replace('.','') 61 | $env:CUDA_PATH = $env:CONDA_PREFIX 62 | $env:CUDA_HOME = $env:CONDA_PREFIX 63 | if ($IsLinux) {$env:LD_LIBRARY_PATH = $env:CONDA_PREFIX + '/lib:' + $env:LD_LIBRARY_PATH} 64 | $env:VERBOSE = '1' 65 | $env:FORCE_CMAKE = '1' 66 | $env:CMAKE_ARGS = '-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=35-real;37-real;52;61-real;70-real;72-real;75-real;80-real;86-real;89-real;90' 67 | if ([version]$env:CUDAVER -ge [version]'12.0') {$env:CMAKE_ARGS = '-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=52;61-real;70-real;72-real;75-real;80-real;86-real;89-real;90'} 68 | if ([version]$env:CUDAVER -lt [version]'11.8') {$env:CMAKE_ARGS = '-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=35-real;37-real;52;61-real;70-real;72-real;75-real;80-real;86'} 69 | if ($env:AVXVER -eq 'AVX') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX2=off'} 70 | if ($env:AVXVER -eq 'AVX512') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX512=on'} 71 | python -m build --wheel -C--build-option=egg_info "-C--build-option=--tag-build=+cu$cudaVersion" 72 | 73 | - name: Upload files to a GitHub release 74 | id: upload-release 75 | uses: svenstaro/upload-release-action@2.6.1 76 | continue-on-error: true 77 | with: 78 | file: ./dist/*.whl 79 | tag: ${{ matrix.releasetag }} 80 | file_glob: true 81 | make_latest: false 82 | overwrite: true 83 | 84 | - uses: actions/upload-artifact@v3 85 | if: steps.upload-release.outcome == 'failure' 86 | with: 87 | name: ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 88 | path: ./dist/*.whl 89 | -------------------------------------------------------------------------------- /old_workflows/build-wheels-0.1.71.yml: -------------------------------------------------------------------------------- 1 | name: Build Wheels 0.1.71 2 | 3 | on: 4 | workflow_dispatch: 5 | workflow_call: 6 | 7 | permissions: 8 | contents: write 9 | 10 | jobs: 11 | build_wheels: 12 | name: ${{ matrix.os }} ${{ matrix.pyver }} ${{ matrix.cuda }} ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 13 | runs-on: ${{ matrix.os }} 14 | strategy: 15 | matrix: 16 | os: [ubuntu-20.04, windows-latest] 17 | pyver: ["3.7", "3.8", "3.9", "3.10", "3.11"] 18 | cuda: ["11.6.2", "11.7.1", "11.8.0", "12.0.1", "12.1.1", "12.2.0"] 19 | releasetag: ["AVX","wheels","AVX512"] 20 | defaults: 21 | run: 22 | shell: pwsh 23 | env: 24 | CUDAVER: ${{ matrix.cuda }} 25 | AVXVER: ${{ matrix.releasetag }} 26 | 27 | steps: 28 | - uses: actions/checkout@v3 29 | with: 30 | repository: 'abetlen/llama-cpp-python' 31 | ref: 'v0.1.71' 32 | submodules: 'recursive' 33 | 34 | - uses: actions/setup-python@v3 35 | with: 36 | python-version: ${{ matrix.pyver }} 37 | 38 | - name: Setup Mamba 39 | uses: conda-incubator/setup-miniconda@v2.2.0 40 | with: 41 | activate-environment: "build" 42 | python-version: ${{ matrix.pyver }} 43 | miniforge-variant: Mambaforge 44 | miniforge-version: latest 45 | use-mamba: true 46 | add-pip-as-python-dependency: true 47 | auto-activate-base: false 48 | 49 | - name: Install Dependencies 50 | run: | 51 | $cudaVersion = $env:CUDAVER 52 | $cudaChannels = '' 53 | $cudaNum = [int]$cudaVersion.substring($cudaVersion.LastIndexOf('.')+1) 54 | while ($cudaNum -ge 0) { $cudaChannels += '-c nvidia/label/cuda-' + $cudaVersion.Remove($cudaVersion.LastIndexOf('.')+1) + $cudaNum + ' '; $cudaNum-- } 55 | mamba install -y 'cuda' $cudaChannels.TrimEnd().Split() 56 | python -m pip install build wheel 57 | 58 | - name: Build Wheel 59 | run: | 60 | $cudaVersion = $env:CUDAVER.Remove($env:CUDAVER.LastIndexOf('.')).Replace('.','') 61 | $env:CUDA_PATH = $env:CONDA_PREFIX 62 | $env:CUDA_HOME = $env:CONDA_PREFIX 63 | if ($IsLinux) {$env:LD_LIBRARY_PATH = $env:CONDA_PREFIX + '/lib:' + $env:LD_LIBRARY_PATH} 64 | $env:VERBOSE = '1' 65 | $env:FORCE_CMAKE = '1' 66 | $env:CMAKE_ARGS = '-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=35-real;37-real;52;61-real;70-real;72-real;75-real;80-real;86-real;89-real;90' 67 | if ([version]$env:CUDAVER -ge [version]'12.0') {$env:CMAKE_ARGS = '-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=52;61-real;70-real;72-real;75-real;80-real;86-real;89-real;90'} 68 | if ([version]$env:CUDAVER -lt [version]'11.8') {$env:CMAKE_ARGS = '-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=35-real;37-real;52;61-real;70-real;72-real;75-real;80-real;86'} 69 | if ($env:AVXVER -eq 'AVX') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX2=off'} 70 | if ($env:AVXVER -eq 'AVX512') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX512=on'} 71 | python -m build --wheel -C--build-option=egg_info "-C--build-option=--tag-build=+cu$cudaVersion" 72 | 73 | - name: Upload files to a GitHub release 74 | id: upload-release 75 | uses: svenstaro/upload-release-action@2.6.1 76 | continue-on-error: true 77 | with: 78 | file: ./dist/*.whl 79 | tag: ${{ matrix.releasetag }} 80 | file_glob: true 81 | make_latest: false 82 | overwrite: true 83 | 84 | - uses: actions/upload-artifact@v3 85 | if: steps.upload-release.outcome == 'failure' 86 | with: 87 | name: ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 88 | path: ./dist/*.whl 89 | -------------------------------------------------------------------------------- /old_workflows/build-wheels-0.1.72.yml: -------------------------------------------------------------------------------- 1 | name: Build Wheels 0.1.72 2 | 3 | on: 4 | workflow_dispatch: 5 | workflow_call: 6 | 7 | permissions: 8 | contents: write 9 | 10 | jobs: 11 | build_wheels: 12 | name: ${{ matrix.os }} ${{ matrix.pyver }} ${{ matrix.cuda }} ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 13 | runs-on: ${{ matrix.os }} 14 | strategy: 15 | matrix: 16 | os: [ubuntu-20.04, windows-latest] 17 | pyver: ["3.7", "3.8", "3.9", "3.10", "3.11"] 18 | cuda: ["11.6.2", "11.7.1", "11.8.0", "12.0.1", "12.1.1", "12.2.0"] 19 | releasetag: ["AVX","wheels","AVX512"] 20 | defaults: 21 | run: 22 | shell: pwsh 23 | env: 24 | CUDAVER: ${{ matrix.cuda }} 25 | AVXVER: ${{ matrix.releasetag }} 26 | 27 | steps: 28 | - uses: actions/checkout@v3 29 | with: 30 | repository: 'abetlen/llama-cpp-python' 31 | ref: 'v0.1.72' 32 | submodules: 'recursive' 33 | 34 | - uses: actions/setup-python@v3 35 | with: 36 | python-version: ${{ matrix.pyver }} 37 | 38 | - name: Setup Mamba 39 | uses: conda-incubator/setup-miniconda@v2.2.0 40 | with: 41 | activate-environment: "build" 42 | python-version: ${{ matrix.pyver }} 43 | miniforge-variant: Mambaforge 44 | miniforge-version: latest 45 | use-mamba: true 46 | add-pip-as-python-dependency: true 47 | auto-activate-base: false 48 | 49 | - name: Install Dependencies 50 | run: | 51 | $cudaVersion = $env:CUDAVER 52 | $cudaChannels = '' 53 | $cudaNum = [int]$cudaVersion.substring($cudaVersion.LastIndexOf('.')+1) 54 | while ($cudaNum -ge 0) { $cudaChannels += '-c nvidia/label/cuda-' + $cudaVersion.Remove($cudaVersion.LastIndexOf('.')+1) + $cudaNum + ' '; $cudaNum-- } 55 | mamba install -y 'cuda' $cudaChannels.TrimEnd().Split() 56 | python -m pip install build wheel 57 | 58 | - name: Build Wheel 59 | run: | 60 | $cudaVersion = $env:CUDAVER.Remove($env:CUDAVER.LastIndexOf('.')).Replace('.','') 61 | $env:CUDA_PATH = $env:CONDA_PREFIX 62 | $env:CUDA_HOME = $env:CONDA_PREFIX 63 | if ($IsLinux) {$env:LD_LIBRARY_PATH = $env:CONDA_PREFIX + '/lib:' + $env:LD_LIBRARY_PATH} 64 | $env:VERBOSE = '1' 65 | $env:FORCE_CMAKE = '1' 66 | $env:CMAKE_ARGS = '-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=35-real;37-real;52;61-real;70-real;72-real;75-real;80-real;86-real;89-real;90' 67 | if ([version]$env:CUDAVER -ge [version]'12.0') {$env:CMAKE_ARGS = '-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=52;61-real;70-real;72-real;75-real;80-real;86-real;89-real;90'} 68 | if ([version]$env:CUDAVER -lt [version]'11.8') {$env:CMAKE_ARGS = '-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=35-real;37-real;52;61-real;70-real;72-real;75-real;80-real;86'} 69 | if ($env:AVXVER -eq 'AVX') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX2=off'} 70 | if ($env:AVXVER -eq 'AVX512') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX512=on'} 71 | python -m build --wheel -C--build-option=egg_info "-C--build-option=--tag-build=+cu$cudaVersion" 72 | 73 | - name: Upload files to a GitHub release 74 | id: upload-release 75 | uses: svenstaro/upload-release-action@2.6.1 76 | continue-on-error: true 77 | with: 78 | file: ./dist/*.whl 79 | tag: ${{ matrix.releasetag }} 80 | file_glob: true 81 | make_latest: false 82 | overwrite: true 83 | 84 | - uses: actions/upload-artifact@v3 85 | if: steps.upload-release.outcome == 'failure' 86 | with: 87 | name: ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 88 | path: ./dist/*.whl 89 | -------------------------------------------------------------------------------- /old_workflows/build-wheels-0.1.73.yml: -------------------------------------------------------------------------------- 1 | name: Build Wheels 0.1.73 2 | 3 | on: 4 | workflow_dispatch: 5 | workflow_call: 6 | 7 | permissions: 8 | contents: write 9 | 10 | jobs: 11 | build_wheels: 12 | name: ${{ matrix.os }} ${{ matrix.pyver }} ${{ matrix.cuda }} ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 13 | runs-on: ${{ matrix.os }} 14 | strategy: 15 | matrix: 16 | os: [ubuntu-20.04, windows-latest] 17 | pyver: ["3.7", "3.8", "3.9", "3.10", "3.11"] 18 | cuda: ["11.6.2", "11.7.1", "11.8.0", "12.0.1", "12.1.1", "12.2.0"] 19 | releasetag: ["AVX","wheels","AVX512"] 20 | defaults: 21 | run: 22 | shell: pwsh 23 | env: 24 | CUDAVER: ${{ matrix.cuda }} 25 | AVXVER: ${{ matrix.releasetag }} 26 | 27 | steps: 28 | - uses: actions/checkout@v3 29 | with: 30 | repository: 'abetlen/llama-cpp-python' 31 | ref: v0.1.73 32 | submodules: 'recursive' 33 | 34 | - uses: actions/setup-python@v3 35 | with: 36 | python-version: ${{ matrix.pyver }} 37 | 38 | - name: Setup Mamba 39 | uses: conda-incubator/setup-miniconda@v2.2.0 40 | with: 41 | activate-environment: "build" 42 | python-version: ${{ matrix.pyver }} 43 | miniforge-variant: Mambaforge 44 | miniforge-version: latest 45 | use-mamba: true 46 | add-pip-as-python-dependency: true 47 | auto-activate-base: false 48 | 49 | - name: Install Dependencies 50 | run: | 51 | $cudaVersion = $env:CUDAVER 52 | $cudaChannels = '' 53 | $cudaNum = [int]$cudaVersion.substring($cudaVersion.LastIndexOf('.')+1) 54 | while ($cudaNum -ge 0) { $cudaChannels += '-c nvidia/label/cuda-' + $cudaVersion.Remove($cudaVersion.LastIndexOf('.')+1) + $cudaNum + ' '; $cudaNum-- } 55 | mamba install -y 'cuda' $cudaChannels.TrimEnd().Split() 56 | python -m pip install build wheel 57 | 58 | - name: Build Wheel 59 | run: | 60 | $cudaVersion = $env:CUDAVER.Remove($env:CUDAVER.LastIndexOf('.')).Replace('.','') 61 | $env:CUDA_PATH = $env:CONDA_PREFIX 62 | $env:CUDA_HOME = $env:CONDA_PREFIX 63 | if ($IsLinux) {$env:LD_LIBRARY_PATH = $env:CONDA_PREFIX + '/lib:' + $env:LD_LIBRARY_PATH} 64 | $env:VERBOSE = '1' 65 | $env:FORCE_CMAKE = '1' 66 | $env:CMAKE_ARGS = '-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=all' 67 | if ($env:AVXVER -eq 'AVX') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX2=off'} 68 | if ($env:AVXVER -eq 'AVX512') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX512=on'} 69 | python -m build --wheel -C--build-option=egg_info "-C--build-option=--tag-build=+cu$cudaVersion" 70 | 71 | - name: Upload files to a GitHub release 72 | id: upload-release 73 | uses: svenstaro/upload-release-action@2.6.1 74 | continue-on-error: true 75 | with: 76 | file: ./dist/*.whl 77 | tag: ${{ matrix.releasetag }} 78 | file_glob: true 79 | make_latest: false 80 | overwrite: true 81 | 82 | - uses: actions/upload-artifact@v3 83 | if: steps.upload-release.outcome == 'failure' 84 | with: 85 | name: ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 86 | path: ./dist/*.whl 87 | -------------------------------------------------------------------------------- /old_workflows/build-wheels-0.1.74.yml: -------------------------------------------------------------------------------- 1 | name: Build Wheels 0.1.74 2 | 3 | on: 4 | workflow_dispatch: 5 | workflow_call: 6 | 7 | permissions: 8 | contents: write 9 | 10 | jobs: 11 | build_wheels: 12 | name: ${{ matrix.os }} ${{ matrix.pyver }} ${{ matrix.cuda }} ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 13 | runs-on: ${{ matrix.os }} 14 | strategy: 15 | matrix: 16 | os: [ubuntu-20.04, windows-latest] 17 | pyver: ["3.7", "3.8", "3.9", "3.10", "3.11"] 18 | cuda: ["11.6.2", "11.7.1", "11.8.0", "12.0.1", "12.1.1", "12.2.0"] 19 | releasetag: ["AVX","wheels","AVX512"] 20 | defaults: 21 | run: 22 | shell: pwsh 23 | env: 24 | CUDAVER: ${{ matrix.cuda }} 25 | AVXVER: ${{ matrix.releasetag }} 26 | 27 | steps: 28 | - uses: actions/checkout@v3 29 | with: 30 | repository: 'abetlen/llama-cpp-python' 31 | ref: v0.1.74 32 | submodules: 'recursive' 33 | 34 | - uses: actions/setup-python@v3 35 | with: 36 | python-version: ${{ matrix.pyver }} 37 | 38 | - name: Setup Mamba 39 | uses: conda-incubator/setup-miniconda@v2.2.0 40 | with: 41 | activate-environment: "build" 42 | python-version: ${{ matrix.pyver }} 43 | miniforge-variant: Mambaforge 44 | miniforge-version: latest 45 | use-mamba: true 46 | add-pip-as-python-dependency: true 47 | auto-activate-base: false 48 | 49 | - name: Install Dependencies 50 | run: | 51 | $cudaVersion = $env:CUDAVER 52 | $cudaChannels = '' 53 | $cudaNum = [int]$cudaVersion.substring($cudaVersion.LastIndexOf('.')+1) 54 | while ($cudaNum -ge 0) { $cudaChannels += '-c nvidia/label/cuda-' + $cudaVersion.Remove($cudaVersion.LastIndexOf('.')+1) + $cudaNum + ' '; $cudaNum-- } 55 | mamba install -y 'cuda' $cudaChannels.TrimEnd().Split() 56 | python -m pip install build wheel 57 | 58 | - name: Build Wheel 59 | run: | 60 | $cudaVersion = $env:CUDAVER.Remove($env:CUDAVER.LastIndexOf('.')).Replace('.','') 61 | $env:CUDA_PATH = $env:CONDA_PREFIX 62 | $env:CUDA_HOME = $env:CONDA_PREFIX 63 | if ($IsLinux) {$env:LD_LIBRARY_PATH = $env:CONDA_PREFIX + '/lib:' + $env:LD_LIBRARY_PATH} 64 | $env:VERBOSE = '1' 65 | $env:FORCE_CMAKE = '1' 66 | $env:CMAKE_ARGS = '-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=all' 67 | if ($env:AVXVER -eq 'AVX') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX2=off'} 68 | if ($env:AVXVER -eq 'AVX512') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX512=on'} 69 | python -m build --wheel -C--build-option=egg_info "-C--build-option=--tag-build=+cu$cudaVersion" 70 | 71 | - name: Upload files to a GitHub release 72 | id: upload-release 73 | uses: svenstaro/upload-release-action@2.6.1 74 | continue-on-error: true 75 | with: 76 | file: ./dist/*.whl 77 | tag: ${{ matrix.releasetag }} 78 | file_glob: true 79 | make_latest: false 80 | overwrite: true 81 | 82 | - uses: actions/upload-artifact@v3 83 | if: steps.upload-release.outcome == 'failure' 84 | with: 85 | name: ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 86 | path: ./dist/*.whl 87 | -------------------------------------------------------------------------------- /old_workflows/build-wheels-0.1.76.yml: -------------------------------------------------------------------------------- 1 | name: Build Wheels 0.1.76 2 | 3 | on: 4 | workflow_dispatch: 5 | workflow_call: 6 | 7 | permissions: 8 | contents: write 9 | 10 | jobs: 11 | build_wheels: 12 | name: ${{ matrix.os }} ${{ matrix.pyver }} ${{ matrix.cuda }} ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 13 | runs-on: ${{ matrix.os }} 14 | strategy: 15 | matrix: 16 | os: [ubuntu-20.04, windows-latest] 17 | pyver: ["3.7", "3.8", "3.9", "3.10", "3.11"] 18 | cuda: ["11.6.2", "11.7.1", "11.8.0", "12.0.1", "12.1.1", "12.2.0"] 19 | releasetag: ["AVX","wheels","AVX512"] 20 | defaults: 21 | run: 22 | shell: pwsh 23 | env: 24 | CUDAVER: ${{ matrix.cuda }} 25 | AVXVER: ${{ matrix.releasetag }} 26 | 27 | steps: 28 | - uses: actions/checkout@v3 29 | with: 30 | repository: 'abetlen/llama-cpp-python' 31 | ref: v0.1.76 32 | submodules: 'recursive' 33 | 34 | - uses: actions/setup-python@v3 35 | with: 36 | python-version: ${{ matrix.pyver }} 37 | 38 | - name: Setup Mamba 39 | uses: conda-incubator/setup-miniconda@v2.2.0 40 | with: 41 | activate-environment: "build" 42 | python-version: ${{ matrix.pyver }} 43 | miniforge-variant: Mambaforge 44 | miniforge-version: latest 45 | use-mamba: true 46 | add-pip-as-python-dependency: true 47 | auto-activate-base: false 48 | 49 | - name: Install Dependencies 50 | run: | 51 | $cudaVersion = $env:CUDAVER 52 | $cudaChannels = '' 53 | $cudaNum = [int]$cudaVersion.substring($cudaVersion.LastIndexOf('.')+1) 54 | while ($cudaNum -ge 0) { $cudaChannels += '-c nvidia/label/cuda-' + $cudaVersion.Remove($cudaVersion.LastIndexOf('.')+1) + $cudaNum + ' '; $cudaNum-- } 55 | mamba install -y 'cuda' $cudaChannels.TrimEnd().Split() 56 | python -m pip install build wheel 57 | 58 | - name: Build Wheel 59 | run: | 60 | $cudaVersion = $env:CUDAVER.Remove($env:CUDAVER.LastIndexOf('.')).Replace('.','') 61 | $env:CUDA_PATH = $env:CONDA_PREFIX 62 | $env:CUDA_HOME = $env:CONDA_PREFIX 63 | if ($IsLinux) {$env:LD_LIBRARY_PATH = $env:CONDA_PREFIX + '/lib:' + $env:LD_LIBRARY_PATH} 64 | $env:VERBOSE = '1' 65 | $env:FORCE_CMAKE = '1' 66 | $env:CMAKE_ARGS = '-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=all' 67 | if ($env:AVXVER -eq 'AVX') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX2=off'} 68 | if ($env:AVXVER -eq 'AVX512') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX512=on'} 69 | python -m build --wheel -C--build-option=egg_info "-C--build-option=--tag-build=+cu$cudaVersion" 70 | 71 | - name: Upload files to a GitHub release 72 | id: upload-release 73 | uses: svenstaro/upload-release-action@2.6.1 74 | continue-on-error: true 75 | with: 76 | file: ./dist/*.whl 77 | tag: ${{ matrix.releasetag }} 78 | file_glob: true 79 | make_latest: false 80 | overwrite: true 81 | 82 | - uses: actions/upload-artifact@v3 83 | if: steps.upload-release.outcome == 'failure' 84 | with: 85 | name: ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 86 | path: ./dist/*.whl -------------------------------------------------------------------------------- /old_workflows/build-wheels-0.1.77.yml: -------------------------------------------------------------------------------- 1 | name: Build Wheels 0.1.77 2 | 3 | on: 4 | workflow_dispatch: 5 | workflow_call: 6 | 7 | permissions: 8 | contents: write 9 | 10 | jobs: 11 | build_wheels: 12 | name: ${{ matrix.os }} ${{ matrix.pyver }} ${{ matrix.cuda }} ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 13 | runs-on: ${{ matrix.os }} 14 | strategy: 15 | matrix: 16 | os: [ubuntu-20.04, windows-latest] 17 | pyver: ["3.7", "3.8", "3.9", "3.10", "3.11"] 18 | cuda: ["11.6.2", "11.7.1", "11.8.0", "12.0.1", "12.1.1", "12.2.0"] 19 | releasetag: ["AVX","wheels","AVX512"] 20 | defaults: 21 | run: 22 | shell: pwsh 23 | env: 24 | CUDAVER: ${{ matrix.cuda }} 25 | AVXVER: ${{ matrix.releasetag }} 26 | 27 | steps: 28 | - uses: actions/checkout@v3 29 | with: 30 | repository: 'abetlen/llama-cpp-python' 31 | ref: v0.1.77 32 | submodules: 'recursive' 33 | 34 | - uses: actions/setup-python@v3 35 | with: 36 | python-version: ${{ matrix.pyver }} 37 | 38 | - name: Setup Mamba 39 | uses: conda-incubator/setup-miniconda@v2.2.0 40 | with: 41 | activate-environment: "build" 42 | python-version: ${{ matrix.pyver }} 43 | miniforge-variant: Mambaforge 44 | miniforge-version: latest 45 | use-mamba: true 46 | add-pip-as-python-dependency: true 47 | auto-activate-base: false 48 | 49 | - name: Install Dependencies 50 | run: | 51 | $cudaVersion = $env:CUDAVER 52 | $cudaChannels = '' 53 | $cudaNum = [int]$cudaVersion.substring($cudaVersion.LastIndexOf('.')+1) 54 | while ($cudaNum -ge 0) { $cudaChannels += '-c nvidia/label/cuda-' + $cudaVersion.Remove($cudaVersion.LastIndexOf('.')+1) + $cudaNum + ' '; $cudaNum-- } 55 | mamba install -y 'cuda' $cudaChannels.TrimEnd().Split() 56 | python -m pip install build wheel 57 | 58 | - name: Build Wheel 59 | run: | 60 | $cudaVersion = $env:CUDAVER.Remove($env:CUDAVER.LastIndexOf('.')).Replace('.','') 61 | $env:CUDA_PATH = $env:CONDA_PREFIX 62 | $env:CUDA_HOME = $env:CONDA_PREFIX 63 | if ($IsLinux) {$env:LD_LIBRARY_PATH = $env:CONDA_PREFIX + '/lib:' + $env:LD_LIBRARY_PATH} 64 | $env:VERBOSE = '1' 65 | $env:FORCE_CMAKE = '1' 66 | $env:CMAKE_ARGS = '-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=all' 67 | if ($env:AVXVER -eq 'AVX') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX2=off'} 68 | if ($env:AVXVER -eq 'AVX512') {$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DLLAMA_AVX512=on'} 69 | python -m build --wheel -C--build-option=egg_info "-C--build-option=--tag-build=+cu$cudaVersion" 70 | 71 | - name: Upload files to a GitHub release 72 | id: upload-release 73 | uses: svenstaro/upload-release-action@2.6.1 74 | continue-on-error: true 75 | with: 76 | file: ./dist/*.whl 77 | tag: ${{ matrix.releasetag }} 78 | file_glob: true 79 | make_latest: false 80 | overwrite: true 81 | 82 | - uses: actions/upload-artifact@v3 83 | if: steps.upload-release.outcome == 'failure' 84 | with: 85 | name: ${{ matrix.releasetag == 'wheels' && 'AVX2' || matrix.releasetag }} 86 | path: ./dist/*.whl -------------------------------------------------------------------------------- /workflows.md: -------------------------------------------------------------------------------- 1 | All workflows are configured to accept a llama-cpp-python release tag to build a specific version of the package. 2 | For the most part, they are written to account for changes in every version since 0.1.62. 3 | 4 | Primary workflows used for new llama-cpp-python releases 5 | ---- 6 | - `build-wheels.yml` 7 | - This workflow will build around 192 wheels for various CUDA, Python and CPU configurations. After this, it will call the `build-wheels-cpu.yml` workflow. 8 | - `build-wheels-full-release.yml` 9 | - This workflow calls these workflows in order: `build-wheels.yml build-wheels-oobabooga.yml build-wheels-rocm-full.yml build-wheels-macos.yml` 10 | - Somewhere around 370 wheels are produced in total, last I checked. This number will likely increase as additional builds, such as MacOS Metal, are eventually included. 11 | - `build-wheels-prioritized-release.yml` 12 | - This workflow is much like `build-wheels-full-release.yml`, except `build-wheels.yml` and `build-wheels-oobabooga.yml` are incorporated into the workflow instead of being called due to minor modifications. 13 | - This workflow is configured to build the wheels used by [text-generation-webui](https://github.com/oobabooga/text-generation-webui) first. This is because the long runtime of the workflow (currently 5 - 6 hours) was causing significant delays in updating the project. 14 | - `build-wheels-cpu.yml` 15 | - This workflow builds CPU-only wheels for all of the CPU configurations supported by the other workflows. 16 | - It was made because the wheels in the main repo are only built to support the default configuration of `AVX2`. 17 | 18 | ~~These workflows, and their dependents, were recently optimized to significantly reduce run times from 6 hours for the longest down to around 2 hours.~~ 19 | Workflow optimizations have been made incompatible with llama-cpp-python 0.2.X+ due to abetlen switching the build backend to one that does not support modifications of the build process. 20 | Copies of the optimized workflows can be found in the `old_workflows` directory. 21 | 22 | Renamed package workflows 23 | ---- 24 | These workflows produced renamed packages under different namespaces to allow for simultaneous installation with the main package. 25 | - `build-wheels-oobabooga*.yml` 26 | - This workflow builds wheels with packages renamed to `llama_cpp_python_cuda`. 27 | - These wheels are to allow applications to simultaneously support both CPU and CUDA builds of llama-cpp-python. 28 | - As the name implies, this was made for text-generation-webui. 29 | - `build-wheels-ggml*.yml` 30 | - This workflow was made to produce wheels for llama-cpp-python 0.1.78 under the name of `llama_cpp_python_ggml`. 31 | - This allows applications to maintain support for GGML while updating to newer versions of llama-cpp-python. 32 | - Intended to be a temporary measure until more models are converted to GGUF. 33 | 34 | Configuration-specific workflows 35 | ---- 36 | - `*-basic.yml` 37 | - `*-avx.yml` 38 | 39 | These are copies of other workflows with build matrices limited to specific configurations. 40 | For the most part, I made these to rebuild previous versions of llama-cpp-python as needed to support new configurations that were added to the main workflows. 41 | 42 | Batch build workflows 43 | ---- 44 | - `build-wheels-batch-*.yml` 45 | 46 | These workflows accept a comma-separated string of llama-cpp-python release tags. 47 | They then use Powershell to parse the input and construct a JSON string that is used to form a job matrix. 48 | Associated workflows are then called as needed to build each version. 49 | Only one workflow is executed at a time due to the large number of jobs that can be generated. 50 | 51 | Experimental workflows used for more specialized builds 52 | ---- 53 | - `build-wheel-rocm.yml` 54 | - This workflow builds Linux and Windows wheels for AMD GPUs using ROCm. 55 | - Linux wheels are built using these ROCm versions: `5.4.2 5.5 5.6.1` 56 | - Currently considered experimental until someone with an AMD GPU can confirm if the resulting wheels work. 57 | - `build-wheels-oobabooga-rocm.yml` 58 | - This workflow is much like the previous. It additionally builds `llama_cpp_python_cuda` wheels. 59 | - `build-wheels-rocm-full.yml` 60 | - This workflow is essentially a combination of the previous 2. 61 | - `build-wheels-macos.yml` 62 | - This workflow builds wheels with MacOS Metal support for MacOS 11, 12 and 13. 63 | - Builds separate wheels for Intel and Apple Silicon CPU support. 64 | - Is currently experimental and may not produce functional Metal wheels. I do not have a Mac to test with, so I can only go off of build logs. 65 | 66 | Utility workflows 67 | ---- 68 | - `deploy-index.yml` 69 | - This workflow is for deploying the package index to GitHub Pages. 70 | - It is configured to run automatically when the index's html files are altered. 71 | - `build-wheels-test.yml` 72 | - This workflow is entirely for testing new workflow code and is changed frequently as needed. 73 | --------------------------------------------------------------------------------