├── .gitignore ├── .pre-commit-config.yaml ├── README.md ├── convert_llama_weights_to_hf.py ├── datautils.py ├── fused_attn.py ├── gptq.py ├── llama.py ├── llama_inference.py ├── llama_inference_dmapauto.py ├── llama_inference_offload.py ├── modelutils.py ├── opt.py ├── quant.py ├── quant_cuda.cpp ├── quant_cuda_kernel.cu ├── requirements.txt ├── santacoder.py ├── santacoder_inference.py ├── scripts ├── santacoder-16bit.sh ├── santacoder-32bit.sh ├── santacoder-4bit.sh ├── santacoder-8bit.sh ├── starcoder-16bit.sh ├── starcoder-32bit.sh ├── starcoder-4bit.sh ├── starcoder-8bit.sh ├── starcoderbase-16bit.sh ├── starcoderbase-32bit.sh ├── starcoderbase-4bit.sh └── starcoderbase-8bit.sh ├── setup_cuda.py ├── share_tensors_across_processes.py └── test_kernel.py /.gitignore: -------------------------------------------------------------------------------- 1 | __pycache__/ 2 | -------------------------------------------------------------------------------- /.pre-commit-config.yaml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/.pre-commit-config.yaml -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/README.md -------------------------------------------------------------------------------- /convert_llama_weights_to_hf.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/convert_llama_weights_to_hf.py -------------------------------------------------------------------------------- /datautils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/datautils.py -------------------------------------------------------------------------------- /fused_attn.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/fused_attn.py -------------------------------------------------------------------------------- /gptq.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/gptq.py -------------------------------------------------------------------------------- /llama.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/llama.py -------------------------------------------------------------------------------- /llama_inference.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/llama_inference.py -------------------------------------------------------------------------------- /llama_inference_dmapauto.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/llama_inference_dmapauto.py -------------------------------------------------------------------------------- /llama_inference_offload.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/llama_inference_offload.py -------------------------------------------------------------------------------- /modelutils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/modelutils.py -------------------------------------------------------------------------------- /opt.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/opt.py -------------------------------------------------------------------------------- /quant.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/quant.py -------------------------------------------------------------------------------- /quant_cuda.cpp: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/quant_cuda.cpp -------------------------------------------------------------------------------- /quant_cuda_kernel.cu: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/quant_cuda_kernel.cu -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/requirements.txt -------------------------------------------------------------------------------- /santacoder.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/santacoder.py -------------------------------------------------------------------------------- /santacoder_inference.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/santacoder_inference.py -------------------------------------------------------------------------------- /scripts/santacoder-16bit.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/scripts/santacoder-16bit.sh -------------------------------------------------------------------------------- /scripts/santacoder-32bit.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/scripts/santacoder-32bit.sh -------------------------------------------------------------------------------- /scripts/santacoder-4bit.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/scripts/santacoder-4bit.sh -------------------------------------------------------------------------------- /scripts/santacoder-8bit.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/scripts/santacoder-8bit.sh -------------------------------------------------------------------------------- /scripts/starcoder-16bit.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/scripts/starcoder-16bit.sh -------------------------------------------------------------------------------- /scripts/starcoder-32bit.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/scripts/starcoder-32bit.sh -------------------------------------------------------------------------------- /scripts/starcoder-4bit.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/scripts/starcoder-4bit.sh -------------------------------------------------------------------------------- /scripts/starcoder-8bit.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/scripts/starcoder-8bit.sh -------------------------------------------------------------------------------- /scripts/starcoderbase-16bit.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/scripts/starcoderbase-16bit.sh -------------------------------------------------------------------------------- /scripts/starcoderbase-32bit.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/scripts/starcoderbase-32bit.sh -------------------------------------------------------------------------------- /scripts/starcoderbase-4bit.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/scripts/starcoderbase-4bit.sh -------------------------------------------------------------------------------- /scripts/starcoderbase-8bit.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/scripts/starcoderbase-8bit.sh -------------------------------------------------------------------------------- /setup_cuda.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/setup_cuda.py -------------------------------------------------------------------------------- /share_tensors_across_processes.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/share_tensors_across_processes.py -------------------------------------------------------------------------------- /test_kernel.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mayank31398/GPTQ-for-SantaCoder/HEAD/test_kernel.py --------------------------------------------------------------------------------