├── .gitignore ├── LICENSE ├── README.md ├── assets └── StreamingLLM.pdf ├── data └── mt_bench.jsonl ├── examples ├── eval_long_ppl.py └── run_streaming_llama.py ├── figures └── schemes.png ├── setup.py └── streaming_llm ├── __init__.py ├── enable_streaming_llm.py ├── kv_cache.py ├── pos_shift ├── __init__.py ├── modify_falcon.py ├── modify_gpt_neox.py └── modify_llama.py └── utils.py /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-han-lab/streaming-llm/HEAD/.gitignore -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-han-lab/streaming-llm/HEAD/LICENSE -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-han-lab/streaming-llm/HEAD/README.md -------------------------------------------------------------------------------- /assets/StreamingLLM.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-han-lab/streaming-llm/HEAD/assets/StreamingLLM.pdf -------------------------------------------------------------------------------- /data/mt_bench.jsonl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-han-lab/streaming-llm/HEAD/data/mt_bench.jsonl -------------------------------------------------------------------------------- /examples/eval_long_ppl.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-han-lab/streaming-llm/HEAD/examples/eval_long_ppl.py -------------------------------------------------------------------------------- /examples/run_streaming_llama.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-han-lab/streaming-llm/HEAD/examples/run_streaming_llama.py -------------------------------------------------------------------------------- /figures/schemes.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-han-lab/streaming-llm/HEAD/figures/schemes.png -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-han-lab/streaming-llm/HEAD/setup.py -------------------------------------------------------------------------------- /streaming_llm/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /streaming_llm/enable_streaming_llm.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-han-lab/streaming-llm/HEAD/streaming_llm/enable_streaming_llm.py -------------------------------------------------------------------------------- /streaming_llm/kv_cache.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-han-lab/streaming-llm/HEAD/streaming_llm/kv_cache.py -------------------------------------------------------------------------------- /streaming_llm/pos_shift/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /streaming_llm/pos_shift/modify_falcon.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-han-lab/streaming-llm/HEAD/streaming_llm/pos_shift/modify_falcon.py -------------------------------------------------------------------------------- /streaming_llm/pos_shift/modify_gpt_neox.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-han-lab/streaming-llm/HEAD/streaming_llm/pos_shift/modify_gpt_neox.py -------------------------------------------------------------------------------- /streaming_llm/pos_shift/modify_llama.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-han-lab/streaming-llm/HEAD/streaming_llm/pos_shift/modify_llama.py -------------------------------------------------------------------------------- /streaming_llm/utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-han-lab/streaming-llm/HEAD/streaming_llm/utils.py --------------------------------------------------------------------------------