├── deploy
    └── README.md
└── README.md


/deploy/README.md:
--------------------------------------------------------------------------------
1 | todo
2 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # CodeLLaMA-chat
 2 | 
 3 | CodeLLaMA-chat - The project aims to train a conversational model based on CodeLLaMA, providing a multi-turn dialogue version to assist with code development and troubleshooting.
 4 | 
 5 | # Model
 6 | - base model
 7 |   - codellama base 34b: https://huggingface.co/TheBloke/CodeLlama-34B-fp16
 8 |   - codellama python 34b: https://huggingface.co/TheBloke/CodeLlama-34B-Python-fp16
 9 |   - codellama instruct 34b: https://huggingface.co/TheBloke/CodeLlama-34B-Instruct-fp16
10 |   - codellama base 13b: https://huggingface.co/TheBloke/CodeLlama-13B-fp16
11 |   - codellama python 13b: https://huggingface.co/TheBloke/CodeLlama-13B-Python-fp16
12 |   - codellama instruct 13b: https://huggingface.co/TheBloke/CodeLlama-13B-Instruct-fp16
13 |   - codellama base 7b: https://huggingface.co/TheBloke/CodeLlama-7B-fp16
14 |   - codellama python 7b: https://huggingface.co/TheBloke/CodeLlama-7B-Python-fp16
15 |   - codellama instruct 7b: https://huggingface.co/TheBloke/CodeLlama-7B-Instruct-fp16
16 | - Chat model
17 |   - Chinese Chat 13B Version
18 |     - based model: CodeLlama-13B-fp16, train dataset: computer_zh_26k
19 |     - model link: https://huggingface.co/shareAI/CodeLLaMA-chat-13b-Chinese
20 |   - English Chat 13B Version
21 |     - based model: CodeLlama-13B-fp16, train dataset: computer_en_26k
22 |     - model link: https://huggingface.co/shareAI/CodeLlama-13b-English-Chat  
23 | 
24 | ## Datasets
25 | - ShareGPT-90K's computer 26k
26 |   - Chinese version: https://huggingface.co/datasets/shareAI/ShareGPT-Chinese-English-90k/blob/main/sharegpt_jsonl/computer_zh_26k.jsonl
27 |   - English version: https://huggingface.co/datasets/shareAI/ShareGPT-Chinese-English-90k/blob/main/sharegpt_jsonl/computer_en_26k.jsonl
28 |   - ... (You can also translate them to other languages with GPT)
29 |   - change-info-tool for this data: https://huggingface.co/datasets/shareAI/ShareGPT-Chinese-English-90k/blob/main/shareGPT/change_info.py
30 |   - train-llm-tool for this data: https://github.com/yangjianxin1/Firefly
31 | 
32 | ## How to use
33 | ```
34 | # from Firefly
35 | from transformers import AutoModelForCausalLM, AutoTokenizer
36 | import torch
37 | 
38 | 
39 | def main():
40 |     model_name = 'shareAI/CodeLLaMA-chat-13b-Chinese' # set with your model path
41 | 
42 |     device = 'cuda'
43 |     max_new_tokens = 500    # max tokens for each response
44 |     history_max_len = 1000  # memory length 
45 |     top_p = 0.9
46 |     temperature = 0.35
47 |     repetition_penalty = 1.0
48 | 
49 |     model = AutoModelForCausalLM.from_pretrained(
50 |         model_name,
51 |         trust_remote_code=True,
52 |         low_cpu_mem_usage=True,
53 |         torch_dtype=torch.float16,
54 |         device_map='auto'
55 |     ).to(device).eval()
56 |     tokenizer = AutoTokenizer.from_pretrained(
57 |         model_name,
58 |         trust_remote_code=True,
59 |         use_fast=False
60 |     )
61 | 
62 | 
63 |     history_token_ids = torch.tensor([[]], dtype=torch.long)
64 | 
65 |     user_input = input('User：')
66 |     while True:
67 |         input_ids = tokenizer(user_input, return_tensors="pt", add_special_tokens=False).input_ids
68 |         eos_token_id = torch.tensor([[tokenizer.eos_token_id]], dtype=torch.long)
69 |         user_input_ids = torch.concat([input_ids, eos_token_id], dim=1)
70 |         history_token_ids = torch.concat((history_token_ids, user_input_ids), dim=1)
71 |         model_input_ids = history_token_ids[:, -history_max_len:].to(device)
72 |         with torch.no_grad():
73 |             outputs = model.generate(
74 |                 input_ids=model_input_ids, max_new_tokens=max_new_tokens, do_sample=True, top_p=top_p,
75 |                 temperature=temperature, repetition_penalty=repetition_penalty, eos_token_id=tokenizer.eos_token_id
76 |             )
77 |         model_input_ids_len = model_input_ids.size(1)
78 |         response_ids = outputs[:, model_input_ids_len:]
79 |         history_token_ids = torch.concat((history_token_ids, response_ids.cpu()), dim=1)
80 |         response = tokenizer.batch_decode(response_ids)
81 |         print("Bot：" + response[0].strip().replace(tokenizer.eos_token, ""))
82 |         user_input = input('User：')
83 | 
84 | 
85 | if __name__ == '__main__':
86 |     main()
87 | ```
88 | 
89 | ## Todo
90 | - Train a multi-turn dialogue version.
91 | - Code inference and API deployment.
92 | - Integration with IDE plugins.
93 | 


--------------------------------------------------------------------------------