└── README.md


/README.md:
--------------------------------------------------------------------------------
  1 | # OpenChatML Specification v0.1
  2 | 
  3 | ## 1. Introduction
  4 | OpenChatML is a markup language designed for representing conversational data in a structured format. It provides a standardized way to encode chat messages, including the role of the speaker, the content of the message, and optional metadata such as the name of the speaker.
  5 | 
  6 | ### 1.1 Overview
  7 | 
  8 | This document defines OpenChatML, a markup language for representing and exchanging conversational data in a standardized format. OpenChatML provides a structured approach to encode chat messages, including the role of the speaker, the content of the message, and optional metadata.
  9 | 
 10 | ### 1.2 Purpose
 11 | 
 12 | The purpose of OpenChatML is to address the following challenges in the conversational AI domain:
 13 | 
 14 | 1. Lack of interoperability between different platforms, tools, and systems.
 15 | 2. Inconsistency in the representation of conversational data across various sources and target systems.
 16 | 3. Limited expressiveness in capturing the nuances and context of conversations.
 17 | 4. Difficulty in extending and evolving existing formats to meet the changing needs of the conversational AI community.
 18 | 
 19 | ### 1.3 Scope
 20 | 
 21 | OpenChatML is designed to be a lightweight and flexible markup language for representing conversational data. It focuses on the core elements and structures necessary to capture the essence of conversations, while allowing for extensibility and customization.
 22 | 
 23 | The specification defines the syntax and semantics of OpenChatML, including the special tokens, message structure, conversation structure, fill-in-the-middle tasks, and multi-file sequences, and function calling. It also provides guidelines for parsing and generating OpenChatML data.
 24 | 
 25 | ### 1.4 Comparison to Other Specifications
 26 | 
 27 | OpenChatML differs from other specifications in the following aspects:
 28 | 
 29 | 1. Simplicity: OpenChatML prioritizes simplicity and readability, making it easy for humans to understand and write conversational data in this format.
 30 | 2. Flexibility: The specification allows for optional metadata attributes to be associated with each message, providing flexibility without imposing strict requirements.
 31 | 3. Fill-in-the-Middle Tasks: OpenChatML provides built-in support for fill-in-the-middle tasks, which are commonly used in conversational AI for completion and generation tasks.
 32 | 4. Multi-File Sequences: OpenChatML introduces the concept of multi-file sequences, enabling the representation of conversations that span multiple files or documents.
 33 | 
 34 | ### 1.5 Use Cases
 35 | 
 36 | OpenChatML is intended to be used in various conversational AI scenarios, including:
 37 | 
 38 | 1. Dialogue Systems: Representing and storing conversation data for training and evaluating dialogue systems.
 39 | 2. Chatbots: Building and deploying chatbots across different platforms, ensuring consistent handling of conversational data.
 40 | 3. Conversational Datasets: Creating, sharing, and analyzing conversational datasets for research and development purposes.
 41 | 4. Human-in-the-Loop Interactions: Representing and capturing interactions between human agents and AI systems for training and evaluation.
 42 | 
 43 | ## 2. Tokens
 44 | OpenChatML uses the following special tokens:
 45 | 
 46 | - `[BOS]`: Rather than redefining it, we will use the same BOS token as the base model uses. This token indicates the start of a conversation.
 47 | - `[EOS]`: Rather than redifining it, we will use the same EOS token as the base model uses. This token indicates the end of a conversation.
 48 | - `<|im_start|>`: Start of Turn token, indicating the beginning of a new message within the conversation. 
 49 | - `<|im_end|>`: End of Turn token, indicating the end of the current message.
 50 | - `<|fim_prefix|>`: Before Cursor token, indicating the content before the cursor in a fill-in-the-middle task.
 51 | - `<|fim_suffix|>`: After Cursor token, indicating the content after the cursor in a fill-in-the-middle task.
 52 | - `<|fim_middle|>`: At Cursor token, indicating where the model should fill in content in a fill-in-the-middle task.
 53 | - `<|file_separator|>`: File Separator token, used to separate content from different files within the same sequence.
 54 | - `<|reflect|>`,`<|introspect|>`,`<|reason|>`: flag tokens to indicate whether these blocks should be generated.
 55 | - `<|start_reflect|>`, `<|end_reflect|>`: a block containing reflection on the user's position and intent.
 56 | - `<|start_introspect|>`, `<|end_introspect|>`: a block containing introspection on the model's position and intent.
 57 | - `<|start_reason|>`, `<|end_reason|>`: a block containing chain of thought required to solve the question.
 58 | - `<|function_list|>`: Function List token, for providing available tools.
 59 | - `<|function_output|>`: Function Output token, indicating output from a tool use.
 60 | - `<|function_call|>`: Function Call token, indicating tool to call.
 61 | 
 62 | ## 3. Message Structure
 63 | Each message in OpenChatML is represented as follows:
 64 | 
 65 | ```
 66 | <|im_start|>role [name=<name>]
 67 | message_content
 68 | <|im_end|>
 69 | ```
 70 | 
 71 | - `role`: A string indicating the role of the speaker. It must be one of the following: "system", "tool", "user", or "assistant".
 72 | - `name` (optional): A string representing the name of the speaker. If present, it should be added after the role, in the format `name=<name>`. The name cannot contain whitespace.
 73 | - `message_content`: The actual content of the message, which can span multiple lines.
 74 | 
 75 | ## 4. Thought Structure
 76 | OpenChatML introduces a new structure to represent reflection, introspection, and reason.  If the flags are present in the system prompt, that indicates to the model to generate the associated thought block.
 77 | 
 78 | Reflection is analysis of the user's emotions and intentions.  Introspection is the model's analysis of its own emotions and intentions.  Reason is the chain of thought to solve problems.
 79 | 
 80 | This behavior will need to be trained into the model with SFT.
 81 | 
 82 | Example:
 83 | 
 84 | ```
 85 | [BOS]<|im_start|>system
 86 | You are a helpful AI assistant.<|reflect|><|introspect|><|reason|><|im_end|>
 87 | <|im_start|>user
 88 | I have here a closed box with the label Band-Aid printed on it. What do you suppose is inside the box?<|im_end|>
 89 | <|im_start|>assistant
 90 | <|start_reflect|>The user is asking a straightforward question about the likely contents of a labeled box. They seem curious but not emotionally invested in the answer. There is no apparent hidden agenda or ulterior motive behind the question.<|end_reflect|>
 91 | <|start_introspect|>As an AI assistant, my goal is to provide helpful and informative responses to the user's questions. In this case, I do not have any strong feelings or opinions about the contents of the box. My intention is to make a reasonable inference based on the label and common knowledge.<|end_introspect|>
 92 | <|start_reason|>The box has a "Band-Aid" label printed on it. Band-Aid is a well-known brand name for adhesive bandages and related first-aid products. When a box is labeled with a brand name, it very often contains that brand's products inside. Therefore, the most logical inference is that this box contains Band-Aid brand adhesive bandages.<|end_reason|>
 93 | Based on the "Band-Aid" label printed on the box, it is very likely that the box contains Band-Aid brand adhesive bandages inside. Band-Aid is a famous brand name for this type of first-aid product. When a box has a brand name printed on the outside, it usually contains that brand's products on the inside. So while I cannot say with 100% certainty, I think it is a very reasonable assumption that you would find Band-Aid adhesive bandages if you opened up this labeled box.<|im_end|>[EOS]
 94 | ```
 95 | 
 96 | ## 5. Conversation Structure
 97 | A conversation in OpenChatML is represented as a sequence of messages, enclosed within `[BOS]` and `[EOS]` tokens:
 98 | 
 99 | ```
100 | [BOS]<|im_start|>role1 [name=<name1>]
101 | message1
102 | <|im_end|>
103 | <|im_start|>role2 [name=<name2>]
104 | message2
105 | <|im_end|>
106 | ...
107 | [EOS]
108 | ```
109 | 
110 | ## 6. Fill-in-the-Middle Tasks
111 | OpenChatML supports fill-in-the-middle (FIM) tasks where the model is asked to complete content given surrounding context. The FIM structure is represented as:
112 | 
113 | ```
114 | <|fim_prefix|>prefix_content<|fim_middle|><|fim_suffix|>suffix_content
115 | ```
116 | 
117 | The model should generate content to replace the `<|fim_middle|>` token, using `prefix_content` as the preceding context and `suffix_content` as the following context. The generated content should smoothly connect the prefix to the suffix.
118 | 
119 | ## 7. Multi-File Sequences
120 | OpenChatML allows combining content from multiple files into a single sequence using the `<|file_separator|>` token:
121 | 
122 | ```
123 | file1_content
124 | <|file_separator|>
125 | file2_content
126 | <|file_separator|>
127 | file3_content
128 | ```
129 | 
130 | The `<|file_separator|>` token is used to demarcate the boundaries between content from different files while keeping them as part of the same overall sequence. This can be useful for tasks involving multiple input sources.
131 | 
132 | ## 8. Function Calling
133 | OpenChatML supports function calling, allowing the model to interact with external tools and APIs. Function calling enables the model to perform specific tasks, retrieve information, and generate more accurate and relevant responses based on the available tools.  The design for function calling in OpenChatML is adapted from the [Hermes-Function-Calling](https://github.com/NousResearch/Hermes-Function-Calling) project, by Nous Research.
134 | 
135 | ### 8.1 Function Signature
136 | 
137 | To enable function calling, the available functions or tools should be provided to the model within the `<tools>` and `</tools>` XML tags in the system message. The function signature is represented as a JSON object with the following properties:
138 | 
139 | - `type`: Indicates the type of the tool, which should be "function".
140 | - `function`: An object representing the function details, containing:
141 |   - `name`: The name of the function.
142 |   - `description`: A brief description of what the function does.
143 |   - `parameters`: An object specifying the parameters of the function, following the JSON Schema format.
144 | 
145 | Example function signature:
146 | 
147 | ```json
148 | <|function_list|>
149 | {
150 |   "type": "function",
151 |   "function": {
152 |     "name": "get_stock_fundamentals",
153 |     "description": "Get fundamental data for a given stock symbol using yfinance API.",
154 |     "parameters": {
155 |       "type": "object",
156 |       "properties": {
157 |         "symbol": {
158 |           "type": "string"
159 |         }
160 |       },
161 |       "required": ["symbol"]
162 |     }
163 |   }
164 | }
165 | ```
166 | 
167 | ### 8.2 Function Call
168 | 
169 | To make a function call, the model should generate a JSON object within the `<tool_call>` and `</tool_call>` XML tags. The JSON object should follow the Pydantic model schema:
170 | 
171 | ```json
172 | {
173 |   "title": "FunctionCall",
174 |   "type": "object",
175 |   "properties": {
176 |     "arguments": {
177 |       "title": "Arguments",
178 |       "type": "object"
179 |     },
180 |     "name": {
181 |       "title": "Name",
182 |       "type": "string"
183 |     }
184 |   },
185 |   "required": ["arguments", "name"]
186 | }
187 | ```
188 | 
189 | Example function call:
190 | 
191 | ```
192 | <|function_call|>
193 | {"arguments": {"symbol": "TSLA"}, "name": "get_stock_fundamentals"}
194 | ```
195 | 
196 | ### 8.3 Function Response
197 | 
198 | After executing the function call, the response should be passed back to the model within the `<tool_response>` and `</tool_response>` XML tags. The response should be a JSON object containing the function name and the content of the response.
199 | 
200 | Example function response:
201 | 
202 | ```
203 | <|function_output|>
204 | {
205 |   "name": "get_stock_fundamentals",
206 |   "content": {
207 |     "symbol": "TSLA",
208 |     "company_name": "Tesla, Inc.",
209 |     "sector": "Consumer Cyclical",
210 |     "industry": "Auto Manufacturers",
211 |     "market_cap": 611384164352,
212 |     "pe_ratio": 49.604652,
213 |     "pb_ratio": 9.762013,
214 |     "dividend_yield": null,
215 |     "eps": 4.3,
216 |     "beta": 2.427,
217 |     "52_week_high": 299.29,
218 |     "52_week_low": 152.37
219 |   }
220 | }
221 | ```
222 | 
223 | ### 8.4 Recursive Function Calls
224 | 
225 | OpenChatML allows for recursive function calls, where the model can make multiple function calls in a single conversation turn. The model can generate multiple `<|function_call|>` tags, and the corresponding `<|function_output|>` tags should be provided in the same order.
226 | 
227 | ### 8.5 Example Conversation with Function Calling
228 | 
229 | Here's an example conversation demonstrating function calling in OpenChatML:
230 | 
231 | ```
232 | [BOS]<|im_start|>system
233 | You are a function calling AI model. You are provided with function signatures within <tools></tools> XML tags. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions. Here are the available tools:
234 | <|function_list|>
235 | {
236 |   "type": "function",
237 |   "function": {
238 |     "name": "get_stock_fundamentals",
239 |     "description": "Get fundamental data for a given stock symbol using yfinance API.",
240 |     "parameters": {
241 |       "type": "object",
242 |       "properties": {
243 |         "symbol": {
244 |           "type": "string"
245 |         }
246 |       },
247 |       "required": ["symbol"]
248 |     }
249 |   }
250 | }
251 | <|function_list|>
252 | Use the following pydantic model json schema for each tool call you will make:
253 | {
254 |   "title": "FunctionCall",
255 |   "type": "object",
256 |   "properties": {
257 |     "arguments": {
258 |       "title": "Arguments", 
259 |       "type": "object"
260 |     },
261 |     "name": {
262 |       "title": "Name",
263 |       "type": "string"
264 |     }
265 |   },
266 |   "required": ["arguments", "name"]
267 | }
268 | For each function call return a json object with function name and arguments within <tool_call></tool_call> XML tags as follows:
269 | <|function_call|>
270 | {"arguments": <args-dict>, "name": <function-name>}
271 | <|im_end|>
272 | <|im_start|>user
273 | Fetch the stock fundamentals data for Tesla (TSLA)
274 | <|im_end|>
275 | <|im_start|>assistant
276 | <|function_call|>
277 | {"arguments": {"symbol": "TSLA"}, "name": "get_stock_fundamentals"}
278 | <|im_end|>
279 | <|im_start|>tool
280 | <|function_output|>
281 | {
282 |   "name": "get_stock_fundamentals",
283 |   "content": {
284 |     "symbol": "TSLA",
285 |     "company_name": "Tesla, Inc.",
286 |     "sector": "Consumer Cyclical", 
287 |     "industry": "Auto Manufacturers",
288 |     "market_cap": 611384164352,
289 |     "pe_ratio": 49.604652,
290 |     "pb_ratio": 9.762013,
291 |     "dividend_yield": null,
292 |     "eps": 4.3,
293 |     "beta": 2.427,
294 |     "52_week_high": 299.29,
295 |     "52_week_low": 152.37
296 |   }
297 | }
298 | <|im_end|>
299 | <|im_start|>assistant
300 | The stock fundamentals data for Tesla (TSLA) are as follows:
301 | - **Symbol**: TSLA  
302 | - **Company Name**: Tesla, Inc.
303 | - **Sector**: Consumer Cyclical
304 | - **Industry**: Auto Manufacturers
305 | - **Market Capitalization**: $611,384,164,352
306 | - **Price-to-Earnings Ratio (P/E Ratio)**: 49.60
307 | - **Price-to-Book Ratio (P/B Ratio)**: 9.76
308 | - **Dividend Yield**: N/A
309 | - **Earnings Per Share (EPS)**: $4.30
310 | - **Beta**: 2.427
311 | - **52-Week High**: $299.29
312 | - **52-Week Low**: $152.37
313 | 
314 | This data provides key financial metrics and ratios that can help assess Tesla's stock performance and valuation. The high P/E and P/B ratios suggest that investors have high expectations for Tesla's future growth. The lack of a dividend yield indicates that Tesla is reinvesting its profits into the business rather than distributing them to shareholders. The high beta value implies that Tesla's stock is more volatile than the overall market.
315 | 
316 | <|im_end|>[EOS]
317 | ```
318 | 
319 | ## 9. Examples
320 | Here are a few examples of OpenChatML structures:
321 | 
322 | Example conversation:
323 | ```
324 | [BOS]<|im_start|>user
325 | Hello there, AI.
326 | <|im_end|>
327 | <|im_start|>assistant
328 | Hi. Nice to meet you.
329 | <|im_end|>[EOS]
330 | ```
331 | 
332 | Example conversation with speaker name:
333 | ```
334 | [BOS]<|im_start|>user name=Eric
335 | Hello there, AI.
336 | <|im_end|>
337 | <|im_start|>assistant
338 | Hi Eric. Nice to meet you.
339 | <|im_end|>[EOS]
340 | ```
341 | 
342 | Example fill-in-the-middle task:
343 | ```
344 | <|fim_prefix|>The capital of France is <|fim_middle|><|fim_suffix|>, which is known for its famous Eiffel Tower.
345 | ```
346 | 
347 | Example multi-file sequence:
348 | ```
349 | This is the content from the first file.
350 | <|file_separator|>
351 | This is the content from the second file.
352 | And this is more content from the second file.
353 | <|file_separator|>
354 | Finally, this is the content from the third file.
355 | ```
356 | 
357 | Here are some more comprehensive examples showcasing various features and use cases of OpenChatML:
358 | 
359 | Example: Named roles and longer conversation
360 | ```
361 | [BOS]<|im_start|>system name=GoalTracker
362 | You are an AI assistant that helps users track and achieve their goals.
363 | <|im_end|>
364 | <|im_start|>user name=Alice
365 | Hi, I need help staying motivated to reach my fitness goals. Any advice?
366 | <|im_end|>
367 | <|im_start|>assistant name=FitnessCoach
368 | Hi Alice! To stay motivated, try setting specific, measurable goals and tracking your progress. Celebrate small victories along the way. Remember why you started this journey and keep that in mind when things get tough. I'm here to support you every step of the way!
369 | <|im_end|>
370 | <|im_start|>user name=Alice  
371 | Thanks, that's helpful! Can you suggest a good workout routine for beginners?
372 | <|im_end|>
373 | <|im_start|>assistant name=FitnessCoach
374 | Absolutely! Here's a simple beginner-friendly workout routine:
375 | 
376 | Monday: 30-minute brisk walk or jog
377 | Tuesday: 15 minutes of bodyweight exercises (squats, push-ups, lunges)
378 | Wednesday: Rest day or light stretching 
379 | Thursday: 30-minute bike ride or swim
380 | Friday: 15 minutes of bodyweight exercises
381 | Saturday: 30-minute yoga or Pilates session
382 | Sunday: Rest day
383 | 
384 | Remember to warm up before and cool down after each workout. Listen to your body and don't push too hard too soon. Consistency is key!
385 | <|im_end|>
386 | <|im_start|>user name=Bob
387 | Hey, I couldn't help but overhear. I'm also trying to get into a fitness routine. Mind if I join the conversation?
388 | <|im_end|>  
389 | <|im_start|>assistant name=FitnessCoach
390 | Of course, Bob! The more, the merrier. Feel free to chime in with your own experiences and questions. We can all learn from and support each other on our fitness journeys.
391 | <|im_end|>[EOS]
392 | ```
393 | 
394 | Example: Fill-in-the-middle task
395 | ```
396 | <|fim_prefix|>def fibonacci(n):
397 |     if n <= 0:
398 |         return []
399 |     elif n == 1:
400 |         return [0]
401 |     elif n == 2:
402 |         return [0, 1]
403 |     else:
404 |         fib = [0, 1]
405 |         <|fim_middle|><|fim_suffix|>
406 |         return fib
407 | 
408 | # Test the function
409 | print(fibonacci(10))
410 | ```
411 | 
412 | Completion:
413 | ```
414 | <|fim_prefix|>def fibonacci(n):
415 |     if n <= 0:  
416 |         return []
417 |     elif n == 1:
418 |         return [0]
419 |     elif n == 2:
420 |         return [0, 1]
421 |     else:
422 |         fib = [0, 1]
423 |         <|fim_middle|>for i in range(2, n):
424 |             fib.append(fib[i-1] + fib[i-2])
425 |         <|fim_suffix|>  
426 |         return fib
427 | 
428 | # Test the function
429 | print(fibonacci(10))
430 | ```
431 | 
432 | Example 3: Multi-file sequence for document summarization
433 | ```
434 | <|file_separator|>
435 | A black hole is a region of spacetime where gravity is so strong that nothing, not even light, can escape from it. The boundary of a black hole is called the event horizon, beyond which events cannot affect an outside observer. Black holes form when massive stars collapse at the end of their life cycle. 
436 | <|file_separator|>
437 | The first modern solution of general relativity that would characterize a black hole was found by Karl Schwarzschild in 1916. However, its interpretation as a region of space from which nothing can escape was first published by David Finkelstein in 1958. Long considered a mathematical curiosity, it was during the 1960s that theoretical work showed black holes were a generic prediction of general relativity.
438 | <|file_separator|>
439 | The discovery of neutron stars by Jocelyn Bell Burnell in 1967 sparked interest in gravitationally collapsed compact objects as a possible astrophysical reality. The first black hole known as such was Cygnus X-1, identified by several researchers independently in 1971. Black holes of stellar mass form when very massive stars collapse at the end of their life cycle.
440 | <|file_separator|>
441 | <|fim_prefix|><|fim_middle|><|fim_suffix|> Despite their invisible interior, the presence of black holes can be inferred through their interaction with other matter and with electromagnetic radiation such as visible light. If there are other stars orbiting a black hole, their orbit can be used to determine the black hole's mass and location. Matter falling into a black hole can form an accretion disk, one of the brightest objects in the universe.
442 | ```
443 | 
444 | Completion:
445 | ```
446 | <|fim_prefix|>Black holes are regions of spacetime where gravity is extremely strong, preventing anything, including light, from escaping. They form when massive stars collapse at the end of their life cycle. The first modern solution describing black holes was found by Karl Schwarzschild in 1916, but their interpretation as inescapable regions was published by David Finkelstein in 1958.
447 | <|fim_middle|>The existence of black holes was confirmed with the discovery of Cygnus X-1 in 1971. Black holes can be detected through their interaction with nearby matter and radiation. Stars orbiting a black hole can reveal its mass and location, while matter falling into a black hole forms a bright accretion disk.
448 | <|fim_suffix|> Despite their invisible interior, the presence of black holes can be inferred through their interaction with other matter and with electromagnetic radiation such as visible light. If there are other stars orbiting a black hole, their orbit can be used to determine the black hole's mass and location. Matter falling into a black hole can form an accretion disk, one of the brightest objects in the universe.
449 | ```
450 | 
451 | ## 10. Parsing and Generation
452 | When parsing OpenChatML, the following rules should be applied:
453 | - The `role` must be one of the predefined values: "system", "tool", "user", or "assistant".
454 | - The `name` attribute is optional and should be parsed if present.
455 | 
456 | When generating OpenChatML, the same structure and rules should be followed to ensure compatibility and consistency.
457 | 
458 | ## 11. References
459 | - Fill In the Middle (FIM) https://arxiv.org/abs/2207.14255
460 | - Quiet Star (chain of thought) https://arxiv.org/abs/2403.09629
461 | 


--------------------------------------------------------------------------------