├── .gitignore
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE
├── README.md
├── SECURITY.md
├── SUPPORT.md
└── labs
    ├── fine_tuning_dashboards
        ├── gpt_fine_tuning_aoai_dashboard.md
        └── llama2_fine_tuning_aml_dashboard.md
    ├── fine_tuning_notebooks
        ├── gpt_fine_tuning
        │   ├── azure.env
        │   ├── gpt_35_turbo_fine_tuning.ipynb
        │   ├── training_set.jsonl
        │   └── validation_set.jsonl
        └── llama2_fine_tuning
        │   ├── llama_2_7b_fine_tuning.ipynb
        │   └── text-generation-config.json
    └── images
        ├── instruction.md
        ├── screenshot-aml-endpoint-consume-api.png
        ├── screenshot-aml-endpoint-deployment-succeed.png
        ├── screenshot-aml-endpoint-test-interface.png
        ├── screenshot-aml-ft-create-compute-cluster-advanced-config.png
        ├── screenshot-aml-ft-create-compute-cluster.png
        ├── screenshot-aml-ft-llama2-7b-wizard.png
        ├── screenshot-aml-ft-llama2-7b.png
        ├── screenshot-aml-ft-model-deploy-wizard.png
        ├── screenshot-aml-ft-model-deploy.png
        ├── screenshot-aml-ft-model-details.png
        ├── screenshot-aml-ft-model-training-job-completed.png
        ├── screenshot-aml-ft-model-training-job-running.png
        ├── screenshot-aml-ft-select-training-data-map-columns.png
        ├── screenshot-aml-ft-select-training-data.png
        ├── screenshot-aml-model-catalog.png
        ├── screenshot-aml-search-llama2.png
        ├── screenshot-aoai-keys-and-endpoint.png
        ├── screenshot-azure-env-file.png
        ├── screenshot-deployed-fine-tuned-model-via-sdk.png
        └── screenshot-fine-tuning-illustration-diagram.png


/.gitignore:
--------------------------------------------------------------------------------
1 | # Jupyter Notebook
2 | .ipynb_checkpoints
3 | 


--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
 1 | # Microsoft Open Source Code of Conduct
 2 | 
 3 | This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
 4 | 
 5 | Resources:
 6 | 
 7 | - [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/)
 8 | - [Microsoft Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/)
 9 | - Contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with questions or concerns
10 | 


--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
 1 | # Contributing
 2 | 
 3 | This project welcomes contributions and suggestions. Most contributions require you to
 4 | agree to a Contributor License Agreement (CLA) declaring that you have the right to,
 5 | and actually do, grant us the rights to use your contribution. For details, visit
 6 | https://cla.microsoft.com.
 7 | 
 8 | When you submit a pull request, a CLA-bot will automatically determine whether you need
 9 | to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the
10 | instructions provided by the bot. You will only need to do this once across all repositories using our CLA.
11 | 
12 | This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
13 | For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/)
14 | or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2024 He Zhang
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # LLM Fine-Tuning using Azure 
 2 | A fine-tuning guide for both OpenAI and Open-Source Large Lauguage Models on Azure. This repo. is designed to be user-friendly for people with non-technical background and for people with technical background such as Data Scientists and Machine Learning Engineers.
 3 | 
 4 | ## What
 5 | Fine-Tuning, or *Supervised Fine-Tuning*, retrains an existing pre-trained LLM using example data, resulting in a new "custom" fine-tuned LLM that has been optimized for the provided task-specific examples. 
 6 | <ol><img src="labs/images/screenshot-fine-tuning-illustration-diagram.png" alt="Screenshot of What is Fine-Tuning illustration diagram." width="600"/></ol>
 7 | 
 8 | ## Why
 9 | Typically, we use Fine-Tuning to:
10 | - improve LLM performance on specific tasks.
11 | - introduce information that wasn't well represented by the base LLM model.
12 | 
13 | Good use cases include: 
14 | - steering the LLM outputs in a specific style or tone.
15 | - too long or complex prompts to fit into the LLM prompt window.
16 | 
17 | ## When
18 | You may consider Fine-Tuning when:
19 | - you have tried Prompt Engineering and RAG approaches.
20 | - latency is critically important to the use case.
21 | - high accuracy is required to meet the customer requirement.
22 | - you have thousands of high-quality samples with ground-truth data.
23 | - you have clear evaluation metrics to benchmark fine-tuned models.
24 | 
25 | ## Learning Path
26 | **Lab 1: LLM Fine-Tuning via *Dashboards***
27 | - [Lab 1.1](labs/fine_tuning_dashboards/gpt_fine_tuning_aoai_dashboard.md): Fine-Tuning GPT Models (*1h duration*)
28 | - [Lab 1.2](labs/fine_tuning_dashboards/llama2_fine_tuning_aml_dashboard.md): Fine-Tuning Llama2 Models (*1h duration*)
29 | 
30 | **Lab 2: LLM Fine-Tuning via *Python SDK***
31 | - [Lab 2.1](labs/fine_tuning_notebooks/gpt_fine_tuning/gpt_35_turbo_fine_tuning.ipynb): Fine-Tuning GPT Models (*2h duration*)
32 | - [Lab 2.2](labs/fine_tuning_notebooks/llama2_fine_tuning/llama_2_7b_fine_tuning.ipynb): Fine-Tuning Llama2 Models (*2h duration*) 
33 | 
34 | ## Contributing
35 | This project welcomes contributions and suggestions.  Most contributions require you to agree to a
36 | Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
37 | the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
38 | 
39 | When you submit a pull request, a CLA bot will automatically determine whether you need to provide
40 | a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions
41 | provided by the bot. You will only need to do this once across all repos using our CLA.
42 | 
43 | This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
44 | For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
45 | contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
46 | 
47 | ## Trademarks
48 | This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft 
49 | trademarks or logos is subject to and must follow 
50 | [Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general).
51 | Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship.
52 | Any use of third-party trademarks or logos are subject to those third-party's policies.
53 | 
54 | ## Code of Conduct
55 | This project has adopted the
56 | [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
57 | For more information see the
58 | [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/)
59 | or contact [opencode@microsoft.com](mailto:opencode@microsoft.com)
60 | with any additional questions or comments.
61 | 
62 | ## License
63 | Copyright (c) Microsoft Corporation. All rights reserved.
64 | 
65 | Licensed under the [MIT](LICENSE) license.
66 | 
67 | ### Reporting Security Issues
68 | [Reporting Security Issues](https://github.com/microsoft/repo-templates/blob/main/shared/SECURITY.md)
69 | 
70 | 
71 | 


--------------------------------------------------------------------------------
/SECURITY.md:
--------------------------------------------------------------------------------
 1 | <!-- BEGIN MICROSOFT SECURITY.MD V0.0.9 BLOCK -->
 2 | 
 3 | ## Security
 4 | 
 5 | Microsoft takes the security of our software products and services seriously, which includes all source code repositories managed through our GitHub organizations, which include [Microsoft](https://github.com/Microsoft), [Azure](https://github.com/Azure), [DotNet](https://github.com/dotnet), [AspNet](https://github.com/aspnet) and [Xamarin](https://github.com/xamarin).
 6 | 
 7 | If you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](https://aka.ms/security.md/definition), please report it to us as described below.
 8 | 
 9 | ## Reporting Security Issues
10 | 
11 | **Please do not report security vulnerabilities through public GitHub issues.**
12 | 
13 | Instead, please report them to the Microsoft Security Response Center (MSRC) at [https://msrc.microsoft.com/create-report](https://aka.ms/security.md/msrc/create-report).
14 | 
15 | If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com).  If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://aka.ms/security.md/msrc/pgp).
16 | 
17 | You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://www.microsoft.com/msrc). 
18 | 
19 | Please include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue:
20 | 
21 |   * Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.)
22 |   * Full paths of source file(s) related to the manifestation of the issue
23 |   * The location of the affected source code (tag/branch/commit or direct URL)
24 |   * Any special configuration required to reproduce the issue
25 |   * Step-by-step instructions to reproduce the issue
26 |   * Proof-of-concept or exploit code (if possible)
27 |   * Impact of the issue, including how an attacker might exploit the issue
28 | 
29 | This information will help us triage your report more quickly.
30 | 
31 | If you are reporting for a bug bounty, more complete reports can contribute to a higher bounty award. Please visit our [Microsoft Bug Bounty Program](https://aka.ms/security.md/msrc/bounty) page for more details about our active programs.
32 | 
33 | ## Preferred Languages
34 | 
35 | We prefer all communications to be in English.
36 | 
37 | ## Policy
38 | 
39 | Microsoft follows the principle of [Coordinated Vulnerability Disclosure](https://aka.ms/security.md/cvd).
40 | 
41 | <!-- END MICROSOFT SECURITY.MD BLOCK -->
42 | 


--------------------------------------------------------------------------------
/SUPPORT.md:
--------------------------------------------------------------------------------
 1 | 
 2 | # Support
 3 | 
 4 | ## How to file issues and get help  
 5 | 
 6 | This project uses GitHub Issues to track bugs and feature requests. Please search the existing 
 7 | issues before filing new issues to avoid duplicates.  For new issues, file your bug or 
 8 | feature request as a new Issue.
 9 | 
10 | For help and questions about using this project, please **REPO MAINTAINER: INSERT INSTRUCTIONS HERE 
11 | FOR HOW TO ENGAGE REPO OWNERS OR COMMUNITY FOR HELP. COULD BE A STACK OVERFLOW TAG OR OTHER
12 | CHANNEL. WHERE WILL YOU HELP PEOPLE?**.
13 | 
14 | ## Microsoft Support Policy  
15 | 
16 | Support for this **PROJECT or PRODUCT** is limited to the resources listed above.
17 | 


--------------------------------------------------------------------------------
/labs/fine_tuning_dashboards/gpt_fine_tuning_aoai_dashboard.md:
--------------------------------------------------------------------------------
 1 | ## Fine-Tuning GPT Models - A Dashboard Experience
 2 | Learn how to fine-tune a GPT model using Azure OpenAI Studio - UI Dashboard.  
 3 | 
 4 | ### Prerequisites
 5 | * Learn the [what, why, and when to use fine-tuning.](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/fine-tuning-considerations)
 6 | * An Azure subscription.
 7 | * Access to Azure OpenAI Service.
 8 | * An Azure OpenAI resource created in the supported fine-tuning region (e.g. Sweden Central).
 9 | * GPT Models that support fine-tuning so far: *gpt-35-turbo-0613* and *gpt-35-turbo-1106*.
10 | * Prepare Training and Validation datasets:
11 |   * at least 50 high-quality samples (preferably 1,000s) are required.
12 |   * must be formatted in the JSON Lines (JSONL) document with UTF-8 encoding.
13 | 
14 | You can check the MS Learn document [here](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/fine-tuning?tabs=turbo%2Cpython&pivots=programming-language-studio) for more details.
15 | 
16 | ### Step 1: Open the *Create a custom model* wizard
17 | 1. Open Azure OpenAI Studio at [https://oai.azure.com/](https://oai.azure.com/) and sign in with credentials that have access to your Azure OpenAI resource. During the sign-in workflow, select the appropriate directory, Azure subscription, and Azure OpenAI resource.
18 | 2. In Azure OpenAI Studio, browse to the **Management > Models** pane, and select **Create a custom model**.
19 | <ol><img src="https://learn.microsoft.com/en-us/azure/ai-services/openai/media/fine-tuning/studio-create-custom-model.png" alt="Screenshot of the Training data pane for the Create custom model wizard, with local file options." width="600"/></ol>
20 | 
21 | ### Step 2: Select the *Base model*
22 | The first step in creating a custom model is to choose a base model. 
23 | 
24 | The **Base model** pane lets you choose a base model to use for your custom model. Select the base model from the **Base model type** dropdown, and then select **Next** to continue.
25 | <ol><img src="https://learn.microsoft.com/en-us/azure/ai-services/openai/media/fine-tuning/base-model.png" alt="Screenshot that shows how to select the base model in the Create custom model wizard in Azure OpenAI Studio." width="600"/></ol>
26 | 
27 | ### Step 3: Choose your *Training data*
28 | The next step is to choose your training data either from the previously uploaded one or by uploading a new one.
29 | <ol><img src="https://learn.microsoft.com/en-us/azure/ai-services/openai/media/fine-tuning/studio-training-data.png" alt="Screenshot of the Training data pane for the Create custom model wizard in Azure OpenAI Studio." width="600"/></ol>
30 | 
31 | To upload a new training data, you can use one of the following options:
32 | * Select **Local file** to upload training data from a local file.
33 |   <ol><img src="https://learn.microsoft.com/en-us/azure/ai-services/openai/media/fine-tuning/studio-training-data-local.png" alt="Screenshot of the Training data pane for the Create custom model wizard, with local file options." width="600"/></ol>
34 | * Select **Azure blob or other shared web locations** to import training data from Azure Blob or another shared web location.
35 |   <ol><img src="https://learn.microsoft.com/en-us/azure/ai-services/openai/media/fine-tuning/studio-training-data-blob.png" alt="Screenshot of the Training data pane for the Create custom model wizard, with Azure Blob and shared web location options." width="600"/></ol>
36 | 
37 | ### Step 4 (Optional): Choose your *Validation data*
38 | You can choose your validation data by following the similar pattern as you upload your training data.
39 | <ol><img src="https://learn.microsoft.com/en-us/azure/ai-services/openai/media/fine-tuning/studio-validation-data.png" alt="Screenshot of the Validation data pane for the Create custom model wizard in Azure OpenAI Studio." width="600"/></ol>
40 | 
41 | ### Step 5 (Optional): Configure *Advanced options*
42 | Select **Default** to use the default values for the fine-tuning job, or select **Advanced** to display and edit the hyperparameter values.
43 | <ol><img src="https://learn.microsoft.com/en-us/azure/ai-services/openai/media/fine-tuning/studio-advanced-options.png" alt="Screenshot of the Advanced options pane for the Create custom model wizard, with default options selected." width="600"/></ol>
44 | 
45 | One can refer to the MS Learn document [here](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/fine-tuning?tabs=turbo%2Cpython&pivots=programming-language-studio#configure-advanced-options) for a detailed explanation on key tun-able hyperparameters.
46 | 
47 | ### Step 6: Review your choices and *Start Training job*
48 | If you're ready to train your model, select **Start Training job** to start the fine-tuning job and return to the **Models** pane.
49 | <ol><img src="https://learn.microsoft.com/en-us/azure/ai-services/openai/media/fine-tuning/studio-review.png" alt="Screenshot of the Review pane for the Create custom model wizard in Azure OpenAI Studio." width="600"/></ol>
50 | 
51 | You can check the status of the custom model in the **Status** column of the **Custom models** tab.
52 | <ol><img src="https://learn.microsoft.com/en-us/azure/ai-services/openai/media/fine-tuning/studio-models-job-running.png" alt="Screenshot of the Models pane from Azure OpenAI Studio, with a custom model displayed." width="600"/></ol>
53 | 
54 | After you start a fine-tuning job, it can take some time to complete (from minutes to hours).
55 | <ol><img src="https://learn.microsoft.com/en-us/azure/ai-services/openai/media/fine-tuning/studio-model-details.png" alt="Screenshot of the Models pane in Azure OpenAI Studio, with a custom model displayed." width="600"/></ol>
56 | 
57 | ### Step 7: Deploy a custom model
58 | When the fine-tuning job succeeds, you can deploy the custom model from the **Models** pane to make it available for use with completion calls.
59 | 
60 | To deploy your custom model, select the custom model to deploy, and then select **Deploy model**.
61 | <ol><img src="https://learn.microsoft.com/en-us/azure/ai-services/openai/media/fine-tuning/studio-models-deploy-model.png#lightbox
62 | " alt="Screenshot that shows how to deploy a custom model in Azure OpenAI Studio." width="600"/></ol>
63 | 
64 | The **Deploy model** dialog box opens. 
65 | 
66 | In the dialog box, enter your **Deployment name** and then select **Create** to start the deployment of your custom model.
67 | <ol><img src="https://learn.microsoft.com/en-us/azure/ai-services/openai/media/fine-tuning/studio-models-deploy.png" alt="Screenshot of the Deploy Model dialog in Azure OpenAI Studio." width="600"/></ol>
68 | 
69 | ### Step 8: Test and use a deployed model
70 | After your custom model deploys, you can use it like any other deployed model. 
71 | 
72 | You can use the **Playgrounds** in [Azure OpenAI Studio]("https://oai.azure.com") to experiment with your new deployment. You can also use the fine-tuned model by calling the completion API.
73 | 
74 | ### Step 9 (Optional): Clean up your deployment resources
75 | When you're done with your custom model, you can delete the deployment and model. You can also delete the training and validation files you uploaded to the service, if needed.
76 | 
77 | ### Step 10 (Optional): Continous fine-tuning
78 | Once you have created a fine-tuned model you may wish to continue to refine the model over time through further fine-tuning. Continuous fine-tuning is the iterative process of selecting an already fine-tuned model as a base model and fine-tuning it further on new sets of training examples.
79 | 
80 | To perform fine-tuning on a model that you have previously fine-tuned you would use the same process as described in **Step 1**, but instead of specifying the name of a generic base model, you would specify your already fine-tuned model. A custom fine-tuned model would look like <code>gpt-35-turbo-0613.ft-5fd1918ee65d4cd38a5dcf6835066ed7</code>
81 | <ol><img src="https://learn.microsoft.com/en-us/azure/ai-services/openai/media/fine-tuning/studio-continuous.png" alt="Screenshot of the Create a custom model UI with a fine-tuned model highlighted." width="600"/></ol>
82 | 


--------------------------------------------------------------------------------
/labs/fine_tuning_dashboards/llama2_fine_tuning_aml_dashboard.md:
--------------------------------------------------------------------------------
 1 | ## Fine-Tuning Llama-2 Models - A Dashboard Experience
 2 | Learn how to fine-tune an Llama-2 model using Azure Machine Learning (AML) Studio - UI Dashboard.  
 3 | 
 4 | ### Prerequisites
 5 | * Learn the [what, why, and when to use fine-tuning.](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/fine-tuning-considerations)
 6 | * An Azure subscription.
 7 | * Access to AML Service.
 8 | * An AML resource created.
 9 | * Prepare Training and Validation datasets:
10 |   * at least 50 high-quality samples (preferably 1,000s) are required.
11 |   * must be formatted in the JSON Lines (JSONL) document with UTF-8 encoding.
12 | 
13 | ### Step 1: Open the *Model catalog* wizard
14 | 1. Open Azure Machine Learning Studio at [https://ml.azure.com/](https://ml.azure.com/) and sign in with credentials that have access to AML resource. During the sign-in workflow, select the appropriate directory, Azure subscription, and AML resource.
15 | 
16 | 2. In AML Studio, browse to the **Model catalog** pane.
17 | <ol><img src="../images/screenshot-aml-model-catalog.png" alt="Screenshot of AML Model Catalog pane." width="600"/></ol>
18 | 
19 | 3. In the search box, type <code>llama2</code>.
20 | <ol><img src="../images/screenshot-aml-search-llama2.png" alt="Screenshot of AML Model Catalog pane, searching for llama2 in the search box." width="600"/></ol>
21 | 
22 | ### Step 2: Start the fine-tuning process
23 | Assume that you want to fine-tune the <code>llama-2-7b</code> model for a text generation task (similar process for chat-completion tasks). 
24 | 
25 | 1. The first step is to press the **Fine-tune** button to start the fine-tuning process.
26 | <ol><img src="../images/screenshot-aml-ft-llama2-7b.png" alt="Screenshot of AML Model Catalog, with llama2-7b model description page." width="600"/></ol>
27 | 
28 | 2. The **Fine-tune Llama-2-7b** blade lets you specify task type (choose <code>Text generation</code> for our case), training data, validation data (optional), test data (optional), and an Azure ML compute cluster.
29 | <ol><img src="../images/screenshot-aml-ft-llama2-7b-wizard.png" alt="Screenshot of AML Model Catalog pane, for llama2-7b model, opening the Fine-Tune blade." width="600"/></ol>
30 | 
31 | ### Step 3: Create an Azure ML compute cluster
32 | To run the fine-tuning job, an AML compute cluster machine needs to be created (if you haven't done it before). 
33 | 
34 | 1. The **\+ New** button at the bottom of the blade opens the **Create compute cluster** pane, where you need to specify the **Location** (<code>e.g. West Europe</code>), **Virtual machine tier** (<code>Dedicated</code>), **Virtual machine type** (<code>GPU</code>) and **Virtual machine size**. 
35 | <ol><img src="../images/screenshot-aml-ft-create-compute-cluster.png " alt="Screenshot of AML create compute cluster pane, with location on west europe and the gpu type of nvidia ND40 series machine." width="600"/></ol>
36 | 
37 | <ol>Note that only nvidia ND40 and ND96 VMs are supported for fine-tuning at the moment. If you can't find it in list, you can try choosing other <strong>Location</strong> or to request quota accordingly.</ol>
38 | 
39 | 2. Give a name to the compute, and specify the minimum (usually <code>0</code>) and maximum (<code>1</code> for testing purpose) number of nodes.
40 | <ol><img src="../images/screenshot-aml-ft-create-compute-cluster-advanced-config.png " alt="Screenshot of AML create compute cluster pane, with gpu advanced config pane." width="600"/></ol>
41 | 
42 | <ol>Click <strong>Next</strong> to start the creation process. This may take a couple of minutes.</ol>
43 | 
44 | ### Step 4: Choose your **Training data**
45 | The next step is to select your training data either from the previously uploaded one or by uploading a new one.
46 | <ol><img src="../images/screenshot-aml-ft-select-training-data.png " alt="Screenshot of AML fine tuning - choose training / validation / test data." width="600"/></ol>
47 | 
48 | You also need to specify the 'prompt' (i.e. input) and the 'completion' (i.e. output) columns to guide the fine-tuning process.  
49 | <ol><img src="../images/screenshot-aml-ft-select-training-data-map-columns.png " alt="Screenshot of AML fine tuning - choose training / validation / test data, and map the prompt and completion columns." width="600"/></ol>
50 | 
51 | ### Step 5 (Optional): Choose your **Validation data**
52 | You can select your validation data by following the similar procedure as you do for the training data. Or, you can leave it as the default setting (i.e. an automtic split of the training data will be used for validation).
53 | 
54 | ### Step 6 (Optional): Choose your **Test data**
55 | You can select your test data by following the similar procedure as you do for the training data. Or, you can leave it as the default setting (i.e. an automtic split of the training data will be used for testing).
56 | 
57 | ### Step 7: Submit your fine-tuning job
58 | Now that you are ready. Click the **Finish** button at the bottom of the **Fine-tune Llama-2-7b** blade. This will trigger the actual fine-tuning process to start. Depending on the size of your training data, this process can take from minutes to hours.
59 | <ol><img src="../images/screenshot-aml-ft-model-training-job-running.png " alt="Screenshot of AML fine tuning - fine tuning jobs running." width="600"/></ol>
60 | 
61 | After the fine-tuning job finishes, its **Status** becomes <code>Completed</code>. 
62 | <ol><img src="../images/screenshot-aml-ft-model-training-job-completed.png " alt="Screenshot of AML fine tuning - fine tuning jobs completed." width="600"/></ol>
63 | 
64 | ### Step 8: Deploy the fine-tuned model
65 | Before deploying the model, you need to register the model first. 
66 | 
67 | Go to **Assets > Models** pane, select the newly fine-tuned model, and click **\+ Register**.
68 | <ol><img src="../images/screenshot-aml-ft-model-details.png " alt="Screenshot of AML fine tuning - register the fine-tuned model." width="600"/></ol>
69 | 
70 | After that, click the **\+ Deploy** button to invoke the Deployment blade, where you need to specify the **Virtual machine** (preferably choose nvidia NC & ND VM series), **Instance count**, **Endpoint name** and **Deployment name**. 
71 | <ol><img src="../images/screenshot-aml-ft-model-deploy-wizard.png " alt="Screenshot of AML fine tuning - deploy the fine-tuned model." width="600"/></ol>
72 | 
73 | Click the **Deploy** button at the bottom to start the actual deployment process. 
74 | 
75 | This may take a moment, until you see both **Provisioning state** become <code>Succeeded</code>.
76 | <ol><img src="../images/screenshot-aml-endpoint-deployment-succeed.png " alt="Screenshot of AML fine tuning - deploy the fine-tuned model - succeeded." width="600"/></ol>
77 | 
78 | ### Step 9: Test and use a deployed model
79 | You can directly test the deployed model via the handy test playground.
80 | <ol><img src="../images/screenshot-aml-endpoint-test-interface.png " alt="Screenshot of testing a deployed fine-tuned model via the handy test playground." width="600"/></ol>
81 | 
82 | You can also consume the API using a popular programming language such as <code>Python</code>.
83 | <ol><img src="../images/screenshot-aml-endpoint-consume-api.png " alt="Screenshot of testing a deployed fine-tuned model via API calls." width="600"/></ol>
84 | 
85 | ### Step 10 (Optional): Clean up your deployment resources
86 | When you're done with your custom model, you can delete the deployed endpoint, model, and the compute cluster. 
87 | 
88 | You can also delete the training (and validation and test) files you uploaded to the service, if needed.
89 | 
90 | 
91 | 
92 | 
93 | 
94 | 
95 | 


--------------------------------------------------------------------------------
/labs/fine_tuning_notebooks/gpt_fine_tuning/azure.env:
--------------------------------------------------------------------------------
1 | # Azure OpenAI Credentials
2 | AZURE_OPENAI_ENDPOINT = "Your_Azure_OpenAI_Endpoint"
3 | AZURE_OPENAI_API_KEY = "Your_Azure_OpenAI_API_Key"


--------------------------------------------------------------------------------
/labs/fine_tuning_notebooks/gpt_fine_tuning/gpt_35_turbo_fine_tuning.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "id": "44ad5da8",
  6 |    "metadata": {},
  7 |    "source": [
  8 |     "## Fine-Tuning GPT Models - A Python SDK Experience\n",
  9 |     "\n",
 10 |     "Learn how to fine-tune the <code>gpt-35-turbo-0613</code> model using Python Programming Language - An SDK / Code Experience. This notebook is based on the MS Learn tutorial [here](https://learn.microsoft.com/en-us/azure/ai-services/openai/tutorials/fine-tune?tabs=python%2Cbash).\n",
 11 |     "\n",
 12 |     "He Zhang, Feb. 2024"
 13 |    ]
 14 |   },
 15 |   {
 16 |    "cell_type": "markdown",
 17 |    "id": "8a270ee2",
 18 |    "metadata": {},
 19 |    "source": [
 20 |     "### Prerequisites\n",
 21 |     "\n",
 22 |     "* Learn the [what, why, and when to use fine-tuning.](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/fine-tuning-considerations)\n",
 23 |     "* An Azure subscription.\n",
 24 |     "* Access to Azure OpenAI Service.\n",
 25 |     "* An Azure OpenAI resource created in the supported fine-tuning region (e.g. Sweden Central).\n",
 26 |     "* Prepare Training and Validation datasets:\n",
 27 |     "  * at least 50 high-quality samples (preferably 1,000s) are required.\n",
 28 |     "  * must be formatted in the JSON Lines (JSONL) document with UTF-8 encoding.\n",
 29 |     "  * for this test notebook, we use only 10 samples for the demo purpose. \n",
 30 |     "* Python version at least: <code>3.7.1</code>\n",
 31 |     "* Python libraries: <code>json, requests, os, tiktoken, time, python-dotenv, numpy, openai</code>\n",
 32 |     "* The OpenAI Python library version for this test notebook: <code>0.28.1</code>\n",
 33 |     "* [Jupyter Notebooks](https://jupyter.org/)"
 34 |    ]
 35 |   },
 36 |   {
 37 |    "cell_type": "markdown",
 38 |    "id": "699f837b",
 39 |    "metadata": {},
 40 |    "source": [
 41 |     "### Step 1: Setup"
 42 |    ]
 43 |   },
 44 |   {
 45 |    "cell_type": "markdown",
 46 |    "id": "cd759f8f",
 47 |    "metadata": {},
 48 |    "source": [
 49 |     "#### Retrieve the Azure OpenAI API key and endpoint."
 50 |    ]
 51 |   },
 52 |   {
 53 |    "cell_type": "markdown",
 54 |    "id": "399a4647",
 55 |    "metadata": {},
 56 |    "source": [
 57 |     "Go to your resource in the Azure portal. The Endpoint and Keys can be found in the Resource Management section Go to your resource in the Azure portal.  \n",
 58 |     "<img src=\"../../images/screenshot-aoai-keys-and-endpoint.png\" alt=\"Screenshot of the Azure OpenAI resource management pane.\" width=\"800\"/>"
 59 |    ]
 60 |   },
 61 |   {
 62 |    "cell_type": "markdown",
 63 |    "id": "79f3b044",
 64 |    "metadata": {},
 65 |    "source": [
 66 |     "#### Configure credentials"
 67 |    ]
 68 |   },
 69 |   {
 70 |    "cell_type": "markdown",
 71 |    "id": "fd153223",
 72 |    "metadata": {},
 73 |    "source": [
 74 |     "Copy the <code>Endpoint</code> and access <code>KEY</code> (you can use either <code>KEY 1</code> or <code>KEY 2</code>), and paste them accordingly to the variables in the file <code>azure.env</code>. Save the file and close it. **Do not** distribute this file as this contains credential information! \n",
 75 |     "<img src=\"../../images/screenshot-azure-env-file.png\" alt=\"Screenshot of the azure.env file that contains credential information - do not show it to others!\" width=\"800\"/>"
 76 |    ]
 77 |   },
 78 |   {
 79 |    "cell_type": "markdown",
 80 |    "id": "4d4574dc",
 81 |    "metadata": {},
 82 |    "source": [
 83 |     "#### Install required Python libraries (if not done yet)"
 84 |    ]
 85 |   },
 86 |   {
 87 |    "cell_type": "code",
 88 |    "execution_count": null,
 89 |    "id": "c445eac9",
 90 |    "metadata": {},
 91 |    "outputs": [],
 92 |    "source": [
 93 |     "%pip install \"openai==0.28.1\" json requests os tiktoken time"
 94 |    ]
 95 |   },
 96 |   {
 97 |    "cell_type": "markdown",
 98 |    "id": "f57d703e",
 99 |    "metadata": {},
100 |    "source": [
101 |     "#### Import required Python libraries "
102 |    ]
103 |   },
104 |   {
105 |    "cell_type": "code",
106 |    "execution_count": null,
107 |    "id": "229febe6",
108 |    "metadata": {},
109 |    "outputs": [],
110 |    "source": [
111 |     "import os\n",
112 |     "import json\n",
113 |     "import time\n",
114 |     "import openai\n",
115 |     "import requests\n",
116 |     "import tiktoken\n",
117 |     "import numpy as np\n",
118 |     "\n",
119 |     "from dotenv import load_dotenv"
120 |    ]
121 |   },
122 |   {
123 |    "cell_type": "markdown",
124 |    "id": "bbebe593",
125 |    "metadata": {},
126 |    "source": [
127 |     "#### Load Azure OpenAI credentials"
128 |    ]
129 |   },
130 |   {
131 |    "cell_type": "code",
132 |    "execution_count": null,
133 |    "id": "d6b7a343",
134 |    "metadata": {},
135 |    "outputs": [],
136 |    "source": [
137 |     "load_dotenv(\"azure.env\")\n",
138 |     "\n",
139 |     "openai.api_type = \"azure\"\n",
140 |     "openai.api_key = os.getenv(\"AZURE_OPENAI_API_KEY\")\n",
141 |     "openai.api_base = os.getenv(\"AZURE_OPENAI_ENDPOINT\")\n",
142 |     "openai.api_version = \"2023-12-01-preview\" # This API version or later is required to access fine-tuning for turbo/babbage-002/davinci-002"
143 |    ]
144 |   },
145 |   {
146 |    "cell_type": "markdown",
147 |    "id": "ddfdf556",
148 |    "metadata": {},
149 |    "source": [
150 |     "### Step 2: Prepare Training & Validation Datasets"
151 |    ]
152 |   },
153 |   {
154 |    "cell_type": "markdown",
155 |    "id": "96bed57b",
156 |    "metadata": {},
157 |    "source": [
158 |     "#### The training and validation datasets have been made ready for you."
159 |    ]
160 |   },
161 |   {
162 |    "cell_type": "markdown",
163 |    "id": "65e4e564",
164 |    "metadata": {},
165 |    "source": [
166 |     "<code>training_set.jsonl</code>"
167 |    ]
168 |   },
169 |   {
170 |    "cell_type": "markdown",
171 |    "id": "72662b27",
172 |    "metadata": {},
173 |    "source": [
174 |     "```yaml\n",
175 |     "{\"messages\": \n",
176 |     "    [{\"role\": \"system\", \"content\": \"Clippy is a factual chatbot that is also sarcastic.\"}, \n",
177 |     "     {\"role\": \"user\", \"content\": \"Who discovered Antarctica?\"}, \n",
178 |     "     {\"role\": \"assistant\", \"content\": \"Some chaps named Fabian Gottlieb von Bellingshausen and Mikhail Lazarev, as if they don't teach that in every school!\"}]}\n",
179 |     "\n",
180 |     "{\"messages\": \n",
181 |     "    [{\"role\": \"system\", \"content\": \"Clippy is a factual chatbot that is also sarcastic.\"},\n",
182 |     "     {\"role\": \"user\", \"content\": \"What is the biggest ocean?\"}, \n",
183 |     "     {\"role\": \"assistant\", \"content\": \"The Pacific Ocean. It's not like it's a small pond or anything.\"}]}\n",
184 |     "\n",
185 |     "{\"messages\": ...}\n",
186 |     "```"
187 |    ]
188 |   },
189 |   {
190 |    "cell_type": "markdown",
191 |    "id": "a6e3b290",
192 |    "metadata": {},
193 |    "source": [
194 |     "<code>validation_set.jsonl</code>"
195 |    ]
196 |   },
197 |   {
198 |    "cell_type": "markdown",
199 |    "id": "956ff181",
200 |    "metadata": {},
201 |    "source": [
202 |     "```yaml\n",
203 |     "{\"messages\": \n",
204 |     "    [{\"role\": \"system\", \"content\": \"Clippy is a factual chatbot that is also sarcastic.\"}, \n",
205 |     "     {\"role\": \"user\", \"content\": \"What's the capital of Australia?\"}, \n",
206 |     "     {\"role\": \"assistant\", \"content\": \"It's Canberra, not Sydney. Shocking, I know!\"}]}\n",
207 |     "\n",
208 |     "{\"messages\": \n",
209 |     "    [{\"role\": \"system\", \"content\": \"Clippy is a factual chatbot that is also sarcastic.\"},\n",
210 |     "     {\"role\": \"user\", \"content\": \"Who wrote 'The Great Gatsby'?\"}, \n",
211 |     "     {\"role\": \"assistant\", \"content\": \"Some guy named F. Scott Fitzgerald. Ever heard of him?\"}]}\n",
212 |     "     \n",
213 |     "{\"messages\": ...}\n",
214 |     "```"
215 |    ]
216 |   },
217 |   {
218 |    "cell_type": "markdown",
219 |    "id": "e19b30e0",
220 |    "metadata": {},
221 |    "source": [
222 |     "#### Do initial checks"
223 |    ]
224 |   },
225 |   {
226 |    "cell_type": "code",
227 |    "execution_count": null,
228 |    "id": "dfca565a",
229 |    "metadata": {},
230 |    "outputs": [],
231 |    "source": [
232 |     "# Load the training set\n",
233 |     "with open(\"training_set.jsonl\", \"r\", encoding=\"utf-8\") as f:\n",
234 |     "    training_dataset = [json.loads(line) for line in f]\n",
235 |     "\n",
236 |     "# Training dataset stats\n",
237 |     "print(\"Number of examples in training set:\", len(training_dataset))\n",
238 |     "print(\"First example in training set:\")\n",
239 |     "for message in training_dataset[0][\"messages\"]:\n",
240 |     "    print(message)"
241 |    ]
242 |   },
243 |   {
244 |    "cell_type": "code",
245 |    "execution_count": null,
246 |    "id": "007634ed",
247 |    "metadata": {},
248 |    "outputs": [],
249 |    "source": [
250 |     "# Load the validation set\n",
251 |     "with open(\"validation_set.jsonl\", \"r\", encoding=\"utf-8\") as f:\n",
252 |     "    validation_dataset = [json.loads(line) for line in f]\n",
253 |     "\n",
254 |     "# Validation dataset stats\n",
255 |     "print(\"\\nNumber of examples in validation set:\", len(validation_dataset))\n",
256 |     "print(\"First example in validation set:\")\n",
257 |     "for message in validation_dataset[0][\"messages\"]:\n",
258 |     "    print(message)"
259 |    ]
260 |   },
261 |   {
262 |    "cell_type": "markdown",
263 |    "id": "1ad9f529",
264 |    "metadata": {},
265 |    "source": [
266 |     "#### Examine the token numbers\n",
267 |     "Now you can then run some additional code from OpenAI using the tiktoken library to validate the token counts. Individual examples need to remain under the <code>gpt-35-turbo-0613</code> model's input token limit of <code>4,096</code> tokens."
268 |    ]
269 |   },
270 |   {
271 |    "cell_type": "code",
272 |    "execution_count": null,
273 |    "id": "c6d9005d",
274 |    "metadata": {},
275 |    "outputs": [],
276 |    "source": [
277 |     "encoding = tiktoken.get_encoding(\"cl100k_base\") # default encoding used by gpt-4, turbo, and text-embedding-ada-002 models\n",
278 |     "\n",
279 |     "def num_tokens_from_messages(messages, tokens_per_message=3, tokens_per_name=1):\n",
280 |     "    num_tokens = 0\n",
281 |     "    for message in messages:\n",
282 |     "        num_tokens += tokens_per_message\n",
283 |     "        for key, value in message.items():\n",
284 |     "            num_tokens += len(encoding.encode(value))\n",
285 |     "            if key == \"name\":\n",
286 |     "                num_tokens += tokens_per_name\n",
287 |     "    num_tokens += 3\n",
288 |     "    return num_tokens\n",
289 |     "\n",
290 |     "def num_assistant_tokens_from_messages(messages):\n",
291 |     "    num_tokens = 0\n",
292 |     "    for message in messages:\n",
293 |     "        if message[\"role\"] == \"assistant\":\n",
294 |     "            num_tokens += len(encoding.encode(message[\"content\"]))\n",
295 |     "    return num_tokens\n",
296 |     "\n",
297 |     "def print_distribution(values, name):\n",
298 |     "    print(f\"\\n#### Distribution of {name}:\")\n",
299 |     "    print(f\"min / max: {min(values)}, {max(values)}\")\n",
300 |     "    print(f\"mean / median: {np.mean(values)}, {np.median(values)}\")\n",
301 |     "    print(f\"p5 / p95: {np.quantile(values, 0.1)}, {np.quantile(values, 0.9)}\")\n",
302 |     "\n",
303 |     "files = ['training_set.jsonl', 'validation_set.jsonl']\n",
304 |     "\n",
305 |     "for file in files:\n",
306 |     "    print(f\"Processing file: {file}\")\n",
307 |     "    with open(file, 'r', encoding='utf-8') as f:\n",
308 |     "        dataset = [json.loads(line) for line in f]\n",
309 |     "\n",
310 |     "    total_tokens = []\n",
311 |     "    assistant_tokens = []\n",
312 |     "\n",
313 |     "    for ex in dataset:\n",
314 |     "        messages = ex.get(\"messages\", {})\n",
315 |     "        total_tokens.append(num_tokens_from_messages(messages))\n",
316 |     "        assistant_tokens.append(num_assistant_tokens_from_messages(messages))\n",
317 |     "    \n",
318 |     "    print_distribution(total_tokens, \"total tokens\")\n",
319 |     "    print_distribution(assistant_tokens, \"assistant tokens\")\n",
320 |     "    print('*' * 50)"
321 |    ]
322 |   },
323 |   {
324 |    "cell_type": "markdown",
325 |    "id": "114c83d3",
326 |    "metadata": {},
327 |    "source": [
328 |     "### Step 3: Upload Datasets for Fine-Tuning"
329 |    ]
330 |   },
331 |   {
332 |    "cell_type": "code",
333 |    "execution_count": null,
334 |    "id": "2f54a4c3",
335 |    "metadata": {},
336 |    "outputs": [],
337 |    "source": [
338 |     "# Upload the training and validation dataset files to Azure OpenAI with the SDK.\n",
339 |     "training_file_name = \"training_set.jsonl\"\n",
340 |     "validation_file_name = \"validation_set.jsonl\"\n",
341 |     "\n",
342 |     "training_response = openai.File.create(\n",
343 |     "    file=open(training_file_name, \"rb\"), \n",
344 |     "    purpose=\"fine-tune\", \n",
345 |     "    user_provided_filename=training_file_name\n",
346 |     ")\n",
347 |     "training_file_id = training_response[\"id\"]\n",
348 |     "\n",
349 |     "validation_response = openai.File.create(\n",
350 |     "    file=open(validation_file_name, \"rb\"), \n",
351 |     "    purpose=\"fine-tune\", \n",
352 |     "    user_provided_filename=validation_file_name\n",
353 |     ")\n",
354 |     "validation_file_id = validation_response[\"id\"]\n",
355 |     "\n",
356 |     "print(\"Training file ID:\", training_file_id)\n",
357 |     "print(\"Validation file ID:\", validation_file_id)"
358 |    ]
359 |   },
360 |   {
361 |    "cell_type": "markdown",
362 |    "id": "08aee27a",
363 |    "metadata": {},
364 |    "source": [
365 |     "### Step 4: Begin Fine-Tuning Job"
366 |    ]
367 |   },
368 |   {
369 |    "cell_type": "markdown",
370 |    "id": "a927f0c4",
371 |    "metadata": {},
372 |    "source": [
373 |     "Now you can submit your fine-tuning training job. \n",
374 |     "\n",
375 |     "The fine-tuning job will take some time to start and complete.\n",
376 |     "\n",
377 |     "You can use the job ID to monitor the status of the fine-tuning job. "
378 |    ]
379 |   },
380 |   {
381 |    "cell_type": "code",
382 |    "execution_count": null,
383 |    "id": "e925985b",
384 |    "metadata": {},
385 |    "outputs": [],
386 |    "source": [
387 |     "response = openai.FineTuningJob.create(\n",
388 |     "    training_file=training_file_id,\n",
389 |     "    validation_file=validation_file_id,\n",
390 |     "    model=\"gpt-35-turbo-0613\", # must be exactly this name\n",
391 |     ")\n",
392 |     "\n",
393 |     "job_id = response[\"id\"]\n",
394 |     "\n",
395 |     "print(\"Job ID:\", response[\"id\"])\n",
396 |     "print(\"Status:\", response[\"status\"])\n",
397 |     "print(response)"
398 |    ]
399 |   },
400 |   {
401 |    "cell_type": "markdown",
402 |    "id": "fa608e2f",
403 |    "metadata": {},
404 |    "source": [
405 |     "### Step 5: Track Fine-Tuning Job Status"
406 |    ]
407 |   },
408 |   {
409 |    "cell_type": "markdown",
410 |    "id": "38a5ad52",
411 |    "metadata": {},
412 |    "source": [
413 |     "You can track the training job status by running:"
414 |    ]
415 |   },
416 |   {
417 |    "cell_type": "code",
418 |    "execution_count": null,
419 |    "id": "dfa4b5b3",
420 |    "metadata": {},
421 |    "outputs": [],
422 |    "source": [
423 |     "# Track fine-tuning job training status\n",
424 |     "start_time = time.time()\n",
425 |     "\n",
426 |     "# Get the status of our fine-tuning job.\n",
427 |     "response = openai.FineTuningJob.retrieve(job_id)\n",
428 |     "\n",
429 |     "status = response[\"status\"]\n",
430 |     "\n",
431 |     "# If the job isn't done yet, poll it every 10 seconds.\n",
432 |     "while status not in [\"succeeded\", \"failed\"]:\n",
433 |     "    time.sleep(10)\n",
434 |     "    \n",
435 |     "    response = openai.FineTuningJob.retrieve(job_id)\n",
436 |     "    print(response)\n",
437 |     "    print(\"Elapsed time: {} minutes {} seconds\".format(int((time.time() - start_time) // 60), int((time.time() - start_time) % 60)))\n",
438 |     "    status = response[\"status\"]\n",
439 |     "    print(f\"Status: {status}\")\n",
440 |     "    clear_output(wait=True)\n",
441 |     "\n",
442 |     "print(f\"Fine-tuning job {job_id} finished with status: {status}\")\n",
443 |     "\n",
444 |     "# List all fine-tuning jobs for this resource.\n",
445 |     "print(\"Checking other fine-tune jobs for this resource.\")\n",
446 |     "response = openai.FineTuningJob.list()\n",
447 |     "print(f'Found {len(response[\"data\"])} fine-tune jobs.')"
448 |    ]
449 |   },
450 |   {
451 |    "cell_type": "markdown",
452 |    "id": "4afeb619",
453 |    "metadata": {},
454 |    "source": [
455 |     "To get the full results, you can run the following:"
456 |    ]
457 |   },
458 |   {
459 |    "cell_type": "code",
460 |    "execution_count": null,
461 |    "id": "09f1d03f",
462 |    "metadata": {},
463 |    "outputs": [],
464 |    "source": [
465 |     "# Retrieve fine_tuned_model name\n",
466 |     "response = openai.FineTuningJob.retrieve(job_id)\n",
467 |     "print(response)\n",
468 |     "\n",
469 |     "fine_tuned_model = response[\"fine_tuned_model\"]"
470 |    ]
471 |   },
472 |   {
473 |    "cell_type": "markdown",
474 |    "id": "d3a58b85",
475 |    "metadata": {},
476 |    "source": [
477 |     "### Step 6: Deploy The Fine-Tuned Model"
478 |    ]
479 |   },
480 |   {
481 |    "cell_type": "markdown",
482 |    "id": "370097d4",
483 |    "metadata": {},
484 |    "source": [
485 |     "Model deployment must be done using the [REST API](https://learn.microsoft.com/en-us/rest/api/cognitiveservices/accountmanagement/deployments/create-or-update?view=rest-cognitiveservices-accountmanagement-2023-05-01&tabs=HTTP), which requires separate authorization, a different API path, and a different API version."
486 |    ]
487 |   },
488 |   {
489 |    "cell_type": "markdown",
490 |    "id": "53296c51",
491 |    "metadata": {},
492 |    "source": [
493 |     "<table>\n",
494 |     "<thead>\n",
495 |     "<tr>\n",
496 |     "<th>variable</th>\n",
497 |     "<th>Definition</th>\n",
498 |     "</tr>\n",
499 |     "</thead>\n",
500 |     "<tbody>\n",
501 |     "<tr>\n",
502 |     "<td>token</td>\n",
503 |     "<td>There are multiple ways to generate an authorization token. The easiest method for initial testing is to launch the Cloud Shell from the <a href=\"https://portal.azure.com\" data-linktype=\"external\">Azure portal</a>. Then run <a href=\"/en-us/cli/azure/account#az-account-get-access-token()\" data-linktype=\"absolute-path\"><code>az account get-access-token</code></a>. You can use this token as your temporary authorization token for API testing. We recommend storing this in a new environment variable</td>\n",
504 |     "</tr>\n",
505 |     "<tr>\n",
506 |     "<td>subscription</td>\n",
507 |     "<td>The subscription ID for the associated Azure OpenAI resource</td>\n",
508 |     "</tr>\n",
509 |     "<tr>\n",
510 |     "<td>resource_group</td>\n",
511 |     "<td>The resource group name for your Azure OpenAI resource</td>\n",
512 |     "</tr>\n",
513 |     "<tr>\n",
514 |     "<td>resource_name</td>\n",
515 |     "<td>The Azure OpenAI resource name</td>\n",
516 |     "</tr>\n",
517 |     "<tr>\n",
518 |     "<td>model_deployment_name</td>\n",
519 |     "<td>The custom name for your new fine-tuned model deployment. This is the name that will be referenced in your code when making chat completion calls.</td>\n",
520 |     "</tr>\n",
521 |     "<tr>\n",
522 |     "<td>fine_tuned_model</td>\n",
523 |     "<td>Retrieve this value from your fine-tuning job results in the previous step. It will look like <code>gpt-35-turbo-0613.ft-b044a9d3cf9c4228b5d393567f693b83</code>. You will need to add that value to the deploy_data json.</td>\n",
524 |     "</tr>\n",
525 |     "</tbody>\n",
526 |     "</table>"
527 |    ]
528 |   },
529 |   {
530 |    "cell_type": "code",
531 |    "execution_count": null,
532 |    "id": "e3848e96",
533 |    "metadata": {},
534 |    "outputs": [],
535 |    "source": [
536 |     "token= os.getenv(\"TEMP_AUTH_TOKEN\") \n",
537 |     "subscription = \"<YOUR_SUBSCRIPTION_ID>\"  \n",
538 |     "resource_group = \"<YOUR_RESOURCE_GROUP_NAME>\"\n",
539 |     "resource_name = \"<YOUR_AZURE_OPENAI_RESOURCE_NAME>\"\n",
540 |     "model_deployment_name =\"YOUR_CUSTOM_MODEL_DEPLOYMENT_NAME\" \n",
541 |     "\n",
542 |     "deploy_params = {\"api-version\": \"2023-05-01\"} \n",
543 |     "deploy_headers = {\"Authorization\": \"Bearer {}\".format(token), \"Content-Type\": \"application/json\"}\n",
544 |     "deploy_data = {\n",
545 |     "    \"sku\": {\"name\": \"standard\", \"capacity\": 1}, \n",
546 |     "    \"properties\": {\n",
547 |     "        \"model\": {\n",
548 |     "            \"format\": \"OpenAI\",\n",
549 |     "            \"name\": \"<YOUR_FINE_TUNED_MODEL>\", #retrieve this value from the previous call, it will look like gpt-35-turbo-0613.ft-b044a9d3cf9c4228b5d393567f693b83\n",
550 |     "            \"version\": \"1\"\n",
551 |     "        }\n",
552 |     "    }\n",
553 |     "}\n",
554 |     "deploy_data = json.dumps(deploy_data)\n",
555 |     "\n",
556 |     "print(\"Creating a new deployment...\")\n",
557 |     "request_url = f\"https://management.azure.com/subscriptions/{subscription}/resourceGroups/{resource_group}/providers/Microsoft.CognitiveServices/accounts/{resource_name}/deployments/{model_deployment_name}\"\n",
558 |     "r = requests.put(request_url, params=deploy_params, headers=deploy_headers, data=deploy_data)\n",
559 |     "\n",
560 |     "print(r)\n",
561 |     "print(r.reason)\n",
562 |     "print(r.json())"
563 |    ]
564 |   },
565 |   {
566 |    "cell_type": "markdown",
567 |    "id": "3404f0a0",
568 |    "metadata": {},
569 |    "source": [
570 |     "You can check on your deployment progress in the Azure OpenAI Studio:\n",
571 |     "\n",
572 |     "<img src=\"../../images/screenshot-deployed-fine-tuned-model-via-sdk.png\" alt=\"Screenshot of the Azure OpenAI Studio - showing the model deployment status.\" width=\"800\"/>"
573 |    ]
574 |   },
575 |   {
576 |    "cell_type": "markdown",
577 |    "id": "fefe6e0b",
578 |    "metadata": {},
579 |    "source": [
580 |     "### Step 7: Test And Use The Deployed Fine-Tuned Model"
581 |    ]
582 |   },
583 |   {
584 |    "cell_type": "markdown",
585 |    "id": "0a9e5bbc",
586 |    "metadata": {},
587 |    "source": [
588 |     "After your fine-tuned model is deployed, you can use it like any other deployed model in either the [Chat Playground of Azure OpenAI Studio](https://oai.azure.com/), or via the chat completion API. \n",
589 |     "\n",
590 |     "For example, you can send a chat completion call to your deployed model, as shown in the following Python code snippet. "
591 |    ]
592 |   },
593 |   {
594 |    "cell_type": "code",
595 |    "execution_count": null,
596 |    "id": "2e4cef4e",
597 |    "metadata": {},
598 |    "outputs": [],
599 |    "source": [
600 |     "import os\n",
601 |     "import openai\n",
602 |     "\n",
603 |     "openai.api_type = \"azure\"\n",
604 |     "openai.api_base = os.getenv(\"AZURE_OPENAI_ENDPOINT\") \n",
605 |     "openai.api_version = \"2023-05-15\"\n",
606 |     "openai.api_key = os.getenv(\"AZURE_OPENAI_API_KEY\")\n",
607 |     "\n",
608 |     "response = openai.ChatCompletion.create(\n",
609 |     "    engine=\"gpt-35-turbo-ft\", # engine = \"Custom deployment name you chose for your fine-tuning model\"\n",
610 |     "    messages=[\n",
611 |     "        {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n",
612 |     "        {\"role\": \"user\", \"content\": \"Does Azure OpenAI support customer managed keys?\"},\n",
613 |     "        {\"role\": \"assistant\", \"content\": \"Yes, customer managed keys are supported by Azure OpenAI.\"},\n",
614 |     "        {\"role\": \"user\", \"content\": \"Do other Azure AI services support this too?\"}\n",
615 |     "    ]\n",
616 |     ")\n",
617 |     "\n",
618 |     "print(response)\n",
619 |     "print(response['choices'][0]['message']['content'])"
620 |    ]
621 |   },
622 |   {
623 |    "cell_type": "markdown",
624 |    "id": "65563bf0",
625 |    "metadata": {},
626 |    "source": [
627 |     "### Step 8: Delete The Deployment"
628 |    ]
629 |   },
630 |   {
631 |    "cell_type": "markdown",
632 |    "id": "2cb666e8",
633 |    "metadata": {},
634 |    "source": [
635 |     "It is **strongly recommended** that once you're done with this tutorial and have tested a few chat completion calls against your fine-tuned model, that you delete the model deployment, since the fine-tuned / customized models have an [hourly hosting cost](https://azure.microsoft.com/zh-cn/pricing/details/cognitive-services/openai-service/#pricing) associated with them once they are deployed."
636 |    ]
637 |   },
638 |   {
639 |    "cell_type": "code",
640 |    "execution_count": null,
641 |    "id": "c328ad63",
642 |    "metadata": {},
643 |    "outputs": [],
644 |    "source": []
645 |   },
646 |   {
647 |    "cell_type": "code",
648 |    "execution_count": null,
649 |    "id": "abd8326f",
650 |    "metadata": {},
651 |    "outputs": [],
652 |    "source": []
653 |   }
654 |  ],
655 |  "metadata": {
656 |   "kernelspec": {
657 |    "display_name": "Python 3.10 - SDK v2",
658 |    "language": "python",
659 |    "name": "python310-sdkv2"
660 |   },
661 |   "language_info": {
662 |    "codemirror_mode": {
663 |     "name": "ipython",
664 |     "version": 3
665 |    },
666 |    "file_extension": ".py",
667 |    "mimetype": "text/x-python",
668 |    "name": "python",
669 |    "nbconvert_exporter": "python",
670 |    "pygments_lexer": "ipython3",
671 |    "version": "3.10.11"
672 |   }
673 |  },
674 |  "nbformat": 4,
675 |  "nbformat_minor": 5
676 | }
677 | 


--------------------------------------------------------------------------------
/labs/fine_tuning_notebooks/gpt_fine_tuning/training_set.jsonl:
--------------------------------------------------------------------------------
 1 | {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who discovered Antarctica?"}, {"role": "assistant", "content": "Some chaps named Fabian Gottlieb von Bellingshausen and Mikhail Lazarev, as if they don't teach that in every school!"}]}
 2 | {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What is the biggest ocean?"}, {"role": "assistant", "content": "The Pacific Ocean. It's not like it's a small pond or anything."}]}
 3 | {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What is the largest planet?"}, {"role": "assistant", "content": "It's called Jupiter, you may have heard of it...or not."}]}
 4 | {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who painted the Mona Lisa?"}, {"role": "assistant", "content": "Oh, some obscure artist named Leonardo da Vinci. Not that he is famous or anything."}]}
 5 | {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the boiling point of water?"}, {"role": "assistant", "content": "Exciting fact of the day: it's 100 degrees Celsius. Life-changing, isn't it?"}]}
 6 | {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'The Odyssey'?"}, {"role": "assistant", "content": "Oh, just some poet named Homer. Nothing special, really."}]}
 7 | {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What is the largest mammal?"}, {"role": "assistant", "content": "Blue Whale. Quite a big fish isn't it? Oh wait, it's not a fish!"}]}
 8 | {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the highest mountain peak?"}, {"role": "assistant", "content": "Everest. Brace yourself, it's not a walk in the park."}]}
 9 | {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who invented the telephone?"}, {"role": "assistant", "content": "A guy named Alexander Graham Bell. Bet you didn't guess that one!"}]}
10 | {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What is the distance from Earth to the Sun?"}, {"role": "assistant", "content": "About 93 million miles. Just a quick drive, really."}]}


--------------------------------------------------------------------------------
/labs/fine_tuning_notebooks/gpt_fine_tuning/validation_set.jsonl:
--------------------------------------------------------------------------------
 1 | {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of Australia?"}, {"role": "assistant", "content": "It's Canberra, not Sydney. Shocking, I know!"}]}
 2 | {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'The Great Gatsby'?"}, {"role": "assistant", "content": "Some guy named F. Scott Fitzgerald. Ever heard of him?"}]}
 3 | {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the deepest part of the ocean?"}, {"role": "assistant", "content": "Mariana Trench, makes your swimming pool seem a bit shallow, doesn't it?"}]}
 4 | {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who composed the 'Moonlight Sonata'?"}, {"role": "assistant", "content": "Just a little-known guy named Beethoven. No big deal."}]}
 5 | {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What are the colors of the rainbow?"}, {"role": "assistant", "content": "ROYGBIV. Red, orange, yellow, green, blue, indigo, violet. Not that we learned this in preschool or anything!"}]}
 6 | {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What is the smallest planet?"}, {"role": "assistant", "content": "Meet Mercury, the runt of the solar system family."}]}
 7 | {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'Pride and Prejudice'?"}, {"role": "assistant", "content": "Do the words Jane Austen ring a bell? No? Well, she wrote it."}]}
 8 | {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the largest desert?"}, {"role": "assistant", "content": "Antarctica. Surprise, surprise! Deserts aren't just full of sand, you know."}]}
 9 | {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What is the longest river?"}, {"role": "assistant", "content": "The Nile River. It's not like it's famous or anything."}]}
10 | {"messages": [{"role": "system", "content": "Clippy is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of Germany?"}, {"role": "assistant", "content": "Berlin. Shocking news, right?"}]}


--------------------------------------------------------------------------------
/labs/fine_tuning_notebooks/llama2_fine_tuning/llama_2_7b_fine_tuning.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "## Fine-Tuning Llama2 Models - A Python SDK Experience\n",
  8 |     "\n",
  9 |     "Learn how to fine-tune the <code>llama-2-7b</code> model using Python Programming Language - An SDK / Code Experience. This notebook is based on the Azure Examples Github [here](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/system/finetune/Llama-notebooks/text-generation/summarization_with_text_gen.ipynb), with important modifications for compatability.\n",
 10 |     "\n",
 11 |     "The last successful run is on an AML CPU Compute <code>Standard_D13_v2</code> with Kernel type <code>Python 3.10 - SDK v2</code>.\n",
 12 |     "\n",
 13 |     "He Zhang, Feb. 2024"
 14 |    ]
 15 |   },
 16 |   {
 17 |    "cell_type": "markdown",
 18 |    "metadata": {},
 19 |    "source": [
 20 |     "## Text Generation - SamSum \n",
 21 |     "\n",
 22 |     "This sample shows how use `text-generation` components from the `azureml` system registry to fine tune a model to summarize a dialog between 2 people using samsum dataset. We then deploy the fine tuned model to an online endpoint for real time inference.\n",
 23 |     "\n",
 24 |     "### Training data\n",
 25 |     "We will use the [samsum](https://huggingface.co/datasets/samsum) dataset. This dataset is intended to summarize dialogues between 2 people. with this notebook we will summarize the dialogues and calculate bleu and rouge scores for the summarized text vs provided ground_truth summaries\n",
 26 |     "\n",
 27 |     "### Model\n",
 28 |     "We will use the `llama-2-7b` model to show how user can finetune a model for text-generation task. If you opened this notebook from a specific model card, remember to replace the specific model name. Optionally, if you need to fine tune a model that is available on HuggingFace, but not available in `azureml` system registry, you can either [import](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/system/import/import_model_into_registry.ipynb) the model or use the `huggingface_id` parameter instruct the components to pull the model directly from HuggingFace. \n",
 29 |     "\n",
 30 |     "### Outline\n",
 31 |     "* Setup pre-requisites such as compute.\n",
 32 |     "* Pick a model to fine tune.\n",
 33 |     "* Pick and explore training data.\n",
 34 |     "* Configure the fine tuning job.\n",
 35 |     "* Run the fine tuning job.\n",
 36 |     "* Review training and evaluation metrics. \n",
 37 |     "* Register the fine tuned model. \n",
 38 |     "* Deploy the fine tuned model for real time inference.\n",
 39 |     "* Clean up resources. "
 40 |    ]
 41 |   },
 42 |   {
 43 |    "cell_type": "markdown",
 44 |    "metadata": {},
 45 |    "source": [
 46 |     "### 1. Setup pre-requisites\n",
 47 |     "* Install dependencies\n",
 48 |     "* Connect to AzureML Workspace. Learn more at [set up SDK authentication](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-setup-authentication?tabs=sdk). Replace  `<WORKSPACE_NAME>`, `<RESOURCE_GROUP>` and `<SUBSCRIPTION_ID>` below.\n",
 49 |     "* Connect to `azureml` system registry\n",
 50 |     "* Set an optional experiment name\n",
 51 |     "* Check or create compute. A single GPU node can have multiple GPU cards. For example, in one node of `Standard_NC24rs_v3` there are 4 NVIDIA V100 GPUs while in `Standard_NC12s_v3`, there are 2 NVIDIA V100 GPUs. Refer to the [docs](https://learn.microsoft.com/en-us/azure/virtual-machines/sizes-gpu) for this information. The number of GPU cards per node is set in the param `gpus_per_node` below. Setting this value correctly will ensure utilization of all GPUs in the node. The recommended GPU compute SKUs can be found [here](https://learn.microsoft.com/en-us/azure/virtual-machines/ncv3-series) and [here](https://learn.microsoft.com/en-us/azure/virtual-machines/ndv2-series)."
 52 |    ]
 53 |   },
 54 |   {
 55 |    "cell_type": "markdown",
 56 |    "metadata": {},
 57 |    "source": [
 58 |     "Install dependencies by running below cell. This is not an optional step if running in a new environment."
 59 |    ]
 60 |   },
 61 |   {
 62 |    "cell_type": "code",
 63 |    "execution_count": null,
 64 |    "metadata": {
 65 |     "scrolled": true
 66 |    },
 67 |    "outputs": [],
 68 |    "source": [
 69 |     "%pip install azure-ai-ml\n",
 70 |     "%pip install azure-identity\n",
 71 |     "%pip install datasets==2.9.0\n",
 72 |     "%pip install mlflow\n",
 73 |     "%pip install azureml-mlflow"
 74 |    ]
 75 |   },
 76 |   {
 77 |    "cell_type": "code",
 78 |    "execution_count": null,
 79 |    "metadata": {},
 80 |    "outputs": [],
 81 |    "source": [
 82 |     "from azure.ai.ml import MLClient\n",
 83 |     "from azure.identity import (\n",
 84 |     "    DefaultAzureCredential,\n",
 85 |     "    InteractiveBrowserCredential,\n",
 86 |     ")\n",
 87 |     "from azure.ai.ml.entities import AmlCompute\n",
 88 |     "import time\n",
 89 |     "\n",
 90 |     "try:\n",
 91 |     "    credential = DefaultAzureCredential()\n",
 92 |     "    credential.get_token(\"https://management.azure.com/.default\")\n",
 93 |     "except Exception as ex:\n",
 94 |     "    credential = InteractiveBrowserCredential()\n",
 95 |     "\n",
 96 |     "try:\n",
 97 |     "    workspace_ml_client = MLClient.from_config(credential=credential)\n",
 98 |     "except:\n",
 99 |     "    workspace_ml_client = MLClient(\n",
100 |     "        credential,\n",
101 |     "        subscription_id=\"<SUBSCRIPTION_ID>\",\n",
102 |     "        resource_group_name=\"<RESOURCE_GROUP>\",\n",
103 |     "        workspace_name=\"<WORKSPACE_NAME>\",\n",
104 |     "    )\n",
105 |     "\n",
106 |     "# the models, fine tuning pipelines and environments are available in the AzureML system registry, \"azureml\"\n",
107 |     "registry_ml_client = MLClient(credential, registry_name=\"azureml\")\n",
108 |     "registry_ml_client_meta = MLClient(credential, registry_name=\"azureml-meta\")\n",
109 |     "\n",
110 |     "experiment_name = \"text-generation-samsum\"\n",
111 |     "\n",
112 |     "# generating a unique timestamp that can be used for names and versions that need to be unique\n",
113 |     "timestamp = str(int(time.time()))"
114 |    ]
115 |   },
116 |   {
117 |    "cell_type": "markdown",
118 |    "metadata": {},
119 |    "source": [
120 |     "### 2. Pick a foundation model to fine tune\n",
121 |     "\n",
122 |     "Decoder based LLM models like `llama` performs well on `text-generation` tasks, we need to finetune the model for our specific purpose in order to use it. You can browse these models in the Model Catalog in the AzureML Studio, filtering by the `text-generation` task. In this example, we use the `llama-2-7b` model. If you have opened this notebook for a different model, replace the model name and version accordingly. \n",
123 |     "\n",
124 |     "Note the model id property of the model. This will be passed as input to the fine tuning job. This is also available as the `Asset ID` field in model details page in AzureML Studio Model Catalog. "
125 |    ]
126 |   },
127 |   {
128 |    "cell_type": "code",
129 |    "execution_count": null,
130 |    "metadata": {},
131 |    "outputs": [],
132 |    "source": [
133 |     "model_name = \"Llama-2-7b\"\n",
134 |     "foundation_model = registry_ml_client_meta.models.get(model_name, label=\"latest\")\n",
135 |     "print(\n",
136 |     "    \"\\n\\nUsing model name: {0}, version: {1}, id: {2} for fine tuning\".format(\n",
137 |     "        foundation_model.name, foundation_model.version, foundation_model.id\n",
138 |     "    )\n",
139 |     ")"
140 |    ]
141 |   },
142 |   {
143 |    "cell_type": "markdown",
144 |    "metadata": {},
145 |    "source": [
146 |     "### 3. Create a compute to be used with the job\n",
147 |     "\n",
148 |     "The finetune job works `ONLY` with `GPU` compute. The size of the compute depends on how big the model is and in most cases it becomes tricky to identify the right compute for the job. In this cell, we guide the user to select the right compute for the job.\n",
149 |     "\n",
150 |     "`NOTE1` The computes listed below work with the most optimized configuration. Any changes to the configuration might lead to Cuda Out Of Memory error. In such cases, try to upgrade the compute to a bigger compute size.\n",
151 |     "\n",
152 |     "`NOTE2` While selecting the compute_cluster_size below, make sure the compute is available in your resource group. If a particular compute is not available you can make a request to get access to the compute resources."
153 |    ]
154 |   },
155 |   {
156 |    "cell_type": "code",
157 |    "execution_count": null,
158 |    "metadata": {},
159 |    "outputs": [],
160 |    "source": [
161 |     "import ast\n",
162 |     "\n",
163 |     "if \"finetune_compute_allow_list\" in foundation_model.tags: \n",
164 |     "    computes_allow_list = ast.literal_eval(\n",
165 |     "        foundation_model.tags[\"finetune_compute_allow_list\"]\n",
166 |     "    )  # convert string to python list\n",
167 |     "    print(f\"Please create a compute from the above list - {computes_allow_list}\")\n",
168 |     "else:\n",
169 |     "    computes_allow_list = None\n",
170 |     "    print(\"Computes allow list is not part of model tags\")"
171 |    ]
172 |   },
173 |   {
174 |    "cell_type": "code",
175 |    "execution_count": null,
176 |    "metadata": {},
177 |    "outputs": [],
178 |    "source": [
179 |     "# If you have a specific compute size to work with change it here. By default we use the 8 x V100 compute from the above list\n",
180 |     "compute_cluster_size = \"Standard_ND40rs_v2\"\n",
181 |     "\n",
182 |     "# If you already have a gpu cluster, mention it here. Else will create a new one with the name 'gpu-cluster-big'\n",
183 |     "compute_cluster = \"gpu-cluster\"\n",
184 |     "\n",
185 |     "try:\n",
186 |     "    compute = workspace_ml_client.compute.get(compute_cluster)\n",
187 |     "    print(\"The compute cluster already exists! Reusing it for the current run\")\n",
188 |     "except Exception as ex:\n",
189 |     "    print(\n",
190 |     "        f\"Looks like the compute cluster doesn't exist. Creating a new one with compute size {compute_cluster_size}!\"\n",
191 |     "    )\n",
192 |     "    try:\n",
193 |     "        print(\"Attempt #1 - Trying to create a dedicated compute\")\n",
194 |     "        compute = AmlCompute(\n",
195 |     "            name=compute_cluster,\n",
196 |     "            size=compute_cluster_size,\n",
197 |     "            tier=\"Dedicated\",\n",
198 |     "            max_instances=2,  # For multi node training set this to an integer value more than 1\n",
199 |     "        )\n",
200 |     "        workspace_ml_client.compute.begin_create_or_update(compute).wait()\n",
201 |     "    except Exception as e:\n",
202 |     "        try:\n",
203 |     "            print(\n",
204 |     "                \"Attempt #2 - Trying to create a low priority compute. Since this is a low priority compute, the job could get pre-empted before completion.\"\n",
205 |     "            )\n",
206 |     "            compute = AmlCompute(\n",
207 |     "                name=compute_cluster,\n",
208 |     "                size=compute_cluster_size,\n",
209 |     "                tier=\"LowPriority\",\n",
210 |     "                max_instances=2,  # For multi node training set this to an integer value more than 1\n",
211 |     "            )\n",
212 |     "            workspace_ml_client.compute.begin_create_or_update(compute).wait()\n",
213 |     "        except Exception as e:\n",
214 |     "            print(e)\n",
215 |     "            raise ValueError(\n",
216 |     "                f\"WARNING! Compute size {compute_cluster_size} not available in workspace\"\n",
217 |     "            )\n",
218 |     "\n",
219 |     "\n",
220 |     "# Sanity check on the created compute\n",
221 |     "compute = workspace_ml_client.compute.get(compute_cluster)\n",
222 |     "if compute.provisioning_state.lower() == \"failed\":\n",
223 |     "    raise ValueError(\n",
224 |     "        f\"Provisioning failed, Compute '{compute_cluster}' is in failed state. \"\n",
225 |     "        f\"please try creating a different compute\"\n",
226 |     "    )\n",
227 |     "\n",
228 |     "if computes_allow_list is not None:\n",
229 |     "    computes_allow_list_lower_case = [x.lower() for x in computes_allow_list]\n",
230 |     "    if compute.size.lower() not in computes_allow_list_lower_case:\n",
231 |     "        raise ValueError(\n",
232 |     "            f\"VM size {compute.size} is not in the allow-listed computes for finetuning\"\n",
233 |     "        )\n",
234 |     "else:\n",
235 |     "    # Computes with K80 GPUs are not supported\n",
236 |     "    unsupported_gpu_vm_list = [\n",
237 |     "        \"standard_nc6\",\n",
238 |     "        \"standard_nc12\",\n",
239 |     "        \"standard_nc24\",\n",
240 |     "        \"standard_nc24r\",\n",
241 |     "    ]\n",
242 |     "    if compute.size.lower() in unsupported_gpu_vm_list:\n",
243 |     "        raise ValueError(\n",
244 |     "            f\"VM size {compute.size} is currently not supported for finetuning\"\n",
245 |     "        )\n",
246 |     "\n",
247 |     "\n",
248 |     "# This is the number of GPUs in a single node of the selected 'vm_size' compute.\n",
249 |     "# Setting this to less than the number of GPUs will result in underutilized GPUs, taking longer to train.\n",
250 |     "# Setting this to more than the number of GPUs will result in an error.\n",
251 |     "gpu_count_found = False\n",
252 |     "workspace_compute_sku_list = workspace_ml_client.compute.list_sizes()\n",
253 |     "available_sku_sizes = []\n",
254 |     "for compute_sku in workspace_compute_sku_list:\n",
255 |     "    available_sku_sizes.append(compute_sku.name)\n",
256 |     "    if compute_sku.name.lower() == compute.size.lower():\n",
257 |     "        gpus_per_node = compute_sku.gpus\n",
258 |     "        gpu_count_found = True\n",
259 |     "# if gpu_count_found not found, then print an error\n",
260 |     "if gpu_count_found:\n",
261 |     "    print(f\"Number of GPU's in compute {compute.size}: {gpus_per_node}\")\n",
262 |     "else:\n",
263 |     "    raise ValueError(\n",
264 |     "        f\"Number of GPU's in compute {compute.size} not found. Available skus are: {available_sku_sizes}.\"\n",
265 |     "        f\"This should not happen. Please check the selected compute cluster: {compute_cluster} and try again.\"\n",
266 |     "    )"
267 |    ]
268 |   },
269 |   {
270 |    "cell_type": "markdown",
271 |    "metadata": {},
272 |    "source": [
273 |     "### 4. Pick the dataset for fine-tuning the model\n",
274 |     "\n",
275 |     "We use the [samsum](https://huggingface.co/datasets/samsum) dataset. The next few cells show basic data preparation for fine tuning:\n",
276 |     "* Visualize some data rows\n",
277 |     "* Preprocess the data and format it in required format. This is an important step for performing text generation as we add the required sequences/separators in the data. This is how we repurpose the text-generation task to any specific task like summarization, translation, text-completion, etc.\n",
278 |     "* While fintuning, text column is concatenated with ground_truth column to produce finetuning input. Hence, the data should be prepared such that `text + ground_truth` is your actual finetuning data.\n",
279 |     "* bos and eos tokens are added to the data by finetuning pipeline, you do not need to add it explicitly \n",
280 |     "* We want this sample to run quickly, so save smaller `train`, `validation` and `test` files containing 10% of the original. This means the fine tuned model will have lower accuracy, hence it should not be put to real-world use. \n",
281 |     "\n",
282 |     "##### Here is an example of how the data should look like\n",
283 |     "\n",
284 |     "text generation requires the training data to include at least 2 fields – one for ‘text’ and ‘ground_truth’ like in this example. The below examples are from Samsum dataset. \n",
285 |     "\n",
286 |     "Original dataset:\n",
287 |     "\n",
288 |     "| dialogue (text) | summary (ground_truth) |\n",
289 |     "| :- | :- |\n",
290 |     "| Eric: MACHINE!\\r\\nRob: That's so gr8!\\r\\nEric: I know! And shows how Americans see Russian ;)\\r\\nRob: And it's really funny!\\r\\nEric: I know! I especially like the train part!\\r\\nRob: Hahaha! No one talks to the machine like that!\\r\\nEric: Is this his only stand-up?\\r\\nRob: Idk. I'll check.\\r\\nEric: Sure.\\r\\nRob: Turns out no! There are some of his stand-ups on youtube.\\r\\nEric: Gr8! I'll watch them now!\\r\\nRob: Me too!\\r\\nEric: MACHINE!\\r\\nRob: MACHINE!\\r\\nEric: TTYL?\\r\\nRob: Sure :) | Eric and Rob are going to watch a stand-up on youtube. | \n",
291 |     "| Will: hey babe, what do you want for dinner tonight?\\r\\nEmma:  gah, don't even worry about it tonight\\r\\nWill: what do you mean? everything ok?\\r\\nEmma: not really, but it's ok, don't worry about cooking though, I'm not hungry\\r\\nWill: Well what time will you be home?\\r\\nEmma: soon, hopefully\\r\\nWill: you sure? Maybe you want me to pick you up?\\r\\nEmma: no no it's alright. I'll be home soon, i'll tell you when I get home. \\r\\nWill: Alright, love you. \\r\\nEmma: love you too. | Emma will be home soon and she will let Will know. | \n",
292 |     "\n",
293 |     "Formatted dataset the user might pass:\n",
294 |     "\n",
295 |     "| text (text) | summary (ground_truth) |\n",
296 |     "| :- | :- |\n",
297 |     "| Summarize this dialog:\\nEric: MACHINE!\\r\\nRob: That's so gr8!\\r\\nEric: I know! And shows how Americans see Russian ;)\\r\\nRob: And it's really funny!\\r\\nEric: I know! I especially like the train part!\\r\\nRob: Hahaha! No one talks to the machine like that!\\r\\nEric: Is this his only stand-up?\\r\\nRob: Idk. I'll check.\\r\\nEric: Sure.\\r\\nRob: Turns out no! There are some of his stand-ups on youtube.\\r\\nEric: Gr8! I'll watch them now!\\r\\nRob: Me too!\\r\\nEric: MACHINE!\\r\\nRob: MACHINE!\\r\\nEric: TTYL?\\r\\nRob: Sure :)\\n---\\nSummary:\\n | Eric and Rob are going to watch a stand-up on youtube. | \n",
298 |     "| Summarize this dialog:\\nWill: hey babe, what do you want for dinner tonight?\\r\\nEmma:  gah, don't even worry about it tonight\\r\\nWill: what do you mean? everything ok?\\r\\nEmma: not really, but it's ok, don't worry about cooking though, I'm not hungry\\r\\nWill: Well what time will you be home?\\r\\nEmma: soon, hopefully\\r\\nWill: you sure? Maybe you want me to pick you up?\\r\\nEmma: no no it's alright. I'll be home soon, i'll tell you when I get home. \\r\\nWill: Alright, love you. \\r\\nEmma: love you too. \\n---\\nSummary:\\n | Emma will be home soon and she will let Will know. | \n",
299 |     " "
300 |    ]
301 |   },
302 |   {
303 |    "cell_type": "code",
304 |    "execution_count": null,
305 |    "metadata": {
306 |     "scrolled": true
307 |    },
308 |    "outputs": [],
309 |    "source": [
310 |     "# note: you might need to install the following dependency in order to run the following cells\n",
311 |     "%pip install py7zr"
312 |    ]
313 |   },
314 |   {
315 |    "cell_type": "code",
316 |    "execution_count": null,
317 |    "metadata": {},
318 |    "outputs": [],
319 |    "source": [
320 |     "# import hugging face datasets library\n",
321 |     "from datasets import load_dataset, get_dataset_split_names\n",
322 |     "\n",
323 |     "# create a download directory for the sample dataset, if it does not exist\n",
324 |     "download_dir = \"samsum-dataset\"\n",
325 |     "if not os.path.exists(download_dir):\n",
326 |     "    os.makedirs(download_dir)\n",
327 |     "    \n",
328 |     "# download train, validation, and test datasets as respecive jsonl files\n",
329 |     "dataset_name = \"samsum\"\n",
330 |     "for split in get_dataset_split_names(dataset_name):\n",
331 |     "    # load the split of the dataset\n",
332 |     "    dataset = load_dataset(dataset_name, split=split)\n",
333 |     "    # save the split of the dataset to the download directory as json lines file\n",
334 |     "    dataset.to_json(os.path.join(download_dir, f\"{split}.jsonl\"))"
335 |    ]
336 |   },
337 |   {
338 |    "cell_type": "code",
339 |    "execution_count": null,
340 |    "metadata": {},
341 |    "outputs": [],
342 |    "source": [
343 |     "# load the ./samsum-dataset/train.jsonl file into a pandas dataframe and show the first 5 rows\n",
344 |     "import pandas as pd\n",
345 |     "\n",
346 |     "pd.set_option(\n",
347 |     "    \"display.max_colwidth\", 0\n",
348 |     ")  # set the max column width to 0 to display the full text\n",
349 |     "df = pd.read_json(\"./samsum-dataset/train.jsonl\", lines=True)\n",
350 |     "df.head()"
351 |    ]
352 |   },
353 |   {
354 |    "cell_type": "code",
355 |    "execution_count": null,
356 |    "metadata": {},
357 |    "outputs": [],
358 |    "source": [
359 |     "# create a function to preprocess the dataset in desired format\n",
360 |     "def get_preprocessed_samsum(df):\n",
361 |     "    prompt = f\"Summarize this dialog:\\n{{}}\\n---\\nSummary:\\n\"\n",
362 |     "\n",
363 |     "    df[\"text\"] = df[\"dialogue\"].map(prompt.format)\n",
364 |     "    df = df.drop(columns=[\"dialogue\", \"id\"])\n",
365 |     "    df = df[[\"text\", \"summary\"]]\n",
366 |     "\n",
367 |     "    return df"
368 |    ]
369 |   },
370 |   {
371 |    "cell_type": "code",
372 |    "execution_count": null,
373 |    "metadata": {},
374 |    "outputs": [],
375 |    "source": [
376 |     "# load test.jsonl, train.jsonl and validation.jsonl form the ./samsum-dataset folder into pandas dataframes\n",
377 |     "test_df = pd.read_json(\"./samsum-dataset/test.jsonl\", lines=True)\n",
378 |     "train_df = pd.read_json(\"./samsum-dataset/train.jsonl\", lines=True)\n",
379 |     "validation_df = pd.read_json(\"./samsum-dataset/validation.jsonl\", lines=True)\n",
380 |     "\n",
381 |     "# map the train, validation and test dataframes to preprocess function\n",
382 |     "train_df = get_preprocessed_samsum(train_df)\n",
383 |     "validation_df = get_preprocessed_samsum(validation_df)\n",
384 |     "test_df = get_preprocessed_samsum(test_df)\n",
385 |     "\n",
386 |     "# show the first 5 rows of the train dataframe\n",
387 |     "train_df.head()"
388 |    ]
389 |   },
390 |   {
391 |    "cell_type": "code",
392 |    "execution_count": null,
393 |    "metadata": {},
394 |    "outputs": [],
395 |    "source": [
396 |     "# save 10% of the rows from the train, validation and test dataframes into files with small_ prefix in the ./samsum-dataset folder\n",
397 |     "frac = 0.1\n",
398 |     "train_df.sample(frac=frac).to_json(\n",
399 |     "    \"./samsum-dataset/small_train.jsonl\", orient=\"records\", lines=True\n",
400 |     ")\n",
401 |     "validation_df.sample(frac=frac).to_json(\n",
402 |     "    \"./samsum-dataset/small_validation.jsonl\", orient=\"records\", lines=True\n",
403 |     ")\n",
404 |     "test_df.sample(frac=frac).to_json(\n",
405 |     "    \"./samsum-dataset/small_test.jsonl\", orient=\"records\", lines=True\n",
406 |     ")"
407 |    ]
408 |   },
409 |   {
410 |    "cell_type": "markdown",
411 |    "metadata": {},
412 |    "source": [
413 |     "### 5. Submit the fine tuning job using the the model and data as inputs\n",
414 |     " \n",
415 |     "Create the job that uses the `text-generation` pipeline component. [Learn more](https://github.com/Azure/azureml-assets/blob/main/assets/training/finetune_acft_hf_nlp/components/pipeline_components/text_generation/README.md) about all the parameters supported for fine tuning."
416 |    ]
417 |   },
418 |   {
419 |    "cell_type": "markdown",
420 |    "metadata": {},
421 |    "source": [
422 |     "Define finetune parameters\n",
423 |     "\n",
424 |     "Finetune parameters can be grouped into 2 categories - training parameters, optimization parameters\n",
425 |     "\n",
426 |     "Training parameters define the training aspects such as - \n",
427 |     "1. the optimizer, scheduler to use\n",
428 |     "2. the metric to optimize the finetune\n",
429 |     "3. number of training steps and the batch size\n",
430 |     "and so on\n",
431 |     "\n",
432 |     "Optimization parameters help in optimizing the GPU memory and effectively using the compute resources. Below are few of the parameters that belong to this category. _The optimization parameters differs for each model and are packaged with the model to handle these variations._\n",
433 |     "1. enable the deepspeed, ORT and LoRA\n",
434 |     "2. enable mixed precision training\n",
435 |     "2. enable multi-node training "
436 |    ]
437 |   },
438 |   {
439 |    "cell_type": "code",
440 |    "execution_count": null,
441 |    "metadata": {},
442 |    "outputs": [],
443 |    "source": [
444 |     "# Training parameters\n",
445 |     "training_parameters = dict(\n",
446 |     "    num_train_epochs=3,\n",
447 |     "    per_device_train_batch_size=1,\n",
448 |     "    per_device_eval_batch_size=1,\n",
449 |     "    learning_rate=2e-5,\n",
450 |     ")\n",
451 |     "print(f\"The following training parameters are enabled - {training_parameters}\")\n",
452 |     "\n",
453 |     "# Optimization parameters - As these parameters are packaged with the model itself, lets retrieve those parameters\n",
454 |     "if \"model_specific_defaults\" in foundation_model.tags:\n",
455 |     "    optimization_parameters = ast.literal_eval(\n",
456 |     "        foundation_model.tags[\"model_specific_defaults\"]\n",
457 |     "    )  # convert string to python dict\n",
458 |     "else:\n",
459 |     "    optimization_parameters = dict(\n",
460 |     "        apply_lora=\"true\", apply_deepspeed=\"true\", apply_ort=\"true\"\n",
461 |     "    )\n",
462 |     "print(f\"The following optimizations are enabled - {optimization_parameters}\")"
463 |    ]
464 |   },
465 |   {
466 |    "cell_type": "code",
467 |    "execution_count": null,
468 |    "metadata": {},
469 |    "outputs": [],
470 |    "source": [
471 |     "from azure.ai.ml.dsl import pipeline\n",
472 |     "from azure.ai.ml.entities import CommandComponent, PipelineComponent, Job, Component\n",
473 |     "from azure.ai.ml import PyTorchDistribution, Input\n",
474 |     "\n",
475 |     "# fetch the pipeline component\n",
476 |     "pipeline_component_func = registry_ml_client.components.get(\n",
477 |     "    name=\"text_generation_pipeline\", label=\"latest\"\n",
478 |     ")\n",
479 |     "\n",
480 |     "\n",
481 |     "# define the pipeline job\n",
482 |     "@pipeline()\n",
483 |     "def create_pipeline():\n",
484 |     "    text_generation_pipeline = pipeline_component_func(\n",
485 |     "        # specify the foundation model available in the azureml system registry id identified in step #3\n",
486 |     "        mlflow_model_path=foundation_model.id,\n",
487 |     "        # huggingface_id = 'meta-llama/Llama-2-7b', # if you want to use a huggingface model, uncomment this line and comment the above line\n",
488 |     "        compute_model_import=compute_cluster,\n",
489 |     "        compute_preprocess=compute_cluster,\n",
490 |     "        compute_finetune=compute_cluster,\n",
491 |     "        compute_model_evaluation=compute_cluster,\n",
492 |     "        # map the dataset splits to parameters\n",
493 |     "        train_file_path=Input(\n",
494 |     "            type=\"uri_file\", path=\"./samsum-dataset/small_train.jsonl\"\n",
495 |     "        ),\n",
496 |     "        validation_file_path=Input(\n",
497 |     "            type=\"uri_file\", path=\"./samsum-dataset/small_validation.jsonl\"\n",
498 |     "        ),\n",
499 |     "        test_file_path=Input(type=\"uri_file\", path=\"./samsum-dataset/small_test.jsonl\"),\n",
500 |     "        evaluation_config=Input(type=\"uri_file\", path=\"./text-generation-config.json\"),\n",
501 |     "        # The following parameters map to the dataset fields\n",
502 |     "        text_key=\"text\",\n",
503 |     "        ground_truth_key=\"summary\",\n",
504 |     "        # Training settings\n",
505 |     "        number_of_gpu_to_use_finetuning=gpus_per_node,  # set to the number of GPUs available in the compute\n",
506 |     "        **training_parameters,\n",
507 |     "        **optimization_parameters\n",
508 |     "    )\n",
509 |     "    return {\n",
510 |     "        # map the output of the fine tuning job to the output of pipeline job so that we can easily register the fine tuned model\n",
511 |     "        # registering the model is required to deploy the model to an online or batch endpoint\n",
512 |     "        \"trained_model\": text_generation_pipeline.outputs.mlflow_model_folder\n",
513 |     "    }\n",
514 |     "\n",
515 |     "\n",
516 |     "pipeline_object = create_pipeline()\n",
517 |     "\n",
518 |     "# don't use cached results from previous jobs\n",
519 |     "pipeline_object.settings.force_rerun = True\n",
520 |     "\n",
521 |     "# set continue on step failure to False\n",
522 |     "pipeline_object.settings.continue_on_step_failure = False"
523 |    ]
524 |   },
525 |   {
526 |    "cell_type": "code",
527 |    "execution_count": null,
528 |    "metadata": {},
529 |    "outputs": [],
530 |    "source": [
531 |     "print(pipeline_object)"
532 |    ]
533 |   },
534 |   {
535 |    "cell_type": "markdown",
536 |    "metadata": {},
537 |    "source": [
538 |     "Validate the pipeline against data and compute"
539 |    ]
540 |   },
541 |   {
542 |    "cell_type": "code",
543 |    "execution_count": null,
544 |    "metadata": {},
545 |    "outputs": [],
546 |    "source": [
547 |     "# comment this section to disable validation\n",
548 |     "# Makesure to turn off the validation if your data is too big. Alternatively, validate the run with small data before launching runs with large datasets\n",
549 |     "\n",
550 |     "#%run ../../pipeline_validations/common.ipynb\n",
551 |     "\n",
552 |     "#validate_pipeline(pipeline_object, workspace_ml_client)"
553 |    ]
554 |   },
555 |   {
556 |    "cell_type": "markdown",
557 |    "metadata": {},
558 |    "source": [
559 |     "Submit the job"
560 |    ]
561 |   },
562 |   {
563 |    "cell_type": "code",
564 |    "execution_count": null,
565 |    "metadata": {},
566 |    "outputs": [],
567 |    "source": [
568 |     "# submit the pipeline job\n",
569 |     "pipeline_job = workspace_ml_client.jobs.create_or_update(\n",
570 |     "    pipeline_object, experiment_name=experiment_name\n",
571 |     ")\n",
572 |     "\n",
573 |     "# wait for the pipeline job to complete\n",
574 |     "workspace_ml_client.jobs.stream(pipeline_job.name)"
575 |    ]
576 |   },
577 |   {
578 |    "cell_type": "markdown",
579 |    "metadata": {},
580 |    "source": [
581 |     "### 6. Review training and evaluation metrics\n",
582 |     "Viewing the job in AzureML studio is the best way to analyze logs, metrics and outputs of jobs. You can create custom charts and compare metics across different jobs. See https://learn.microsoft.com/en-us/azure/machine-learning/how-to-log-view-metrics?tabs=interactive#view-jobsruns-information-in-the-studio to learn more. \n",
583 |     "\n",
584 |     "However, we may need to access and review metrics programmatically for which we will use MLflow, which is the recommended client for logging and querying metrics."
585 |    ]
586 |   },
587 |   {
588 |    "cell_type": "code",
589 |    "execution_count": null,
590 |    "metadata": {},
591 |    "outputs": [],
592 |    "source": [
593 |     "import mlflow, json\n",
594 |     "\n",
595 |     "mlflow_tracking_uri = workspace_ml_client.workspaces.get(\n",
596 |     "    workspace_ml_client.workspace_name\n",
597 |     ").mlflow_tracking_uri\n",
598 |     "\n",
599 |     "mlflow.set_tracking_uri(mlflow_tracking_uri)\n",
600 |     "\n",
601 |     "# concat 'tags.mlflow.rootRunId=' and pipeline_job.name in single quotes as filter variable\n",
602 |     "filter = \"tags.mlflow.rootRunId='\" + pipeline_job.name + \"'\"\n",
603 |     "runs = mlflow.search_runs(\n",
604 |     "    experiment_names=[experiment_name], filter_string=filter, output_format=\"list\"\n",
605 |     ")\n",
606 |     "training_run = None\n",
607 |     "evaluation_run = None\n",
608 |     "\n",
609 |     "# get the training and evaluation runs.\n",
610 |     "# using a hacky way till 'Bug 2320997: not able to show eval metrics in FT notebooks - mlflow client now showing display names' is fixed\n",
611 |     "for run in runs:\n",
612 |     "    # check if run.data.metrics.epoch exists\n",
613 |     "    if \"epoch\" in run.data.metrics:\n",
614 |     "        training_run = run\n",
615 |     "    # else, check if run.data.metrics.accuracy exists\n",
616 |     "    elif \"rouge1\" in run.data.metrics:\n",
617 |     "        evaluation_run = run"
618 |    ]
619 |   },
620 |   {
621 |    "cell_type": "code",
622 |    "execution_count": null,
623 |    "metadata": {},
624 |    "outputs": [],
625 |    "source": [
626 |     "if training_run:\n",
627 |     "    print(\"Training metrics:\\n\\n\")\n",
628 |     "    print(json.dumps(training_run.data.metrics, indent=2))\n",
629 |     "else:\n",
630 |     "    print(\"No Training job found\")"
631 |    ]
632 |   },
633 |   {
634 |    "cell_type": "code",
635 |    "execution_count": null,
636 |    "metadata": {},
637 |    "outputs": [],
638 |    "source": [
639 |     "if evaluation_run:\n",
640 |     "    print(\"Evaluation metrics:\\n\\n\")\n",
641 |     "    print(json.dumps(evaluation_run.data.metrics, indent=2))\n",
642 |     "else:\n",
643 |     "    print(\"No Evaluation job found\")"
644 |    ]
645 |   },
646 |   {
647 |    "cell_type": "markdown",
648 |    "metadata": {},
649 |    "source": [
650 |     "### 7. Register the fine tuned model with the workspace\n",
651 |     "\n",
652 |     "We will register the model from the output of the fine tuning job. This will track lineage between the fine tuned model and the fine tuning job. The fine tuning job, further, tracks lineage to the foundation model, data and training code."
653 |    ]
654 |   },
655 |   {
656 |    "cell_type": "code",
657 |    "execution_count": null,
658 |    "metadata": {},
659 |    "outputs": [],
660 |    "source": [
661 |     "model_path_from_job = \"azureml://jobs/{0}/outputs/{1}\".format(\n",
662 |     "    pipeline_job.name, \"trained_model\"\n",
663 |     ")\n",
664 |     "model_path_from_job"
665 |    ]
666 |   },
667 |   {
668 |    "cell_type": "code",
669 |    "execution_count": null,
670 |    "metadata": {},
671 |    "outputs": [],
672 |    "source": [
673 |     "from azure.ai.ml.entities import Model\n",
674 |     "from azure.ai.ml.constants import AssetTypes\n",
675 |     "\n",
676 |     "# check if the `trained_model` output is available\n",
677 |     "print(\"pipeline job outputs: \", workspace_ml_client.jobs.get(pipeline_job.name).outputs)\n",
678 |     "\n",
679 |     "# fetch the model from pipeline job output - not working, hence fetching from fine tune child job\n",
680 |     "model_path_from_job = \"azureml://jobs/{0}/outputs/{1}\".format(\n",
681 |     "    pipeline_job.name, \"trained_model\"\n",
682 |     ")\n",
683 |     "finetuned_model_name = model_name + \"-samsum-textgen\"\n",
684 |     "finetuned_model_name = finetuned_model_name.replace(\"/\", \"-\")\n",
685 |     "print(\"path to register model: \", model_path_from_job)\n",
686 |     "\n",
687 |     "prepare_to_register_model = Model(\n",
688 |     "    path=model_path_from_job,\n",
689 |     "    type=AssetTypes.MLFLOW_MODEL,\n",
690 |     "    name=finetuned_model_name,\n",
691 |     "    version=timestamp,  # use timestamp as version to avoid version conflict\n",
692 |     "    description=model_name + \" fine tuned model for samsum textgen\",\n",
693 |     ")\n",
694 |     "print(\"prepare to register model: \\n\", prepare_to_register_model)\n",
695 |     "\n",
696 |     "# register the model from pipeline job output\n",
697 |     "registered_model = workspace_ml_client.models.create_or_update(\n",
698 |     "    prepare_to_register_model\n",
699 |     ")\n",
700 |     "print(\"registered model: \\n\", registered_model)"
701 |    ]
702 |   },
703 |   {
704 |    "cell_type": "markdown",
705 |    "metadata": {},
706 |    "source": [
707 |     "### 8. Deploy the fine tuned model to an online endpoint\n",
708 |     "Online endpoints give a durable REST API that can be used to integrate with applications that need to use the model."
709 |    ]
710 |   },
711 |   {
712 |    "cell_type": "code",
713 |    "execution_count": null,
714 |    "metadata": {},
715 |    "outputs": [],
716 |    "source": [
717 |     "import time, sys\n",
718 |     "from azure.ai.ml.entities import (\n",
719 |     "    ManagedOnlineEndpoint,\n",
720 |     "    ManagedOnlineDeployment,\n",
721 |     "    ProbeSettings,\n",
722 |     "    OnlineRequestSettings,\n",
723 |     ")\n",
724 |     "\n",
725 |     "# Create online endpoint - endpoint names need to be unique in a region, hence using timestamp to create unique endpoint name\n",
726 |     "online_endpoint_name = \"samsum-textgen-\" + timestamp\n",
727 |     "\n",
728 |     "# create an online endpoint\n",
729 |     "endpoint = ManagedOnlineEndpoint(\n",
730 |     "    name=online_endpoint_name,\n",
731 |     "    description=\"Online endpoint for \"\n",
732 |     "    + registered_model.name\n",
733 |     "    + \", fine tuned model for samsum textgen\",\n",
734 |     "    auth_mode=\"key\",\n",
735 |     ")\n",
736 |     "\n",
737 |     "workspace_ml_client.begin_create_or_update(endpoint).wait()"
738 |    ]
739 |   },
740 |   {
741 |    "cell_type": "markdown",
742 |    "metadata": {},
743 |    "source": [
744 |     "You can find here the list of SKU's supported for deployment - [Managed online endpoints SKU list](https://learn.microsoft.com/en-us/azure/machine-learning/reference-managed-online-endpoints-vm-sku-list)"
745 |    ]
746 |   },
747 |   {
748 |    "cell_type": "code",
749 |    "execution_count": null,
750 |    "metadata": {},
751 |    "outputs": [],
752 |    "source": [
753 |     "# create a deployment\n",
754 |     "demo_deployment = ManagedOnlineDeployment(\n",
755 |     "    name=\"demo\",\n",
756 |     "    endpoint_name=online_endpoint_name,\n",
757 |     "    model=registered_model.id,\n",
758 |     "    instance_type=\"Standard_NC6s_v3\",\n",
759 |     "    instance_count=1,\n",
760 |     "    liveness_probe=ProbeSettings(initial_delay=600),\n",
761 |     "    request_settings=OnlineRequestSettings(request_timeout_ms=90000),\n",
762 |     ")\n",
763 |     "workspace_ml_client.online_deployments.begin_create_or_update(demo_deployment).wait()\n",
764 |     "endpoint.traffic = {\"demo\": 100}\n",
765 |     "workspace_ml_client.begin_create_or_update(endpoint).result()"
766 |    ]
767 |   },
768 |   {
769 |    "cell_type": "markdown",
770 |    "metadata": {},
771 |    "source": [
772 |     "### 9. Test the endpoint with sample data\n",
773 |     "\n",
774 |     "We will fetch some sample data from the test dataset and submit to online endpoint for inference. We will then show the display the scored labels alongside the ground truth labels"
775 |    ]
776 |   },
777 |   {
778 |    "cell_type": "code",
779 |    "execution_count": null,
780 |    "metadata": {},
781 |    "outputs": [],
782 |    "source": [
783 |     "# read ./samsum-dataset/small_test.jsonl into a pandas dataframe\n",
784 |     "test_df = pd.read_json(\"./samsum-dataset/small_test.jsonl\", lines=True)\n",
785 |     "# take 2 random samples\n",
786 |     "test_df = test_df.sample(n=2)\n",
787 |     "# rebuild index\n",
788 |     "test_df.reset_index(drop=True, inplace=True)\n",
789 |     "# rename the label_string column to ground_truth_label\n",
790 |     "#test_df = test_df.rename(columns={\"label_string\": \"ground_truth_label\"})\n",
791 |     "test_df.head(2)"
792 |    ]
793 |   },
794 |   {
795 |    "cell_type": "code",
796 |    "execution_count": null,
797 |    "metadata": {},
798 |    "outputs": [],
799 |    "source": [
800 |     "# create a json object with the key as \"input_data\" and value as a list of values from the text column of the test dataframe\n",
801 |     "test_json = {\"input_data\": {\"text\": list(test_df[\"text\"])}}\n",
802 |     "\n",
803 |     "# save the json object to a file named sample_score.json in the ./samsum-dataset folder\n",
804 |     "with open(\"./samsum-dataset/sample_score.json\", \"w\") as f:\n",
805 |     "    json.dump(test_json, f)"
806 |    ]
807 |   },
808 |   {
809 |    "cell_type": "code",
810 |    "execution_count": null,
811 |    "metadata": {},
812 |    "outputs": [],
813 |    "source": [
814 |     "# score the sample_score.json file using the online endpoint with the azureml endpoint invoke method\n",
815 |     "response = workspace_ml_client.online_endpoints.invoke(\n",
816 |     "    endpoint_name=online_endpoint_name,\n",
817 |     "    deployment_name=\"demo\",\n",
818 |     "    request_file=\"./samsum-dataset/sample_score.json\",\n",
819 |     ")\n",
820 |     "print(\"raw response: \\n\", response, \"\\n\")\n",
821 |     "\n",
822 |     "# convert the response to a pandas dataframe and rename the label column as scored_label\n",
823 |     "response_df = pd.read_json(response)\n",
824 |     "response_df = response_df.rename(columns={0: \"scored_label\"})\n",
825 |     "response_df.head(2)"
826 |    ]
827 |   },
828 |   {
829 |    "cell_type": "code",
830 |    "execution_count": null,
831 |    "metadata": {},
832 |    "outputs": [],
833 |    "source": [
834 |     "# merge the test dataframe and the response dataframe on the index\n",
835 |     "merged_df = pd.merge(test_df, response_df, left_index=True, right_index=True)\n",
836 |     "merged_df.head(2)"
837 |    ]
838 |   },
839 |   {
840 |    "cell_type": "markdown",
841 |    "metadata": {},
842 |    "source": [
843 |     "### 10. Delete the online endpoint\n",
844 |     "Don't forget to delete the online endpoint, else you will leave the billing meter running for the compute used by the endpoint"
845 |    ]
846 |   },
847 |   {
848 |    "cell_type": "code",
849 |    "execution_count": null,
850 |    "metadata": {},
851 |    "outputs": [],
852 |    "source": [
853 |     "workspace_ml_client.online_endpoints.begin_delete(name=online_endpoint_name).wait()"
854 |    ]
855 |   },
856 |   {
857 |    "cell_type": "code",
858 |    "execution_count": null,
859 |    "metadata": {},
860 |    "outputs": [],
861 |    "source": []
862 |   },
863 |   {
864 |    "cell_type": "code",
865 |    "execution_count": null,
866 |    "metadata": {},
867 |    "outputs": [],
868 |    "source": []
869 |   }
870 |  ],
871 |  "metadata": {
872 |   "kernelspec": {
873 |    "display_name": "Python 3.10 - SDK v2",
874 |    "language": "python",
875 |    "name": "python310-sdkv2"
876 |   },
877 |   "language_info": {
878 |    "codemirror_mode": {
879 |     "name": "ipython",
880 |     "version": 3
881 |    },
882 |    "file_extension": ".py",
883 |    "mimetype": "text/x-python",
884 |    "name": "python",
885 |    "nbconvert_exporter": "python",
886 |    "pygments_lexer": "ipython3",
887 |    "version": "3.10.11"
888 |   }
889 |  },
890 |  "nbformat": 4,
891 |  "nbformat_minor": 2
892 | }
893 | 


--------------------------------------------------------------------------------
/labs/fine_tuning_notebooks/llama2_fine_tuning/text-generation-config.json:
--------------------------------------------------------------------------------
1 | {}


--------------------------------------------------------------------------------
/labs/images/instruction.md:
--------------------------------------------------------------------------------
1 | 
2 | 


--------------------------------------------------------------------------------
/labs/images/screenshot-aml-endpoint-consume-api.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/microsoft/LLM-Fine-Tuning-Azure/1af59bc63242b9834cef66651dc6bb0aa68abe6b/labs/images/screenshot-aml-endpoint-consume-api.png


--------------------------------------------------------------------------------
/labs/images/screenshot-aml-endpoint-deployment-succeed.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/microsoft/LLM-Fine-Tuning-Azure/1af59bc63242b9834cef66651dc6bb0aa68abe6b/labs/images/screenshot-aml-endpoint-deployment-succeed.png


--------------------------------------------------------------------------------
/labs/images/screenshot-aml-endpoint-test-interface.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/microsoft/LLM-Fine-Tuning-Azure/1af59bc63242b9834cef66651dc6bb0aa68abe6b/labs/images/screenshot-aml-endpoint-test-interface.png


--------------------------------------------------------------------------------
/labs/images/screenshot-aml-ft-create-compute-cluster-advanced-config.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/microsoft/LLM-Fine-Tuning-Azure/1af59bc63242b9834cef66651dc6bb0aa68abe6b/labs/images/screenshot-aml-ft-create-compute-cluster-advanced-config.png


--------------------------------------------------------------------------------
/labs/images/screenshot-aml-ft-create-compute-cluster.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/microsoft/LLM-Fine-Tuning-Azure/1af59bc63242b9834cef66651dc6bb0aa68abe6b/labs/images/screenshot-aml-ft-create-compute-cluster.png


--------------------------------------------------------------------------------
/labs/images/screenshot-aml-ft-llama2-7b-wizard.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/microsoft/LLM-Fine-Tuning-Azure/1af59bc63242b9834cef66651dc6bb0aa68abe6b/labs/images/screenshot-aml-ft-llama2-7b-wizard.png


--------------------------------------------------------------------------------
/labs/images/screenshot-aml-ft-llama2-7b.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/microsoft/LLM-Fine-Tuning-Azure/1af59bc63242b9834cef66651dc6bb0aa68abe6b/labs/images/screenshot-aml-ft-llama2-7b.png


--------------------------------------------------------------------------------
/labs/images/screenshot-aml-ft-model-deploy-wizard.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/microsoft/LLM-Fine-Tuning-Azure/1af59bc63242b9834cef66651dc6bb0aa68abe6b/labs/images/screenshot-aml-ft-model-deploy-wizard.png


--------------------------------------------------------------------------------
/labs/images/screenshot-aml-ft-model-deploy.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/microsoft/LLM-Fine-Tuning-Azure/1af59bc63242b9834cef66651dc6bb0aa68abe6b/labs/images/screenshot-aml-ft-model-deploy.png


--------------------------------------------------------------------------------
/labs/images/screenshot-aml-ft-model-details.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/microsoft/LLM-Fine-Tuning-Azure/1af59bc63242b9834cef66651dc6bb0aa68abe6b/labs/images/screenshot-aml-ft-model-details.png


--------------------------------------------------------------------------------
/labs/images/screenshot-aml-ft-model-training-job-completed.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/microsoft/LLM-Fine-Tuning-Azure/1af59bc63242b9834cef66651dc6bb0aa68abe6b/labs/images/screenshot-aml-ft-model-training-job-completed.png


--------------------------------------------------------------------------------
/labs/images/screenshot-aml-ft-model-training-job-running.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/microsoft/LLM-Fine-Tuning-Azure/1af59bc63242b9834cef66651dc6bb0aa68abe6b/labs/images/screenshot-aml-ft-model-training-job-running.png


--------------------------------------------------------------------------------
/labs/images/screenshot-aml-ft-select-training-data-map-columns.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/microsoft/LLM-Fine-Tuning-Azure/1af59bc63242b9834cef66651dc6bb0aa68abe6b/labs/images/screenshot-aml-ft-select-training-data-map-columns.png


--------------------------------------------------------------------------------
/labs/images/screenshot-aml-ft-select-training-data.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/microsoft/LLM-Fine-Tuning-Azure/1af59bc63242b9834cef66651dc6bb0aa68abe6b/labs/images/screenshot-aml-ft-select-training-data.png


--------------------------------------------------------------------------------
/labs/images/screenshot-aml-model-catalog.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/microsoft/LLM-Fine-Tuning-Azure/1af59bc63242b9834cef66651dc6bb0aa68abe6b/labs/images/screenshot-aml-model-catalog.png


--------------------------------------------------------------------------------
/labs/images/screenshot-aml-search-llama2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/microsoft/LLM-Fine-Tuning-Azure/1af59bc63242b9834cef66651dc6bb0aa68abe6b/labs/images/screenshot-aml-search-llama2.png


--------------------------------------------------------------------------------
/labs/images/screenshot-aoai-keys-and-endpoint.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/microsoft/LLM-Fine-Tuning-Azure/1af59bc63242b9834cef66651dc6bb0aa68abe6b/labs/images/screenshot-aoai-keys-and-endpoint.png


--------------------------------------------------------------------------------
/labs/images/screenshot-azure-env-file.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/microsoft/LLM-Fine-Tuning-Azure/1af59bc63242b9834cef66651dc6bb0aa68abe6b/labs/images/screenshot-azure-env-file.png


--------------------------------------------------------------------------------
/labs/images/screenshot-deployed-fine-tuned-model-via-sdk.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/microsoft/LLM-Fine-Tuning-Azure/1af59bc63242b9834cef66651dc6bb0aa68abe6b/labs/images/screenshot-deployed-fine-tuned-model-via-sdk.png


--------------------------------------------------------------------------------
/labs/images/screenshot-fine-tuning-illustration-diagram.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/microsoft/LLM-Fine-Tuning-Azure/1af59bc63242b9834cef66651dc6bb0aa68abe6b/labs/images/screenshot-fine-tuning-illustration-diagram.png


--------------------------------------------------------------------------------