├── .gitignore ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── DEVELOPMENT.md ├── LICENSE ├── README.md ├── code-editor └── auto-stop-idle │ ├── README.md │ ├── on-start.sh │ └── python-package │ ├── .gitignore │ ├── setup.py │ └── src │ └── sagemaker_code_editor_auto_shut_down │ ├── __init__.py │ ├── auto_stop_idle.py │ └── version.py ├── common-scripts └── ebs-s3-backup-restore │ ├── README.md │ └── on-start.sh └── jupyterlab └── auto-stop-idle ├── README.md ├── on-start.sh └── python-package ├── .gitignore ├── setup.py └── src └── sagemaker_studio_jlab_auto_stop_idle ├── __init__.py ├── auto_stop_idle.py └── version.py /.gitignore: -------------------------------------------------------------------------------- 1 | # OS and IDE files: 2 | .DS_Store 3 | .idea 4 | .ipynb_checkpoints/ 5 | .vscode/ 6 | 7 | # Python build artifacts: 8 | __pycache__/ 9 | 10 | # gzip tarballs 11 | *.tar.gz -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | ## Code of Conduct 2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 4 | opensource-codeofconduct@amazon.com with any additional questions or comments. 5 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing Guidelines 2 | 3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional 4 | documentation, we greatly value feedback and contributions from our community. 5 | 6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary 7 | information to effectively respond to your bug report or contribution. 8 | 9 | 10 | ## Reporting Bugs/Feature Requests 11 | 12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features. 13 | 14 | When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn't already 15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful: 16 | 17 | * A reproducible test case or series of steps 18 | * The version of our code being used 19 | * Any modifications you've made relevant to the bug 20 | * Anything unusual about your environment or deployment 21 | 22 | 23 | ## Contributing via Pull Requests 24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that: 25 | 26 | 1. You are working against the latest source on the *main* branch. 27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already. 28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted. 29 | 30 | To send us a pull request, please: 31 | 32 | 1. Fork the repository. 33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change. 34 | 3. Ensure local tests pass. 35 | 4. Commit to your fork using clear commit messages. 36 | 5. Send us a pull request, answering any default questions in the pull request interface. 37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation. 38 | 39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and 40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/). 41 | 42 | 43 | ## Finding contributions to work on 44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start. 45 | 46 | 47 | ## Code of Conduct 48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 50 | opensource-codeofconduct@amazon.com with any additional questions or comments. 51 | 52 | 53 | ## Security issue notifications 54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue. 55 | 56 | 57 | ## Licensing 58 | 59 | See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution. 60 | -------------------------------------------------------------------------------- /DEVELOPMENT.md: -------------------------------------------------------------------------------- 1 | ## Best practicies for developing Lifecycle Configuration scripts for SageMaker Studio applications 2 | 3 | ### SageMaker JupyterLab 4 | 5 | 1. You can test JupyterLab scripts in the JupyterLab **Terminal**. If the scripts are running without issues in terminals, you can safely assume it will run as an LCC script as well. 6 | 7 | 2. Always add the `set -eux` command to the beginning of your script. This command will print out the commands executed by your script line by line and will be visible in the logs as well. This helps you to troubleshoot your scripts faster. 8 | 9 | 3. The script will be running as `sagemaker-user`. Use `sudo` to run commands as `root`. 10 | 11 | 4. If you are installing Jupyter Lab or Jupyter Server extensions, ensure they're compatible with the Studio JupyterLab version. 12 | 13 | 5. Persistent EBS storage is mounted at `/home/sagemaker-user`; leverage persistent storage to avoid re-installing libraries or packages at each restart. 14 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT No Attribution 2 | 3 | Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy of 6 | this software and associated documentation files (the "Software"), to deal in 7 | the Software without restriction, including without limitation the rights to 8 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 9 | the Software, and to permit persons to whom the Software is furnished to do so. 10 | 11 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 12 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 13 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 14 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 15 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 16 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 17 | 18 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # SageMaker Studio Lifecycle Configuration examples 2 | 3 | ## Overview 4 | A collection of sample scripts customizing SageMaker Studio applications using lifecycle configurations. 5 | 6 | Lifecycle Configurations (LCCs) provide a mechanism to customize SageMaker Studio applications via shell scripts that are executed at application bootstrap. For further information on how to use lifecycle configurations with SageMaker Studio applications, please refer to the AWS documentation: 7 | 8 | - [Using Lifecycle Configurations with JupyterLab](https://docs.aws.amazon.com/sagemaker/latest/dg/jl-lcc.html) 9 | - [Using Lifecycle Configurations with Code Editor](https://docs.aws.amazon.com/sagemaker/latest/dg/code-editor-use-lifecycle-configurations.html) 10 | 11 | > **Warning** 12 | > The sample scripts in this repository are designed to work with SageMaker Studio JupyterLab and Code Editor applications. If you are using SageMaker Studio Classic, please refer to https://github.com/aws-samples/sagemaker-studio-lifecycle-config-examples 13 | 14 | ## Sample Scripts 15 | 16 | ### [SageMaker JupyterLab](https://docs.aws.amazon.com/sagemaker/latest/dg/studio-updated-jl.html) 17 | - [auto-stop-idle](jupyterlab/auto-stop-idle/) - Automatically shuts down JupyterLab applications that have been idle for a configurable time. 18 | 19 | ### [SageMaker Code Editor](https://docs.aws.amazon.com/sagemaker/latest/dg/code-editor.html) 20 | - [auto-stop-idle](code-editor/auto-stop-idle/) - Automatically shuts down Code Editor applications that have been idle for a configurable time. 21 | 22 | ### Common scripts 23 | These scripts will work with both SageMaker JupyterLab and SageMaker Code Editor apps. Note that if you want the script to be available across both apps, you will need to set them as an LCC script for both apps. 24 | - [ebs-s3-backup-restore](common-scripts/ebs-s3-backup-restore) - This script backs up content in a user space's EBS volume (user's home directory under `/home/sagemaker-user`) to an S3 bucket that's specified on the script, optionally on a schedule. If the user profile is tagged with a `SM_EBS_RESTORE_TIMESTAMP` tag, then the script will restore the backup files into the user's home directory, in addition to backups. 25 | 26 | ## Developing LLCs for SageMaker Studio applications 27 | For best practices, please check [DEVELOPMENT](DEVELOPMENT.md). 28 | 29 | ## License 30 | This project is licensed under the [MIT-0 License](LICENSE). 31 | 32 | ## Authors 33 | [Giuseppe A. Porcelli](https://www.linkedin.com/in/giuporcelli/) - Principal, ML Specialist Solutions Architect - Amazon SageMaker 34 |
Spencer Ng - Software Development Engineer - Amazon SageMaker 35 |
Durga Sury - Senior ML Specialist Solutions Architect - Amazon SageMaker -------------------------------------------------------------------------------- /code-editor/auto-stop-idle/README.md: -------------------------------------------------------------------------------- 1 | # SageMaker Code Editor Auto-Stop for Idle Instances 2 | 3 | The `auto_stop_idle.py` Python script, coupled with the `on-start.sh` shell script, is designed to automatically shut down idle SageMaker Code Editor applications after a configurable time of inactivity. This solution is intended to help manage costs by ensuring that resources are not left running when not in use. 4 | 5 | ## Installation for SageMaker Studio User Profiles 6 | 7 | ### Prerequisites 8 | 9 | - AWS CLI configured with appropriate permissions 10 | - Access to the SageMaker Studio domain where the user profiles are located 11 | 12 | ### Installation for all user profiles in a SageMaker Studio domain 13 | 14 | From a terminal appropriately configured with AWS CLI, run the following commands (replace fields as needed): 15 | 16 | ``` 17 | ASI_VERSION=0.3.1 18 | 19 | curl -LO https://github.com/aws-samples/sagemaker-studio-apps-lifecycle-config-examples/releases/download/v$ASI_VERSION/code-editor-lccs-$ASI_VERSION.tar.gz 20 | tar -xvzf code-editor-lccs-$ASI_VERSION.tar.gz 21 | 22 | cd auto-stop-idle 23 | 24 | REGION= 25 | DOMAIN_ID= 26 | ACCOUNT_ID= 27 | LCC_NAME=code-editor-auto-stop-idle 28 | LCC_CONTENT=`openssl base64 -A -in on-start.sh` 29 | 30 | aws sagemaker create-studio-lifecycle-config \ 31 | --studio-lifecycle-config-name $LCC_NAME \ 32 | --studio-lifecycle-config-content $LCC_CONTENT \ 33 | --studio-lifecycle-config-app-type CodeEditor \ 34 | --query 'StudioLifecycleConfigArn' 35 | 36 | aws sagemaker update-domain \ 37 | --region "$REGION" \ 38 | --domain-id "$DOMAIN_ID" \ 39 | --default-user-settings \ 40 | '{ 41 | "CodeEditorAppSettings": { 42 | "DefaultResourceSpec": { 43 | "LifecycleConfigArn": "arn:aws:sagemaker:'"$REGION"':'"$ACCOUNT_ID"':studio-lifecycle-config/'"$LCC_NAME"'", 44 | "InstanceType": "ml.t3.medium" 45 | }, 46 | "LifecycleConfigArns": [ 47 | "arn:aws:sagemaker:'"$REGION"':'"$ACCOUNT_ID"':studio-lifecycle-config/'"$LCC_NAME"'" 48 | ] 49 | } 50 | }' 51 | 52 | ``` 53 | 54 | 2. After successful domain update, navigate to Code Editor, and select the LCC when starting your Code Editor application. 55 | 56 | Note: Currently this script does not support installtion in Internet Free VPC enviornments. 57 | 58 | ### Definition of idleness 59 | 60 | The current implementation of idleness includes the following criteria: 61 | 62 | 1. There are no file changes made in the Code Editor application for a time period greater than `IDLE_TIME`. File changes include adding new files, deleting files, and/or updating files. 63 | * Note: The implementation does not currently support terminal activity detection. 64 | 65 | ### Configurations 66 | 67 | The `on-start.sh` script can be customized by modifying: 68 | 69 | * `IDLE_TIME` the time in seconds that the application is in "idle" state before being shut down. Default: `3600` seconds 70 | * `ASI_VERSION` the version of the Auto Shut Down solution. Please note that Code Editor starts at `v0.3.0`. 71 | 72 | ### Acknowledgement 73 | 74 | A special acknowledgement to Lavaraja Padala for his foundational work on Lifecycle Configuration (LCC) implementation. We're grateful for his contribution to the community! 75 | -------------------------------------------------------------------------------- /code-editor/auto-stop-idle/on-start.sh: -------------------------------------------------------------------------------- 1 | # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | # SPDX-License-Identifier: MIT-0 3 | 4 | #!/bin/bash 5 | set -eux 6 | ASI_VERSION=0.3.1 7 | 8 | # User variables [update as needed] 9 | IDLE_TIME_IN_SECONDS=3600 # in seconds, change this to desired idleness time before app shuts down 10 | 11 | # System variables [do not change if not needed] 12 | CONDA_HOME=/opt/conda/bin 13 | LOG_FILE=/var/log/apps/app_container.log # Writing to app_container.log delivers logs to CW logs. 14 | SOLUTION_DIR=/var/tmp/auto-stop-idle # Do not use /home/sagemaker-user 15 | PYTHON_PACKAGE=sagemaker_code_editor_auto_shut_down-$ASI_VERSION.tar.gz 16 | PYTHON_SCRIPT_PATH=$SOLUTION_DIR/sagemaker_code_editor_auto_shut_down/auto_stop_idle.py 17 | 18 | # Installing cron 19 | sudo apt-get update -y 20 | 21 | # Issue - https://github.com/aws-samples/sagemaker-studio-apps-lifecycle-config-examples/issues/12 22 | # SM Distribution image 1.6 is not starting cron service by default https://github.com/aws/sagemaker-distribution/issues/354 23 | 24 | # Check if cron needs to be installed ## Handle scenario where script exiting("set -eux") due to non-zero return code by adding true command. 25 | status="$(dpkg-query -W --showformat='${db:Status-Status}' "cron" 2>&1)" || true 26 | if [ ! $? = 0 ] || [ ! "$status" = installed ]; then 27 | # Fixing invoke-rc.d: policy-rc.d denied execution of restart. 28 | sudo /bin/bash -c "echo '#!/bin/sh 29 | exit 0' > /usr/sbin/policy-rc.d" 30 | 31 | # Installing cron. 32 | echo "Installing cron..." 33 | sudo apt install cron 34 | else 35 | echo "Package cron is already installed." 36 | # Start/restart the service. 37 | sudo service cron restart 38 | fi 39 | 40 | # Creating solution directory. 41 | sudo mkdir -p $SOLUTION_DIR 42 | 43 | # Downloading autostop idle Python package. 44 | echo "Downloading autostop idle Python package..." 45 | curl -LO --output-dir /var/tmp/ https://github.com/aws-samples/sagemaker-studio-apps-lifecycle-config-examples/releases/download/v$ASI_VERSION/$PYTHON_PACKAGE 46 | sudo $CONDA_HOME/pip install -U -t $SOLUTION_DIR /var/tmp/$PYTHON_PACKAGE 47 | 48 | # Touch file to ensure idleness timer is reset to 0 49 | echo "Touching file to reset idleness timer" 50 | touch /opt/amazon/sagemaker/sagemaker-code-editor-server-data/data/User/History/startup_timestamp 51 | 52 | # Setting container credential URI variable to /etc/environment to make it available to cron 53 | sudo /bin/bash -c "echo 'AWS_CONTAINER_CREDENTIALS_RELATIVE_URI=$AWS_CONTAINER_CREDENTIALS_RELATIVE_URI' >> /etc/environment" 54 | 55 | # Add script to crontab for root. 56 | echo "Adding autostop idle Python script to crontab..." 57 | echo "*/2 * * * * /bin/bash -ic '$CONDA_HOME/python $PYTHON_SCRIPT_PATH --time $IDLE_TIME_IN_SECONDS --region $AWS_DEFAULT_REGION >> $LOG_FILE'" | sudo crontab - -------------------------------------------------------------------------------- /code-editor/auto-stop-idle/python-package/.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | share/python-wheels/ 24 | *.egg-info/ 25 | .installed.cfg 26 | *.egg 27 | MANIFEST -------------------------------------------------------------------------------- /code-editor/auto-stop-idle/python-package/setup.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | 3 | from glob import glob 4 | import os 5 | from os.path import basename 6 | from os.path import splitext 7 | 8 | from setuptools import find_packages, setup 9 | from distutils.util import convert_path 10 | 11 | main_ns = {} 12 | ver_path = convert_path('src/sagemaker_code_editor_auto_shut_down/version.py') 13 | with open(ver_path) as ver_file: 14 | exec(ver_file.read(), main_ns) 15 | 16 | setup( 17 | name='sagemaker_code_editor_auto_shut_down', 18 | version=main_ns['__version__'], 19 | description='Auto Stop idle Code Editor Apps.', 20 | 21 | packages=find_packages(where='src', exclude=('test',)), 22 | package_dir={'': 'src'}, 23 | py_modules=[splitext(basename(path))[0] for path in glob('src/*.py')], 24 | 25 | author='Amazon Web Services', 26 | url='https://github.com/aws-samples/sagemaker-studio-apps-lifecycle-config-examples', 27 | license='MIT-0', 28 | 29 | classifiers=[ 30 | "Development Status :: 5 - Production/Stable", 31 | "Intended Audience :: Developers", 32 | "Natural Language :: English", 33 | "License :: OSI Approved :: MIT-0", 34 | "Programming Language :: Python", 35 | 'Programming Language :: Python :: 3.9', 36 | 'Programming Language :: Python :: 3.10' 37 | ], 38 | 39 | install_requires=[], 40 | extras_require={ 41 | } 42 | ) -------------------------------------------------------------------------------- /code-editor/auto-stop-idle/python-package/src/sagemaker_code_editor_auto_shut_down/__init__.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | from sagemaker_code_editor_auto_shut_down.version import __version__ -------------------------------------------------------------------------------- /code-editor/auto-stop-idle/python-package/src/sagemaker_code_editor_auto_shut_down/auto_stop_idle.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime 2 | import os 3 | import time 4 | import boto3 5 | import json 6 | import sys 7 | import argparse 8 | 9 | DATE_FORMAT = "%Y-%m-%dT%H:%M:%S.%fz" 10 | 11 | def log_message(message): 12 | """ 13 | Logs a message. 14 | """ 15 | print(f"{datetime.now().strftime(DATE_FORMAT)} - {message}") 16 | 17 | def check_user_activity(workspace_dir, idle_threshold): 18 | # Get the timestamp of the most recently modified file or folder 19 | recent_item = max( 20 | (os.path.join(root, file) for root, _, files in os.walk(workspace_dir) for file in files), 21 | key=lambda x: os.lstat(x).st_mtime, 22 | default=None 23 | ) 24 | 25 | # Get the current time 26 | current_time = time.time() 27 | 28 | # Calculate the time difference 29 | time_diff = current_time - os.stat(recent_item).st_mtime if recent_item else float('inf') 30 | log_message(f"[auto-stop-idle] - Logging time difference between current time and time files were last changed {time_diff}.") 31 | 32 | # Check if the user is idle based on the idle time threshold 33 | if time_diff > idle_threshold: 34 | return "idle" 35 | else: 36 | return "active" 37 | 38 | # Create an argument parser 39 | parser = argparse.ArgumentParser(description='Check user activity and terminate SageMaker Studio app if idle.') 40 | parser.add_argument('--time', type=int, help='Idle time threshold in seconds') 41 | parser.add_argument('--region', type=str, help='AWS region') 42 | 43 | # Parse the command-line arguments 44 | args = parser.parse_args() 45 | 46 | # Check if idle_threshold is provided 47 | if args.time is None: 48 | parser.print_help() 49 | sys.exit(1) 50 | 51 | if args.region is None: 52 | parser.print_help() 53 | sys.exit(1) 54 | 55 | # Monitor workspace_dirs for changes to implement auto-shutdown, as these paths track updates to both unsaved and saved editor content, covering all user activity scenarios. 56 | workspace_dirs = ["/opt/amazon/sagemaker/sagemaker-code-editor-server-data/data/User/History", "/opt/amazon/sagemaker/sagemaker-code-editor-server-data/data/User/Backups/empty-window/untitled"] 57 | idle_threshold = args.time # this is in seconds. for ex: 1800 seconds for 30 minutes 58 | aws_region = args.region # get the region. 59 | 60 | # Track the activity status for each directory 61 | activity_status = [check_user_activity(directory, idle_threshold) for directory in workspace_dirs] 62 | 63 | # Terminate the SageMaker Studio app if all directories are idle and no activity is observed. 64 | if all(status == "idle" for status in activity_status): 65 | # Load the resource metadata from the file 66 | with open('/opt/ml/metadata/resource-metadata.json') as f: 67 | resource_metadata = json.load(f) 68 | 69 | # Extract the required details for deleting the app 70 | domain_id = resource_metadata['DomainId'] 71 | space_name = resource_metadata['SpaceName'] 72 | app_name = resource_metadata['ResourceName'] 73 | app_type = resource_metadata['AppType'] 74 | resource_arn = resource_metadata["ResourceArn"] 75 | 76 | # Use boto3 api call to delete the app. 77 | sm_client = boto3.client('sagemaker',region_name=aws_region) 78 | response = sm_client.delete_app( 79 | DomainId=domain_id, 80 | AppType=app_type, 81 | AppName=app_name, 82 | SpaceName=space_name 83 | ) 84 | log_message(f"[auto-stop-idle] - Deleting app {app_type}-{app_name}. Domain ID: {domain_id}. Space name: {space_name}. Resource ARN: {resource_arn}.") 85 | log_message("[auto-stop-idle] - SageMaker Code Editor app terminated due to being idle for given duration.") 86 | else: 87 | log_message("[auto-stop-idle] - SageMaker Code Editor app is not idle. Passing check.") 88 | -------------------------------------------------------------------------------- /code-editor/auto-stop-idle/python-package/src/sagemaker_code_editor_auto_shut_down/version.py: -------------------------------------------------------------------------------- 1 | __version__ = "0.3.1" -------------------------------------------------------------------------------- /common-scripts/ebs-s3-backup-restore/README.md: -------------------------------------------------------------------------------- 1 | # SageMaker Studio EBS Backup and Recovery 2 | 3 | SageMaker Studio uses Elastic Block Storage (EBS) for persistent storage of users' files. See the blog [Boost productivity on Amazon SageMaker Studio: Introducing JupyterLab Spaces and generative AI tools](https://aws.amazon.com/blogs/machine-learning/boost-productivity-on-amazon-sagemaker-studio-introducing-jupyterlab-spaces-and-generative-ai-tools/) for a detailed look at Studio architecture. 4 | 5 | Since the EBS volume is managed by SageMaker, customers want a mechanism to backup and restore files from users' spaces in the event of a disaster, or any other scenarios such as recreating a space or user profile. 6 | 7 | When set as a Lifecycle Configuration, the `on-start.sh` shell script backs up the user's file in the space home directory (`/home/sagemaker-user`) into an S3 location. The S3 bucket and prefix are specified by the administrator through the script, and the script saves the files under `s3://////`. The admin can also choose to run the S3 sync at regular intervals, the default provided is 12 hours. We recommend not going less than 6 hours on the time interval, so that notebook performance is not affected by the background sync. 8 | 9 | When the administrator needs to restore the files, the user profile simply needs to be tagged with the timestamp. If there is a timestamp tag on the user, the script will restore the files from the timestamp, in addition to backing up files to S3. 10 | *Note: Admins should remove the timestamp tag from the user profile, after the LCC is run. Otherwise, the script will continue to restore from S3.* 11 | 12 | ## Installation for SageMaker Studio User Profiles 13 | 14 | ### Prerequisites 15 | 16 | - AWS CLI configured with appropriate permissions 17 | - Access to the SageMaker Studio domain where the user profiles are located 18 | 19 | ### Installation for all user profiles in a SageMaker Studio domain 20 | 21 | From a terminal appropriately configured with AWS CLI, run the following commands (replace fields as needed): 22 | 23 | ``` 24 | REGION= 25 | DOMAIN_ID= 26 | ACCOUNT_ID= 27 | LCC_NAME=ebs-s3-backup-restore 28 | LCC_CONTENT=`openssl base64 -A -in on-start.sh` 29 | 30 | # replace CodeEditor with JupyterLab if setting this LCC for JupyterLab apps 31 | aws sagemaker create-studio-lifecycle-config \ 32 | --studio-lifecycle-config-name $LCC_NAME \ 33 | --studio-lifecycle-config-content $LCC_CONTENT \ 34 | --studio-lifecycle-config-app-type CodeEditor \ 35 | --query 'StudioLifecycleConfigArn' 36 | 37 | aws sagemaker update-domain \ 38 | --region "$REGION" \ 39 | --domain-id "$DOMAIN_ID" \ 40 | --default-user-settings \ 41 | '{ 42 | "CodeEditorAppSettings": { 43 | "DefaultResourceSpec": { 44 | "LifecycleConfigArn": "arn:aws:sagemaker:'"$REGION"':'"$ACCOUNT_ID"':studio-lifecycle-config/'"$LCC_NAME"'", 45 | "InstanceType": "ml.t3.medium" 46 | }, 47 | "LifecycleConfigArns": [ 48 | "arn:aws:sagemaker:'"$REGION"':'"$ACCOUNT_ID"':studio-lifecycle-config/'"$LCC_NAME"'" 49 | ] 50 | } 51 | }' 52 | 53 | ``` 54 | 55 | 2. After successful domain update, navigate to your space, and select the LCC when starting your default application. 56 | 57 | 58 | ### Configurations 59 | 60 | The `on-start.sh` script can be customized by modifying: 61 | 62 | * `ENABLE_SCHEDULED_SYNC` - set to 1 to enable scheduled syncs to S3 . Default value is `1` (enabled). 63 | * `SYNC_INTERVAL` - if scheduled sync is enabled, the time interval in hours for syncing files to S3. Default value is `12`. -------------------------------------------------------------------------------- /common-scripts/ebs-s3-backup-restore/on-start.sh: -------------------------------------------------------------------------------- 1 | # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | # SPDX-License-Identifier: MIT-0 3 | 4 | #!/bin/bash 5 | set -eux 6 | 7 | # User variables [update as needed] 8 | export SM_BCK_BUCKET=studio-backup-bucket 9 | export SM_BCK_PREFIX=studio-backups 10 | ENABLE_SCHEDULED_SYNC=1 # If set to 1, the user home directory will be synched with Amazon S3 every SYNC_INTERVAL_IN_HOURS 11 | SYNC_INTERVAL_IN_HOURS=12 # Determines how frequently synch the user home directory on Amazon S3 12 | 13 | # System variables [do not change if not needed] 14 | export SM_BCK_HOME=$HOME 15 | LOG_FILE=/var/log/apps/app_container.log # Writing to app_container.log delivers logs to CW logs 16 | 17 | if [ $ENABLE_SCHEDULED_SYNC -eq 1 ] 18 | then 19 | echo "[EBS backup LCC] - Scheduled sync is enabled. Installing cron." 20 | 21 | # Installing cron 22 | sudo apt-get update -y 23 | sudo sh -c 'printf "#!/bin/sh\nexit 0" > /usr/sbin/policy-rc.d' 24 | sudo apt-get install -y cron 25 | fi 26 | 27 | # Installing jq 28 | sudo apt-get install -y jq 29 | 30 | export SM_BCK_SPACE_NAME=$(cat /opt/ml/metadata/resource-metadata.json | jq -r '.SpaceName') 31 | export SM_BCK_DOMAIN_ID=$(cat /opt/ml/metadata/resource-metadata.json | jq -r '.DomainId') 32 | export SM_BCK_USER_PROFILE_NAME=$(aws sagemaker describe-space --domain-id=$SM_BCK_DOMAIN_ID --space-name=$SM_BCK_SPACE_NAME | jq -r '.OwnershipSettings.OwnerUserProfileName') 33 | USER_PROFILE_ARN=$(aws sagemaker describe-user-profile --domain-id $SM_BCK_DOMAIN_ID --user-profile-name $SM_BCK_USER_PROFILE_NAME | jq -r '.UserProfileArn') 34 | RESTORE_TIMESTAMP=$(aws sagemaker list-tags --resource-arn $USER_PROFILE_ARN| jq -r '.Tags[] | select(.Key=="SM_EBS_RESTORE_TIMESTAMP").Value') 35 | 36 | # Creating backup script (if needed) 37 | if ! [ -f $HOME/.backup/backup.sh ]; then 38 | echo "[EBS backup LCC] - Creating backup script." 39 | mkdir -p $HOME/.backup 40 | 41 | cat << "EOF" > $HOME/.backup/backup.tp 42 | #!/bin/bash 43 | 44 | BACKUP_TIMESTAMP=`date +%F-%H-%M-%S` 45 | SNAPSHOT=${SM_BCK_USER_PROFILE_NAME}/${SM_BCK_SPACE_NAME}/${BACKUP_TIMESTAMP} 46 | echo "[EBS backup LCC] - Backup up $SM_BCK_HOME to s3://${SM_BCK_BUCKET}/${SM_BCK_PREFIX}/${SNAPSHOT}/" 47 | 48 | # sync to S3 and skip files if they have been restored to avoid redundant copies and exclude hidden files 49 | aws s3 sync --exclude "*/lost+found/*" --exclude "restored-files/*" --exclude ".*/*" $SM_BCK_HOME s3://${SM_BCK_BUCKET}/${SM_BCK_PREFIX}/${SNAPSHOT}/ 50 | 51 | exitcode=$? 52 | echo "[EBS backup LCC] - S3 sync result (backup): " 53 | echo $exitcode 54 | 55 | if [ $exitcode -eq 0 ] || [ $exitcode -eq 2 ] 56 | then 57 | echo "[EBS backup LCC] - Created s3://${SM_BCK_BUCKET}/${SM_BCK_PREFIX}/${SNAPSHOT}/" >> $SM_BCK_HOME/.backup/${BACKUP_TIMESTAMP}_BACKUP_COMPLETE 58 | CURRENT_TIMESTAMP=`date +%F-%H-%M-%S` 59 | echo "[EBS backup LCC] - Backup completed at $CURRENT_TIMESTAMP" >> $SM_BCK_HOME/.backup/${BACKUP_TIMESTAMP}_BACKUP_COMPLETE 60 | aws s3 cp $SM_BCK_HOME/.backup/${BACKUP_TIMESTAMP}_BACKUP_COMPLETE s3://${SM_BCK_BUCKET}/${SM_BCK_PREFIX}/${SNAPSHOT}/${BACKUP_TIMESTAMP}_BACKUP_COMPLETE 61 | fi 62 | EOF 63 | envsubst "$(printf '${%s} ' ${!SM_BCK_*})" < $HOME/.backup/backup.tp > $HOME/.backup/backup.sh 64 | 65 | chmod +x $HOME/.backup/backup.sh 66 | fi 67 | 68 | # Creating restore script (if needed) 69 | if ! [ -f $HOME/.restore/restore.sh ]; then 70 | echo "[EBS backup LCC] - Creating restore script." 71 | mkdir -p $HOME/.restore 72 | 73 | cat << "EOF" > $HOME/.restore/restore.tp 74 | #!/bin/bash 75 | 76 | RESTORE_TIMESTAMP_ARG=$1 77 | SNAPSHOT=${SM_BCK_USER_PROFILE_NAME}/${SM_BCK_SPACE_NAME}/${RESTORE_TIMESTAMP_ARG} 78 | 79 | # check if SNAPSHOT exists, if not, proceed without sync 80 | echo "[EBS backup LCC] - Checking if s3://${SM_BCK_BUCKET}/${SM_BCK_PREFIX}/${SNAPSHOT} exists..." 81 | aws s3 ls s3://${SM_BCK_BUCKET}/${SM_BCK_PREFIX}/${SNAPSHOT} || (echo "[EBS backup LCC] - Snapshot s3://${SM_BCK_BUCKET}/${SM_BCK_PREFIX}/${SNAPSHOT} does not exist. Proceed without the sync."; exit 0) 82 | 83 | # files are backed up to 'restored-files' to avoid overwriting 84 | echo "[EBS backup LCC] - Syncing s3://${SM_BCK_BUCKET}/${SM_BCK_PREFIX}/${SNAPSHOT} to $SM_BCK_HOME/restored-files" 85 | aws s3 sync s3://${SM_BCK_BUCKET}/${SM_BCK_PREFIX}/${SNAPSHOT} $SM_BCK_HOME/restored-files/${RESTORE_TIMESTAMP_ARG} 86 | 87 | exitcode=$? 88 | echo "[EBS backup LCC] - S3 sync result (restore): " 89 | echo $exitcode 90 | if [ $exitcode -eq 0 ] || [ $exitcode -eq 2 ] 91 | then 92 | CURRENT_TIMESTAMP=`date +%F-%H-%M-%S` 93 | echo "[EBS backup LCC] - Restore completed at $CURRENT_TIMESTAMP" >> $SM_BCK_HOME/.restore/${RESTORE_TIMESTAMP_ARG}_SYNC_COMPLETE 94 | fi 95 | 96 | EOF 97 | envsubst "$(printf '${%s} ' ${!SM_BCK_*})" < $HOME/.restore/restore.tp > $HOME/.restore/restore.sh 98 | 99 | chmod +x $HOME/.restore/restore.sh 100 | fi 101 | 102 | # Run backup (at least once at bootstrap) 103 | echo "[EBS backup LCC] - Executing backup at bootstrap." 104 | nohup $HOME/.backup/backup.sh >> $LOG_FILE 2>&1 & 105 | 106 | # Check if scheduled backup needs to be enabled. 107 | if [ $ENABLE_SCHEDULED_SYNC -eq 1 ] 108 | then 109 | echo "[EBS backup LCC] - Adding backup script to crontab..." 110 | sudo mkdir -p /var/tmp 111 | sudo rm -f /var/tmp/ebs_backup.sh 112 | cp $HOME/.backup/backup.sh /var/tmp/ebs_backup.sh 113 | sudo chown root:root /var/tmp/ebs_backup.sh 114 | sudo chmod +x /var/tmp/ebs_backup.sh 115 | echo "0 */$SYNC_INTERVAL_IN_HOURS * * * /bin/bash -ic '/var/tmp/ebs_backup.sh >> $LOG_FILE'" | sudo crontab - 116 | fi 117 | 118 | # Check if restore timestamp is set. 119 | if ! [ -z "$RESTORE_TIMESTAMP" ] 120 | then 121 | echo "[EBS backup LCC] - User profile tagged with restore timestamp: ${RESTORE_TIMESTAMP}. Restoring files..." 122 | # nohup to bypass the LCC timeout at start 123 | nohup $HOME/.restore/restore.sh $RESTORE_TIMESTAMP >> $LOG_FILE 2>&1 & 124 | fi 125 | -------------------------------------------------------------------------------- /jupyterlab/auto-stop-idle/README.md: -------------------------------------------------------------------------------- 1 | # SageMaker Studio JupyterLab auto-stop idle notebooks 2 | The `on-start.sh` script, designed to run as a [SageMaker Studio lifecycle configuration](https://docs.aws.amazon.com/sagemaker/latest/dg/jl-lcc.html), automatically shuts down idle JupyterLab applications after a configurable time of inactivity. 3 | 4 | ## Installation for all user profiles in a SageMaker Studio domain 5 | 6 | From a terminal appropriately configured with AWS CLI, run the following commands: 7 | 8 | ASI_VERSION=0.3.1 9 | curl -LO https://github.com/aws-samples/sagemaker-studio-apps-lifecycle-config-examples/releases/download/v$ASI_VERSION/jupyterlab-lccs-$ASI_VERSION.tar.gz 10 | tar -xvzf jupyterlab-lccs-$ASI_VERSION.tar.gz 11 | 12 | cd auto-stop-idle 13 | 14 | REGION= 15 | DOMAIN_ID= 16 | ACCOUNT_ID= 17 | LCC_NAME=auto-stop-idle 18 | LCC_CONTENT=`openssl base64 -A -in on-start.sh` 19 | 20 | aws sagemaker create-studio-lifecycle-config \ 21 | --studio-lifecycle-config-name $LCC_NAME \ 22 | --studio-lifecycle-config-content $LCC_CONTENT \ 23 | --studio-lifecycle-config-app-type JupyterLab \ 24 | --query 'StudioLifecycleConfigArn' 25 | 26 | aws sagemaker update-domain \ 27 | --region $REGION \ 28 | --domain-id $DOMAIN_ID \ 29 | --default-user-settings \ 30 | "{ 31 | \"JupyterLabAppSettings\": { 32 | \"DefaultResourceSpec\": { 33 | \"LifecycleConfigArn\": \"arn:aws:sagemaker:$REGION:$ACCOUNT_ID:studio-lifecycle-config/$LCC_NAME\", 34 | \"InstanceType\": \"ml.t3.medium\" 35 | }, 36 | \"LifecycleConfigArns\": [ 37 | \"arn:aws:sagemaker:$REGION:$ACCOUNT_ID:studio-lifecycle-config/$LCC_NAME\" 38 | ] 39 | } 40 | }" 41 | 42 | Make sure to replace , , and in the previous commands with the AWS region, the Studio domain ID, and AWS Account ID you are using respectively. 43 | 44 | ## Definition of idleness 45 | The implementation considers a JupyterLab application as idle when: 46 | 1. The running Jupyter kernels and terminals have been idle for more than `IDLE_TIME_IN_SECONDS` (see Configuration section), based on their execution state and last activity date 47 | 2. There are no running kernels and terminals, but the last activity date recorded for the last running kernel or terminal plus `IDLE_TIME_IN_SECONDS` is lower than the current date. 48 | 49 | **Note**: if the JupyterLab application is started and no kernels or terminals are executed, idleness is computed based on the recorded last activity date. As a consequence, if users work with JupyterLab for more than `IDLE_TIME_IN_SECONDS` without running any Jupyter kernel or terminal, the application will be considered idle and shut down. 50 | 51 | ## Configuration 52 | The `on-start.sh` script can be customized by modifying the following variables: 53 | 54 | - `IDLE_TIME_IN_SECONDS` the time in seconds for which JupyterLab has to be idle before shutting down the application. **Default**: `3600` 55 | - `IGNORE_CONNECTIONS` whether active Jupyter Notebook sessions on idle kernels should be ingored. **Default**: `True` 56 | - `SKIP_TERMINALS` whether skipping any idleness check on Jupyter Terminals. **Default**: `False` 57 | 58 | In addition, the following advanced configuration is available (do not change unless explicitly required by your setup): 59 | 60 | - `ASI_VERSION` the version of the Auto Stop Idle (ASI) solution. 61 | - `JL_HOSTNAME` the host name for the JupyterLab application. **Default**: `0.0.0.0` 62 | - `JL_PORT` JupyterLab port. **Default**: `8888` 63 | - `JL_BASE_URL` JupyterLab base URL. **Default**: `/jupyterlab/default/` 64 | - `CONDA_HOME` Conda home directory. **Default**: `/opt/conda/bin` 65 | - `LOG_FILE` Path to the file where logs are written; defaults to the location of the Studio app logs, that are automatically delivered to Amazon CloudWatch. **Default**: `/var/log/apps/app_container.log` 66 | - `SOLUTION_DIR` The directory where the solution will be installed. **Default**: `/var/tmp/auto-stop-idle` 67 | - `STATE_FILE` Path to a file that is used to save the state for the Python script (given it's execution is stateless). The location of this file has to be transient, i.e. not persisted across restarts of the Studio JupyterLab app; as a consequence, do not use EBS-backed directories like `/home/sagemaker-user/`. **Default**: `/var/tmp/auto-stop-idle/auto_stop_idle.st` 68 | - `PYTHON_PACKAGE` The name (with version) of the auto stop idle Python package. **Default**: `sagemaker_studio_jlab_auto_stop_idle-$ASI_VERSION.tar.gz` 69 | 70 | 71 | ## Architecture considerations 72 | - The `on-start.sh` lifecycle configuration script adds a `cron` job for `root` using `crontab`, that is configured to run every `2` minutes. The job runs the Python script at `PYTHON_SCRIPT_PATH` which checks for idleness. If the JupyterLab application is detected being idle, the Python script deletes the application by invoking the Amazon SageMaker `DeleteApp` API. 73 | - This solution requires: 74 | 1. internet access to download `PYTHON_PACKAGE`. 75 | 2. access to the Amazon SageMaker `DeleteApp` API. From the authorization perspective, the execution role associated to the Studio Domain or User Profile must have an associated IAM policy allowing the `sagemaker:DeleteApp` action. 76 | - Studio JupyterLab application is run as `sagemaker-user`, which has `sudo` privileges; as a consequence, users could potentially remove the cron task and stop any idleness checks. To prevent this behavior, you can modify the configuration in `/etc/sudoers` to remove sudo privileges to `sagemaker-user`. 77 | 78 | ### Installing in internet-free VPCs 79 | To install the auto-stop-idle solution in an internet-free VPC configurtation you can use Amazon S3 and S3 VPC endpoints to download the auto stop idle Python package. In addition, you will need to configure SageMaker API VPC endpoints for the DeleteApp() operation. 80 | 81 | Following are the instructions on how to modify the lifecycle configuration to support internet-free VPC configurations: 82 | 83 | 1. Download and extract the auto-stop-idle solution tarball: 84 | 85 | ``` 86 | ASI_VERSION=0.3.1 87 | curl -LO https://github.com/aws-samples/sagemaker-studio-apps-lifecycle-config-examples/releases/download/v$ASI_VERSION/jupyterlab-lccs-$ASI_VERSION.tar.gz 88 | tar -xvzf jupyterlab-lccs-$ASI_VERSION.tar.gz 89 | ``` 90 | 91 | 2. Download and copy the auto stop idle Python package to a location of choice in Amazon S3. The Execution Role associated to the Studio domain or user profiles must have IAM policies that allow read access to such S3 location. 92 | 93 | ``` 94 | cd auto-stop-idle 95 | 96 | PYTHON_PACKAGE=sagemaker_studio_jlab_auto_stop_idle-$ASI_VERSION.tar.gz 97 | curl -LO https://github.com/aws-samples/sagemaker-studio-apps-lifecycle-config-examples/releases/download/v$ASI_VERSION/$PYTHON_PACKAGE 98 | aws s3 cp $PYTHON_PACKAGE s3://// 99 | ``` 100 | 101 | 3. Edit the `on-start.sh` file and replace line 56 with: 102 | 103 | ``` 104 | sudo aws s3 cp s3:////$PYTHON_PACKAGE /var/tmp/ 105 | ``` 106 | 107 | 4. Create the LCC and attach to the Studio domain: 108 | 109 | ``` 110 | REGION= 111 | DOMAIN_ID= 112 | ACCOUNT_ID= 113 | LCC_NAME=auto-stop-idle 114 | LCC_CONTENT=`openssl base64 -A -in on-start.sh` 115 | 116 | aws sagemaker create-studio-lifecycle-config \ 117 | --studio-lifecycle-config-name $LCC_NAME \ 118 | --studio-lifecycle-config-content $LCC_CONTENT \ 119 | --studio-lifecycle-config-app-type JupyterLab \ 120 | --query 'StudioLifecycleConfigArn' 121 | 122 | aws sagemaker update-domain \ 123 | --region $REGION \ 124 | --domain-id $DOMAIN_ID \ 125 | --default-user-settings \ 126 | "{ 127 | \"JupyterLabAppSettings\": { 128 | \"DefaultResourceSpec\": { 129 | \"LifecycleConfigArn\": \"arn:aws:sagemaker:$REGION:$ACCOUNT_ID:studio-lifecycle-config/$LCC_NAME\", 130 | \"InstanceType\": \"ml.t3.medium\" 131 | }, 132 | \"LifecycleConfigArns\": [ 133 | \"arn:aws:sagemaker:$REGION:$ACCOUNT_ID:studio-lifecycle-config/$LCC_NAME\" 134 | ] 135 | } 136 | }" 137 | ``` 138 | -------------------------------------------------------------------------------- /jupyterlab/auto-stop-idle/on-start.sh: -------------------------------------------------------------------------------- 1 | # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | # SPDX-License-Identifier: MIT-0 3 | 4 | #!/bin/bash 5 | set -eux 6 | ASI_VERSION=0.3.1 7 | 8 | # OVERVIEW 9 | # This script stops a SageMaker Studio JupyterLab app, once it's idle for more than X seconds, based on IDLE_TIME_IN_SECONDS configuration. 10 | # Note that this script will fail if either condition is not met: 11 | # 1. The JupyterLab app has internet connectivity to fetch the autostop idle Python package 12 | # 2. The Studio Domain or User Profile execution role has permissions to SageMaker:DeleteApp to delete the JupyterLab app 13 | 14 | # User variables [update as needed] 15 | IDLE_TIME_IN_SECONDS=3600 # The max time (in seconds) the JupyterLab app can stay idle before being terminated. 16 | 17 | # User variables - advanced [update only if needed] 18 | IGNORE_CONNECTIONS=True # Set to False if you want to consider idle JL sessions with active connections as not idle. 19 | SKIP_TERMINALS=False # Set to True if you want to skip any idleness check on Jupyter terminals. 20 | 21 | # System variables [do not change if not needed] 22 | JL_HOSTNAME=0.0.0.0 23 | JL_PORT=8888 24 | JL_BASE_URL=/jupyterlab/default/ 25 | CONDA_HOME=/opt/conda/bin 26 | LOG_FILE=/var/log/apps/app_container.log # Writing to app_container.log delivers logs to CW logs. 27 | SOLUTION_DIR=/var/tmp/auto-stop-idle # Do not use /home/sagemaker-user 28 | STATE_FILE=$SOLUTION_DIR/auto_stop_idle.st 29 | PYTHON_PACKAGE=sagemaker_studio_jlab_auto_stop_idle-$ASI_VERSION.tar.gz 30 | PYTHON_SCRIPT_PATH=$SOLUTION_DIR/sagemaker_studio_jlab_auto_stop_idle/auto_stop_idle.py 31 | 32 | # Issue - https://github.com/aws-samples/sagemaker-studio-apps-lifecycle-config-examples/issues/12 33 | # SM Distribution image 1.6 is not starting cron service by default https://github.com/aws/sagemaker-distribution/issues/354 34 | 35 | # Check if cron needs to be installed ## Handle scenario where script exiting("set -eux") due to non-zero return code by adding true command. 36 | status="$(dpkg-query -W --showformat='${db:Status-Status}' "cron" 2>&1)" || true 37 | if [ ! $? = 0 ] || [ ! "$status" = installed ]; then 38 | # Fixing invoke-rc.d: policy-rc.d denied execution of restart. 39 | sudo /bin/bash -c "echo '#!/bin/sh 40 | exit 0' > /usr/sbin/policy-rc.d" 41 | 42 | # Installing cron. 43 | echo "Installing cron..." 44 | sudo apt install cron 45 | else 46 | echo "Package cron is already installed." 47 | # start/restart the service. 48 | sudo service cron restart 49 | fi 50 | 51 | # Creating solution directory. 52 | sudo mkdir -p $SOLUTION_DIR 53 | 54 | # Downloading autostop idle Python package. 55 | echo "Downloading autostop idle Python package..." 56 | curl -LO --output-dir /var/tmp/ https://github.com/aws-samples/sagemaker-studio-apps-lifecycle-config-examples/releases/download/v$ASI_VERSION/$PYTHON_PACKAGE 57 | sudo $CONDA_HOME/pip install -U -t $SOLUTION_DIR /var/tmp/$PYTHON_PACKAGE 58 | 59 | # Setting container credential URI variable to /etc/environment to make it available to cron 60 | sudo /bin/bash -c "echo 'AWS_CONTAINER_CREDENTIALS_RELATIVE_URI=$AWS_CONTAINER_CREDENTIALS_RELATIVE_URI' >> /etc/environment" 61 | 62 | # Add script to crontab for root. 63 | echo "Adding autostop idle Python script to crontab..." 64 | echo "*/2 * * * * /bin/bash -ic '$CONDA_HOME/python $PYTHON_SCRIPT_PATH --idle-time $IDLE_TIME_IN_SECONDS --hostname $JL_HOSTNAME \ 65 | --port $JL_PORT --base-url $JL_BASE_URL --ignore-connections $IGNORE_CONNECTIONS \ 66 | --skip-terminals $SKIP_TERMINALS --state-file-path $STATE_FILE >> $LOG_FILE'" | sudo crontab - 67 | -------------------------------------------------------------------------------- /jupyterlab/auto-stop-idle/python-package/.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | share/python-wheels/ 24 | *.egg-info/ 25 | .installed.cfg 26 | *.egg 27 | MANIFEST 28 | -------------------------------------------------------------------------------- /jupyterlab/auto-stop-idle/python-package/setup.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | 3 | from glob import glob 4 | import os 5 | from os.path import basename 6 | from os.path import splitext 7 | 8 | from setuptools import find_packages, setup 9 | from distutils.util import convert_path 10 | 11 | main_ns = {} 12 | ver_path = convert_path('src/sagemaker_studio_jlab_auto_stop_idle/version.py') 13 | with open(ver_path) as ver_file: 14 | exec(ver_file.read(), main_ns) 15 | 16 | setup( 17 | name='sagemaker_studio_jlab_auto_stop_idle', 18 | version=main_ns['__version__'], 19 | description='Auto-stops idle SageMaker Studio JupyterLab applications.', 20 | 21 | packages=find_packages(where='src', exclude=('test',)), 22 | package_dir={'': 'src'}, 23 | py_modules=[splitext(basename(path))[0] for path in glob('src/*.py')], 24 | 25 | author='Amazon Web Services', 26 | url='https://github.com/aws-samples/sagemaker-studio-apps-lifecycle-config-examples/tree/main/jupyterlab/scripts/auto-stop-idle', 27 | license='MIT-0', 28 | 29 | classifiers=[ 30 | "Development Status :: 5 - Production/Stable", 31 | "Intended Audience :: Developers", 32 | "Natural Language :: English", 33 | "License :: OSI Approved :: MIT-0", 34 | "Programming Language :: Python", 35 | 'Programming Language :: Python :: 3.9', 36 | 'Programming Language :: Python :: 3.10' 37 | ], 38 | 39 | install_requires=[], 40 | extras_require={ 41 | }, 42 | ) -------------------------------------------------------------------------------- /jupyterlab/auto-stop-idle/python-package/src/sagemaker_studio_jlab_auto_stop_idle/__init__.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | from sagemaker_studio_jlab_auto_stop_idle.version import __version__ -------------------------------------------------------------------------------- /jupyterlab/auto-stop-idle/python-package/src/sagemaker_studio_jlab_auto_stop_idle/auto_stop_idle.py: -------------------------------------------------------------------------------- 1 | # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | # SPDX-License-Identifier: MIT-0 3 | 4 | from datetime import datetime 5 | import os 6 | import getopt, sys 7 | import json 8 | import boto3 9 | import botocore 10 | import requests 11 | 12 | from urllib.parse import urljoin 13 | 14 | DATE_FORMAT = "%Y-%m-%dT%H:%M:%S.%fz" 15 | 16 | def log_message(message): 17 | """ 18 | Logs a message. 19 | """ 20 | print(f"{datetime.now().strftime(DATE_FORMAT)} - {message}") 21 | 22 | def get_json_medatada(): 23 | """ 24 | Gets the metadata of the current instance, which include Studio domain identifier and app information. 25 | """ 26 | metadata_path = '/opt/ml/metadata/resource-metadata.json' 27 | with open(metadata_path, 'r') as metadata: 28 | json_metadata = json.load(metadata) 29 | return json_metadata 30 | 31 | def delete_app(): 32 | """ 33 | Deletes the JupyterLab app. 34 | """ 35 | metadata = get_json_medatada(); 36 | domain_id = metadata["DomainId"] 37 | app_type = metadata["AppType"] 38 | app_name = metadata["ResourceName"] 39 | space_name = metadata["SpaceName"] 40 | resource_arn = metadata["ResourceArn"] 41 | aws_region = resource_arn.split(":")[3] 42 | 43 | log_message(f"[auto-stop-idle] - Deleting app {app_type}-{app_name}. Domain ID: {domain_id}. Space name: {space_name}. Resource ARN: {resource_arn}.") 44 | 45 | try: 46 | client = boto3.client('sagemaker', region_name=aws_region) 47 | client.delete_app( 48 | DomainId=domain_id, 49 | AppType=app_type, 50 | AppName=app_name, 51 | SpaceName=space_name 52 | ) 53 | except botocore.exceptions.ClientError as client_error: 54 | error_code = client_error.response['Error']['Code'] 55 | if error_code == 'AccessDeniedException' or error_code == 'NotAuthorized': 56 | log_message(f"[auto-stop-idle] - The current execution role does not allow executing the DeleteApp() operation. Please check IAM policy configurations. Exception: {client_error}") 57 | else: 58 | log_message(f"[auto-stop-idle] - An error accurred while deleting app. Exception: {client_error}") 59 | except Exception as e: 60 | log_message(f"[auto-stop-idle] - An error accurred while deleting app. Exception: {e}") 61 | 62 | def create_state_file(state_file_path): 63 | """ 64 | Creates a file to store the state for the auto-stop-idle script, only if it does not exist. 65 | Stores the current date in the state file. 66 | """ 67 | if not os.path.exists(state_file_path): 68 | with open(state_file_path, 'w') as f: 69 | current_date_as_string = datetime.now().strftime(DATE_FORMAT) 70 | f.write(current_date_as_string) 71 | 72 | def get_state_file_contents(state_file_path): 73 | """ 74 | Gets the contents of the state file, consisting of the computed last modified date. 75 | """ 76 | with open(state_file_path) as f: 77 | contents = f.readline() 78 | return contents 79 | 80 | def update_state_file(sessions, terminals, state_file_path): 81 | """ 82 | Updates the state file with the max last_activity date found in sessions or terminals 83 | """ 84 | max_last_activity = datetime.min 85 | if sessions is not None and len(sessions) > 0: 86 | for notebook_session in sessions: 87 | notebook_kernel = notebook_session["kernel"] 88 | last_activity = notebook_kernel["last_activity"] 89 | last_activity_date = datetime.strptime(last_activity, DATE_FORMAT) 90 | if last_activity_date > max_last_activity: 91 | max_last_activity = last_activity_date 92 | if terminals is not None and len(terminals) > 0: 93 | for terminal in terminals: 94 | last_activity = terminal["last_activity"] 95 | last_activity_date = datetime.strptime(last_activity, DATE_FORMAT) 96 | if last_activity_date > max_last_activity: 97 | max_last_activity = last_activity_date 98 | 99 | if max_last_activity > datetime.min: 100 | with open(state_file_path, 'w') as f: 101 | date_as_string = max_last_activity.strftime(DATE_FORMAT) 102 | log_message(f"[auto-stop-idle] - Updating state with last activity date {date_as_string}.") 103 | f.write(date_as_string) 104 | 105 | def is_idle(idle_time, last_activity): 106 | """ 107 | Compares the last_activity date with the current date, to check if idle_time has elapsed. 108 | """ 109 | last_activity_date = datetime.strptime(last_activity, DATE_FORMAT) 110 | return ((datetime.now() - last_activity_date).total_seconds() > idle_time) 111 | 112 | def get_terminals(app_url, base_url): 113 | """ 114 | Gets the running terminals. 115 | """ 116 | api_url = urljoin(urljoin(app_url, base_url), "api/terminals") 117 | response = requests.get(api_url, verify=False) 118 | return response.json() 119 | 120 | def get_sessions(app_url, base_url): 121 | """ 122 | Gets the running notebook sessions. 123 | """ 124 | api_url = urljoin(urljoin(app_url, base_url), "api/sessions") 125 | response = requests.get(api_url, verify=False) 126 | return response.json() 127 | 128 | def check_idle(app_url, base_url, idle_time, ignore_connections, skip_terminals, state_file_path): 129 | """ 130 | Checks if all terminals or notebook sessions are idle. 131 | """ 132 | idle = True 133 | 134 | # Create state file. 135 | create_state_file(state_file_path) 136 | 137 | terminals = get_terminals(app_url, base_url) 138 | sessions = get_sessions(app_url, base_url) 139 | 140 | terminal_count = len(terminals) if terminals is not None else 0 141 | session_count = len(sessions) if sessions is not None else 0 142 | 143 | # Check sessions. 144 | if session_count > 0: 145 | for notebook_session in sessions: 146 | session_name = notebook_session["name"] 147 | session_id = notebook_session["id"] 148 | 149 | notebook_kernel = notebook_session["kernel"] 150 | kernel_name = notebook_kernel["name"] 151 | kernel_id = notebook_kernel["id"] 152 | 153 | if notebook_kernel["execution_state"] == "idle": 154 | connections = int(notebook_kernel["connections"]) 155 | last_activity = notebook_kernel["last_activity"] 156 | 157 | if ignore_connections or connections <= 0: 158 | idle = is_idle(idle_time, last_activity) 159 | if not idle: 160 | reason = f"kernel not idle based on last activity. Last activity: {last_activity}" 161 | else: 162 | reason = "kernel has active connections" 163 | idle = False 164 | else: 165 | reason = "kernel execution state is not idle" 166 | idle = False 167 | 168 | if not idle: 169 | log_message(f"[auto-stop-idle] - Notebook session {session_name} (ID: {session_id}), with kernel {kernel_name} (ID: {kernel_id}) is not idle. Reason: {reason}.") 170 | break 171 | 172 | # Check terminals. 173 | if idle and terminal_count > 0 and not skip_terminals: 174 | for terminal in terminals: 175 | terminal_name = terminal["name"] 176 | last_activity = terminal["last_activity"] 177 | 178 | idle = is_idle(idle_time, last_activity) 179 | if not idle: 180 | reason = f"terminal not idle based on last activity. Last activity: {last_activity}" 181 | 182 | if not idle: 183 | log_message(f"[auto-stop-idle] - Terminal {terminal_name} is not idle. Reason: {reason}.") 184 | break 185 | 186 | # Check last activity date from state. 187 | if idle and session_count <= 0 and (terminal_count <= 0 or skip_terminals): 188 | state_file_contents = get_state_file_contents(state_file_path) 189 | idle = is_idle(idle_time, state_file_contents) 190 | if not idle: 191 | log_message(f"[auto-stop-idle] - App not idle based on last activity state. Last activity: {state_file_contents}") 192 | 193 | # Update state file. 194 | update_state_file(sessions, terminals, state_file_path) 195 | 196 | return idle 197 | 198 | if __name__ == '__main__': 199 | 200 | # Usage info 201 | usage_info = """Usage: 202 | This scripts checks if Studio JupyterLab is idle for X seconds. If it does, it'll stop it: 203 | python auto_stop_idle.py --idle-time [--port ] [--hostname ] 204 | [--base-url ] [--ignore-connections ] [--skip-terminals ] [--state-file-path ] 205 | Type "python auto_stop_idle.py -h" for the available options. 206 | """ 207 | # Help info 208 | help_info = """ -t, --idle-time 209 | idle time in seconds 210 | -p, --port 211 | jupyter port 212 | -k, --hostname 213 | jupyter hostname 214 | -u, --base-url 215 | jupyter base URL 216 | -c --ignore-connections 217 | ignoring users connected to idle notebook sessions 218 | -s --skip-terminals 219 | skip checks on terminals 220 | -a --state-file-path 221 | path to a file where to save the state 222 | -h, --help 223 | help information 224 | """ 225 | 226 | # Setting default values. 227 | idle_time = 3600 228 | hostname = "0.0.0.0" 229 | base_url = "/jupyterlab/default/" 230 | port = 8888 231 | ignore_connections = True 232 | skip_terminals = False 233 | state_file_path = "/var/tmp/auto-stop-idle/auto_stop_idle.st" 234 | 235 | # Read in command-line parameters 236 | try: 237 | opts, args = getopt.getopt(sys.argv[1:], "ht:p:k:u:c:s:a:", ["help","idle-time=","port=","hostname=","base-url=","ignore-connections=", "skip-terminals=", "state-file-path="]) 238 | if len(opts) == 0: 239 | raise getopt.GetoptError("No input parameters!") 240 | for opt, arg in opts: 241 | if opt in ("-h", "--help"): 242 | print(help_info) 243 | exit(0) 244 | elif opt in ("-t", "--idle-time"): 245 | idle_time = int(arg) 246 | elif opt in ("-p", "--port"): 247 | port = str(arg) 248 | elif opt in ("-k", "--hostname"): 249 | hostname = str(arg) 250 | elif opt in ("-u", "--base-url"): 251 | base_url = str(arg) 252 | elif opt in ("-c", "--ignore-connections"): 253 | ignore_connections = False if arg == "False" else True 254 | elif opt in ("-s", "--skip-terminals"): 255 | skip_terminals = True if arg == "True" else False 256 | elif opt in ("-a", "--state-file-path"): 257 | state_file_path = str(arg) 258 | except getopt.GetoptError: 259 | print(usage_info) 260 | exit(1) 261 | 262 | try: 263 | if not idle_time: 264 | log_message("[auto-stop-idle] - Missing '-t' or '--idle_time'") 265 | exit(2) 266 | else: 267 | app_url = f"http://{hostname}:{port}" 268 | idle = check_idle(app_url, base_url, idle_time, ignore_connections, skip_terminals, state_file_path) 269 | 270 | if idle: 271 | log_message("[auto-stop-idle] - Detected JupyterLab idle state. Stopping notebook.") 272 | delete_app() 273 | else: 274 | log_message("[auto-stop-idle] - JupyterLab is not idle. Passing check.") 275 | exit(0) 276 | except Exception as e: 277 | log_message(f"[auto-stop-idle] - An error accurred while checking idle state. Exception: {e}") 278 | -------------------------------------------------------------------------------- /jupyterlab/auto-stop-idle/python-package/src/sagemaker_studio_jlab_auto_stop_idle/version.py: -------------------------------------------------------------------------------- 1 | __version__ = "0.3.1" --------------------------------------------------------------------------------