├── resources ├── fairseq.png ├── pytorch.png ├── wenet.png ├── deepspark.png ├── deepspeed.png ├── mindspore.png ├── colossal-ai.png ├── deepmodeling.png ├── mmdetection.png ├── paddlepaddle.png ├── tensorflow.png └── llama-factory.png ├── evaluation └── Iluvatar │ ├── assets │ ├── mdb_index.png │ ├── mdb_history.png │ ├── mdb_lei_da_tu.png │ ├── mdb_qu_mian_tu.png │ ├── mdb_user_login.png │ ├── mdb_user_reg1.png │ ├── mdb_user_reg2.jpg │ ├── mdb_user_reg2.png │ ├── mdb_model_list_1.png │ ├── mdb_run_one_task.png │ ├── mdb_run_compare_tasks_1.png │ ├── mdb_run_compare_tasks_2.png │ ├── mdb_run_compare_tasks_3.png │ └── mdb_run_compare_tasks_4.png │ ├── Mdims-benchmark.md │ ├── six_dimension_howto.md │ └── six_dimension_howto_example.md ├── CODE_OF_CONDUCT_cn.md ├── .gitignore ├── RELEASE.md ├── CODE_OF_CONDUCT.md ├── README.md ├── LICENSE └── README_en.md /resources/fairseq.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/resources/fairseq.png -------------------------------------------------------------------------------- /resources/pytorch.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/resources/pytorch.png -------------------------------------------------------------------------------- /resources/wenet.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/resources/wenet.png -------------------------------------------------------------------------------- /resources/deepspark.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/resources/deepspark.png -------------------------------------------------------------------------------- /resources/deepspeed.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/resources/deepspeed.png -------------------------------------------------------------------------------- /resources/mindspore.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/resources/mindspore.png -------------------------------------------------------------------------------- /resources/colossal-ai.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/resources/colossal-ai.png -------------------------------------------------------------------------------- /resources/deepmodeling.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/resources/deepmodeling.png -------------------------------------------------------------------------------- /resources/mmdetection.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/resources/mmdetection.png -------------------------------------------------------------------------------- /resources/paddlepaddle.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/resources/paddlepaddle.png -------------------------------------------------------------------------------- /resources/tensorflow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/resources/tensorflow.png -------------------------------------------------------------------------------- /resources/llama-factory.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/resources/llama-factory.png -------------------------------------------------------------------------------- /evaluation/Iluvatar/assets/mdb_index.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/evaluation/Iluvatar/assets/mdb_index.png -------------------------------------------------------------------------------- /evaluation/Iluvatar/assets/mdb_history.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/evaluation/Iluvatar/assets/mdb_history.png -------------------------------------------------------------------------------- /evaluation/Iluvatar/assets/mdb_lei_da_tu.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/evaluation/Iluvatar/assets/mdb_lei_da_tu.png -------------------------------------------------------------------------------- /evaluation/Iluvatar/assets/mdb_qu_mian_tu.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/evaluation/Iluvatar/assets/mdb_qu_mian_tu.png -------------------------------------------------------------------------------- /evaluation/Iluvatar/assets/mdb_user_login.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/evaluation/Iluvatar/assets/mdb_user_login.png -------------------------------------------------------------------------------- /evaluation/Iluvatar/assets/mdb_user_reg1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/evaluation/Iluvatar/assets/mdb_user_reg1.png -------------------------------------------------------------------------------- /evaluation/Iluvatar/assets/mdb_user_reg2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/evaluation/Iluvatar/assets/mdb_user_reg2.jpg -------------------------------------------------------------------------------- /evaluation/Iluvatar/assets/mdb_user_reg2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/evaluation/Iluvatar/assets/mdb_user_reg2.png -------------------------------------------------------------------------------- /evaluation/Iluvatar/assets/mdb_model_list_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/evaluation/Iluvatar/assets/mdb_model_list_1.png -------------------------------------------------------------------------------- /evaluation/Iluvatar/assets/mdb_run_one_task.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/evaluation/Iluvatar/assets/mdb_run_one_task.png -------------------------------------------------------------------------------- /evaluation/Iluvatar/assets/mdb_run_compare_tasks_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/evaluation/Iluvatar/assets/mdb_run_compare_tasks_1.png -------------------------------------------------------------------------------- /evaluation/Iluvatar/assets/mdb_run_compare_tasks_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/evaluation/Iluvatar/assets/mdb_run_compare_tasks_2.png -------------------------------------------------------------------------------- /evaluation/Iluvatar/assets/mdb_run_compare_tasks_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/evaluation/Iluvatar/assets/mdb_run_compare_tasks_3.png -------------------------------------------------------------------------------- /evaluation/Iluvatar/assets/mdb_run_compare_tasks_4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Deep-Spark/DeepSpark/HEAD/evaluation/Iluvatar/assets/mdb_run_compare_tasks_4.png -------------------------------------------------------------------------------- /CODE_OF_CONDUCT_cn.md: -------------------------------------------------------------------------------- 1 | 2 | # 贡献者公约 3 | 4 | ## 我们的承诺 5 | 6 | 身为社区成员、贡献者和领袖,我们承诺使社区参与者不受骚扰,无论其年龄、体型、可见或不可见的缺陷、族裔、性征、性别认同和表达、经验水平、教育程度、社会与经济地位、国籍、相貌、种族、种姓、肤色、宗教信仰、性倾向或性取向如何。 7 | 8 | 我们承诺以有助于建立开放、友善、多样化、包容、健康社区的方式行事和互动。 9 | 10 | ## 我们的准则 11 | 12 | 有助于为我们的社区创造积极环境的行为例子包括但不限于: 13 | 14 | * 表现出对他人的同情和善意 15 | * 尊重不同的主张、观点和感受 16 | * 提出和大方接受建设性意见 17 | * 承担责任并向受我们错误影响的人道歉 18 | * 注重社区共同诉求,而非个人得失 19 | 20 | 不当行为例子包括: 21 | 22 | * 使用情色化的语言或图像,及性引诱或挑逗 23 | * 嘲弄、侮辱或诋毁性评论,以及人身或政治攻击 24 | * 公开或私下的骚扰行为 25 | * 未经他人明确许可,公布他人的私人信息,如物理或电子邮件地址 26 | * 其他有理由认定为违反职业操守的不当行为 27 | 28 | ## 责任和权力 29 | 30 | 社区领袖有责任解释和落实我们所认可的行为准则,并妥善公正地对他们认为不当、威胁、冒犯或有害的任何行为采取纠正措施。 31 | 32 | 社区领导有权力和责任删除、编辑或拒绝或拒绝与本行为准则不相符的评论(comment)、提交(commits)、代码、维基(wiki)编辑、议题(issues)或其他贡献,并在适当时机知采取措施的理由。 33 | 34 | ## 适用范围 35 | 36 | 本行为准则适用于所有社区场合,也适用于在公共场所代表社区时的个人。 37 | 38 | 代表社区的情形包括使用官方电子邮件地址、通过官方社交媒体帐户发帖或在线上或线下活动中担任指定代表。 39 | 40 | ## 监督 41 | 42 | 辱骂、骚扰或其他不可接受的行为可通过conduct@deepspark.org.cn向负责监督的社区领袖报告。 43 | 所有投诉都将得到及时和公平的审查和调查。 44 | 45 | 所有社区领袖都有义务尊重任何事件报告者的隐私和安全。 46 | 47 | ## 处理方针 48 | 49 | 社区领袖将遵循下列社区处理方针来明确他们所认定违反本行为准则的行为的处理方式: 50 | 51 | ### 1. 纠正 52 | 53 | **社区影响**:使用不恰当的语言或其他在社区中被认定为不符合职业道德或不受欢迎的行为。 54 | 55 | **处理意见**:由社区领袖发出非公开的书面警告,明确说明违规行为的性质,并解释举止如何不妥。或将要求公开道歉。 56 | 57 | ### 2. 警告 58 | 59 | **社区影响**:单个或一系列违规行为。 60 | 61 | **处理意见**:警告并对连续性行为进行处理。在指定时间内,不得与相关人员互动,包括主动与行为准则执行者互动。这包括避免在社区场所和外部渠道中的互动。违反这些条款可能会导致临时或永久封禁。 62 | 63 | ### 3. 临时封禁 64 | 65 | **社区影响**: 严重违反社区准则,包括持续的不当行为。 66 | 67 | **处理意见**: 在指定时间内,暂时禁止与社区进行任何形式的互动或公开交流。在此期间,不得与相关人员进行公开或私下互动,包括主动与行为准则执行者互动。违反这些条款可能会导致永久封禁。 68 | 69 | ### 4. 永久封禁 70 | 71 | **社区影响**:行为模式表现出违反社区准则,包括持续的不当行为、骚扰个人或攻击或贬低某个类别的个体。 72 | 73 | **处理意见**:永久禁止在社区内进行任何形式的公开互动。 74 | 75 | ## 参见 76 | 77 | 本行为准则改编自 [Contributor Covenant][homepage] 2.1 版, 参见 [https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1]。 78 | 79 | 社区处理方针灵感来源于 [Mozilla's code of conduct enforcement ladder][Mozilla CoC]。 80 | 81 | 有关本行为准则的常见问题的答案,参见 [https://www.contributor-covenant.org/faq][FAQ]。 82 | 其他语言翻译参见 [https://www.contributor-covenant.org/translations][translations]。 83 | 84 | [homepage]: https://www.contributor-covenant.org 85 | [v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html 86 | [Mozilla CoC]: https://github.com/mozilla/diversity 87 | [FAQ]: https://www.contributor-covenant.org/faq 88 | [translations]: https://www.contributor-covenant.org/translations 89 | 90 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | pip-wheel-metadata/ 24 | share/python-wheels/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | MANIFEST 29 | 30 | # PyInstaller 31 | # Usually these files are written by a python script from a template 32 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 33 | *.manifest 34 | *.spec 35 | 36 | # Installer logs 37 | pip-log.txt 38 | pip-delete-this-directory.txt 39 | 40 | # Unit test / coverage reports 41 | htmlcov/ 42 | .tox/ 43 | .nox/ 44 | .coverage 45 | .coverage.* 46 | .cache 47 | nosetests.xml 48 | coverage.xml 49 | *.cover 50 | *.py,cover 51 | .hypothesis/ 52 | .pytest_cache/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | target/ 76 | 77 | # Jupyter Notebook 78 | .ipynb_checkpoints 79 | 80 | # IPython 81 | profile_default/ 82 | ipython_config.py 83 | 84 | # pyenv 85 | .python-version 86 | 87 | # pipenv 88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 91 | # install all needed dependencies. 92 | #Pipfile.lock 93 | 94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 95 | __pypackages__/ 96 | 97 | # Celery stuff 98 | celerybeat-schedule 99 | celerybeat.pid 100 | 101 | # SageMath parsed files 102 | *.sage.py 103 | 104 | # Environments 105 | .env 106 | .venv 107 | env/ 108 | venv/ 109 | ENV/ 110 | env.bak/ 111 | venv.bak/ 112 | 113 | # Spyder project settings 114 | .spyderproject 115 | .spyproject 116 | 117 | # Rope project settings 118 | .ropeproject 119 | 120 | # mkdocs documentation 121 | /site 122 | 123 | # mypy 124 | .mypy_cache/ 125 | .dmypy.json 126 | dmypy.json 127 | 128 | # Pyre type checker 129 | .pyre/ 130 | -------------------------------------------------------------------------------- /evaluation/Iluvatar/Mdims-benchmark.md: -------------------------------------------------------------------------------- 1 | # 多维度评测系统使用指南 2 | 3 | ## 简介 4 | 5 | [多维度评测系统](https://mdb.deepspark.org.cn:8086)是一款基于[多维度评测体系标准](https://gitee.com/deep-spark/deepspark/blob/master/README.md#%E8%AF%84%E6%B5%8B%E4%BD%93%E7%B3%BB)开发的线上评测工具,通过在同等条件下对不同加速卡在多个维度(如速度,准确度,线性度,功耗效率,显存效率,稳定性)上进行模型评测、指标收集和多维度雷达图展示,方便用户更加全面的对比评估GPU加速卡的综合能力。 6 | 多维度评测系统的训练、推理评测模型分别来自[DeepSparkHub](https://gitee.com/deep-spark/deepsparkhub)训练模型仓库和[DeepSparkInference](https://gitee.com/deep-spark/deepsparkinference)推理模型仓库: 7 | 8 | ![mdb_model_list_1](assets/mdb_model_list_1.png) 9 | 10 | 多维度评测系统未来还会持续演进,丰富模型类别和添加更多评测模型,欢迎各位第一时间使用和体验。如有任何问题或建议,可随时在DeepSpark开源社区的[Issue标签页](https://gitee.com/deep-spark/deepspark/issues)提交Issue。 11 | 12 | ## 用户注册 13 | 14 | ### Step 1:新用户首次使用多维度评测系统,需要先进行用户注册 15 | 16 | ![mdb_user_reg1](assets/mdb_user_reg1.png) 17 | 18 | ### Step 2:填写如下“用户注册”表单,建议勾选“任务状态通知”单选框 19 | 20 | ![mdb_user_reg2](assets/mdb_user_reg2.png) 21 | 22 | ### Step 3:完成用户注册后可登录该用户体验多维度评测系统 23 | 24 | ![mdb_user_login](assets/mdb_user_login.png) 25 | 26 | ### Step 4:多维度评测系统主界面如下所示 27 | 28 | - 左上导航栏的`“?”`问号链接到本帮助文档 29 | - 左中“模型集合”展示了当前支持的模型列表,第一期只支持训练模型,可通过上面的“文本框”进行关键字检索 30 | - 左下“功能导航”表格提供如下功能 31 | - 对比测试:触发单个模型在两款GPU加速卡上的对比测试 32 | - 运行状态:可查看当前正在运行或排队的任务列表 33 | - 终止运行:可一次性终止当前用户正在运行的任务 34 | - 雷达图:查看单个模型的六维度雷达图(速度,稳定性,线性度,功耗效率,准确度,显存效率) 35 | - 曲面图:综合查看两款GPU加速卡在各个模型类别下表现的曲面图 36 | - 历史记录:查看已触发的历史任务日志 37 | 38 | ![mdb_index](assets/mdb_index.png) 39 | 40 | ## 评测任务 41 | 42 | 多维度评测系统支持“指定GPU卡单次评测”和“两款GPU卡对比测试”,评测完成的结果将汇总到数据库中,供雷达图和曲面图展示使用。 43 | 44 | > 注:在资源有限的情况下,为保障社区用户的使用体验,触发的评测任务将自动进行排队,且有相应的任务最大配额限制。 45 | 46 | ### 指定GPU卡单次评测 47 | 48 | #### 左中导航栏定位到想跑的模型,并在右上进行参数配置后,点击“执行”按钮,下发评测任务 49 | 50 | ![mdb_run_one_task](assets/mdb_run_one_task.png) 51 | 52 | ### 两款GPU卡对比测试 53 | 54 | #### Step 1:点击左下的导航栏的“对比测试”按钮,并在右上进行模型选择和参数配置后,点击“对比测试”,下发评测任务 55 | 56 | ![mdb_run_compare_tasks_1](assets/mdb_run_compare_tasks_1.png) 57 | 58 | #### Step 2:对比测试任务下发后将显示当前运行脚本列表,点击右侧“终止”按钮可终止指定任务的运行 59 | 60 | ![mdb_run_compare_tasks_2](assets/mdb_run_compare_tasks_2.png) 61 | 62 | #### Step 3:点击左侧单个运行脚本名称,将直接跳转到该任务的日志展示页面 63 | 64 | ![mdb_run_compare_tasks_3](assets/mdb_run_compare_tasks_3.png) 65 | 66 | #### Step 4: 点击左下的“终止运行”按钮将终止当前用户的所有在运行任务 67 | 68 | ![mdb_run_compare_tasks_4](assets/mdb_run_compare_tasks_4.png) 69 | 70 | #### Step 5: 点击左下的“历史记录”按钮,可以查看当前用户的所有的历史任务运行记录 71 | 72 | ![mdb_history](assets/mdb_history.png) 73 | 74 | ## 结果展示 75 | 76 | ### 雷达图 77 | 78 | #### 点击左下导航栏“雷达图”按钮,在右上进行型选择和参数配置后,点击“查看雷达图”,将展示六维度雷达图 79 | 80 | ![mdb_lei_da_tu](assets/mdb_lei_da_tu.png) 81 | 82 | ### 曲面图 83 | 84 | #### 点击左下导航栏“曲面图”按钮,在右上进程模型选择和参数配置后,点击“查看曲面图”,将展示两款GPU加速卡在各个模型类别下表现的曲面图 85 | 86 | ![mdb_qu_mian_tu](assets/mdb_qu_mian_tu.png) 87 | 88 | ## 链接 89 | 90 | - 社区网站: 91 | 92 | - 工具链接: 93 | 94 | - 联系我们: 95 | -------------------------------------------------------------------------------- /evaluation/Iluvatar/six_dimension_howto.md: -------------------------------------------------------------------------------- 1 | # 天垓100(BI-V100)六维度评测方法 2 | 3 | ## 前置条件 4 | 5 | 1. 获取天数智芯天垓100(BI-V100)算力。 6 | 7 | DeepSpark社区客户想要使用天垓100加速卡,可以通过填写[表单](https://order.iluvatar.com/)进行申请。 8 | 9 | 2. 安装天数智芯软件栈。 10 | 11 | ## 使用工具 12 | 13 | - ixSMI:GPU实时状态检测工具,包含在天数智芯软件栈中,当您安装好天数智芯软件栈后即可通过**ixsmi**命令进行调用。 14 | - DeepSpark模型训练脚本:脚本会打印训练时每秒处理的单位样本数量、模型精度值等数据。 15 | 16 | ## 六维度数据计算方法 17 | 18 | ### 速度 19 | 20 | 定义:模型稳定训练时每秒处理的单位样本的算力。 21 | 22 | 数据来源:DeepSpark模型训练脚本,脚本会打印训练过程中每个epoch的每秒处理的单位样本数量。 23 | 24 | 计算方法:即指定迭代轮次为总共5次,去掉5次中的速度的最高最低值,取中间3次速度值的平均值(或mean中值)。如取稳定训练的第6到第10个epoch共5个epoch,最高是第6个、最低是第10个epoch,则取第7个、第8个和第9个epoch的值相加除以3。 25 | 26 | ### 准确性 27 | 28 | 定义:模型收敛的精度值。 29 | 30 | 数据来源:DeepSpark模型训练脚本,脚本会打印训练过程中每个epoch的训练准确度。 31 | 32 | 计算方法:随着训练迭代而模型的loss变化不再显著时,评估模型是否达到判定为收敛所要求的准确度指标,记录此时的精度值。 33 | 34 | ### 线性度 35 | 36 | 定义:模型集群规模化训练的scalability,即算力的线性扩展性能。 37 | 38 | 数据来源:DeepSpark模型训练脚本,脚本会打印训练过程中每个epoch的每秒处理的单位样本数量。 39 | 40 | 计算方法:线性度具体还可以分为如下两种: 41 | 42 | - 卡线性度:在一台服务器上使用多张BI-V100训练时每秒处理的单位样本数量(即速度)除以多卡的卡数,再对比使用1张BI-V100训练时的速度。例如,使用8张BI-V100时的训练速度除以卡的倍数(即8),再除以使用1张BI-V100时的训练速度,得到8卡对单卡的线性度。 43 | 44 | - 节点线性度:在每台服务器(即节点)安装相同数量的BI-V100卡前提下,使用多台服务器训练的每秒处理的单位样本数量(即速度)除以多台服务器的台数,再对比使用1台服务器训练时的速度。例如,使用4台服务器(每台都分别安装了8张BI-V100)时的训练速度,除以服务器的总台数(即节点数4),再除以使用1台服务器时的训练速度,得到4节点对单节点的线性度。 45 | 46 | ### 功耗 47 | 48 | 定义:模型稳定训练时候实际消耗的GPU平均功耗。 49 | 50 | 数据来源:GPU实时状态检测工具ixSMI。 51 | 52 | 计算方法:模型稳定训练时(通常是排除训练刚开始的数据处理和预热阶段),在终端输入以下命令,ixSMI工具会每间隔5秒显示此时BI-V100的功耗,单位为W。取多次的数据取平均值作为模型训练的功耗。 53 | 54 | - 如您环境中仅安装了1张BI-V100,输入以下命令: 55 | 56 | ``` 57 | $ ixsmi -q -l | grep -E 'Power Draw' 58 | ``` 59 | 60 | - 如您环境中安装了多张BI-V100(可使用**ixsmi**命令获取多张卡的概要信息,包括index信息),输入以下命令,使用${gpu_index}指定具体读取哪张BI-V100的信息,注意index从0开始: 61 | 62 | ``` 63 | $ ixsmi -q -i ${gpu_index} -l | grep -E 'Power Draw' 64 | ``` 65 | 66 | 例如,以下命令可以得到index为0的BI-V100的功耗信息: 67 | 68 | ``` 69 | $ ixsmi -q -i 0 -l | grep -E 'Power Draw' 70 | ``` 71 | 72 | 73 | ### 显存占用 74 | 75 | 定义:模型稳定训练时实际消耗的GPU平均显存占用量。 76 | 77 | 数据来源:GPU实时状态检测工具ixSMI。 78 | 79 | 计算方法:模型稳定训练时(通常是排除训练刚开始的数据处理和预热阶段),在终端输入以下命令,ixSMI工具会每间隔5秒显示此时BI-V100显存的使用量,单位为MiB。取多次的数据取平均值作为模型训练的显存占用量。 80 | 81 | - 如您环境中仅安装了1张BI-V100,输入以下命令: 82 | 83 | ``` 84 | $ ixsmi -q -l | grep Used.[^G] 85 | ``` 86 | 87 | - 如您环境中安装了多张BI-V100(可使用**ixsmi**命令获取多张卡的概要信息,包括index信息),输入以下命令,使用${gpu_index}指定具体读取哪张BI-V100的信息,注意index从0开始: 88 | 89 | ``` 90 | $ ixsmi -q -i ${gpu_index} -l | grep Used.[^G] 91 | ``` 92 | 93 | 例如,以下命令可以得到index为0的BI-V100的显存占用量: 94 | 95 | ``` 96 | $ ixsmi -q -i 0 -l | grep Used.[^G] 97 | ``` 98 | 99 | 100 | ### 稳定度 101 | 102 | 定义:取模型多次完整训练最终所达到的收敛值,比较多次收敛值的稳定程度。 103 | 104 | 数据来源:DeepSpark模型训练脚本,脚本会打印训练过程中每个epoch的结束时达到的准确度。 105 | 106 | 计算方法:模型采用5次完整训练,每次都最终达到标准收敛值,取5次的收敛值的中值做为基准值,然后其他的4次收敛值对比基准值的差值百分比应分布在(-0.01,+0.01)的合理区间,此时达到满分1。当5个数据有1次不在该范围内,稳定度则递减20%。 107 | 108 | ## 六维度数据计算示例 109 | 110 | [天垓100六维度评测方法示例:ResNet50](six_dimension_howto_example.md) 111 | -------------------------------------------------------------------------------- /RELEASE.md: -------------------------------------------------------------------------------- 1 | 2 | # DeepSpark Release Notes 3 | 4 | ## 25.09 Release Notes 5 | 6 | ### 特性和增强 7 | 8 | * DeepSpark 修复了README文档中的markdownlint问题。 9 | * DeepSparkHub 新增了10个大模型训练微调示例,其中7个为强化学习相关示例,使用了OpenRLHF和verl工具箱。详见[25.09版本日志](https://gitee.com/deep-spark/deepsparkhub/releases/tag/25.09)。 10 | * DeepSparkInference 新增了19个推理小模型示例,涵盖视觉分类,对象检测和语义分割等领域,并新增了11个大语言模型的推理示例,涉及FastDeploy、LMDeploy和vLLM等框架。详见[25.09版本日志](https://gitee.com/deep-spark/deepsparkinference/releases/tag/25.09)。 11 | 12 | ### 贡献者 13 | 14 | 感谢以下人员做出的贡献: 15 | 16 | YoungPeng,fhfang,郭寒冰,qiang.zhang,sanghui_ilu,李一力,郝燕龙,胡方健,lsy789,张汉涛,fanglaipeng,majorli6,honglyua。 17 | 18 | 欢迎以任何形式向DeepSpark社区贡献。 19 | 20 | ## 25.06 Release Notes 21 | 22 | ### 特性和增强 23 | 24 | * DeepSpark 增加了首页的英文版README文档。 25 | * DeepSparkHub 新增了1个小模型示例和5个大模型的强化学习微调示例,使用了verl和OpenRLHF工具箱。详见[25.06版本日志](https://gitee.com/deep-spark/deepsparkhub/releases/tag/25.06)。 26 | * DeepSparkInference 新增了24个推理小模型示例,涵盖视觉分类,对象检测和语义分割等领域,并新增了6个基于vLLM的大语言模型的推理示例,其中3个为多模态模型。详见[25.06版本日志](https://gitee.com/deep-spark/deepsparkinference/releases/tag/25.06)。 27 | 28 | ### 贡献者 29 | 30 | 感谢以下人员做出的贡献: 31 | 32 | YoungPeng,张文风,qiang.zhang,李一力,sanghui-ilu,honglyua,majorli6,吴永乐。 33 | 34 | 欢迎以任何形式向DeepSpark社区贡献。 35 | 36 | ## 25.03 Release Notes 37 | 38 | ### 特性和增强 39 | 40 | * 首页README文档添加了天数智算软件栈IXUCA的介绍。 41 | * DeepSparkHub 25.03版本新增了9个基于MoE-LLaVA,DeepSpeed和LLaMA-Factory的大语言模型的训练示例。具体详见DeepSparkHub 25.03版本日志。 42 | * DeepSparkInference 25.03版本新增了25个推理小模型,涵盖图片分类,对象检测和姿态估计等领域,并新增了11个基于vLLM的大语言模型的推理示例,其中6个为DeepSeek-R1蒸馏模型。具体详见DeepSparkInference 25.03版本日志。 43 | 44 | ### 贡献者 45 | 46 | 感谢以下人员做出的贡献: 47 | 48 | YoungPeng,xinchi.tian,xiaomei.wang,qiang.zhang,李一力,sanghui-ilu,honglyua,majorli6,吴永乐。 49 | 50 | 欢迎以任何形式向DeepSpark社区贡献。 51 | 52 | ## 24.12 Release Notes 53 | 54 | ### 特性和增强 55 | 56 | * DeepSparkHub 24.12版本新增了4个PyTorch模型,其中2个为Multimodal模型,同时新增了5个基于ColossalAI,Megatron-LM和LLaMA-Factory的大语言模型的训练示例。具体详见DeepSparkHub 24.12版本日志。 57 | * DeepSparkInference 24.12版本新增了24个推理小模型,涵盖图片分类,对象检测和文字识别等领域,并新增了9个基于vLLM和ixFormer的大语言模型的推理示例。具体详见DeepSparkInference 24.12版本日志。 58 | 59 | ### 贡献者 60 | 61 | 感谢以下人员做出的贡献: 62 | 63 | YoungPeng,xinchi.tian,xiaomei.wang,qiang.zhang,李一力,sanghui-ilu,honglyua,majorli6,吴永乐。 64 | 65 | 欢迎以任何形式向DeepSpark社区贡献。 66 | 67 | ## 24.09 Release Notes 68 | 69 | ### 特性和增强 70 | 71 | * DeepSparkHub 24.09版本新增了5个PyTorch模型,其中3个为Stable Diffusion模型,同时新增了4个基于DeepSpeed、Megatron DeepSpeed和Firefly的大语言模型的训练示例,并修复了一些模型执行步骤和版本兼容相关的问题。具体详见DeepSparkHub 24.09版本日志。 72 | * DeepSparkInference 24.09版本新增了29个推理模型示例和对ByteMLPerf工具箱的支持,涵盖计算机视觉,自然语言处理和语音识别等领域,同时新增了5个基于vLLM,TensorRT-LLM和Text Generation Inference的大语言模型的推理示例。具体详见DeepSparkInference 24.09版本日志 。 73 | 74 | ### 贡献者 75 | 76 | 感谢以下人员做出的贡献: 77 | 78 | majorli。 79 | 80 | 欢迎以任何形式向DeepSpark社区贡献。 81 | 82 | --- 83 | 84 | ## 24.06 Release Notes 85 | 86 | ### 特性和增强 87 | 88 | * DeepSparkHub 24.06版本新增了7个PyTorch训练模型和对OpenPCDet工具箱的支持,同时新增了8个基于DeepSpeed,Megatron DeepSpeed和Firefly的大语言模型的训练示例,并修复了一些模型执行步骤和版本兼容相关的问题。具体详见DeepSparkHub 24.06版本日志。 89 | * DeepSparkInference 24.06版本新增了31个推理模型示例,其中支持IGIE推理引擎的16个,支持IxRT推理引擎的15个,涵盖计算机视觉,自然语言处理等领域。同时新增了4个基于vLLM,TensorRT-LLM和Text Generation Inference的大语言模型的推理示例。具体详见DeepSparkInference 24.06版本日志 。 90 | 91 | ### 贡献者 92 | 93 | 感谢以下人员做出的贡献: 94 | 95 | majorli。 96 | 97 | 欢迎以任何形式向DeepSpark社区贡献。 98 | 99 | --- 100 | 101 | ## 24.03 Release Notes 102 | 103 | ### 特性和增强 104 | 105 | * DeepSparkHub 24.03版本新增了10个基于PyTorch和PaddlePaddle框架的算法模型,涉及计算机视觉、多模态领域。同时新增了大语言模型Megatron-DeepSpeed Llama-2-7B SFT和DeepSpeed Llama-2-7B Reward Model Finetuning的训练示例,并修复了一些模型执行步骤和版本兼容相关的问题。具体详见DeepSparkHub 24.03版本日志。 106 | 107 | ### 贡献者 108 | 109 | 感谢以下人员做出的贡献: 110 | 111 | majorli。 112 | 113 | 欢迎以任何形式向DeepSpark社区贡献。 114 | 115 | --- 116 | 117 | ## 23.12 Release Notes 118 | 119 | ### 特性和增强 120 | 121 | * DeepSparkHub 23.12版本新增了30个基于PyTorch和PaddlePaddle框架的算法模型,涉及计算机视觉、自然语言处理、语音识别、多模态、图神经网络、推荐和强化学习等领域。同时新增了基于分布式训练框架Megatron-DeepSpeed的大语言模型LLaMA2-7B的训练示例,并修复了一些模型执行步骤和版本兼容相关的问题。具体详见DeepSparkHub 23.12版本日志。 122 | 123 | ### 贡献者 124 | 125 | 感谢以下人员做出的贡献: 126 | 127 | majorli。 128 | 129 | 欢迎以任何形式向DeepSpark社区贡献。 130 | 131 | --- 132 | 133 | ## 23.09 Release Notes 134 | 135 | ### 特性和增强 136 | 137 | * DeepSparkHub新增了30个基于PyTorch/TensorFlow/MindSpore/PaddlePaddle的算法模型,涉及计算机视觉,图神经网络,自然语言处理,语音识别等领域。同时新增了基于分布式训练框架Colossal-AI和DeepSpeed的大语言模型LLaMA-7B和ChatGLM-6B的训练示例,以及基于深度学习分子动力学套件DeePMD-kit的水分子模型训练示例。同时修复了一些模型数据集和执行步骤相关的问题。具体详见DeepSparkHub 23.09版本日志 。 138 | 139 | ### 贡献者 140 | 141 | 感谢以下人员做出的贡献: 142 | 143 | majorli,tonychen。 144 | 145 | 欢迎以任何形式向DeepSpark社区贡献。 146 | 147 | --- 148 | 149 | ## 23.06 Release Notes 150 | 151 | ### 特性和增强 152 | 153 | * 添加了30个基于PyTorch框架的算法模型,新增了网络剪枝、自监督学习、知识蒸馏这3种模型类别。 154 | * 新增30个模型中有12个使用了开源工具箱: 155 | * Transformer,U2++ Conformer,Unified Conformer模型基于开源的WeNet工具箱。 156 | * ATSS,Cascade R-CNN,CornerNet,DCNV2,RepPoints模型基于开源的MMDetection工具箱。 157 | * BART,Convoluntional,RoBERTa,Transformer模型基于开源的Fairseq工具箱。 158 | 159 | ### 贡献者 160 | 161 | 感谢以下人员做出的贡献: 162 | 163 | majorli。 164 | 165 | 欢迎以任何形式向DeepSpark社区贡献。 166 | 167 | --- 168 | 169 | ## 23.03 Release Notes 170 | 171 | ### 特性和增强 172 | 173 | * 新增了对TensorFlow和MindSpore的支持。 174 | 175 | * 新增基于TensorFlow和MindSpore的模型各5个,PaddlePaddle模型10个,PyTorch模型15个。 176 | 177 | ### 贡献者 178 | 179 | 感谢以下人员做出的贡献: 180 | 181 | chenyingtony。 182 | 183 | 欢迎以任何形式向DeepSpark社区贡献。 184 | 185 | --- 186 | 187 | ## 22.12 Release Notes 188 | 189 | ### 特性和增强 190 | 191 | * SATRN,Conformer和ngp-nerf模型更新[六维度评测数据](README.md#硬件评测方法和结果)。 192 | * 添加了基于国产通用GPU天垓100的六维度评测[方法](evaluation/Iluvatar/six_dimension_howto.md)和[示例](evaluation/Iluvatar/six_dimension_howto_example.md)。 193 | * 应用开放平台新增基于PaddlePaddle框架的19个[模型](https://gitee.com/deep-spark/deepsparkhub)。 194 | 195 | ### 贡献者 196 | 197 | 感谢以下人员做出的贡献: 198 | 199 | li-shi-kun,tonychen,majorli,李睿,Jeff Guo。 200 | 201 | 欢迎以任何形式向DeepSpark社区贡献。 202 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | 2 | # Contributor Covenant Code of Conduct 3 | 4 | ## Our Pledge 5 | 6 | We as members, contributors, and leaders pledge to make participation in our 7 | community a harassment-free experience for everyone, regardless of age, body 8 | size, visible or invisible disability, ethnicity, sex characteristics, gender 9 | identity and expression, level of experience, education, socio-economic status, 10 | nationality, personal appearance, race, caste, color, religion, or sexual 11 | identity and orientation. 12 | 13 | We pledge to act and interact in ways that contribute to an open, welcoming, 14 | diverse, inclusive, and healthy community. 15 | 16 | ## Our Standards 17 | 18 | Examples of behavior that contributes to a positive environment for our 19 | community include: 20 | 21 | * Demonstrating empathy and kindness toward other people 22 | * Being respectful of differing opinions, viewpoints, and experiences 23 | * Giving and gracefully accepting constructive feedback 24 | * Accepting responsibility and apologizing to those affected by our mistakes, 25 | and learning from the experience 26 | * Focusing on what is best not just for us as individuals, but for the overall 27 | community 28 | 29 | Examples of unacceptable behavior include: 30 | 31 | * The use of sexualized language or imagery, and sexual attention or advances of 32 | any kind 33 | * Trolling, insulting or derogatory comments, and personal or political attacks 34 | * Public or private harassment 35 | * Publishing others' private information, such as a physical or email address, 36 | without their explicit permission 37 | * Other conduct which could reasonably be considered inappropriate in a 38 | professional setting 39 | 40 | ## Enforcement Responsibilities 41 | 42 | Community leaders are responsible for clarifying and enforcing our standards of 43 | acceptable behavior and will take appropriate and fair corrective action in 44 | response to any behavior that they deem inappropriate, threatening, offensive, 45 | or harmful. 46 | 47 | Community leaders have the right and responsibility to remove, edit, or reject 48 | comments, commits, code, wiki edits, issues, and other contributions that are 49 | not aligned to this Code of Conduct, and will communicate reasons for moderation 50 | decisions when appropriate. 51 | 52 | ## Scope 53 | 54 | This Code of Conduct applies within all community spaces, and also applies when 55 | an individual is officially representing the community in public spaces. 56 | Examples of representing our community include using an official e-mail address, 57 | posting via an official social media account, or acting as an appointed 58 | representative at an online or offline event. 59 | 60 | ## Enforcement 61 | 62 | Instances of abusive, harassing, or otherwise unacceptable behavior may be 63 | reported to the community leaders responsible for enforcement at 64 | conduct@deepspark.org.cn. 65 | All complaints will be reviewed and investigated promptly and fairly. 66 | 67 | All community leaders are obligated to respect the privacy and security of the 68 | reporter of any incident. 69 | 70 | ## Enforcement Guidelines 71 | 72 | Community leaders will follow these Community Impact Guidelines in determining 73 | the consequences for any action they deem in violation of this Code of Conduct: 74 | 75 | ### 1. Correction 76 | 77 | **Community Impact**: Use of inappropriate language or other behavior deemed 78 | unprofessional or unwelcome in the community. 79 | 80 | **Consequence**: A private, written warning from community leaders, providing 81 | clarity around the nature of the violation and an explanation of why the 82 | behavior was inappropriate. A public apology may be requested. 83 | 84 | ### 2. Warning 85 | 86 | **Community Impact**: A violation through a single incident or series of 87 | actions. 88 | 89 | **Consequence**: A warning with consequences for continued behavior. No 90 | interaction with the people involved, including unsolicited interaction with 91 | those enforcing the Code of Conduct, for a specified period of time. This 92 | includes avoiding interactions in community spaces as well as external channels 93 | like social media. Violating these terms may lead to a temporary or permanent 94 | ban. 95 | 96 | ### 3. Temporary Ban 97 | 98 | **Community Impact**: A serious violation of community standards, including 99 | sustained inappropriate behavior. 100 | 101 | **Consequence**: A temporary ban from any sort of interaction or public 102 | communication with the community for a specified period of time. No public or 103 | private interaction with the people involved, including unsolicited interaction 104 | with those enforcing the Code of Conduct, is allowed during this period. 105 | Violating these terms may lead to a permanent ban. 106 | 107 | ### 4. Permanent Ban 108 | 109 | **Community Impact**: Demonstrating a pattern of violation of community 110 | standards, including sustained inappropriate behavior, harassment of an 111 | individual, or aggression toward or disparagement of classes of individuals. 112 | 113 | **Consequence**: A permanent ban from any sort of public interaction within the 114 | community. 115 | 116 | ## Attribution 117 | 118 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], 119 | version 2.1, available at 120 | [https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1]. 121 | 122 | Community Impact Guidelines were inspired by 123 | [Mozilla's code of conduct enforcement ladder][Mozilla CoC]. 124 | 125 | For answers to common questions about this code of conduct, see the FAQ at 126 | [https://www.contributor-covenant.org/faq][FAQ]. Translations are available at 127 | [https://www.contributor-covenant.org/translations][translations]. 128 | 129 | [homepage]: https://www.contributor-covenant.org 130 | [v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html 131 | [Mozilla CoC]: https://github.com/mozilla/diversity 132 | [FAQ]: https://www.contributor-covenant.org/faq 133 | [translations]: https://www.contributor-covenant.org/translations 134 | 135 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | [English](README_en.md) [Chinese](README.md) 4 | 5 | # DeepSpark开源社区 6 | 7 |
8 | Homepage 10 | LICENSE 11 | Release 12 |
13 |
14 | 15 | 在万物皆算的时代,各领域应用层出不穷,算力必须支撑实际应用,通用性和未来可扩展性是评估算力的重要指标。天数智芯作为国内头部通用GPU高端芯片及超级算力系统提供商,截止2024年12月,已成功支持 400+ AI算法模型,覆盖训练和推理,与 400+ 家客户和生态伙伴建立合作,共同促进国内通用算力的发展,产品服务于智慧城市、数字个人、医疗、教育、通信、能源等多个领域。 16 | 17 | 天数智芯本着“平台共建、生态共享、产业共赢”的原则,致力于和行业伙伴一起打造[DeepSpark开源社区](https://www.deepspark.org.cn/),以来自开源回馈开源的方式,汇聚社区力量,助力客户加速应用落地和收获算力赋能,促进产业生态的完善和发展。 18 | 19 | DeepSpark开源社区目前主要致力于[百大应用开放平台](#百大应用开放平台)的打造和推广,将来会有更多相关的项目和成果通过DeepSpark社区开源。 20 | 21 | 2023年8月,DeepSpark开源社区与[上海白玉兰开源开放研究院](http://baiyulan.org.cn/)签署了战略合作协议,旨在进一步促进人工智能开源事业共建共享,推动产业生态的完善和发展。2023年11月,DeepSpark社区与[启智社区](https://openi.pcl.ac.cn/)开展合作,社区用户可通过启智云脑提供的[天垓100算力](https://openi.pcl.ac.cn/iluvatar/TianGai100)训练来自DeepSparkHub的模型。 22 | 23 | 欢迎行业合作伙伴、社区用户和开发者以任何形式为DeepSpark开源社区作贡献,期待您的积极参与。 24 | 25 | -------- 26 | 27 | ## 百大应用开放平台 28 | 29 | 百大应用开放平台作为国内领先的AI和通用计算应用开发及评测平台,甄选上百个与行业应用深度耦合的开源算法和模型,支持主流生态应用框架,并针对行业需求构建多维度评测体系,广泛支持各类落地场景。 30 | 31 | ### 应用算法和模型 32 | 33 | [DeepSparkHub](https://gitee.com/deep-spark/deepsparkhub)甄选上百个开源应用算法和模型,覆盖AI和通用计算各领域,支持主流市场智能计算场景,包括智慧城市、数字个人、医疗、教育、通信、能源等多个领域。 34 | 35 | [DeepSparkInference](https://gitee.com/deep-spark/deepsparkinference)精选基于国产推理引擎IGIE和IxRT的推理模型示例和指导文档,部分模型提供了基于国产通用GPU[智铠100](https://www.iluvatar.com/productDetails?fullCode=cpjs-yj-tlxltt-zk100)的评测结果。 36 | 37 | ### 天数智芯智算平台 IXUCA 38 | 39 | IXUCA兼容主流GPU通用计算模型,提供支持主流GPU通用计算模型的等效组件、特性、API和算法,可助力用户便捷地实现系统或应用的无痛迁移。天数智算软件栈包括人工智能深度学习应用、主流框架、函数库、编译器及工具、运行时库及驱动。 40 | 41 | - IXUCA集成了TensorFlow、PyTorch、百度飞桨PaddlePaddle等国内外主流的深度学习框架,提供与官方开源框架一致的算子,并针对天数智芯加速卡持续优化性能。 42 | 43 | - IXUCA提供IGIE推理框架和IxRT推理引擎,支持在天数智芯加速卡上实现最优推理性能。 44 | 45 | - IXUCA的函数库不仅支持通用计算还提供了深度学习应用开发所需的基础算子,开发者可以便捷地调用这些算子灵活地构造各类深度神经网络模型以及其他机器学习领域的算法。 46 | 47 | 您可前往天数智芯官方网站的[资源中心](https://support.iluvatar.com/#/ProductLine?id=2)获取天数智算软件栈。 48 | 49 | ### 应用框架 50 | 51 | 百大应用开放平台支持国内外主流应用框架和工具箱。 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 |
pytorchtensorflow
paddlepaddledeepspeed
fairseqmmdetection
wenetcolossal-ai
deepmodelingllama-factory
75 | 76 | ### 评测体系 77 | 78 | 评测标准广泛适用于硬件平台,体系完备,部署简单。 79 | 80 | - 提供6️⃣维度 81 | 82 | | 维度 | 说明 | 数据来源 | 计算方法 | 83 | |------------|----------------------------------------------------------------|---------------------------|-----------------------------------------------------------------------------------------------------------------------------------------| 84 | | 速度🚀 | 模型稳定训练时每秒处理的单位样本的算力 | DeepSpark模型训练脚本输出 | 指定迭代轮次5次去掉最高最低,取中间3次的mean中值 | 85 | | 准确性🎯 | 模型收敛的精度值 | DeepSpark模型训练脚本输出 | 记录模型收敛时的精度值 | 86 | | 线性度📈 | 模型集群规模化训练算力的线性扩展性能,包括卡线性度和节点线性度 | DeepSpark模型训练脚本输出 | 用多卡/多节点的训练速度除以卡数/节点数,再对比使用单张/单节点的训练速度 | 87 | | 功耗🔌 | 模型稳定训练时候实际消耗的GPU平均功耗 | GPU实时状态检测工具 | 取多次的功耗数据的平均值 | 88 | | 显存占用📊 | 模型稳定训练时实际消耗的GPU平均显存占用量 | GPU实时状态检测工具 | 取多次的显存占用量的平均值 | 89 | | 稳定度🔧 | 多次完整训练(均达到收敛值)的收敛值的稳定程度 | DeepSpark模型训练脚本输出 | 采用5次达到标准收敛值的完整训练,取收敛值的中值做为基准值,其它值对比基准值的差值百分比有1次不在(-0.01,+0.01)范围内,稳定度则递减20% | 90 | 91 | 参考信息:[硬件评测结果](#硬件评测方法和结果) 92 | 93 | - 1️⃣键式部署:全自动✅ 、数据可复现🔁、场景可寻源🔎 94 | 95 | - 0️⃣平台依赖:不限制框架、不限制源语、不限制硬件 96 | 97 | ### 多维度评测系统 98 | 99 | [多维度评测系统](https://mdb.deepspark.org.cn:8086)是一款基于[多维度评测体系标准](#评测体系)开发的线上评测工具,通过在同等条件下对BI-V100和NV-V100加速卡在六个维度(速度,准确度,线性度,功耗效率,显存效率,稳定性)上进行模型训练评测、指标 100 | 收集和六维度雷达图展示,方便用户更加全面的对比评估GPU加速卡的综合能力。以下为已支持的模型列表: 101 | 102 | ![training model list](evaluation/Iluvatar/assets/mdb_model_list_1.png) 103 | 104 | 使用方法详见[多维度评测系统使用指南](evaluation/Iluvatar/Mdims-benchmark.md)。 105 | 106 | -------- 107 | 108 | ### 硬件评测方法和结果 109 | 110 | #### 天垓100通用GPU 111 | 112 | 评测方法详见[天垓100六维度评测方法](evaluation/Iluvatar/six_dimension_howto.md)。 113 | 114 | 评测结果如下: 115 | 116 | | 任务 | 模型 | 收敛指标 | 配置(x-\>gpus) | 速度 | 准确度 | 功耗(W) | 线性度 | 显存占用(G) | 稳定度 | 117 | |--------------|------------|------------------|-----------------------|--------|--------|-----------|--------|---------------|--------| 118 | | 自然语言处理 | BERT-large | 0.72 | sdk2.2,bs:32,8x,amp | 214 | 0.72 | 152\*8 | 0.96 | 20.3\*8 | 1 | 119 | | 推荐系统 | DLRM | AUC:0.75 | sdk2.2,bs:2048,8x,amp | 793486 | 0.75 | 60\*8 | 0.97 | 3.7\*8 | 1 | 120 | | 图像分类 | ResNet50 | top1 75.9% | sdk2.2,bs:512,8x,amp | 5221 | 76.43% | 128\*8 | 0.97 | 29.1\*8 | 1 | 121 | | 图像分割 | 3D U-Net | 0.908 | sdk2.2,bs:4,8x,fp32 | 12 | 0.908 | 152\*8 | 0.85 | 19.6\*8 | 1 | 122 | | 目标检测 | YOLOv5 | mAP:0.5 | sdk2.2,bs:128,8x,amp | 1228 | 0.56 | 140\*8 | 0.92 | 27.3\*8 | 1 | 123 | | 文本检测 | SATRN | 0.841 | sdk2.2,bs:128,8x,fp32 | 630 | 88.4 | 166\*8 | 0.98 | 28.5\*8 | 1 | 124 | | 语音识别 | Conformer | 3.72 | sdk2.2,bs:32,8x,fp32 | 380 | 4.79 | 113\*8 | 0.82 | 21.5\*8 | 1 | 125 | | 3D重建 | ngp-nerf | 0.0046 | sdk2.2,bs:1,8x,amp | 10 | 19.6 | 82\*8 | 0.90 | 28.1\*8 | 1 | 126 | | 目标追踪 | FairMOT | MOTA:69.8 | sdk2.2,bs:64,8x,fp32 | 52 | 69.8 | 132\*8 | 0.97 | 19.1\*8 | 1 | 127 | | 大模型 | CPM | 0.91 | sdk2.2,bs:128,8x,amp | 357 | 0.91 | 156\*8 | 0.93 | 20.6\*8 | 1 | 128 | | 语音语义 | Tacotron2 | score(MOS):4.460 | sdk2.2,bs:128,8x,amp | 77 | 4.46 | 128\*8 | 0.96 | 18.4\*8 | 1 | 129 | | 新兴模型 | Wave-MLP | 80.1 | sdk2.2,bs:256,8x,fp32 | 1026 | 83.1 | 198\*8 | 0.98 | 29.4\*8 | 1 | 130 | 131 | -------- 132 | 133 | ## 社区 134 | 135 | ### 治理 136 | 137 | 请参见 [Code of Conduct](CODE_OF_CONDUCT.md)。 138 | 139 | ### 交流 140 | 141 | 请联系 。 142 | 143 | ### 贡献 144 | 145 | 请参见各项目的Contributing Guidelines。 146 | 147 | ### 许可证 148 | 149 | [Apache License 2.0](LICENSE)。 150 | -------------------------------------------------------------------------------- /evaluation/Iluvatar/six_dimension_howto_example.md: -------------------------------------------------------------------------------- 1 | # 天垓100六维度评测方法示例:ResNet50 2 | 3 | 使用天数智芯天垓100(BI-V100)通用GPU硬件并安装天数智芯软件栈和深度学习框架后,您可运行DeepSpark模型训练脚本,并借助于天数智芯GPU实时状态检测工具ixSMI得到天垓100硬件的六维度数据。DeepSpark六维度指标的定义请参照[评测体系](../../README.md#评测体系)。天垓100六维度具体计算方法请参照[天垓100(BI-V100)六维度评测方法](six_dimension_howto.md)。 4 | 5 | 本示例以[ResNet50模型](https://gitee.com/deep-spark/deepsparkhub/tree/master/cv/classification/resnet50/pytorch)为例,用8张天垓100(BI-V100)卡执行`amp_8card.sh`脚本进行混合精度训练。在本示例环境中,数据集所在路径为`/home/datasets/imagenet`。 6 | 7 | 具体运行和六维度的计算过程如下: 8 | 9 | ## 速度 10 | 11 | 执行DeepSpark模型训练脚本,脚本会打印训练过程中每个epoch的每秒处理的单位样本数量,例如: 12 | 13 | ``` 14 | $ bash amp_8card.sh --data-path /home/datasets/imagenet 15 | ... 16 | Creating data loaders 17 | read 1281167 files from 1000 directories 18 | ... 19 | Creating model resnet50 20 | Start training 21 | Epoch: [0] [ 0/334] eta: 0:38:25 lr: 0.128 img/s: 590.7051616388846 loss: 7.1621 (7.1621) acc1: 0.0000 (0.0000) acc5: 0.0000 (0.0000) time: 6.9017 data: 0.0000 22 | Epoch: [0] [ 10/334] eta: 0:06:47 lr: 0.128 img/s: 5625.496142412713 loss: 7.2926 (7.2497) acc1: 0.0000 (0.1326) acc5: 0.6250 (0.8144) time: 1.2591 data: 0.0019 23 | Epoch: [0] [ 20/334] eta: 0:05:09 lr: 0.128 img/s: 5597.293247401811 loss: 7.4071 (7.4806) acc1: 0.0000 (0.0794) acc5: 0.6250 (0.7639) time: 0.6908 data: 0.0020 24 | Epoch: [0] [ 30/334] eta: 0:04:30 lr: 0.128 img/s: 5445.774084325316 loss: 7.3483 (7.4083) acc1: 0.0000 (0.1075) acc5: 0.6250 (0.7258) time: 0.6889 data: 0.0018 25 | ... 26 | Epoch: [0] [320/334] eta: 0:00:11 lr: 0.128 img/s: 5411.016050829418 loss: 6.8285 (6.9574) acc1: 0.2083 (0.1694) acc5: 1.2500 (0.7976) time: 0.8184 data: 0.0023 27 | Epoch: [0] [330/334] eta: 0:00:03 lr: 0.128 img/s: 5601.840934156804 loss: 6.8106 (6.9527) acc1: 0.4167 (0.1788) acc5: 1.6667 (0.8283) time: 0.7644 data: 0.0023 28 | Epoch: [0] Total time: 0:04:38 29 | Data loading of epoch 0: 0:00:00.759193 30 | Epoch: [0] Avg img/s: 5060.21643823427 31 | Test: [ 0/14] eta: 0:00:03 loss: 6.9883 (6.9883) acc1: 0.0000 (0.0000) acc5: 0.0000 (0.0000) time: 0.2724 data: 0.0000 32 | Test: Total time: 0:00:04 33 | * Acc@1 0.112 Acc@5 1.040 34 | Epoch time 0:04:43 35 | ... 36 | Epoch: [6] Total time: 0:04:38 37 | Data loading of epoch 6: 0:00:00.797741 38 | Epoch: [6] Avg img/s: 4910.795658460139 39 | Test: [ 0/14] eta: 0:00:04 loss: 5.5391 (5.5391) acc1: 8.1250 (8.1250) acc5: 24.1667 (24.1667) time: 0.2859 data: 0.0019 40 | Test: Total time: 0:00:03 41 | * Acc@1 13.360 Acc@5 30.960 42 | Epoch time 0:04:43 43 | ... 44 | Epoch: [7] Total time: 0:04:36 45 | Data loading of epoch 7: 0:00:00.770986 46 | Epoch: [7] Avg img/s: 4980.222383258538 47 | Test: [ 0/14] eta: 0:00:03 loss: 5.4844 (5.4844) acc1: 8.1250 (8.1250) acc5: 23.9583 (23.9583) time: 0.2801 data: 0.0017 48 | Test: Total time: 0:00:03 49 | * Acc@1 24.928 Acc@5 49.712 50 | Epoch time 0:04:41 51 | ... 52 | Epoch: [8] Total time: 0:04:31 53 | Data loading of epoch 8: 0:00:00.756521 54 | Epoch: [8] Avg img/s: 5032.905842259754 55 | Test: [ 0/14] eta: 0:00:04 loss: 4.1641 (4.1641) acc1: 29.1667 (29.1667) acc5: 54.5833 (54.5833) time: 0.2861 data: 0.0019 56 | Test: Total time: 0:00:03 57 | * Acc@1 23.536 Acc@5 47.024 58 | Epoch time 0:04:35 59 | ... 60 | Epoch: [9] Total time: 0:04:39 61 | Data loading of epoch 9: 0:00:00.761111 62 | Epoch: [9] Avg img/s: 4925.753353494863 63 | Test: [ 0/14] eta: 0:00:04 loss: 3.3516 (3.3516) acc1: 45.4167 (45.4167) acc5: 70.4167 (70.4167) time: 0.3107 data: 0.0022 64 | Test: Total time: 0:00:03 65 | * Acc@1 28.480 Acc@5 59.568 66 | Epoch time 0:04:44 67 | ... 68 | Epoch: [10] Total time: 0:04:42 69 | Data loading of epoch 10: 0:00:00.776602 70 | Epoch: [10] Avg img/s: 4845.4628560647725 71 | Test: [ 0/14] eta: 0:00:04 loss: 3.7891 (3.7891) acc1: 27.0833 (27.0833) acc5: 56.4583 (56.4583) time: 0.3491 data: 0.0028 72 | Test: Total time: 0:00:03 73 | * Acc@1 33.520 Acc@5 61.264 74 | Epoch time 0:04:47 75 | ... 76 | Epoch: [63] Total time: 0:04:36 77 | Data loading of epoch 63: 0:00:00.688607 78 | Epoch: [63] Avg img/s: 4928.513448918747 79 | Test: [ 0/14] eta: 0:00:03 loss: 2.3945 (2.3945) acc1: 63.5417 (63.5417) acc5: 85.6250 (85.6250) time: 0.2850 data: 0.0017 80 | Test: Total time: 0:00:03 81 | * Acc@1 69.104 Acc@5 89.088 82 | Epoch time 0:04:40 83 | ... 84 | Epoch: [64] Total time: 0:04:47 85 | Data loading of epoch 64: 0:00:00.674387 86 | Epoch: [64] Avg img/s: 4812.667878767461 87 | Test: [ 0/14] eta: 0:00:04 loss: 1.7793 (1.7793) acc1: 81.8750 (81.8750) acc5: 94.7917 (94.7917) time: 0.3257 data: 0.0019 88 | Test: Total time: 0:00:03 89 | * Acc@1 77.360 Acc@5 93.120 90 | ... 91 | Epoch time 0:04:52 92 | The accuracy has been exceeded 75.9,and the training is terminated at epoch 64 93 | Training time 5:06:03 94 | ... 95 | ``` 96 | 97 | 取稳定训练的第6到第10个epoch共5个epoch: 98 | 99 | - Epoch: [6] Avg img/s: 4910.795658460139 100 | - Epoch: [7] Avg img/s: 4980.222383258538 101 | - Epoch: [8] Avg img/s: 5032.905842259754 102 | - Epoch: [9] Avg img/s: 4925.753353494863 103 | - Epoch: [10] Avg img/s: 4845.4628560647725 104 | 105 | 最高是第8个、最低是第10个epoch,计算速度则取第6个、第7个和第9个epoch的值相加除以3,即为: 106 | 107 | (4910+4980+4925)/3=4938.33 img/s 108 | 109 | 注:在该模型用例中,通过以上方法获取的速度,主要评估的是训练集上training的速度,而端到端(end-to-end)训练速度则包含了训练集和验证集在整个training和evaluation的全部过程中的总体速度。 110 | 111 | 训练端到端的速度计算方式为:(每个epoch的处理的样本数量*完整训练的epoch数量)/总的训练时长 112 | 113 | 1. 通过“read 1281167 files from 1000 directories”的提示得到每个epoch的处理的样本数量为1281167个。 114 | 115 | 2. 通过“Training time 5:06:03”的提示得到总的训练时长为5:06:03小时,换算成18363秒。 116 | 117 | 3. 本次`amp_8card.sh`完整训练的端到端速度为:(1281167*64)/18363=4465.2 img/s 118 | 119 | 120 | ## 准确性 121 | 122 | 执行DeepSpark模型训练脚本,脚本会打印训练过程中每个epoch的训练准确度,例如: 123 | 124 | ``` 125 | $ bash amp_8card.sh --data-path /home/datasets/imagenet 126 | ... 127 | Epoch: [0] Total time: 0:04:38 128 | Data loading of epoch 0: 0:00:00.759193 129 | Epoch: [0] Avg img/s: 5060.21643823427 130 | Test: [ 0/14] eta: 0:00:03 loss: 6.9883 (6.9883) acc1: 0.0000 (0.0000) acc5: 0.0000 (0.0000) time: 0.2724 data: 0.0000 131 | Test: Total time: 0:00:04 132 | * Acc@1 0.112 Acc@5 1.040 133 | ... 134 | Epoch: [64] Avg img/s: 4812.667878767461 135 | Test: [ 0/14] eta: 0:00:04 loss: 1.7793 (1.7793) acc1: 81.8750 (81.8750) acc5: 94.7917 (94.7917) time: 0.3257 data: 0.0019 136 | Test: Total time: 0:00:03 137 | * Acc@1 77.360 Acc@5 93.120 138 | ... 139 | The accuracy has been exceeded 75.9,and the training is terminated at epoch 64 140 | ... 141 | ``` 142 | 143 | ResNet50模型收敛的标准是top1 75.9%,在第64个epoch时达到了Acc@1 77.360(百分比),超过了75.9(百分比)的阈值。所以训练的精度值为77.36%。 144 | 145 | 146 | ## 线性度 147 | 148 | 执行DeepSpark模型训练脚本,脚本会打印训练过程中每个epoch的速度,训练可以使用单卡和多卡进行,或使用单节点和多节点进行,分别可以计算出卡线性度和节点线性度。 149 | 150 | 以计算卡线性度为例: 151 | 152 | 例如,在一台服务器上使用8张天垓100加速卡训练的速度是2575 img/s: 153 | 154 | ``` 155 | $ bash fp32_8card.sh --data-path /home/datasets/imagenet 156 | ... 157 | Epoch: [0] Total time: 0:08:37 158 | Data loading of epoch 0: 0:00:00.745164 159 | Epoch: [0] Avg img/s: 2575.527288024902 160 | Test: [ 0/21] eta: 0:00:06 loss: 6.1528 (6.1528) acc1: 4.3333 (4.3333) acc5: 24.0000 (24.0000) time: 0.3201 data: 0.0000 161 | Test: Total time: 0:00:07 162 | * Acc@1 0.704 Acc@5 3.968 163 | Epoch time 0:08:45 164 | ... 165 | ``` 166 | 167 | 将2575 img/s除以多卡的卡数(即8),得到: 168 | 169 | 2575/8=321.9 img/s 170 | 171 | 再对比使用1张天垓100加速卡训练时的速度,如下所示,速度为335 img/s。 172 | 173 | ``` 174 | $ bash fp32_1card.sh --data-path /home/datasets/imagenet 175 | ... 176 | Epoch: [0] Avg img/s: 335.40682470945455 177 | Test: [ 0/179] eta: 0:00:53 loss: 3.6599 (3.6599) acc1: 36.4286 (36.4286) acc5: 70.3571 (70.3571) time: 0.2962 data: 0.0000 178 | Test: [100/179] eta: 0:00:23 loss: 5.3622 (5.0095) acc1: 6.4286 (12.0226) acc5: 22.1429 (31.2907) time: 0.2973 data: 0.0011 179 | Test: Total time: 0:00:53 180 | * Acc@1 11.786 Acc@5 30.074 181 | Epoch time 1:04:59 182 | ... 183 | ``` 184 | 185 | 得到卡线性度 321.9/335=96% 186 | 187 | ## 功耗 188 | 189 | 在模型稳定训练时,使用GPU实时状态检测工具ixSMI查询实际消耗的GPU功耗并取平均值。 190 | 191 | 以运行`fp32_8card.sh`脚本使用8张天垓100加速卡训练为例,输入以下命令查询index为0的GPU的状态: 192 | 193 | ``` 194 | $ ixsmi -q -i 0 -l | grep -E 'Power Draw' 195 | Power Draw : 161 W 196 | Power Draw : 169 W 197 | Power Draw : 161 W 198 | Power Draw : 174 W 199 | Power Draw : 163 W 200 | ... 201 | ``` 202 | 203 | 取多次的数据取平均值作为模型训练的功耗,例如: 204 | 205 | 平均功耗 (161+169+161+174+163)/5=165.6 W 206 | 207 | ## 显存占用 208 | 209 | 在模型稳定训练时,使用GPU实时状态检测工具ixSMI查询GPU显存占用量并取平均值。 210 | 211 | 以运行`amp_8card.sh`脚本使用8张天垓100加速卡训练为例,输入以下命令查询index为0的GPU的状态: 212 | 213 | ``` 214 | $ ixsmi -q -i 0 -l | grep Used.[^G] 215 | Used : 30671 MiB 216 | Used : 30670 MiB 217 | Used : 30671 MiB 218 | Used : 30671 MiB 219 | Used : 30670 MiB 220 | ``` 221 | 222 | 去平均值得到平均显存占用量为 (30671+30670+30671+30671+30670)/5=30670.6 MiB 223 | 224 | ## 稳定度 225 | 226 | 执行DeepSpark模型训练脚本,脚本会打印完整训练最终所达到的收敛值,例如: 227 | 228 | ``` 229 | $ bash amp_8card.sh --data-path /home/datasets/imagenet 230 | ... 231 | Epoch: [0] Total time: 0:04:38 232 | Data loading of epoch 0: 0:00:00.759193 233 | Epoch: [0] Avg img/s: 5060.21643823427 234 | Test: [ 0/14] eta: 0:00:03 loss: 6.9883 (6.9883) acc1: 0.0000 (0.0000) acc5: 0.0000 (0.0000) time: 0.2724 data: 0.0000 235 | Test: Total time: 0:00:04 236 | * Acc@1 0.112 Acc@5 1.040 237 | ... 238 | Epoch: [64] Avg img/s: 4812.667878767461 239 | Test: [ 0/14] eta: 0:00:04 loss: 1.7793 (1.7793) acc1: 81.8750 (81.8750) acc5: 94.7917 (94.7917) time: 0.3257 data: 0.0019 240 | Test: Total time: 0:00:03 241 | * Acc@1 77.360 Acc@5 93.120 242 | ... 243 | The accuracy has been exceeded 75.9,and the training is terminated at epoch 64 244 | ... 245 | ``` 246 | 247 | 可以得到本次的收敛精度为Acc@1 77.360,记为第1次。 248 | 249 | 再进行4次模型完整训练,每次都最终达到标准收敛值,即达到`The accuracy has been exceeded 75.9`,,Acc@1分别得到如下结果: 250 | 251 | - 第2次:76.816 252 | - 第3次:77.116 253 | - 第4次:76.960 254 | - 第5次:77.010 255 | 256 | 取5次的收敛值的中值做为基准值,即以第5次的收敛精度77.010作为基准,然后其它的4次收敛值对比基准值的差值百分比计算如下: 257 | 258 | - 第1次:(77.360-77.010)/77.010=0.0045 259 | - 第2次:(76.816-77.010)/77.010=-0.0025 260 | - 第3次:(77.116-77.010)/77.010=0.0013 261 | - 第4次:(76.960-77.010)/77.010=-0.0006 262 | 263 | 可以得到差值百分比都分布在(-0.01,+0.01)的合理区间,此时达到满分1。(当5个数据有1次不在该范围内,稳定度则递减20%。) 264 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /README_en.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | [English](README_en.md) [Chinese](README.md) 4 | 5 | # DeepSpark Open Source Community 6 | 7 |
8 | Homepage 10 | LICENSE 11 | Release 12 |
13 |
14 | 15 | In the time when everything is computable, applications in various fields are emerging rapidly. Computing power must support practical applications, with versatility and future scalability being crucial metrics for evaluating computing capabilities. As a leading domestic provider of high-end GPGPU chips and supercomputing systems, the Iluvatar CoreX has successfully supported 400+ AI algorithm models by December 2024. We have established collaborations with 400+ customers and ecosystem partners to jointly promote the development of domestic general-purpose computing power. Our products serve multiple fields including smart cities, digital individuals, healthcare, education, telecommunications, and energy. 16 | 17 | Guided by the principles of "co-building platforms, sharing ecosystems, and winning together in the industry," the Iluvatar CoreX is committed to collaborating with industry partners to establish the [DeepSpark Open Source Community](https://www.deepspark.org.cn/). By giving back to the open-source community through open-source contributions, we aim to gather community strength, help customers accelerate application deployment and benefit from computing power empowerment, and promote the improvement and development of the industry ecosystem. 18 | 19 | Currently, the DeepSpark Open Source Community is primarily focused on building and promoting the [Hundreds of Applications Open Platform](#hundreds-of-applications-open-platform), In the future, more related projects and achievements will be open-sourced through the DeepSpark community. 20 | 21 | In August 2023, the DeepSpark Open Source Community signed a strategic cooperation agreement with the [Shanghai Baiyulan Open Source Research Institute](http://baiyulan.org.cn/) to further promote the co-construction and sharing of AI open-source initiatives and drive the improvement and development of the industry ecosystem. In November 2023, the DeepSpark community collaborated with the [OpenI Community](https://openi.pcl.ac.cn/), enabling community users to train models from DeepSparkHub using the [TianGai 100 computing power](https://openi.pcl.ac.cn/iluvatar/TianGai100) provided by OpenI's cloud brain. 22 | 23 | We welcome industry partners, community users, and developers to contribute to the DeepSpark Open Source Community in any form. Your active participation is highly anticipated. 24 | 25 | -------- 26 | 27 | ## Hundreds of Applications Open Platform 28 | 29 | As a leading domestic AI and general-purpose computing application development and evaluation platform, the Hundreds of Applications Open Platform carefully selects hundreds of open-source algorithms and models deeply integrated with industry applications. It supports mainstream ecosystem application frameworks and builds a multi-dimensional evaluation system tailored to industry needs, widely supporting various implementation scenarios. 30 | 31 | ### Application Algorithms and Models 32 | 33 | [DeepSparkHub](https://gitee.com/deep-spark/deepsparkhub) selects hundreds of open-source application algorithms and models, covering various fields of AI and general-purpose computing. It supports mainstream intelligent computing scenarios in the market, including smart cities, digital individuals, healthcare, education, telecommunications, and energy. 34 | 35 | [DeepSparkInference](https://gitee.com/deep-spark/deepsparkinference) selects inference model examples and guidance documents based on the independant-developed inference engines IGIE and IxRT. Some models provide evaluation results based on the self-developed GPGPU [ZhiKai 100](https://www.iluvatar.com/productDetails?fullCode=cpjs-yj-tlxltt-zk100). 36 | 37 | ### IXUCA (Iluvatar CoreX Unified Compute Architecture) 38 | 39 | IXUCA is compatible with mainstream GPGPU computing models, providing equivalent components, features, APIs, and algorithms that support mainstream GPU computing. It enables seamless migration of systems or applications with minimal effort. The IXUCA stack includes AI deep learning applications, mainstream frameworks, libraries, compilers and tools, as well as runtime libraries and drivers. 40 | 41 | - IXUCA integrates mainstream deep learning frameworks such as TensorFlow, PyTorch, and PaddlePaddle, delivering operators consistent with official open-source frameworks while continuously optimizing performance for Iluvatar CoreX acceleration cards. 42 | 43 | - IXUCA provides the IGIE inference framework and IxRT inference engine, enabling optimal inference performance on Iluvatar CoreX acceleration cards. 44 | 45 | - The libraries in IXUCA not only support general-purpose computing but also provide fundamental operators required for deep learning application development. Developers can conveniently utilize these operators to flexibly construct various deep neural network models and other machine learning algorithms. 46 | 47 | You can visit the [Resource Center](https://support.iluvatar.com/#/ProductLine?id=2) on Iluvatar CoreX's official website to obtain the IXUCA software stack. 48 | 49 | ### Application Frameworks 50 | 51 | The Hundreds of Applications Open Platform supports mainstream application frameworks and toolkits both domestically and internationally. 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 |
pytorchtensorflow
paddlepaddledeepspeed
fairseqmmdetection
wenetcolossal-ai
deepmodelingllama-factory
75 | 76 | ### Multi-Dimensional Benchmark Standards 77 | 78 | The benchmark standards are widely applicable to hardware platforms, featuring a comprehensive system and simple deployment. 79 | 80 | - Support 6️⃣ dimensions 81 | 82 | | Dimension | Description | Data Source | Calculation Method | 83 | |-------------|-----------------------------------------------------------------------|----------------------------------|--------------------------------------------------------------------------------------| 84 | | Speed🚀 | Computing power per second for stable model training samples | DeepSpark training script output | Remove highest/lowest of 5 iterations, take mean of middle 3 values | 85 | | Accuracy🎯 | Model convergence accuracy value | DeepSpark training script output | Record accuracy value at convergence | 86 | | Linearity📈 | Linear scaling performance for cluster training (card/node linearity) | DeepSpark training script output | Multi-card/node speed divided by card/node count, compared to single-card/node speed | 87 | | Power🔌 | Average GPU power consumption during stable training | GPU monitoring tool | Average of multiple power measurements | 88 | | Memory📊 | Average GPU memory usage during stable training | GPU monitoring tool | Average of multiple memory measurements | 89 | | Stability🔧 | Convergence value stability across multiple full training runs | DeepSpark training script output | 5 full training runs, median as baseline, 20% deduction if any value deviates by ±1% | 90 | 91 | Reference: [Hardware Benchmark Results](#hardware-evaluation-methods-and-results) 92 | 93 | - 1️⃣-click deployment: Fully automated ✅, reproducible data 🔁, traceable scenarios 🔎 94 | 95 | - 0️⃣ platform dependencies: No framework restrictions, no source language restrictions, no hardware restrictions 96 | 97 | ### Multi-Dimensional Benchmark System 98 | 99 | [Multi-Dimensional Benchmark System](https://mdb.deepspark.org.cn:8086) is an online evaluation tool developed based on the [Multi-Dimensional Benchmark Standards](#multi-dimensional-benchmark-standards). It conducts model training evaluations on BI-V100 and NV-V100 accelerator cards under equal conditions across six dimensions (speed, accuracy, linearity, power efficiency, memory efficiency, and stability), collects metrics, and displays them in six-dimensional radar charts, enabling users to comprehensively compare and evaluate the overall capabilities of GPU accelerators. Below is the list of currently supported models: 100 | 101 | ![training model list](evaluation/Iluvatar/assets/mdb_model_list_1.png) 102 | 103 | For usage details, please refer to the [Multi-Dimensional Benchmark System User Guide](evaluation/Iluvatar/Mdims-benchmark.md). 104 | 105 | -------- 106 | 107 | ### Hardware evaluation methods and results 108 | 109 | #### TianGai 100 GPGPU 110 | 111 | For evaluation methods, please refer to the [TianGai 100 Six-Dimension Benchmark Guide](evaluation/Iluvatar/six_dimension_howto.md). 112 | 113 | The results is as below: 114 | 115 | | Task | Model | Convergence Metric | Configuration(x->gpus) | Speed | Accuracy | Power(W) | Linearity | Memory Usage(GB) | Stability | 116 | |-----------------------|------------|--------------------|------------------------|--------|----------|----------|-----------|------------------|-----------| 117 | | NLP | BERT-large | 0.72 | sdk2.2,bs:32,8x,amp | 214 | 0.72 | 152*8 | 0.96 | 20.3*8 | 1 | 118 | | Recommendation System | DLRM | AUC:0.75 | sdk2.2,bs:2048,8x,amp | 793486 | 0.75 | 60*8 | 0.97 | 3.7*8 | 1 | 119 | | Image Classification | ResNet50 | top1 75.9% | sdk2.2,bs:512,8x,amp | 5221 | 76.43% | 128*8 | 0.97 | 29.1*8 | 1 | 120 | | Image Segmentation | 3D U-Net | 0.908 | sdk2.2,bs:4,8x,fp32 | 12 | 0.908 | 152*8 | 0.85 | 19.6*8 | 1 | 121 | | Object Detection | YOLOv5 | mAP:0.5 | sdk2.2,bs:128,8x,amp | 1228 | 0.56 | 140*8 | 0.92 | 27.3*8 | 1 | 122 | | Text Detection | SATRN | 0.841 | sdk2.2,bs:128,8x,fp32 | 630 | 88.4 | 166*8 | 0.98 | 28.5*8 | 1 | 123 | | Speech Recognition | Conformer | 3.72 | sdk2.2,bs:32,8x,fp32 | 380 | 4.79 | 113*8 | 0.82 | 21.5*8 | 1 | 124 | | 3D Reconstruction | ngp-nerf | 0.0046 | sdk2.2,bs:1,8x,amp | 10 | 19.6 | 82*8 | 0.90 | 28.1*8 | 1 | 125 | | Object Tracking | FairMOT | MOTA:69.8 | sdk2.2,bs:64,8x,fp32 | 52 | 69.8 | 132*8 | 0.97 | 19.1*8 | 1 | 126 | | Large Model | CPM | 0.91 | sdk2.2,bs:128,8x,amp | 357 | 0.91 | 156*8 | 0.93 | 20.6*8 | 1 | 127 | | Speech Synthesis | Tacotron2 | score(MOS):4.460 | sdk2.2,bs:128,8x,amp | 77 | 4.46 | 128*8 | 0.96 | 18.4*8 | 1 | 128 | | New Model | Wave-MLP | 80.1 | sdk2.2,bs:256,8x,fp32 | 1026 | 83.1 | 198*8 | 0.98 | 29.4*8 | 1 | 129 | 130 | -------- 131 | 132 | ## Community 133 | 134 | ### Code of Conduct 135 | 136 | See [Code of Conduct](CODE_OF_CONDUCT.md). 137 | 138 | ### Contact 139 | 140 | Contact . 141 | 142 | ### Contribution 143 | 144 | Refer to each project's Contributing Guidelines. 145 | 146 | ### License 147 | 148 | [Apache License 2.0](LICENSE). 149 | --------------------------------------------------------------------------------