├── README.md ├── Toward_Understanding_Deep_Learning_Framework_Bugs.pdf └── dataset.xlsx /README.md: -------------------------------------------------------------------------------- 1 | # DeepLearningBugsData 2 | 3 | ## Introduction 4 | 5 | This repo contains a dataset for supporting the paper: **Toward Understanding Deep Learning Framework Bugs**, which has been accepted by TOSEM 2023. 6 | 7 | To fully research the characteristic and distribution of bugs in DL frameworks, we collected closed and merged pull request from four famous DL library repositories: TensorFlow, PyTorch, MXNet and Deeplearning4J. In total we analyzed1,250 pull requests and collected 1000 real bugs, including 250 latest bugs for each DL frameworks. . All bugs are recorded in the `dataset.xlsx` file. 8 | 9 | ## Repository 10 | 11 | Four repository links are displayed as follows. 12 | 13 | TensorFlow: https://github.com/tensorflow/tensorflow 14 | 15 | PyTorch: https://github.com/pytorch/pytorch 16 | 17 | MXNet: https://github.com/apache/incubator-mxnet 18 | 19 | Deeplearning4J: https://github.com/eclipse/deeplearning4j 20 | 21 | ## Information 22 | 23 | Here we introduce some important labels in the worksheet. 24 | 25 | - issue: issue id solved by or relevant to the pull request. 26 | - pr_id: short for pull request id. 27 | - start_time: time when relevent issue was created. 28 | - merge_time: time when pull request was merged. 29 | - patch_file: files that contributor pulled to solve the issue. 30 | - symptom: the symptom created by bugs. 31 | - root_cause: the root cause of bugs. 32 | - root_cause-sub: records of subcategories in root cause. 33 | - component: the category where the bugs happens in DL framework. 34 | - stage: period when bugs happens. 35 | - function_num: function numbers modified in the pull request. 36 | 37 | ## Preliminary application 38 | 39 | Guided by our study findings, we conduct a preliminary test case generating tool and deploy it in four versions of TensorFlow. The tool has detected 6 bugs, involving 3 historical bugs and 3 unknown bugs. Regarding 3 unknown bugs, we present the following issue url. 40 | 41 | 1. Triggered by muate_shape: https://github.com/tensorflow/tensorflow/issues/55214 42 | 2. Triggered by mutate_para: https://github.com/tensorflow/tensorflow/issues/55201 43 | 3. Triggered by mutate_type: https://github.com/tensorflow/tensorflow/issues/55285 44 | -------------------------------------------------------------------------------- /Toward_Understanding_Deep_Learning_Framework_Bugs.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DLFrameworkBug/DLFrameworkBugsData/8f3caa144157f1a19a21090ff6ca78ec417c25b9/Toward_Understanding_Deep_Learning_Framework_Bugs.pdf -------------------------------------------------------------------------------- /dataset.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DLFrameworkBug/DLFrameworkBugsData/8f3caa144157f1a19a21090ff6ca78ec417c25b9/dataset.xlsx --------------------------------------------------------------------------------