├── README.md ├── README_ZH.md ├── docs ├── .DS_Store ├── RDS_AuditLog_Format.md └── diagrams │ └── traffic_replay_flow.png ├── go.mod ├── go.sum ├── i18n.go ├── i18n_test.go ├── load.go ├── main.go ├── parse.go ├── parse_test.go ├── parsetidb.go ├── replay.go ├── replay_test.go ├── report.go └── translations.go /README.md: -------------------------------------------------------------------------------- 1 | **[中文](https://github.com/Bowen-Tang/sql-replay/blob/main/README_ZH.md) | [English](https://github.com/Bowen-Tang/sql-replay/blob/main/README.md)** 2 | 3 | # Introduction 4 | ![image](docs/diagrams/traffic_replay_flow.png) 5 | 6 | ## Scenarios 7 | 1. Version upgrade compatibility and performance evaluation 8 | 2. System migration compatibility and performance evaluation 9 | 10 | ## Supported Source Databases 11 | 1. MySQL 5.6, 5.7, 8.0 12 | 2. Aurora MySQL 5.7/8.0 13 | 3. Maybe Other Cloud RDS ... 14 | 4. TiDB 15 | 16 | Examples of MySQL Slow log formats: 17 | ``` 18 | # Time: 2024-01-19T16:29:48.141142Z 19 | # User@Host: t1[t1] @ [10.2.103.21] Id: 797 20 | # Query_time: 0.000038 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 1 21 | SET timestamp=1705681788; 22 | SELECT c FROM sbtest1 WHERE id=250438; 23 | ``` 24 | 25 | ``` 26 | # Time: 240119 16:29:48 27 | # User@Host: t1[t1] @ [10.2.103.21] Id: 797 28 | # Query_time: 0.000038 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 1 29 | SET timestamp=1705681788; 30 | SELECT c FROM sbtest1 WHERE id=250438; 31 | ``` 32 | 33 | ``` 34 | # Time: 231106 0:06:36 35 | # User@Host: coplo2o[coplo2o] @ [10.0.2.34] Id: 45827727 36 | # Query_time: 1.066695 Lock_time: 0.000042 Rows_sent: 1 Rows_examined: 7039 Thread_id: 45827727 Schema: db Errno: 0 Killed: 0 Bytes_received: 0 Bytes_sent: 165 Read_first 37 | : 0 Read_last: 0 Read_key: 1 Read_next: 7039 Read_prev: 0 Read_rnd: 0 Read_rnd_next: 0 Sort_merge_passes: 0 Sort_range_count: 0 Sort_rows: 0 Sort_scan_count: 0 Created_tmp_ 38 | disk_tables: 0 Created_tmp_tables: 0 Start: 2023-11-06T00:06:35.589701 End: 2023-11-06T00:06:36.656396 Launch_time: 0.000000 39 | # QC_Hit: No Full_scan: No Full_join: No Tmp_table: No Tmp_table_on_disk: No Filesort: No Filesort_on_disk: No 40 | use db; 41 | SET timestamp=1699200395; 42 | SELECT c FROM sbtest1 WHERE id=250438; 43 | ``` 44 | 45 | ## Parse Section 46 | Reads MySQL slow query logs, removes invalid SQL such as automatically generated "set timestamp=xx/# Administrator/--", and generates a formattable JSON file for replay. 47 | 48 | ## Replay Section 49 | 1. Reads the formatted JSON file generated in the parse stage (or by parse-tshark -mode parse2file from packet capture tool tshark). Can filter upstream database users, upstream SQL types (all, select), and upstream database names (only supports logs collected by packet capture tools) for replay. 50 | 2. Parallel execution based on connection id, serial execution for SQL with the same connection id. 51 | 3. Outputs replay results to JSON files (separated by connection id). 52 | 53 | ## Load Section 54 | 1. Parses the JSON files generated by replay, uses the TiDB Parse module to format SQL and generate fingerprints (sql digest). 55 | 2. Writes the parsed information into the replay_info table in the database. 56 | 57 | ## Report Section 58 | Analyzes replay results and generates a replay report (including response time comparison and error information). 59 | 60 | # Usage Example 61 | 62 | ## Download and Extract 63 | ``` 64 | mkdir replay && cd replay && wget https://github.com/Bowen-Tang/sql-replay/releases/download/v0.3.4/v0.3.4.zip 65 | unzip v0.3.4.zip 66 | ``` 67 | 68 | ## 1. Parse Slow Query Log 69 | ``` 70 | # Parse MySQL Slow Log 71 | ./sql-replay -mode parsemysqlslow -slow-in /opt/slow.log -slow-out /opt/slow.format 72 | # Parse TiDB Slow Log 73 | ./sql-replay -mode parsetidbslow -slow-in /opt/slow.log -slow-out /opt/slow.format 74 | ``` 75 | Note: 76 | 1. /opt/slow.log is the path to the slow query log, slow.format is the output formatted file. 77 | 2. TiDB slow logs are split by file. When replaying multiple slow log files, it is recommended to merge the output results into a single replay file in order. 78 | 79 | ## 2. Connect to Target Database for Replay 80 | ``` 81 | mkdir out # To store Replay Results 82 | # Replay All users and All SQL type 83 | ./sql-replay -mode replay -db 'user:password@tcp(ip:port)/db' -speed 1.0 -slow-out /opt/slow.format -replay-out ./out/sb1_all -username all -sqltype all -dbname all -lang en 84 | # Replay All users and only Select SQL 85 | ./sql-replay -mode replay -db 'user:password@tcp(ip:port)/db' -speed 1.0 -slow-out /opt/slow.format -replay-out ./out/sb1_select -username all -sqltype select -dbname db1 -lang zh 86 | ``` 87 | 88 | Note: 89 | 1. 'out' is the directory for storing replay results **(can be changed to other directories, needs to be manually created)**. sb1_all/sb1_select is the replay task name; 'speed' is the replay speed. When the slow query cycle is long but there are few statements, it's recommended to increase the replay speed. When simulating higher pressure, it's also recommended to increase the replay speed. 90 | 2. In 'user:password@tcp(ip:port)/db', 'db' refers to the target database for replay. 91 | 3. Advanced Future: -ignoredigests digest1,digest2,digest3... 92 | 93 | ## 3. Import Replay Results to Database 94 | **Import data** 95 | ``` 96 | # Import Replay Results For task sb1_all 97 | ./sql-replay -mode load -db 'user:password@tcp(ip:port)/db' -out-dir ./out -replay-name sb1_all -table replay_info 98 | # Import Replay Results For task sb1_select 99 | ./sql-replay -mode load -db 'user:password@tcp(ip:port)/db' -out-dir ./out -replay-name sb1_select -table replay_info 100 | ``` 101 | Note: -out-dir is the directory for storing replay results, -replay-name is the replay task name, table is the result table to write to. 102 | 103 | ## 4. Generate Report 104 | ``` 105 | ./sql-replay -mode report -db 'user:password@tcp(ip:port)/db' -replay-name slow1 -port ':8081' 106 | ``` 107 | Note: After execution, you can access the report content at IP:PORT. 108 | 109 | # Report Example 110 | 111 | ![image](https://github.com/Bowen-Tang/sql-replay/assets/52245161/c72dcbea-ad39-4ade-ad09-24dd163b913a) 112 | The Replay Summary records the "total SQL execution time comparison", "number of faster SQL statements", "number of slower SQL statements", and "number of erroneous SQL statements". 113 | 114 | ![image](https://github.com/Bowen-Tang/sql-replay/assets/52245161/6f027083-88ff-49a3-a6fc-f7bf952f9f6f) 115 | Note: 116 | 1. For ease of display, text content has been abbreviated with ..., but you can still copy the full content by double-clicking and selecting the cell content; additionally, sample_sql_text supports preview. 117 | 2. The SQL Error Info is sorted by sql_digest and error_info (first 10 characters). 118 | 119 | # Multiple Speed Replay Effects (1x (default), 10x, 50x) 120 | ![image](https://github.com/Bowen-Tang/sql-replay/assets/52245161/8ebbee92-586d-4090-97d6-9e2c87c640bc) 121 | 122 | # Replay Basically Follows Original Order 123 | Replay file content 124 | 125 | ![image](https://github.com/Bowen-Tang/sql-replay/assets/52245161/e87e84fd-7318-41d0-8356-ddce5c744e2d) 126 | 127 | SQL execution order recorded in the database 128 | 129 | ![image](https://github.com/Bowen-Tang/sql-replay/assets/52245161/7d9d7f84-80a1-44d6-b3d5-8f27b933ffcb) 130 | 131 | # How to build 132 | 133 | 1. Install golang 1.20 or above 134 | 2. Download the repo 135 | ``` 136 | git clone https://github.com/Bowen-Tang/sql-replay 137 | ``` 138 | 139 | 3. build 140 | 141 | ``` 142 | cd sql-replay 143 | go mod tidy 144 | go build 145 | ``` 146 | 147 | # Replay Suggestions 148 | 1. When there's only one database and one user in the database, use -username all -dbname all for replay. 149 | 2. When there are multiple databases and users, it's recommended to start multiple sql-replay processes for parallel replay (otherwise, a large number of SQL errors will occur). Each process corresponds to different -username and -dbname (note that the username and database name in -db should also be consistent). 150 | 151 | # Known Issues 152 | 1. When replaying through slow queries, since the log doesn't record database information, you can only specify -db all or not specify during replay. Otherwise, replay won't occur. (If you want to filter databases during slow query replay, you can do so by specifying -username and the username and database name in -db to complete the replay of the corresponding database) 153 | 2. Replaying SQL with hundreds of thousands of rows like "insert into ... (),(),(),() " may cause the program to crash. 154 | 3. In SQL from packet capture replay, if it's a prepared statement with ? placeholders, these SQL statements will execute with errors during replay. 155 | 4. The SQL replay order is not exactly the same as the real execution order. 156 | 5. The execution time recorded in MySQL slow query logs may be slower than the actual time (e.g., select sleep(10) won't be recorded as 10 seconds, and MySQL 5.7 doesn't include lock waiting time, etc.) 157 | 6. Slow query log formats of cloud RDS may vary and might not be supported; MariaDB is not currently supported as connection_id cannot be obtained, will be added later. 158 | 7. When there are too many connection_id values (>4096), you may encounter a "too many open files" error during replay. Temporary solution: Before replay, use ulimit -n 1000000 159 | -------------------------------------------------------------------------------- /README_ZH.md: -------------------------------------------------------------------------------- 1 | **[中文](https://github.com/Bowen-Tang/sql-replay/blob/main/README_ZH.md) | [English](https://github.com/Bowen-Tang/sql-replay/blob/main/README.md)** 2 | # 功能介绍 3 | ![image](docs/diagrams/traffic_replay_flow.png) 4 | 5 | 6 | 7 | ## 适用场景 8 | 1. 版本升级兼容性及性能评估 9 | 2. 系统迁移兼容性及性能评估 10 | 11 | ## 支持的源端数据库 12 | 1. MySQL 5.6, 5.7, 8.0 13 | 2. Auroa MySQL 5.7/8.0 14 | 3. 云上 MySQL RDS 15 | 4. TiDB 16 | 17 | 支持的 MySQL 日志格式示例: 18 | ``` 19 | # Time: 2024-01-19T16:29:48.141142Z 20 | # User@Host: t1[t1] @ [10.2.103.21] Id: 797 21 | # Query_time: 0.000038 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 1 22 | SET timestamp=1705681788; 23 | SELECT c FROM sbtest1 WHERE id=250438; 24 | ``` 25 | 26 | ``` 27 | # Time: 240119 16:29:48 28 | # User@Host: t1[t1] @ [10.2.103.21] Id: 797 29 | # Query_time: 0.000038 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 1 30 | SET timestamp=1705681788; 31 | SELECT c FROM sbtest1 WHERE id=250438; 32 | ``` 33 | 34 | ``` 35 | # Time: 231106 0:06:36 36 | # User@Host: coplo2o[coplo2o] @ [10.0.2.34] Id: 45827727 37 | # Query_time: 1.066695 Lock_time: 0.000042 Rows_sent: 1 Rows_examined: 7039 Thread_id: 45827727 Schema: db Errno: 0 Killed: 0 Bytes_received: 0 Bytes_sent: 165 Read_first 38 | : 0 Read_last: 0 Read_key: 1 Read_next: 7039 Read_prev: 0 Read_rnd: 0 Read_rnd_next: 0 Sort_merge_passes: 0 Sort_range_count: 0 Sort_rows: 0 Sort_scan_count: 0 Created_tmp_ 39 | disk_tables: 0 Created_tmp_tables: 0 Start: 2023-11-06T00:06:35.589701 End: 2023-11-06T00:06:36.656396 Launch_time: 0.000000 40 | # QC_Hit: No Full_scan: No Full_join: No Tmp_table: No Tmp_table_on_disk: No Filesort: No Filesort_on_disk: No 41 | use db; 42 | SET timestamp=1699200395; 43 | SELECT c FROM sbtest1 WHERE id=250438; 44 | ``` 45 | 46 | ## parse 部分 47 | 读取 MySQL 慢查询日志,去掉 MySQL 中自动生成的 set timestamp=xx/# Administor/-- 等无效 SQL,生成一个可以格式化的 json 文件,用于回放 48 | ## replay 部分 49 | 1. 读取 parse 阶段(或抓包工具 tshark 经 parse-tshark -mode parse2file)生成的格式化 json 文件,可过滤上游数据库用户、上游 SQL 类型(all、select)、上游数据库名(仅支持抓包工具采集的日志)来进行回放 50 | 2. 根据 connection id 并行,相同 connection id 的 SQL 串行 51 | 3. 将回放结果输出成 json 文件(按照 connection id 区分) 52 | ## load 部分 53 | 1. 解析 replay 生成的 json 文件,使用 TiDB Parse 模块对 SQL 进行格式化,并生成指纹(sql digest) 54 | 2. 将解析出来的信息写入数据库的 replay_info 表中 55 | ## report 部分 56 | 对回放结果进行分析,生成回放报告(含响应时间对比、错误信息) 57 | 58 | # 操作示例 59 | ## 下载并解压 60 | ``` 61 | mkdir replay && cd replay && wget https://github.com/Bowen-Tang/sql-replay/releases/download/v0.3.4/v0.3.4.zip 62 | unzip v0.3.4.zip 63 | ``` 64 | 65 | ## 1. 解析慢查询日志 66 | ``` 67 | # MySQL Slow Log 68 | ./sql-replay -mode parsemysqlslow -slow-in /opt/slow.log -slow-out /opt/slow.format 69 | # TiDB Slow Log 70 | ./sql-replay -mode parsetidbslow -slow-in /opt/slow.log -slow-out /opt/slow.format 71 | ``` 72 | 73 | 说明: 74 | 1. /opt/slow.log 为慢查询日志路径,slow.format 则为输出的格式化文件 75 | 2. TiDB 慢日志按文件进行了切分,当需要回放多个慢日志文件时,建议将输出结果按照顺序合并为一个回放文件 76 | 77 | ## 2. 连接目标库回放 78 | 79 | ``` 80 | mkdir out # 用于存储回放结果 81 | # 回放所有用户、所有 SQL 82 | ./sql-replay -mode replay -db 'user:password@tcp(ip:port)/db' -speed 1.0 -slow-out /opt/slow.format -replay-out ./out/sb1_all -username all -sqltype all -dbname all -lang en 83 | # 回放所有用户、select 语句 84 | ./sql-replay -mode replay -db 'user:password@tcp(ip:port)/db' -speed 1.0 -slow-out /opt/slow.format -replay-out ./out/sb1_select -username all -sqltype select -dbname db1 -lang zh 85 | ``` 86 | 说明: 87 | 88 | 1. out 为回放结果存储目录**(可更换为其他目录,需手动创建)**,sb1_all/sb1_select 为回放任务名称;speed 为回放速度,当慢查询周期很长但语句很少时建议增大回放速度,当需要模拟更大压力时,建议增大回放速度 89 | 2. 'user:password@tcp(ip:port)/db' 中的 db 指的是用于回放的目标库 90 | 3. 高级功能:-ignoredigests digest1,digest2,digest3... 回放时可以忽略指定的 SQL 91 | 92 | ## 3. 导入回放结果到数据库 93 | **导入数据** 94 | ``` 95 | # 导入回放任务 sb1_all 的回放数据 96 | ./sql-replay -mode load -db 'user:password@tcp(ip:port)/db' -out-dir ./out -replay-name sb1_all -table replay_info 97 | # 导入回放任务 sb1_select 的回放数据 98 | ./sql-replay -mode load -db 'user:password@tcp(ip:port)/db' -out-dir ./out -replay-name sb1_select -table replay_info 99 | ``` 100 | 说明:-out-dir 为回放结果存储目录,-replay-name 回放任务名称,table 为写入结果表 101 | 102 | ## 4. 生成报告 103 | 104 | ``` 105 | ./sql-replay -mode report -db 'user:password@tcp(ip:port)/db' -replay-name slow1 -port ':8081' 106 | ``` 107 | 说明:执行完可访问 IP:PORT 访问报告内容 108 | 109 | # 报告示例 110 | 111 | ![image](https://github.com/Bowen-Tang/sql-replay/assets/52245161/c72dcbea-ad39-4ade-ad09-24dd163b913a) 112 | Replay Summary 中,记录了 SQL 总耗时对比、快的 SQL 条数、慢的 SQL 条数、错误的 SQL 条数 113 | 114 | ![image](https://github.com/Bowen-Tang/sql-replay/assets/52245161/6f027083-88ff-49a3-a6fc-f7bf952f9f6f) 115 | 116 | 说明: 117 | 1. 为方便展示,将文本内容使用 ... 进行了省略,但依旧可以通过双击单元格选择内容后复制完整内容;另外 sample_sql_text 支持预览 118 | 2. Sql Error Info 中,根据 sql_digest 以及 error_info(前 10 位)排序 119 | 120 | 121 | # 倍速回放效果(1 倍(默认),10 倍,50 倍) 122 | ![image](https://github.com/Bowen-Tang/sql-replay/assets/52245161/8ebbee92-586d-4090-97d6-9e2c87c640bc) 123 | 124 | # 回放基本按照原始顺序回放 125 | 回放文件内容 126 | 127 | ![image](https://github.com/Bowen-Tang/sql-replay/assets/52245161/e87e84fd-7318-41d0-8356-ddce5c744e2d) 128 | 129 | 数据库记录的 SQL 执行顺序 130 | 131 | ![image](https://github.com/Bowen-Tang/sql-replay/assets/52245161/7d9d7f84-80a1-44d6-b3d5-8f27b933ffcb) 132 | 133 | 134 | 135 | # 编译安装方法 136 | 137 | 1. 安装 golang 1.20 及以上 138 | 2. 下载项目 139 | 140 | ``` 141 | git clone https://github.com/Bowen-Tang/sql-replay 142 | ``` 143 | 144 | 4. 编译 sql-replay 145 | 146 | ``` 147 | cd sql-replay 148 | go mod tidy 149 | go build 150 | ``` 151 | # 回放建议 152 | 1. 当数据库中就一个 database,一个 user 时,使用 -username all -dbname all 来回放 153 | 2. 当数据库中有多个 database、多个 user 时,建议启动多个 sql-replay 进程并行回放(否则将出现大量 SQL 报错),每个进程对应不同的 -username 和 -dbname(注意 -db 中的用户名、数据库名也需保持一致) 154 | 155 | # 已知问题 156 | 1. 通过慢查询回放时,由于日志中没有记录 database 信息,所以在 replay 时,只能指定 -db all,或者不指定,否则不会进行回放(如果想要在慢查询回放时过滤库,可以通过指定 -username 以及 -db 中的用户名和数据库名的形式来完成对应库的回放) 157 | 2. insert into ... (),(),(),() 数十万行的 SQL 回放时,有可能会导致程序崩溃 158 | 3. 抓包回放的 SQL 中,如果是预编译 ? 占位符类型时,回放时这部分 SQL 会执行报错 159 | 4. SQL 回放顺序并不完全与真实执行顺序相等 160 | 5. MySQL 慢查询日志中记录的执行时间可能比真实时间慢(如 select sleep(10),并不会记录为 10 秒,如 MySQL 5.7 中并不包含等锁时间等) 161 | 6. 云上 RDS 的慢查询日志格式不尽相同,可能不支持;暂不支持 MariaDB,当前无法获取 connection_id,后续加上 162 | 7. 当 connection_id 值过多(>4096)时,进行回放时会遇到 too many open files 错误,临时解决办法:回放前 ulimit -n 1000000 163 | -------------------------------------------------------------------------------- /docs/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Bowen-Tang/sql-replay/2434219a672313227193e79278e6e1565455674b/docs/.DS_Store -------------------------------------------------------------------------------- /docs/RDS_AuditLog_Format.md: -------------------------------------------------------------------------------- 1 | **HW Cloud RDS 审计日志内容示例** 2 | ``` 3 | "21596830293", "93726418", "0", "Query", "2024-09-04T00:57:27 UTC", "select", "SELECT pushcode FROM els WHERE els.pn = 'staff500' AND els.ut = '20' ORDER BY els.ct DESC limit 0,1", "root[root] @ [10.3.2.22]", "", "", "10.3.2.22", "portal" 4 | "21596830294", "93594738", "0", "Query", "2024-09-04T00:57:27 UTC", "set_option", "SET autocommit=0", "root[root] @ [10.3.2.34]", "", "", "10.3.2.34", "portal" 5 | "21596830295", "93789840", "0", "Query", "2024-09-04T00:57:27 UTC", "select", "select @@session.tx_read_only", "root[root] @ [10.3.2.21]", "", "", "10.3.2.21", "portal" 6 | ``` 7 | 8 | **HW Cloud RDS 审计日志存储方式** 9 | 10 | 按 100MB 一个文件切割,按序号存储,如 1~9 11 | 12 | **脚本1:format.sh** 13 | 14 | ``` 15 | #!/bin/bash 16 | 17 | # 检查参数数量 18 | if [ "$#" -ne 2 ]; then 19 | echo "Usage: $0 input_file output_file" 20 | exit 1 21 | fi 22 | 23 | input_file="$1" # 输入文件名 24 | output_file="$2" # 输出文件名 25 | 26 | # 初始化变量 27 | rows='' 28 | 29 | # 逐行读取输入文件 30 | while IFS= read -r line 31 | do 32 | # 处理新行 33 | row="$line" 34 | 35 | # 判断当前行是否以 '"2159' 开头 36 | if [[ "${row:0:12}" =~ ^\"[0-9]{11} ]]; then 37 | # 将之前的内容写入到输出文件中 38 | if [ -n "$rows" ]; then 39 | echo "${rows}" >> "$output_file" 40 | fi 41 | # 重置 rows 变量 42 | rows="$row" 43 | else 44 | # 追加到 rows 变量中 45 | rows="${rows} ${row}" 46 | fi 47 | done < "$input_file" 48 | 49 | # 处理最后一段数据 50 | if [ -n "$rows" ]; then 51 | echo -e "${rows}" >> "$output_file" 52 | fi 53 | ``` 54 | 55 | **运行格式化** 56 | 57 | ```for i in `seq 1 9`; do sh ../format.sh ./$i ./for/$i; done``` 58 | 59 | **python 脚本(用于格式化成 sql-replay 可回放的文件)** 60 | 61 | ``` 62 | # -*- coding: utf-8 -*- 63 | 64 | import json 65 | import time 66 | 67 | def process_file(input_file, output_file): 68 | with open(input_file, 'r', encoding='utf-8') as csv_file, open(output_file, 'a', encoding='utf-8') as json_file: 69 | for line in csv_file: 70 | # 使用 '", "' 作为分隔符手动分割行数据 71 | row = line.strip().split('", "') 72 | 73 | # 处理提取的字段 74 | if len(row) < 7: 75 | continue # 跳过不完整的行 76 | 77 | # 去除字段开始和结束的引号 78 | connection_id = row[1].strip('"') 79 | ts = row[4].strip('"') 80 | sql_type = row[5].strip('"') 81 | sql = row[6].strip('"') 82 | jd = ".000001" 83 | dbname = "portal" 84 | 85 | # 去掉时区部分 86 | ts = ts.replace(" UTC", "") 87 | 88 | # 处理时间戳为浮点数,6 位精度 89 | try: 90 | ts_struct = time.strptime(ts, "%Y-%m-%dT%H:%M:%S") 91 | timestamp = time.mktime(ts_struct) 92 | timestamp = round(timestamp, 6) 93 | except ValueError: 94 | timestamp = 0.0 95 | 96 | # 创建 JSON 对象 97 | log_entry = { 98 | "connection_id": connection_id, 99 | "query_time": 100, 100 | "sql": sql, 101 | "rows_sent": 0, 102 | "username": "t1", 103 | "sql_type": sql_type, 104 | "dbname": dbname, 105 | "ts": timestamp # 不带引号的浮点数 106 | } 107 | 108 | # 写入 JSON 文件 109 | json.dump(log_entry, json_file) 110 | json_file.write('\n') 111 | 112 | # 处理多个文件 113 | for i in range(1, 10): 114 | input_file = f"{i}" 115 | output_file = "db1.log" 116 | process_file(input_file, output_file) 117 | ``` 118 | 119 | -------------------------------------------------------------------------------- /docs/diagrams/traffic_replay_flow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Bowen-Tang/sql-replay/2434219a672313227193e79278e6e1565455674b/docs/diagrams/traffic_replay_flow.png -------------------------------------------------------------------------------- /go.mod: -------------------------------------------------------------------------------- 1 | module sql-replay 2 | 3 | go 1.20 4 | 5 | require ( 6 | github.com/go-sql-driver/mysql v1.7.1 7 | github.com/pingcap/tidb/pkg/parser v0.0.0-20240126183920-6a87b80e2c8d 8 | ) 9 | 10 | require ( 11 | github.com/benbjohnson/clock v1.3.5 // indirect 12 | github.com/cznic/mathutil v0.0.0-20181122101859-297441e03548 // indirect 13 | github.com/pingcap/errors v0.11.5-0.20210425183316-da1aaba5fb63 // indirect 14 | github.com/pingcap/failpoint v0.0.0-20220801062533-2eaa32854a6c // indirect 15 | github.com/pingcap/log v1.1.0 // indirect 16 | github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec // indirect 17 | go.uber.org/atomic v1.11.0 // indirect 18 | go.uber.org/multierr v1.11.0 // indirect 19 | go.uber.org/zap v1.25.0 // indirect 20 | golang.org/x/exp v0.0.0-20231006140011-7918f672742d // indirect 21 | golang.org/x/text v0.12.0 // indirect 22 | gopkg.in/natefinch/lumberjack.v2 v2.2.1 // indirect 23 | ) 24 | -------------------------------------------------------------------------------- /go.sum: -------------------------------------------------------------------------------- 1 | github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03qcyfWMU= 2 | github.com/benbjohnson/clock v1.1.0/go.mod h1:J11/hYXuz8f4ySSvYwY0FKfm+ezbsZBKZxNJlLklBHA= 3 | github.com/benbjohnson/clock v1.3.5 h1:VvXlSJBzZpA/zum6Sj74hxwYI2DIxRWuNIoXAzHZz5o= 4 | github.com/benbjohnson/clock v1.3.5/go.mod h1:J11/hYXuz8f4ySSvYwY0FKfm+ezbsZBKZxNJlLklBHA= 5 | github.com/cznic/mathutil v0.0.0-20181122101859-297441e03548 h1:iwZdTE0PVqJCos1vaoKsclOGD3ADKpshg3SRtYBbwso= 6 | github.com/cznic/mathutil v0.0.0-20181122101859-297441e03548/go.mod h1:e6NPNENfs9mPDVNRekM7lKScauxd5kXTr1Mfyig6TDM= 7 | github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= 8 | github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c= 9 | github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= 10 | github.com/go-sql-driver/mysql v1.7.1 h1:lUIinVbN1DY0xBg0eMOzmmtGoHwWBbvnWubQUrtU8EI= 11 | github.com/go-sql-driver/mysql v1.7.1/go.mod h1:OXbVy3sEdcQ2Doequ6Z5BW6fXNQTmx+9S1MCJN5yJMI= 12 | github.com/kr/pretty v0.1.0/go.mod h1:dAy3ld7l9f0ibDNOQOHHMYYIIbhfbHSm3C4ZsoJORNo= 13 | github.com/kr/pty v1.1.1/go.mod h1:pFQYn66WHrOpPYNljwOMqo10TkYh1fy3cYio2l3bCsQ= 14 | github.com/kr/text v0.1.0/go.mod h1:4Jbv+DJW3UT/LiOwJeYQe1efqtUx/iVham/4vfdArNI= 15 | github.com/pingcap/errors v0.11.0/go.mod h1:Oi8TUi2kEtXXLMJk9l1cGmz20kV3TaQ0usTwv5KuLY8= 16 | github.com/pingcap/errors v0.11.4/go.mod h1:Oi8TUi2kEtXXLMJk9l1cGmz20kV3TaQ0usTwv5KuLY8= 17 | github.com/pingcap/errors v0.11.5-0.20210425183316-da1aaba5fb63 h1:+FZIDR/D97YOPik4N4lPDaUcLDF/EQPogxtlHB2ZZRM= 18 | github.com/pingcap/errors v0.11.5-0.20210425183316-da1aaba5fb63/go.mod h1:X2r9ueLEUZgtx2cIogM0v4Zj5uvvzhuuiu7Pn8HzMPg= 19 | github.com/pingcap/failpoint v0.0.0-20220801062533-2eaa32854a6c h1:CgbKAHto5CQgWM9fSBIvaxsJHuGP0uM74HXtv3MyyGQ= 20 | github.com/pingcap/failpoint v0.0.0-20220801062533-2eaa32854a6c/go.mod h1:4qGtCB0QK0wBzKtFEGDhxXnSnbQApw1gc9siScUl8ew= 21 | github.com/pingcap/log v1.1.0 h1:ELiPxACz7vdo1qAvvaWJg1NrYFoY6gqAh/+Uo6aXdD8= 22 | github.com/pingcap/log v1.1.0/go.mod h1:DWQW5jICDR7UJh4HtxXSM20Churx4CQL0fwL/SoOSA4= 23 | github.com/pingcap/tidb/pkg/parser v0.0.0-20240126183920-6a87b80e2c8d h1:p0xcjxuV3DorBSQRig2tAHLxQ+DUyMf7pGVIaOpOW1E= 24 | github.com/pingcap/tidb/pkg/parser v0.0.0-20240126183920-6a87b80e2c8d/go.mod h1:yRkiqLFwIqibYg2P7h4bclHjHcJiIFRLKhGRyBcKYus= 25 | github.com/pkg/errors v0.8.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0= 26 | github.com/pkg/errors v0.9.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0= 27 | github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM= 28 | github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4= 29 | github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec h1:W09IVJc94icq4NjY3clb7Lk8O1qJ8BdBEF8z0ibU0rE= 30 | github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec/go.mod h1:qqbHyh8v60DhA7CoWK5oRCqLrMHRGoxYCSS9EjAz6Eo= 31 | github.com/sergi/go-diff v1.1.0/go.mod h1:STckp+ISIX8hZLjrqAeVduY0gWCT9IjLuqbuNXdaHfM= 32 | github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME= 33 | github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI= 34 | github.com/stretchr/testify v1.4.0/go.mod h1:j7eGeouHqKxXV5pUuKE4zz7dFj8WfuZ+81PSLYec5m4= 35 | github.com/stretchr/testify v1.7.0/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg= 36 | github.com/stretchr/testify v1.8.4 h1:CcVxjf3Q8PM0mHUKJCdn+eZZtm5yQwehR5yeSVQQcUk= 37 | go.uber.org/atomic v1.6.0/go.mod h1:sABNBOSYdrvTF6hTgEIbc7YasKWGhgEQZyfxyTvoXHQ= 38 | go.uber.org/atomic v1.7.0/go.mod h1:fEN4uk6kAWBTFdckzkM89CLk9XfWZrxpCo0nPH17wJc= 39 | go.uber.org/atomic v1.9.0/go.mod h1:fEN4uk6kAWBTFdckzkM89CLk9XfWZrxpCo0nPH17wJc= 40 | go.uber.org/atomic v1.11.0 h1:ZvwS0R+56ePWxUNi+Atn9dWONBPp/AUETXlHW0DxSjE= 41 | go.uber.org/atomic v1.11.0/go.mod h1:LUxbIzbOniOlMKjJjyPfpl4v+PKK2cNJn91OQbhoJI0= 42 | go.uber.org/goleak v1.1.10/go.mod h1:8a7PlsEVH3e/a/GLqe5IIrQx6GzcnRmZEufDUTk4A7A= 43 | go.uber.org/goleak v1.2.0 h1:xqgm/S+aQvhWFTtR0XK3Jvg7z8kGV8P4X14IzwN3Eqk= 44 | go.uber.org/multierr v1.6.0/go.mod h1:cdWPpRnG4AhwMwsgIHip0KRBQjJy5kYEpYjJxpXp9iU= 45 | go.uber.org/multierr v1.7.0/go.mod h1:7EAYxJLBy9rStEaz58O2t4Uvip6FSURkq8/ppBp95ak= 46 | go.uber.org/multierr v1.11.0 h1:blXXJkSxSSfBVBlC76pxqeO+LN3aDfLQo+309xJstO0= 47 | go.uber.org/multierr v1.11.0/go.mod h1:20+QtiLqy0Nd6FdQB9TLXag12DsQkrbs3htMFfDN80Y= 48 | go.uber.org/zap v1.19.0/go.mod h1:xg/QME4nWcxGxrpdeYfq7UvYrLh66cuVKdrbD1XF/NI= 49 | go.uber.org/zap v1.25.0 h1:4Hvk6GtkucQ790dqmj7l1eEnRdKm3k3ZUrUMS2d5+5c= 50 | go.uber.org/zap v1.25.0/go.mod h1:JIAUzQIH94IC4fOJQm7gMmBJP5k7wQfdcnYdPoEXJYk= 51 | golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w= 52 | golang.org/x/exp v0.0.0-20231006140011-7918f672742d h1:jtJma62tbqLibJ5sFQz8bKtEM8rJBtfilJ2qTU199MI= 53 | golang.org/x/exp v0.0.0-20231006140011-7918f672742d/go.mod h1:ldy0pHrwJyGW56pPQzzkH36rKxoZW1tw7ZJpeKx+hdo= 54 | golang.org/x/lint v0.0.0-20190930215403-16217165b5de/go.mod h1:6SW0HCj/g11FgYtHlgUYUwCkIfeOF89ocIRzGO/8vkc= 55 | golang.org/x/net v0.0.0-20190311183353-d8887717615a/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg= 56 | golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s= 57 | golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= 58 | golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= 59 | golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ= 60 | golang.org/x/text v0.12.0 h1:k+n5B8goJNdU7hSvEtMUz3d1Q6D/XW4COJSJR6fN0mc= 61 | golang.org/x/text v0.12.0/go.mod h1:TvPlkZtksWOMsz7fbANvkp4WM8x/WCo/om8BMLbz+aE= 62 | golang.org/x/tools v0.0.0-20190311212946-11955173bddd/go.mod h1:LCzVGOaR6xXOjkQ3onu1FJEFr0SW1gC7cKk1uF8kGRs= 63 | golang.org/x/tools v0.0.0-20191029041327-9cc4af7d6b2c/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo= 64 | golang.org/x/tools v0.0.0-20191108193012-7d206e10da11/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo= 65 | golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= 66 | gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= 67 | gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= 68 | gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= 69 | gopkg.in/natefinch/lumberjack.v2 v2.0.0/go.mod h1:l0ndWWf7gzL7RNwBG7wST/UCcT4T24xpD6X8LsfU/+k= 70 | gopkg.in/natefinch/lumberjack.v2 v2.2.1 h1:bBRl1b0OH9s/DuPhuXpNl+VtCaJXFZ5/uEFST95x9zc= 71 | gopkg.in/natefinch/lumberjack.v2 v2.2.1/go.mod h1:YD8tP3GAjkrDg1eZH7EGmyESg/lsYskCTPBJVb9jqSc= 72 | gopkg.in/yaml.v2 v2.2.2/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= 73 | gopkg.in/yaml.v2 v2.2.4/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= 74 | gopkg.in/yaml.v2 v2.2.8/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= 75 | gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM= 76 | gopkg.in/yaml.v3 v3.0.0-20210107192922-496545a6307b/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM= 77 | gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA= 78 | -------------------------------------------------------------------------------- /i18n.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "fmt" 5 | "sync" 6 | ) 7 | 8 | // I18n 结构体定义 9 | type I18n struct { 10 | translations map[string]map[string]string 11 | defaultLang string 12 | mu sync.RWMutex 13 | } 14 | 15 | // 修改后的 NewI18n 函数,使用 translations.go 中定义的 translations 变量 16 | func NewI18n(defaultLang string) (*I18n, error) { 17 | i18n := &I18n{ 18 | translations: translations, // 使用 translations.go 中的 translations 变量 19 | defaultLang: defaultLang, 20 | } 21 | return i18n, nil 22 | } 23 | 24 | // T 方法用于获取翻译并格式化字符串 25 | func (i *I18n) T(lang, key string, args ...interface{}) string { 26 | i.mu.RLock() 27 | defer i.mu.RUnlock() 28 | 29 | var message string 30 | 31 | // 尝试从指定语言中获取翻译 32 | if langMessages, ok := i.translations[lang]; ok { 33 | if msg, ok := langMessages[key]; ok { 34 | message = msg 35 | } 36 | } 37 | 38 | // 如果指定语言没有找到,尝试使用默认语言 39 | if message == "" { 40 | if langMessages, ok := i.translations[i.defaultLang]; ok { 41 | if msg, ok := langMessages[key]; ok { 42 | message = msg 43 | } 44 | } 45 | } 46 | 47 | // 如果仍然找不到翻译,直接返回键值 48 | if message == "" { 49 | return key 50 | } 51 | 52 | // 如果没有参数,直接返回消息模板 53 | if len(args) == 0 { 54 | return message 55 | } 56 | 57 | // 使用参数格式化消息模板 58 | return fmt.Sprintf(message, args...) 59 | } 60 | -------------------------------------------------------------------------------- /i18n_test.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "testing" 5 | ) 6 | 7 | // 测试 NewI18n 8 | func TestNewI18n(t *testing.T) { 9 | // Test NewI18n 10 | i18n, err := NewI18n("en") 11 | if err != nil { 12 | t.Fatalf("NewI18n failed: %v", err) 13 | } 14 | 15 | if i18n.defaultLang != "en" { 16 | t.Errorf("Expected default language 'en', got '%s'", i18n.defaultLang) 17 | } 18 | 19 | if len(i18n.translations) != 2 { 20 | t.Errorf("Expected 2 languages, got %d", len(i18n.translations)) 21 | } 22 | } 23 | 24 | // 测试 T 方法 25 | func TestI18n_T(t *testing.T) { 26 | i18n := &I18n{ 27 | translations: map[string]map[string]string{ 28 | "en": { 29 | "hello": "Hello", 30 | "world": "World", 31 | "with_args": "Hello, %s!", 32 | }, 33 | "es": { 34 | "hello": "Hola", 35 | "world": "Mundo", 36 | "with_args": "¡Hola, %s!", 37 | }, 38 | }, 39 | defaultLang: "en", 40 | } 41 | 42 | tests := []struct { 43 | lang string 44 | key string 45 | args []interface{} 46 | expected string 47 | }{ 48 | {"en", "hello", nil, "Hello"}, 49 | {"es", "hello", nil, "Hola"}, 50 | {"en", "world", nil, "World"}, 51 | {"es", "world", nil, "Mundo"}, 52 | {"fr", "hello", nil, "Hello"}, // Fallback to default language 53 | {"en", "unknown", nil, "unknown"}, // Key not found 54 | {"en", "with_args", []interface{}{"John"}, "Hello, John!"}, 55 | {"es", "with_args", []interface{}{"Juan"}, "¡Hola, Juan!"}, 56 | } 57 | 58 | for _, tt := range tests { 59 | t.Run(tt.lang+"/"+tt.key, func(t *testing.T) { 60 | result := i18n.T(tt.lang, tt.key, tt.args...) 61 | if result != tt.expected { 62 | t.Errorf("T(%q, %q, %v) = %q, want %q", tt.lang, tt.key, tt.args, result, tt.expected) 63 | } 64 | }) 65 | } 66 | } 67 | -------------------------------------------------------------------------------- /load.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "database/sql" 5 | "encoding/json" 6 | "fmt" 7 | "io/ioutil" 8 | "path/filepath" 9 | "strings" 10 | "sync" 11 | "time" 12 | 13 | _ "github.com/go-sql-driver/mysql" 14 | "github.com/pingcap/tidb/pkg/parser" 15 | ) 16 | 17 | const ( 18 | batchSize = 1000 19 | workers = 4 20 | ) 21 | 22 | func LoadData(dbConnStr, outDir, replayOut, tableName string) { 23 | if !validateInputs(dbConnStr, outDir, replayOut, tableName) { 24 | return 25 | } 26 | 27 | fmt.Printf("load batchsize: %d, load workers: %d\n",batchSize,workers) 28 | 29 | db, err := sql.Open("mysql", dbConnStr) 30 | if err != nil { 31 | fmt.Println("connect to db failed:", err) 32 | return 33 | } 34 | defer db.Close() 35 | 36 | ts_create_table := time.Now() 37 | fmt.Printf("[%s] Begin create table - REPLAY_INFO\n", ts_create_table.Format("2006-01-02 15:04:05.000")) 38 | if err := createTableIfNotExists(db, tableName); err != nil { 39 | fmt.Println("create table failed:", err) 40 | return 41 | } 42 | 43 | if err := processFilesParallel(outDir, replayOut, tableName, db); err != nil { 44 | fmt.Println("process files failed:", err) 45 | } 46 | } 47 | 48 | func createTableIfNotExists(db *sql.DB, tableName string) error { 49 | createTableSQL := fmt.Sprintf(` 50 | CREATE TABLE IF NOT EXISTS %s ( 51 | sql_text longtext DEFAULT NULL, 52 | sql_type varchar(16) DEFAULT NULL, 53 | sql_digest varchar(64) DEFAULT NULL, 54 | query_time bigint(20) DEFAULT NULL, 55 | rows_sent bigint(20) DEFAULT NULL, 56 | execution_time bigint(20) DEFAULT NULL, 57 | rows_returned bigint(20) DEFAULT NULL, 58 | error_info text DEFAULT NULL, 59 | file_name varchar(64) NOT NULL, 60 | db_name varchar(64) DEFAULT NULL 61 | )`, tableName) 62 | 63 | _, err := db.Exec(createTableSQL) 64 | return err 65 | } 66 | 67 | func processFilesParallel(outDir, replayName, tableName string, db *sql.DB) error { 68 | filePaths, err := filepath.Glob(filepath.Join(outDir, replayName+"*")) 69 | if err != nil { 70 | return fmt.Errorf("find files failed: %w", err) 71 | } 72 | 73 | var wg sync.WaitGroup 74 | errChan := make(chan error, len(filePaths)) 75 | semaphore := make(chan struct{}, workers) 76 | 77 | for _, filePath := range filePaths { 78 | wg.Add(1) 79 | go func(fp string) { 80 | defer wg.Done() 81 | semaphore <- struct{}{} 82 | defer func() { <-semaphore }() 83 | 84 | fileName := filepath.Base(fp) 85 | if err := processFile(fp, fileName, tableName, db); err != nil { 86 | errChan <- fmt.Errorf("process file %s failed: %w", fileName, err) 87 | } else { 88 | logCompletion(fileName) 89 | } 90 | }(filePath) 91 | } 92 | 93 | wg.Wait() 94 | close(errChan) 95 | 96 | for err := range errChan { 97 | if err != nil { 98 | return err 99 | } 100 | } 101 | 102 | return nil 103 | } 104 | 105 | func validateInputs(dbConnStr, outDir, replayOut, tableName string) bool { 106 | if dbConnStr == "" || outDir == "" || replayOut == "" || tableName == "" { 107 | fmt.Println("Usage: ./sql-replay -mode load -db -out-dir -replay-name -table ") 108 | return false 109 | } 110 | return true 111 | } 112 | 113 | func processFiles(outDir, replayName, tableName string, db *sql.DB) error { 114 | filePaths, err := filepath.Glob(filepath.Join(outDir, replayName+"*")) 115 | if err != nil { 116 | return fmt.Errorf("error finding files: %w", err) 117 | } 118 | 119 | for _, filePath := range filePaths { 120 | fileName := filepath.Base(filePath) 121 | if err := processFile(filePath, fileName, tableName, db); err != nil { 122 | return fmt.Errorf("error processing file %s: %w", fileName, err) 123 | } else { 124 | logCompletion(fileName) 125 | } 126 | } 127 | 128 | return nil 129 | } 130 | 131 | func processFile(filePath, fileName, tableName string, db *sql.DB) error { 132 | fileContent, err := ioutil.ReadFile(filePath) 133 | if err != nil { 134 | return fmt.Errorf("error reading file: %w", err) 135 | } 136 | 137 | lines := strings.Split(string(fileContent), "\n") 138 | for i := 0; i < len(lines); i += batchSize { 139 | end := min(i+batchSize, len(lines)) 140 | if err := insertBatch(lines[i:end], fileName, tableName, db); err != nil { 141 | return fmt.Errorf("error inserting batch: %w", err) 142 | } 143 | } 144 | 145 | return nil 146 | } 147 | 148 | func insertBatch(lines []string, fileName, tableName string, db *sql.DB) error { 149 | records := parseRecords(lines) 150 | if len(records) == 0 { 151 | return nil // No data to insert 152 | } 153 | 154 | query, args := buildInsertQuery(records, fileName, tableName) 155 | _, err := db.Exec(query, args...) 156 | return err 157 | } 158 | 159 | func parseRecords(lines []string) []SQLExecutionRecord { 160 | var records []SQLExecutionRecord 161 | for _, line := range lines { 162 | if line == "" { 163 | continue 164 | } 165 | var record SQLExecutionRecord 166 | if err := json.Unmarshal([]byte(line), &record); err != nil { 167 | fmt.Printf("Error parsing JSON: %v\n", err) 168 | continue 169 | } 170 | records = append(records, record) 171 | } 172 | return records 173 | } 174 | 175 | func buildInsertQuery(records []SQLExecutionRecord, fileName, tableName string) (string, []interface{}) { 176 | valueStrings := make([]string, 0, len(records)) 177 | valueArgs := make([]interface{}, 0, len(records)*9) 178 | 179 | for _, record := range records { 180 | normalizedSQL := parser.Normalize(record.SQL) 181 | digest := parser.DigestNormalized(normalizedSQL).String() 182 | sqlType := getSQLType(normalizedSQL) 183 | 184 | valueStrings = append(valueStrings, "(?, ?, ?, ?, ?, ?, ?, ?, ?, ?)") 185 | valueArgs = append(valueArgs, record.SQL, sqlType, digest, record.QueryTime, record.RowsSent, record.ExecutionTime, record.RowsReturned, record.ErrorInfo, fileName, record.DBName) 186 | } 187 | 188 | query := fmt.Sprintf("INSERT INTO %s (sql_text, sql_type, sql_digest, query_time, rows_sent, execution_time, rows_returned, error_info, file_name, db_name) VALUES %s", 189 | tableName, strings.Join(valueStrings, ",")) 190 | return query, valueArgs 191 | } 192 | 193 | func getSQLType(normalizedSQL string) string { 194 | words := strings.Fields(normalizedSQL) 195 | if len(words) > 0 { 196 | return strings.ToLower(words[0]) 197 | } 198 | return "other" 199 | } 200 | 201 | func min(a, b int) int { 202 | if a < b { 203 | return a 204 | } 205 | return b 206 | } 207 | 208 | func logCompletion(fileName string) { 209 | currentTime := time.Now().Format("2006-01-02 15:04:05.000") 210 | fmt.Printf("[%s] Completed processing file: %s\n", currentTime, fileName) 211 | } 212 | -------------------------------------------------------------------------------- /main.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "flag" 5 | "fmt" 6 | "os" 7 | ) 8 | 9 | // Version information for the SQL Replay Tool 10 | var version = "0.3.4, build date 20241014" 11 | 12 | var showVersion bool 13 | 14 | func main() { 15 | var mode string 16 | flag.StringVar(&mode, "mode", "", "Mode of operation: parsemysqlslow ,parsetidbslow , replay, load, report") 17 | 18 | // Define flags for various operation parameters 19 | var slowLogPath, slowOutputPath, dbConnStr, replayOutputFilePath, filterUsername, filterSQLType, filterDBName, ignoreDigests, outDir, replayOut, tableName, Port string 20 | var Speed float64 21 | var lang string 22 | 23 | flag.BoolVar(&showVersion, "version", false, "Show version info") 24 | flag.StringVar(&slowLogPath, "slow-in", "", "Path to slow query log file") 25 | flag.StringVar(&slowOutputPath, "slow-out", "", "Path to slow output JSON file") 26 | flag.StringVar(&dbConnStr, "db", "username:password@tcp(localhost:3306)/test", "Database connection string") 27 | flag.StringVar(&outDir, "out-dir", "", "Directory containing the JSON files") 28 | flag.StringVar(&replayOut, "replay-name", "", "Replay output filename") 29 | flag.StringVar(&tableName, "table", "replay_info", "Name of the table to insert data into") 30 | flag.StringVar(&replayOutputFilePath, "replay-out", "", "Path to output json file") 31 | flag.StringVar(&filterUsername, "username", "all", "Username to filter (default 'all', or specific username)") 32 | flag.StringVar(&filterSQLType, "sqltype", "all", "SQL type to filter (default 'all', or 'select')") 33 | flag.StringVar(&filterDBName, "dbname", "all", "Database name to filter (default 'all', or specific dbname)") 34 | flag.StringVar(&ignoreDigests, "ignoredigests", "", "Ignore the Specific digests") 35 | flag.Float64Var(&Speed, "speed", 1.0, "Replay speed multiplier") 36 | flag.StringVar(&Port, "port", ":8081", "Report web server port") 37 | flag.StringVar(&lang, "lang", "en", "Language for output (e.g., 'en' for English, 'zh' for Chinese)") 38 | 39 | flag.Parse() 40 | 41 | if showVersion { 42 | fmt.Println("SQL Replay Tool Version:", version) 43 | os.Exit(0) 44 | } 45 | 46 | if mode == "" { 47 | printUsage() 48 | os.Exit(1) 49 | } 50 | 51 | // Execute the appropriate function based on the selected mode 52 | switch mode { 53 | case "parsemysqlslow": 54 | ParseLogs(slowLogPath, slowOutputPath) 55 | case "parsetidbslow": 56 | ParseTiDBLogs(slowLogPath, slowOutputPath) 57 | case "replay": 58 | StartSQLReplay(dbConnStr, Speed, slowOutputPath, replayOutputFilePath, filterUsername, filterSQLType, filterDBName, ignoreDigests, lang) 59 | case "load": 60 | LoadData(dbConnStr, outDir, replayOut, tableName) 61 | case "report": 62 | Report(dbConnStr, replayOut, Port) 63 | default: 64 | fmt.Println("Invalid mode. Available modes: parse, replay, load, report") 65 | os.Exit(1) 66 | } 67 | } 68 | 69 | func printUsage() { 70 | fmt.Println("Usage: ./sql-replay -mode [parse|replay|load|report]") 71 | fmt.Println(" 1. parse mysql slow log: ./sql-replay -mode parsemysqlslow -slow-in -slow-out ") 72 | fmt.Println(" 2. parse tidb slow log: ./sql-replay -mode parsetidbslow -slow-in -slow-out ") 73 | fmt.Println(" 3. replay mode: ./sql-replay -mode replay -db -speed 1.0 -slow-out -replay-out -username -sqltype -dbname -ignoredigests -lang ") 74 | fmt.Println(" 4. load mode: ./sql-replay -mode load -db -out-dir -replay-name -table ") 75 | fmt.Println(" 5. report mode: ./sql-replay -mode report -db -replay-name -port ':8081'") 76 | } 77 | -------------------------------------------------------------------------------- /parse.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "bufio" 5 | "encoding/json" 6 | "fmt" 7 | "os" 8 | "regexp" 9 | "strconv" 10 | "strings" 11 | "time" 12 | 13 | "github.com/pingcap/tidb/pkg/parser" 14 | ) 15 | 16 | func ParseLogs(slowLogPath, slowOutputPath string) { 17 | if slowLogPath == "" || slowOutputPath == "" { 18 | fmt.Println("Usage: ./sql-replay -mode parse -slow-in -slow-out ") 19 | return 20 | } 21 | 22 | file, err := os.Open(slowLogPath) 23 | if err != nil { 24 | fmt.Println("Error opening file:", err) 25 | return 26 | } 27 | defer file.Close() 28 | 29 | outputFile, err := os.Create(slowOutputPath) 30 | if err != nil { 31 | fmt.Println("Error creating output file:", err) 32 | return 33 | } 34 | defer outputFile.Close() 35 | 36 | scanner := bufio.NewScanner(file) 37 | buf := make([]byte, 0, 512*1024*1024) // 512MB buffer 38 | scanner.Buffer(buf, bufio.MaxScanTokenSize) 39 | 40 | var currentEntry LogEntry 41 | var sqlBuffer strings.Builder 42 | var entryStarted bool = false 43 | 44 | // Add support for MySQL 5.6 time format 45 | reTime56 := regexp.MustCompile(`Time: (\d{6}) ?(\d{1,2}:\d{2}:\d{2})`) 46 | 47 | reTime := regexp.MustCompile(`Time: ([\d-T:.Z]+)`) 48 | reUser := regexp.MustCompile(`User@Host: (\w+)\[`) 49 | reConnectionID := regexp.MustCompile(`Id:\s*(\d+)`) 50 | 51 | for scanner.Scan() { 52 | line := scanner.Text() 53 | 54 | if strings.HasPrefix(line, "# Time:") { 55 | if entryStarted { 56 | finalizeEntry(¤tEntry, &sqlBuffer, outputFile) 57 | } 58 | entryStarted = true 59 | 60 | // MySQL 5.6 Time Format 61 | if match := reTime56.FindStringSubmatch(line); len(match) > 1 { 62 | timeStr := fmt.Sprintf("%s %s", match[1], match[2]) 63 | parsedTime, err := time.Parse("060102 15:04:05", timeStr) 64 | if err != nil { 65 | fmt.Println("Error parsing time:", err) 66 | continue 67 | } 68 | currentEntry.Timestamp = float64(parsedTime.UnixNano()) / 1e9 69 | continue 70 | } 71 | 72 | // MySQL 5.7/8.0 Time Format 73 | if match := reTime.FindStringSubmatch(line); len(match) > 1 { 74 | parsedTime, _ := time.Parse(time.RFC3339Nano, match[1]) 75 | currentEntry.Timestamp = float64(parsedTime.UnixNano()) / 1e9 76 | continue 77 | } 78 | continue 79 | } 80 | 81 | if entryStarted { 82 | if strings.HasPrefix(line, "# User@Host:") { 83 | match := reUser.FindStringSubmatch(line) 84 | if len(match) > 1 { 85 | currentEntry.Username = match[1] 86 | } 87 | matchID := reConnectionID.FindStringSubmatch(line) 88 | if len(matchID) > 1 { 89 | currentEntry.ConnectionID = matchID[1] 90 | } 91 | } else if strings.HasPrefix(line, "# Query_time:") { 92 | processQueryTimeAndRowsSent(line, ¤tEntry) 93 | } else if !strings.HasPrefix(line, "#") { 94 | if !(strings.HasPrefix(line, "SET timestamp=") || strings.HasPrefix(line, "-- ") || strings.HasPrefix(line, "use ")) { 95 | sqlBuffer.WriteString(line + " ") 96 | } 97 | } 98 | } 99 | } 100 | 101 | // Process the last entry if there is one 102 | if entryStarted { 103 | finalizeEntry(¤tEntry, &sqlBuffer, outputFile) 104 | } 105 | 106 | if err := scanner.Err(); err != nil { 107 | fmt.Println("Error reading file:", err) 108 | } 109 | } 110 | 111 | func processQueryTimeAndRowsSent(line string, entry *LogEntry) { 112 | reTime := regexp.MustCompile(`Query_time: (\d+\.\d+)`) 113 | matchTime := reTime.FindStringSubmatch(line) 114 | if len(matchTime) > 1 { 115 | queryTime, _ := strconv.ParseFloat(matchTime[1], 64) 116 | entry.QueryTime = int64(queryTime * 1000000) // Convert seconds to microseconds 117 | } 118 | 119 | reRows := regexp.MustCompile(`Rows_sent: (\d+)`) 120 | matchRows := reRows.FindStringSubmatch(line) 121 | if len(matchRows) > 1 { 122 | entry.RowsSent, _ = strconv.Atoi(matchRows[1]) 123 | } 124 | } 125 | 126 | func finalizeEntry(entry *LogEntry, sqlBuffer *strings.Builder, outputFile *os.File) { 127 | entry.SQL = strings.TrimSpace(sqlBuffer.String()) 128 | // 检查 SQL 是否为空,如果为空,则不处理这条记录 129 | if entry.SQL == "" { 130 | return 131 | } 132 | normalizedSQL := parser.Normalize(entry.SQL) 133 | entry.Digest = parser.DigestNormalized(normalizedSQL).String() 134 | words := strings.Fields(normalizedSQL) 135 | entry.SQLType = "other" 136 | if len(words) > 0 { 137 | entry.SQLType = words[0] 138 | } 139 | jsonEntry, _ := json.Marshal(entry) 140 | fmt.Fprintln(outputFile, string(jsonEntry)) 141 | // Reset for next entry 142 | *entry = LogEntry{} 143 | sqlBuffer.Reset() 144 | } 145 | -------------------------------------------------------------------------------- /parse_test.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "bufio" 5 | "encoding/json" 6 | "os" 7 | "strings" 8 | "testing" 9 | ) 10 | 11 | func TestParseLogs(t *testing.T) { 12 | slowLogPath := "test_slow_log.txt" 13 | slowOutputPath := "test_output.json" 14 | 15 | // 创建测试输入文件 16 | input := `/usr/sbin/mysqld, Version: 5.7.44 (MySQL Community Server (GPL)). started with: 17 | Tcp port: 3306 Unix socket: /var/lib/mysql/mysql.sock 18 | Time Id Command Argument 19 | # Time: 2024-08-30T06:09:28.060156Z 20 | # User@Host: t1[t1] @ localhost [127.0.0.1] Id: 9 21 | # Query_time: 0.000065 Lock_time: 0.000022 Rows_sent: 0 Rows_examined: 1 22 | SET timestamp=1724998168; 23 | UPDATE stock SET s_quantity = 86, s_ytd = s_ytd + 4, s_order_cnt = s_order_cnt + 1, s_remote_cnt = s_remote_cnt + 0 WHERE s_i_id = 52521 AND s_w_id = 3; 24 | # Time: 2024-08-30T06:09:28.060206Z 25 | # User@Host: t1[t1] @ localhost [127.0.0.1] Id: 6 26 | # Query_time: 0.000109 Lock_time: 0.000030 Rows_sent: 0 Rows_examined: 1 27 | SET timestamp=1724998168; 28 | UPDATE district SET d_next_o_id = 4172 + 1 WHERE d_id = 1 AND d_w_id = 6; 29 | # Time: 2024-01-19T16:29:48.141142Z 30 | # User@Host: t1[t1] @ [10.2.103.21] Id: 797 31 | # Query_time: 0.000038 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 1 32 | SET timestamp=1705681788; 33 | SELECT c FROM sbtest1 WHERE id=250438; 34 | # Time: 240119 16:29:48 35 | # User@Host: t1[t1] @ [10.2.103.21] Id: 797 36 | # Query_time: 0.000038 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 1 37 | SET timestamp=1705681788; 38 | SELECT c FROM sbtest1 WHERE id=250438; 39 | # Time: 231106 0:06:36 40 | # User@Host: coplo2o[coplo2o] @ [10.0.2.34] Id: 45827727 41 | # Query_time: 1.066695 Lock_time: 0.000042 Rows_sent: 1 Rows_examined: 7039 Thread_id: 45827727 Schema: db Errno: 0 Killed: 0 Bytes_received: 0 Bytes_sent: 165 Read_first: 0 Read_last: 0 Read_key: 1 Read_next: 7039 Read_prev: 0 Read_rnd: 0 Read_rnd_next: 0 Sort_merge_passes: 0 Sort_range_count: 0 Sort_rows: 0 Sort_scan_count: 0 Created_tmp_disk_tables: 0 Created_tmp_tables: 0 Start: 2023-11-06T00:06:35.589701 End: 2023-11-06T00:06:36.656396 Launch_time: 0.000000 42 | # QC_Hit: No Full_scan: No Full_join: No Tmp_table: No Tmp_table_on_disk: No Filesort: No Filesort_on_disk: No 43 | use db; 44 | SET timestamp=1699200395; 45 | SELECT c FROM sbtest1 WHERE id=250438;` 46 | 47 | expectedOutput := []LogEntry{ 48 | { 49 | ConnectionID: "9", 50 | QueryTime: 65, 51 | SQL: "UPDATE stock SET s_quantity = 86, s_ytd = s_ytd + 4, s_order_cnt = s_order_cnt + 1, s_remote_cnt = s_remote_cnt + 0 WHERE s_i_id = 52521 AND s_w_id = 3;", 52 | RowsSent: 0, 53 | Username: "t1", 54 | SQLType: "UPDATE", 55 | DBName: "", 56 | Timestamp: 1724998168.060156, 57 | }, 58 | { 59 | ConnectionID: "6", 60 | QueryTime: 109, 61 | SQL: "UPDATE district SET d_next_o_id = 4172 + 1 WHERE d_id = 1 AND d_w_id = 6;", 62 | RowsSent: 0, 63 | Username: "t1", 64 | SQLType: "UPDATE", 65 | DBName: "", 66 | Timestamp: 1724998168.060206, 67 | }, 68 | { 69 | ConnectionID: "797", 70 | QueryTime: 38, 71 | SQL: "SELECT c FROM sbtest1 WHERE id=250438;", 72 | RowsSent: 1, 73 | Username: "t1", 74 | SQLType: "SELECT", 75 | DBName: "", 76 | Timestamp: 1705681788.141142, 77 | }, 78 | { 79 | ConnectionID: "797", 80 | QueryTime: 38, 81 | SQL: "SELECT c FROM sbtest1 WHERE id=250438;", 82 | RowsSent: 1, 83 | Username: "t1", 84 | SQLType: "SELECT", 85 | DBName: "", 86 | Timestamp: 1705681788, 87 | }, 88 | { 89 | ConnectionID: "45827727", 90 | QueryTime: 1066695, 91 | SQL: "SELECT c FROM sbtest1 WHERE id=250438;", 92 | RowsSent: 1, 93 | Username: "coplo2o", 94 | SQLType: "SELECT", 95 | DBName: "", 96 | Timestamp: 1699229196, 97 | }, 98 | } 99 | 100 | // 写入测试输入文件 101 | err := os.WriteFile(slowLogPath, []byte(input), 0644) 102 | if err != nil { 103 | t.Fatalf("Failed to write test input file: %v", err) 104 | } 105 | 106 | // 调用 ParseLogs 函数 107 | ParseLogs(slowLogPath, slowOutputPath) 108 | 109 | // 读取并解析输出文件 110 | outputFile, err := os.Open(slowOutputPath) 111 | if err != nil { 112 | t.Fatalf("Failed to open output file: %v", err) 113 | } 114 | defer outputFile.Close() 115 | 116 | scanner := bufio.NewScanner(outputFile) 117 | var actualOutput []LogEntry 118 | for scanner.Scan() { 119 | var entry LogEntry 120 | err := json.Unmarshal(scanner.Bytes(), &entry) 121 | if err != nil { 122 | t.Fatalf("Failed to unmarshal JSON: %v", err) 123 | } 124 | actualOutput = append(actualOutput, entry) 125 | } 126 | 127 | if err := scanner.Err(); err != nil { 128 | t.Fatalf("Error reading output file: %v", err) 129 | } 130 | 131 | // 比较实际输出和预期输出 132 | if len(actualOutput) != len(expectedOutput) { 133 | t.Fatalf("Output length does not match expected length.\nActual: %v\nExpected: %v", len(actualOutput), len(expectedOutput)) 134 | } 135 | 136 | for i := range actualOutput { 137 | if actualOutput[i].ConnectionID != expectedOutput[i].ConnectionID || 138 | actualOutput[i].QueryTime != expectedOutput[i].QueryTime || 139 | actualOutput[i].SQL != expectedOutput[i].SQL || 140 | actualOutput[i].RowsSent != expectedOutput[i].RowsSent || 141 | actualOutput[i].Username != expectedOutput[i].Username || 142 | !strings.EqualFold(actualOutput[i].SQLType, expectedOutput[i].SQLType) || 143 | actualOutput[i].DBName != expectedOutput[i].DBName || 144 | !floatEquals(actualOutput[i].Timestamp, expectedOutput[i].Timestamp) { 145 | t.Errorf("Output does not match expected output at index %d.\nActual: %v\nExpected: %v", i, actualOutput[i], expectedOutput[i]) 146 | } 147 | } 148 | 149 | // 清理测试文件 150 | os.Remove(slowLogPath) 151 | os.Remove(slowOutputPath) 152 | } 153 | 154 | func floatEquals(a, b float64) bool { 155 | const epsilon = 1e-6 156 | return (a-b) < epsilon && (b-a) < epsilon 157 | } 158 | -------------------------------------------------------------------------------- /parsetidb.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "bufio" 5 | "encoding/json" 6 | "fmt" 7 | "os" 8 | "regexp" 9 | "strconv" 10 | "strings" 11 | "time" 12 | "github.com/pingcap/tidb/pkg/parser" 13 | ) 14 | 15 | func ParseTiDBLogs(slowLogPath, slowOutputPath string) { 16 | if slowLogPath == "" || slowOutputPath == "" { 17 | fmt.Println("Usage: ./sql-replay -mode parsetidbslow -slow-in -slow-out ") 18 | return 19 | } 20 | 21 | file, err := os.Open(slowLogPath) 22 | if err != nil { 23 | fmt.Println("Error opening file:", err) 24 | return 25 | } 26 | defer file.Close() 27 | 28 | var entries []LogEntry 29 | bufferSize := 1024 * 1024 * 10 // 1MB 30 | scanner := bufio.NewScanner(file) 31 | buf := make([]byte, bufferSize) 32 | scanner.Buffer(buf, bufferSize) 33 | 34 | var entry LogEntry 35 | var isInternal bool 36 | var sqlStatement string 37 | var isPrepared string // 声明 isPrepared 变量 38 | timeRegex := regexp.MustCompile(`# Time:\s+(\d+-\d+-\d+T\d+:\d+:\d+\.\d+\+\d+:\d+)`) 39 | userHostRegex := regexp.MustCompile(`# User@Host:\s+(\w+)`) 40 | connIDRegex := regexp.MustCompile(`# Conn_ID:\s+(\d+)`) 41 | queryTimeRegex := regexp.MustCompile(`# Query_time:\s+(\d+\.\d+)`) 42 | dbRegex := regexp.MustCompile(`# DB:\s+(\w+)`) 43 | isInternalRegex := regexp.MustCompile(`# Is_internal:\s+(true|false)`) 44 | preparedRegex := regexp.MustCompile(`# Prepared:\s+(true|false)`) 45 | 46 | for scanner.Scan() { 47 | line := scanner.Text() 48 | 49 | // 检查是否为 Time 字段 50 | if timeRegex.MatchString(line) { 51 | if !isInternal { // 如果 Is_internal 为 true,跳过这个日志段落 52 | if isPrepared == "true" { 53 | entry.SQL = formatSQL(sqlStatement) 54 | } else { 55 | entry.SQL = sqlStatement 56 | } 57 | if entry.ConnectionID != "" && entry.SQL != "" { // 确保 entry 被正确填充 58 | entries = append(entries, entry) 59 | } 60 | } 61 | 62 | // 初始化新的日志段落 63 | sqlStatement = "" 64 | entry = LogEntry{} 65 | isInternal = false 66 | isPrepared = "false" 67 | 68 | // 提取 Time 字段并转换为带时区的时间戳 69 | match := timeRegex.FindStringSubmatch(line) 70 | if match != nil { 71 | parsedTime, err := time.Parse(time.RFC3339Nano, match[1]) 72 | if err == nil { 73 | entry.Timestamp = float64(parsedTime.UnixNano()) / 1e9 74 | } else { 75 | fmt.Println("Error parsing time:", err) 76 | } 77 | } 78 | } else if userHostRegex.MatchString(line) { 79 | // 提取 Username 80 | match := userHostRegex.FindStringSubmatch(line) 81 | if match != nil { 82 | entry.Username = match[1] 83 | } 84 | } else if connIDRegex.MatchString(line) { 85 | // 提取 ConnectionID 86 | match := connIDRegex.FindStringSubmatch(line) 87 | if match != nil { 88 | entry.ConnectionID = match[1] 89 | } 90 | } else if queryTimeRegex.MatchString(line) { 91 | // 提取 QueryTime 92 | match := queryTimeRegex.FindStringSubmatch(line) 93 | if match != nil { 94 | queryTime, _ := strconv.ParseFloat(match[1], 64) 95 | entry.QueryTime = int64(queryTime * 1e6) // 转换为微秒 96 | } 97 | } else if dbRegex.MatchString(line) { 98 | // 提取 DBName 99 | match := dbRegex.FindStringSubmatch(line) 100 | if match != nil { 101 | entry.DBName = match[1] 102 | } 103 | } else if isInternalRegex.MatchString(line) { 104 | // 检查是否为内部 SQL 105 | match := isInternalRegex.FindStringSubmatch(line) 106 | if match != nil && match[1] == "true" { 107 | isInternal = true 108 | } 109 | } else if preparedRegex.MatchString(line) { 110 | // 检查是否为预编译 SQL 111 | match := preparedRegex.FindStringSubmatch(line) 112 | if match != nil { 113 | isPrepared = match[1] 114 | } 115 | } else if !strings.HasPrefix(line, "#") { 116 | // 检查是否以 "use "(忽略大小写)开头 117 | if !strings.HasPrefix(strings.ToLower(line), "use ") { 118 | // 处理 SQL 语句 119 | sqlStatement += strings.TrimSpace(line) 120 | normalizedSQL := parser.Normalize(sqlStatement) 121 | entry.Digest = parser.DigestNormalized(normalizedSQL).String() 122 | words := strings.Fields(normalizedSQL) 123 | entry.SQLType = "other" 124 | if len(words) > 0 { 125 | entry.SQLType = words[0] 126 | } 127 | } 128 | } 129 | } 130 | 131 | // 在处理最后一个日志段落的部分 132 | if !isInternal { 133 | if isPrepared == "true" { 134 | entry.SQL = formatSQL(sqlStatement) 135 | } else { 136 | entry.SQL = sqlStatement 137 | } 138 | if entry.ConnectionID != "" && entry.SQL != "" { // 确保 entry 被正确填充 139 | entries = append(entries, entry) 140 | } 141 | } 142 | 143 | outputFile, err := os.Create(slowOutputPath) 144 | if err != nil { 145 | fmt.Println("Error creating output file:", err) 146 | return 147 | } 148 | defer outputFile.Close() 149 | 150 | // 逐个输出 JSON 对象 151 | for _, entry := range entries { 152 | jsonEntry, err := json.Marshal(entry) 153 | if err != nil { 154 | fmt.Println("Error marshaling JSON:", err) 155 | return 156 | } 157 | outputFile.Write(jsonEntry) 158 | outputFile.WriteString("\n") 159 | } 160 | 161 | fmt.Println("Logs processed and written to output json") 162 | } 163 | 164 | // formatSQL 函数用于格式化 SQL 语句,替换 ? 占位符为对应的参数值。 165 | func formatSQL(input string) string { 166 | // 使用正则表达式匹配 arguments 部分 167 | argumentsRegex := regexp.MustCompile(`\[arguments:\s*(\((.*?)\)|([^()]+))\]`) 168 | match := argumentsRegex.FindStringSubmatch(input) 169 | 170 | var arguments []string 171 | if len(match) > 1 { 172 | // 提取 arguments 部分并去掉多余的空格 173 | var argumentsStr string 174 | if match[2] != "" { 175 | // 如果存在括号,提取括号内的内容 176 | argumentsStr = match[2] 177 | } else { 178 | // 否则直接使用匹配的内容 179 | argumentsStr = match[3] 180 | } 181 | 182 | // 拆分参数并去掉多余的空格 183 | arguments = strings.Split(argumentsStr, ",") 184 | for i := range arguments { 185 | arguments[i] = strings.TrimSpace(arguments[i]) // 去除空格 186 | } 187 | 188 | // 去掉原始 input 中的 arguments 部分 189 | input = strings.Replace(input, match[0], "", 1) 190 | } 191 | 192 | // 替换 ? 占位符,注意考虑引号情况 193 | var result strings.Builder 194 | argIndex := 0 // 当前参数索引 195 | inQuotes := 0 // 引号计数:0 表示不在引号内,1 表示在单引号内,2 表示在双引号内 196 | 197 | for i, char := range input { 198 | if char == '"' { 199 | inQuotes = (inQuotes + 2) % 4 // 切换双引号状态 200 | } else if char == '\'' { 201 | inQuotes = (inQuotes + 1) % 2 // 切换单引号状态 202 | } 203 | 204 | // 判断是否为 ? 占位符 205 | if char == '?' && inQuotes == 0 { 206 | // 判断是否有参数可用 207 | if argIndex < len(arguments) { 208 | arg := arguments[argIndex] 209 | argIndex++ // 递增参数索引 210 | 211 | // 判断参数类型,如果是字符串则加上引号 212 | if strings.HasPrefix(arg, "'") && strings.HasSuffix(arg, "'") { 213 | arg = arg[1 : len(arg)-1] // 去掉引号 214 | result.WriteString("'" + arg + "'") 215 | } else if strings.HasPrefix(arg, "\"") && strings.HasSuffix(arg, "\"") { 216 | arg = arg[1 : len(arg)-1] // 去掉引号 217 | result.WriteString("'" + arg + "'") 218 | } else { 219 | result.WriteString(arg) 220 | } 221 | continue 222 | } 223 | } 224 | 225 | // 处理转义字符和保留原字符 226 | if char == '\\' && i < len(input)-1 && (input[i+1] == '"' || input[i+1] == '\'') { 227 | result.WriteRune(char) 228 | continue 229 | } 230 | result.WriteRune(char) // 其他字符直接写入 231 | } 232 | 233 | return result.String() 234 | } 235 | -------------------------------------------------------------------------------- /replay.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "bufio" 5 | "database/sql" 6 | "encoding/json" 7 | "fmt" 8 | "os" 9 | "sync" 10 | "time" 11 | "strings" 12 | 13 | _ "github.com/go-sql-driver/mysql" 14 | ) 15 | 16 | type SQLExecutionRecord struct { 17 | SQL string `json:"sql"` 18 | QueryTime int64 `json:"query_time"` 19 | RowsSent int `json:"rows_sent"` 20 | ExecutionTime int64 `json:"execution_time"` 21 | RowsReturned int64 `json:"rows_returned"` 22 | ErrorInfo string `json:"error_info,omitempty"` 23 | FileName string // File name 24 | DBName string `json:"dbname"` 25 | } 26 | 27 | type LogEntry struct { 28 | ConnectionID string `json:"connection_id"` 29 | QueryTime int64 `json:"query_time"` 30 | SQL string `json:"sql"` 31 | RowsSent int `json:"rows_sent"` 32 | Username string `json:"username"` 33 | SQLType string `json:"sql_type"` 34 | DBName string `json:"dbname"` 35 | Timestamp float64 `json:"ts"` 36 | Digest string `json:"digest"` 37 | } 38 | 39 | type SQLTask struct { 40 | Entry LogEntry 41 | DB *sql.DB 42 | } 43 | 44 | var i18n *I18n 45 | 46 | func init() { 47 | var err error 48 | i18n, err = NewI18n("en") 49 | if err != nil { 50 | panic(err) 51 | } 52 | } 53 | 54 | func ExecuteSQLAndRecord(task SQLTask, baseReplayOutputFilePath string) error { 55 | if task.DB == nil { 56 | return fmt.Errorf("database connection is nil") 57 | } 58 | startTime := time.Now() 59 | 60 | rows, err := task.DB.Query(task.Entry.SQL) 61 | var rowsReturned int64 62 | var errorInfo string 63 | 64 | if err != nil { 65 | errorInfo = err.Error() 66 | } else { 67 | for rows.Next() { 68 | rowsReturned++ 69 | } 70 | rows.Close() 71 | } 72 | 73 | executionTime := time.Since(startTime).Microseconds() 74 | 75 | record := SQLExecutionRecord{ 76 | SQL: task.Entry.SQL, 77 | QueryTime: task.Entry.QueryTime, 78 | RowsSent: task.Entry.RowsSent, 79 | DBName: task.Entry.DBName, 80 | ExecutionTime: executionTime, 81 | RowsReturned: rowsReturned, 82 | ErrorInfo: errorInfo, 83 | } 84 | 85 | jsonData, err := json.Marshal(record) 86 | if err != nil { 87 | return err 88 | } 89 | 90 | replayOutputFilePath := fmt.Sprintf("%s.%s", baseReplayOutputFilePath, task.Entry.ConnectionID) 91 | file, err := os.OpenFile(replayOutputFilePath, os.O_WRONLY|os.O_CREATE|os.O_APPEND, 0666) 92 | if err != nil { 93 | return err 94 | } 95 | defer file.Close() 96 | 97 | _, err = file.Write(jsonData) 98 | if err != nil { 99 | return err 100 | } 101 | _, err = file.WriteString("\n") 102 | return err 103 | } 104 | 105 | func ParseLogEntries(slowOutputPath, filterUsername, filterSQLType, filterDBName string, ignoreDigestList []string) (map[string][]LogEntry, float64, error) { 106 | inputFile, err := os.Open(slowOutputPath) 107 | if err != nil { 108 | return nil, 0, fmt.Errorf("file open error: %w", err) 109 | } 110 | defer inputFile.Close() 111 | 112 | logFilePath := "ignored_digests.log" 113 | logFile, err := os.OpenFile(logFilePath, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, 0644) 114 | if err != nil { 115 | fmt.Printf("Failed to open log file: %v\n", err) 116 | } 117 | 118 | defer logFile.Close() 119 | 120 | scanner := bufio.NewScanner(inputFile) 121 | buf := make([]byte, 0, 512*1024*1024) // 512MB buffer 122 | scanner.Buffer(buf, bufio.MaxScanTokenSize) 123 | 124 | tasksMap := make(map[string][]LogEntry) 125 | var minTimestamp float64 = 9999999999.999999 126 | 127 | for scanner.Scan() { 128 | var entry LogEntry 129 | if err := json.Unmarshal([]byte(scanner.Text()), &entry); err != nil { 130 | fmt.Println("Error parsing log entry:", err) 131 | continue 132 | } 133 | 134 | if filterUsername != "all" && entry.Username != filterUsername { 135 | continue 136 | } 137 | 138 | if filterSQLType != "all" && entry.SQLType != filterSQLType { 139 | continue 140 | } 141 | 142 | if filterDBName != "all" && entry.DBName != filterDBName { 143 | continue 144 | } 145 | if contains(ignoreDigestList, entry.Digest) { // ignore input digests 146 | fmt.Fprintf(logFile, "%s, %s\n", entry.Digest,entry.SQL) 147 | continue 148 | } 149 | tasksMap[entry.ConnectionID] = append(tasksMap[entry.ConnectionID], entry) 150 | 151 | if entry.Timestamp < minTimestamp { 152 | minTimestamp = entry.Timestamp 153 | } 154 | } 155 | 156 | return tasksMap, minTimestamp, nil 157 | } 158 | 159 | func contains(slice []string, item string) bool { 160 | for _, v := range slice { 161 | if v == item { 162 | return true 163 | } 164 | } 165 | return false 166 | } 167 | 168 | func ReplaySQLForConnection(connID string, entries []LogEntry, dbConnStr string, replayOutputFilePath string, minTimestamp float64, speed float64, lang string) { 169 | db, err := sql.Open("mysql", dbConnStr) 170 | if err != nil { 171 | fmt.Printf(i18n.T(lang, "db_open_error")+"\n", connID, err) 172 | return 173 | } 174 | defer db.Close() 175 | 176 | var prevTimestamp float64 = entries[0].Timestamp - (entries[0].Timestamp - minTimestamp) 177 | var lastQueryTime int64 = 0 178 | 179 | for _, entry := range entries { 180 | interval := (entry.Timestamp - prevTimestamp - float64(lastQueryTime)/1e6) / speed 181 | if interval > 0 { 182 | sleepDuration := time.Duration(interval * float64(time.Second)) 183 | time.Sleep(sleepDuration) 184 | } 185 | prevTimestamp = entry.Timestamp 186 | 187 | task := SQLTask{Entry: entry, DB: db} 188 | if err := ExecuteSQLAndRecord(task, replayOutputFilePath); err != nil { 189 | fmt.Printf(i18n.T(lang, "sql_exec_error")+"\n", connID, err) 190 | } 191 | lastQueryTime = entry.QueryTime 192 | } 193 | } 194 | 195 | func StartSQLReplay(dbConnStr string, speed float64, slowOutputPath, replayOutputFilePath, filterUsername, filterSQLType, filterDBName, ignoreDigests string, lang string) { 196 | if dbConnStr == "" || slowOutputPath == "" || replayOutputFilePath == "" { 197 | fmt.Println(i18n.T(lang, "usage")) 198 | return 199 | } 200 | 201 | if speed <= 0 { 202 | fmt.Println(i18n.T(lang, "invalid_speed")) 203 | return 204 | } 205 | ignoreDigestList := strings.Split(ignoreDigests, ",") 206 | fmt.Printf(i18n.T(lang, "replay_info")+"\n", filterUsername, filterDBName, filterSQLType, speed) 207 | fmt.Println("Ignored Digests: "+ignoreDigests) 208 | fmt.Println("Ignored Digests And SQL Info: ignored_digests.log") 209 | 210 | ts0 := time.Now() 211 | fmt.Printf("[%s] %s\n",ts0.Format("2006-01-02 15:04:05.000"),i18n.T(lang, "parsing_start")) 212 | 213 | tasksMap, minTimestamp, err := ParseLogEntries(slowOutputPath, filterUsername, filterSQLType, filterDBName, ignoreDigestList) 214 | if err != nil { 215 | fmt.Println(i18n.T(lang, "file_open_error"), err) 216 | return 217 | } 218 | 219 | ts1 := time.Now() 220 | fmt.Printf("[%s] %s, ",ts1.Format("2006-01-02 15:04:05.000"),i18n.T(lang, "parsing_complete")) 221 | fmt.Printf("%s %v, ", i18n.T(lang, "parsing_time"), ts1.Sub(ts0)) 222 | fmt.Println(i18n.T(lang, "replay_start")) 223 | 224 | var wg sync.WaitGroup 225 | 226 | for connID, entries := range tasksMap { 227 | wg.Add(1) 228 | go func(connID string, entries []LogEntry) { 229 | defer wg.Done() 230 | ReplaySQLForConnection(connID, entries, dbConnStr, replayOutputFilePath, minTimestamp, speed, lang) 231 | }(connID, entries) 232 | } 233 | 234 | wg.Wait() 235 | ts2 := time.Now() 236 | fmt.Printf("[%s] %s, ",ts2.Format("2006-01-02 15:04:05.000"),i18n.T(lang, "replay_complete")) 237 | fmt.Printf("%s %v\n", i18n.T(lang, "replay_time"), ts2.Sub(ts1)) 238 | } 239 | -------------------------------------------------------------------------------- /replay_test.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "bufio" 5 | "encoding/json" 6 | "fmt" 7 | "os" 8 | "sync" 9 | "testing" 10 | "time" 11 | 12 | "bou.ke/monkey" 13 | ) 14 | 15 | var executionOrder []string 16 | var mu sync.Mutex 17 | 18 | func generateReplayFiles(filePath string) error { 19 | entries := []LogEntry{ 20 | {ConnectionID: "1", QueryTime: 100000, SQL: "SELECT * FROM table1;", RowsSent: 1, Username: "user1", SQLType: "SELECT", Timestamp: 1.0}, 21 | {ConnectionID: "1", QueryTime: 100000, SQL: "UPDATE table1 SET col1 = 'value1';", RowsSent: 0, Username: "user1", SQLType: "UPDATE", Timestamp: 10.0}, 22 | {ConnectionID: "2", QueryTime: 100000, SQL: "INSERT INTO table2 (col1) VALUES ('value2');", RowsSent: 0, Username: "user2", SQLType: "INSERT", Timestamp: 2.1}, 23 | {ConnectionID: "2", QueryTime: 100000, SQL: "DELETE FROM table2 WHERE col1 = 'value2';", RowsSent: 0, Username: "user2", SQLType: "DELETE", Timestamp: 13.0}, 24 | {ConnectionID: "3", QueryTime: 100000, SQL: "SELECT * FROM table3;", RowsSent: 1, Username: "user3", SQLType: "SELECT", Timestamp: 9.0}, 25 | {ConnectionID: "3", QueryTime: 100000, SQL: "UPDATE table3 SET col1 = 'value3';", RowsSent: 0, Username: "user3", SQLType: "UPDATE", Timestamp: 14.0}, 26 | {ConnectionID: "4", QueryTime: 100000, SQL: "INSERT INTO table4 (col1) VALUES ('value4');", RowsSent: 0, Username: "user4", SQLType: "INSERT", Timestamp: 2.0}, 27 | {ConnectionID: "4", QueryTime: 100000, SQL: "DELETE FROM table4 WHERE col1 = 'value4';", RowsSent: 0, Username: "user4", SQLType: "DELETE", Timestamp: 20.0}, 28 | } 29 | 30 | file, err := os.Create(filePath) 31 | if err != nil { 32 | return err 33 | } 34 | defer file.Close() 35 | 36 | for _, entry := range entries { 37 | jsonEntry, err := json.Marshal(entry) 38 | if err != nil { 39 | return err 40 | } 41 | file.Write(jsonEntry) 42 | file.WriteString("\n") 43 | } 44 | 45 | return nil 46 | } 47 | 48 | func mockExecuteSQLAndRecord(task SQLTask, baseReplayOutputFilePath string) error { 49 | // 模拟 SQL 执行时间 50 | time.Sleep(time.Duration(task.Entry.QueryTime) * time.Microsecond) 51 | 52 | // 打印当前时间戳,连接号,SQL 53 | // fmt.Printf("%s, 连接号: %s, SQL: %s\n", time.Now().Format("2006-01-02 15:04:05.000"), task.Entry.ConnectionID, task.Entry.SQL) 54 | 55 | // 记录执行顺序 56 | mu.Lock() 57 | executionOrder = append(executionOrder, task.Entry.ConnectionID) 58 | mu.Unlock() 59 | 60 | record := SQLExecutionRecord{ 61 | SQL: task.Entry.SQL, 62 | QueryTime: task.Entry.QueryTime, 63 | RowsSent: task.Entry.RowsSent, 64 | ExecutionTime: task.Entry.QueryTime, 65 | RowsReturned: int64(task.Entry.RowsSent), 66 | ErrorInfo: "", 67 | } 68 | 69 | jsonData, err := json.Marshal(record) 70 | if err != nil { 71 | return err 72 | } 73 | 74 | replayOutputFilePath := fmt.Sprintf("%s.%s", baseReplayOutputFilePath, task.Entry.ConnectionID) 75 | file, err := os.OpenFile(replayOutputFilePath, os.O_WRONLY|os.O_CREATE|os.O_APPEND, 0666) 76 | if err != nil { 77 | return err 78 | } 79 | defer file.Close() 80 | 81 | _, err = file.Write(jsonData) 82 | if err != nil { 83 | return err 84 | } 85 | _, err = file.WriteString("\n") 86 | return err 87 | } 88 | 89 | func TestSQLReplay(t *testing.T) { 90 | replayFilePath := "test_replay.json" 91 | replayOutputFilePath := "test_replay_output" 92 | speed1 := 1.0 93 | speed100 := 100.0 94 | 95 | // 生成回放文件 96 | err := generateReplayFiles(replayFilePath) 97 | if err != nil { 98 | t.Fatalf("Failed to generate replay files: %v", err) 99 | } 100 | 101 | // 解析回放文件 102 | tasksMap, _, err := ParseLogEntries(replayFilePath, "all", "all", "all") 103 | if err != nil { 104 | t.Fatalf("Failed to parse replay files: %v", err) 105 | } 106 | 107 | // 清理之前的回放结果文件 108 | for connID := range tasksMap { 109 | os.Remove(fmt.Sprintf("%s.%s", replayOutputFilePath, connID)) 110 | } 111 | 112 | // 使用 monkey 补丁替换 ExecuteSQLAndRecord 函数 113 | var patch = monkey.Patch(ExecuteSQLAndRecord, mockExecuteSQLAndRecord) 114 | defer patch.Unpatch() 115 | 116 | // 第一次回放,使用 1.0 倍速 117 | start1 := time.Now() 118 | StartSQLReplay("root1@tcp(127.0.0.1:4000)/test", speed1, replayFilePath, replayOutputFilePath, "all", "all", "all", "en") 119 | duration1 := time.Since(start1) 120 | 121 | // 验证 1.0 倍速回放时间 122 | if duration1 <= 19*time.Second { 123 | t.Errorf("1x speed replay did not complete in expected time: expected more than 19s, got %v", duration1) 124 | } 125 | 126 | // 验证 1.0 倍速回放的连接号启动顺序 127 | expectedOrder := []string{"1", "4", "2", "3", "1", "2", "3", "4"} 128 | if len(executionOrder) != len(expectedOrder) { 129 | t.Errorf("Execution order length does not match expected length.\nActual: %v\nExpected: %v", len(executionOrder), len(expectedOrder)) 130 | } else { 131 | for i, connID := range expectedOrder { 132 | if executionOrder[i] != connID { 133 | t.Errorf("Execution order does not match at index %d.\nActual: %v\nExpected: %v", i, executionOrder[i], connID) 134 | } 135 | } 136 | } 137 | 138 | // 清理之前的回放结果文件 139 | for connID := range tasksMap { 140 | os.Remove(fmt.Sprintf("%s.%s", replayOutputFilePath, connID)) 141 | } 142 | 143 | // 第二次回放,使用 100.0 倍速 144 | start2 := time.Now() 145 | StartSQLReplay("root1@tcp(127.0.0.1:4000)/test", speed100, replayFilePath, replayOutputFilePath, "all", "all", "all", "en") 146 | duration2 := time.Since(start2) 147 | 148 | // 验证 100.0 倍速回放时间 149 | if duration2 >= 1*time.Second { 150 | t.Errorf("100x speed replay did not complete in expected time: expected less than 1s, got %v", duration2) 151 | } 152 | 153 | // 验证回放结果 154 | expectedOutputs := map[string][]SQLExecutionRecord{ 155 | "1": { 156 | {SQL: "SELECT * FROM table1;", QueryTime: 100000, RowsSent: 1, ExecutionTime: 100000, RowsReturned: 1, ErrorInfo: "", FileName: ""}, 157 | {SQL: "UPDATE table1 SET col1 = 'value1';", QueryTime: 100000, RowsSent: 0, ExecutionTime: 100000, RowsReturned: 0, ErrorInfo: "", FileName: ""}, 158 | }, 159 | "2": { 160 | {SQL: "INSERT INTO table2 (col1) VALUES ('value2');", QueryTime: 100000, RowsSent: 0, ExecutionTime: 100000, RowsReturned: 0, ErrorInfo: "", FileName: ""}, 161 | {SQL: "DELETE FROM table2 WHERE col1 = 'value2';", QueryTime: 100000, RowsSent: 0, ExecutionTime: 100000, RowsReturned: 0, ErrorInfo: "", FileName: ""}, 162 | }, 163 | "3": { 164 | {SQL: "SELECT * FROM table3;", QueryTime: 100000, RowsSent: 1, ExecutionTime: 100000, RowsReturned: 1, ErrorInfo: "", FileName: ""}, 165 | {SQL: "UPDATE table3 SET col1 = 'value3';", QueryTime: 100000, RowsSent: 0, ExecutionTime: 100000, RowsReturned: 0, ErrorInfo: "", FileName: ""}, 166 | }, 167 | "4": { 168 | {SQL: "INSERT INTO table4 (col1) VALUES ('value4');", QueryTime: 100000, RowsSent: 0, ExecutionTime: 100000, RowsReturned: 0, ErrorInfo: "", FileName: ""}, 169 | {SQL: "DELETE FROM table4 WHERE col1 = 'value4';", QueryTime: 100000, RowsSent: 0, ExecutionTime: 100000, RowsReturned: 0, ErrorInfo: "", FileName: ""}, 170 | }, 171 | } 172 | 173 | for connID, expectedOutput := range expectedOutputs { 174 | outputFilePath := fmt.Sprintf("%s.%s", replayOutputFilePath, connID) 175 | outputFile, err := os.Open(outputFilePath) 176 | if err != nil { 177 | t.Fatalf("Failed to open replay output file: %v", err) 178 | } 179 | defer outputFile.Close() 180 | 181 | scanner := bufio.NewScanner(outputFile) 182 | var actualOutput []SQLExecutionRecord 183 | for scanner.Scan() { 184 | var record SQLExecutionRecord 185 | err := json.Unmarshal(scanner.Bytes(), &record) 186 | if err != nil { 187 | t.Fatalf("Failed to unmarshal JSON: %v", err) 188 | } 189 | actualOutput = append(actualOutput, record) 190 | } 191 | 192 | if err := scanner.Err(); err != nil { 193 | t.Fatalf("Error reading replay output file: %v", err) 194 | } 195 | 196 | if len(actualOutput) != len(expectedOutput) { 197 | t.Fatalf("Output length does not match expected length for connection %s.\nActual: %v\nExpected: %v", connID, len(actualOutput), len(expectedOutput)) 198 | } 199 | 200 | for i := range actualOutput { 201 | if actualOutput[i].SQL != expectedOutput[i].SQL || 202 | actualOutput[i].QueryTime != expectedOutput[i].QueryTime || 203 | actualOutput[i].RowsSent != expectedOutput[i].RowsSent || 204 | actualOutput[i].ExecutionTime != expectedOutput[i].ExecutionTime || 205 | actualOutput[i].RowsReturned != expectedOutput[i].RowsReturned || 206 | actualOutput[i].ErrorInfo != expectedOutput[i].ErrorInfo { 207 | t.Errorf("Output does not match expected output at index %d for connection %s.\nActual: %v\nExpected: %v", i, connID, actualOutput[i], expectedOutput[i]) 208 | } 209 | } 210 | } 211 | 212 | // 清理测试文件 213 | os.Remove(replayFilePath) 214 | for connID := range tasksMap { 215 | os.Remove(fmt.Sprintf("%s.%s", replayOutputFilePath, connID)) 216 | } 217 | } 218 | -------------------------------------------------------------------------------- /report.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "database/sql" 5 | "fmt" 6 | "time" 7 | "html/template" 8 | "log" 9 | "net/http" 10 | 11 | _ "github.com/go-sql-driver/mysql" 12 | ) 13 | 14 | type QueryResult struct { 15 | SQL string 16 | Columns []string 17 | Rows [][]interface{} 18 | Error error 19 | } 20 | 21 | func Report(dbConnStr, replayOut, Port string) { 22 | if dbConnStr == "" || replayOut == "" { 23 | fmt.Println("Usage: ./sql-replay -mode report -db -replay-name -port ':8081'") 24 | return 25 | } 26 | // 连接数据库 27 | db, err := sql.Open("mysql", dbConnStr) 28 | if err != nil { 29 | log.Fatal(err) 30 | } 31 | defer db.Close() 32 | 33 | // 定义 SQL 查询 34 | queries := map[string]string{ 35 | "Replay Summary": `select min(SUBSTRING_INDEX(file_name,'.',1)) replay_name,count(*) sql_cnts, 36 | sum(case when query_time>execution_time and error_info='' then 1 else 0 end) faster_cnts, 37 | sum(case when query_time'' then 1 else 0 end) err_cnts, 39 | round(sum(case when error_info='' then ri.query_time else 0 end)/1000000/60,2) "before_sql_time(min)", 40 | round(sum(case when error_info='' then ri.execution_time else 0 end)/1000000/60,2) "now_sql_time(min)" 41 | from replay_info ri where ri.file_name like concat(?,'%')`, 42 | "Sample1: <500us": `SELECT 43 | sql_digest,max(concat(sql_type,':',ifnull(db_name,''))) sql_type, 44 | COUNT(*) AS exec_cnts, 45 | round(AVG(execution_time / 1000),2) AS current_ms, 46 | round(AVG(query_time / 1000),2) AS before_ms, 47 | concat(ROUND((AVG(execution_time / 1000) - AVG(query_time / 1000)) / AVG(query_time / 1000) ,2)*100,'%') AS reduce_pct, 48 | MIN(sql_text) AS sample_sql_text 49 | FROM 50 | replay_info 51 | WHERE 52 | file_name like concat(?,'%') and error_info='' 53 | GROUP BY 54 | sql_digest 55 | HAVING 56 | AVG(query_time) <= 500 57 | ORDER BY 58 | avg(execution_time)/avg(query_time) desc`, 59 | "Sample2: 500us~1ms": `SELECT 60 | sql_digest,max(concat(sql_type,':',ifnull(db_name,''))) sql_type, 61 | COUNT(*) AS exec_cnts, 62 | round(AVG(execution_time / 1000),2) AS current_ms, 63 | round(AVG(query_time / 1000),2) AS before_ms, 64 | concat(ROUND((AVG(execution_time / 1000) - AVG(query_time / 1000)) / AVG(query_time / 1000) ,2)*100,'%') AS reduce_pct, 65 | MIN(sql_text) AS sample_sql_text 66 | FROM 67 | replay_info 68 | WHERE 69 | file_name like concat(?,'%') and error_info='' 70 | GROUP BY 71 | sql_digest 72 | HAVING 73 | AVG(query_time) > 500 AND AVG(query_time) <= 1000 74 | ORDER BY 75 | avg(execution_time)/avg(query_time) desc`, 76 | "Sample3: 1ms~10ms": `SELECT 77 | sql_digest,max(concat(sql_type,':',ifnull(db_name,''))) sql_type, 78 | COUNT(*) AS exec_cnts, 79 | round(AVG(execution_time / 1000),2) AS current_ms, 80 | round(AVG(query_time / 1000),2) AS before_ms, 81 | concat(ROUND((AVG(execution_time / 1000) - AVG(query_time / 1000)) / AVG(query_time / 1000) ,2)*100,'%') AS reduce_pct, 82 | MIN(sql_text) AS sample_sql_text 83 | FROM 84 | replay_info 85 | WHERE 86 | file_name like concat(?,'%') and error_info='' 87 | GROUP BY 88 | sql_digest 89 | HAVING 90 | AVG(query_time) > 1000 AND AVG(query_time) <= 10000 91 | ORDER BY 92 | avg(execution_time)/avg(query_time) desc`, 93 | "Sample4: 10ms~100ms": `SELECT 94 | sql_digest,max(concat(sql_type,':',ifnull(db_name,''))) sql_type, 95 | COUNT(*) AS exec_cnts, 96 | round(AVG(execution_time / 1000),2) AS current_ms, 97 | round(AVG(query_time / 1000),2) AS before_ms, 98 | concat(ROUND((AVG(execution_time / 1000) - AVG(query_time / 1000)) / AVG(query_time / 1000) ,2)*100,'%') AS reduce_pct, 99 | MIN(sql_text) AS sample_sql_text 100 | FROM 101 | replay_info 102 | WHERE 103 | file_name like concat(?,'%') and error_info='' 104 | GROUP BY 105 | sql_digest 106 | HAVING 107 | AVG(query_time) > 10000 AND AVG(query_time) <= 100000 108 | ORDER BY 109 | avg(execution_time)/avg(query_time) desc`, 110 | "Sample5: 100ms~1s": `SELECT 111 | sql_digest,max(concat(sql_type,':',ifnull(db_name,''))) sql_type, 112 | COUNT(*) AS exec_cnts, 113 | round(AVG(execution_time / 1000),2) AS current_ms, 114 | round(AVG(query_time / 1000),2) AS before_ms, 115 | concat(ROUND((AVG(execution_time / 1000) - AVG(query_time / 1000)) / AVG(query_time / 1000) ,2)*100,'%') AS reduce_pct, 116 | MIN(sql_text) AS sample_sql_text 117 | FROM 118 | replay_info 119 | WHERE 120 | file_name like concat(?,'%') and error_info='' 121 | GROUP BY 122 | sql_digest 123 | HAVING 124 | AVG(query_time) > 100000 AND AVG(query_time) <= 1000000 125 | ORDER BY 126 | avg(execution_time)/avg(query_time) desc`, 127 | "Sample6: 1s~10s": `SELECT 128 | sql_digest,max(concat(sql_type,':',ifnull(db_name,''))) sql_type, 129 | COUNT(*) AS exec_cnts, 130 | round(AVG(execution_time / 1000),2) AS current_ms, 131 | round(AVG(query_time / 1000),2) AS before_ms, 132 | concat(ROUND((AVG(execution_time / 1000) - AVG(query_time / 1000)) / AVG(query_time / 1000) ,2)*100,'%') AS reduce_pct, 133 | MIN(sql_text) AS sample_sql_text 134 | FROM 135 | replay_info 136 | WHERE 137 | file_name like concat(?,'%') and error_info='' 138 | GROUP BY 139 | sql_digest 140 | HAVING 141 | AVG(query_time) > 1000000 AND AVG(query_time) <= 10000000 142 | ORDER BY 143 | avg(execution_time)/avg(query_time) desc`, 144 | "Sample7: >10s": `SELECT 145 | sql_digest,max(concat(sql_type,':',ifnull(db_name,''))) sql_type, 146 | COUNT(*) AS exec_cnts, 147 | round(AVG(execution_time / 1000),2) AS current_ms, 148 | round(AVG(query_time / 1000),2) AS before_ms, 149 | concat(ROUND((AVG(execution_time / 1000) - AVG(query_time / 1000)) / AVG(query_time / 1000) ,2)*100,'%') AS reduce_pct, 150 | MIN(sql_text) AS sample_sql_text 151 | FROM 152 | replay_info 153 | WHERE 154 | file_name like concat(?,'%') and error_info='' 155 | GROUP BY 156 | sql_digest 157 | HAVING 158 | AVG(query_time) > 10000000 159 | ORDER BY 160 | avg(execution_time)/avg(query_time) desc`, 161 | "Sql Error Info": `select sql_digest,count(*) exec_cnts,concat(ifnull(max(db_name),''),':',substr(min(error_info),1,256)) as error_info,min(sql_text) as sample_sql_text from replay_info where error_info <>'' and file_name like concat(?,'%') group by sql_digest,substr(error_info,1,10) order by count(*) desc`, 162 | } 163 | 164 | ts_begin_query := time.Now() 165 | fmt.Printf("[%s] Begin execute query\n",ts_begin_query.Format("2006-01-02 15:04:05.000")) 166 | 167 | // 预先执行查询并存储结果 168 | results := make(map[string]QueryResult) 169 | for name, query := range queries { 170 | rows, err := db.Query(query, replayOut) 171 | if err != nil { 172 | results[name] = QueryResult{SQL: name, Error: err} 173 | continue 174 | } 175 | defer rows.Close() 176 | 177 | columns, err := rows.Columns() 178 | if err != nil { 179 | results[name] = QueryResult{SQL: name, Error: err} 180 | continue 181 | } 182 | 183 | var rowsData [][]interface{} 184 | for rows.Next() { 185 | values := make([]interface{}, len(columns)) 186 | valuePtrs := make([]interface{}, len(columns)) 187 | for i := range values { 188 | valuePtrs[i] = &values[i] 189 | } 190 | if err := rows.Scan(valuePtrs...); err != nil { 191 | results[name] = QueryResult{SQL: name, Error: err} 192 | continue 193 | } 194 | rowData := make([]interface{}, len(columns)) 195 | for i, v := range values { 196 | b, ok := v.([]byte) 197 | if ok { 198 | rowData[i] = string(b) 199 | } else { 200 | rowData[i] = v 201 | } 202 | } 203 | rowsData = append(rowsData, rowData) 204 | } 205 | 206 | if err := rows.Err(); err != nil { 207 | results[name] = QueryResult{SQL: name, Error: err} 208 | } 209 | 210 | results[name] = QueryResult{SQL: name, Columns: columns, Rows: rowsData} 211 | } 212 | 213 | tmpl := ` 214 | 215 | 216 | 217 | replay report 218 | 303 | 304 | 305 | 313 |
314 | {{range $key, $query := .}} 315 | {{ if eq $key "Replay Summary" }} 316 |
{{ $key }}
317 | {{ else if eq $key "Sample1: <500us" }} 318 |
{{ $key }}
319 | {{ else if eq $key "Sample2: 500us~1ms" }} 320 |
{{ $key }}
321 | {{ else if eq $key "Sample3: 1ms~10ms" }} 322 |
{{ $key }}
323 | {{ else if eq $key "Sample4: 10ms~100ms" }} 324 |
{{ $key }}
325 | {{ else if eq $key "Sample5: 100ms~1s" }} 326 |
{{ $key }}
327 | {{ else if eq $key "Sample6: 1s~10s" }} 328 |
{{ $key }}
329 | {{ else if eq $key "Sample7: >10s" }} 330 |
{{ $key }}
331 | {{ else if eq $key "Sql Error Info" }} 332 |
{{ $key }}
333 | {{ else }} 334 |

{{ $key }}

335 | {{ end }} 336 | {{with $query.Error}} 337 |

Error: {{ . }}

338 | {{else}} 339 | 340 | 341 | {{range $query.Columns}} 342 | 343 | {{end}} 344 | 345 | {{range $query.Rows}} 346 | 347 | {{range $index, $value := .}} 348 | {{if eq (index $query.Columns $index) "sample_sql_text"}} 349 | 350 | {{else}} 351 | 352 | {{end}} 353 | {{end}} 354 | 355 | {{end}} 356 |
{{.}}
{{$value}}{{$value}}
357 | {{end}} 358 | {{end}} 359 |
360 | 361 | 362 |
363 | 364 | 410 | 411 | 412 | 413 | ` 414 | 415 | t, err := template.New("webpage").Parse(tmpl) 416 | if err != nil { 417 | log.Fatal(err) 418 | } 419 | 420 | http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { 421 | if err := t.Execute(w, results); err != nil { 422 | log.Fatal(err) 423 | } 424 | }) 425 | 426 | ts_finsh_query := time.Now() 427 | 428 | fmt.Printf("[%s] Server is running on port %s\n",ts_finsh_query.Format("2006-01-02 15:04:05.000"),Port) 429 | if err := http.ListenAndServe(Port, nil); err != nil { 430 | log.Fatal(err) 431 | } 432 | } 433 | -------------------------------------------------------------------------------- /translations.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | var translations = map[string]map[string]string{ 4 | "en": { 5 | "usage": "Usage: ./sql-replay -mode replay -db -speed 1.0 -slow-out -replay-out -username -sqltype -dbname -lang ", 6 | "invalid_speed": "Invalid replay speed. The speed must be a positive number.", 7 | "replay_info": "Filter Rule: Source user - %s, Source database - %s, Source SQL type - %s, Replay speed: %f", 8 | "parsing_start": "Parameters read successfully, starting data parsing", 9 | "file_open_error": "Error opening file:", 10 | "parsing_complete": "Data parsing completed", 11 | "parsing_time": "Data parsing time:", 12 | "replay_start": "Starting SQL replay", 13 | "db_open_error": "Error opening database for %s:", 14 | "sql_exec_error": "Error executing SQL for %s:", 15 | "replay_complete": "SQL replay completed", 16 | "replay_time": "SQL replay time:", 17 | }, 18 | "zh": { 19 | "usage": "用法: ./sql-replay -mode replay -db -speed 1.0 -slow-out <慢查询输出文件> -replay-out <回放输出文件> -username -sqltype -dbname -lang <语言代码>", 20 | "invalid_speed": "无效的回放速度。速度必须是正数。", 21 | "replay_info": "过滤规则:源端用户 - %s,源端数据库 - %s,源端 SQL 类型 - %s,回放速度: %f", 22 | "parsing_start": "参数读取成功,开始解析数据", 23 | "file_open_error": "打开文件错误:", 24 | "parsing_complete": "完成数据解析", 25 | "parsing_time": "数据解析耗时:", 26 | "replay_start": "开始 SQL 回放", 27 | "db_open_error": "为 %s 打开数据库时出错:", 28 | "sql_exec_error": "执行 %s 的 SQL 时出错:", 29 | "replay_complete": "SQL 回放完成", 30 | "replay_time": "SQL 回放时间:", 31 | }, 32 | } 33 | --------------------------------------------------------------------------------