├── CONTRIBUTING.md ├── LICENSE ├── NOTICE ├── README.md ├── databases └── mysql.go ├── local_database └── users.sql ├── qan.toml ├── query_analyzer.go └── remote_database ├── remote_schema.sql └── users.sql /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | Contribution Agreement 2 | ====================== 3 | 4 | As a contributor, you represent that the code you submit is your 5 | original work or that of your employer (in which case you represent 6 | you have the right to bind your employer). By submitting code, you 7 | (and, if applicable, your employer) are licensing the submitted code 8 | to LinkedIn and the open source community subject to the BSD 2-Clause 9 | license. 10 | 11 | Responsible Disclosure of Security Vulnerabilities 12 | ================================================== 13 | 14 | Please do not file reports on Github for security issues. Please 15 | review the guidelines on at (link to more info). Reports should be 16 | encrypted using PGP (link to PGP key) and sent to 17 | security@linkedin.com preferably with the title "Github 18 | linkedin/ - ". 19 | 20 | Tips for Getting Your Pull Request Accepted 21 | =========================================== 22 | 23 | *Note: These are suggestions. Customize as needed.* 24 | 25 | 1. Make sure all new features are tested and the tests pass. 26 | 2. Bug fixes must include a test case demonstrating the error that it 27 | fixes. 28 | 3. Open an issue first and seek advice for your change before 29 | submitting a pull request. Large features which have never been 30 | discussed are unlikely to be accepted. **You have been warned.** 31 | 32 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | BSD 2-CLAUSE LICENSE 2 | 3 | Copyright 2020 LinkedIn Corporation 4 | All Rights Reserved. 5 | 6 | Redistribution and use in source and binary forms, with or 7 | without modification, are permitted provided that the following 8 | conditions are met: 9 | 10 | 1. Redistributions of source code must retain the above copyright 11 | notice, this list of conditions and the following disclaimer. 12 | 13 | 2. Redistributions in binary form must reproduce the above 14 | copyright notice, this list of conditions and the following 15 | disclaimer in the documentation and/or other materials provided 16 | with the distribution. 17 | 18 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 19 | "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 20 | LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 21 | A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 22 | HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 23 | SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 24 | LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 25 | DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 26 | THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 27 | (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 28 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 29 | -------------------------------------------------------------------------------- /NOTICE: -------------------------------------------------------------------------------- 1 | Copyright 2020 LinkedIn Corporation 2 | All Rights Reserved. 3 | 4 | Licensed under the BSD 2-Clause License (the "License"). See License in the project root for license information. 5 | 6 | Please note, this project may automatically load third party code from external repositories. 7 | If so, such third party code may be subject to other license terms than as set forth above. 8 | In addition, such third party code may also depend on and load multiple tiers of dependencies. 9 | Please review the applicable licenses of the additional dependencies. 10 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Query Analyzer Agent - Capture and analyze the queries without overhead. 2 | 3 | Query Analyzer Agent runs on the database server. It captures all the queries by sniffing the network port, aggregates the queries and sends results to a remote server for further analysis. Refer to [LinkedIn's Engineering Blog](https://engineering.linkedin.com/blog/2017/09/query-analyzer--a-tool-for-analyzing-mysql-queries-without-overh) for more details. 4 | 5 | ## Getting Started 6 | ### Prerequisites 7 | Query Analyzer Agent is written in Go, so before you get started you should [install and setup Go](https://golang.org/doc/install). You can also follow the steps here to install and setup Go. 8 | ``` 9 | $ wget https://dl.google.com/go/go1.14.linux-amd64.tar.gz 10 | $ sudo tar -C /usr/local -xzf go1.14.linux-amd64.tar.gz 11 | $ mkdir ~/projects 12 | $ export PATH=$PATH:/usr/local/go/bin 13 | $ export GOPATH=~/projects 14 | $ export GOBIN=~/projects/bin 15 | ``` 16 | 17 | Query Analyzer Agent requires the following external libraries 18 | - pcap.h (provided by libpcap-dev package), gcc or build-essential for building this package 19 | - RHEL/CentOs/Fedora: 20 | ``` 21 | $ sudo yum install gcc libpcap libpcap-devel git 22 | ``` 23 | - Debian/Ubuntu: 24 | ``` 25 | $ sudo apt-get install build-essential libpcap-dev git 26 | ``` 27 | - [Go-MySQL-Driver](https://github.com/go-sql-driver/mysql) 28 | ``` 29 | $ go get github.com/go-sql-driver/mysql 30 | ``` 31 | 32 | ### Third Party Libraries 33 | Go build system automatically downloads the following third party tools from the respective github repository during the compilation of this project. 34 | ``` 35 | GoPacket 36 | https://github.com/google/gopacket 37 | Copyright (c) 2012 Google, Inc. All rights reserved. 38 | Copyright (c) 2009-2011 Andreas Krennmair. All rights reserved. 39 | License: BSD 3-Clause "New" or "Revised" License 40 | 41 | Percona Go packages for MySQL 42 | https://github.com/percona/go-mysql 43 | Copyright (C) 2007 Free Software Foundation, Inc. 44 | License: BSD 3-Clause "New" or "Revised" License 45 | 46 | Viper 47 | https://github.com/spf13/viper 48 | Copyright (c) 2014 Steve Francia 49 | License: MIT 50 | ``` 51 | 52 | ### Setting up remote database 53 | Query Analyzer Agent either prints the aggregated queries to a local log file or sends to a remote database which can store queries collected from all the agents. We have chosen MySQL as the remote database. 54 | 55 | Execute the following SQL statements on the remote database server. 56 | ``` 57 | CREATE DATABASE IF NOT EXISTS `query_analyzer`; 58 | 59 | CREATE TABLE IF NOT EXISTS `query_analyzer`.`query_info` ( 60 | `hostname` varchar(64) NOT NULL DEFAULT '', 61 | `checksum` char(16) NOT NULL DEFAULT '', 62 | `fingerprint` longtext NOT NULL, 63 | `sample` longtext CHARACTER SET utf8mb4, 64 | `firstseen` datetime NOT NULL, 65 | `mintime` float NOT NULL DEFAULT '0', 66 | `mintimeat` datetime NOT NULL, 67 | `maxtime` float NOT NULL DEFAULT '0', 68 | `maxtimeat` datetime NOT NULL, 69 | `is_reviewed` enum('0','1','2') NOT NULL DEFAULT '0', 70 | `reviewed_by` varchar(20) DEFAULT NULL, 71 | `reviewed_on` datetime DEFAULT NULL, 72 | `comments` mediumtext, 73 | PRIMARY KEY (`hostname`,`checksum`), 74 | KEY `checksum` (`checksum`) 75 | ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=4; 76 | 77 | CREATE TABLE IF NOT EXISTS `query_analyzer`.`query_history` ( 78 | `hostname` varchar(64) NOT NULL DEFAULT '', 79 | `checksum` char(16) NOT NULL DEFAULT '', 80 | `src` varchar(39) NOT NULL DEFAULT '', 81 | `user` varchar(16) DEFAULT NULL, 82 | `db` varchar(64) NOT NULL DEFAULT '', 83 | `ts` datetime NOT NULL, 84 | `count` int unsigned NOT NULL DEFAULT '1', 85 | `querytime` float NOT NULL DEFAULT '0', 86 | `bytes` int unsigned NOT NULL DEFAULT '0', 87 | PRIMARY KEY (`hostname`,`checksum`,`ts`), 88 | KEY `checksum` (`checksum`), 89 | KEY `user` (`user`), 90 | KEY `covering` (`hostname`,`ts`,`querytime`,`count`,`bytes`) 91 | ) ENGINE=InnoDB DEFAULT CHARSET=utf8 92 | /*!50100 PARTITION BY RANGE (TO_DAYS(ts)) 93 | (PARTITION p202004 VALUES LESS THAN (TO_DAYS('2020-05-01')) ENGINE = InnoDB, 94 | PARTITION p202005 VALUES LESS THAN (TO_DAYS('2020-06-01')) ENGINE = InnoDB, 95 | PARTITION p202006 VALUES LESS THAN (TO_DAYS('2020-07-01')) ENGINE = InnoDB, 96 | PARTITION p202007 VALUES LESS THAN (TO_DAYS('2020-08-01')) ENGINE = InnoDB, 97 | PARTITION p202008 VALUES LESS THAN (TO_DAYS('2020-09-01')) ENGINE = InnoDB, 98 | PARTITION p202009 VALUES LESS THAN (TO_DAYS('2020-10-01')) ENGINE = InnoDB, 99 | PARTITION p202010 VALUES LESS THAN (TO_DAYS('2020-11-01')) ENGINE = InnoDB, 100 | PARTITION p202011 VALUES LESS THAN (TO_DAYS('2020-12-01')) ENGINE = InnoDB, 101 | PARTITION pMAX VALUES LESS THAN (MAXVALUE) ENGINE = InnoDB) */; 102 | /* You can use different partition scheme based on your retention */ 103 | 104 | CREATE USER /*!50706 IF NOT EXISTS*/ 'qan_rw'@'qan_agent_ip' IDENTIFIED BY 'Complex_P@ssw0rd'; 105 | 106 | GRANT SELECT, INSERT, UPDATE ON `query_analyzer`.* TO 'qan_rw'@'qan_agent_ip'; 107 | ``` 108 | The above SQLs can be found in remote_database/remote_schema.sql and remote_database/users.sql files. 109 | 110 | ## Build and Install 111 | ``` 112 | $ git clone https://github.com/linkedin/QueryAnalyzerAgent 113 | $ cd QueryAnalyzerAgent 114 | $ go get 115 | $ go build -o $GOBIN/QueryAnalyzerAgent 116 | ``` 117 | 118 | ## Configuration 119 | QueryAnalyzerAgent config is in TOML format which is organized into several subheadings. For the basic use, you need to specify the Ethernet Interface, Port and connection details of remote database endpoint in the config file - qan.toml 120 | 121 | Once the remote database is setup, update qan.toml 122 | ``` 123 | [remoteDB] 124 | Enabled = 1 125 | 126 | # remote database hostname to send results to 127 | Hostname = "remote_database_hostname" 128 | 129 | # remote database port to send results to 130 | Port = 3306 131 | 132 | # remote database username to send results to 133 | Username = "qan_rw" 134 | 135 | # remote database password to send results to 136 | Password = "Complex_P@ssw0rd" 137 | 138 | # remote database name to send results to 139 | DBName = "query_analyzer" 140 | ``` 141 | 142 | If user and db details of connection are needed, create a user to connect to the local database and update the localDB section. Create user SQL can be found at local_database/users.sql 143 | 144 | 145 | ## Running Query Analyzer Agent 146 | ``` 147 | Since the agent sniffs the network interface, it should have net_raw capability. 148 | $ sudo setcap cap_net_raw+ep $GOBIN/QueryAnalyzerAgent 149 | $ $GOBIN/QueryAnalyzerAgent --config-file qan.toml (or complete path to qan.toml) 150 | 151 | If you do not set the net_raw capability, you can run the agent as a root user. 152 | $ sudo $GOBIN/QueryAnalyzerAgent --config-file qan.toml (or complete path to qan.toml) 153 | ``` 154 | 155 | ## Query Analytics 156 | Once you understand the schema, you can write queries and build fancy UI to extract the information you want. 157 | Examples: 158 | 159 | * Top 5 queries which have the maximum total run time during a specific interval. If a query takes 1 second and executes 1000 times, the total run time is 1000 seconds. 160 | ``` 161 | SELECT 162 | SUM(count), 163 | SUM(querytime) 164 | INTO 165 | @count, @qt 166 | FROM 167 | query_history history 168 | WHERE 169 | history.hostname='mysql.database-server-001.linkedin.com' 170 | AND ts>='2020-03-11 09:00:00' 171 | AND ts<='2020-03-11 10:00:00'; 172 | 173 | SELECT 174 | info.checksum, 175 | info.firstseen AS first_seen, 176 | info.fingerprint, 177 | info.sample, 178 | SUM(count) as count, 179 | ROUND(((SUM(count)/@count)*100),2) AS pct_total_query_count, 180 | ROUND((SUM(count)/(TIME_TO_SEC(TIMEDIFF(MAX(history.ts),MIN(history.ts))))),2) AS qps, 181 | ROUND((SUM(querytime)/SUM(count)),6) AS avg_query_time, 182 | ROUND(SUM(querytime),6) AS sum_query_time, 183 | ROUND((SUM(querytime)/@qt)*100,2) AS pct_total_query_time, 184 | MIN(info.mintime) AS min_query_time, 185 | MAX(info.maxtime) AS max_query_time 186 | FROM 187 | query_history history 188 | JOIN 189 | query_info info 190 | ON 191 | info.checksum=history.checksum 192 | AND info.hostname=history.hostname 193 | WHERE 194 | info.hostname='mysql.database-server-001.linkedin.com' 195 | AND ts>='2020-03-11 09:00:00' 196 | AND ts<='2020-03-11 10:00:00' 197 | GROUP BY 198 | info.checksum 199 | ORDER BY 200 | pct_total_query_time DESC 201 | LIMIT 5\G 202 | ``` 203 | 204 | * Trend for a particular query 205 | ``` 206 | SELECT 207 | UNIX_TIMESTAMP(ts), 208 | ROUND(querytime/count,6) 209 | FROM 210 | query_history history 211 | WHERE 212 | history.checksum='D22AB75FA3CC05DC' 213 | AND history.hostname='mysql.database-server-001.linkedin.com' 214 | AND ts>='2020-03-11 09:00:00' 215 | AND ts<='2020-03-11 10:00:00'; 216 | ``` 217 | * Queries fired from a particular IP 218 | ``` 219 | SELECT 220 | info.checksum, 221 | info.fingerprint, 222 | info.sample 223 | FROM 224 | query_history history 225 | JOIN 226 | query_info info 227 | ON 228 | info.checksum=history.checksum 229 | AND info.hostname=history.hostname 230 | WHERE 231 | history.src='10.251.225.27' 232 | LIMIT 5; 233 | ``` 234 | * New queries on a particular day 235 | ``` 236 | SELECT 237 | info.firstseen, 238 | info.checksum, 239 | info.fingerprint, 240 | info.sample 241 | FROM 242 | query_info info 243 | WHERE 244 | info.hostname = 'mysql.database-server-001.linkedin.com' 245 | AND info.firstseen >= '2020-03-10 00:00:00' 246 | AND info.firstseen < '2020-03-11 00:00:00' 247 | LIMIT 5; 248 | ``` 249 | 250 | ## Limitations 251 | * As of now, it works only for MySQL. 252 | 253 | * Does not account for 254 | * SSL 255 | * Compressed packets 256 | * Replication traffic 257 | * Big queries for performance reasons 258 | 259 | * The number of unique query fingerprints should be limited (like <100K). For example if there is some blob in the query and the tool is unable to generate the correct fingerprint, it will lead to a huge number of fingerprints and can increase the memory footprint of QueryAnalyzerAgent.

260 | Another example is if you are using Github's Orchestrator in pseudo GTID mode, it generates queries like 261 | ``` 262 | drop view if exists `_pseudo_gtid_`.`_asc:5d8a58c6:0911a85c:865c051f49639e79` 263 | ``` 264 | 265 | The fingerprint for those queries will be unique each time and it will lead to more number of distinct queries in QueryAnalyzerAgent. Code to ignore those queries is commented, uncomment if needed. 266 | 267 | * Test the performance of QueryAnalyzerAgent in your staging environment before running on production. 268 | -------------------------------------------------------------------------------- /databases/mysql.go: -------------------------------------------------------------------------------- 1 | package MySQLProtocol 2 | 3 | import ( 4 | "database/sql" 5 | "errors" 6 | "fmt" 7 | _ "github.com/go-sql-driver/mysql" 8 | "github.com/percona/go-mysql/query" 9 | "log" 10 | "net" 11 | "regexp" 12 | "strings" 13 | "time" 14 | ) 15 | 16 | /* 17 | MySQL packet type constants 18 | */ 19 | const ( 20 | COM_SLEEP = 0x00 21 | COM_QUIT = 0x01 22 | COM_INIT_DB = 0x02 23 | COM_QUERY = 0x03 24 | COM_FIELD_LIST = 0x04 25 | COM_CREATE_DB = 0x05 26 | COM_DROP_DB = 0x06 27 | COM_REFRESH = 0x07 28 | COM_SHUTDOWN = 0x08 29 | COM_STATISTICS = 0x09 30 | COM_PROCESS_INFO = 0x0a 31 | COM_CONNECT = 0x0b 32 | COM_PROCESS_KILL = 0x0c 33 | COM_DEBUG = 0x0d 34 | COM_PING = 0x0e 35 | COM_TIME = 0x0f 36 | COM_DELAYED_INSERT = 0x10 37 | COM_CHANGE_USER = 0x11 38 | COM_BINLOG_DUMP = 0x12 39 | COM_TABLE_DUMP = 0x13 40 | COM_CONNECT_OUT = 0x14 41 | COM_REGISTER_SLAVE = 0x15 42 | COM_STMT_PREPARE = 0x16 43 | COM_STMT_EXECUTE = 0x17 44 | COM_STMT_SEND_LONG_DATA = 0x18 45 | COM_STMT_CLOSE = 0x19 46 | COM_STMT_RESET = 0x1a 47 | COM_SET_OPTION = 0x1b 48 | COM_STMT_FETCH = 0x1c 49 | iOK byte = 0x00 50 | iEOF byte = 0xfe 51 | iERR byte = 0xff 52 | ) 53 | 54 | // Config struct to hold config related to connecting to local db 55 | type LocalDBConfig struct { 56 | LocalUsername string 57 | LocalPassword string 58 | LocalSocket string 59 | Enabled uint8 60 | } 61 | 62 | // Config struct to hold qan agent config section 63 | type QanAgentConfig struct { 64 | ReportInterval uint32 65 | MaxRequestLength uint32 66 | MaxDBConnections int 67 | DebugLevels string 68 | LogFile string 69 | DebugMap map[uint8]bool 70 | } 71 | 72 | // Struct for packet related information 73 | type Source struct { 74 | Src string 75 | QueryStartTime time.Time 76 | QueryEndTime time.Time 77 | RequestCount uint16 78 | ResponseCount uint16 79 | RequestLength uint32 80 | QueryText string 81 | ResponseTime float64 82 | User string 83 | Db string 84 | NewConnection bool 85 | PreparedStatement bool 86 | FullLength bool 87 | } 88 | 89 | // Struct for storing the processlist 90 | type ProcUser struct { 91 | Src string 92 | User string 93 | Db string 94 | } 95 | 96 | // Hashmap for processlist 97 | var ProcUserMap map[string]*ProcUser = make(map[string]*ProcUser) 98 | 99 | // Hashmap for the packet information 100 | var SourceMap map[string]*Source = make(map[string]*Source) 101 | 102 | // hostname of the localhost 103 | var host string 104 | 105 | // LocalDBConfig instance 106 | var LocalDBConfigParams = LocalDBConfig{} 107 | 108 | // QanAgentConfig instance 109 | var QanAgentConfigParams = QanAgentConfig{} 110 | 111 | var parseUserRegex = regexp.MustCompile("([0-9a-z]+)(?P00{23})(?P[0-9a-z]+)00([0-9a-z]{42})(?P[1-9a-z]+0{0,1}[1-9a-z]*0{0,1})00([0-9a-z]+)") 112 | 113 | /* 114 | This function extracts queries from the packet. If the packet is a continuation of an earlier request, it will be added to the request. 115 | TODO: Need to add support for compressed packets and SSL connections. 116 | */ 117 | func ParseProtocol(src *string, payload *[]byte, fullLength bool) (err error) { 118 | var MySQLPacketType byte = 0xff 119 | var MySQLPacketData []byte 120 | defer func() { 121 | // recover from panic if one occured. Set err to nil otherwise. 122 | if recover() != nil { 123 | log.Printf("Failed while parsing protocol. Details: %v", src) 124 | err = errors.New("Unknown Exception") 125 | } 126 | }() 127 | 128 | MySQLPacketType, MySQLPacketData = parsePacket(payload) 129 | packetLength := len(MySQLPacketData) 130 | 131 | if QanAgentConfigParams.DebugMap[5] { 132 | log.Printf("Source: %s\n, MySQLPacket Length: %d\nMySQLPacketType: %d\nMySQLPacketData: %s\n", *src, packetLength, MySQLPacketType, MySQLPacketData) 133 | } 134 | 135 | if MySQLPacketType == iERR { 136 | return 137 | } 138 | 139 | sourceMapHandle, exists := SourceMap[*src] 140 | if !exists { 141 | sourceMapHandle = &Source{Src: *src, RequestCount: 0, ResponseCount: 0, QueryStartTime: time.Now(), QueryEndTime: time.Now(), NewConnection: true} 142 | SourceMap[*src] = sourceMapHandle 143 | if QanAgentConfigParams.DebugMap[5] { 144 | log.Printf("New connection from %s\n", sourceMapHandle.Src) 145 | } 146 | } else { 147 | sourceMapHandle.QueryStartTime = time.Now() 148 | sourceMapHandle.QueryEndTime = time.Now() 149 | } 150 | 151 | if sourceMapHandle.FullLength == false { 152 | sourceMapHandle.FullLength = fullLength 153 | } 154 | 155 | // COM_QUIT Packet 156 | if MySQLPacketType == COM_QUIT { 157 | delete(SourceMap, sourceMapHandle.Src) 158 | if QanAgentConfigParams.DebugMap[4] { 159 | log.Printf("Deleting on COM_QUIT %s\nLength of Query Hashmap: %d\n", sourceMapHandle.Src, len(SourceMap)) 160 | } 161 | return 162 | } 163 | 164 | if sourceMapHandle.NewConnection == true { 165 | sourceMapHandle.User, sourceMapHandle.Db = parseUserInfo(src, *payload) 166 | sourceMapHandle.NewConnection = false 167 | } 168 | 169 | if MySQLPacketType == COM_INIT_DB { 170 | sourceMapHandle.Db = string(MySQLPacketData) 171 | sourceMapHandle.RequestCount++ 172 | return 173 | } 174 | 175 | if MySQLPacketType == COM_STMT_PREPARE { 176 | sourceMapHandle.QueryText = string(MySQLPacketData) 177 | sourceMapHandle.RequestCount++ 178 | sourceMapHandle.PreparedStatement = true 179 | return 180 | } 181 | 182 | if MySQLPacketType == COM_STMT_EXECUTE { 183 | if sourceMapHandle.QueryText != "" { 184 | sourceMapHandle.RequestCount++ 185 | sourceMapHandle.QueryStartTime = time.Now() 186 | sourceMapHandle.PreparedStatement = false 187 | } 188 | return 189 | } 190 | 191 | if MySQLPacketType == COM_QUERY { 192 | sourceMapHandle.RequestLength += uint32(packetLength) 193 | // Do not process requests more than 512K, probably huge insert 194 | if sourceMapHandle.RequestLength < QanAgentConfigParams.MaxRequestLength { 195 | queryString := string(MySQLPacketData) 196 | sourceMapHandle.RequestCount++ 197 | if sourceMapHandle.RequestCount <= 3 && strings.Contains(queryString, "select @@version_comment limit 1") { 198 | sourceMapHandle.QueryText = "" 199 | } else { 200 | // Sometimes the set autocommit = 0 and commit is sent along with the request. 201 | if !(strings.EqualFold(queryString, "commit") || strings.EqualFold(queryString, "rollback") || strings.EqualFold(queryString, "set autocommit = 0") || strings.EqualFold(queryString, "set autocommit = 1")) && sourceMapHandle.QueryText != queryString { 202 | sourceMapHandle.QueryText = queryString 203 | } 204 | } 205 | } 206 | return 207 | } 208 | 209 | // probably part of earlier request 210 | if exists { 211 | if sourceMapHandle.QueryText != "" { 212 | queryString := string(MySQLPacketData) 213 | sourceMapHandle.RequestCount++ 214 | sourceMapHandle.QueryText += queryString 215 | } 216 | if sourceMapHandle.RequestLength >= QanAgentConfigParams.MaxRequestLength { 217 | if QanAgentConfigParams.DebugMap[4] { 218 | log.Printf("Truncating huge insert: %s", sourceMapHandle.QueryText) 219 | } 220 | sourceMapHandle.QueryText = "" 221 | } 222 | } 223 | 224 | // Uncomment these lines to clean up the orphan packets. Cleaning up will wipe off the user and db information also. 225 | /* 226 | if (len(SourceMap)>1024) { 227 | for key, _ := range SourceMap { 228 | if SourceMap[key].ResponseCount > 3 && SourceMap[key].QueryText == "" { 229 | delete(SourceMap, key) 230 | } 231 | } 232 | } 233 | */ 234 | return 235 | } 236 | 237 | /* 238 | This function returns the username and the database name if it is a connection request packet 239 | TODO: This only support mysql_native_password authentication mechanism, need to add support for sha_256 plugin and others 240 | */ 241 | 242 | func parseUserInfo(src *string, payload []byte) (string, string) { 243 | var user string 244 | var db string 245 | payloadLength := len(payload) 246 | // max username is 16 chars (32 chars in version >= 5.7.8) and max database name is 64 chars 247 | if payloadLength >= 59 && payloadLength <= 285 { 248 | 249 | var userOffset int 250 | var dbOffset int 251 | 252 | /* 253 | First 3 bytes - Packet Length 254 | Next byte - Packet Number 255 | Next 2 bytes - Client Capabilities 256 | Next 2 bytes - Extended Client Capabilities 257 | Next 5 bytes - Max packet size, charset 258 | Next 23 bytes - Filled with 00 259 | (so we can ignore first 36 bytes) 260 | (..)00 - Null-terminated user name 261 | ([0-9][a-f]){22} - Hashed Password 262 | (..)00 - Null-terminated database name 263 | */ 264 | 265 | // get user information 266 | for i := 1; i < payloadLength-36; i++ { 267 | if payload[36+i] == '\x00' { 268 | userOffset = i 269 | break 270 | } 271 | } 272 | 273 | if string(payload[36:36+userOffset]) != "" { 274 | user = string(payload[36 : 36+userOffset]) 275 | } 276 | 277 | // get db information 278 | for j := 36 + userOffset + 22; j < len(payload); j++ { 279 | if payload[j] == '\x00' { 280 | dbOffset = j 281 | break 282 | } 283 | } 284 | if 36+userOffset+22 < dbOffset && dbOffset < payloadLength && string(payload[36+userOffset+22:dbOffset]) != "" { 285 | db = string(payload[36+userOffset+22 : dbOffset]) 286 | if strings.Contains(db, "mysql_native_password") { 287 | db = "" 288 | } 289 | } 290 | 291 | // Doing it in regex way 292 | /* 293 | matches := parseUserRegex.FindStringSubmatch(string(payload)) 294 | names := parseUserRegex.SubexpNames() 295 | for i, match := range matches { 296 | if i != 0 { 297 | if names[i] == "user" { 298 | t, _ := hex.DecodeString(match) 299 | user = string(t) 300 | } 301 | if names[i] == "db" { 302 | g, _ := hex.DecodeString(match) 303 | db = string(g) 304 | } 305 | } 306 | } 307 | */ 308 | 309 | if QanAgentConfigParams.DebugMap[4] { 310 | log.Printf("New Connection\nUser:%s\nDB:%s\n", user, db) 311 | } 312 | } 313 | return user, db 314 | } 315 | 316 | /* 317 | This function returns the packet type which is the 5th byte of the packet and the packet data 318 | */ 319 | 320 | func parsePacket(buf *[]byte) (byte, []byte) { 321 | dataLength := len(*buf) 322 | if dataLength <= 5 { 323 | if dataLength == 5 && (*buf)[4] == COM_QUIT { 324 | // COM_QUIT Packet 325 | return COM_QUIT, nil 326 | } 327 | return iERR, nil 328 | } 329 | MySQLPacketType := (*buf)[4] 330 | MySQLPacketData := (*buf)[5:] 331 | return MySQLPacketType, MySQLPacketData 332 | } 333 | 334 | /* 335 | This function is same as parsePacket function except this is only for checking whether there is error in response based on 336 | the packet type which is the 5th byte of the packet. 337 | */ 338 | 339 | func IsErrPacket(payload *[]byte) bool { 340 | if len(*payload) >= 7 && (*payload)[4] == iERR { 341 | // ERROR 1044 (42000): Access denied for user 'xxxx'@'xxxx' to database 'xxxx' 342 | // ERROR 1045 (28000): Access denied for user 'xxxx'@'xxxx' (using password: YES) 343 | // ERROR 1049 (42000): Unknown database 'xxxx' 344 | if (*payload)[6] == 0x04 && ((*payload)[5] == 0x14 || (*payload)[5] == 0x15 || (*payload)[5] == 0x19) { 345 | return true 346 | } 347 | } 348 | return false 349 | } 350 | 351 | /* 352 | This function anonymizes the data from a query by replacing it with ? and calculates the checksum of the query using Percona go-query. 353 | */ 354 | func AnonymizeQuery(queryText string) (string, string) { 355 | fp := query.Fingerprint(queryText) 356 | checksum := query.Id(fp) 357 | return fp, checksum 358 | } 359 | 360 | func GetProcesslist() { 361 | var db *sql.DB 362 | var err error 363 | var src string 364 | var user string 365 | var dbname string 366 | 367 | // clear entries 368 | for key, _ := range ProcUserMap { 369 | delete(ProcUserMap, key) 370 | } 371 | 372 | localSQLHandle := fmt.Sprintf("%s:%s@unix(%s)/information_schema", LocalDBConfigParams.LocalUsername, LocalDBConfigParams.LocalPassword, LocalDBConfigParams.LocalSocket) 373 | db, err = sql.Open("mysql", localSQLHandle) 374 | if err != nil { 375 | panic(err.Error()) 376 | } 377 | defer db.Close() 378 | 379 | err = db.Ping() 380 | if err != nil { 381 | log.Printf("ERROR: Unable to connect to local database to get processlist: %s", err) 382 | return 383 | } 384 | 385 | rows, err := db.Query("SELECT HOST, USER, DB FROM INFORMATION_SCHEMA.PROCESSLIST WHERE HOST!=''") 386 | for rows.Next() { 387 | hostname := "" 388 | port := "" 389 | rows.Scan(&src, &user, &dbname) 390 | if strings.Contains(src, ":") { 391 | // src := "example-app1234.linkedin.com:55380" 392 | // src := "2405:2300:ff02:6005:7338:6962:863b:595a:55380" 393 | // src := "::1:55380" 394 | src_split := strings.Split(src, ":") 395 | hostname = strings.Join(src_split[:len(src_split)-1], ":") 396 | port = src_split[len(src_split)-1] 397 | } 398 | ips, err := net.LookupHost(hostname) 399 | if err == nil && len(ips) > 0 && len(port) > 0 { 400 | for _, ip := range ips { 401 | src = fmt.Sprintf("%s@%s", ip, port) 402 | ProcUserMapHandle, exists := ProcUserMap[src] 403 | if !exists { 404 | ProcUserMapHandle = &ProcUser{Src: src, User: user, Db: dbname} 405 | ProcUserMap[src] = ProcUserMapHandle 406 | } 407 | } 408 | } else if len(port) > 0 { 409 | src = fmt.Sprintf("%s@%s", hostname, port) 410 | ProcUserMapHandle, exists := ProcUserMap[src] 411 | if !exists { 412 | ProcUserMapHandle = &ProcUser{Src: src, User: user, Db: dbname} 413 | ProcUserMap[src] = ProcUserMapHandle 414 | } 415 | } 416 | } 417 | if QanAgentConfigParams.DebugMap[10] { 418 | for key, value := range ProcUserMap { 419 | log.Printf("Processlist:%s, %s\n", key, *value) 420 | } 421 | } 422 | } 423 | -------------------------------------------------------------------------------- /local_database/users.sql: -------------------------------------------------------------------------------- 1 | CREATE USER /*!50706 IF NOT EXISTS*/ 'qan_ro'@'localhost' IDENTIFIED BY 'Complex_P@ssw0rd'; 2 | GRANT PROCESS ON *.* TO 'qan_ro'@'localhost'; 3 | -------------------------------------------------------------------------------- /qan.toml: -------------------------------------------------------------------------------- 1 | [sniffer] 2 | # Database type. As of now only MySQL is supported 3 | Database = "MySQL" 4 | 5 | # Port to sniff 6 | ListenPort = 3306 7 | 8 | # interface to listen on (ex: eth0, eth1, bond0 etc) 9 | ListenInterface = "eth0" 10 | 11 | # The max payload of TCP is 64K, but it will be a big CPU overhead to capture 64K. 12 | # The capture length can be reduced (most of the queries won't take full payload length) to suit your query length 13 | CaptureLength = 8192 14 | 15 | # You can get packets bigger than the capture length and logging the number of such packets helps us in tuning the capture length. 16 | # You can also get number of aborted connections due to timeout, reset connections etc 17 | # Print such instances every ReportStatsInterval seconds 18 | ReportStatsInterval = 60 19 | 20 | # Comma separated list of IPs to be ignored from sniffing 21 | IgnoreIPs = "" 22 | 23 | [qan-agent] 24 | # Comma separated debug levels 25 | # Example: 26 | # DebugLevels = "2,3,4,5,9,10" 27 | # 1 - Processed query info, this can be used if you do not send results to remote server 28 | # 2 - Source, Query, Query Time 29 | # 3 - Queries greater than capture length 30 | # 4 - User and connection related information, memory related info 31 | # 5 - MySQL packet info 32 | # 6 - Query Response info 33 | # 7 - Orphan packets garbage collection information 34 | # 8 - Orphan packets garbage collection detailed information 35 | # 9 - Access denied requests 36 | # 10 - Processlist 37 | DebugLevels = "" 38 | 39 | # If the query is bigger than MaxRequestLength, it will be ignored. Probably it is a huge insert 40 | MaxRequestLength = 524288 41 | 42 | # Maximum number of db connections. This decides the connection buffer for qan agent. Buffer will be set to 1.5 times the max connections. It is fine to have connections to db more than what is specified here. 43 | MaxDBConnections = 1024 44 | 45 | # Send the query report to remote server every ReportInterval seconds 46 | ReportInterval = 300 47 | 48 | # Log file to print 49 | LogFile = "/var/log/qan.log" 50 | 51 | [localDB] 52 | # Some connections might have been established before the sniffer was started. It is not possbile to get those connection details like user and db. If enabled, agent connects to the local database, checks the processlist and gets the user and db info 53 | # 0 - Do not check processlist 54 | # 1 - Check processlist only once at startup 55 | # 2 - Check processlist as and when required 56 | Enabled = 0 57 | 58 | # Username to connect to the local database 59 | LocalUsername = "qan_ro" 60 | 61 | # Password to connect to the local database 62 | LocalPassword = "xxxx" 63 | 64 | # Socket to connect to the local database 65 | LocalSocket = "/var/lib/mysql/mysql.sock" 66 | 67 | ### Send the query reports to remote database server 68 | [remoteDB] 69 | Enabled = 0 70 | 71 | # remote database hostname to send results to. If it is IPv6, enclose with [] example: [::1] 72 | Hostname = "" 73 | 74 | # remote database port to send results to 75 | Port = 3306 76 | 77 | # remote database username to send results to 78 | Username = "qan_rw" 79 | 80 | # remote database password to send results to 81 | Password = "xxxx" 82 | 83 | # remote database name to send results to 84 | DBName = "query_analyzer" 85 | 86 | # Sample query is the query which took maximum time for that fingerprint. 87 | # Sample query contains data. If you do not want to send data, disable this 88 | IncludeSample = 1 89 | 90 | # send the reports over SSL 91 | # 0 - disabled 92 | # 1 - enabled with skip verify 93 | # 2 - self signed SSL certificates taken from Ca_cert, Client_cert, Client_key config 94 | # 3 - SSL certificates taken from Ca_cert, Client_cert, Client_key config 95 | SSLEnabled = 1 96 | 97 | # SSL certificate details 98 | Ca_cert = "" 99 | Client_cert = "" 100 | Client_key = "" 101 | 102 | ### post to remote API instead of remote database (not implemented yet) 103 | [remoteAPI] 104 | Enabled = 0 105 | URL = "https://xxxx" 106 | apikey = "xxxx" 107 | -------------------------------------------------------------------------------- /query_analyzer.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "crypto/tls" 5 | "crypto/x509" 6 | "database/sql" 7 | _ "encoding/hex" 8 | "errors" 9 | "flag" 10 | "fmt" 11 | mysql "github.com/go-sql-driver/mysql" 12 | "github.com/google/gopacket" 13 | "github.com/google/gopacket/layers" 14 | "github.com/google/gopacket/pcap" 15 | MySQLProtocol "github.com/linkedin/QueryAnalyzerAgent/databases" 16 | "github.com/spf13/viper" 17 | "io/ioutil" 18 | "log" 19 | "net" 20 | "os" 21 | "path/filepath" 22 | _ "regexp" 23 | "strconv" 24 | "strings" 25 | "sync" 26 | "time" 27 | ) 28 | 29 | // Config struct to hold various sections of the config 30 | type Config struct { 31 | Sniffer snifferConfig `mapstructure:"sniffer"` 32 | QanAgent qanAgentConfig `mapstructure:"qan-agent"` 33 | LocalDB localDBConfig `mapstructure:"localDB"` 34 | RemoteDB remoteDBConfig `mapstructure:"remoteDB"` 35 | } 36 | 37 | // Config struct to hold sniffer config section 38 | type snifferConfig struct { 39 | ListenInterface string 40 | ListenPort uint16 41 | CaptureLength int32 42 | ReportStatsInterval int32 43 | IgnoreIPs string 44 | } 45 | 46 | // Config struct to hold qan agent config section 47 | type qanAgentConfig struct { 48 | ReportInterval uint32 49 | MaxRequestLength uint32 50 | MaxDBConnections int 51 | DebugLevels string 52 | LogFile string 53 | DebugMap map[uint8]bool 54 | } 55 | 56 | // Config struct to hold config related to connecting to local db 57 | type localDBConfig struct { 58 | LocalUsername string 59 | LocalPassword string 60 | LocalSocket string 61 | Enabled uint8 62 | } 63 | 64 | // Config struct to hold config related to connecting to remote db 65 | type remoteDBConfig struct { 66 | Hostname string 67 | Port uint16 68 | Username string 69 | Password string 70 | DBName string 71 | Ca_cert string 72 | Client_cert string 73 | Client_key string 74 | Enabled uint8 75 | SSLEnabled uint8 76 | IncludeSample uint8 77 | } 78 | 79 | // struct for query information 80 | type queryInfo struct { 81 | checksum string 82 | src string 83 | user string 84 | db string 85 | fingerprint string 86 | sample string 87 | queryTime float64 88 | minTime float64 89 | maxTime float64 90 | count uint32 91 | bytesIn uint32 92 | } 93 | 94 | // struct for query metadata information 95 | type queryMeta struct { 96 | checksum string 97 | minTime float64 98 | maxTime float64 99 | } 100 | 101 | // hashmap for query info 102 | var queryInfoMap map[string]*queryInfo = make(map[string]*queryInfo) 103 | var queryInfoCopyMap map[string]*queryInfo = make(map[string]*queryInfo) 104 | 105 | // hashmap for query metadata info 106 | var queryMetaMap map[string]*queryMeta = make(map[string]*queryMeta) 107 | 108 | // send results or not to the remote system 109 | var sendResults bool 110 | 111 | // hostname of the localhost 112 | var host string 113 | 114 | // config instance 115 | var Params = Config{} 116 | 117 | // Lock for query info 118 | var queryInfoMapMutex = &sync.Mutex{} 119 | 120 | // count of queries with packet size equal to capture size 121 | var fullLengthQueryCount uint16 122 | var abortedConnectionCount uint16 123 | var accessDeniedCount uint16 124 | 125 | /* 126 | Read config from toml file and marshal to Config struct 127 | */ 128 | func ReadConfig() Config { 129 | ConfigFile := flag.String("config-file", "/etc/qan.toml", "Path to the configuration file") 130 | flag.Parse() 131 | viper.SetConfigType("toml") 132 | if len(os.Args) > 1 { 133 | ConfigPath := filepath.Dir(*ConfigFile) 134 | ConfigFullName := filepath.Base(*ConfigFile) 135 | Extension := filepath.Ext(ConfigFullName) 136 | ConfigName := ConfigFullName[0 : len(ConfigFullName)-len(Extension)] 137 | viper.AddConfigPath(ConfigPath) 138 | viper.SetConfigName(ConfigName) 139 | } else { 140 | viper.SetConfigName("qan") 141 | viper.AddConfigPath(".") 142 | viper.AddConfigPath("/etc/") 143 | } 144 | if err := viper.ReadInConfig(); err != nil { 145 | fmt.Printf("Unable to load config: %s", err) 146 | os.Exit(1) 147 | } 148 | if err := viper.Unmarshal(&Params); err != nil { 149 | fmt.Printf("Unable to read config: %s", err) 150 | } 151 | return Params 152 | } 153 | 154 | /* 155 | This function opens the interface and sniffs packets over the port. 156 | It extracts the source IP, source port, dest IP and dest port and determines it is request or response. 157 | If the source port in the TCP header is same as the port that is being sniffed, it is considered as response otherwise, it is request 158 | After extracting the info from the TCP header, processPayload is called to process the payload. 159 | */ 160 | func StartSniffer(ipaddrs []string) { 161 | var handle *pcap.Handle 162 | var fullLength bool = false 163 | handle, err := pcap.OpenLive(Params.Sniffer.ListenInterface, Params.Sniffer.CaptureLength, false, 0) 164 | if handle == nil || err != nil { 165 | msg := "unknown error" 166 | if err != nil { 167 | msg = err.Error() 168 | } 169 | log.Fatalf("ERROR: Failed to open device: %s", msg) 170 | } 171 | 172 | filter := fmt.Sprintf("tcp and port %d", uint16(Params.Sniffer.ListenPort)) 173 | if len(ipaddrs) > 0 { 174 | ipFilter := "src host " 175 | // Ignore request originating from this server to remote server 176 | ipFilter += strings.Join(ipaddrs, " or src host ") 177 | filter = fmt.Sprintf("%s and not (dst port %d and (%s))", filter, uint16(Params.Sniffer.ListenPort), ipFilter) 178 | } 179 | 180 | if Params.Sniffer.IgnoreIPs != "" { 181 | ignoreIPs := strings.Split(Params.Sniffer.IgnoreIPs, ",") 182 | ignoreFilter := "host " 183 | ignoreFilter += strings.Join(ignoreIPs, " or host ") 184 | filter = fmt.Sprintf("%s and not (%s)", filter, ignoreFilter) 185 | } 186 | 187 | err = handle.SetBPFFilter(filter) 188 | if err != nil { 189 | log.Fatalf("ERROR: Failed to set port filter: %s", err) 190 | } 191 | 192 | packetSource := gopacket.NewPacketSource(handle, handle.LinkType()) 193 | for packet := range packetSource.Packets() { 194 | var src string 195 | var request bool = false 196 | SrcIP, DstIP := getIP(packet) 197 | if SrcIP != "" && DstIP != "" { 198 | SrcPort, DstPort, isRST := getTCPPort(packet) 199 | if SrcPort != 0 && DstPort != 0 { 200 | if Params.QanAgent.DebugMap[5] { 201 | log.Printf("From %s-%d to %s-%d\n", SrcIP, SrcPort, DstIP, DstPort) 202 | } 203 | if SrcPort == Params.Sniffer.ListenPort { 204 | src = fmt.Sprintf("%s-%d", DstIP, DstPort) 205 | request = false 206 | } else if DstPort == Params.Sniffer.ListenPort { 207 | src = fmt.Sprintf("%s-%d", SrcIP, SrcPort) 208 | request = true 209 | } else { 210 | log.Fatalf("ERROR: Got packet src = %d, dst = %d", SrcPort, DstPort) 211 | } 212 | if len(packet.Data()) == int(Params.Sniffer.CaptureLength) { 213 | fullLength = true 214 | } else { 215 | fullLength = false 216 | } 217 | processPayload(&src, request, packet, isRST, fullLength) 218 | } 219 | } 220 | } 221 | } 222 | 223 | /* 224 | Returns the source IP and dest IP 225 | */ 226 | func getIP(packet gopacket.Packet) (string, string) { 227 | ipv4Layer := packet.Layer(layers.LayerTypeIPv4) 228 | if ipv4Layer != nil { 229 | ip, _ := ipv4Layer.(*layers.IPv4) 230 | return ip.SrcIP.String(), ip.DstIP.String() 231 | } 232 | 233 | ipv6Layer := packet.Layer(layers.LayerTypeIPv6) 234 | if ipv6Layer != nil { 235 | ip, _ := ipv6Layer.(*layers.IPv6) 236 | return ip.SrcIP.String(), ip.DstIP.String() 237 | } 238 | return "", "" 239 | } 240 | 241 | /* 242 | Returns the source port and dest port 243 | */ 244 | func getTCPPort(packet gopacket.Packet) (uint16, uint16, bool) { 245 | tcpLayer := packet.Layer(layers.LayerTypeTCP) 246 | if tcpLayer != nil { 247 | tcp, _ := tcpLayer.(*layers.TCP) 248 | SrcPort, _ := strconv.Atoi(fmt.Sprintf("%d", tcp.SrcPort)) 249 | DstPort, _ := strconv.Atoi(fmt.Sprintf("%d", tcp.DstPort)) 250 | return uint16(SrcPort), uint16(DstPort), tcp.RST 251 | } 252 | return 0, 0, false 253 | } 254 | 255 | /* 256 | Returns the payload of the packet 257 | */ 258 | func getPayload(packet gopacket.Packet) []byte { 259 | applicationLayer := packet.ApplicationLayer() 260 | if applicationLayer != nil { 261 | return applicationLayer.Payload() 262 | } 263 | return nil 264 | } 265 | 266 | /* 267 | Pass the payload to the protocol parser of the database. 268 | */ 269 | func processPayload(src *string, request bool, packet gopacket.Packet, isRST bool, fullLength bool) { 270 | payload := getPayload(packet) 271 | if len(payload) == 0 && !(isRST == true && request == false) { 272 | return 273 | } 274 | 275 | /* 276 | If this is not a request, it means it is a response to an earlier request. If response comes, it means the query execution was 277 | completed. Record the end time, calculate the total response time and pass the query to ProcessQuery to anonymize and further analytics 278 | */ 279 | if !request { 280 | sourceMapHandle, exists := MySQLProtocol.SourceMap[*src] 281 | if exists { 282 | // Discard entries with access denied requests 283 | if MySQLProtocol.IsErrPacket(&payload) { 284 | accessDeniedCount++ 285 | delete(MySQLProtocol.SourceMap, sourceMapHandle.Src) 286 | if Params.QanAgent.DebugMap[9] { 287 | log.Printf("Error:%s\n", payload) 288 | } 289 | return 290 | } 291 | if isRST == false { 292 | if Params.QanAgent.DebugMap[6] { 293 | log.Printf("Source: %s\nResponse Packet: %s\n, Response Count: %d\n Response Length: %d\n", sourceMapHandle.Src, string(getPayload(packet)), sourceMapHandle.ResponseCount, len(payload)) 294 | } 295 | sourceMapHandle.ResponseCount++ 296 | sourceMapHandle.ResponseTime = float64(uint64(time.Since(sourceMapHandle.QueryEndTime).Nanoseconds())) / 1000000000 297 | // If you want to capture the response time from the first packet of the query till the first packet of response. 298 | // sourceMapHandle.totalResponseTime = float64(uint64(time.Since(sourceMapHandle.QueryStartTime).Nanoseconds())) / 1000000000 299 | if sourceMapHandle.FullLength == true { 300 | fullLengthQueryCount++ 301 | if Params.QanAgent.DebugMap[3] { 302 | log.Printf("Full Length Query\n*************\nSource: %s\nTime: %f\nUser: %s\nDB: %s\nQuery: %s\n------------------------------\n", sourceMapHandle.Src, sourceMapHandle.ResponseTime, sourceMapHandle.User, sourceMapHandle.Db, sourceMapHandle.QueryText) 303 | } 304 | } 305 | if sourceMapHandle.QueryText != "" && sourceMapHandle.PreparedStatement == false { 306 | if sourceMapHandle.FullLength == false { 307 | if Params.QanAgent.DebugMap[2] { 308 | log.Printf("Source: %s\nTime: %f\nUser: %s\nDB: %s\nQuery: %s\n------------------------------\n", sourceMapHandle.Src, sourceMapHandle.ResponseTime, sourceMapHandle.User, sourceMapHandle.Db, sourceMapHandle.QueryText) 309 | if Params.QanAgent.DebugMap[4] { 310 | log.Printf("Length of Query Hashmap: %d\n******************************\n", len(MySQLProtocol.SourceMap)) 311 | } 312 | } 313 | // Asynchronous query processing 314 | go processQuery(*sourceMapHandle) 315 | } 316 | // Reset the hashmap for the source to handle the next requests from the same connection 317 | sourceMapHandle.QueryText = "" 318 | sourceMapHandle.RequestLength = 0 319 | sourceMapHandle.NewConnection = false 320 | if len(MySQLProtocol.SourceMap) > Params.QanAgent.MaxDBConnections { 321 | if Params.QanAgent.DebugMap[7] { 322 | log.Printf("Length of Query Hashmap before cleanup: %d\n******************************\n", len(MySQLProtocol.SourceMap)) 323 | log.Printf("Query Hashmap:\n******************************\n") 324 | } 325 | for key, _ := range MySQLProtocol.SourceMap { 326 | if Params.QanAgent.DebugMap[8] { 327 | log.Printf("%s:%v\n", key, MySQLProtocol.SourceMap[key]) 328 | } 329 | if MySQLProtocol.SourceMap[key].QueryText == "" || uint64((uint64(time.Since(MySQLProtocol.SourceMap[key].QueryEndTime).Nanoseconds()))/1000000000) > 3600 { 330 | if Params.QanAgent.DebugMap[7] { 331 | log.Printf("Length of Query Hashmap after deleting queries > 3600 secs: %d\n******************************\n", len(MySQLProtocol.SourceMap)) 332 | log.Printf("SRC: %s, DURATION: %d, Sync: %t\n", key, uint64(time.Since(MySQLProtocol.SourceMap[key].QueryStartTime).Seconds()), MySQLProtocol.SourceMap[key].NewConnection) 333 | log.Printf("SRC: %s, Query: %s, Sync: %t\n", key, MySQLProtocol.SourceMap[key].QueryText, MySQLProtocol.SourceMap[key].NewConnection) 334 | } 335 | delete(MySQLProtocol.SourceMap, key) 336 | } 337 | } 338 | if Params.QanAgent.DebugMap[7] { 339 | log.Printf("Length of Query Hashmap after cleanup: %d\n******************************\n", len(MySQLProtocol.SourceMap)) 340 | } 341 | // If entries in MySQLProtocol.SourceMap get deleted, the user and db information is lost. LocalDB.Enabled = 2 repopulates the user and db information from the processlist. 342 | if Params.LocalDB.Enabled == 2 { 343 | MySQLProtocol.GetProcesslist() 344 | } 345 | } 346 | } 347 | } else { 348 | delete(MySQLProtocol.SourceMap, sourceMapHandle.Src) 349 | if Params.QanAgent.DebugMap[7] { 350 | log.Printf("Length of Query Hashmap after delete RST: %d\n******************************\n", len(MySQLProtocol.SourceMap)) 351 | } 352 | abortedConnectionCount++ 353 | } 354 | } 355 | return 356 | } 357 | 358 | // If it is a request, call ParseProtocol function to extract the query from the packet 359 | MySQLProtocol.ParseProtocol(src, &payload, fullLength) 360 | } 361 | 362 | func processQuery(sourceMapHandle MySQLProtocol.Source) (err error) { 363 | defer func() { 364 | // recover from panic if one occured. Set err to nil otherwise. 365 | if recover() != nil { 366 | log.Printf("Failed while processing query.\nDetails: %v", sourceMapHandle) 367 | err = errors.New("Unknown Exception") 368 | } 369 | }() 370 | fp, checksum := MySQLProtocol.AnonymizeQuery(sourceMapHandle.QueryText) 371 | procUserMapHandle, ok := MySQLProtocol.ProcUserMap[sourceMapHandle.Src] 372 | if ok { 373 | sourceMapHandle.User = procUserMapHandle.User 374 | sourceMapHandle.Db = procUserMapHandle.Db 375 | } else { 376 | if (len(sourceMapHandle.User) == 0) || (len(sourceMapHandle.User) > 16) { 377 | sourceMapHandle.User = "unknown" 378 | } 379 | if (len(sourceMapHandle.Db) == 0) || (len(sourceMapHandle.Db) > 64) { 380 | sourceMapHandle.Db = "unknown" 381 | } 382 | } 383 | src := strings.Split(sourceMapHandle.Src, "-")[0] 384 | queryInfoMapMutex.Lock() 385 | queryInfoMapHandle, ok := queryInfoMap[checksum] 386 | if !ok { 387 | queryInfoMapHandle = &queryInfo{checksum: checksum, src: src, user: sourceMapHandle.User, db: sourceMapHandle.Db, fingerprint: fp, count: 1, sample: sourceMapHandle.QueryText, queryTime: sourceMapHandle.ResponseTime, minTime: sourceMapHandle.ResponseTime, maxTime: sourceMapHandle.ResponseTime, bytesIn: sourceMapHandle.RequestLength} 388 | queryInfoMap[checksum] = queryInfoMapHandle 389 | } else { 390 | queryInfoMapHandle.queryTime += sourceMapHandle.ResponseTime 391 | queryInfoMapHandle.count++ 392 | queryInfoMapHandle.bytesIn += sourceMapHandle.RequestLength 393 | if queryInfoMapHandle.maxTime < sourceMapHandle.ResponseTime { 394 | queryInfoMapHandle.maxTime = sourceMapHandle.ResponseTime 395 | queryInfoMapHandle.sample = sourceMapHandle.QueryText 396 | } 397 | if queryInfoMapHandle.minTime > sourceMapHandle.ResponseTime { 398 | queryInfoMapHandle.minTime = sourceMapHandle.ResponseTime 399 | } 400 | } 401 | if Params.QanAgent.DebugMap[1] { 402 | log.Printf("Source: %s\nChecksum: %s\nUser: %s\nDB: %s\nFingerprint: %s\nCount: %d\nSample: %s\nqueryTime: %g\nminTime: %g\nmaxTime: %g\nbytesIn: %d\n------------------------------------\n", queryInfoMapHandle.src, queryInfoMapHandle.checksum, queryInfoMapHandle.user, queryInfoMapHandle.db, queryInfoMapHandle.fingerprint, queryInfoMapHandle.count, queryInfoMapHandle.sample, queryInfoMapHandle.queryTime, queryInfoMapHandle.minTime, queryInfoMapHandle.maxTime, queryInfoMapHandle.bytesIn) 403 | } 404 | queryInfoMapMutex.Unlock() 405 | return 406 | } 407 | 408 | func sendResultsToDB(host string) (err error) { 409 | defer func() { 410 | // recover from panic if one occured. Set err to nil otherwise. 411 | if recover() != nil { 412 | log.Printf("Failed while sending results to remote system\n") 413 | err = errors.New("Unknown Exception") 414 | } 415 | }() 416 | if Params.RemoteDB.Enabled != 1 { 417 | return 418 | } 419 | var db *sql.DB 420 | var tlsmode string = "false" 421 | var selfsigned bool = false 422 | var conn string 423 | var queryInfoMapLength int 424 | 425 | switch Params.RemoteDB.SSLEnabled { 426 | case 0: 427 | tlsmode = "false" 428 | case 1: 429 | tlsmode = "skip-verify" 430 | case 2: 431 | tlsmode = "custom" 432 | selfsigned = true 433 | case 3: 434 | tlsmode = "custom" 435 | selfsigned = false 436 | default: 437 | tlsmode = "skip-verify" 438 | } 439 | 440 | // switch to skip-verify if any of Ca_cert, Client_cert and Client_key are not specified 441 | if tlsmode == "custom" && Params.RemoteDB.Ca_cert != "" && Params.RemoteDB.Client_cert != "" && Params.RemoteDB.Client_key != "" { 442 | tlsmode = "custom" 443 | } else if tlsmode == "custom" { 444 | tlsmode = "skip-verify" 445 | } 446 | 447 | if tlsmode == "custom" { 448 | rootCertPool := x509.NewCertPool() 449 | pem, err := ioutil.ReadFile(Params.RemoteDB.Ca_cert) 450 | if err != nil { 451 | log.Fatalf("Unable to open the CA certs - %s", err) 452 | } 453 | if ok := rootCertPool.AppendCertsFromPEM(pem); !ok { 454 | log.Fatal("Failed to append PEM.") 455 | } 456 | clientCert := make([]tls.Certificate, 0, 1) 457 | certs, err := tls.LoadX509KeyPair(Params.RemoteDB.Client_cert, Params.RemoteDB.Client_key) 458 | if err != nil { 459 | log.Fatalf("Unable to open the Client certs and Client Key - %s", err) 460 | } 461 | clientCert = append(clientCert, certs) 462 | mysql.RegisterTLSConfig("custom", &tls.Config{ 463 | RootCAs: rootCertPool, 464 | Certificates: clientCert, 465 | InsecureSkipVerify: selfsigned, 466 | }) 467 | conn = fmt.Sprintf("%s:%s@tcp(%s:%d)/%s?tls=custom&autocommit=true&charset=utf8mb4", Params.RemoteDB.Username, Params.RemoteDB.Password, Params.RemoteDB.Hostname, Params.RemoteDB.Port, Params.RemoteDB.DBName) 468 | } else { 469 | conn = fmt.Sprintf("%s:%s@tcp(%s:%d)/%s?tls=%s&autocommit=true&charset=utf8mb4", Params.RemoteDB.Username, Params.RemoteDB.Password, Params.RemoteDB.Hostname, Params.RemoteDB.Port, Params.RemoteDB.DBName, tlsmode) 470 | } 471 | 472 | db, err = sql.Open("mysql", conn) 473 | if err != nil { 474 | log.Fatalf("Unable to open the DB Connection - %s", err.Error()) 475 | } 476 | // sql.DB should be long lived "defer" closes it once this function ends 477 | defer db.Close() 478 | 479 | // Test the connection to the database 480 | err = db.Ping() 481 | if err != nil { 482 | log.Fatalf("ERROR: Unable to ping database - %s", err) 483 | } 484 | 485 | queryInfoSQL := `INSERT INTO query_info (hostname, checksum, fingerprint, sample, firstseen, mintime, mintimeat, maxtime, maxtimeat) 486 | VALUES (?, ?, ?, ?, UTC_TIMESTAMP(), ?, UTC_TIMESTAMP(), ?, UTC_TIMESTAMP()) 487 | ON DUPLICATE KEY UPDATE 488 | sample = CASE WHEN maxtime >= ? THEN sample ELSE ? END, 489 | maxtimeat = CASE WHEN maxtime >= ? THEN maxtimeat ELSE UTC_TIMESTAMP() END, 490 | mintimeat = CASE WHEN mintime <= ? THEN mintimeat ELSE UTC_TIMESTAMP() END, 491 | maxtime = CASE WHEN maxtime >= ? THEN maxtime ELSE ? END, 492 | mintime = CASE WHEN mintime <= ? THEN mintime ELSE ? END` 493 | queryInfoSQLHandle, err := db.Prepare(queryInfoSQL) 494 | 495 | if err != nil { 496 | log.Fatalf("ERROR: Unable to prepare SQL statement") 497 | } 498 | 499 | for { 500 | // Make a copy to avoid data race condition. TODO - implement concurrent maps or use channels 501 | queryInfoMapMutex.Lock() 502 | for checksum, data := range queryInfoMap { 503 | queryInfoCopyMap[checksum] = data 504 | delete(queryInfoMap, checksum) 505 | } 506 | queryInfoMapMutex.Unlock() 507 | queryInfoMapLength = len(queryInfoCopyMap) 508 | 509 | if sendResults == true && queryInfoMapLength > 0 { 510 | queryHistoryCols := make([]string, 0, queryInfoMapLength) 511 | vals := make([]interface{}, 0, queryInfoMapLength*6) 512 | utcnow := time.Now().UTC().Format("2006-01-02 15:04:05") 513 | 514 | for checksum, data := range queryInfoCopyMap { 515 | var upsert bool 516 | if Params.RemoteDB.IncludeSample == 0 { 517 | data.sample = "" 518 | } 519 | 520 | // Uncomment these lines if pseudo GTID has to be ignored 521 | /* 522 | // Ignore pseudo GTID 523 | if strings.Contains(data.sample, "drop view if exists `_pseudo_gtid_`") { 524 | delete(queryMetaMap, data.checksum) 525 | delete(queryInfoCopyMap, checksum) 526 | continue 527 | } 528 | */ 529 | 530 | queryHistoryCols = append(queryHistoryCols, "(?,?,?,?,?,?,?,?,?)") 531 | vals = append(vals, host, data.checksum, data.src, data.user, data.db, utcnow, data.count, data.queryTime, data.bytesIn) 532 | 533 | delete(queryInfoCopyMap, checksum) 534 | 535 | queryMetaMapHandle, exists := queryMetaMap[data.checksum] 536 | if !exists { 537 | queryMetaMapHandle = &queryMeta{checksum: checksum, minTime: data.minTime, maxTime: data.maxTime} 538 | queryMetaMap[checksum] = queryMetaMapHandle 539 | _, err = queryInfoSQLHandle.Exec(host, data.checksum, data.fingerprint, data.sample, data.minTime, data.maxTime, data.maxTime, data.sample, data.maxTime, data.minTime, data.maxTime, data.maxTime, data.minTime, data.minTime) 540 | if err != nil { 541 | log.Printf("ERROR: Unable to insert into query_info table: %s", err) 542 | log.Printf("SQL (query_info): %s\nChecksum: %s\nFingerprint: %s\nSample: %s\nMaxTime: %f\nMinTime: %f\n------------------------------", queryInfoSQL, data.checksum, data.fingerprint, data.sample, data.maxTime, data.minTime) 543 | } 544 | } else { 545 | if data.maxTime > queryMetaMapHandle.maxTime { 546 | queryMetaMapHandle.maxTime = data.maxTime 547 | upsert = true 548 | } 549 | 550 | if data.minTime < queryMetaMapHandle.minTime { 551 | queryMetaMapHandle.minTime = data.minTime 552 | upsert = true 553 | } 554 | 555 | if upsert == true { 556 | _, err = queryInfoSQLHandle.Exec(host, data.checksum, data.fingerprint, data.sample, data.minTime, data.maxTime, data.maxTime, data.sample, data.maxTime, data.minTime, data.maxTime, data.maxTime, data.minTime, data.minTime) 557 | if err != nil { 558 | log.Printf("ERROR: Unable to upsert into query_info table: %s", err) 559 | } 560 | upsert = false 561 | } 562 | } 563 | } 564 | // Doing batch insert 565 | queryHistorySQL := fmt.Sprintf("INSERT INTO query_history (hostname, checksum, src, user, db, ts, count, querytime, bytes) VALUES %s", strings.Join(queryHistoryCols, ",")) 566 | _, err := db.Exec(queryHistorySQL, vals...) 567 | if err != nil { 568 | log.Printf("ERROR: Unable to insert into query_history table %s %s: %s", queryHistorySQL, vals, err) 569 | } 570 | sendResults = false 571 | } else { 572 | time.Sleep(time.Second * time.Duration(Params.QanAgent.ReportInterval)) 573 | sendResults = true 574 | if Params.QanAgent.DebugMap[4] { 575 | log.Printf("Length of Query Info Hashmap: %d\n******************************\n", queryInfoMapLength) 576 | } 577 | } 578 | } 579 | } 580 | 581 | /* 582 | This function logs the number of queries with packet size equal to or more than the capture size, aborted connection count (wait timeout, connection reset etc) 583 | */ 584 | func logStats() { 585 | for { 586 | if fullLengthQueryCount > 65535 { 587 | fullLengthQueryCount = 1 588 | } 589 | if abortedConnectionCount > 65535 { 590 | abortedConnectionCount = 1 591 | } 592 | if accessDeniedCount > 65535 { 593 | accessDeniedCount = 1 594 | } 595 | log.Printf("%s\nNumber of queries more than Capture Length: %d\nNumber of aborted connections: %d\nNumber of access denied requests: %d\n", time.Now().Format("2006-01-02 15:04:05"), fullLengthQueryCount, abortedConnectionCount, accessDeniedCount) 596 | time.Sleep(time.Second * time.Duration(Params.Sniffer.ReportStatsInterval)) 597 | } 598 | } 599 | 600 | func main() { 601 | var ip string 602 | var ipaddrs []string 603 | 604 | Params = ReadConfig() 605 | if Params.QanAgent.MaxDBConnections == 0 { 606 | Params.QanAgent.MaxDBConnections = 1024 607 | } 608 | levels := strings.Split(Params.QanAgent.DebugLevels, ",") 609 | // Create hashmap of debug levels 610 | Params.QanAgent.DebugMap = make(map[uint8]bool) 611 | for _, element := range levels { 612 | level, _ := strconv.Atoi(element) 613 | Params.QanAgent.DebugMap[uint8(level)] = true 614 | } 615 | 616 | LogFile, err := os.OpenFile(Params.QanAgent.LogFile, os.O_RDWR|os.O_CREATE|os.O_APPEND, 0666) 617 | if err != nil { 618 | log.Printf("Error opening log file, using STDOUT: %v", err) 619 | } else { 620 | log.SetOutput(LogFile) 621 | } 622 | defer LogFile.Close() 623 | 624 | log.SetPrefix("") 625 | log.SetFlags(0) 626 | log.Printf("%s\nStarting Sniffer\n", time.Now().Format("2006-01-02 15:04:05")) 627 | 628 | host, err := os.Hostname() 629 | if err != nil { 630 | log.Fatalf("Unable to identify the hostname: %s", err) 631 | } 632 | 633 | ifaces, _ := net.Interfaces() 634 | // handle err 635 | for _, i := range ifaces { 636 | addrs, _ := i.Addrs() 637 | // handle err 638 | for _, addr := range addrs { 639 | switch v := addr.(type) { 640 | case *net.IPNet: 641 | ip = fmt.Sprintf("%s", v.IP) 642 | if ip != "127.0.0.1" && ip != "::1" { 643 | ipaddrs = append(ipaddrs, ip) 644 | } 645 | case *net.IPAddr: 646 | ip = fmt.Sprintf("%s", v.IP) 647 | ipaddrs = append(ipaddrs, ip) 648 | } 649 | } 650 | } 651 | 652 | // Pass the config params to the Protocol parser file 653 | MySQLProtocol.LocalDBConfigParams = MySQLProtocol.LocalDBConfig(Params.LocalDB) 654 | MySQLProtocol.QanAgentConfigParams = MySQLProtocol.QanAgentConfig(Params.QanAgent) 655 | 656 | // Send the results to the remote server 657 | go sendResultsToDB(host) 658 | 659 | // Print the number of queries with packet size equal to or more than the capture size and alsos aborted connections 660 | if Params.Sniffer.ReportStatsInterval > 0 { 661 | go logStats() 662 | } 663 | if Params.LocalDB.Enabled >= 1 { 664 | MySQLProtocol.GetProcesslist() 665 | } 666 | StartSniffer(ipaddrs) 667 | } 668 | -------------------------------------------------------------------------------- /remote_database/remote_schema.sql: -------------------------------------------------------------------------------- 1 | CREATE DATABASE IF NOT EXISTS `query_analyzer`; 2 | 3 | CREATE TABLE IF NOT EXISTS `query_analyzer`.`query_info` ( 4 | `hostname` varchar(64) NOT NULL DEFAULT '', 5 | `checksum` char(16) NOT NULL DEFAULT '', 6 | `fingerprint` longtext NOT NULL, 7 | `sample` longtext CHARACTER SET utf8mb4, 8 | `firstseen` datetime NOT NULL, 9 | `mintime` float NOT NULL DEFAULT '0', 10 | `mintimeat` datetime NOT NULL, 11 | `maxtime` float NOT NULL DEFAULT '0', 12 | `maxtimeat` datetime NOT NULL, 13 | `is_reviewed` enum('0','1','2') NOT NULL DEFAULT '0', 14 | `reviewed_by` varchar(20) DEFAULT NULL, 15 | `reviewed_on` datetime DEFAULT NULL, 16 | `comments` mediumtext, 17 | PRIMARY KEY (`hostname`,`checksum`), 18 | KEY `checksum` (`checksum`) 19 | ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=4; 20 | 21 | CREATE TABLE IF NOT EXISTS `query_analyzer`.`query_history` ( 22 | `hostname` varchar(64) NOT NULL DEFAULT '', 23 | `checksum` char(16) NOT NULL DEFAULT '', 24 | `src` varchar(39) NOT NULL DEFAULT '', 25 | `user` varchar(16) DEFAULT NULL, 26 | `db` varchar(64) NOT NULL DEFAULT '', 27 | `ts` datetime NOT NULL, 28 | `count` int unsigned NOT NULL DEFAULT '1', 29 | `querytime` float NOT NULL DEFAULT '0', 30 | `bytes` int unsigned NOT NULL DEFAULT '0', 31 | PRIMARY KEY (`hostname`,`checksum`,`ts`), 32 | KEY `checksum` (`checksum`), 33 | KEY `user` (`user`), 34 | KEY `covering` (`hostname`,`ts`,`querytime`,`count`,`bytes`) 35 | ) ENGINE=InnoDB DEFAULT CHARSET=utf8 36 | /*!50100 PARTITION BY RANGE (TO_DAYS(ts)) 37 | (PARTITION p202002 VALUES LESS THAN (TO_DAYS('2020-03-01')) ENGINE = InnoDB, 38 | PARTITION p202003 VALUES LESS THAN (TO_DAYS('2020-04-01')) ENGINE = InnoDB, 39 | PARTITION p202004 VALUES LESS THAN (TO_DAYS('2020-05-01')) ENGINE = InnoDB, 40 | PARTITION p202005 VALUES LESS THAN (TO_DAYS('2020-06-01')) ENGINE = InnoDB, 41 | PARTITION p202006 VALUES LESS THAN (TO_DAYS('2020-07-01')) ENGINE = InnoDB, 42 | PARTITION p202007 VALUES LESS THAN (TO_DAYS('2020-08-01')) ENGINE = InnoDB, 43 | PARTITION pMAX VALUES LESS THAN (MAXVALUE) ENGINE = InnoDB) */; 44 | -------------------------------------------------------------------------------- /remote_database/users.sql: -------------------------------------------------------------------------------- 1 | CREATE USER /*!50706 IF NOT EXISTS*/ 'qan_rw'@'qan_agent_ip' IDENTIFIED BY 'Complex_P@ssw0rd'; 2 | GRANT SELECT, INSERT, UPDATE ON `query_analyzer`.* TO 'qan_rw'@'qan_agent_ip'; 3 | --------------------------------------------------------------------------------