├── .gitignore ├── Gemfile ├── Gemfile.lock ├── Dockerfile ├── README.md └── redis-audit.rb /.gitignore: -------------------------------------------------------------------------------- 1 | .bundle/ 2 | vendor/ 3 | -------------------------------------------------------------------------------- /Gemfile: -------------------------------------------------------------------------------- 1 | source 'https://rubygems.org' 2 | 3 | gem 'redis' 4 | gem 'hiredis' 5 | -------------------------------------------------------------------------------- /Gemfile.lock: -------------------------------------------------------------------------------- 1 | GEM 2 | remote: https://rubygems.org/ 3 | specs: 4 | hiredis (0.6.0) 5 | redis (3.2.2) 6 | 7 | PLATFORMS 8 | ruby 9 | 10 | DEPENDENCIES 11 | hiredis 12 | redis 13 | 14 | BUNDLED WITH 15 | 1.11.2 16 | -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- 1 | FROM ruby:alpine 2 | 3 | COPY . /redis-audit 4 | 5 | RUN set -ex; \ 6 | apk --no-cache add build-base; \ 7 | cd redis-audit; \ 8 | bundle install; \ 9 | apk del build-base; 10 | 11 | WORKDIR "/redis-audit" 12 | 13 | ENTRYPOINT ["ruby", "redis-audit.rb"] -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Redis-audit 2 | 3 | This script samples a number of the Redis keys in a database and then groups them with other *similar* looking keys. It then displays key 4 | metrics around those groups of keys to help you spot where efficiencies can be made in the memory usage of your Redis database. 5 | _Warning_: The script cannot be used with AWS Elasticache Redis instances, as the debug command is restricted. 6 | 7 | ## Installation 8 | `bundle install` will take care of everything! 9 | 10 | 11 | ## Example 12 | 13 | If you have a Redis database that contains two sets of keys "user\_profile\_#{user\_id}" and "notification\_#{user\_id}", this script will 14 | help you work out which group of keys is taking up more memory. It will also help you spot keys that should have an expiry that don't, as well 15 | as providing you with statistics on how often keys are accessed within each group. 16 | 17 | ## Usage 18 | 19 | The script provides two different methods of being run, one with argument decleration and a legacy method based on order of the arguments passed in. 20 | 21 | The legacy option looks like this: 22 | 23 | redis-audit.rb [host] [port] [password] [dbnum] [(optional)sample_size] 24 | 25 | You can also specify the arguments with declarations, which also adds the ability to use a Redis URL and pass in authentication credentials: 26 | 27 | redis-audit.rb -h/--host [host] -p/--port [port] -a/--password [password] -d/--dbnum [dbnum] -s/--sample [(optional)sample_size] 28 | 29 | or 30 | 31 | redis-audit.rb -u/--url [url] -s/--sample [(optional)sample_size] 32 | 33 | - **Host**: Generally this is 127.0.0.1 (Please note, running it remotely will cause the script to take significantly longer) 34 | - **Port**: The port to connect to (e.g. 6379) 35 | - **Password**: The Redis password if authentication is required 36 | - **DBNum**: The Redis database to connect to (e.g. 0) 37 | - **Sample size**: This optional parameter controls how many keys to sample. I recommend starting with 10, then going to 100 initially. This 38 | will enable you to see that keys are being grouped properly. If you omit this parameter the script samples 10% of your keys. If the sample size is 39 | greater than the number of keys in the database the script will walk all the keys in the Redis database. **DO NOT** run this with a lot of keys on 40 | a production master database. Keys * will block for a long time and cause timeouts! 41 | - **Url**: Follows the normal syntax for Redis Urls 42 | 43 | `redis-audit.rb --help` will print out the argument options as well. 44 | 45 | ## Outputs 46 | Auditing 127.0.0.1:6379 db:0 sampling 26000 keys 47 | DB has 8951491 keys 48 | Sampled 32.88 MB of Redis memory 49 | 50 | Found 2 key groups 51 | 52 | ============================================================================== 53 | Found 10000 keys containing strings, like: 54 | user_profile_3897016, user_profile_3339430, user_profile_3240266, user_profile_2883394, user_profile_3969781, user_profile_3256693, user_profile_3766796, user_profile_2051997, user_profile_2817842, user_profile_1453480 55 | 56 | These keys use 11.86% of the total sampled memory (3.9 MB) 57 | 99.98% of these keys expire (10067), with maximum ttl of 4 days, 23 hours, 59 minutes, 44 seconds 58 | Average last accessed time: 1 days, 8 hours, 27 minutes, 56 seconds - (Max: 12 days, 20 hours, 13 minutes Min:20 seconds) 59 | 60 | ============================================================================== 61 | Found 16000 keys containing zsets, like: 62 | notification_3109439, notification_3634040, notification_2318378, notification_3871169, notification_3980323, notification_3427141, notification_1639845, notification_2823390, notification_2658377, notification_4153039 63 | 64 | These keys use 88.14% of the total sampled memory (28.98 MB) 65 | None of these keys expire 66 | Average last accessed time: 10 days, 6 hours, 13 minutes, 23 seconds - (Max: 12 days, 20 hours, 13 minutes, 10 seconds Min:2 minutes) 67 | 68 | ============================================================================== 69 | Summary 70 | 71 | ---------------------------------------------------+--------------+-------------------+--------------------------------------------------- 72 | Key | Memory Usage | Expiry Proportion | Last Access Time 73 | ---------------------------------------------------+--------------+-------------------+--------------------------------------------------- 74 | notification_3109439 | 88.14% | 0.0% | 2 minutes 75 | user_profile_3897016 | 11.86% | 99.98% | 20 seconds 76 | ---------------------------------------------------+--------------+-------------------+--------------------------------------------------- 77 | 78 | ## Key Grouping Algorithm 79 | The key grouping algorithm is a good default, but you may require more control over it. There is an array of regular expressions that can be used to help force a group. 80 | If the key being sampled matches a regular expression, it is grouped with all the keys that match that regex. 81 | 82 | @@key_group_regex_list = [/notification/,/user_profile/] 83 | 84 | If you don't configure the regular expressions, the script has to find a good match for each key that it finds, which can 85 | take a significant amount of time, depending on the number of types of keys. If you find the script takes too long to run, 86 | I recommend setting up the regular expressions. Even if you only set the regular expressions for 50% of the keys it will encounter, 87 | the speedup will still be noticeable. 88 | 89 | **Please note:** If your keys are appended with a namespace, rather than prepended, then you will have to configure a full set 90 | of regular expressions. 91 | 92 | ## Memory Usage 93 | The memory usage that the script calculates is based on the serialized length as reported by Redis using the DEBUG OBJECT command. 94 | This memory usage is not equal to the resident memory taken by the key, but is (hopefully) proportional to it. 95 | 96 | ## Other Redis Audit Tools 97 | - [Redis Sampler](https://github.com/antirez/redis-sampler) - Samples keys for statistics around how often you each Redis value type, and how big the value is. By Antirez. 98 | -------------------------------------------------------------------------------- /redis-audit.rb: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env ruby 2 | 3 | # Copyright (c) 2012, Simon Maynard 4 | # http://snmaynard.com 5 | # 6 | # Permission is hereby granted, free of charge, to any person obtaining a 7 | # copy of this software and associated documentation files (the "Software"), 8 | # to deal in the Software without restriction, including without limitation 9 | # the rights to use, copy, modify, merge, publish, distribute, sublicense, 10 | # and/or sell copies of the Software, and to permit persons to whom the 11 | # Software is furnished to do so, subject to the following conditions: 12 | # 13 | # The above copyright notice and this permission notice shall be included 14 | # in all copies or substantial portions of the Software. 15 | # 16 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 17 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 18 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 19 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, 20 | # WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 21 | # CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 22 | 23 | require 'bundler/setup' 24 | require 'redis' 25 | require 'redis/connection/hiredis' 26 | require 'optparse' 27 | 28 | # Container class for stats around a key group 29 | class KeyStats 30 | attr_accessor :total_instances, 31 | :total_idle_time, 32 | :total_serialized_length, 33 | :total_expirys_set, 34 | :min_serialized_length, 35 | :max_serialized_length, 36 | :min_idle_time, 37 | :max_idle_time, 38 | :max_ttl, 39 | :sample_keys 40 | 41 | def initialize 42 | @total_instances = 0 43 | @total_idle_time = 0 44 | @total_serialized_length = 0 45 | @total_expirys_set = 0 46 | 47 | @min_serialized_length = nil 48 | @max_serialized_length = nil 49 | @min_idle_time = nil 50 | @max_idle_time = nil 51 | @max_ttl = nil 52 | 53 | @sample_keys = {} 54 | end 55 | 56 | def add_stats_for_key(key, type, idle_time, serialized_length, ttl) 57 | @total_instances += 1 58 | @total_idle_time += idle_time 59 | @total_expirys_set += 1 if ttl != nil 60 | @total_serialized_length += serialized_length 61 | 62 | @min_idle_time = idle_time if @min_idle_time.nil? || @min_idle_time > idle_time 63 | @max_idle_time = idle_time if @max_idle_time.nil? || @max_idle_time < idle_time 64 | @min_serialized_length = serialized_length if @min_serialized_length.nil? || @min_serialized_length > serialized_length 65 | @max_serialized_length = serialized_length if @max_serialized_length.nil? || @max_serialized_length < serialized_length 66 | @max_ttl = ttl if ttl != nil && ( @max_ttl == nil || @max_ttl < ttl ) 67 | 68 | @sample_keys[key] = type if @sample_keys.count < 10 69 | end 70 | end 71 | 72 | class RedisAudit 73 | @@key_regex = /^(.*):(.*)$/ 74 | @@debug_regex = /serializedlength:(\d*).*lru_seconds_idle:(\d*)/ 75 | 76 | # Configure regular expressions here if you need to guarantee that certain keys are grouped together 77 | @@key_group_regex_list = [] 78 | 79 | def initialize(redis, sample_size) 80 | @redis = redis 81 | @keys = Hash.new {|h,k| h[k] = KeyStats.new} 82 | @sample_size = sample_size 83 | @dbsize = 0 84 | end 85 | 86 | def audit_keys 87 | @dbsize = @redis.dbsize.to_i 88 | 89 | if @sample_size == 0 || @sample_size.nil? 90 | @sample_size = (0.1 * @dbsize).to_i 91 | end 92 | 93 | if @sample_size < @dbsize 94 | puts "Sampling #{@sample_size} keys..." 95 | sample_progress = @sample_size/10 96 | 97 | @sample_size.times do |index| 98 | key = @redis.randomkey 99 | audit_key(key) 100 | if sample_progress > 0 && (index + 1) % sample_progress == 0 101 | puts "#{index + 1} keys sampled - #{(((index + 1)/@sample_size.to_f) * 100).round}% complete - #{Time.now}" 102 | end 103 | end 104 | else 105 | sample_progress = @dbsize/10 106 | 107 | puts "Getting a list of all #{@dbsize} keys..." 108 | keys = @redis.keys("*") 109 | puts "Auditing #{@dbsize} keys..." 110 | keys.each_with_index do |key, index| 111 | audit_key(key) 112 | if sample_progress > 0 && (index + 1) % sample_progress == 0 113 | puts "#{index + 1} keys sampled - #{(((index + 1)/@dbsize.to_f) * 100).round}% complete - #{Time.now}" 114 | end 115 | end 116 | end 117 | end 118 | 119 | def audit_key(key) 120 | pipeline = @redis.pipelined do 121 | @redis.debug("object", key) 122 | @redis.type(key) 123 | @redis.ttl(key) 124 | end 125 | debug_fields = @@debug_regex.match(pipeline[0]) 126 | serialized_length = debug_fields[1].to_i 127 | idle_time = debug_fields[2].to_i 128 | type = pipeline[1] 129 | ttl = pipeline[2] == -1 ? nil : pipeline[2] 130 | @keys[group_key(key, type)].add_stats_for_key(key, type, idle_time, serialized_length, ttl) 131 | rescue Redis::CommandError 132 | $stderr.puts "Skipping key #{key}" 133 | end 134 | 135 | # This function defines what keys are grouped together. Currently it looks for a key that 136 | # matches at least a third of the key from the start, and groups those together. It also 137 | # removes any numbers as they are (generally) ids. 138 | def group_key(key, type) 139 | @@key_group_regex_list.each_with_index do |regex, index| 140 | return "#{regex.to_s}:#{type}" if regex.match(key) 141 | end 142 | 143 | # This makes the odds of finding a correct match higher, as mostly these are ids 144 | key = key.delete("0-9") 145 | 146 | matching_key = nil 147 | length_of_best_match = 0 148 | threshold = key.length / 3 149 | matching_portion = nil 150 | key_codepoints = key.codepoints.to_a 151 | 152 | @keys.keys.each do |current_key| 153 | next if matching_key && !current_key.start_with?(matching_portion) # we know it wont be longer 154 | length_of_match = 0 155 | 156 | current_key.each_codepoint.with_index do |codepoint, index| 157 | next if index < length_of_best_match 158 | break unless key_codepoints[index] == codepoint 159 | length_of_match += 1 160 | end 161 | 162 | # Minimum length of match is 1/3 of the new key length 163 | if length_of_match >= threshold && length_of_match > length_of_best_match && @@key_regex.match(current_key)[2] == type 164 | matching_key = current_key 165 | length_of_best_match = length_of_match 166 | matching_portion = matching_key[0...length_of_match] 167 | end 168 | end 169 | if matching_key != nil 170 | return matching_key 171 | else 172 | return "#{key}:#{type}" 173 | end 174 | end 175 | 176 | def output_duration(seconds) 177 | m, s = seconds.divmod(60) 178 | h, m = m.divmod(60) 179 | d, h = h.divmod(24) 180 | 181 | output = [] 182 | output << "#{d} days" if d != 0 183 | output << "#{h} hours" if h != 0 184 | output << "#{m} minutes" if m != 0 185 | output << "#{s} seconds" if s != 0 186 | return "0 seconds" if output.count == 0 187 | return output.join(", ") 188 | end 189 | 190 | def output_bytes(bytes) 191 | kb, b = bytes.divmod(1024) 192 | mb, kb = kb.divmod(1024) 193 | gb, mb = mb.divmod(1024) 194 | 195 | if gb != 0 196 | result = ((gb + mb/1024.0)*100).round()/100.0 197 | return "#{result} GB" 198 | elsif mb != 0 199 | result = ((mb + kb/1024.0)*100).round()/100.0 200 | return "#{result} MB" 201 | elsif kb != 0 202 | result = ((kb + b/1024.0)*100).round()/100.0 203 | return "#{result} kB" 204 | else 205 | return "#{b} bytes" 206 | end 207 | end 208 | 209 | def output_stats 210 | complete_serialized_length = @keys.map {|key, value| value.total_serialized_length }.reduce(:+) 211 | sorted_keys = @keys.keys.sort{|a,b| @keys[a].total_serialized_length <=> @keys[b].total_serialized_length} 212 | 213 | if complete_serialized_length == 0 || complete_serialized_length.nil? 214 | complete_serialized_length = 0 215 | end 216 | 217 | puts "DB has #{@dbsize} keys" 218 | puts "Sampled #{output_bytes(complete_serialized_length)} of Redis memory" 219 | puts 220 | puts "Found #{@keys.count} key groups" 221 | puts 222 | sorted_keys.each do |key| 223 | value = @keys[key] 224 | key_fields = @@key_regex.match(key) 225 | common_key = key_fields[1] 226 | common_type = key_fields[2] 227 | 228 | puts "==============================================================================" 229 | puts "Found #{value.total_instances} keys containing #{common_type}s, like:" 230 | puts "\e[0;33m#{value.sample_keys.keys.join(", ")}\e[0m" 231 | puts 232 | puts "These keys use \e[0;1;4m#{make_proportion_percentage(value.total_serialized_length/complete_serialized_length.to_f)}\e[0m of the total sampled memory (#{output_bytes(value.total_serialized_length)})" 233 | if value.total_expirys_set == 0 234 | puts "\e[0;1;4mNone\e[0m of these keys expire" 235 | else 236 | puts "\e[0;1;4m#{make_proportion_percentage(value.total_expirys_set/value.total_instances.to_f)}\e[0m of these keys expire (#{value.total_expirys_set}), with maximum ttl of #{output_duration(value.max_ttl)}" 237 | end 238 | 239 | puts "Average last accessed time: \e[0;1;4m#{output_duration(value.total_idle_time/value.total_instances)}\e[0m - (Max: #{output_duration(value.max_idle_time)} Min:#{output_duration(value.min_idle_time)})" 240 | puts 241 | end 242 | summary_columns = [{ 243 | :title => "Key", 244 | :width => 50 245 | },{ 246 | :title => "Memory Usage", 247 | :width => 12 248 | },{ 249 | :title => "Expiry Proportion", 250 | :width => 17 251 | },{ 252 | :title => "Last Access Time", 253 | :width => 50 254 | }] 255 | format = summary_columns.map{|c| "%-#{c[:width]}s" }.join(' | ') 256 | 257 | puts "==============================================================================" 258 | puts "Summary" 259 | puts 260 | puts format.tr(' |', '-+') % summary_columns.map{|c| '-'*c[:width] } 261 | puts format % summary_columns.map{|c| c[:title]} 262 | puts format.tr(' |', '-+') % summary_columns.map{|c| '-'*c[:width] } 263 | sorted_keys.reverse.each do |key| 264 | value = @keys[key] 265 | puts format % [value.sample_keys.keys[0][0...50], make_proportion_percentage(value.total_serialized_length/complete_serialized_length.to_f), make_proportion_percentage(value.total_expirys_set/value.total_instances.to_f), output_duration(value.min_idle_time)[0...50]] 266 | end 267 | puts format.tr(' |', '-+') % summary_columns.map{|c| '-'*c[:width] } 268 | end 269 | 270 | def make_proportion_percentage(value) 271 | return "#{(value * 10000).round/100.0}%" 272 | end 273 | end 274 | 275 | # take in our command line options and parse 276 | options = {} 277 | OptionParser.new do |opts| 278 | opts.banner = "Usage: redis-audit.rb [options]" 279 | 280 | opts.on("-u", "--url URL", "Connection Url") do |url| 281 | options[:url] = url 282 | end 283 | 284 | opts.on("-h", "--host HOST", "Redis Host") do |host| 285 | options[:host] = host 286 | end 287 | 288 | opts.on("-p", "--port PORT", "Redis Port") do |port| 289 | options[:port] = port 290 | end 291 | 292 | opts.on("-a", "--password PASSWORD", "Redis Password") do |password| 293 | options[:password] = password 294 | end 295 | 296 | opts.on("-d", "--dbnum DBNUM", "Redis DB Number") do |dbnum| 297 | options[:dbnum] = dbnum 298 | end 299 | 300 | opts.on("-s", "--sample NUM", "Sample Size") do |sample_size| 301 | options[:sample_size] = sample_size.to_i 302 | end 303 | 304 | opts.on('--help', 'Displays Help') do 305 | puts opts 306 | exit 307 | end 308 | end.parse! 309 | 310 | # allows non-paramaterized/backwards compatible command line 311 | if options[:host].nil? && options[:url].nil? 312 | if ARGV.length < 4 || ARGV.length > 5 313 | puts "Run redis-audit.rb --help for information on how to use this tool." 314 | exit 1 315 | else 316 | options[:host] = ARGV[0] 317 | options[:port] = ARGV[1].to_i 318 | options[:password] = ARGV[2].to_i 319 | options[:dbnum] = ARGV[3].to_i 320 | options[:sample_size] = ARGV[4].to_i 321 | end 322 | end 323 | 324 | # create our connection to the redis db 325 | if !options[:url].nil? 326 | redis = Redis.new(:url => options[:url]) 327 | else 328 | # with url empty, assume that --host has been set, but since we don't enforce 329 | # port or dbnum to be set, allow sane defaults 330 | # set default port if no port is set 331 | if options[:port].nil? 332 | options[:port] = 6379 333 | end 334 | # set default dbnum if no dbnum is set 335 | if options[:dbnum].nil? 336 | options[:dbnum] = 0 337 | end 338 | # don't pass the password argument unless it is set 339 | if options[:password].nil? 340 | redis = Redis.new(:host => options[:host], :port => options[:port], :db => options[:dbnum]) 341 | else 342 | redis = Redis.new(:host => options[:host], :port => options[:port], :password => options[:password], :db => options[:dbnum]) 343 | end 344 | end 345 | 346 | # set sample_size to a default if not passed in 347 | if options[:sample_size].nil? 348 | options[:sample_size] = 0 349 | end 350 | 351 | # audit our data 352 | auditor = RedisAudit.new(redis, options[:sample_size]) 353 | if !options[:url].nil? 354 | puts "Auditing #{options[:url]} sampling #{options[:sample_size]} keys" 355 | else 356 | puts "Auditing #{options[:host]}:#{options[:port]} dbnum:#{options[:dbnum]} sampling #{options[:sample_size]} keys" 357 | end 358 | auditor.audit_keys 359 | auditor.output_stats 360 | --------------------------------------------------------------------------------