├── README.md
└── kafka-reassign-tool
/README.md:
--------------------------------------------------------------------------------
1 | # kafka-reassign-tool
2 | A helper script for Kafka to make it easier changing replicas for existing topics
3 |
4 | `kafka-reassign-tool` uses Kafka's standard `kafka-reassign-partitions.sh` script and generates data for it.
5 | The main purpose of this script is to make changing replication factor for existing topics easier.
6 | `kafka-reassign-tool` simply generates partition reassignment JSON file that can be fed into `kafka-reassign-partitions.sh`
7 | for execution.
8 |
9 | What it can do:
10 | * increase replication for existing topics
11 | * decrease replication factor for existing topics
12 | * remove all replicas from a particular broker so it can be decomissioned
13 | * balance leaders
14 |
15 | In all these cases kafka-reassign-tool will try to miminise number of changes which makes it
16 | different from standard Kafka tools which may generate a better distribution but they may generate a lot of movement
17 | by reallocating each partition to a completely different set of brokers (which may be ok for small installation but
18 | become an issue when you have lots of partitions and brokers).
19 |
20 | Note: I wrote this tool before discovering Linkedin's [kafka-tools](https://github.com/linkedin/kafka-tools) for example. I am still using my tool now but you may have a better shot with these more mature ones.
21 |
22 | ## Configuration
23 | `kafka-reassign-tool` needs to know where Kafka standard scripts are located as well as Zookeeper URL.
24 | If your Kafka is installed in `/opt/kafka` you do not need to supply any additional command line options. Otherwise you
25 | need to provide `--kafka-home
` command line option.
26 | Either way the script will attempt to read zookeeper URL from `config/server.properties` of Kafka home directory.
27 | If that does not work for any reason, you may need to add `--zookeeper ` option.
28 |
29 | ## Changing replication factor
30 | Running the tool with `--replication-factor 3` for example will change partition assignment map so that each partition has 3 replicas in the end.
31 | `kafka-reassign-tool` goes over all partitions one by one and:
32 | * if partition has fewer replicas than needed - adds more replicas by selecting least used brokers (that have fewer replicas for the topic than others)
33 | * if partition has more replicas than needed - removes extra replicas by removing most used brokers (that have more replicas for the topic than others)
34 | * if partition already at target replication factor, the tool still may re-order its replicas to make sure leader distribution is more level among brokers
35 |
36 | This is an example output of the tool when it is asked to increase replication factor for a topic with 2 replicas to 3:
37 | ```
38 | $ kafka-reassign-tool --topic mytopic --replication-factor 3
39 |
40 | Reading /opt/kafka/config/server.properties
41 | Using zookeeper URL: localhost:2181/kafka
42 | Reading list of brokers...
43 | Reading list of topics...
44 | ------------------------
45 | Brokers:
46 | 1001
47 | 1002
48 | 1003
49 | 1004
50 | Topics:
51 | mytopic
52 | ------------------------
53 | Getting current assignments...
54 | Building new assignments...
55 | mytopic-0 : [1001, 1002] => [1001, 1002, 1003]
56 | mytopic-2 : [1002, 1004] => [1002, 1004, 1001]
57 | mytopic-1 : [1002, 1004] => [1002, 1003, 1004]
58 | mytopic-3 : [1003, 1001] => [1003, 1001, 1004]
59 | Saving new assignments into new-assignments.json...
60 | Done
61 | ```
62 | And a similar ouptut when it is asked to decrease replication factor for a topic with 2 replicas to 1:
63 | ```
64 | $ kafka-reassign-tool --topic mytopic --replication-factor 1
65 |
66 | Reading /opt/kafka/config/server.properties
67 | Using zookeeper URL: localhost:2181/kafka
68 | Reading list of brokers...
69 | Reading list of topics...
70 | ------------------------
71 | Brokers:
72 | 1001
73 | 1002
74 | 1003
75 | 1004
76 | Topics:
77 | mytopic
78 | ------------------------
79 | Getting current assignments...
80 | Building new assignments...
81 | mytopic-0 : [1001, 1002] => [1001]
82 | mytopic-2 : [1002, 1004] => [1002]
83 | mytopic-1 : [1002, 1004] => [1004]
84 | mytopic-3 : [1003, 1001] => [1003]
85 | Saving new assignments into new-assignments.json...
86 | Done
87 | ```
88 |
89 | Of course, it always makes sense to eyeball what changes `kafka-reassign-tool` plans before actually executing them.
90 |
91 | ### Bulk change
92 | You can supply `--topic` option multiple times:
93 | ```
94 | $ kafka-reassign-tool --topic mytopic1 --topic mytopic2 --replication-factor 3
95 | ```
96 | or you can omit it completely in which case the same replication factor will be applied to all topics in your cluster
97 | ```
98 | $ kafka-reassign-tool --replication-factor 3
99 | ```
100 | Which, of course, only makes sense when all your topics share the same replication factor
101 |
102 | ## Decomissioning a broker
103 | The `--brokers` command line option allows you to specify which brokers can be used for assignment.
104 | `kafka-reassign-tool` remove all assignments from brokers that are not in that list (replacing it with others to maintain replication factor).
105 | This can be used to decomission a certain broker by removing all replicas from it.
106 |
107 | For example, the command below does not list broker 1004 so if any partition used it as a replica, it will be changed to something else:
108 | ```
109 | $ kafka-reassign-tool --topic mytopic --replication-factor 2 --brokers 1001,1002,1003
110 |
111 | Reading /opt/kafka/config/server.properties
112 | Using zookeeper URL: localhost:2181/kafka
113 | Reading list of brokers...
114 | Reading list of topics...
115 | ------------------------
116 | Brokers:
117 | 1001
118 | 1002
119 | 1003
120 | 1004
121 | Topics:
122 | mytopic
123 | ------------------------
124 | Getting current assignments...
125 | Building new assignments...
126 | mytopic-0 : [1001, 1002] => [1001, 1002]
127 | mytopic-2 : [1002, 1004] => [1002, 1003]
128 | mytopic-1 : [1002, 1004] => [1002, 1001]
129 | mytopic-3 : [1003, 1001] => [1003, 1001]
130 | Saving new assignments into new-assignments.json...
131 | Done
132 | ```
133 |
134 | Note that this operation can also be used in bulk by providing multiple `--topic` options or omitting it completely to select all topics.
135 | However, keep in mind that the same value from `--replication-factor` will be used for all selected topics.
136 |
137 | ## Applying the change
138 | After successful invocation, `kafka-reassign-tool` generates `new-assignments.json` file which can be then applied as
139 | ```
140 | kafka-reassign-partitions.sh --zookeeper --reassignment-json-file new-assignments.json --execute --throttle 100000000
141 | ```
142 | the example above throttles replication to 100Mb/sec. You may decide to use a different limit or to omit it completely.
143 | If throttling was used, at the end you must verify reassignmnet with
144 | ```
145 | kafka-reassign-partitions.sh --zookeeper --reassignment-json-file new-assignments.json --verify
146 | ```
147 | so replication quota is reset. For more details see https://kafka.apache.org/documentation.html#rep-throttle
148 |
--------------------------------------------------------------------------------
/kafka-reassign-tool:
--------------------------------------------------------------------------------
1 | #!/usr/bin/ruby
2 |
3 | require 'yaml'
4 | require 'json'
5 | require 'open3'
6 | require 'tempfile'
7 | require 'optparse'
8 |
9 | DEFAULT_KAFKA_ROOT='/opt/kafka'
10 | DEFAULT_KAFKA_CONFIG='/config/server.properties'
11 |
12 | $kafka_root = nil
13 | $zookeeper_url = nil
14 |
15 | def run(command, args = [], opts = {})
16 |
17 | opts = {
18 | :raise_on_exitstatus => true,
19 | :raise_on_err => true,
20 | :input => ''
21 | }.merge(opts)
22 |
23 | out = nil
24 | err = nil
25 | exit_status = nil
26 | Open3::popen3(command, *args) do |stdin, stdout, stderr, wait_thr|
27 | stdin << opts[:input]
28 | stdin.close
29 | out = stdout.read
30 | err = stderr.read
31 | exit_status = wait_thr.value
32 | end
33 |
34 | if (opts[:raise_on_exitstatus] && exit_status.exitstatus != 0) || (opts[:raise_on_err] && !err.empty?) then
35 | puts "#{command} failed. exitstatus=#{exit_status.exitstatus}, stderr:"
36 | puts err
37 | raise "#{command} failed"
38 | end
39 |
40 | {:out => out, :err => err, :exit_status => exit_status}
41 | end
42 |
43 | def get_kafka_root()
44 | return $kafka_root if $kafka_root
45 | DEFAULT_KAFKA_ROOT
46 | end
47 |
48 | def get_zk_url()
49 | return $zookeeper_url if $zookeeper_url
50 |
51 | config_file = get_kafka_root + DEFAULT_KAFKA_CONFIG
52 |
53 | if File.file? config_file then
54 | puts "Reading #{config_file}"
55 | File.open(config_file, 'r') do |file|
56 | file.each_line do |line|
57 | if line =~ /^zookeeper.connect=(.*)/ then
58 | $zookeeper_url = $1.strip
59 | end
60 | end
61 | end
62 | else
63 | puts "Config file #{config_file} does not exist"
64 | end
65 |
66 | raise "No zookeeper URL given" unless $zookeeper_url
67 |
68 | $zookeeper_url
69 | end
70 |
71 | def get_brokers()
72 | result = run(get_kafka_root + '/bin/zookeeper-shell.sh', [get_zk_url], {:input => 'ls /brokers/ids'})
73 |
74 | # Look for a line
75 | # [1003, 1001, 1002]
76 | #
77 | brokers = nil
78 | result[:out].lines.each {|line|
79 | if line =~ /^ \[ ([^\]]+) \] /x then
80 | brokers = $1.split(',').collect(&:strip).collect(&:to_i)
81 | end
82 | }
83 |
84 | brokers
85 | end
86 |
87 | def get_topics()
88 | result = run(get_kafka_root + '/bin/kafka-topics.sh', [
89 | '--zookeeper', get_zk_url,
90 | '--list'
91 | ])
92 | result[:out].lines.collect(&:strip)
93 | end
94 |
95 | def topics_json(topics)
96 | # {
97 | # "topics": [
98 | # {"topic": "topicname1"},
99 | # {"topic": "topicname2"}
100 | # ],
101 | # "version":1
102 | # }
103 |
104 | data = {
105 | :version => 1,
106 | :topics => topics.collect{|t| { :topic => t }}
107 | }
108 |
109 | data.to_json
110 | end
111 |
112 | def get_current_assignments(topics, brokers)
113 |
114 | topics_file = Tempfile.new(File.basename(__FILE__) + '-topics')
115 | topics_file << topics_json(topics)
116 | topics_file.close
117 |
118 | result = run(get_kafka_root + '/bin/kafka-reassign-partitions.sh', [
119 | '--zookeeper', get_zk_url,
120 | '--topics-to-move-json-file', topics_file.path,
121 | '--broker-list', brokers.join(','),
122 | '--generate'
123 | ])
124 |
125 | # Current partition replica assignment
126 | #
127 | # {"version":1,"partitions":[]}
128 | # Proposed partition reassignment configuration
129 | #
130 | # {"version":1,"partitions":[]}
131 |
132 | # Note that on newer Kafka versions all that is prepended by this line:
133 | # Warning: --zookeeper is deprecated, and will be removed in a future version of Kafka.
134 |
135 | unless result[:out] =~ /Current partition replica assignment(.*)Proposed partition reassignment configuration(.*)\z/m
136 | puts "kafka-reassign-partitions.sh output:"
137 | puts result[:out]
138 | raise "Cannot parse assignment data"
139 | end
140 |
141 | JSON.parse($1)
142 | end
143 |
144 | # Returns a hash
145 | # {
146 | # 'topic_name' => {
147 | # 'broker_id' => {
148 | # :replica => 0, // for how many partitions this broker has a replica
149 | # :leader => 0, // for how many partitions this broker is the leader
150 | # },
151 | # ...
152 | # },
153 | # ...
154 | # }
155 | #
156 | def get_broker_stats(assignments)
157 | stats = {}
158 |
159 | assignments['partitions'].each {|item|
160 | topic = item['topic']
161 | replicas = item['replicas']
162 |
163 | stats[topic] ||= {}
164 | topic_stats = stats[topic]
165 |
166 | leader = nil
167 | replicas.each {|replica|
168 | leader = replica if leader.nil?
169 |
170 | topic_stats[replica] ||= {:replica => 0, :leader => 0}
171 | topic_stats[replica][:replica] += 1
172 | }
173 |
174 | topic_stats[leader][:leader] += 1 if leader
175 | }
176 |
177 | stats
178 | end
179 |
180 | def set_replication(assignments, brokers, replication_factor)
181 |
182 | stats = get_broker_stats(assignments)
183 |
184 | # Make sure we have an entry for each broker even if given topic never uses it
185 | stats.each {|topic, topic_data|
186 | brokers.each {|broker|
187 | topic_data[broker] ||= {:replica => 0, :leader => 0}
188 | }
189 | }
190 |
191 | result = assignments.clone()
192 | new_assignments = []
193 | result['partitions'] = new_assignments
194 |
195 | puts
196 | puts "Initial stats:"
197 | puts stats
198 | puts
199 |
200 | assignments['partitions'].each {|item|
201 | topic = item['topic']
202 | partition = item['partition']
203 | replicas = item['replicas']
204 |
205 | topic_stats = stats[topic]
206 |
207 | # Calculate for how many partitions a broker should be a leader
208 | # I calculate number of partition as a sum of how many partitions each broker leads
209 | # this kinda sucks, it would be easier if get_broker_stats returned that info but
210 | # that will require changing its result structure once again. So next time.
211 | partition_count = topic_stats.values.collect {|data| data[:leader]}.reduce(0, :+)
212 | leader_target = partition_count / brokers.size
213 |
214 | # Remember what replicas we had before so we can see later if anything has changed
215 | orig_replicas = replicas.clone()
216 |
217 | # Remove replicas that are not in the given list of allowed brokers
218 | old_leader = replicas[0]
219 | (replicas - brokers).each {|broker|
220 | replicas.delete(broker)
221 | topic_stats[broker][:replica] -= 1
222 | }
223 |
224 | while replicas.size > replication_factor do
225 | # Remove the most used (for this topic) broker
226 | broker = topic_stats.select{|broker, data| replicas.include? broker} .max_by{|broker, data| data[:replica]}[0]
227 |
228 | topic_stats[broker][:replica] -= 1
229 | replicas.delete(broker)
230 | end
231 |
232 | while replicas.size < replication_factor do
233 | # Use least used (for this topic) allowed broker as a new replica
234 | broker = topic_stats.select{|broker, data| (brokers - replicas).include? broker} .min_by{|broker, data| data[:replica]}[0]
235 |
236 | # Make a new broker ordinary replica, the code below will promote it to leader if needed
237 | replicas.push broker
238 | topic_stats[broker][:replica] += 1
239 | end
240 |
241 | # If removing brokers that were not allowed or adding/removing replicas changed the leader, update stats too
242 | # Need to have up-to-date stats before using them to select a new leader
243 | new_leader = replicas[0]
244 | if new_leader != old_leader then
245 | topic_stats[old_leader][:leader] -= 1 unless old_leader.nil?
246 | topic_stats[new_leader][:leader] += 1 unless new_leader.nil?
247 | end
248 |
249 | old_leader = new_leader
250 |
251 | # Select the best leader for this partition (based on which broker is a leader for fewest partitions)
252 |
253 | candidate_leader_pos = nil
254 | candidate_leader_leads = leader_target
255 |
256 | replicas.each_with_index{|broker, pos|
257 | # For how many partitions this broker is currently the leader
258 | leader_for = topic_stats[broker][:leader]
259 | if leader_for < candidate_leader_leads then
260 | candidate_leader_pos = pos
261 | candidate_leader_leads = leader_for
262 | end
263 | }
264 |
265 | # Make our leader the first element in the list
266 | replicas.unshift(replicas.delete_at(candidate_leader_pos)) if candidate_leader_pos
267 |
268 | # Update leader stats again if leader has changed
269 | new_leader = replicas[0]
270 | if new_leader != old_leader then
271 | topic_stats[old_leader][:leader] -= 1 unless old_leader.nil?
272 | topic_stats[new_leader][:leader] += 1 unless new_leader.nil?
273 | end
274 |
275 | next if replicas == orig_replicas
276 |
277 | puts " #{topic}-#{partition} : #{orig_replicas} => #{replicas}"
278 |
279 | item = item.clone()
280 | item['replicas'] = replicas
281 |
282 | # If log_dirs configuration is present (it is optional) and it only
283 | # contains 'any' and not something else, replace it with a new one
284 | # as number of replicas may have changed.
285 | if item['log_dirs'] == ['any'] * orig_replicas.count
286 | item['log_dirs'] = ['any'] * replicas.count
287 | end
288 |
289 | new_assignments << item
290 | }
291 |
292 | puts
293 | puts "Final stats:"
294 | puts stats
295 | puts
296 |
297 | result
298 | end
299 |
300 | topics = []
301 | brokers = []
302 | replication_factor = nil
303 |
304 | optparse = OptionParser.new do |opts|
305 | opts.on("--kafka-home ",
306 | "Root directory of the Kafka installation",
307 | " (standard Kafka scripts must be under bin/ directory there)",
308 | " Default: #{DEFAULT_KAFKA_ROOT}") { |v| $kafka_root = v }
309 | opts.on("--zookeeper ",
310 | "The connection string for the zookeeper connection",
311 | " If not specified, and attempt to read it from Kafka config file is made") { |v| $zookeeper_url = v }
312 | opts.on("--topic ",
313 | "Can be specified multiple times to list one or more topic to apply operation to.",
314 | " If option is not used, operation will apply to all existing topics") { |v| topics << v }
315 | opts.on("--brokers ",
316 | "Coma-separated list of brokers that partitions will be reassigned to",
317 | " If option is not used, all brokers of the cluster will be used") { |v| brokers = v.split(',').collect(&:strip).collect(&:to_i) }
318 | opts.on("--replication-factor ",
319 | "Target replication factor. Required") { |v| replication_factor = v.to_i }
320 | end
321 |
322 | begin
323 | optparse.parse!
324 | raise OptionParser::MissingArgument.new('replication-factor') if replication_factor.nil?
325 | rescue OptionParser::InvalidOption, OptionParser::MissingArgument
326 | puts $!.to_s
327 | puts optparse
328 | exit -1
329 | end
330 |
331 |
332 | zk_url = get_zk_url()
333 | puts "Using zookeeper URL: #{zk_url}"
334 |
335 | puts "Reading list of brokers..."
336 | known_brokers = get_brokers()
337 |
338 | if brokers.empty? then
339 | brokers = known_brokers
340 | else
341 | brokers = brokers.sort.uniq
342 | invalid = brokers - known_brokers
343 | unless invalid.empty?
344 | raise "Unknown brokers: #{invalid}"
345 | end
346 | end
347 |
348 | puts "Reading list of topics..."
349 | known_topics = get_topics()
350 |
351 | if topics.empty? then
352 | topics = known_topics
353 | else
354 | topics = topics.sort.uniq
355 | invalid = topics - known_topics
356 | unless invalid.empty?
357 | raise "Unknown topics: #{invalid}"
358 | end
359 | end
360 |
361 | puts "------------------------"
362 | puts "Brokers:"
363 | brokers.each {|broker| puts " #{broker}" }
364 | puts "Topics:"
365 | topics.each {|topic| puts " #{topic}" }
366 | puts "------------------------"
367 |
368 | if brokers.size < replication_factor then
369 | raise "Cannot achieve replication factor of #{replication_factor} with #{brokers.size} broker(s)"
370 | end
371 |
372 | puts "Getting current assignments..."
373 | assignments = get_current_assignments(topics, known_brokers)
374 |
375 | puts "Building new assignments..."
376 | new_assignments = set_replication(assignments, brokers, replication_factor)
377 |
378 | if new_assignments['partitions'].empty? then
379 | puts "No changes needed"
380 | else
381 | puts "Saving new assignments into new-assignments.json..."
382 | File.open('new-assignments.json', 'w') { |file|
383 | file.write(new_assignments.to_json)
384 | }
385 | puts "Done"
386 | puts "To apply, run:"
387 | puts " #{get_kafka_root + '/bin/kafka-reassign-partitions.sh'} --zookeeper #{zk_url} --reassignment-json-file new-assignments.json --execute --throttle XXXXXXXX"
388 | puts "Then verify with:"
389 | puts " #{get_kafka_root + '/bin/kafka-reassign-partitions.sh'} --zookeeper #{zk_url} --reassignment-json-file new-assignments.json --verify"
390 | puts
391 | end
392 |
393 |
--------------------------------------------------------------------------------