├── LICENSE
├── README.md
└── consistent_hashing.h


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2016 Phaistos Networks
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | This is an C++14 implementation of [Consistent hashing](https://en.wikipedia.org/wiki/Consistent_hashing), abstracted as a `Ring` of tokens, with a `ring_segment` data structure that represents a segment of the ring. [We](http://phaistosnetworks.gr/) have been using this implementation for many years in building multiple distributed systems, including our massive scale, high performance distributed store (CloudDS). 
  2 | 
  3 | Please check the comments in the single header file for how to use the data structures and their APIs, and how it works. 
  4 | It is pretty trivial to use it and various useful methods are implemented for building robust distributed services. You should also check the wiki, startng with the [transition plan page](https://github.com/phaistos-networks/ConsistentHashing/wiki/Transition-Plan)
  5 | 
  6 | ### Using it in your Project
  7 | Just include [consistent_hashing.h](https://github.com/phaistos-networks/ConsistentHashing/blob/master/consistent_hashing.h), and make sure you set `std=c++14` or higher compiler option.
  8 | 
  9 | If you are going to need more than 2^64 ring tokens, and you probably should, you will need a struct or class to represent it (because an uin64_t won't suffice). In this case, you need to implement a few things:
 10 | 
 11 | - You need a `TrivialCmp()` implementation for your token type. This should return < 0, 0, or > 0 depending on the comparison result of two tokens. Hopefully, a future C++ standard update will introduce the [spaceship operator](https://en.wikipedia.org/wiki/Three-way_comparison) we could override and solve this more elegantly, but for now this will have to do.
 12 | ```cpp
 13 | template<>
 14 | static inline int8_t TrivialCmp<hugetoken_t>(const hugetoken_t &a, const hugetoken_t &b)
 15 | {
 16 | 	// return comparison result
 17 | }
 18 | ```
 19 | 
 20 | - You will need to implement an appropriate `std::numeric_limits<hugetoken_t>::min()`, like so:
 21 | ```cpp
 22 | namespace std
 23 | {
 24 | 	template<>
 25 | 	struct numeric_limits<hugetoken_t>
 26 | 	{
 27 | 		static inline const hugetoken_t min()
 28 | 		{
 29 | 			//return minimum possible token (e.g 0)
 30 | 		}
 31 | 	};
 32 | }
 33 | ```
 34 | 
 35 | - Implement appropriate `std::min()` and `std::max()` methods like so:
 36 | ```cpp
 37 | namespace std
 38 | {
 39 | 	template<>
 40 | 	inline const hugetoken_t &min<hugetoken_t>(const hugetoken_t &a, const hugetoken_t &b)
 41 | 	{
 42 | 		// return whichever is lower
 43 | 	}
 44 | 
 45 | 	template<>
 46 | 	inline const hugetoken_t &max<hugetoken_t>(const hugetoken_t &a, const hugetoken_t &b)
 47 | 	{
 48 | 		// return whichever is higher 
 49 | 	}
 50 | }
 51 | ```
 52 | 
 53 | - Finally, you should implement appropriate `operator==`, `operator!=`, `operator<` and `operator>` methods for your hugetoken_t
 54 | 
 55 | This is really not that much work, and chances are you are already doing that anyway to support other needs of your codebase.
 56 | 
 57 | ### Example
 58 | Please read about [transition plans](https://github.com/phaistos-networks/ConsistentHashing/wiki/Transition-Plan) first.
 59 | 
 60 | ```cpp
 61 | #include <consistent_hashing.h>
 62 | 
 63 | int main()
 64 | {
 65 |         using token_t = uint32_t;
 66 |         using segment_t = ConsistentHashing::ring_segment<token_t>;
 67 |         using ring_t = ConsistentHashing::Ring<token_t>;
 68 |         using node_t = uint32_t;
 69 |         // Suppose we have a simple ring, and of those tokens, only
 70 |         // one is owned by the node we wish to update (node 1)
 71 |         std::pair<node_t, token_t> ringStructure[] =
 72 |             {
 73 |                 {100, 10},
 74 |                 {200, 20},
 75 |                 {300, 30},
 76 |                 {400, 40},
 77 |                 {500, 50},
 78 |                 {600, 60},
 79 |                 {1, 70}, /* this is the only token owned by node 1 */
 80 |                 {800, 80},
 81 |                 {900, 90},
 82 |                 {1000, 100},
 83 |                 {150, 110},
 84 |                 {1200, 120}};
 85 |         std::vector<node_t> ringTokensNodes, ringTokens;
 86 | 
 87 |         // We need all ring tokens, and all associated nodes for those tokens.
 88 |         // We also need to collect the tokens owned by the node
 89 |         for (auto it = std::begin(ringStructure), end = std::end(ringStructure); it != end; ++it)
 90 |         {
 91 |                 const auto &v = *it;
 92 | 
 93 |                 ringTokens.push_back(v.second);
 94 |                 ringTokensNodes.push_back(v.first);
 95 |         }
 96 | 
 97 |         // This is a simple method that returns the replicas for a given token
 98 |         // In practice, you wouldn't allocate any memory here (i.e no std::vector<> use), you 'd
 99 |         // pay a lot of attention to performance, and you 'd possiblyh consider physical placement of node
100 |         // (i.e at least one from local DC and the rest from other DCs)
101 |         const auto replicas_of = [](const auto &ring, const auto ringTokensNodes, const token_t token, node_t *const out) {
102 |                 static constexpr uint8_t replicationFactor{2}; // how many copies of each ring segment we need at any given time
103 |                 const auto base = ring.index_owner_of(token);  // index in the ring tokens for the token-owner of token(input)
104 |                 uint32_t i{base}, n{0};
105 | 
106 |                 // walk the the ring clockwise until we have enough nodes to return
107 |                 do
108 |                 {
109 |                         const auto node = ringTokensNodes[i];
110 | 
111 |                         if (std::find(out, out + n, node) == out + n)
112 |                         {
113 |                                 // haven't collected that node yet
114 |                                 // we only care for distinct nodes
115 |                                 out[n++] = node;
116 |                                 if (n == replicationFactor)
117 |                                         break;
118 |                         }
119 | 
120 |                         i = (i + 1) % ring.size();
121 |                 } while (i != base);
122 | 
123 |                 return n;
124 |         };
125 | 
126 |         // This is our ring
127 |         const ring_t ring(ringTokens.data(), ringTokens.size());
128 |         // those are the changes
129 |         const std::unordered_map<node_t, std::vector<token_t>> topologyUpdates{
130 |             {1, {35, 95}},
131 | 	    {110, {}}, 	// will remove node from the ring
132 |             {64, {7}}};
133 |         // Create the transition plan
134 |         auto plan = ring.transition(ringTokensNodes.data(), topologyUpdates, replicas_of);
135 | 
136 | 
137 | 	for (auto &it : plan)
138 |         {
139 |                 const auto segment = it.first;
140 |                 const auto &toFrom = it.second;
141 |                 const auto target = toFrom.first;
142 |                 auto sources = toFrom.second;
143 | 
144 |                 // Let's filter the sources, so that we only pull from the nodes closest to us in terms of node hopes
145 |                 sources.resize(ring_t::filter_by_distance(sources.data(), sources.data() + sources.size(), [target](const auto node) {
146 |                                        return 1; // TODO: return an appropriate distance from target to node
147 |                                }) -
148 |                                sources.data());
149 |         }
150 | 
151 |      
152 |        // Please read https://github.com/phaistos-networks/ConsistentHashing/wiki/Transition-Plan
153 |         schedule_transfer(std::move(plan),
154 |                           [ ringTokensNodes, ring, topologyUpdates ]() {
155 |                                   const auto newTopology = ring.new_topology(ringTokensNodes.data(), topologyUpdates);
156 | 
157 |                                   switch_to_ring(newTopology);
158 |                           });
159 |         return 0;
160 | }
161 | ```
162 | --
163 | 
164 | With the included data structures and implemented algorithms, it should be trivial to build robust consistent-hashing based replication for your distributed systems.
165 | 
166 | Have Fun!
167 | 


--------------------------------------------------------------------------------
/consistent_hashing.h:
--------------------------------------------------------------------------------
   1 | // In Consistent Hashing, a Ring is represented as an array of sorted in ascending order tokens, and each of those
   2 | // tokens identifies a segment in the ring.
   3 | //
   4 | // The key property is that every segment owns the ring-space defined by the range:
   5 | // (prev_segment.token, segment.token]
   6 | // that is, starting but excluding the token of the previous segment in the ring, upto and including the token of the segment.
   7 | //
   8 | // The two exceptions are for tokens that <= the first tokens in the ring or > last tokens in the ring(ring semantics)
   9 | // -- For the last segment in the array, its next segment is the first segment in the array
  10 | // -- For the first segment in the array, its previous segment is the last segment in the array
  11 | //
  12 | //
  13 | // You should operate on ring segments, and for a typical distributed service, each segment will be owned by a primary replica, and based on
  14 | // replication strategies and factors, more(usually, the successor) segments will also get to hold to hold the same segment's data.
  15 | #pragma once
  16 | #ifdef HAVE_SWITCH
  17 | #include "switch.h"
  18 | #include "switch_vector.h"
  19 | #include <experimental/optional>
  20 | #else
  21 | #include <algorithm>
  22 | #include <experimental/optional>
  23 | #include <limits>
  24 | #include <stdint.h>
  25 | #include <string.h>
  26 | #include <vector>
  27 | #endif
  28 | #include <unordered_map>
  29 | 
  30 | namespace ConsistentHashing
  31 | {
  32 |         // Returns lowest index where token <= tokens[index]
  33 |         // if it returns cnt, use 0 ( because tokens[0] owns ( tokens[cnt - 1], tokens[0] ]
  34 |         template <typename T>
  35 |         static uint32_t search(const T *const tokens, const uint32_t cnt, const T token)
  36 |         {
  37 |                 int32_t h = cnt - 1, l{0};
  38 | 
  39 |                 while (l <= h)
  40 |                 {
  41 |                         // This protects from overflows: see http://locklessinc.com/articles/binary_search/
  42 |                         // The addition can be split up bitwise. The carry between bits can be obtained by
  43 |                         // the logical-and of the two summands. The resultant bit can be obtained by a XOR
  44 |                         //
  45 |                         // https://en.wikipedia.org/wiki/Binary_search_algorithm#Implementation_issues
  46 |                         // The problem with overflow is that if (l + h) add up to value greater than INT32_MAX,
  47 |                         // (exceeds the range of integers of the data type used to store the midpoint, even if
  48 |                         // l and h are withing rhe range). If l and h, this can non-negatives, this can be avoided
  49 |                         // by calculating the modpoint as: (l + (r - l) / 2)
  50 |                         //
  51 |                         // We are not using unsigned integers though -- though we should look for a way
  52 |                         // to do that so that we could safely use (l + (r - l ) / 2)
  53 |                         // so we can't use >> 1 here becuse (l + r) may result in a negative number
  54 |                         // and shifting by >> 1 won't divide that number by two.
  55 |                         const auto m = (l & h) + ((l ^ h) >> 1);
  56 |                         const auto v = tokens[m];
  57 |                         const auto r = TrivialCmp(token, v);
  58 | 
  59 |                         if (!r)
  60 |                                 return m;
  61 |                         else if (r < 0)
  62 |                                 h = m - 1;
  63 |                         else
  64 |                                 l = m + 1;
  65 |                 }
  66 | 
  67 |                 return l;
  68 |         }
  69 | 
  70 | 	// An 128bit token representation
  71 | 	// You should probably use 128 or more bits for the tokens space
  72 |         struct token128
  73 |         {
  74 |                 uint64_t ms;
  75 |                 uint64_t ls;
  76 | 
  77 |                 constexpr token128()
  78 | 			: ms{0}, ls{0}
  79 |                 {
  80 |                 }
  81 | 
  82 | 		constexpr token128(const uint64_t m, const uint64_t l)
  83 | 			: ms{m}, ls{l}
  84 |                 {
  85 |                 }
  86 | 
  87 | 		constexpr bool is_minimum() const noexcept
  88 | 		{
  89 | 			return ms == 0 && ls == 0;
  90 | 		}
  91 | 
  92 |                 constexpr operator bool() const noexcept
  93 |                 {
  94 |                         return is_valid();
  95 |                 }
  96 | 
  97 |                 constexpr bool is_valid() const noexcept
  98 |                 {
  99 |                         return ms || ls;
 100 |                 }
 101 | 
 102 |                 constexpr bool operator==(const token128 &o) const noexcept
 103 |                 {
 104 |                         return ms == o.ms && ls == o.ls;
 105 |                 }
 106 | 
 107 |                 constexpr bool operator!=(const token128 &o) const noexcept
 108 |                 {
 109 |                         return ms != o.ms || ls != o.ls;
 110 |                 }
 111 | 
 112 |                 constexpr bool operator>(const token128 &o) const noexcept
 113 |                 {
 114 |                         return ms > o.ms || (ms == o.ms && ls > o.ls);
 115 |                 }
 116 | 
 117 |                 constexpr bool operator<(const token128 &o) const noexcept
 118 |                 {
 119 |                         return ms < o.ms || (ms == o.ms && ls < o.ls);
 120 |                 }
 121 | 
 122 |                 constexpr bool operator>=(const token128 &o) const noexcept
 123 |                 {
 124 |                         return ms > o.ms || (ms == o.ms && ls >= o.ls);
 125 |                 }
 126 | 
 127 |                 constexpr bool operator<=(const token128 &o) const noexcept
 128 |                 {
 129 |                         return ms < o.ms || (ms == o.ms && ls <= o.ls);
 130 |                 }
 131 | 
 132 |                 constexpr auto &operator=(const token128 &o) noexcept
 133 |                 {
 134 |                         ms = o.ms;
 135 |                         ls = o.ls;
 136 | 
 137 |                         return *this;
 138 |                 }
 139 | 
 140 |                 constexpr void reset() noexcept
 141 |                 {
 142 |                         ms = 0;
 143 |                         ls = 0;
 144 |                 }
 145 |         };
 146 | 
 147 |         // A segment in a ring. The segment is responsible(owns) the tokens range
 148 |         // (left, right] 	i.e left exlusive, right inclusive
 149 |         // whereas left is the token of the predecessor segment and right is the token of this segment
 150 |         // See also: https://en.wikipedia.org/wiki/Circular_segment
 151 |         template <typename token_t>
 152 |         struct ring_segment
 153 |         {
 154 |                 token_t left;
 155 |                 token_t right;
 156 | 
 157 |                 constexpr uint64_t span() const noexcept
 158 |                 {
 159 |                         if (wraps())
 160 |                         {
 161 |                                 require(left >= right);
 162 |                                 return uint64_t(std::numeric_limits<token_t>::max()) - left + right;
 163 |                         }
 164 |                         else
 165 |                         {
 166 |                                 require(right >= left);
 167 |                                 return right - left;
 168 |                         }
 169 |                 }
 170 | 
 171 |                 constexpr ring_segment()
 172 |                 {
 173 |                 }
 174 | 
 175 |                 constexpr ring_segment(const token_t l, const token_t r)
 176 |                     : left{l}, right{r}
 177 |                 {
 178 |                 }
 179 | 
 180 |                 constexpr void set(const token_t l, const token_t r)
 181 |                 {
 182 |                         left = l;
 183 |                         right = r;
 184 |                 }
 185 | 
 186 |                 // this segment's token
 187 |                 constexpr auto token() const noexcept
 188 |                 {
 189 |                         return right;
 190 |                 }
 191 | 
 192 |                 constexpr bool operator==(const ring_segment &o) const noexcept
 193 |                 {
 194 |                         return left == o.left && right == o.right;
 195 |                 }
 196 | 
 197 |                 constexpr bool operator!=(const ring_segment &o) const noexcept
 198 |                 {
 199 |                         return left != o.left || right != o.right;
 200 |                 }
 201 | 
 202 |                 constexpr bool operator<(const ring_segment &o) const noexcept
 203 |                 {
 204 |                         return left < o.left || (left == o.left && right < o.right);
 205 |                 }
 206 | 
 207 |                 constexpr bool operator>(const ring_segment &o) const noexcept
 208 |                 {
 209 |                         return left > o.left || (left == o.left && right > o.right);
 210 |                 }
 211 | 
 212 |                 constexpr int8_t cmp(const ring_segment &rhs) const noexcept
 213 |                 {
 214 |                         if (tokens_wrap_around(left, right))
 215 |                         {
 216 |                                 // there is only one segment that wraps around in the ring
 217 |                                 return -1;
 218 |                         }
 219 |                         else if (tokens_wrap_around(rhs.left, rhs.right))
 220 |                         {
 221 |                                 // there is only one segment that wraps around in the ring
 222 |                                 return 1;
 223 |                         }
 224 |                         else
 225 |                         {
 226 |                                 if (right == rhs.right)
 227 |                                         return 0;
 228 |                                 else if (right > rhs.right)
 229 |                                         return 1;
 230 |                                 else
 231 |                                         return -1;
 232 |                         }
 233 |                 }
 234 | 
 235 |                 static constexpr bool tokens_wrap_around(const token_t &l, const token_t &r) noexcept
 236 |                 {
 237 |                         // true iff extends from last to the first ring segment
 238 |                         return l >= r;
 239 |                 }
 240 | 
 241 |                 bool contains(const ring_segment &that) const noexcept
 242 |                 {
 243 |                         if (left == right)
 244 |                         {
 245 |                                 // Full ring always contains all other ranges
 246 |                                 return true;
 247 |                         }
 248 | 
 249 |                         const bool thisWraps = tokens_wrap_around(left, right);
 250 |                         const bool thatWraps = tokens_wrap_around(that.left, that.right);
 251 | 
 252 |                         if (thisWraps == thatWraps)
 253 |                                 return left <= that.left && that.right <= right;
 254 |                         else if (thisWraps)
 255 |                         {
 256 |                                 // wrapping might contain non-wrapping that is contained if both its tokens are in one of our wrap segments
 257 |                                 return left <= that.left || that.right <= right;
 258 |                         }
 259 |                         else
 260 |                         {
 261 |                                 // non-wrapping cannot contain wrapping
 262 |                                 return false;
 263 |                         }
 264 |                 }
 265 | 
 266 |                 // masks a segment `mask` from a segment `s`, if they intersect, and return 0+ segments
 267 |                 //
 268 |                 // It is very important that we get this right, otherwise other methods that depend on it will produce crap
 269 |                 // returns a pair, where the first is true if the segment was intersected by the mask, false otherwise, and the second
 270 |                 // is the number of segments it was partitioned to (can be 0)
 271 |                 std::pair<bool, uint8_t> mask(const ring_segment mask, ring_segment *const out) const noexcept
 272 |                 {
 273 |                         if (false == intersects(mask))
 274 |                                 return {false, 0};
 275 |                         else if (mask.contains(*this))
 276 |                         {
 277 |                                 // completely masked
 278 |                                 return {true, 0};
 279 |                         }
 280 |                         else
 281 |                         {
 282 |                                 // partially masked
 283 |                                 uint8_t n{0};
 284 | 
 285 |                                 if (mask.wraps() || wraps())
 286 |                                         n = mask.difference(*this, out);
 287 |                                 else if (mask.right > left)
 288 |                                 {
 289 |                                         if (mask.left < right && mask.left > left)
 290 |                                                 out[n++] = {left, mask.left};
 291 | 
 292 |                                         if (mask.right < right)
 293 |                                                 out[n++] = {mask.right, right};
 294 |                                 }
 295 | 
 296 |                                 return {true, n};
 297 |                         }
 298 |                 }
 299 | 
 300 |                 static void mask_segments_impl(const ring_segment *it, const ring_segment *const end, const std::vector<ring_segment> &toExclude, std::vector<ring_segment> *const out)
 301 |                 {
 302 |                         ring_segment list[2];
 303 | 
 304 |                         for (auto i{it}; i != end; ++i)
 305 |                         {
 306 |                                 const auto in = *i;
 307 | 
 308 |                                 for (const auto mask : toExclude)
 309 |                                 {
 310 |                                         if (const auto res = in.mask(mask, list); res.first)
 311 |                                         {
 312 |                                                 // OK, either completely or partially masked
 313 | 
 314 |                                                 if (res.second)
 315 |                                                         mask_segments_impl(list, list + res.second, toExclude, out);
 316 | 
 317 |                                                 goto next;
 318 |                                         }
 319 |                                 }
 320 | 
 321 |                                 out->push_back(in);
 322 | 
 323 |                         next:;
 324 |                         }
 325 |                 }
 326 | 
 327 |                 static void mask_segments(const ring_segment *it, const ring_segment *const end, const std::vector<ring_segment> &toExclude, std::vector<ring_segment> *const out)
 328 |                 {
 329 |                         if (toExclude.size())
 330 |                         {
 331 |                                 mask_segments_impl(it, end, toExclude, out);
 332 |                                 // Just in case (this is cheap)
 333 |                                 sort_and_deoverlap(out);
 334 |                         }
 335 |                         else
 336 |                                 out->insert(out->end(), it, end);
 337 |                 }
 338 | 
 339 |                 static void mask_segments(const std::vector<ring_segment> &in, const std::vector<ring_segment> &toExclude, std::vector<ring_segment> *const out)
 340 |                 {
 341 |                         mask_segments(in.data(), in.data() + in.size(), toExclude, out);
 342 |                 }
 343 | 
 344 |                 static auto mask_segments(const std::vector<ring_segment> &in, const std::vector<ring_segment> &toExclude)
 345 |                 {
 346 |                         std::vector<ring_segment> out;
 347 | 
 348 |                         mask_segments(in.begin(), in.end(), &out);
 349 |                         return out;
 350 |                 }
 351 | 
 352 |                 // For list of wrapped segments sorted by left token ascending, process the list to produce
 353 |                 // an equivalent set of ranges, sans the overlapping ranges
 354 |                 // it will also merge together ranges
 355 |                 // i.e [(8, 10],(8, 15],(14, 18],(17, 18]] => [ (8, 18] ]
 356 |                 //
 357 |                 // this will only work if the segments are properly sorted. see sort_and_deoverlap()
 358 |                 // This utility method deals with invalid segments as well (e.g you can't really have more than one segments that wrap)
 359 |                 static void deoverlap(std::vector<ring_segment> *const segments)
 360 |                 {
 361 |                         auto out = segments->data();
 362 | 
 363 |                         for (auto *it = segments->data(), *const end = it + segments->size(); it != end;)
 364 |                         {
 365 |                                 auto s = *it;
 366 | 
 367 |                                 if (it->right <= it->left)
 368 |                                 {
 369 |                                         // This segment wraps
 370 |                                         // deal with e.g [30, 4], [35, 8], [40, 2]
 371 |                                         // that'd be an invalid list of segments(there can only be one wrapping segment), but we 'll deal with it anyway
 372 |                                         const auto wrappedSegmentIt = it;
 373 | 
 374 |                                         for (++it; it != end; ++it)
 375 |                                         {
 376 |                                                 if (it->right > s.right)
 377 |                                                         s.right = it->right;
 378 |                                         }
 379 | 
 380 |                                         // we need to potentially drop some of them segments if the wrapping segment overlaps them
 381 |                                         if (wrappedSegmentIt != (it = segments->data()) && s.right >= it->right)
 382 |                                         {
 383 |                                                 s.right = it->right;
 384 |                                                 memmove(it, it + 1, (out - it) * sizeof(ring_segment));
 385 |                                                 --out;
 386 |                                         }
 387 | 
 388 |                                         *out++ = s;
 389 |                                         break;
 390 |                                 }
 391 |                                 else
 392 |                                 {
 393 |                                         for (++it; it != end && ((*it == s) || (it->left >= s.left && s.right > it->left)); ++it)
 394 |                                                 s.right = it->right;
 395 | 
 396 |                                         if (out == segments->data() || false == out[-1].contains(s))
 397 |                                         {
 398 |                                                 // deal with (8, 30],(9, 18]
 399 |                                                 *out++ = s;
 400 |                                         }
 401 |                                 }
 402 |                         }
 403 | 
 404 |                         segments->resize(out - segments->data());
 405 | 
 406 |                         if (segments->size() == 1 && segments->back().left == segments->back().right)
 407 |                         {
 408 |                                 // spans the whole ring
 409 |                                 const auto MinTokenValue = std::numeric_limits<token_t>::min();
 410 | 
 411 |                                 segments->pop_back();
 412 |                                 segments->push_back({MinTokenValue, MinTokenValue});
 413 |                         }
 414 |                 }
 415 | 
 416 |                 // utility method; sorts segments so that deoverlap() can process them
 417 |                 static void sort_and_deoverlap(std::vector<ring_segment> *const segments)
 418 |                 {
 419 |                         std::sort(segments->begin(), segments->end(), [](const auto &a, const auto &b) { return a.left < b.left || (a.left == b.left && a.right < b.right); });
 420 |                         deoverlap(segments);
 421 |                 }
 422 | 
 423 |                 // Copy of input list, with all segments unwrapped, sorted by left bound, and with overlapping bounds merged
 424 |                 static void normalize(const ring_segment *const segments, const uint32_t segmentsCnt, std::vector<ring_segment> *const out)
 425 |                 {
 426 |                         ring_segment res[2];
 427 | 
 428 |                         for (uint32_t i{0}; i != segmentsCnt; ++i)
 429 |                         {
 430 |                                 if (const uint8_t n = segments[i].unwrap(res))
 431 |                                         out->insert(out->end(), res, res + n);
 432 |                         }
 433 | 
 434 |                         sort_and_deoverlap(out);
 435 |                 }
 436 | 
 437 |                 static auto normalize(const ring_segment *const segments, const uint32_t segmentsCnt)
 438 |                 {
 439 |                         std::vector<ring_segment> res;
 440 | 
 441 |                         normalize(segments, segmentsCnt, &res);
 442 |                         return res;
 443 |                 }
 444 | 
 445 |                 // true iff segment contains the token
 446 |                 bool contains(const token_t &token) const noexcept
 447 |                 {
 448 |                         if (wraps())
 449 |                         {
 450 |                                 // We are wrapping around. Thee interval is (a, b] where a>= b
 451 |                                 // then we have 3 cases which hold for any given token k, and we should return true
 452 |                                 // 1. a < k
 453 |                                 // 2. k <= b
 454 |                                 // 3. b < k <= a
 455 |                                 return token > left || right >= token;
 456 |                         }
 457 |                         else
 458 |                         {
 459 |                                 // Range [a,b], a < b
 460 |                                 return token > left && right >= token;
 461 |                         }
 462 |                 }
 463 | 
 464 |                 constexpr bool wraps() const noexcept
 465 |                 {
 466 |                         return tokens_wrap_around(left, right);
 467 |                 }
 468 | 
 469 |                 inline bool intersects(const ring_segment that) const noexcept
 470 |                 {
 471 |                         ring_segment out[2];
 472 | 
 473 |                         return intersection(that, out);
 474 |                 }
 475 | 
 476 |                 static uint8_t _intersection_of_two_wrapping_segments(const ring_segment &first, const ring_segment &that, ring_segment *intersection) noexcept
 477 |                 {
 478 |                         if (that.right > first.left)
 479 |                         {
 480 |                                 intersection[0] = ring_segment(first.left, that.right);
 481 |                                 intersection[1] = ring_segment(that.left, first.right);
 482 |                                 return 2;
 483 |                         }
 484 |                         else
 485 |                         {
 486 |                                 intersection[0] = ring_segment(that.left, first.right);
 487 |                                 return 1;
 488 |                         }
 489 |                 }
 490 | 
 491 |                 static uint8_t _intersection_of_single_wrapping_segment(const ring_segment &wrapping, const ring_segment &other, ring_segment *intersection) noexcept
 492 |                 {
 493 |                         uint8_t size{0};
 494 | 
 495 |                         if (other.contains(wrapping.right))
 496 |                                 intersection[size++] = ring_segment(other.left, wrapping.right);
 497 |                         if (other.contains(wrapping.left) && wrapping.left < other.right)
 498 |                                 intersection[size++] = ring_segment(wrapping.left, other.right);
 499 | 
 500 |                         return size;
 501 |                 }
 502 | 
 503 |                 // Returns the intersection of two segments. That can be two disjoint ranges if one is wrapping and the other is not.
 504 |                 // e.g for two nodes G and M, and a query range (D, T]; the intersection is (M-T] and (D-G]
 505 |                 // If there is no interesection, an empty list is returned
 506 |                 //
 507 |                 // (12,7)^(5,20) => [(5,7), (12, 20)]
 508 |                 // ring_segment(10, 100).intersection(50, 120) => [ ring_segment(50, 100) ]
 509 |                 // see also mask()
 510 |                 //
 511 |                 // this is the result of the logical operation: ((*this) & that)
 512 |                 uint8_t intersection(const ring_segment &that, ring_segment *out) const noexcept
 513 |                 {
 514 |                         if (that.contains(*this))
 515 |                         {
 516 |                                 *out = *this;
 517 |                                 return 1;
 518 |                         }
 519 |                         else if (contains(that))
 520 |                         {
 521 |                                 *out = that;
 522 |                                 return 1;
 523 |                         }
 524 |                         else
 525 |                         {
 526 |                                 const bool thisWraps = tokens_wrap_around(left, right);
 527 |                                 const bool thatWraps = tokens_wrap_around(that.left, that.right);
 528 | 
 529 |                                 if (!thisWraps && !thatWraps)
 530 |                                 {
 531 |                                         // Neither wraps; fast path
 532 |                                         if (!(left < that.right && that.left < right))
 533 |                                                 return 0;
 534 | 
 535 |                                         *out = ring_segment(std::max<token_t>(left, that.left), std::min<token_t>(right, that.right));
 536 |                                         return 1;
 537 |                                 }
 538 |                                 else if (thisWraps && thatWraps)
 539 |                                 {
 540 |                                         // Two wrapping ranges always intersect.
 541 |                                         // We have determined that neither this or that contains the other, we are left
 542 |                                         // with two possibilities and mirror images of each such case:
 543 |                                         // 1. both of s (1,2] endpoints lie in this's (A, B] right segment
 544 |                                         // 2. only that's start endpoint lies in this's right segment:
 545 |                                         if (left < that.left)
 546 |                                                 return _intersection_of_two_wrapping_segments(*this, that, out);
 547 |                                         else
 548 |                                                 return _intersection_of_two_wrapping_segments(that, *this, out);
 549 |                                 }
 550 |                                 else if (thisWraps && !thatWraps)
 551 |                                         return _intersection_of_single_wrapping_segment(*this, that, out);
 552 |                                 else
 553 |                                         return _intersection_of_single_wrapping_segment(that, *this, out);
 554 |                         }
 555 |                 }
 556 | 
 557 |                 // Subtracts a portion of this range
 558 |                 // @contained : The range to subtract from `this`: must be totally contained by this range
 559 |                 // @out: List of ranges left after subtracting contained from `this` (@return value is size of @out)
 560 |                 //
 561 |                 // i.e ring_segment(10, 100).subdvide(ring_segment(50, 55)) => [ ring_segment(10, 50), ring_segment(55, 110) ]
 562 |                 //
 563 |                 // You may want to use mask() instead, which is more powerful and covers wrapping cases, etc
 564 |                 uint8_t subdivide(const ring_segment &contained, ring_segment *const out) const noexcept
 565 |                 {
 566 |                         if (contained.contains(*this))
 567 |                         {
 568 |                                 // contained actually contains this segment
 569 |                                 return 0;
 570 |                         }
 571 | 
 572 |                         uint8_t size{0};
 573 | 
 574 |                         if (left != contained.left)
 575 |                                 out[size++] = ring_segment(left, contained.left);
 576 |                         if (right != contained.right)
 577 |                                 out[size++] = ring_segment(contained.right, right);
 578 |                         return size;
 579 |                 }
 580 | 
 581 |                 // if this segment wraps, it will return two segments
 582 |                 // 1. (left, std::numeric_limits<token_t>::min())
 583 |                 // 2. (std::numeric_limits<token_t>::min(), right)
 584 |                 // otherwise, it will return itself
 585 |                 uint8_t unwrap(ring_segment *const out) const noexcept
 586 |                 {
 587 |                         const auto MinTokenValue = std::numeric_limits<token_t>::min();
 588 | 
 589 |                         if (false == wraps() || right == MinTokenValue)
 590 |                         {
 591 |                                 *out = *this;
 592 |                                 return 1;
 593 |                         }
 594 |                         else
 595 |                         {
 596 |                                 out[0] = ring_segment(left, MinTokenValue);
 597 |                                 out[1] = ring_segment(MinTokenValue, right);
 598 |                                 return 2;
 599 |                         }
 600 |                 }
 601 | 
 602 |                 // Compute difference betweet two ring segments
 603 |                 // This is very handy for computing, e.g the segments a node will need to fetch, when moving to a given token
 604 |                 // e.g segment(5, 20).difference(segment(2, 25)) => [ (2, 5), (20, 25) ]
 605 |                 // e.g segment(18, 25).difference(segment(5,20)) => [ (5, 18) ]
 606 |                 //
 607 |                 // In other words, compute the missing segments(ranges) that (*this) is missing from rhs
 608 |                 // There is an opposite operation, mask()
 609 |                 //
 610 |                 // This is the result of the logical operation: (rhs & (~(rhs & (*this))) )
 611 |                 uint8_t difference(const ring_segment &rhs, ring_segment *const result) const
 612 |                 {
 613 |                         ring_segment intersectionSet[2];
 614 | 
 615 |                         switch (intersection(rhs, intersectionSet))
 616 |                         {
 617 |                                 case 0:
 618 |                                         // not intersected
 619 |                                         *result = rhs;
 620 |                                         return 1;
 621 | 
 622 |                                 case 1:
 623 |                                         // compute missing sub-segments
 624 |                                         return rhs.subdivide(intersectionSet[0], result);
 625 | 
 626 |                                 default:
 627 |                                 {
 628 |                                         const auto first = intersectionSet[0], second = intersectionSet[1];
 629 |                                         ring_segment tmp[2];
 630 | 
 631 |                                         rhs.subdivide(first, tmp);
 632 |                                         // two intersections; subtracting only one of them will yield a single segment
 633 |                                         return tmp[0].subdivide(second, result);
 634 |                                 }
 635 |                         }
 636 |                 }
 637 | 
 638 |                 // split the segment in two, halved at segmentToken value (if segmentToken is contained in segment)
 639 |                 //
 640 |                 // i.e ring_segment(10, 20).split(18) => (  ring_segment(10, 18), ring_segment(18, 20) )
 641 |                 std::experimental::optional<std::pair<ring_segment, ring_segment>> split(const token_t segmentToken) const noexcept
 642 |                 {
 643 |                         if (left == segmentToken || right == segmentToken || !contains(segmentToken))
 644 |                                 return {};
 645 | 
 646 |                         return {{ring_segment(left, segmentToken), ring_segment(segmentToken, right)}};
 647 |                 }
 648 | 
 649 | #ifdef HAVE_SWITCH
 650 |                 void serialize(IOBuffer *const b) const
 651 |                 {
 652 |                         b->Serialize(left);
 653 |                         b->Serialize(right);
 654 |                 }
 655 | 
 656 |                 void deserialize(ISerializer *const b) const
 657 |                 {
 658 |                         b->Unserialize<token_t>(&left);
 659 |                         b->Unserialize<token_t>(&right);
 660 |                 }
 661 | #endif
 662 | 
 663 |                 // Make sure segments is properly ordered and deoverlapped
 664 |                 // see sort_and_deoverlap()
 665 |                 static bool segments_contain(const token_t token, const ring_segment *const segments, const uint32_t cnt)
 666 |                 {
 667 |                         if (!cnt)
 668 |                                 return false;
 669 | 
 670 |                         int32_t h = cnt - 1;
 671 | 
 672 |                         if (segments[h].wraps())
 673 |                         {
 674 |                                 if (segments[h--].contains(token))
 675 |                                 {
 676 |                                         // there can only be one segment that wraps, and that should be the last one (see sort_and_deoverlap() impl.)
 677 |                                         return true;
 678 |                                 }
 679 |                         }
 680 | 
 681 |                         for (int32_t l{0}; l <= h;)
 682 |                         {
 683 |                                 const auto m = (l & h) + ((l ^ h) >> 1);
 684 |                                 const auto segment = segments[m];
 685 | 
 686 |                                 if (segment.contains(token))
 687 |                                         return true;
 688 |                                 else if (token <= segment.left)
 689 |                                         h = m - 1;
 690 |                                 else
 691 |                                         l = m + 1;
 692 |                         }
 693 | 
 694 |                         return false;
 695 |                 }
 696 |         };
 697 | 
 698 |         // A Ring of tokens
 699 |         template <typename T>
 700 |         struct Ring
 701 |         {
 702 |                 using token_t = T;
 703 |                 using segment_t = ring_segment<T>;
 704 | 
 705 |                 const T *const tokens;
 706 |                 const uint32_t cnt;
 707 | 
 708 |                 constexpr Ring(const T *const v, const uint32_t n)
 709 |                     : tokens{v}, cnt{n}
 710 |                 {
 711 |                 }
 712 | 
 713 |                 constexpr Ring(const std::vector<T> &v)
 714 |                     : Ring{v.data(), v.size()}
 715 |                 {
 716 |                 }
 717 | 
 718 |                 constexpr auto size() const noexcept
 719 |                 {
 720 |                         return cnt;
 721 |                 }
 722 | 
 723 |                 uint32_t index_of(const T token) const noexcept
 724 |                 {
 725 |                         for (int32_t h = cnt - 1, l{0}; l <= h;)
 726 |                         {
 727 |                                 const auto m = (l & h) + ((l ^ h) >> 1);
 728 |                                 const auto v = tokens[m];
 729 |                                 const auto r = TrivialCmp(token, v);
 730 | 
 731 |                                 if (!r)
 732 |                                         return m;
 733 |                                 else if (r < 0)
 734 |                                         h = m - 1;
 735 |                                 else
 736 |                                         l = m + 1;
 737 |                         }
 738 | 
 739 |                         return UINT32_MAX;
 740 |                 }
 741 | 
 742 |                 inline bool is_set(const T token) const noexcept
 743 |                 {
 744 |                         return index_of(token) != UINT32_MAX;
 745 |                 }
 746 | 
 747 |                 inline uint32_t search(const T token) const noexcept
 748 |                 {
 749 |                         return ConsistentHashing::search(tokens, cnt, token);
 750 |                 }
 751 | 
 752 |                 // In a distributed systems, you should map the token to a node (or the segment index returned by this method)
 753 |                 inline uint32_t index_owner_of(const T token) const noexcept
 754 |                 {
 755 |                         // modulo is not cheap, and comparisons are much cheaper, but branchless is nice
 756 |                         return search(token) % cnt;
 757 |                 }
 758 | 
 759 |                 inline auto token_owner_of(const T token) const noexcept
 760 |                 {
 761 |                         return tokens[index_owner_of(token)];
 762 |                 }
 763 | 
 764 |                 constexpr const T &token_predecessor_by_index(const uint32_t idx) const noexcept
 765 |                 {
 766 |                         return tokens[(idx + (cnt - 1)) % cnt];
 767 |                 }
 768 | 
 769 |                 constexpr const T &token_predecessor(const T token) const noexcept
 770 |                 {
 771 |                         return token_predecessor_by_index(index_of(token));
 772 |                 }
 773 | 
 774 |                 constexpr const T &token_successor_by_index(const uint32_t idx) const noexcept
 775 |                 {
 776 |                         return tokens[(idx + 1) % cnt];
 777 |                 }
 778 | 
 779 |                 constexpr const T &token_successor(const T token) const noexcept
 780 |                 {
 781 |                         return token_successor_by_index(index_of(token));
 782 |                 }
 783 | 
 784 |                 constexpr auto index_segment(const uint32_t idx) const noexcept
 785 |                 {
 786 |                         return ring_segment<T>(tokens[(idx + (cnt - 1)) % cnt], tokens[idx]);
 787 |                 }
 788 | 
 789 |                 // segment in the ring that owns this token
 790 |                 // based on the (prev segment.token, this segment.token] ownership rule
 791 |                 constexpr auto token_segment(const T token) const noexcept
 792 |                 {
 793 |                         return index_segment(index_of(token));
 794 |                 }
 795 | 
 796 |                 // see also sort_and_deoverlap()
 797 |                 void segments(std::vector<ring_segment<T>> *const res) const
 798 |                 {
 799 |                         if (cnt)
 800 |                         {
 801 |                                 res->reserve(cnt + 2);
 802 |                                 for (uint32_t i{1}; i != cnt; ++i)
 803 |                                         res->push_back({tokens[i - 1], tokens[i]});
 804 |                                 res->push_back({tokens[cnt - 1], tokens[0]});
 805 |                         }
 806 |                 }
 807 | 
 808 |                 auto segments() const
 809 |                 {
 810 |                         std::vector<ring_segment<T>> res;
 811 | 
 812 |                         segments(&res);
 813 |                         return res;
 814 |                 }
 815 | 
 816 |                 auto tokens_segments(const std::vector<token_t> &t) const
 817 |                 {
 818 |                         std::vector<segment_t> res;
 819 | 
 820 |                         res.reserve(t.size());
 821 |                         for (const auto token : t)
 822 |                         {
 823 |                                 const auto idx = index_owner_of(token);
 824 | 
 825 |                                 res.push_back({token_predecessor_by_index(idx), token});
 826 |                         }
 827 | 
 828 |                         std::sort(res.begin(), res.end(), [](const auto &a, const auto &b) { return a.left < b.left; });
 829 |                         return res;
 830 |                 }
 831 | 
 832 |                 // Assuming a node is a replica for tokens in segments `current`, and then it assumes ownership of a different
 833 |                 // set of segments, `updated`
 834 |                 //
 835 |                 // This handy utility method will generate a pair of segments list:
 836 |                 // 1. The first is segments the node will need to *fetch* from other nodes in the ring, because it will now be also responsible
 837 |                 // for those segments, but it does not have the data, based on the current owned segments.
 838 |                 // 2. The second is segments the node will need to *stream* to other nodes in the ring, because it will no longer hold data for them.
 839 |                 //
 840 |                 // Obviously, if a node is just introduced to a ring (i.e have only updated and no current segments ), it should
 841 |                 // just fetch data for all the current segments. Conversely, if the node is exiting the ring, it should
 842 |                 // consider streaming all the data it has to other nodes if needed, and not fetch any data to itself.
 843 |                 //
 844 |                 // make sure that current and updated are in order  e.g std::sort(start, end, [](const auto &a, const auto &b) { return a.left < b.left; });
 845 |                 //
 846 |                 // Because the output will be an array of segments (_not_ tokens), you will need to determine the segments of the ring that intersect it
 847 |                 // in order to figure out which nodes have which parts of the segments.
 848 |                 //
 849 |                 // This is a fairly expensive method (although it should be easy to optimize it if necessary), but given how rare it should be used, that's not a real concern
 850 |                 //
 851 |                 // Example: current segment [10, 20), updated segment [10, 25)
 852 |                 // Example: current segment [10, 20), updated segment [8, 30)
 853 |                 static auto compute_segments_ownership_updates(const std::vector<segment_t> &currentSegmentsInput, const std::vector<segment_t> &updatedSegmentsInput)
 854 |                 {
 855 |                         std::vector<segment_t> toFetch, toStream, current, updated, toFetchFinal, toStreamFinal;
 856 |                         segment_t segmentsList[2];
 857 | 
 858 |                         // We need to work on normalized lists of segments
 859 |                         current = currentSegmentsInput;
 860 |                         ring_segment<T>::sort_and_deoverlap(&current);
 861 | 
 862 |                         updated = updatedSegmentsInput;
 863 |                         ring_segment<T>::sort_and_deoverlap(&updated);
 864 | 
 865 |                         for (const auto curSegment : current)
 866 |                         {
 867 |                                 const auto n = toStream.size();
 868 | 
 869 |                                 for (const auto updatedSegment : updated)
 870 |                                 {
 871 |                                         if (curSegment.intersects(updatedSegment))
 872 |                                                 toStream.insert(toStream.end(), segmentsList, segmentsList + updatedSegment.difference(curSegment, segmentsList));
 873 |                                 }
 874 | 
 875 |                                 if (toStream.size() == n)
 876 |                                 {
 877 |                                         // no intersection; accept whole segment
 878 |                                         toStream.push_back(curSegment);
 879 |                                 }
 880 |                         }
 881 | 
 882 |                         for (const auto updatedSegment : updated)
 883 |                         {
 884 |                                 const auto n = toFetch.size();
 885 | 
 886 |                                 for (const auto curSegment : current)
 887 |                                 {
 888 |                                         if (updatedSegment.intersects(curSegment))
 889 |                                                 toFetch.insert(toFetch.end(), segmentsList, segmentsList + curSegment.difference(updatedSegment, segmentsList));
 890 |                                 }
 891 | 
 892 |                                 if (toFetch.size() == n)
 893 |                                 {
 894 |                                         // no intersection; accept whole segment
 895 |                                         toFetch.push_back(updatedSegment);
 896 |                                 }
 897 |                         }
 898 | 
 899 |                         // normalize output
 900 |                         ring_segment<T>::sort_and_deoverlap(&toFetch);
 901 |                         ring_segment<T>::sort_and_deoverlap(&toStream);
 902 | 
 903 |                         // mask segments:
 904 |                         // 	from segments to fetch, mask currently owned segments
 905 |                         //	from segments to stream, mask segments we will own (updated segments)
 906 |                         ring_segment<T>::mask_segments(toFetch, current, &toFetchFinal);
 907 |                         ring_segment<T>::mask_segments(toStream, updated, &toStreamFinal);
 908 | 
 909 |                         return std::make_pair(toFetchFinal, toStreamFinal);
 910 |                 }
 911 | 
 912 |                 // When a node acquires ring tokens(joins a cluster), it only disupts segments its token(s) fall into
 913 |                 // Assuming a ring of tokens:  (10, 100, 150, 180, 200)
 914 |                 // and a node joins a cluster, and acquires token 120
 915 |                 // then it will only affect requests for (100, 120]
 916 |                 // so it will need to fetch content for (100, 120] from somewhere. Where? well, from whichever owned (100, 150]
 917 |                 // which is just the successor node, which we can find using index_owner_of()
 918 |                 // This is a simple replication strategy implementation; we 'll just walk the ring clockwise and collect nodes that own
 919 |                 // the tokens, skipping already collected nodes
 920 |                 //
 921 |                 // EXAMPLE: This is an illustrative example; you shouldn't really use this in production as is
 922 |                 template <typename L>
 923 |                 auto token_replicas_basic(const token_t token, const uint8_t replicationFactor, L &&endpoint_token) const
 924 |                 {
 925 |                         using node_t = typename std::result_of<L(uint32_t)>::type;
 926 |                         std::vector<node_t> nodes;
 927 |                         const auto base = index_owner_of(token);
 928 |                         auto idx = base;
 929 | 
 930 |                         do
 931 |                         {
 932 |                                 const auto node = endpoint_token(idx);
 933 | 
 934 |                                 if (std::find(nodes.begin(), nodes.end(), node) == nodes.end())
 935 |                                 {
 936 |                                         nodes.push_back(node);
 937 |                                         if (nodes.size() == replicationFactor)
 938 |                                                 break;
 939 |                                 }
 940 | 
 941 |                                 idx = (idx + 1) % size();
 942 |                         } while (idx != base);
 943 | 
 944 |                         return nodes;
 945 |                 }
 946 | 
 947 |                 // This generates the lists of tokens and matching nodes that own them based on a new ownership state that results
 948 |                 // from applying the changes in ringTokensNodes
 949 |                 // Specifically, in the resulted topology, current nodes tokens are replaced with their updated set in ringTokensNodes
 950 |                 template <typename node_t>
 951 |                 std::pair<std::vector<token_t>, std::vector<node_t>> new_topology(const node_t *const ringTokensNodes,
 952 |                                                                                         const std::unordered_map<node_t, std::vector<token_t>> &futureNodesTokens) const
 953 |                 {
 954 |                         std::vector<token_t> transientRingTokens;
 955 |                         std::vector<node_t> transientRingTokensNodes;
 956 |                         std::unordered_map<token_t, node_t> map;
 957 | 
 958 |                         for (uint32_t i{0}; i != cnt; ++i)
 959 |                         {
 960 |                                 const auto token = tokens[i];
 961 | 
 962 |                                 if (futureNodesTokens.find(ringTokensNodes[i]) == futureNodesTokens.end())
 963 |                                 {
 964 |                                         transientRingTokens.push_back(tokens[i]);
 965 |                                         map.insert({tokens[i], ringTokensNodes[i]});
 966 |                                 }
 967 |                         }
 968 | 
 969 |                         for (const auto &it : futureNodesTokens)
 970 |                         {
 971 |                                 const auto node = it.first;
 972 | 
 973 |                                 transientRingTokens.insert(transientRingTokens.end(), it.second.data(), it.second.data() + it.second.size());
 974 |                                 for (const auto token : it.second)
 975 |                                         map.insert({token, node});
 976 |                         }
 977 | 
 978 |                         std::sort(transientRingTokens.begin(), transientRingTokens.end());
 979 | 
 980 |                         // The associated nodes for each token in the transient ring
 981 |                         transientRingTokensNodes.reserve(transientRingTokens.size());
 982 |                         for (const auto token : transientRingTokens)
 983 |                                 transientRingTokensNodes.push_back(map[token]);
 984 | 
 985 |                         return {std::move(transientRingTokens), std::move(transientRingTokensNodes)};
 986 |                 }
 987 | 
 988 |                 template <typename node_t, typename L>
 989 |                 static node_t *filter_by_distance(node_t *const nodes, const node_t *const end, L &&l)
 990 |                 {
 991 |                         using dist_t = typename std::result_of<L(node_t)>::type;
 992 |                         dist_t min;
 993 |                         uint32_t out{0};
 994 | 
 995 |                         for (auto it = nodes; it != end; ++it)
 996 |                         {
 997 |                                 if (!out)
 998 |                                 {
 999 |                                         min = l(*it);
1000 |                                         nodes[out++] = *it;
1001 |                                 }
1002 |                                 else if (const auto d = l(*it); d == min)
1003 |                                         nodes[out++] = *it;
1004 |                                 else if (d < min)
1005 |                                 {
1006 |                                         min = d;
1007 |                                         nodes[0] = *it;
1008 |                                         out = 1;
1009 |                                 }
1010 |                         }
1011 | 
1012 |                         return nodes + out;
1013 |                 }
1014 | 
1015 | 
1016 | 		// Builds a ring transition plan that is to be commited in order to transition to a new ring state.
1017 | 		// 
1018 |                 // Whenever one more nodes alters the ring topology (when joining a cluster, leaving a cluster, or acquiring a different set of tokens), we need to
1019 |                 // account for that change, by copying data to nodes that will serve segments they didn't already were serving(thus, they don't have the data for that ring space)
1020 |                 // and by copying data to nodes that will now serve as a result of one or more nodes dropping segments they used to serve. This is necessary in order to
1021 |                 // support replication semantics.
1022 |                 //
1023 |                 // You should initiate a transition PLAN, and when it is COMMITED, create the final ring topology using
1024 |                 // new_topology() like it's used here for the transient ring, and switch to it, by advertising the new tokens for all tokens in the ring.
1025 |                 //
1026 |                 // GUIDELINES
1027 |                 // - There can be only one active transition in progress. If you allow for concurrent transitions, you will almost definitely end up with
1028 |                 // invalid rings that likely contain missing data. You can queue new transition plans to be executed after the current plan is complete. See Riak
1029 |                 // - For existing nodes participating in the transition: They should not be stopped or otherwise be treated in any special way.
1030 |                 // - For nodes that are to join the cluster(i.e are not already participating in the ring), you should wait until the transition has completed successfully,
1031 |                 // 	and then initialize them with the tokens you used for them in the transition.
1032 |                 // - For nodes that are to leave the cluster(i.e notes in the current cluster, but not in the cluster topology after the transition), you should
1033 |                 //	wait until the transition has completed successfully, and them stop them, and make sure you won't start them again with the same tokens.
1034 |                 //
1035 |                 // With this method, the only tricky operation becomes the coordindation required for switching to the new topology after the
1036 |                 // streaming required for the transition is complete. You will need to (re)start nodes using specific tokens, etc.
1037 |                 //
1038 |                 // OPTIMIZATION OPPORTUNITIES
1039 |                 // - You should use filter_by_distance() to filter sources, or a similar function so that you will always select among the closest(in terms of network hops) nodes to
1040 |                 // the ring target node for streaming, in order to minimize streaming time. Use a best-effort strategy in order to minimize data motion.
1041 |                 // - You should try to schedule the streaming operations fairly among the involved nodes. If you over-load a node and under-load the rest, or vice versa, the time
1042 |                 // and effort(cost) will be much higher.
1043 |                 //
1044 |                 // EXAMPLES
1045 |                 // - If you 'd like to add 5 new nodes to your cluster, you can pick appropriate tokens(functionality for selecting tokens from the ring based on current distribution will be
1046 |                 // implemented later) for those new tokens, initiate a transition that involves them and their new tokens, and when you are done streaming, you should start those new 5 nodes, and each
1047 |                 // should be conigured to use the tokens you selected for transition().
1048 |                 // - If you 'd like to decomission 1 node, you just need to a new transition that involves that node, and the list of tokens it will own will be empty. Once the streaming is
1049 |                 // complete, you should stop the node.
1050 | 		//
1051 | 		//
1052 | 		// ## Riak and Cassandra
1053 | 		// Effectively, this creates a transition plan. 
1054 | 		// Once you have successfully executed the transition plan, you should *commit* the changes.
1055 | 		// According to @justinsheehy an @tsantero, Riak supports on-demand resizing by arbitrary factors, and you can
1056 | 		// issue multiple resize operations(they are queued and are executed whenever the last one commits,
1057 | 		// it's possible to cancel them).
1058 | 		// It does a best-effort about minimizing movement. If a node failed during movement, HH would
1059 | 		// kick in.
1060 | 		// Like cassandra, it would keep track of 'pending segmnets' and the coordinator would push writes
1061 | 		// to them if needed. See Cassandra's method for calculating "pending ranges" and how
1062 | 		// that is beijng used in the proxy where for each write, the pending ranges list is consulted
1063 | 		// and if any pending segments match the token, the targetrs of those segments also receive the update
1064 | 		// See also: https://billo.gitbooks.io/lfe-little-riak-book/content/ch4/5.html 
1065 |                 template <typename node_t, typename L>
1066 |                 auto transition(
1067 |                     const node_t *const ringTokensNodes,
1068 |                     const std::unordered_map<node_t, std::vector<token_t>> &futureNodesTokens,
1069 |                     L &&replicas_for) const
1070 |                 {
1071 |                         static constexpr size_t maxReplicasCnt{16};
1072 |                         const auto segments_of = [&replicas_for](const Ring &ring, const node_t *const ringTokensNodes, const node_t node, std::vector<segment_t> *const res) {
1073 |                                 node_t replicas[maxReplicasCnt];
1074 | 
1075 |                                 for (uint32_t i{0}; i != ring.cnt; ++i)
1076 |                                 {
1077 |                                         const auto token = ring.tokens[i];
1078 |                                         const auto replicasCnt = replicas_for(ring, ringTokensNodes, token, replicas);
1079 | 
1080 |                                         if (std::find(replicas, replicas + replicasCnt, node) != replicas + replicasCnt)
1081 | 					{
1082 | 						// We need the distinct replicas
1083 |                                                 res->push_back({ring.token_predecessor_by_index(i), token});
1084 | 					}
1085 |                                 }
1086 | 
1087 |                                 std::sort(res->begin(), res->end(), [](const auto &a, const auto &b) { return a.left < b.left; });
1088 |                         };
1089 | 
1090 |                         const auto transientRingTopology = new_topology(ringTokensNodes, futureNodesTokens);
1091 |                         const auto &transientRingTokens = transientRingTopology.first;
1092 |                         const auto &transientRingTokensNodes = transientRingTopology.second;
1093 |                         const Ring transientRing(transientRingTokens.data(), transientRingTokens.size());
1094 |                         const auto transientRingSegments = transientRing.segments();
1095 |                         const auto currentRingSegments = segments();
1096 |                         std::vector<segment_t> outSegments;
1097 |                         segment_t segmentsList[2];
1098 | 			// The plan to execute to transition to the new ring
1099 | 			// A list of:
1100 | 			// (ring segment, target node, replicas)
1101 |                         std::vector<std::pair<segment_t, std::pair<node_t, std::vector<node_t>>>> plan;
1102 |                         std::unordered_map<node_t, std::vector<segment_t>> curRingServeMap;
1103 |                         std::vector<node_t> replicas;
1104 |                         node_t tokenReplicas[maxReplicasCnt], futureReplicas[maxReplicasCnt];
1105 | 			std::vector<segment_t> replicaForSegmentsFuture, replicaForSegmentsNow;
1106 | 
1107 |                         // Build (node => [segments]) map for the current ring
1108 |                         {
1109 |                                 std::vector<std::pair<node_t, segment_t>> v;
1110 | 
1111 |                                 for (const auto segment : currentRingSegments)
1112 |                                 {
1113 |                                         const auto n = replicas_for(*this, ringTokensNodes, segment.right, tokenReplicas);
1114 | 
1115 |                                         for (uint8_t i{0}; i != n; ++i)
1116 |                                                 v.push_back({tokenReplicas[i], segment});
1117 |                                 }
1118 | 
1119 |                                 std::sort(v.begin(), v.end(), [](const auto &a, const auto &b) { return a.first < b.first; });
1120 | 
1121 |                                 const auto n = v.size();
1122 |                                 const auto all = v.data();
1123 | 
1124 |                                 for (uint32_t i{0}; i != n;)
1125 |                                 {
1126 |                                         const auto node = v[i].first;
1127 |                                         std::vector<segment_t> list;
1128 | 
1129 |                                         do
1130 |                                         {
1131 |                                                 list.push_back(v[i].second);
1132 |                                         } while (++i != n && v[i].first == node);
1133 | 
1134 |                                         curRingServeMap.insert({node, std::move(list)});
1135 |                                 }
1136 |                         }
1137 | 
1138 |                         for (const auto &it : futureNodesTokens)
1139 |                         {
1140 |                                 const auto node = it.first;
1141 | 
1142 | 				replicaForSegmentsFuture.clear();
1143 | 				replicaForSegmentsNow.clear();
1144 | 
1145 |                                 segments_of(transientRing, transientRingTokensNodes.data(), node, &replicaForSegmentsFuture);
1146 |                                 segments_of(*this, ringTokensNodes, node, &replicaForSegmentsNow);
1147 | 
1148 | 
1149 |                                 // Compute what needs to be delivered to _this_ node
1150 |                                 for (const auto futureSegment : replicaForSegmentsFuture)
1151 |                                 {
1152 |                                         //  Mask segments this node serves already, no need to acquire any content we already have
1153 | 					const auto futureSegmentWraps = futureSegment.wraps();
1154 | 
1155 |                                         outSegments.clear();
1156 |                                         segment_t::mask_segments(&futureSegment, (&futureSegment) + 1, replicaForSegmentsNow, &outSegments);
1157 | 
1158 |                                         if (outSegments.empty())
1159 |                                         {
1160 |                                                 // No need to acquire extra data
1161 |                                                 continue;
1162 |                                         }
1163 | 
1164 | 					// TODO: use binary search to locate the next segment in currentRingSegments
1165 | 					// No need for the linear scan: https://github.com/phaistos-networks/ConsistentHashing/issues/1
1166 | 
1167 | 
1168 |                                         // OK, so who's going to provide content for those segments, based on the current ring?
1169 |                                         for (const auto it : currentRingSegments)
1170 |                                         {
1171 | 						if (it.right <= futureSegment.left)
1172 | 						{
1173 | 							// can safely skip it
1174 | 							continue;
1175 | 						}
1176 | 						else if (it.left > futureSegment.right && !futureSegmentWraps && !it.wraps())
1177 | 						{
1178 | 							// can safely stop here
1179 | 							break;
1180 | 						}
1181 | 
1182 |                                                 const auto cnt = it.intersection(futureSegment, segmentsList);
1183 | 
1184 |                                                 if (!cnt)
1185 |                                                         continue;
1186 | 
1187 |                                                 const std::vector<node_t> replicas(tokenReplicas, tokenReplicas + replicas_for(*this, ringTokensNodes, it.right, tokenReplicas));
1188 | 
1189 |                                                 for (uint8_t i{0}; i != cnt; ++i)
1190 |                                                         plan.push_back({segmentsList[i], {node, replicas}}); // from replicas, to node, that segment
1191 |                                         }
1192 |                                 }
1193 | 
1194 |                                 // Whenever a node gives up (part of) a ring segment, we need to shift data around in order
1195 |                                 // to account for the fact that replication factor for the data that span that segment will drop by 1.
1196 |                                 for (const auto currentSegment : replicaForSegmentsNow)
1197 |                                 {
1198 |                                         const auto token = currentSegment.right;
1199 | 					const auto currentSegmentWraps = currentSegment.wraps();
1200 |                                         bool haveSources{false};
1201 | 
1202 | 					// TODO: use binary search to locate the next segment in transientRingSegments
1203 | 					// No need for linear scan: https://github.com/phaistos-networks/ConsistentHashing/issues/1
1204 | 
1205 |                                         for (const auto futureSegment : transientRingSegments)
1206 |                                         {
1207 | 						if (futureSegment.right <= currentSegment.left)
1208 | 						{
1209 | 							// can safely skip it
1210 | 							continue;
1211 | 						}
1212 | 						else if (futureSegment.left > currentSegment.right && !currentSegmentWraps && !futureSegment.wraps())
1213 | 						{
1214 | 							// can safely stop here
1215 | 							break;
1216 | 						}
1217 | 
1218 |                                                 const auto cnt = futureSegment.intersection(currentSegment, segmentsList);
1219 | 
1220 |                                                 if (!cnt)
1221 |                                                         continue;
1222 | 
1223 |                                                 const auto futureReplicasCnt = std::remove_if(futureReplicas,
1224 |                                                                                               futureReplicas + replicas_for(transientRing, transientRingTokensNodes.data(), futureSegment.right, futureReplicas),
1225 |                                                                                               [node, &futureNodesTokens](const node_t target) {
1226 |                                                                                                       if (target == node)
1227 |                                                                                                       {
1228 |                                                                                                               // exclude self
1229 |                                                                                                               return true;
1230 |                                                                                                       }
1231 |                                                                                                       else if (futureNodesTokens.find(target) != futureNodesTokens.end())
1232 |                                                                                                       {
1233 |                                                                                                               // exclude nodes that are also involved in this process, otherwise we may output the same value twice in plan
1234 |                                                                                                               return true;
1235 |                                                                                                       }
1236 |                                                                                                       else
1237 |                                                                                                       {
1238 |                                                                                                               return false;
1239 |                                                                                                       }
1240 | 
1241 |                                                                                               }) -
1242 |                                                                                futureReplicas;
1243 | 
1244 |                                                 for (uint8_t i{0}; i != cnt; ++i)
1245 |                                                 {
1246 |                                                         const auto subSegment = segmentsList[i]; // intersection
1247 | 
1248 | 							for (uint32_t ri{0}; ri != futureReplicasCnt; ++ri)
1249 |                                                         {
1250 |                                                                 const auto target = futureReplicas[ri];
1251 | 
1252 |                                                                 if (!haveSources)
1253 |                                                                 {
1254 |                                                                         // lazy generation of the sources(replicas) for this segment
1255 |                                                                         // replicas should include this node
1256 |                                                                         replicas.clear();
1257 |                                                                         replicas.insert(replicas.end(), tokenReplicas, tokenReplicas + replicas_for(*this, ringTokensNodes, token, tokenReplicas));
1258 |                                                                         haveSources = true;
1259 |                                                                 }
1260 | 
1261 |                                                                 if (auto s = curRingServeMap.find(target); s != curRingServeMap.end())
1262 |                                                                 {
1263 |                                                                         // this node serves 1+ segments already in the current segment
1264 |                                                                         // mask subSegment with them; we don't want to send data to nodes if they already have any of it
1265 |                                                                         outSegments.clear();
1266 |                                                                         segment_t::mask_segments(&subSegment, (&subSegment) + 1, s->second, &outSegments);
1267 | 
1268 |                                                                         for (const auto s : outSegments)
1269 |                                                                                 plan.push_back({s, {target, replicas}});
1270 |                                                                 }
1271 |                                                                 else
1272 |                                                                 {
1273 |                                                                         // this target does not currently server any segments in the current segment
1274 |                                                                         plan.push_back({subSegment, {target, replicas}});
1275 |                                                                 }
1276 |                                                         }
1277 |                                                 }
1278 |                                         }
1279 |                                 }
1280 |                         }
1281 | 
1282 |                         return plan;
1283 |                 }
1284 |         };
1285 | }
1286 | 
1287 | #ifdef HAVE_SWITCH
1288 | template <typename token_t>
1289 | static inline void PrintImpl(Buffer &b, const ConsistentHashing::ring_segment<token_t> &segment)
1290 | {
1291 |         b.append("(", segment.left, ", ", segment.right, "]");
1292 | }
1293 | 
1294 | template <typename T>
1295 | static inline void PrintImpl(Buffer &b, const ConsistentHashing::Ring<T> &ring)
1296 | {
1297 |         b.append(_S32("(( "));
1298 |         if (const auto cnt = ring.cnt)
1299 |         {
1300 |                 for (uint32_t i{1}; i != cnt; ++i)
1301 |                         b.append(ConsistentHashing::ring_segment<T>(ring.tokens[i - 1], ring.tokens[i]), ",");
1302 | 
1303 |                 b.append(ConsistentHashing::ring_segment<T>(ring.tokens[cnt - 1], ring.tokens[0]));
1304 |         }
1305 |         b.append(_S32(" ))"));
1306 | }
1307 | #endif
1308 | 


--------------------------------------------------------------------------------