├── .gitignore ├── README.md ├── examples ├── http_server.py └── tcp_server.py ├── redis_router ├── __init__.py ├── http_interface.py ├── router.py └── tcp_interface.py ├── requirements.txt ├── setup.py ├── shardacross.png ├── tests.py └── workflow.png /.gitignore: -------------------------------------------------------------------------------- 1 | *.py[cod] 2 | 3 | # C extensions 4 | *.so 5 | 6 | # Packages 7 | *.egg 8 | *.egg-info 9 | dist 10 | build 11 | eggs 12 | parts 13 | bin 14 | var 15 | sdist 16 | develop-eggs 17 | .installed.cfg 18 | lib 19 | lib64 20 | 21 | # Installer logs 22 | pip-log.txt 23 | 24 | # Unit test / coverage reports 25 | .coverage 26 | .tox 27 | nosetests.xml 28 | 29 | # Translations 30 | *.mo 31 | 32 | # Mr Developer 33 | .mr.developer.cfg 34 | .project 35 | .pydevproject 36 | temp.py 37 | serverlist 38 | .idea 39 | .idea/* 40 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | redis-router 2 | ============ 3 | 4 | redis_router, a redis sharding library/api for your redis sharding needs. 5 | 6 | 7 | 8 | how it works 9 | ============== 10 | 11 | wikipedia/consistent_hashing 12 | 13 | > Consistent hashing is a special kind of hashing. When a hash table is resized and consistent hashing is used, 14 | > only K/n keys need to be remapped on average, where K is the number of keys, and n is the number of slots. 15 | > In contrast, in most traditional hash tables, a change in the number of array slots causes 16 | > nearly all keys to be remapped. 17 | 18 | redis_router uses last.fm's 19 | libketama in the back. 20 | 21 | installation 22 | ========== 23 | 24 | install libketama/ketama_python first. 25 | 26 | After that; 27 | 28 | ``` 29 | pip install redis-router 30 | ``` 31 | or if you like 90s: 32 | 33 | ``` 34 | easy_install redis-router 35 | ``` 36 | 37 | or add redis_router directory to the your path. 38 | 39 | 40 | quick start 41 | ============ 42 | 43 | 44 | servers.txt (server:ip weight) 45 | ``` 46 | 127.0.0.1:6379 100 47 | 127.0.0.1:6380 100 48 | ``` 49 | 50 | your python code: 51 | 52 | ``` python 53 | router = Router("servers.txt") 54 | 55 | router.set("forge", 13) 56 | router.set("spawning_pool", 18) 57 | ``` 58 | 59 | output with loglevel=DEBUG 60 | 61 | ``` 62 | DEBUG:key 'forge' hashed as 4113771093 and mapped to 127.0.0.1:6379 63 | DEBUG:key 'spawning_pool' hashed as 1434709819 and mapped to 127.0.0.1:6380 64 | DEBUG:key 'forge' hashed as 4113771093 and mapped to 127.0.0.1:6379 65 | DEBUG:key 'spawning_pool' hashed as 1434709819 and mapped to 127.0.0.1:6380 66 | 13 6 67 | ``` 68 | 69 | redis_router as a server 70 | ======================================== 71 | If you have clients using X programming language other than python, you can use HTTP or TCP interface to connect 72 | and send commands to redis_router. 73 | 74 | running TCP interface 75 | ======================= 76 | 77 | ``` python 78 | from redis_router.tcp_interface import RouterServer 79 | 80 | r = RouterServer('0.0.0.0', 5000) 81 | r.run() 82 | ``` 83 | 84 | playing with it 85 | ``` 86 | $ telnet localhost 5000 87 | Trying 127.0.0.1... 88 | Connected to localhost. 89 | Escape character is '^]'. 90 | set selam timu 91 | True 92 | get selam 93 | timu 94 | dbsize 95 | 13 96 | ``` 97 | 98 | HTTP API 99 | ============= 100 | 101 | ``` python 102 | from redis_router.http_interface import start_server 103 | 104 | start_server('0.0.0.0', 5000) 105 | ``` 106 | 107 | example request: 108 | 109 | * initialize a set with two members. 110 | 111 | ``` bash 112 | $ curl -X POST --data "command=sadd&arguments=teams,galatasaray,fenerbahce" http://localhost:5000 113 | ``` 114 | ``` json 115 | { 116 | "response": 2 117 | } 118 | ``` 119 | * get members 120 | 121 | ``` bash 122 | $ curl -X POST --data "command=smembers&arguments=teams" http://localhost:5000 123 | ``` 124 | 125 | ``` json 126 | { 127 | "response": [ 128 | "fenerbahce", 129 | "galatasaray" 130 | ] 131 | } 132 | ``` 133 | 134 | running tests 135 | ================= 136 | ``` bash 137 | $ py.test tests.py 138 | =============================================== test session starts ========================= 139 | platform linux2 -- Python 2.7.3 -- pytest-2.3.4 140 | collected 11 items 141 | 142 | tests.py ........... 143 | 144 | ============================================ 11 passed in 0.33 seconds ====================== 145 | ``` 146 | 147 | FAQ 148 | ========= 149 | > Q: What about data invalidation if I move servers, change the config etc. 150 | 151 | It's minimum. At least better than: 152 | ``` 153 | Node = Hash(key) MOD N 154 | ``` 155 | 156 | > Q: I want to see some stats about sharding efficiency. 157 | 158 | Results for 100.000 random keys. 159 | ``` 160 | results: { 161 | redis.client.Redis object at 0x8df75a4: 33558, 162 | redis.client.Redis object at 0x8df7644: 31207, 163 | redis.client.Redis object at 0x8df7504: 35235 164 | } 165 | ``` 166 | 167 | 168 | > Q: Can I use this with PHP or [INSERT RANDOM LANGUAGE HERE] 169 | 170 | Yes. 171 | 172 | There are TCP server 173 | and HTTP Server options. 174 | You can always use libketama's implementations in your language though. 175 | 176 | 177 | 178 | 179 | 180 | 181 | 182 | 183 | [![Bitdeli Badge](https://d2weczhvl823v0.cloudfront.net/emre/redis-router/trend.png)](https://bitdeli.com/free "Bitdeli Badge") 184 | 185 | -------------------------------------------------------------------------------- /examples/http_server.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf8 -*- 2 | 3 | from redis_router.http_interface import start_server 4 | 5 | start_server('0.0.0.0', 5000) 6 | -------------------------------------------------------------------------------- /examples/tcp_server.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf8 -*- 2 | 3 | from redis_router.tcp_interface import RouterServer 4 | 5 | r = RouterServer('0.0.0.0', 5000) 6 | r.run() 7 | 8 | """ 9 | $ telnet localhost 5000 10 | Trying 127.0.0.1... 11 | Connected to localhost. 12 | Escape character is '^]'. 13 | set selam timu 14 | True 15 | get selam 16 | timu 17 | dbsize 18 | 13 19 | """ -------------------------------------------------------------------------------- /redis_router/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/emre/redis-router/0bc35886c2e1a66a8f5ef3c439db818bcb7493d3/redis_router/__init__.py -------------------------------------------------------------------------------- /redis_router/http_interface.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf8 -*- 2 | 3 | try: 4 | from flask import Flask, render_template, jsonify, request 5 | except ImportError: 6 | raise ImportError('flask library is not installed.') 7 | 8 | from router import Router 9 | 10 | import os 11 | 12 | # initialize flask application 13 | app = Flask(__name__) 14 | 15 | config_file = os.getenv('ROUTER_CONFIG_FILE', '/etc/redis_router/servers.config') 16 | 17 | # main view 18 | @app.route('/', methods=['POST', ]) 19 | def index(): 20 | router = Router(config_file) 21 | command, arguments = request.form['command'], request.form['arguments'] 22 | 23 | arguments = arguments.split(",") 24 | router_response = getattr(router, command)(*arguments) 25 | if isinstance(router_response, set): 26 | router_response = list(router_response) 27 | 28 | return jsonify({"response": router_response}) 29 | 30 | from gevent.wsgi import WSGIServer 31 | 32 | 33 | def start_server(host, port): 34 | http_server = WSGIServer((host, port), app) 35 | http_server.serve_forever() 36 | -------------------------------------------------------------------------------- /redis_router/router.py: -------------------------------------------------------------------------------- 1 | 2 | try: 3 | import ketama 4 | except ImportError: 5 | raise ImportError('libketama is not installed.') 6 | 7 | import redis 8 | import re 9 | import logging 10 | 11 | 12 | class Router(object): 13 | 14 | SERVERS = {} 15 | METHOD_BLACKLIST = [ 16 | 'smove', # it's hard to shard with atomic approach. 17 | 'move', 18 | ] 19 | 20 | def __init__(self, ketama_server_file): 21 | self.server_list = self.parse_server_file(ketama_server_file) 22 | self.continuum = ketama.Continuum(ketama_server_file) 23 | 24 | for hostname, port in self.server_list: 25 | server_string = "{0}:{1}".format(hostname, port) 26 | 27 | # creating a emtpy record for lazy connection responses. 28 | self.SERVERS.update({ 29 | server_string: None, 30 | }) 31 | 32 | def strict_connection(self, hostname, port, timeout=None): 33 | 34 | if not isinstance(port, int): 35 | try: 36 | port = int(port) 37 | except ValueError: 38 | raise ValueError('port must be int or int convertable.') 39 | 40 | return redis.StrictRedis(host=hostname, port=port, db=0, socket_timeout=timeout) 41 | 42 | def get_connection(self, key): 43 | key_hash, connection_uri = self.continuum.get_server(key) 44 | hostname, port = connection_uri.split(":") 45 | 46 | logging.debug("key '{0}' hashed as {1} and mapped to {2}".format( 47 | key, 48 | key_hash, 49 | connection_uri 50 | )) 51 | 52 | connection = self.SERVERS.get(connection_uri) 53 | if not connection: 54 | self.SERVERS.update({ 55 | connection_uri: self.strict_connection(hostname, port), 56 | }) 57 | 58 | connection = self.SERVERS.get(connection_uri) 59 | 60 | return connection 61 | 62 | def __getattr__(self, methodname): 63 | 64 | if methodname in self.METHOD_BLACKLIST: 65 | raise AttributeError('this method is not allowed with redis_router') 66 | 67 | def method(*args, **kwargs): 68 | if len(args) < 1: 69 | raise AttributeError("not enough arguments.") 70 | 71 | connection = self.get_connection(args[0]) 72 | 73 | if hasattr(connection, methodname): 74 | return getattr(connection, methodname)(*args, **kwargs) 75 | else: 76 | raise AttributeError("invalid method name:{0}".format(methodname)) 77 | 78 | return method 79 | 80 | def __set_generator(self, *args): 81 | """ 82 | iterable for the custom set methods: ["sinter", "sdiff", "sunion"] 83 | returns related set's members as python's built-in set. 84 | """ 85 | for index, key in enumerate(args): 86 | yield set(self.smembers(key)) 87 | 88 | def sinter(self, *args): 89 | return set.intersection(*self.__set_generator(*args)) 90 | 91 | def sinterstore(self, destination, *args): 92 | intersection = self.sinter(*args) 93 | if len(intersection) > 0: 94 | self.sadd(destination, *intersection) 95 | 96 | return len(intersection) 97 | 98 | def sdiff(self, *args): 99 | return set.difference(*self.__set_generator(*args)) 100 | 101 | def sdiffstore(self, destination, *args): 102 | difference = self.sdiff(*args) 103 | if len(difference) > 0: 104 | self.sadd(destination, *difference) 105 | 106 | return len(difference) 107 | 108 | def sunion(self, *args): 109 | return set.union(*self.__set_generator(*args)) 110 | 111 | def sunionstore(self, destination, *args): 112 | union = self.sunion(*args) 113 | if len(union) > 0: 114 | return self.sadd(destination, *union) 115 | 116 | return len(union) 117 | 118 | def ping_all(self, timeout=None): 119 | """ 120 | pings all shards and returns the results. 121 | if a shard is down, returns 'DOWN' for the related shard. 122 | """ 123 | results = list() 124 | for connection_uri, connection in self.SERVERS.items(): 125 | if not connection: 126 | try: 127 | connection = self.strict_connection(*connection_uri.split(":"), timeout=timeout) 128 | results.append({ 129 | "result": connection.ping(), 130 | "connection_uri": connection_uri, 131 | }) 132 | except redis.exceptions.ConnectionError: 133 | results.append({ 134 | "result": 'DOWN', 135 | "connection_uri": connection_uri, 136 | }) 137 | 138 | return results 139 | 140 | def dbsize(self): 141 | """ 142 | returns the number of keys across all the shards. 143 | """ 144 | result = 0 145 | for connection_uri, connection in self.SERVERS.items(): 146 | if not connection: 147 | connection = self.strict_connection(*connection_uri.split(":")) 148 | 149 | result += int(connection.dbsize()) 150 | 151 | return result 152 | 153 | def flush_all(self): 154 | """ 155 | flushes all the keys from all the instances. 156 | """ 157 | for connection_uri, connection in self.SERVERS.items(): 158 | if not connection: 159 | connection = self.strict_connection(*connection_uri.split(":")) 160 | 161 | connection.flushall() 162 | 163 | def parse_server_file(self, ketama_server_file): 164 | file_content = open(ketama_server_file).read() 165 | result = re.findall('([^:]*):([^\s]*)\s[^\n]*\n', file_content) 166 | 167 | return result 168 | 169 | 170 | -------------------------------------------------------------------------------- /redis_router/tcp_interface.py: -------------------------------------------------------------------------------- 1 | import logging 2 | import sys 3 | import os 4 | 5 | 6 | try: 7 | from gevent.server import StreamServer 8 | except ImportError: 9 | raise Exception('gevent library is not installed.') 10 | 11 | from router import Router 12 | 13 | 14 | class RouterServer(object): 15 | 16 | CONFIG_FILE = '/etc/redis_router/servers.config' 17 | 18 | def __init__(self, host, port): 19 | self.server = StreamServer((host, port), self.main) 20 | self.init_router() 21 | 22 | def main(self, socket, address): 23 | logging.debug('New connection from %s:%s' % address) 24 | fileobj = socket.makefile() 25 | while True: 26 | client_call = fileobj.readline().replace("\n", "") 27 | 28 | if not client_call: 29 | logging.debug("client disconnected") 30 | break 31 | 32 | if client_call.strip() == '\quit': 33 | logging.debug("client quit") 34 | sys.exit(0) 35 | elif len(client_call) > 2: 36 | splitted_query = client_call.strip().split(" ") 37 | method, args = splitted_query[0], splitted_query[1:] 38 | 39 | response = getattr(self.r, method)(*args) 40 | fileobj.write(response) 41 | 42 | fileobj.flush() 43 | 44 | def init_router(self): 45 | if not os.path.exists(self.CONFIG_FILE): 46 | raise IOError('config file could not found. {0}'.format(self.CONFIG_FILE)) 47 | 48 | self.r = Router(self.CONFIG_FILE) 49 | return self.r 50 | 51 | def run(self): 52 | self.server.serve_forever() 53 | 54 | 55 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | Flask==0.9 2 | gevent==0.13.8 3 | ketama==0.1 4 | py==1.4.13 5 | pytest==2.3.4 6 | redis==2.7.2 7 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | from distutils.core import setup 2 | 3 | setup( 4 | name='redis-router', 5 | version='0.2', 6 | packages=['redis_router'], 7 | url='https://github.com/emre/redis-router', 8 | license='MIT', 9 | author='Emre Yilmaz', 10 | author_email='mail@emreyilmaz.me', 11 | description='A redis sharding library/api for your sharding needs.', 12 | install_requires = ['redis',] 13 | ) 14 | -------------------------------------------------------------------------------- /shardacross.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/emre/redis-router/0bc35886c2e1a66a8f5ef3c439db818bcb7493d3/shardacross.png -------------------------------------------------------------------------------- /tests.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf8 -*- 2 | 3 | import unittest 4 | import os 5 | import ketama 6 | 7 | from redis_router.router import Router 8 | 9 | 10 | class RouterTests(unittest.TestCase): 11 | 12 | def setUp(self): 13 | # localhost:6379 and localhost:6390 must be accessible redis instances for testing. 14 | self.valid_list_file = os.tmpnam() 15 | self.valid_list = file(self.valid_list_file, "w") 16 | self.valid_list.write("127.0.0.1:6379\t600\n") 17 | self.valid_list.write("127.0.0.1:6380\t400\n") 18 | self.valid_list.flush() 19 | 20 | self.invalid_list_file = os.tmpnam() 21 | self.invalid_list = file(self.invalid_list_file, "w") 22 | self.invalid_list.write("127.0.0.1:11211 600\n") 23 | self.invalid_list.write("127.0.0.1:11212 foo\n") 24 | self.invalid_list.flush() 25 | 26 | self.router = Router(self.valid_list_file) 27 | 28 | def tearDown(self): 29 | self.valid_list.close() 30 | os.unlink(self.valid_list_file) 31 | 32 | self.invalid_list.close() 33 | os.unlink(self.invalid_list_file) 34 | 35 | def test_valid_configuration(self): 36 | r = Router(self.valid_list_file) 37 | self.assertEqual(isinstance(r, Router), True) 38 | 39 | def test_invalid_configuration(self): 40 | self.assertRaises(ketama.KetamaError, Router, self.invalid_list_file) 41 | 42 | def test_continuum(self): 43 | cont = Router(self.valid_list_file).continuum 44 | self.assertEqual(type(cont), ketama.Continuum) 45 | 46 | def test_invalid_null(self): 47 | self.assertRaises(ketama.KetamaError, Router, "/dev/null") 48 | 49 | def test_hashing(self): 50 | router = Router(self.valid_list_file) 51 | router.set('forge', 13) 52 | router.set("spawning_pool", 18) 53 | 54 | key_hash, connection_uri = router.continuum.get_server('forge') 55 | self.assertEqual(key_hash, 4113771093) 56 | self.assertEqual(connection_uri, '127.0.0.1:6379') 57 | 58 | key_hash, connection_uri = router.continuum.get_server('spawning_pool') 59 | self.assertEqual(key_hash, 1434709819) 60 | self.assertEqual(connection_uri, '127.0.0.1:6380') 61 | 62 | def test_sinter(self): 63 | self.router.sadd('X', 'a', 'b', 'c') 64 | self.router.sadd('Y', 'a', 'd', 'e') 65 | 66 | self.assertEqual(self.router.sinter('X', 'Y'), set(['a', ])) 67 | 68 | def test_sinterstore(self): 69 | self.router.sadd('X1', 'a', 'b', 'c') 70 | self.router.sadd('Y1', 'a', 'd', 'e') 71 | self.router.sinterstore('Z1', 'X1', 'Y1') 72 | 73 | self.assertEqual(self.router.smembers('Z1'), set(['a', ])) 74 | 75 | def test_sunion(self): 76 | self.router.sadd('T1', 'a', 'b', 'c') 77 | self.router.sadd('M1', 'a', 'd', 'e') 78 | 79 | self.assertEqual(self.router.sunion('T1', 'M1'), set(['a', 'b', 'c', 'd', 'e'])) 80 | 81 | def test_sunionstore(self): 82 | self.router.sadd('T2', 'a', 'b', 'c') 83 | self.router.sadd('M2', 'a', 'd', 'e') 84 | 85 | self.router.sunionstore('Z2', 'T2', 'M2') 86 | 87 | self.assertEqual(self.router.smembers('Z2'), set(['a', 'b', 'c', 'd', 'e'])) 88 | 89 | def test_dbsize(self): 90 | self.router.flush_all() 91 | 92 | for index in xrange(1, 10): 93 | self.router.set('data{0}'.format(index), '1') 94 | 95 | self.assertEqual(self.router.dbsize(), 9) 96 | 97 | def test_flush_all(self): 98 | for index in xrange(1, 10): 99 | self.router.set('random_data{0}'.format(index), '1') 100 | 101 | self.router.flush_all() 102 | 103 | self.assertEqual(self.router.dbsize(), 0) 104 | 105 | 106 | 107 | 108 | 109 | 110 | 111 | 112 | -------------------------------------------------------------------------------- /workflow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/emre/redis-router/0bc35886c2e1a66a8f5ef3c439db818bcb7493d3/workflow.png --------------------------------------------------------------------------------