python 一致性hash 一致性hash以及python代码实现

一致性 hash 以及 python 代码实现：自己之前的项目里面使用了 redis 作为 KV 存储，不仅是因为性能，主要是需要用 redis 的 hash 数据结构。

后来随着业务发展，读写压力越来越大，一开始的做法是读写分离，接着一主多从，发现还是不能很好的解决写 redis 的压力，又因为自己使用的 redis 版本比较低还不支持分布式的功能。

所以自己想去部署一套分布式的 redis 存储系统，开始想到的做法是简单的做个 hash，hashcode=hash(key/machine_num)，接着将对应的 key 放在对应的机器，可是考虑到机器异常宕机，或者新增机器，数据迁移的代价都比较大，所以自己了解了一下一致性 hash，准备用它实现一套分布式的 KV 存储系统。

主要思想：
一致性 hash 的经典介绍网上有很多，我感觉下面这篇文章介绍的不错。

http://blog.csdn.net/cywosp/article/details/23397179/

其实主要的思想我认为如下：

把机器按照某种 hash 算法（比如 MD5）计算得到机器的 hashcode 值对于存储的数据，根据数据的 key，使用与机器相同的 hash 算法获取到相应的 hashcode 值，然后将 key 写入到顺时针最近的机器。

可以是 hashcode(key) <= hashcode(machine) 的机器当有新机器加入时，只需要把新加入机器影响到的数据进行重新分配；当删除机器时，只需要把被删除机器的数据重新分配一下，这样可以减小数据的迁移代价。为了维持平衡性，防止雪崩效应，使用虚拟节点代替真实机器，一个真实机器对应多个虚拟节点，这样可以保证数据的分布均衡

下面是 Python 的代码实现，实现思路就是上面提到的。
代码也是参考网上的：
http://www.cnblogs.com/xuxm2007/archive/2011/08/28/2156015.html

# ! /usr/bin / env python# - *-coding: utf8 - *-import md5 class HashRing(object) : def __init__(self, nodes = None, replicas = 3) : """            Manages a hash ring            :nodes                 a list of object that have a proper __str__ representation            :replicas                indicates how many virtual points should be used per node,                replicas are required to improve the distribution            """self.replicas = replicas self.ring = {}
self._sorted_keys = []
if nodes: for node in nodes: self.add_node(node) def add_node(self, node) : """            Adds a node to the hash ring(including a number of replicas)            """
for i in xrange(0, self.replicas) : key = self.gen_key("%s:%s" % (node, i)) self.ring[key] = node self._sorted_keys.append(key) self._sorted_keys.sort() def remove_node(self, node) : """            Removes node from the hash ring and its replicas            """
for i in xrange(0, self.replicas) : key = self.gen_key("%s:%s" % (node, i)) del self.ring[key] self._sorted_keys.remove(key) def get_node_keys(self, node) : """            Given a node return all its virturl node keys            """keys = []
for i in xrange(0, self.replicas) : key = self.gen_key("%s:%s" % (node, i)) keys.append(key) return key def get_node(self, string_key) : """            Given a string key corresponding node in the hash ring is returned.            If the hash ring is empty, None is returned.            """
return self.get_node_pos(string_key)[0] def get_node_pos(self, string_key) : """            Given a string key corresponding node in the hash ring is returned            along with it's position in the ring.            If the hash ring is empty, (None, None) is returned            """
if not self.ring: return None,
None key = self.gen_key(string_key) nodes = self._sorted_keys
for i in xrange(0, len(nodes)) : node = nodes[i]
if key <= node: return self.ring[node],
i
return self.ring[nodes[0]],
0 def get_nodes(self, string_key) : """            Given a string key it returns the nodes as a generator that can hold the key.            The generator is never ending and iterates through the ring starting at the correct position.            """
if not self.ring: yield None node,
pos = self.get_node_pos(string_key) for key in self._sorted_keys: yield self.ring[key] def gen_key(self, key) : """            Given a string key it return a long value,            this long value represent a place on the hash ring.            md5 is currently used because it mixes well.            """m = md5.new() m.update(key) return long(m.hexdigest(), 16)

就爱阅读 www.92to.com 网友整理上传, 为您提供最全的知识大全, 期待您的分享，转载请注明出处。

来源: http://www.92to.com/bangong/2017/02-17/17249840.html

与本文相关文章

暂无,快来抢沙发吧！