国产aⅴ自拍,日韩精品不卡,三级黄特色

C#實戰：一致性Hash算法

轉帖|其它|編輯：郝浩|2010-09-02 15:02:58.000|閱讀 671 次

概述：最近在研究"一致性HASH算法"(Consistent Hashing)，用于解決memcached集群中當服務器出現增減變動時對散列值的影響。后來在JAVAEYE上的一篇文章中，找到了其中的 KetamaHash 算法的JAVA實現(一種基于虛擬結點的HASH算法)，于是為了加深理解，對照 JAVA版本，用C#重寫了一個。　　

# 界面/圖表報表/文檔/IDE等千款熱門軟控件火熱銷售中 >>

　　最近在研究"一致性HASH算法"(Consistent Hashing)，用于解決memcached集群中當服務器出現增減變動時對散列值的影響。后來在JAVAEYE上的一篇文章中，找到了其中的 KetamaHash 算法的JAVA實現(一種基于虛擬結點的HASH算法)，于是為了加深理解，對照 JAVA版本，用C#重寫了一個。　　

下面是對Ketama的介紹：

#div_code img{border:0px;}

Ketama is an implementation of a consistent hashing algorithm, meaning you can add or remove servers from the memcached pool without causing a complete remap of all keys.
Here’s how it works:
* Take your list of servers (eg: 1.2.3.4:11211, 5.6.7.8:11211, 9.8.7.6:11211)
* Hash each server string to several (100-200) unsigned ints
* Conceptually, these numbers are placed on a circle called the continuum. (imagine a clock face that goes from 0 to 2^32)
* Each number links to the server it was hashed from, so servers appear at several points on the continuum, by each of the numbers they hashed to.
* To map a key->server, hash your key to a single unsigned int, and find the next biggest number on the continuum. The server linked to that number is the correct server for that key.
* If you hash your key to a value near 2^32 and there are no points on the continuum greater than your hash, return the first server in the continuum.
If you then add or remove a server from the list, only a small proportion of keys end up mapping to different servers.

　　我的理解，這類方法其實有點像大學里的微積分的思想(通過連續函數的取值區間來計算圖形面積)。通過把已知的實結點(memcached服務IP端口)列表結成一個圓，然后在兩兩實結點之間“放置”盡可能多的虛擬節點(上面文中的unsigned ints)，假設用戶數據映射在虛擬節點上(用戶數據真正存儲位置是在該虛擬節點代表的實際物理服務器上)，這樣就能抑制分布不均勻，最大限度地減小服務器增減時的緩存重新分布。如下圖：

　　如這篇文章所說(總結一致性哈希(Consistent Hashing) )：

　　Consistent Hashing最大限度地抑制了hash鍵的重新分布。另外要取得比較好的負載均衡的效果，往往在服務器數量比較少的時候需要增加虛擬節點來保證服務器能均勻的分布在圓環上。因為使用一般的hash方法，服務器的映射地點的分布非常不均勻。使用虛擬節點的思想，為每個物理節點(服務器)在圓上分配100～200個點。這樣就能抑制分布不均勻，最大限度地減小服務器增減時的緩存重新分布。用戶數據映射在虛擬節點上，就表示用戶數據真正存儲位置是在該虛擬節點代表的實際物理服務器上。

　　了解了原理，下面看一下具體實現。

　　JAVA實現代碼取自Spy Memcached client中的類，原文的作者也盡可能的對代碼進行注釋說明，所以讓我剩了不少時間。　　

　　下面是相應的.NET實現(注釋參考JAVA版本):

#div_code img{border:0px;}

public class KetamaNodeLocator
{
//原文中的JAVA類TreeMap實現了Comparator方法，這里我圖省事，直接用了net下的SortedList，其中Comparer接口方法）
private SortedList<long, string> ketamaNodes = new SortedList<long, string>();
private HashAlgorithm hashAlg;
private int numReps = 160;

//此處參數與JAVA版中有區別，因為使用的靜態方法，所以不再傳遞HashAlgorithm alg參數
public KetamaNodeLocator(List<string> nodes, int nodeCopies)
{
ketamaNodes = new SortedList<long, string>();

numReps = nodeCopies;
//對所有節點，生成nCopies個虛擬結點
foreach (string node in nodes)
{
//每四個虛擬結點為一組
for (int i = 0; i < numReps / 4; i++)
{
//getKeyForNode方法為這組虛擬結點得到惟一名稱
byte[] digest = HashAlgorithm.computeMd5(node + i);
/** Md5是一個16字節長度的數組，將16字節的數組每四個字節一組，分別對應一個虛擬結點，這就是為什么上面把虛擬結點四個劃分一組的原因*/
for (int h = 0; h < 4; h++)
{
long m = HashAlgorithm.hash(digest, h);
ketamaNodes[m] = node;
}
}
}
}

public string GetPrimary(string k)
{
byte[] digest = HashAlgorithm.computeMd5(k);
string rv = GetNodeForKey(HashAlgorithm.hash(digest, 0));
return rv;
}

string GetNodeForKey(long hash)
{
string rv;
long key = hash;
//如果找到這個節點，直接取節點，返回
if (!ketamaNodes.ContainsKey(key))
{
//得到大于當前key的那個子Map，然后從中取出第一個key，就是大于且離它最近的那個key 說明詳見: //www.javaeye.com/topic/684087
var tailMap = from coll in ketamaNodes
where coll.Key > hash
select new { coll.Key };
if (tailMap == null || tailMap.Count() == 0)
key = ketamaNodes.FirstOrDefault().Key;
else
key = tailMap.FirstOrDefault().Key;
}
rv = ketamaNodes[key];
return rv;
}
}

　　下面的代碼與JAVA中有所不同，它使用靜態方法實現：

#div_code img{border:0px;}

public class HashAlgorithm
{
public static long hash(byte[] digest, int nTime)
{
long rv = ((long)(digest[3 + nTime * 4] & 0xFF) << 24)
| ((long)(digest[2 + nTime * 4] & 0xFF) << 16)
| ((long)(digest[1 + nTime * 4] & 0xFF) << 8)
| ((long)digest[0 + nTime * 4] & 0xFF);

return rv & 0xffffffffL; /* Truncate to 32-bits */
}

/**
* Get the md5 of the given key.
*/
public static byte[] computeMd5(string k)
{
MD5 md5 = new MD5CryptoServiceProvider();

byte[] keyBytes = md5.ComputeHash(Encoding.UTF8.GetBytes(k));
md5.Clear();
//md5.update(keyBytes);
//return md5.digest();
return keyBytes;
}
}

　　下面是.net版本下的測試結果

　　分布平均性測試:測試隨機生成的眾多key是否會平均分布到各個結點上測試結果如下: