redis的hash用法,redis如何rehash

虽然很久没有写过纯粹的C代码了，但是最近在学习redis，很佩服它的力量和优雅，在空闲的时候看看源代码。结构非常清晰，各模块功能也很清晰，非常适合阅读和学习。

众所周知redis支持各种各样的数据结构，其中dict的使用频率相当高，是非常实用的结构。 redis的具体实现使用渐进式gxdxl(rehashing)机制来提高dict的缩放效率。看这部分的源代码，真的有很优雅的东西。

其实关于渐进式gxdxl的报道不少，但我决定自己写。在重新考虑想法的同时，可以加深以下印象。

在查看rehash的函数主体之前，我们来看看dict的数据结构是如何定义的：

/* gxdxl表节点*/typedef struct dictEntry { //键void *key； //值union { void *val； uint64_t u64； int64_t s64； ) v； //指向下一个gxdxl表节点，然后单击链表struct dictEntry *next； (} dictEntry；/* thisisourhashtablestructure.everydictionaryhastwoofthisaswe * implementincrementalrehashing， fortheoldtothenewtable.* */typedef struct dict ht {//gxdxl表数组//可以看作是一个GX dxl表数组，作为条目链表的开头节点。 (链寻址解决gxdxl冲突) dictEntry **table； //gxdxl表大小unsigned long size； //gxdxl表的大小掩码。计算索引值//以使其始终等于size - 1 unsigned long sizemask； //此gxdxl表中的现有节点数unsigned long used； (} dictht； /*字典*/typedef struct dict { //类型指定函数dictType *type； //私人数据void *privdata； //gxdxl表dictht ht[2]； //rehash索引//rehash如果不存在，则值为-1 int rehashidx；/* rehashingnotinprogressifrehashidx==-1 *///当前正在运行的安全迭代器的数量int iterators；/* numberofiteratorscurrentlyrunning */} dict； dict的结构大致如上。接下来，我们来分析一下其中最重要的数据成员。

dictht:table:gxdxl表内部的table结构使用链地址法解决了gxdxl的冲突。开始看的时候，我很奇怪这为什么是二维排列。这实际上是一个指向数组的指针，数组中的每个项目都是entry链表的开头节点。 dict htht [2] :在dict内部维护两个gxdxl表。角色与一对滚动数组相同。一张表是旧表，一张表是新表。如果需要动态调整hashtable的大小，旧表中的元素会迁移到新打开的新表中，当前新表将调整大小成为旧表。因为这是rehashidx:http://www.Sina.com/的gxdxl，所以数据迁移不是一步，所以需要一个表示当前rehash进度的索引。如果rehashidx为-1，则没有gxdxl操作。接下来，我们来看看rehash的主体部分。直接从github的最新版本中获取的。

/* performsnstepsofincrementalrehashing.return S1 iftherearestill * keystomovefromtheoldtothenewhashtable， otherwise0is returned.* * notethatarehashingstepconsistsinmovingabucket (thatmayhavemore * Thanonekeyasweusechaining ) ) ) ) 652 from the old to the new hash table，however * sincepartofthehashtablemaybech ) itis not * guaranteedthatthisfunctionwillrehashevenasinglebucket，since it * willvisitatmaxn * 10 empty bu

ckets in total, otherwise the amount of * work it does would be unbound and the function may block for a long time. */int dictRehash(dict *d, int n) { int empty_visits = n*10; /* Max number of empty buckets to visit. */ if (!dictIsRehashing(d)) return 0; while(n-- && d->ht[0].used != 0) { dictEntry *de, *nextde; /* Note that rehashidx can't overflow as we are sure there are more * elements because ht[0].used != 0 */ assert(d->ht[0].size > (unsigned long)d->rehashidx); while(d->ht[0].table[d->rehashidx] == NULL) { d->rehashidx++; if (--empty_visits == 0) return 1; } de = d->ht[0].table[d->rehashidx]; /* Move all the keys in this bucket from the old to the new hash HT */ while(de) { uint64_t h; nextde = de->next; /* Get the index in the new hash table */ h = dictHashKey(d, de->key) & d->ht[1].sizemask; de->next = d->ht[1].table[h]; d->ht[1].table[h] = de; d->ht[0].used--; d->ht[1].used++; de = nextde; } d->ht[0].table[d->rehashidx] = NULL; d->rehashidx++; } /* Check if we already rehashed the whole table... */ if (d->ht[0].used == 0) { zfree(d->ht[0].table); d->ht[0] = d->ht[1]; _dictReset(&d->ht[1]); d->rehashidx = -1; return 0; } /* More to rehash... */ return 1;}

了解一个函数功能最好的入口就是它的注释。我们可以大致了解到：

rehash是以bucket(桶)为基本单位进行渐进式的数据迁移的，每步完成一个bucket的迁移，直至所有数据迁移完毕。一个bucket对应gxdxl表数组中的一条entry链表。新版本的dictRehash()还加入了一个最大访问空桶数(empty_visits)的限制来进一步减小可能引起阻塞的时间。

接下来我们深扒一下这个函数的具体实现。

判断dict是否正在rehashing，只有是，才能继续往下进行，否则已经结束gxdxl过程，直接返回。接着是分n步进行的渐进式gxdxl主体部分（n由函数参数传入），在while的条件里面加入对.used旧表中剩余元素数目的观察，增加安全性。一个runtime的断言保证一下渐进式gxdxl的索引没有越界。接下来一个小while是为了跳过空桶，同时更新剩余可以访问的空桶数，empty_visits这个变量的作用之前已经说过了。现在我们来到了当前的bucket，在下一个while(de)中把其中的所有元素都迁移到ht[1]中，索引值是辅助了gxdxl表的大小掩码计算出来的，可以保证不会越界。同时更新了两张表的当前元素数目。每一步rehash结束，都要增加索引值，并且把旧表中已经迁移完毕的bucket置为空指针。最后判断一下旧表是否全部迁移完毕，若是，则回收空间，重置旧表，重置渐进式gxdxl的索引，否则用返回值告诉调用方，dict内仍然有数据未迁移。

渐进式gxdxl的精髓在于：数据的迁移不是一次性完成的，而是可以通过dictRehash()这个函数分步规划的，并且调用方可以及时知道是否需要继续进行渐进式gxdxl操作。如果dict数据结构中存储了海量的数据，那么一次性迁移势必带来redis性能的下降，别忘了redis是单线程模型，在实时性要求高的场景下这可能是致命的。而渐进式gxdxl则将这种代价可控地分摊了，调用方可以在dict做插入，删除，更新的时候执行dictRehash()，最小化数据迁移的代价。
在迁移的过程中，数据是在新表还是旧表中并不是一个非常急迫的需求，迁移的过程并不会丢失数据，在旧表中找不到再到新表中寻找就是了。

所以在后面的dict相关的函数里，会大量的看到

if(dictIsRehashing(d)) _dictRehashStep(d); // 单步rehash

这样的代码。

最后是从《Redis设计与实现》中copy来的图解，可以帮助大家更形象地理解整个incremental rehash的过程：

总结一下

redis高性能的保障采取了各式各样的措施，不乏很多优雅惊艳的工程技巧，非常值得我们学习。在阅读源码的过程中，还给我留下深刻印象的一点就是：redis对于内存的管理到了精细的程度，也可能是我太久没看pure c的项目了吧，收获还是颇丰的。希望能和大家一起共同进步。