HashMap源码解析-红黑树操作

距离上一篇分析HashMap的文章已经过去一年了，今天偶尔翻到之前的那篇，还记得当时是打算另起一篇来分析链表的树化，以及树的链表化过程，结果这一拖就???，还是来把坑填上吧，毕竟要有始有终。

在我看来HashMap大致分为三个部分：散列表，红黑树算法，链表与树之间的转化。关于散列表部分，关于红黑树分析，这里就来分析下树化部分。

接下来先分析TreeNode 中的各个方法，之后再在相关操作中看它们的应用。

    static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
        TreeNode<K,V> parent;  // red-black tree links
        TreeNode<K,V> left;
        TreeNode<K,V> right;
        TreeNode<K,V> prev;    // needed to unlink next upon deletion
        boolean red;
        TreeNode(int hash, K key, V val, Node<K,V> next) {
            super(hash, key, val, next);
        }

TreeNode 继承自 LinkedHashMap.Entry，LinkedHashMap.Entry 继承自HashMap.Node，这绕了一圈是因为LinkedHashMap 的实现是依赖于HashMap的。关于LinkedHashMap的解析看我的这篇：LinkedHashMap源码解析（JDK8）

对于TreeNode 它有指向父节点的 parent，指向左节点的 left，右节点 right，前一个节点 prev，以及继承自 Node 的next，还有LinkedHashMap.Entry 中的 before，after指针，这两个与插入顺序的实现有关，在HashMap中并无对二者的任何相关操作。这就造成我们可以从两种不同视图角度来看树结构，一个是 parent，left，right 形成的二叉树，另一个是 next ，previous 形成的双向链表。

moveRootToFront：
就是将 root 前移至链头，不改变其在树中的位置。该方法是为了确保 root 即是红黑树的根节点，又是双向链表的表头。因为插入，删除操作可能会改变树的根节点，在插入删除后会调用该方法确保这种一致性，只有一处在删除后没有调用该方法，后文再介绍。
该方法本质上就是将节点从双向链原位置处删除，之后插入到链表的头部。

        static <K,V> void moveRootToFront(Node<K,V>[] tab, TreeNode<K,V> root) {
            int n;
            if (root != null && tab != null && (n = tab.length) > 0) {
                int index = (n - 1) & root.hash;
                TreeNode<K,V> first = (TreeNode<K,V>)tab[index];
                if (root != first) {
                    Node<K,V> rn;
                    tab[index] = root;
                    TreeNode<K,V> rp = root.prev;
                    if ((rn = root.next) != null)
                        ((TreeNode<K,V>)rn).prev = rp;
                    if (rp != null)
                        rp.next = rn;
                    if (first != null)
                        first.prev = root;
                    root.next = first;
                    root.prev = null;
                }
                assert checkInvariants(root);
            }
        }

treeify

在介绍 treeify 方法之前先来介绍下 TreeNode 节点之间的大小比较，假设要插入的节点 x，其关键字为 k，比较位置处节点 p，其关键字 pk ：首先比较 x 与 p 的 hash字段(即 key 的hash)，若相同则要看 k 的类是否是 class X implements Comparable<X>这样的类型，是则调用 compareTo( pk )，前提是 pk 与 k 是同一类型(即 k.getClass == pk.getClass)，否则则比较二者的类名，若仍相同则比较二者对象的hashCode，若 k 对象的hashCode 小于等于 pk 的，则返回 -1，代表 x 应插入到 p 的左子树中。

上述说明的实现对应 comparableClassFor，compareComparables，tieBreakOrder 三个方法：

comparableClassFor：
如果 x 类是这样的：class X implements Comparable<X>，则返回 x 的class 对象，否则返回null。

    static Class<?> comparableClassFor(Object x) {
        if (x instanceof Comparable) { // x必须实现Comparable接口
            Class<?> c; Type[] ts, as; Type t; ParameterizedType p;
            // String的化直接返回String.class
            if ((c = x.getClass()) == String.class) // bypass checks
                return c;
            // 获取x直接实现的所有接口
            if ((ts = c.getGenericInterfaces()) != null) {
            	// 遍历这些接口，找寻是否存在 Comparable<X>
                for (int i = 0; i < ts.length; ++i) {
                    if (((t = ts[i]) instanceof ParameterizedType) &&
                        ((p = (ParameterizedType)t).getRawType() ==
                         Comparable.class) &&
                        (as = p.getActualTypeArguments()) != null &&
                        as.length == 1 && as[0] == c) // type arg is c
                        return c;
                }
            }
        }
        return null;
    }

关于 ParameterizedType 泛型参数化类型：ParameterizedType详解

compareComparables：
若 x 与 k 是同一个类，k 的类是这样的：class C implements Comparable<C>，
则调用 compareTo 方法进行比较，否则返回0。

    static int compareComparables(Class<?> kc, Object k, Object x) {
        return (x == null || x.getClass() != kc ? 0 :
                ((Comparable)k).compareTo(x));
    }

tieBreakOrder
若 a ，b 对象其中有一个为null，或者是同一个类，则比较对象的hashCode；否则比较它们的类名。System.identityHashCode(null) 大小为 0。

        static int tieBreakOrder(Object a, Object b) {
            int d;
            if (a == null || b == null ||
                (d = a.getClass().getName().
                 compareTo(b.getClass().getName())) == 0)
                d = (System.identityHashCode(a) <= System.identityHashCode(b) ?
                     -1 : 1);
            return d;
        }

treeify
普通的二叉查找树的插入操作，关于比较上面已经分析过，插入后调用 balanceInsertion 进行红黑平衡，这属于红黑树算法这一部分——红黑树，这里就不再赘述了。

        final void treeify(Node<K,V>[] tab) {
            TreeNode<K,V> root = null;
            for (TreeNode<K,V> x = this, next; x != null; x = next) {
                next = (TreeNode<K,V>)x.next;
                x.left = x.right = null;
                if (root == null) {
                    x.parent = null;
                    x.red = false;
                    root = x;
                }
                else {
                    K k = x.key;
                    int h = x.hash;
                    Class<?> kc = null;
                    for (TreeNode<K,V> p = root;;) {
                        int dir, ph;
                        K pk = p.key;
                        if ((ph = p.hash) > h)
                            dir = -1;
                        else if (ph < h)
                            dir = 1;
                        else if ((kc == null &&
                                  (kc = comparableClassFor(k)) == null) ||
                                 (dir = compareComparables(kc, k, pk)) == 0)
                            dir = tieBreakOrder(k, pk);

                        TreeNode<K,V> xp = p;
                        if ((p = (dir <= 0) ? p.left : p.right) == null) {
                            x.parent = xp;
                            if (dir <= 0)
                                xp.left = x;
                            else
                                xp.right = x;
                            root = balanceInsertion(root, x);
                            break;
                        }
                    }
                }
            }
            moveRootToFront(tab, root);
        }

如果从 left , tight , parent 的角度来看，这是一颗二叉树，如果从 prev, next 的角度则是一个双向链表，moveRootToFront 就是将 root 节点移到双向列表的头部，这并不会破坏树，确保树的根同时又是链的头。

find

在以当前节点为根的子树中找寻目标节点。二叉查找树的搜寻很简单，关键是由于HashMap的节点大小比较设计的很健壮，也就造成了搜索的代码有些绕。

总结一下代码的逻辑：首先比较hash，若相同则确认是否是要找寻的目标，不是则先看左或右是否为空，有一个为空的化就不需要甄别直接从非空一边继续查找，都不为空，则尝试使用 comparaTo 方法来比较大小(前提是二者皆是X类的实例 class X implements Comparable<X>)，不满足前提或是比较了仍为0，则从该位置的右子树中查找，若未搜索到则跳回到该位置从其左子树中进行搜索。

        final TreeNode<K,V> find(int h, Object k, Class<?> kc) {
            TreeNode<K,V> p = this;//注意find是TreeNode的方法，this指的是当前树节点
            do {
                int ph, dir; K pk;
                TreeNode<K,V> pl = p.left, pr = p.right, q;
                if ((ph = p.hash) > h)
                    p = pl;
                else if (ph < h)
                    p = pr;
                else if ((pk = p.key) == k || (k != null && k.equals(pk)))
                    return p;
                else if (pl == null)
                    p = pr;
                else if (pr == null)
                    p = pl;
                else if ((kc != null ||
                          (kc = comparableClassFor(k)) != null) &&
                         (dir = compareComparables(kc, k, pk)) != 0)
                    p = (dir < 0) ? pl : pr;
                else if ((q = pr.find(h, k, kc)) != null)
                    return q;
                else
                    p = pl;
            } while (p != null);
            return null;
        }

untreeify

replacementNode：根据 p 来构建一个新的 Node 节点，untreeify 利用它来生成Node链表节点。

    Node<K,V> replacementNode(Node<K,V> p, Node<K,V> next) {
        return new Node<>(p.hash, p.key, p.value, next);
    }

树的链化。就是沿着TreeNode 的 next指针遍历整个树构建 Node 节点链，返回链头。

        final Node<K,V> untreeify(HashMap<K,V> map) {
            Node<K,V> hd = null, tl = null;
            for (Node<K,V> q = this; q != null; q = q.next) {
                Node<K,V> p = map.replacementNode(q, null);
                if (tl == null)
                    hd = p;
                else
                    tl.next = p;
                tl = p;
            }
            return hd;
        }

由于我们TreeNode 始终保持着 next 指针，其链化操作很容易得以实现。

putTreeVal

代码实现是常规的二叉搜索树的插入实现。

关键在于对 hash 相同 key 不相同情况的处理：首先尝试使用Comparable来比较确定往左还是往右，若无法比较或是仍相等，导致无法确定左/右，则尝试从左子树中寻找，找不到就去右子树，找到说明已存在直接返回该节点，仍未找到调用tieBreakOrder，通过类名或对象的 hash 值来比较大小，最终你将得到一个确切的方向-左/右，沿着该方向找到确定的插入位置。

注意：方法采用无限 for 循环的方式来找寻插入位置，当初次遇到 hash 相同 key 不相同且无法利用Comparable确定搜寻方向时，会搜索左右子树来查找节点是否已存在，不存在则调用 tieBreakOrder 利用类名或hashcode来确定方向，若是接下来再次遇到这样的情况是没有必要再搜索的，直接调用 tieBreakOrder 确定方向。

当找到插入的位置，比如要插入到 p 的左侧：构建TreeNode节点 x ，加入树中，即 p.left = x，x.parent = p 到此树构建完毕，可是TreeNode 还维持了 prev及next 指针，维持着一个双向链表的视图，我们需要将 x 插入到 p 在双向链表中位置的后面。

最后调用 balanceInsertion 红黑平衡，该方法返回根节点 root，因为平衡可能会改变头节点，所以平衡后调用 moveRootToFront 保证红黑树的头节点同时也是双向链表的表头。

        final TreeNode<K,V> putTreeVal(HashMap<K,V> map, Node<K,V>[] tab,
                                       int h, K k, V v) {
            Class<?> kc = null;
            boolean searched = false;
            TreeNode<K,V> root = (parent != null) ? root() : this;
            for (TreeNode<K,V> p = root;;) {
                int dir, ph; K pk;
                if ((ph = p.hash) > h)
                    dir = -1;
                else if (ph < h)
                    dir = 1;
                else if ((pk = p.key) == k || (k != null && k.equals(pk)))
                    return p;
                else if ((kc == null &&
                          (kc = comparableClassFor(k)) == null) ||
                         (dir = compareComparables(kc, k, pk)) == 0) {
                    if (!searched) {
                        TreeNode<K,V> q, ch;
                        searched = true;
                        if (((ch = p.left) != null &&
                             (q = ch.find(h, k, kc)) != null) ||
                            ((ch = p.right) != null &&
                             (q = ch.find(h, k, kc)) != null))
                            return q;
                    }
                    dir = tieBreakOrder(k, pk);
                }

                TreeNode<K,V> xp = p;
                if ((p = (dir <= 0) ? p.left : p.right) == null) {
                    Node<K,V> xpn = xp.next;
                    TreeNode<K,V> x = map.newTreeNode(h, k, v, xpn);
                    if (dir <= 0)
                        xp.left = x;
                    else
                        xp.right = x;
                    xp.next = x;
                    x.parent = x.prev = xp;
                    if (xpn != null)
                        ((TreeNode<K,V>)xpn).prev = x;
                    moveRootToFront(tab, balanceInsertion(root, x));
                    return null;
                }
            }
        }

removeTreeNode

二叉查找树的删除操作就是找到删除节点右子树的最小节点，同最小节点的key与value替换删除节点的(红黑树多一步颜色的替换)，将问题转化为删除该子树的最小节点，该节点的左子树为null，这就降低了问题复杂性。

而HashMap的删除不同于一般红黑树，因为TreeNode 同时还是双向链表中的一个节点，所以不能像一般二叉查找树那样替换，会破坏链表。

HashMap采用的是将树中这两个节点进行换位，颜色也互换保证红黑平衡，并不改变二者在链表中的位置，互换后，删除节点此时的左子树是空的，将问题转化为了对左子树为空的节点的删除。

源码主要分三个部分：

第一部分：将删除节点从双向链表中删除。

        final void removeTreeNode(HashMap<K,V> map, Node<K,V>[] tab,
                                  boolean movable) {
            int n;
            if (tab == null || (n = tab.length) == 0)
                return;
            int index = (n - 1) & hash;
            TreeNode<K,V> first = (TreeNode<K,V>)tab[index], root = first, rl;
            TreeNode<K,V> succ = (TreeNode<K,V>)next, pred = prev;
            if (pred == null) // 删除的是链表头节点
                tab[index] = first = succ;
            else
                pred.next = succ;
            if (succ != null)
                succ.prev = pred;
            if (first == null) // table该位置为空
                return;
            // table[index]处一定是链表的头节点，但却并不一定是树的根节点
            if (root.parent != null) 
                root = root.root();
            // 树太小，进行链表化，之后返回
            if (root == null || root.right == null ||
                (rl = root.left) == null || rl.left == null) {
                tab[index] = first.untreeify(map);  // too small
                return;
            }

第一部分就是将节点从双向链表中删除，注意边界的处理。

注意：table[index]处一定是链表的头节点，但却并不一定是树的根节点，虽然每次 putTreeVal 会保证树的根节点是链表的头节点，通过 moveRootToFront 方法，扩容拆分树的 split 方法调用 treeify 来重构树，而 treeify 会在最后调用 moveRootToFront，那么究竟是哪里造成了树的根节点与链表头节点的不相等呢？
就是 removeTreeNode 自身，该方法有个参数 movable，用来控制是否在删除节点后确保树的根节点与链的头节点的一致性，也就是是否调用moveRootToFront，关于究竟哪些方法传了true，哪些传了false，在所有调用该方法中只有一处传了 false，就是内部抽象类 HashIterator 的 remove 方法，之后分析。

若树过小——通过对树高度的探测来判断，则链表化后直接返回。

第二部分：将删除节点与其右子树最小节点互换，之后重新平衡树。

            TreeNode<K,V> p = this, pl = left, pr = right, replacement;
            if (pl != null && pr != null) {
                TreeNode<K,V> s = pr, sl;
                // s 为删除节点p 的右子树最小节点
                while ((sl = s.left) != null) // find successor
                    s = sl;
                // 交换颜色，保证s换到p位置后不会破坏红黑平衡
                boolean c = s.red; s.red = p.red; p.red = c; // swap colors
                
                // 下面代码就是将 p 与 s在树中的位置互换
                
                TreeNode<K,V> sr = s.right;
                TreeNode<K,V> pp = p.parent;
                if (s == pr) { // p 的右子树只有s这一个节点
                    p.parent = s;
                    s.right = p;
                }
                else {
                    TreeNode<K,V> sp = s.parent;
                    if ((p.parent = sp) != null) {
                        if (s == sp.left)
                            sp.left = p;
                        else
                            sp.right = p;
                    }
                    if ((s.right = pr) != null)
                        pr.parent = s;
                }
                p.left = null;
                if ((p.right = sr) != null)
                    sr.parent = p;
                if ((s.left = pl) != null)
                    pl.parent = s;
                if ((s.parent = pp) == null)
                    root = s;
                else if (p == pp.left)
                    pp.left = s;
                else
                    pp.right = s;
                if (sr != null)
                    replacement = sr;
                else
                    replacement = p;
            }
            // 这里对replacement的确定很重要，
            // replacement可能是p本身或是它不为空的那个子节点。
            // 对于是其本身我们会在红黑平衡后再删除它；否则我们先删除后平衡。
            else if (pl != null)
                replacement = pl;
            else if (pr != null)
                replacement = pr;
            else
                replacement = p;
            if (replacement != p) { // 非p本身则先删除后平衡
                TreeNode<K,V> pp = replacement.parent = p.parent;
                if (pp == null)
                    root = replacement;
                else if (p == pp.left)
                    pp.left = replacement;
                else
                    pp.right = replacement;
                p.left = p.right = p.parent = null;
            }

			// r 代表树的根节点，若p的颜色为红(说明交换前的s节点颜色为红)，
			// 则无需进行平衡，否则调用balanceDeletion进行平衡，还方法返回树的根节点
            TreeNode<K,V> r = p.red ? root : balanceDeletion(root, replacement);

            if (replacement == p) {  // 是其本身则在平衡后删除该节点
                TreeNode<K,V> pp = p.parent;
                p.parent = null;
                if (pp != null) {
                    if (p == pp.left)
                        pp.left = null;
                    else if (p == pp.right)
                        pp.right = null;
                }
            }
        }

第二部分主要是将删除节点 p 与其右子树最小节点 s 进行互换，包括颜色，之后进行红黑平衡，需要注意的地方都在上述代码中注释了，具体红黑树算法请看我关于红黑树的文章。

第三部分：

            // 参数movable控制是否在删除节点后确保树的根节点为链的头节点，
            // 因为删除操作可能会改变根节点
            if (movable)
                moveRootToFront(tab, r);

split

只被扩容方法 resize 调用，该方法作用是将红黑树拆分，同链的扩容一样，通过最高位的 bit 是否为0来拆分，比如大小16的数组，扩容后为32，bit 大小为16，原先 arr[ index ]处的节点，一部分仍留在原地，另一部分则被分到 arr[ 16+index ] 处，详见我分析HashMap的第一篇，关于数组为何维持2的倍数，一是 & 代替 % 提高效率，二是避免扩容时hash的重定位操作，文章里都有介绍。

// 拆分table[index]处的红黑树，bit用于区分；比如扩容为32，则bit为10000(即16)
        final void split(HashMap<K,V> map, Node<K,V>[] tab, int index, int bit) {
            TreeNode<K,V> b = this;

            //用于构建留在原地arr[index]处的节点
            TreeNode<K,V> loHead = null, loTail = null;
            //用于构建arr[index+bit]处的节点
            TreeNode<K,V> hiHead = null, hiTail = null;
            int lc = 0, hc = 0;
            for (TreeNode<K,V> e = b, next; e != null; e = next) {
                next = (TreeNode<K,V>)e.next;
                e.next = null;
                if ((e.hash & bit) == 0) {
                    if ((e.prev = loTail) == null)
                        loHead = e;
                    else
                        loTail.next = e;
                    loTail = e;
                    ++lc;
                }
                else {
                    if ((e.prev = hiTail) == null)
                        hiHead = e;
                    else
                        hiTail.next = e;
                    hiTail = e;
                    ++hc;
                }
            }
            // 到这里，从红黑树的视图来看树并没右便，从链表的视图来看
            // 已经分割成了两块链，接下来就是重新树化这两条链，当然若是太小则链化

            if (loHead != null) {
                if (lc <= UNTREEIFY_THRESHOLD) // 6
                    tab[index] = loHead.untreeify(map);
                else {
                    tab[index] = loHead;
                    if (hiHead != null) // (else is already treeified)
                        loHead.treeify(tab);
                }
            }
            if (hiHead != null) {
                if (hc <= UNTREEIFY_THRESHOLD)
                    tab[index + bit] = hiHead.untreeify(map);
                else {
                    tab[index + bit] = hiHead;
                    if (loHead != null)
                        hiHead.treeify(tab);
                }
            }
        }

TreeNode中剩下的 rotateLeft ，roatateRight，balanceInsertion，balanceDeletion 都是红黑树算法中关于红黑平衡的操作，这里就不赘述了，详见我的红黑树文章。

我们来看看HashMap如何使用这些TreeNode方法的：

put 调用 putVal ，方法在检测到插入位置处是红黑树时，调用 putTreeVal 方法，传进必要的参数信息。

            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);

若是检测到链表需要树化则调用 treeifyBin 方法，在该方法中将 Node 链表转化为TreeNode 链，之后头节点的 treeify 构造树。注意当table数组大小小于64时，treeifyBin 的调用不会进行树化而是调用 resize 进行扩容，这里的设计意图是这样的：当由于数组过小(<64)导致的hash冲突，我们选择扩容而不是树化，毕竟链的树化以及树的链化也是种消耗，而且TreeNode比Node更占空间。

get 调用 getNode，调用 getTreeNode 在红黑树中查找目标，底层调用的就是我们上面分析的 find 方法

                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);

remove 调用 removeNode，注意这里的 movable 参数传的是true，在removeNode方法中会调用 getTreeNode 获取目标，之后 removeTreeNode 删除树节点。

                if (node instanceof TreeNode)
                	// movable 为true
                    ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);

movable 为true 说明我们在调用remove删除时，最后会保证根与链头的一致性。

resize 调用 split 来拆分树，传入新数组 newTab 及高位bit大小。

                    else if (e instanceof TreeNode)
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);

HashIterator

HashIterator 是从 table 数组 0 开始沿着链表的顺序遍历完数组的全部节点，遍历完一个位置就跳到下一个不为null的位置继续遍历，通过它我们可以遍历完所有节点。

上面在分析 removeTreeNode 是说当参数 movable 为false 时不会调用 moveRootToFront 来确保根与链头的一致性，而只有 HashIterator 的remove 方法这一处调用传了 false ，这是因为HashIterator 是按照链顺序迭代的，若是删除某个节点后根发生变化，moveRootToFront 会将新根节点从双向链表原位置处删除并插入到链头处，而新的根可能是还未迭代到的节点，这样会造成某些数据无法获取到，所以才不允许 moveRootToFront 方法的调用。

总结

到此HashMap 的三大部分已经分析完毕，若是觉得还不错的化就点个赞吧！
在这里插入图片描述

原文链接：https://blog.csdn.net/sinat_34976604/article/details/90404647