caffeine 淘汰策略 – 源码巴士

Caffeine在开启了淘汰策略的时候，维护了三条队列，分别是eden，protected，probation队列。具体的三条队列大小分配方案如下所示，

long max = Math.min(maximum, MAXIMUM_CAPACITY);
long eden = max - (long) (max * PERCENT_MAIN);
long mainProtected = (long) ((max - eden) * PERCENT_MAIN_PROTECTED);

其中，eden为总大小的百分之1，当总大小为1000的时候，eden也只有10的长度，而protected则占剩下百分之99的百分之80，也就是800，剩下的百分之20则为probation队列，大小则为190。

当数据刚添加的时候将会将其加入到eden队列，eden队列中的数据不会具体参与到淘汰的过程中，保证在lfu算法下新进入的数据不会由于刚进入而被淘汰。

淘汰的具体方法为evictEntries()方法。

void evictEntries() {
  if (!evicts()) {
    return;
  }
  int candidates = evictFromEden();
  evictFromMain(candidates);
}

其中，一共分为两步，第一步则通过evictFromEden()方法将eden队列中的元素个数平衡到eden的最大数量以下，并将移出的元素放到probation队列中，而后通过evictFromMain()方法开始从probation和protected淘汰数据。

int evictFromEden() {
  int candidates = 0;
  Node<K, V> node = accessOrderEdenDeque().peek();
  while (edenWeightedSize() > edenMaximum()) {
    // The pending operations will adjust the size to reflect the correct weight
    if (node == null) {
      break;
    }

    Node<K, V> next = node.getNextInAccessOrder();
    if (node.getWeight() != 0) {
      node.makeMainProbation();
      accessOrderEdenDeque().remove(node);
      accessOrderProbationDeque().add(node);
      candidates++;

      lazySetEdenWeightedSize(edenWeightedSize() - node.getPolicyWeight());
    }
    node = next;
  }

  return candidates;
}

evictFromEden()方法的实现如上，逻辑很简单，当eden元素个数大于总量的百分之一的时候，从eden队首获取元素，并将其移到probation的队尾重复该操作，直到eden剩余数量小于预设的eden长度为止。该方法会返回从eden移入到probation的个数，这几个新加入probation的成员会优先与probation的队首键进行频度比较。

而后，通过evictFromMain()方法开始正式进行淘汰，其中，优先会从probation中将数据进行淘汰，直到probation队列为空才会从protected队列中获取元素尝试淘汰，而当protected也而无法获取元素，则会从protected队列中进行获取。probation中的元素被外部访问一次就会被晋升到protected队列。

int victimQueue = PROBATION;
Node<K, V> victim = accessOrderProbationDeque().peekFirst();
Node<K, V> candidate = accessOrderProbationDeque().peekLast();

evictFromMain()方法很长，首先，会从probation中获取队首和队尾两个元素，其中，队尾的元素为刚刚上一步从eden移入到probation的元素。

if ((candidate == null) && (victim == null)) {
  if (victimQueue == PROBATION) {
    victim = accessOrderProtectedDeque().peekFirst();
    victimQueue = PROTECTED;
    continue;
  } else if (victimQueue == PROTECTED) {
    victim = accessOrderEdenDeque().peekFirst();
    victimQueue = EDEN;
    continue;
  }

  // The pending operations will adjust the size to reflect the correct weight
  break;
}
if ((victim != null) && (victim.getPolicyWeight() == 0)) {
  victim = victim.getNextInAccessOrder();
  continue;
} else if ((candidate != null) && (candidate.getPolicyWeight() == 0)) {
  candidate = candidate.getPreviousInAccessOrder();
  candidates--;
  continue;
}

首先，如果队首队尾都无法正常获取元素，则会变更淘汰处理的队列，变更顺序已经有提，并略过0权重下的数据。

if (victim == null) {
  candidates--;
  Node<K, V> evict = candidate;
  candidate = candidate.getPreviousInAccessOrder();
  evictEntry(evict, RemovalCause.SIZE, 0L);
  continue;
} else if (candidate == null) {
  Node<K, V> evict = victim;
  victim = victim.getNextInAccessOrder();
  evictEntry(evict, RemovalCause.SIZE, 0L);
  continue;
}

如果队首队尾元素有一个为null，那么就直接淘汰其中不为null的元素。

K victimKey = victim.getKey();
K candidateKey = candidate.getKey();
if (victimKey == null) {
  Node<K, V> evict = victim;
  victim = victim.getNextInAccessOrder();
  evictEntry(evict, RemovalCause.COLLECTED, 0L);
  continue;
} else if (candidateKey == null) {
  candidates--;
  Node<K, V> evict = candidate;
  candidate = candidate.getPreviousInAccessOrder();
  evictEntry(evict, RemovalCause.COLLECTED, 0L);
  continue;
}

之后，如果其中任意一个元素的key已经被回收，那么也直接就淘汰掉。

candidates--;
if (admit(candidateKey, victimKey)) {
  Node<K, V> evict = victim;
  victim = victim.getNextInAccessOrder();
  evictEntry(evict, RemovalCause.SIZE, 0L);
  candidate = candidate.getPreviousInAccessOrder();
} else {
  Node<K, V> evict = candidate;
  candidate = candidate.getPreviousInAccessOrder();
  evictEntry(evict, RemovalCause.SIZE, 0L);
}

最后，才采用admit()方法通过lfu算法进行淘汰，其中，lfu的具体实现在admit()方法中。

boolean admit(K candidateKey, K victimKey) {
  int victimFreq = frequencySketch().frequency(victimKey);
  int candidateFreq = frequencySketch().frequency(candidateKey);
  if (candidateFreq > victimFreq) {
    return true;
  } else if (candidateFreq <= 5) {
    // The maximum frequency is 15 and halved to 7 after a reset to age the history. An attack
    // exploits that a hot candidate is rejected in favor of a hot victim. The threshold of a warm
    // candidate reduces the number of random acceptances to minimize the impact on the hit rate.
    return false;
  }
  int random = ThreadLocalRandom.current().nextInt();
  return ((random & 127) == 0);
}

frequecy()方法来具体获得两个key的频度，其中具体的实现算法介绍在上一篇博文中已经提及。

如果队尾元素的频度大于队首，那么就直接淘汰队首，当队尾频度小于等于队首，且频度小于5的时候，直接将其淘汰，否则，通过随机的方式进行淘汰任意一个键。

这里的5的选择，与caffeine底层具体的lfu实现有关，因为底层通过4位记录一个频度，最大值是15，老的键可能由于超过15，导致频度减半变为7，所以队尾新加入的元素如果小于5会直接淘汰。

重复以上流程，淘汰到数据总量小于设定的最大容量。

原文链接：https://blog.csdn.net/weixin_40318210/article/details/106044398