高效筛选两个List中的不同的元素

问题记录：

开发过程中，需要把两个List中不同的元素筛选出来，这两个List的数据量都很大，如果按照一般的方法，分别去遍历两个List，然后分别对每一个元素做比较，时间消耗将会达到m*n，处理效率显然不尽人意。

解决思路：

使用一个Map来对2个List中的元素进行计数：

即把List的元素作为Map的Key，Entry的Value为Integer类型，用于记录元素在两个集合中出现的次数。

解决方案：

先遍历一个List中的所有元素，put进Map，初始出现次数为1；

再遍历第二个List中的所有元素，与map已有的元素进行比较：

如果Map中不存在这个元素，就把这个元素插入结果集，

如果Map中存在这个元素，则把这个元素的出现次数置为2。

代码示例：

示例实体类Product：

public class Product {

	private Integer id;
	
	private String name;

	public Product(Integer id, String name) {
		this.id = id;
		this.name = name;
	}

	public Integer getId() {
		return id;
	}

	public String getName() {
		return name;
	}
	
	@Override
	public String toString() {
		return "Product [id=" + id + ", name=" + name + "]";
	}
	
	public boolean equals(Object o){
		if (o == null) {
			return false;
		}
		if (this == o) {
			return true;
		}
		if (o instanceof Product) {
			Product p = (Product) o;
			if (p.getId() == this.getId() && p.getName().equals(this.getName())) {
				return true;
			}else {
				return false;
			}
		}
		return false;
	}
	
	public int hashCode(){
		int result = 17;
		result = result*37 + id;
		result = result*37 + name.hashCode();
		return result;
	}
}

示例解决方法：

public static Collection<Product> getDiffrent(Collection<Product> col1, Collection<Product> col2){
		//创建返回结果
		Collection<Product> diffrentResult = new ArrayList<>();
		//比较出两个集合的大小，在添加进map的时候先遍历较大集合，这样子可以减少没必要的判断
		Collection<Product> bigCol = null;
		Collection<Product> smallCol = null;
		if (col1.size() > col2.size()) {
			bigCol = col1;
			smallCol = col2;
		}else {
			bigCol = col2;
			smallCol = col1;
		}
		//创建 Map<对象,出现次数> (直接指定大小减少空间浪费)
		Map<Object, Integer> map = new HashMap<>(bigCol.size());
		//遍历大集合把元素put进map，初始出现次数为1
		for(Product p : bigCol) {
			map.put(p, 1);
		}
		//遍历小集合，如果map中不存在小集合中的元素，就添加到返回结果，如果存在，把出现次数置为2
		for(Product p : smallCol) {
			if (map.get(p) == null) {
				diffrentResult.add(p);
			}else {
				map.put(p, 2);
			}
		}
		//把出现次数为1的 Key:Value 捞出，并把Key添加到返回结果
		for(Map.Entry<Object, Integer> entry : map.entrySet()) {
			if (entry.getValue() == 1) {
				diffrentResult.add((Product) entry.getKey());
			}
		}
		
		return diffrentResult;
	}

测试代码：

public static void main(String[] args) {
		List<Product> list1 = new ArrayList<>();
		List<Product> list2 = new ArrayList<>();
		for (int i = 0; i < 10; i++) {
			list1.add(new Product(i, "Product"+String.valueOf(i)));
		}
		for (int i = 0; i < 10; i = i + 2) {
			list2.add(new Product(i, "Product"+String.valueOf(i)));
		}
		Collection<Product> result = getDiffrent(list1, list2);
		for(Product p : result) {
			System.out.println(p.toString());
		}
	}

测试结果：

Product [id=7, name=Product7]
Product [id=1, name=Product1]
Product [id=9, name=Product9]
Product [id=3, name=Product3]
Product [id=5, name=Product5]

解决过程中遇到的问题：

由于是把自定义类作为Map的Key，势必会存在一个问题：

Map在get的时候，是没有办法直接get到这个Key对应的键值对的。

解决办法：

由HashMap的get方法的源码：

    public V get(Object key) {
        if (key == null)
            return getForNullKey();
        Entry<K,V> entry = getEntry(key);

        return null == entry ? null : entry.getValue();
    }

再来看看getEntry方法的源码：

  final Entry<K,V> getEntry(Object key) {
        if (size == 0) {
            return null;
        }

        int hash = (key == null) ? 0 : hash(key);
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash &&
                ((k = e.key) == key || (key != null && key.equals(k))))
                return e;
        }
        return null;
    }

由源码可以看出，HashMap在根据Key查找的时候，是根据hashCode的值和equals方法来查找这个Key所对应的键值对的，显然我们需要重写自定义类的equals()方法和hashCode()方法。

由于这里只需要判断对象的逻辑相等，重写的equals()方法只需要判断各个属性值是否相等即可

	public boolean equals(Object o){
		if (o == null) {
			return false;
		}
		if (this == o) {
			return true;
		}
		if (o instanceof Product) {
			Product p = (Product) o;
			if (p.getId() == this.getId() && p.getName().equals(this.getName())) {
				return true;
			}else {
				return false;
			}
		}
		return false;
	}

重写hashCode()方法

学习了《Effective Java》中提出的一种简单通用的hashCode算法

1. 初始化一个整形变量，为此变量赋予一个非零的常数值，比如int result = 17;

2. 选取equals方法中用于比较的所有域，然后针对每个域的属性进行计算：

(1) 如果是boolean值，则计算f ? 1:0

(2) 如果是byte\char\short\int,则计算(int)f

(3) 如果是long值，则计算(int)(f ^ (f >>> 32))

(4) 如果是float值，则计算Float.floatToIntBits(f)

(5) 如果是double值，则计算Double.doubleToLongBits(f)，然后返回的结果是long,再用规则(3)去处理

long得到int

(6) 如果是对象应用，如果equals方法中采取递归调用的比较方式，那么hashCode中同样采取递归调用

hashCode的方式。否则需要为这个域计算一个范式，比如当这个域的值为null的时候，那么hashCode值为0。

(7) 如果是数组，那么需要为每个元素当做单独的域来处理。如果你使用的是1.5及以上版本的JDK，那么没

必要自己去重新遍历一遍数组，java.util.Arrays.hashCode方法包含了8种基本类型数组和引用数组的

hashCode计算，算法同上。

	public int hashCode(){
		int result = 17;
		if (id != null) {
			result = result*37 + id;
		}
		if (name != null) {
			result = result*37 + name.hashCode();
		}
		return result;
	}

至此问题完美解决，总的来说这是一个以空间换时间的解决方案。

原文链接：https://blog.csdn.net/qq_35235028/article/details/78553514