java8新特性之Stream流操作

Stream流操作是java8的最重要的新特性之一，功能强大,非常常用。可以说每个java coder必须掌握，这次把理论放在最后面，因为你可能不需要知道的多么详细，只是需要看看某个功能，可能听名字就大概知道它是做什么的。

举个简单例子，现在给了你同学信息和成绩，你需要统计班级平均分

// 以前可能的做法
public static avgScore(List<Student> stus){
    int sum=0;
    for (Student student : stus) {
        int core = student.core;
        sum += core;
    }
    return  ((double)sum / stus.size());
}
// 有了Stream后，
public static Double avgScore(List<Student> stus){
        OptionalDouble average = stus.stream().mapToDouble(stu -> stu.core).average();
        return average.getAsDouble();
    }

流的一些操作：

filter

filter过滤操作，只返回满足条件的数据

filter返回的流中只包含满足断言(predicate)的数据

//输出学生数组中，成绩大于80的学生
public static void  sFilter(List<Student> stus){
        stus.stream()
                .filter(student -> student.getCore()>80)
                .forEach(System.out::println);
    }

distinct

distinct输出的流中彼此不同的元素，通过Object.equals(Object)来检查是否包含相同的元素。

//从数组中找出不同的对象
public static void  sDistinct(List<Student> stus){
        stus.stream()
                .distinct()
                .forEach(System.out::println);
    }

map、peek与forEach对比

map可以对流中每一个的元素执行操作返回一种元素(可以相同)，可以用来做元素转换

peek通常用于debug，会通过Consumer对流操作，但操作结果不返回Stream中。

forEach与map和peek类似，但由于不会返回元素，forEach的返回值是void，可以看做是流的终点操作

//map,从Student数组中获取由name组成的数组
public static void  sMap(List<Student> stus){
        List<String> collect = stus.stream()
                .map(student -> student.getName())
                .collect(Collectors.toList());
        stus.forEach(System.out::println);
        collect.forEach(System.out::println);
    }
//forEach，将Student数组中的成绩属性全部加1
public static void  sForEach(List<Student> stus){
        stus.stream()
                .forEach(student -> {
                    student.setCore(student.getCore()+1);
                });
        stus.forEach(System.out::println);
    }

peek和map的区别，在后面的理论中会对其原理做解释

// peek
IntStream.range(1,5).boxed().peek(i-> {
            i=i+1;
        }).forEach(System.out::print);
// 输出结果 1234
//map
IntStream.range(1,10).boxed().map(i-> i+1)
    .forEach(System.out::print);
// 输出结果 2345

可见peek中虽然对i进行+1操作，但没有改变源数据

flatMap

扁平化处理流，flatMap和map类似，但flatMap转换返回的是Stream对象，而map返回的是数据源的对象，flatMap会把返回的Stream对象中的元素压缩到一起，最后回到原来的流中

Map<String, List<Integer>> map = new LinkedHashMap<>();
map.put("a", Arrays.asList(1, 2, 3));
map.put("b", Arrays.asList(4, 5, 6));

List<Integer> allValues = map.values() // Collection<List<Integer>>
        .stream()                      // Stream<List<Integer>>
        .flatMap(List::stream)         // Stream<Integer>
        .collect(Collectors.toList());

System.out.println(allValues);

这个例子中，将map.values扁平化处理成单个的Stream。

sorted

sorted对源数据进行排序，通过实现Comparable接口完成排序。

如果源没有实现Comparable接口，在终点操作时将会抛出java.lang.ClassCastException异常。

// 按成绩排序，从大到小 
public static void sSorted(List<Student> stus) {
        List<Student> collect = stus.stream()
            .sorted(Comparator.comparingInt(Student::getCore).reversed())
            .collect(Collectors.toList());
        collect.forEach(System.out::println);
    }

limit

limit进行短路操作。指定一定数量的元素的流。

public static void sLimit(List<Student> stus) {
        List<Student> collect = stus.stream()
                .sorted(Comparator.comparingInt(Student::getCore).reversed())
                .limit(3)
                .collect(Collectors.toList());
        collect.forEach(System.out::println);
    }

boxed

将一个原始流转为盒型的流，就是将原始数据类型int，long，double转为Integer，Long，Double

// 计算average
OptionalDouble average = IntStream.range(1, 10).average();
System.out.println(average.getAsDouble());
// peek内的i是Integer类型
IntStream.range(1,5).boxed().peek(i-> {
            i=i+1;
        }).forEach(System.out::print);

skip

skip从当前流结果中丢弃前n个元素，返回新的流，如果元素个数小于或等于n，会返回空的流

 public static void sSkip(List<Student> stus){
        List<Student> collect = stus.stream()
                .limit(3)
                .skip(2)
                .collect(Collectors.toList());
        collect.forEach(System.out::println);
    }

reduce

reduce可以看做是map的细节版本，它有两个参数

pre：上一次返回的对象
cur：当前对象

   Optional<Integer> reduce = IntStream.range(1,5).boxed().reduce((pre,cur)->{
            return pre+cur;
        });
        System.out.println(reduce.get());

reduce 还有重载的方法，大概说一下

// 将使用流中第一个元素作为初始值
pubic Optional<T> reduce(BinaryOperator<T> accumulator)
// 使用提供的identity作为初始值
pubic T reduce(T identity, BinaryOperator<T> accumulator)
//同上，但对结果进行combiner操作，可以转为其他类型
pubic <U> U reduce(U identity, BiFunction<U,? super T,U> accumulator, BinaryOperator<U> combiner)

值得注意的是accumulator应该满足结合性(associative)。

match

match 匹配，它有三个方法：

allMatch
若流中所有元素符合条件断言，则返回true，反之为false，流为空则总是返回true
anyMatch
流中有一个及以上符合条件断言，则返回true
noneMatch
流中所有元素不满足条件断言，则返回true

public static void sMatch(List<Student> stus){
        boolean b = stus.stream().allMatch(stu -> stu.getCore() > 80);
        System.out.println(b);
        b = stus.stream().anyMatch(stu -> stu.getCore() > 90);
        System.out.println(b);
        b = stus.stream().noneMatch(stu -> stu.getCore() > 80);
        System.out.println(b);
    }

count

count 方法返回流中元素的数量、它的底层实现为

mapToLong(e->1L).sum();
// 其中sum是原始流的方法

collect

collect 非常常用了，它将按参数的方法，收集流中的数据组装然后返回一个对象

// 收集成List
List<String> asList = stringStream.collect(
    ArrayList::new, 
    ArrayList::add,
    ArrayList::addAll
); 
// 收集成String
String concat = stringStream.collect(
    StringBuilder::new, 
    StringBuilder::append,
    StringBuilder::append)
.toString();

find

findAny
从流中任意返回一个元素，串行流比较难看到效果
findFirst
返回流中第一个元素

// 若流为空，则返回空的Optional
Optional<Student> any = stus.stream().findAny();
Optional<Student> first = stus.stream().findFirst();

max、min

max：返回流中的最大值
min：返回流中最小值

boxed型的流需要给出自己的比较器，而原始流不需要

toArray()

将流中的元素放入到一个数组中。

contact

连接两个类型一样的流对象

来一杯理论

理论整理来自Javadoc。

基础

Stream流和具体的集合区别。

不存储数据：
流操作基于数据源对象，但本身不存储数据元素，或者说用完即消(使用者是无感知的)，通过管道将数据源元素传递给操作
函数式编程
流操作也不会修改数据源，filter、map都不会对源数据做修改
延迟操作
流的很多操作如filter,map等中间操作是延迟执行的，只记录了要怎么操作但还没执行，只有到终点操作才会将操作顺序执行。
就像点菜一样，下单过程，不是马上做的，而且确定了订单才根据订单做。
解绑
对于无限数量的流，有些操作是可以在有限的时间完成的，比如limit(n) 或 findFirst()，这些操作可以实现"短路"(Short-circuiting)，访问到有限的元素后就可以返回。
纯消费
流的元素只能访问一次，类似迭代器Iterator，操作没有回头路，如果你想从头重新访问流的元素，那你得重新生成一个新的流。
就是说流操作的每一步都是上一步的结果，而初始数据则是源数据。

流的操作是以管道的方式串起来的。流管道包含一个数据源，接着包含零到N个中间操作，最后以一个终点操作结束。

Stream接口还提供了原语int，long，double级的专门化的流。如IntStream, LongStream 和 DoubleStream。姑且称他们为primitive原始流，是对基本类型执行操作的，不过我们使用流通常是对对象操作。

创建流

创建的方法很多，主要有

集合自带的stream()方法
集合的静态Arrays.stream(Object[])方法
流的静态方法，如Stream.of(Object[])
从文件中获得流，BufferedReader.lines()
文件类Files 操作路径的方法，如list、find、walk等。
随机数流Random.ints()
底层的StreamSupport，它提供了将拆分器Spliterator转换成流的方法。

中间操作 intermediate operations

中间操作会返回流对象本身，形成串联的管道，流式风格（fluent style），并且中间操作是延迟的，不会修改源数据，只有在终点操作时才真正执行。

常见的中间操作

filter、distinct、map、peek、sorted、limit、boxed、skip、flatmap、skip、reduce

终点操作 terminal operations

终点操作是真正执行的，会将流的最终结果返回。

常见的终点操作

Match、count、collect、find、forEach、forEachOrdered、max、min、reduce、average、toArray

1. 流的串行与并行

所有的流操作都可以串行执行或者并行执行。除非显示地创建并行流，否则Java库中创建的都是串行流。

Collection.stream()创建串行流，使用parallel()也可以转为并行流
Collection.parallelStream()创建并行流，使用sequential()方法可以转为串行流。

2. 无干扰 Non-interference

流可以从非线程安全的集合中创建，当流的管道执行的时候，非concurrent数据源不应该被改变。

也就是说，在终点操作(管道执行)时，不能对数据源再做修改，避免并发问题产生无法预料的结果

// 这会抛出异常java.util.ConcurrentModificationException
List<Integer> l = new ArrayList(Arrays.asList(1, 2));
        Stream<Integer> sl = l.stream();
        sl.forEach(s -> l.add(3));

// 正常
List<String> l = new ArrayList(Arrays.asList("one", "two"));
Stream<String> sl = l.stream();
l.add("three"); // 在中间操作修改
sl.forEach(System.out::println);

// 使用concurrent的方式,也正常,但可能有无法预料的结果
List<String> l = new CopyOnWriteArrayList<>(Arrays.asList("one", "two"));
Stream<String> sl = l.stream();
sl.forEach(s -> l.add("three"));

CopyOnWriteArrayList的参考

特别的，如果将add换成set，并不会抛异常，因为对象没有被修改，只是内容发生变化。

3. 结合性

一个操作或者函数op满足结合性意味着它满足下面的条件

(a op b) op c == a op (b op c)

对于并发流来说，如果操作满足结合性，我们就可以并行计算

a op b op c op d == (a op b) op (c op d)

比如min、max以及字符串连接都是满足结合性的。

4. 排序

某些流的返回的元素是有确定顺序的，我们称之为 encounter order。

一个流是否是encounter order主要依赖数据源和它的中间操作，比如数据源List和Array上创建的流是有序的(ordered)，但是在HashSet创建的流不是有序的.

sorted方法可以将流转换成encounter order的，unordered可以将流转换成encounter order的。

注意，这个方法并不是对元素进行排序或者打散，而是返回一个是否encounter order的流

map会用不同的类型替换流中的元素，此时，元素的有序性已经没有意义了。
filter 执行丢弃了一些值，流的类型没有改变

对于串行流，流有序与否不会影响其性能，只是会影响确定性(determinism)，无序流在多次执行的时候结果可能是不一样的。

原文链接：https://blog.csdn.net/weixin_40877427/article/details/116022849