모던 자바 인 액션 Chapter 6

doxxx·2022년 5월 29일

모던자바인액션

목록 보기

3/14

Chapter 6. 스트림으로 데이터 수집

java.util.stream.Collectors를 사용하게 된다.

6.1 컬렉터란 무엇인가?

6.2 리듀싱과 요약

https://junhyunny.blogspot.com/2019/04/collectors-reducing.html
https://junhyunny.blogspot.com/2019/04/stream-collect.html

6.2.3 문자열 연결

https://junhyunny.blogspot.com/2019/04/collectors-joining.html

6.3 그룹화

.collect(groupingBy(Function classifier, (Supplier mapFactory), (Collector downstream)))

위와 같은 형태를 갖는다.

groupingBy 팩토리 메서드는 인수를 1,2,3개를 갖게 오버로드하고있다.
- classifier: 입력 요소를 키에 매핑하는 분류자 함수
- mapFactory: 결과를 넣을 빈 Map을 제공하는 공급자.
- downstream: 값을 리스트 외 다른 타입으로 반환하려면 이 인수를 명시해야 된다.

https://junhyunny.blogspot.com/2019/04/collectors-groupingby.html

public final class Collectors {

    // 인수 1개를 갖는 경우: classifier(분류자)를 넣는다.
    // 인수를 두개 갖는 메서드를 리턴함
    public static <T, K> Collector<T, ?, Map<K, List<T>>>
    groupingBy(Function<? super T, ? extends K> classifier) {
        return groupingBy(classifier, toList());
    }

    // 인수 2개를 갖는 경우:
    public static <T, K, A, D>
    Collector<T, ?, Map<K, D>> groupingBy(Function<? super T, ? extends K> classifier,
        Collector<? super T, A, D> downstream) {
        return groupingBy(classifier, HashMap::new, downstream);
    }

    Collector<T, ?, M> groupingBy(Function<? super T, ? extends K> classifier,
        Supplier<M> mapFactory,
        Collector<? super T, A, D> downstream) {
        Supplier<A> downstreamSupplier = downstream.supplier();
        BiConsumer<A, ? super T> downstreamAccumulator = downstream.accumulator();
        BiConsumer<Map<K, A>, T> accumulator = (m, t) -> {
            K key = Objects.requireNonNull(classifier.apply(t),
                "element cannot be mapped to a null key");
            A container = m.computeIfAbsent(key, k -> downstreamSupplier.get());
            downstreamAccumulator.accept(container, t);
        };
        BinaryOperator<Map<K, A>> merger = Collectors.<K, A, Map<K, A>>mapMerger(
            downstream.combiner());
        @SuppressWarnings("unchecked")
        Supplier<Map<K, A>> mangledFactory = (Supplier<Map<K, A>>) mapFactory;

        if (downstream.characteristics().contains(Collector.Characteristics.IDENTITY_FINISH)) {
            return new CollectorImpl<>(mangledFactory, accumulator, merger, CH_ID);
        } else {
            @SuppressWarnings("unchecked")
            Function<A, A> downstreamFinisher = (Function<A, A>) downstream.finisher();
            Function<Map<K, A>, M> finisher = intermediate -> {
                intermediate.replaceAll((k, v) -> downstreamFinisher.apply(v));
                @SuppressWarnings("unchecked")
                M castResult = (M) intermediate;
                return castResult;
            };
            return new CollectorImpl<>(mangledFactory, accumulator, merger, finisher, CH_NOID);
        }
    }
}

6.4 분할

.collect(partitioningBy(Predicate predicate, (Collector downstream)))

위와 같은 형태를 갖는다.

partitioningBy 팩토리 메서드는 인수를 1,2,3개를 갖게 오버로드하고있다.
- predicate: 뷴류 함수
- downstream: 값을 리스트 외 다른 타입으로 반환하려면 이 인수를 명시해야 된다.

https://junhyunny.blogspot.com/2019/04/collectors-partitioningby.html

public final class Collectors {

    public static <T>
    Collector<T, ?, Map<Boolean, List<T>>> partitioningBy(Predicate<? super T> predicate) {
        return partitioningBy(predicate, toList());
    }

    public static <T, D, A>
    Collector<T, ?, Map<Boolean, D>> partitioningBy(Predicate<? super T> predicate,
        Collector<? super T, A, D> downstream) {
        BiConsumer<A, ? super T> downstreamAccumulator = downstream.accumulator();
        BiConsumer<Partition<A>, T> accumulator = (result, t) ->
            downstreamAccumulator.accept(predicate.test(t) ? result.forTrue : result.forFalse, t);
        BinaryOperator<A> op = downstream.combiner();
        BinaryOperator<Partition<A>> merger = (left, right) ->
            new Partition<>(op.apply(left.forTrue, right.forTrue),
                op.apply(left.forFalse, right.forFalse));
        Supplier<Partition<A>> supplier = () ->
            new Partition<>(downstream.supplier().get(),
                downstream.supplier().get());
        if (downstream.characteristics().contains(Collector.Characteristics.IDENTITY_FINISH)) {
            return new CollectorImpl<>(supplier, accumulator, merger, CH_ID);
        } else {
            Function<Partition<A>, Map<Boolean, D>> finisher = par ->
                new Partition<>(downstream.finisher().apply(par.forTrue),
                    downstream.finisher().apply(par.forFalse));
            return new CollectorImpl<>(supplier, accumulator, merger, finisher, CH_NOID);
        }
    }

}

6.5 Collector 인터페이스

public interface Collector<T, A, R> {

    // 새로운 결과 컨테이너 만들기
    Supplier<A> supplier();

    // 결과 컨테이너에 요소 추가하기
    BiConsumer<A, T> accumulator();

    // 최종 변환값을 결과 컨테이너로 적용하기
    Function<A, R> finisher();

    // 두 결과 컨테이너 병합
    BinaryOperator<A> combiner();

    // 스트림을 병렬로 리듀스할 것인지
    // 병렬로 리듀스 한다면 어떤 최적화를 선택할지 힌트 제공
    Set<Characteristics> characteristics();
}

`toList` 구현

public class ToListCollector<T> implements Collector<T, List<T>, List<T>> {


    @Override
    public Supplier<List<T>> supplier() {
        // 수집 연산
        return ArrayList::new;
    }

    @Override
    public BiConsumer<List<T>, T> accumulator() {
        // 탐색한 항목 누적, 누적자를 고친다
        return List::add;
    }

    @Override
    public Function<List<T>, List<T>> finisher() {
        // 항등 함수
        return Function.identity();
    }

    @Override
    public BinaryOperator<List<T>> combiner() {
        // 두번째 콘텐츠와 합쳐서 첫 번째 누적자를 고친다
        return (list1, list2) -> {
            // 변경된 첫 번째 누적자를 반환한다
            list1.addAll(list2);
            return list1;
        };
    }

    @Override
    public Set<Characteristics> characteristics() {
        // 컬렉터의 플래그를 IDENTITY_FINISH와 CONCURRENT로 설정한다
        return Collections.unmodifiableSet(EnumSet.of(IDENTITY_FINISH, CONCURRENT));
    }
}

public class PartitionPrimeNumbers {

    // 커스텀 컬렉터 구현
    // 1단계: Collector 클래스 시그니처 정의
    public static class PrimeNumbersCollector implements
        Collector<Integer // 스트림 요소의 형식
            , Map<Boolean, List<Integer>> // 누적자 형식
            , Map<Boolean, List<Integer>>> { // 수집 연산의 결과 형식

        // 2단계: 리듀싱 연산 구현
        // supplier 메서드는 누적자를 만드는 함수를 반환
        @Override
        public Supplier<Map<Boolean, List<Integer>>> supplier() {
            // 두 개의 빈 리스트를 포함하는 맵으로 수집 동작 시작
            return () -> new HashMap<Boolean, List<Integer>>() {{
                put(true, new ArrayList<>()); // 소수가 추가될 빈 리스트로 초기화
                put(false, new ArrayList<>()); // 비소수가 추가될 빈 리스트로 초기화
            }};
        }

        // 스트림의 요소를 어떻게 수집할지 결정하는 accumulator 메서드
        @Override
        public BiConsumer<Map<Boolean, List<Integer>>, Integer> accumulator() {
            return (Map<Boolean, List<Integer>> acc, Integer candidate) -> {
                // isPrime의 결과에 따라 소수 리스트와 비소수 리스트를 만든다
                // 지금까지 발견한 소수 리스트를 isPrime 메서드로 전달한다
                acc.get(isPrime(acc.get(true), candidate))
                    // candidate를 알맞은 리스트에 추가한다
                    // isPrime 메서드의 결과에 따라 맵에서 알맞은 리스트를 받아 현재 candidate를 추가한다
                    .add(candidate);
            };
        }

        // 3단계: 병렬 실행할 수 있는 컬렉터 만들기(as possible)
        @Override
        public BinaryOperator<Map<Boolean, List<Integer>>> combiner() {
            // 두 번째 맵을 첫번째 맵에 병합한다
            return (Map<Boolean, List<Integer>> map1, Map<Boolean, List<Integer>> map2) -> {
                map1.get(true).addAll(map2.get(true));
                map1.get(false).addAll(map2.get(false));
                return map1;
            };
        }

        // 4단계: finisher 메서드와 컬렉터의 characteristics 메서드
        // accumulator의 형식은 컬렉터 결과 형식과 같으므로 반환 과정이 필요 없다.
        @Override
        public Function<Map<Boolean, List<Integer>>, Map<Boolean, List<Integer>>> finisher() {
            return Function.identity(); // 항등 함수
        }

        // 커스텀 컬렉터는 CONCURRENT도 아니고 UNORDERED도 아니지만 IDENTITY_FINISH
        @Override
        public Set<Characteristics> characteristics() {
            return Collections.unmodifiableSet(EnumSet.of(IDENTITY_FINISH));
        }
    }
}

doxxx

이전 포스트

모던 자바 인 액션 Chap 5

다음 포스트

모던 자바 인 액션 Chapter 6

모던자바인액션

Chapter 6. 스트림으로 데이터 수집

6.1 컬렉터란 무엇인가?

6.2 리듀싱과 요약

6.2.3 문자열 연결

6.3 그룹화

6.4 분할

6.5 Collector 인터페이스

`toList` 구현

모던 자바 인 액션 Chap 5

모던 자바 인 액션 Chap 7. 병렬 데이터 처리와 성능

0개의 댓글

관련 채용 정보

모던 자바 인 액션 Chapter 6

모던자바인액션

Chapter 6. 스트림으로 데이터 수집

6.1 컬렉터란 무엇인가?

6.2 리듀싱과 요약

6.2.3 문자열 연결

6.3 그룹화

6.4 분할

6.5 Collector 인터페이스

toList 구현

모던 자바 인 액션 Chap 5

모던 자바 인 액션 Chap 7. 병렬 데이터 처리와 성능

0개의 댓글

관련 채용 정보

`toList` 구현