Class PairStream<K,V>

java.lang.Object
org.apache.storm.streams.Stream<Pair<K,V>>
org.apache.storm.streams.PairStream<K,V>
Type Parameters:
K - the key type
V - the value type

@Unstable public class PairStream<K,V> extends Stream<Pair<K,V>>
Represents a stream of key-value pairs.
  • Method Details

    • mapValues

      public <R> PairStream<K,R> mapValues(Function<? super V,? extends R> function)
      Returns a new stream by applying a Function to the value of each key-value pairs in this stream.
      Parameters:
      function - the mapping function
      Returns:
      the new stream
    • flatMapValues

      public <R> PairStream<K,R> flatMapValues(FlatMapFunction<? super V,? extends R> function)
      Return a new stream by applying a FlatMapFunction function to the value of each key-value pairs in this stream.
      Parameters:
      function - the flatmap function
      Returns:
      the new stream
    • aggregateByKey

      public <R> PairStream<K,R> aggregateByKey(R initialValue, BiFunction<? super R,? super V,? extends R> accumulator, BiFunction<? super R,? super R,? extends R> combiner)
      Aggregates the values for each key of this stream using the given initial value, accumulator and combiner.
      Parameters:
      initialValue - the initial value of the result
      accumulator - the accumulator
      combiner - the combiner
      Returns:
      the new stream
    • aggregateByKey

      public <A, R> PairStream<K,R> aggregateByKey(CombinerAggregator<? super V,A,? extends R> aggregator)
      Aggregates the values for each key of this stream using the given CombinerAggregator.
      Parameters:
      aggregator - the combiner aggregator
      Returns:
      the new stream
    • countByKey

      public PairStream<K,Long> countByKey()
      Counts the values for each key of this stream.
      Returns:
      the new stream
    • reduceByKey

      public PairStream<K,V> reduceByKey(Reducer<V> reducer)
      Performs a reduction on the values for each key of this stream by repeatedly applying the reducer.
      Parameters:
      reducer - the reducer
      Returns:
      the new stream
    • groupByKey

      public PairStream<K,Iterable<V>> groupByKey()
      Returns a new stream where the values are grouped by the keys.
      Returns:
      the new stream
    • groupByKeyAndWindow

      public PairStream<K,Iterable<V>> groupByKeyAndWindow(Window<?,?> window)
      Returns a new stream where the values are grouped by keys and the given window. The values that arrive within a window having the same key will be merged together and returned as an Iterable of values mapped to the key.
      Parameters:
      window - the window configuration
      Returns:
      the new stream
    • reduceByKeyAndWindow

      public PairStream<K,V> reduceByKeyAndWindow(Reducer<V> reducer, Window<?,?> window)
      Returns a new stream where the values that arrive within a window having the same key will be reduced by repeatedly applying the reducer.
      Parameters:
      reducer - the reducer
      window - the window configuration
      Returns:
      the new stream
    • peek

      public PairStream<K,V> peek(Consumer<? super Pair<K,V>> action)
      Returns a stream consisting of the elements of this stream, additionally performing the provided action on each element as they are consumed from the resulting stream.
      Overrides:
      peek in class Stream<Pair<K,V>>
      Parameters:
      action - the action to perform on the element as they are consumed from the stream
      Returns:
      the new stream
    • filter

      public PairStream<K,V> filter(Predicate<? super Pair<K,V>> predicate)
      Returns a stream consisting of the elements of this stream that matches the given filter.
      Overrides:
      filter in class Stream<Pair<K,V>>
      Parameters:
      predicate - the predicate to apply to each element to determine if it should be included
      Returns:
      the new stream
    • join

      public <V1> PairStream<K,Pair<V,V1>> join(PairStream<K,V1> otherStream)
      Join the values of this stream with the values having the same key from the other stream.

      Note: The parallelism of this stream is carried forward to the joined stream.

      Parameters:
      otherStream - the other stream
      Returns:
      the new stream
    • join

      public <R, V1> PairStream<K,R> join(PairStream<K,V1> otherStream, ValueJoiner<? super V,? super V1,? extends R> valueJoiner)
      Join the values of this stream with the values having the same key from the other stream.

      Note: The parallelism of this stream is carried forward to the joined stream.

      Parameters:
      otherStream - the other stream
      valueJoiner - the ValueJoiner
      Returns:
      the new stream
    • leftOuterJoin

      public <V1> PairStream<K,Pair<V,V1>> leftOuterJoin(PairStream<K,V1> otherStream)
      Does a left outer join of the values of this stream with the values having the same key from the other stream.

      Note: The parallelism of this stream is carried forward to the joined stream.

      Parameters:
      otherStream - the other stream
      Returns:
      the new stream
    • leftOuterJoin

      public <R, V1> PairStream<K,R> leftOuterJoin(PairStream<K,V1> otherStream, ValueJoiner<? super V,? super V1,? extends R> valueJoiner)
      Does a left outer join of the values of this stream with the values having the same key from the other stream.

      Note: The parallelism of this stream is carried forward to the joined stream.

      Parameters:
      otherStream - the other stream
      valueJoiner - the ValueJoiner
      Returns:
      the new stream
    • rightOuterJoin

      public <V1> PairStream<K,Pair<V,V1>> rightOuterJoin(PairStream<K,V1> otherStream)
      Does a right outer join of the values of this stream with the values having the same key from the other stream.

      Note: The parallelism of this stream is carried forward to the joined stream.

      Parameters:
      otherStream - the other stream
      Returns:
      the new stream
    • rightOuterJoin

      public <R, V1> PairStream<K,R> rightOuterJoin(PairStream<K,V1> otherStream, ValueJoiner<? super V,? super V1,? extends R> valueJoiner)
      Does a right outer join of the values of this stream with the values having the same key from the other stream.

      Note: The parallelism of this stream is carried forward to the joined stream.

      Parameters:
      otherStream - the other stream
      valueJoiner - the ValueJoiner
      Returns:
      the new stream
    • fullOuterJoin

      public <V1> PairStream<K,Pair<V,V1>> fullOuterJoin(PairStream<K,V1> otherStream)
      Does a full outer join of the values of this stream with the values having the same key from the other stream.

      Note: The parallelism of this stream is carried forward to the joined stream.

      Parameters:
      otherStream - the other stream
      Returns:
      the new stream
    • fullOuterJoin

      public <R, V1> PairStream<K,R> fullOuterJoin(PairStream<K,V1> otherStream, ValueJoiner<? super V,? super V1,? extends R> valueJoiner)
      Does a full outer join of the values of this stream with the values having the same key from the other stream.

      Note: The parallelism of this stream is carried forward to the joined stream.

      Parameters:
      otherStream - the other stream
      valueJoiner - the ValueJoiner
      Returns:
      the new stream
    • window

      public PairStream<K,V> window(Window<?,?> window)
      Returns a new stream consisting of the elements that fall within the window as specified by the window parameter. The Window specification could be used to specify sliding or tumbling windows based on time duration or event count. For example,
       // time duration based sliding window
       stream.window(SlidingWindows.of(Duration.minutes(10), Duration.minutes(1));
      
       // count based sliding window
       stream.window(SlidingWindows.of(Count.(10), Count.of(2)));
      
       // time duration based tumbling window
       stream.window(TumblingWindows.of(Duration.seconds(10));
       
      Overrides:
      window in class Stream<Pair<K,V>>
      Parameters:
      window - the window configuration
      Returns:
      the new stream
      See Also:
    • repartition

      public PairStream<K,V> repartition(int parallelism)
      Returns a new stream with the given value of parallelism. Further operations on this stream would execute at this level of parallelism.
      Overrides:
      repartition in class Stream<Pair<K,V>>
      Parameters:
      parallelism - the parallelism value
      Returns:
      the new stream
    • branch

      public PairStream<K,V>[] branch(Predicate<? super Pair<K,V>>... predicates)
      Returns an array of streams by splitting the given stream into multiple branches based on the given predicates. The predicates are applied in the given order to the values of this stream and the result is forwarded to the corresponding (index based) result stream based on the (index of) predicate that matches.

      Note: If none of the predicates match a value, that value is dropped.

      Overrides:
      branch in class Stream<Pair<K,V>>
      Parameters:
      predicates - the predicates
      Returns:
      an array of result streams (branches) corresponding to the given predicates
    • updateStateByKey

      public <R> StreamState<K,R> updateStateByKey(R initialValue, BiFunction<? super R,? super V,? extends R> stateUpdateFn)
      Update the state by applying the given state update function to the previous state of the key and the new value for the key. This internally uses IStatefulBolt to save the state. Use Config.TOPOLOGY_STATE_PROVIDER to choose the state implementation.
      Parameters:
      stateUpdateFn - the state update function
      Returns:
      the StreamState which can be used to query the state
    • updateStateByKey

      public <R> StreamState<K,R> updateStateByKey(StateUpdater<? super V,? extends R> stateUpdater)
      Update the state by applying the given state update function to the previous state of the key and the new value for the key. This internally uses IStatefulBolt to save the state. Use Config.TOPOLOGY_STATE_PROVIDER to choose the state implementation.
      Parameters:
      stateUpdater - the state updater
      Returns:
      the StreamState which can be used to query the state
    • coGroupByKey

      public <V1> PairStream<K,Pair<Iterable<V>,Iterable<V1>>> coGroupByKey(PairStream<K,V1> otherStream)
      Groups the values of this stream with the values having the same key from the other stream.

      If stream1 has values - (k1, v1), (k2, v2), (k2, v3)
      and stream2 has values - (k1, x1), (k1, x2), (k3, x3)
      The the co-grouped stream would contain - (k1, ([v1], [x1, x2]), (k2, ([v2, v3], [])), (k3, ([], [x3]))

      Note: The parallelism of this stream is carried forward to the co-grouped stream.

      Parameters:
      otherStream - the other stream
      Returns:
      the new stream