Class SlidingWindowCounter<T>

java.lang.Object
org.apache.storm.starter.tools.SlidingWindowCounter<T>
Type Parameters:
T - The type of those objects we want to count.
All Implemented Interfaces:
Serializable

public final class SlidingWindowCounter<T> extends Object implements Serializable
This class counts objects in a sliding window fashion.

It is designed 1) to give multiple "producer" threads write access to the counter, i.e. being able to increment counts of objects, and 2) to give a single "consumer" thread (e.g. RollingCountBolt) read access to the counter. Whenever the consumer thread performs a read operation, this class will advance the head slot of the sliding window counter. This means that the consumer thread indirectly controls where writes of the producer threads will go to. Also, by itself this class will not advance the head slot.

A note for analyzing data based on a sliding window count: During the initial windowLengthInSlots iterations, this sliding window counter will always return object counts that are equal or greater than in the previous iteration. This is the effect of the counter "loading up" at the very start of its existence. Conceptually, this is the desired behavior.

To give an example, using a counter with 5 slots which for the sake of this example represent 1 minute of time each:

 
 Sliding window counts of an object X over time

 Minute (timeline):
 1    2   3   4   5   6   7   8

 Observed counts per minute:
 1    1   1   1   0   0   0   0

 Counts returned by counter:
 1    2   3   4   4   3   2   1
 
 

As you can see in this example, for the first windowLengthInSlots (here: the first five minutes) the counter will always return counts equal or greater than in the previous iteration (1, 2, 3, 4, 4). This initial load effect needs to be accounted for whenever you want to perform analyses such as trending topics; otherwise your analysis algorithm might falsely identify the object to be trending as the counter seems to observe continuously increasing counts. Also, note that during the initial load phase every object will exhibit increasing counts.

On a high-level, the counter exhibits the following behavior: If you asked the example counter after two minutes, "how often did you count the object during the past five minutes?", then it should reply "I have counted it 2 times in the past five minutes", implying that it can only account for the last two of those five minutes because the counter was not running before that time.

See Also:
  • Constructor Details

    • SlidingWindowCounter

      public SlidingWindowCounter(int windowLengthInSlots)
  • Method Details

    • incrementCount

      public void incrementCount(T obj)
    • getCountsThenAdvanceWindow

      public Map<T,Long> getCountsThenAdvanceWindow()
      Return the current (total) counts of all tracked objects, then advance the window.

      Whenever this method is called, we consider the counts of the current sliding window to be available to and successfully processed "upstream" (i.e. by the caller). Knowing this we will start counting any subsequent objects within the next "chunk" of the sliding window.

      Returns:
      The current (total) counts of all tracked objects.