public class RollingCountBolt extends BaseRichBolt
This bolt performs rolling counts of incoming objects, i.e. sliding window based counting.
The bolt is configured by two parameters, the length of the sliding window in seconds (which influences the output data of the bolt, i.e. how it will count objects) and the emit frequency in seconds (which influences how often the bolt will output the latest window counts). For instance, if the window length is set to an equivalent of five minutes and the emit frequency to one minute, then the bolt will output the latest five-minute sliding window every minute. The bolt emits a rolling count tuple per object, consisting of the object itself, its latest rolling count, and the actual duration of the sliding window. The latter is included in case the expected sliding window length (as configured by the user) is different from the actual length, e.g. due to high system load. Note that the actual window length is tracked and calculated for the window, and not individually for each object within a window. Note: During the startup phase you will usually observe that the bolt warns you about the actual sliding window length being smaller than the expected length. This behavior is expected and is caused by the way the sliding window counts are initially “loaded up”. You can safely ignore this warning during startup (e.g. you will see this warning during the first ~ five minutes of startup time if the window length is set to five minutes).Constructor and Description |
---|
RollingCountBolt() |
RollingCountBolt(int windowLengthInSeconds,
int emitFrequencyInSeconds) |
Modifier and Type | Method and Description |
---|---|
void |
declareOutputFields(OutputFieldsDeclarer declarer)
Declare the output schema for all the streams of this topology.
|
void |
execute(Tuple tuple)
Process a single tuple of input.
|
Map<String,Object> |
getComponentConfiguration()
Declare configuration specific to this component.
|
void |
prepare(Map<String,Object> topoConf,
TopologyContext context,
OutputCollector collector)
Called when a task for this component is initialized within a worker on the cluster.
|
cleanup
public RollingCountBolt()
public RollingCountBolt(int windowLengthInSeconds, int emitFrequencyInSeconds)
public void prepare(Map<String,Object> topoConf, TopologyContext context, OutputCollector collector)
IBolt
Called when a task for this component is initialized within a worker on the cluster. It provides the bolt with the environment in which the bolt executes.
This includes the:
topoConf
- The Storm configuration for this bolt. This is the configuration provided to the topology merged in with cluster configuration on this machine.context
- This object can be used to get information about this task’s place within the topology, including the task id and component id of this task, input and output information, etc.collector
- The collector is used to emit tuples from this bolt. Tuples can be emitted at any time, including the prepare and cleanup methods. The collector is thread-safe and should be saved as an instance variable of this bolt object.public void execute(Tuple tuple)
IBolt
Process a single tuple of input. The Tuple object contains metadata on it about which component/stream/task it came from. The values of the Tuple can be accessed using Tuple#getValue. The IBolt does not have to process the Tuple immediately. It is perfectly fine to hang onto a tuple and process it later (for instance, to do an aggregation or join).
Tuples should be emitted using the OutputCollector provided through the prepare method. It is required that all input tuples are acked or failed at some point using the OutputCollector. Otherwise, Storm will be unable to determine when tuples coming off the spouts have been completed.
For the common case of acking an input tuple at the end of the execute method, see IBasicBolt which automates this.
tuple
- The input tuple to be processed.public void declareOutputFields(OutputFieldsDeclarer declarer)
IComponent
Declare the output schema for all the streams of this topology.
declarer
- this is used to declare output stream ids, output fields, and whether or not each output stream is a direct streampublic Map<String,Object> getComponentConfiguration()
IComponent
Declare configuration specific to this component. Only a subset of the “topology.*” configs can be overridden. The component configuration can be further overridden when constructing the topology using TopologyBuilder
getComponentConfiguration
in interface IComponent
getComponentConfiguration
in class BaseComponent
Copyright © 2022 The Apache Software Foundation. All rights reserved.