Interface ISpout

All Superinterfaces:
Serializable
All Known Subinterfaces:
IRichSpout
All Known Implementing Classes:
AnchoredWordCount.RandomSentenceSpout, BaseRichSpout, BlobStoreAPIWordCountTopology.RandomSentenceSpout, BucketTestHiveTopology.UserDataSpout, CheckpointSpout, ClojureSpout, ConstSpout, DRPCSpout, EsIndexTopology.UserDataSpout, ExclamationTopology.FixedOrderWordSpout, FastWordCountTopology.FastRandomSentenceSpout, FeederSpout, FileReadSpout, FixedTupleSpout, FluxShellSpout, HdfsFileTopology.SentenceSpout, HdfsSpout, HiveTopology.UserDataSpout, HiveTopologyPartitioned.UserDataSpout, IncrementingSpout, InOrderDeliveryTest.InOrderSpout, JmsSpout, KafkaSpout, LambdaSpout, LoadSpout, MasterBatchCoordinator, PythonShellMetricsSpout, RandomIntegerSpout, RandomSentenceSpout, RandomSentenceSpout.TimeStamped, RichShellSpout, RichSpoutBatchTriggerer, SequenceFileTopology.SentenceSpout, ShellSpout, SocketSpout, SpoutTracker, StringGenSpout, TestEventLogSpout, TestPlannerSpout, TestWordSpout, ThroughputVsLatency.FastRandomSentenceSpout, TimeDataIncrementingSpout, UserSpout, WordCountTopologyNode.RandomSentence, WordGenSpout, WordSpout

public interface ISpout extends Serializable
ISpout is the core interface for implementing spouts. A Spout is responsible for feeding messages into the topology for processing. For every tuple emitted by a spout, Storm will track the (potentially very large) DAG of tuples generated based on a tuple emitted by the spout. When Storm detects that every tuple in that DAG has been successfully processed, it will send an ack message to the Spout.

If a tuple fails to be fully processed within the configured timeout for the topology (see Config), Storm will send a fail message to the spout for the message.

When a Spout emits a tuple, it can tag the tuple with a message id. The message id can be any type. When Storm acks or fails a message, it will pass back to the spout the same message id to identify which tuple it's referring to. If the spout leaves out the message id, or sets it to null, then Storm will not track the message and the spout will not receive any ack or fail callbacks for the message.

Storm executes ack, fail, and nextTuple all on the same thread. This means that an implementor of an ISpout does not need to worry about concurrency issues between those methods. However, it also means that an implementor must ensure that nextTuple is non-blocking: otherwise the method could block acks and fails that are pending to be processed.

  • Method Summary

    Modifier and Type
    Method
    Description
    void
    ack(Object msgId)
    Storm has determined that the tuple emitted by this spout with the msgId identifier has been fully processed.
    void
    Called when a spout has been activated out of a deactivated mode. nextTuple will be called on this spout soon.
    void
    Called when an ISpout is going to be shutdown.
    void
    Called when a spout has been deactivated. nextTuple will not be called while a spout is deactivated.
    void
    fail(Object msgId)
    The tuple emitted by this spout with the msgId identifier has failed to be fully processed.
    void
    When this method is called, Storm is requesting that the Spout emit tuples to the output collector.
    void
    Called when a task for this component is initialized within a worker on the cluster.
  • Method Details

    • open

      void open(Map<String,Object> conf, TopologyContext context, SpoutOutputCollector collector)
      Called when a task for this component is initialized within a worker on the cluster. It provides the spout with the environment in which the spout executes.

      This includes the:

      Parameters:
      conf - The Storm configuration for this spout. This is the configuration provided to the topology merged in with cluster configuration on this machine.
      context - This object can be used to get information about this task's place within the topology, including the task id and component id of this task, input and output information, etc.
      collector - The collector is used to emit tuples from this spout. Tuples can be emitted at any time, including the open and close methods. The collector is thread-safe and should be saved as an instance variable of this spout object.
    • close

      void close()
      Called when an ISpout is going to be shutdown. There is no guarentee that close will be called, because the supervisor kill -9's worker processes on the cluster.

      The one context where close is guaranteed to be called is a topology is killed when running Storm in local mode.

    • activate

      void activate()
      Called when a spout has been activated out of a deactivated mode. nextTuple will be called on this spout soon. A spout can become activated after having been deactivated when the topology is manipulated using the `storm` client.
    • deactivate

      void deactivate()
      Called when a spout has been deactivated. nextTuple will not be called while a spout is deactivated. The spout may or may not be reactivated in the future.
    • nextTuple

      void nextTuple()
      When this method is called, Storm is requesting that the Spout emit tuples to the output collector. This method should be non-blocking, so if the Spout has no tuples to emit, this method should return. nextTuple, ack, and fail are all called in a tight loop in a single thread in the spout task. When there are no tuples to emit, it is courteous to have nextTuple sleep for a short amount of time (like a single millisecond) so as not to waste too much CPU.
    • ack

      void ack(Object msgId)
      Storm has determined that the tuple emitted by this spout with the msgId identifier has been fully processed. Typically, an implementation of this method will take that message off the queue and prevent it from being replayed.
    • fail

      void fail(Object msgId)
      The tuple emitted by this spout with the msgId identifier has failed to be fully processed. Typically, an implementation of this method will put that message back on the queue to be replayed at a later time.