Package org.apache.storm.hdfs.spout
Class HdfsSpout
java.lang.Object
org.apache.storm.topology.base.BaseComponent
org.apache.storm.topology.base.BaseRichSpout
org.apache.storm.hdfs.spout.HdfsSpout
- All Implemented Interfaces:
Serializable
,ISpout
,IComponent
,IRichSpout
- See Also:
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
Storm has determined that the tuple emitted by this spout with the msgId identifier has been fully processed.void
close()
Called when an ISpout is going to be shutdown.void
declareOutputFields
(OutputFieldsDeclarer declarer) Declare the output schema for all the streams of this topology.protected void
void
The tuple emitted by this spout with the msgId identifier has failed to be fully processed.org.apache.hadoop.fs.Path
void
When this method is called, Storm is requesting that the Spout emit tuples to the output collector.void
open
(Map<String, Object> conf, TopologyContext context, SpoutOutputCollector collector) Called when a task for this component is initialized within a worker on the cluster.setArchiveDir
(String archiveDir) setBadFilesDir
(String badFilesDir) setClocksInSync
(boolean clocksInSync) setCommitFrequencyCount
(int commitFrequencyCount) setCommitFrequencySec
(int commitFrequencySec) setHdfsUri
(String hdfsUri) setIgnoreSuffix
(String ignoreSuffix) setLockDir
(String lockDir) setLockTimeoutSec
(int lockTimeoutSec) setMaxOutstanding
(int maxOutstanding) setReaderType
(String readerType) setSourceDir
(String sourceDir) withConfigKey
(String configKey) set key name under which HDFS options are placed.withOutputFields
(String... fields) Output field names.withOutputStream
(String streamName) Set output stream name.Methods inherited from class org.apache.storm.topology.base.BaseRichSpout
activate, deactivate
Methods inherited from class org.apache.storm.topology.base.BaseComponent
getComponentConfiguration
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.storm.topology.IComponent
getComponentConfiguration
-
Constructor Details
-
HdfsSpout
public HdfsSpout()
-
-
Method Details
-
setHdfsUri
-
setReaderType
-
setSourceDir
-
setArchiveDir
-
setBadFilesDir
-
setLockDir
-
setCommitFrequencyCount
-
setCommitFrequencySec
-
setMaxOutstanding
-
setLockTimeoutSec
-
setClocksInSync
-
setIgnoreSuffix
-
withOutputFields
Output field names. Number of fields depends upon the reader type -
withConfigKey
set key name under which HDFS options are placed. (similar to HDFS bolt). default key name is 'hdfs.config' -
withOutputStream
Set output stream name. -
getLockDirPath
public org.apache.hadoop.fs.Path getLockDirPath() -
getCollector
-
nextTuple
public void nextTuple()Description copied from interface:ISpout
When this method is called, Storm is requesting that the Spout emit tuples to the output collector. This method should be non-blocking, so if the Spout has no tuples to emit, this method should return. nextTuple, ack, and fail are all called in a tight loop in a single thread in the spout task. When there are no tuples to emit, it is courteous to have nextTuple sleep for a short amount of time (like a single millisecond) so as not to waste too much CPU. -
emitData
-
open
Description copied from interface:ISpout
Called when a task for this component is initialized within a worker on the cluster. It provides the spout with the environment in which the spout executes.This includes the:
- Parameters:
conf
- The Storm configuration for this spout. This is the configuration provided to the topology merged in with cluster configuration on this machine.context
- This object can be used to get information about this task's place within the topology, including the task id and component id of this task, input and output information, etc.collector
- The collector is used to emit tuples from this spout. Tuples can be emitted at any time, including the open and close methods. The collector is thread-safe and should be saved as an instance variable of this spout object.
-
close
public void close()Description copied from interface:ISpout
Called when an ISpout is going to be shutdown. There is no guarentee that close will be called, because the supervisor kill -9's worker processes on the cluster.The one context where close is guaranteed to be called is a topology is killed when running Storm in local mode.
- Specified by:
close
in interfaceISpout
- Overrides:
close
in classBaseRichSpout
-
ack
Description copied from interface:ISpout
Storm has determined that the tuple emitted by this spout with the msgId identifier has been fully processed. Typically, an implementation of this method will take that message off the queue and prevent it from being replayed.- Specified by:
ack
in interfaceISpout
- Overrides:
ack
in classBaseRichSpout
-
fail
Description copied from interface:ISpout
The tuple emitted by this spout with the msgId identifier has failed to be fully processed. Typically, an implementation of this method will put that message back on the queue to be replayed at a later time.- Specified by:
fail
in interfaceISpout
- Overrides:
fail
in classBaseRichSpout
-
declareOutputFields
Description copied from interface:IComponent
Declare the output schema for all the streams of this topology.- Parameters:
declarer
- this is used to declare output stream ids, output fields, and whether or not each output stream is a direct stream
-