Package org.apache.storm.starter
Class ReachTopology
java.lang.Object
org.apache.storm.starter.ReachTopology
This is a good example of doing complex Distributed RPC on top of Storm. This program creates a topology that can
compute the reach for any URL on Twitter in realtime by parallelizing the whole computation.
Reach is the number of unique people exposed to a URL on Twitter. To compute reach, you have to get all the people who tweeted the URL, get all the followers of all those people, unique that set of followers, and then count the unique set. It's an intense computation that can involve thousands of database calls and tens of millions of follower records.
This Storm topology does every piece of that computation in parallel, turning what would be a computation that takes minutes on a single machine into one that takes just a couple seconds.
For the purposes of demonstration, this topology replaces the use of actual DBs with in-memory hashmaps.
- See Also:
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
static class
static class
static class
-
Field Summary
Modifier and TypeFieldDescription -
Constructor Summary
-
Method Summary
-
Field Details
-
TWEETERS_DB
-
FOLLOWERS_DB
-
-
Constructor Details
-
ReachTopology
public ReachTopology()
-
-
Method Details
-
construct
-
main
- Throws:
Exception
-