Using Flume: Flexible, Scalable, and Reliable Data Streaming by Hari Shreedharan, O’Reilly Media Book Review

Flexible, Scalable, and Reliable Data Streaming

Using Flume is one of the books from the so called Big Data series. Flume is one of the graduated projects from the Apache foundation incubator and as of time of this writing is at version 1.5 which means (in the OSS terms) a mature product. How battle tested it is I cannot say as I am not using it, but our world increasingly relies on fast and distributed methods of log data processing. I truly believe it is worth investing time in learning tomorrow’s technology and propose using it at the right moment and opportunity. I am confident my own journey through data will naturally take me to using Flume some day, and I may not be surprised if it is happening soon. This book I am sure is going to take a Big Data practitioner (like myself) a step or two further regardless. If you are looking at entering a project or POC involving Flume, then this book is a must. If you are using it already this book is worth your buck too and not only for “just in case”. The work of Hari (who worked at such iconic companies as Yahoo! before Cloudera) is probably fundamental to Flume. Here is why it helps:

  • Assessing whether Flume is the appropriate fit to address your project/business needs/goals;
  • The book has seemingly enough code (Java only) to create simple Flume extensions or indices
  • Full coverage of the three popular data serializations techniques
  • Persisting logs and even in-transit processing
  • Optimization of Flume
  • Performance tuning and monitoring

If you want to know why I gave this book a 4 out of 5 star rating is because

  1. The structure, or flow of the book I see supposed to be different, 1st should be basics 1st, it is not too logically outlined
  2. The book is a tad dry (to my taste maybe), what I mean there are no practical, “from the trenches” examples on why this and that setup, configuration is needed in what circumstances;
  3. Java centric and discusses only the Apache products

Disclaimer: I received a free copy of this book in exchange of writing a review as per the reader review program rules.

Posted on October 11, 2014, in Book Review and tagged . Bookmark the permalink. Leave a comment.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: