R for Cloud Computing by A. Ohri, Springer, Book Review

R for Cloud Computing

R for Cloud Computing is not the first book I am reading on R, however, it is my first from Springer.

I picked this title for review due to several reasons. First, but not the major one is because Springer is viewed as an advanced, specialized or narrow subject book shop rather than popular technology content and/or educational material publisher. Another deal maker is that this book’s title sounded like the next step in the corporate, small business or even personal Computational Statistics space. Not just R itself.

It turned out to be the case!

In short, the main idea of this book is to state and proof that using R in the Cloud is a more than a workable idea, but it is very possible in a vast number of ways. And it is, I now thankfully agree. And the competition is tight.

Why it makes sense? In short, since R’s design (as many other programming languages) is to use the local machine runnable memory (RAM) and CPU by default (as of end of 2014, and except when the Snowfall package is used), one can rip enormous benefits from R any scripts developed locally and deployed to the Cloud (let me stress, without any changes) where there is as much RAM and CPU power at your disposal as you need (or can pay for), and therefore the limit to how much data you can process gets lifted

But let me speculate, what remains to be discovered or seen, as well as it is not mentioned in the book is how parallelized R in the Cloud would work. Personally, it is a huge thing, bigger than harnessing the power of the local GPU. There is some ground work laid: http://cran.r-project.org/web/views/HighPerformanceComputing.html but again, it seems to me not progressing fast enough (perhaps as many other grid computing technologies). To me, passing this milestone is of an utter importance to be able to process the data volumes of 2020 . But please read the book to know more, a lot more.

Another supporting item for the Cloud + R scheme I can add is that most end-results are anyways shared on the Web, either in form of a publication, chart, or even a web application. And the Cloud and Web are close neighbours.

OK, more on the book itself. And may be I shall start from an item I did not expect to find in a Springer book: personal interviews. It seems that every chapter in the book has at least one. This says to me Mr. A. Ohri keeps in close contact with and very well respected by the technology leaders in his area of interests, yet that the author keeps abreast with the latest happenings in the R space. I enjoyed many interviews and found them very technologically tasteful and professional. The most I liked is the one with Jeroen Ooms, the person I admire as an advanced data scientist, the inventor of OpenCPU. How useful all the interviews are, hmm, I will let you to decide.

It is needless to say, Ajay made sure there is comprehensive ground covered of what is available to a person working or planning to with R in the Cloud, and it seems to be a non bias coverage based on a well done, prior research, exactly as I expected it to be seen in a Springer book. I made a dozen of bookmarks or so discovering new articles and projects I was not aware of. Thank you Ajay!

Otherwise, the book is opening for experimentation and thought. It is full of practical examples, tons of relevant reference. Alas, several things did not work for me and some links appeared dead.

In terms of closing, do not expect this book to be at a student’s desk, I mean it is not for learning R, even though there is runnable code in images. It in my opinion is targeting a mature R user who wants to expand one’s horizons or a corporate decision maker willing to take one’s enterprise one notch (well actually a lot) further ahead in the game.

My verdict: 4 out of 5. A deserving read, even though more like a collection of stories and collection of technologies. A possibly convincing approach and sure inspiring to take the R community to new heights.


