Foreword
I want to welcome you to the world of Hadoop. If you are novice or an expert looking to expand your
knowledge of the technology, then you have arrived at the right place. This book contains a wealth of
knowledge that can help the former become the latter. Even most experts in a technology focus on particular
aspects. This book will broaden your horizon and add valuable tools to your toolbox of knowledge.
When Deepak asked me to write the foreword, I was honored and excited. Those of you who know me
usually find I have no shortage of words. This case was no exception, but I found myself thinking more about
what to say, and about how to keep it simple.
Every few years, technology has a period of uncertainty. It always seems we are on the cusp of the next
“great” thing. Most of the time, we find that it is a fad that is soon replaced by the next shiny bauble. There
are some moments that have had an impact, and some that leave the community guessing. Let’s take a look
at a couple of examples to make a point.
Java appeared like manna from the heavens in 1995. Well, that is perhaps a bit dramatic. It did burst on
to the scene and made development easier because you didn’t need to worry about memory management
or networking. It also had this marketing mantra, which was “write once, run anywhere”. It turned out to be
mostly true. This was the next “great” thing.
Rolling ahead to 1999 and the release of J2EE. Again, we encounter Java doing all the right things. J2EE
technologies allowed, in a standard way, enterprises to focus on business value and not worry about the
technology stack. Again, this was mostly true.
Next we take a quantum leap to 2006. I attended JavaOne 2005 and 2006 and listened to numerous
presentations of where J2EE technology was going. I met a really passionate developer named Rod Johnson
who was talking about Spring. Some of you may have heard of it. I also listened as Sun pushed Java EE 5,
which was the next big change in the technology stack. I was also sold on a new component-based web UI
framework called Woodstock, which was based on JavaServer Faces. I was in a unique position; I was in
charge of making decisions for a variety of business systems at my employer at the time. I had to make a
series of choices. On the one hand I could use Spring, or on the other, Java EE 5. I chose Java EE 5 because
of the relationships I had developed at Sun, and because I wanted something based on a “standard”.
Woodstock, which I thought was the next “great” thing, turned out to be flash in the pan. Sun abandoned
Woodstock, and well… I guess on occasion I maintain it along with some former Sun and Oracle employees.
Spring, like Java EE 5, turned out to be a “great” thing.
Next was the collapse of Sun and its being acquired by the dark side. This doomsday scenario seemed to
be on everyone’s mind in 2009. The darkness consumed them in 2010. What would happen to Java? It turned
out everyone’s initial assessment was incorrect. Oracle courted the Java community initially, spent time and
treasure to fix a number of issues in the Java SE stack, and worked on Java EE as well. It was a phenomenal
wedding, and the first fruits of the union were fantastic—Java SE 7 and Java EE 7 were “great”. They allowed a
number of the best ideas to become reality. Java SE 8, the third child, was developed in conjunction with the
Java community. The lambda, you would have thought, was a religious movement.
While the Java drama was unfolding, a really bright fellow named Doug Cutting came along in 2006 and
created an Apache project called Hadoop. The funny name was the result of his son’s toy elephant. Today it
literally is the elephant in the room. This project was based on the Google File System and Map Reduce. The
xix