ThorneLabs

A Basic Understanding of MapReduce on Hadoop

• Updated May 20, 2019


If you read any article on big data, you will inevitably come across the terms MapReduce and Hadoop. If your technical focus is not in big data related technologies, the terms MapReduce and Hadoop are not exactly intuitive enough (though, what technical terms are) to understand what they are and do.

Back in June, Linux Journal published an Introduction to MapReduce with Hadoop on Linux that provides a clear explanation of what MapReduce is, with example code, and how it is used in Hadoop.

You might have also noticed how much of a buzz word “big data” has become, and, like anything in technology, there are frequently situations where a problem tries to be solved with an over engineered solution - in this case Hadoop. Hadoop was created to process hundreds of terabytes to petabytes of data, but it is often used on smaller datasets that can be processed much more easily and quickly on a modern workstation. Below are two articles that, very directly, talk about just that.

If you found this post useful and would like to help support this site - and get something for yourself - sign up for any of the services listed below through the provided affiliate links. I will receive a referral payment from any of the services you sign-up for.

Get faster shipping and more with Amazon Prime: About to order something from Amazon but want to get more value out of the money you would normally pay for shipping? Sign-up for a free 30-day trial of Amazon Prime to get free two-day shipping, access to thousands of movies and TV shows, and more.

Thanks for reading and take care.