Installing Hadoop 2.2 (Single Node) on Windows Server 2008 R2 Virtual Machine

Last evening, I spent some time on reading few materials on Hadoop. I know, it has been a long time since I could some more work on Hadoop. In the past, I have taken HDInsight Preview edition from Microsoft or HortonWorks 1.1 for Windows. I wanted to install and play with latest and greatest of Hadoop in it’s raw form, directly from the Apache site. The bad news is that there is no installer for Hadoop for Windows and there was no virtual machine readily available for download. When you have to learn Hadoop, you don’t want to spend time learning CentOS Image may be NSFW.
Clik here to view. Smile Anyway, you would need to build from scratch. I found this site: http://www.srccodes.com/p/article/38/build-install-configure-run-apache-hadoop-2.2.0-microsoft-windows-os very useful and it helped me save tons of time.

However, it was not flawless (you mileage may vary too). I had to struggle to bring pieces together.

My Environment

Microsoft Windows Server 2008 R2 virtual machine mounted on Hyper-V
Visual Studio 2012 and SQL Server 2012 including Microsoft Visual C++ 2010 Redistributable

What didn’t work for me?

I faced problems in creating native distribution (“Build Hadoop bin distribution for Windows” # step g).

Installing Java in the default path of c:\program files\...

I installed JDK in the c:\Program Files\Java folder. However, after reading installation material at Apache site, I realized that you should have any space in the folder path. So I un-installed and re-installed Java in C:\Java. I also made sure that JAVA_HOME reflected the same.

Struggling with installation of Win7 SDK

Apache installation guidelines also mentions that you must have Visual Studio 2010. I had Visual Studio 2012. After few tries, I searched and found a thread on StackOverflow.com (I love this site as a developer) that you must un-install existing versions of Microsoft Visual C++ Redistributable before installing Windows 7 SDK. After un-installing this redistributable, life was much easier Image may be NSFW.
Clik here to view. and Windows 7 SDK (also good for my OS i.e. Windows Server 2008 R2) installed flawlessly. On the side note, my Google search gave me many possible causes including registry permissions etc. but nothing worked till I hit the jackpot of un-installing existing C++ 2010 redistributable.

Missing tools.jar file exception

I kept getting tools.jar missing file exception in the build. Based on Google searches, I added the following the pom.xml file of C:\hdfs\hadoop-dist directory.
<dependency>
    <groupId>jdk.tools</groupId>
    <artifactId>jdk.tools</artifactId>
    <version>1.7.0_45</version>
    <scope>system</scope>
    <systemPath>${JAVA_HOME}/lib/tools.jar</systemPath>
</dependency>

hadoop-annotations error (missing annotations)

This was a known bug and I edit pom.xml file in C:\hdfs\hadoop-common-project\hadoop-auth directory and added
<dependency>
<groupId>org.mortbay.jetty</groupId>
<artifactId>jetty-util</artifactId>
<scope>test</scope>
</dependency>

What did I do differently?

Installed JDK 7 and not 6
Changed order of the steps

I wanted to save time and be more efficient (subjective Image may be NSFW.
Clik here to view.). While build for native distribution hadoop-2.2.0.tar.gz (Build Hadoop bin distribution for Windows, step g) was still running, I went ahead with step “Configure Hadoop”.

Created System Variables rather than User Variables

The blog mentions creating User Variables such as JAVA_HOME, M2_HOME and Platform. I created them at System level rather than user level. I also had to restart the system because I noticed that changes in the environment variables were not taking effect without it (still thinking about it).

Copied files to bin directory after successful build

After fixing all issues related to native distribution build, I ran start-dfs and start-yarn commands but got exceptions that winutils.exe was missing. The native distribution was built in folder C:\hdfs\hadoop-dist\target\hadoop-2.2.0\bin and you would need to copy/paste/overwirte all files including winutils.exe to c:\hadoop\bin directory before your HDF/Yarn start commands work.

At the end of the night (almost 1:00AM), things finally worked and I could sleep. I would call this sleepless nightImage may be NSFW.
Clik here to view. Sleeping half-moon in Atlanta. Now that I have a raw, full and latest Hadoop running, I will configure Eclipse and run MapReduce program in Java. I am also planning to learn Python quickly for Machine Learning (still debating Python vs. R). Ideas?

Installing Hadoop 2.2 (Single Node) on Windows Server 2008 R2 Virtual Machine

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112