In order to process large data sets in Hadoop it is necessary to install a full version of Hadoop on a real cluster with nodes of computers ranging from tens to several thousands.
Cloudera Vm Software Packages BundledA Sandbox installation of Hadoop is a ready to run installation with core Hadoop module and other related Hadoop software packages bundled in a virtual machine(vm) image.
Cloudera Vm Full Version Of HadoopIt typically runs on a single node and it is good enough for us to learn Hadoop. Cloudera Vm For Free From TheThe three main sand box distributions of Hadoop are: Cloudera QuickStart VM Hortonworks Sandbox MapR Sandbox for Hadoop All the above sandbox distributions can be downloaded for free from the respective websites. We will go ahead with installing Cloudera QuickStart VM in Windows for our Hadoop learning purpose. Cloudera QuickStart VM comes with CentOS 6 operating system and the following Hadoop ecosystem and Development tools pre-installed. Apache Hadoop Ecosystem Tools Development Tools Apache Hadoop JDK 7 Apache Spark Eclipse IDE (Luna) with Maven Apache Pig MySQL database Apache Hive Git Command Line Apache HBase Perl Apache Impala Python Hue PHP Apache Oozie Apache Solr So, there is no need for us to worry about installing all these software separately. Instead, we could simply install Cloudera QuickStart VM and get our hands dirty by developing Hadoop MapReduce code. Before we can install and configure Cloudera QuickStart VM we need a VirtualBox to run it. Note: VirtualBox allows us to run multiple operating systems as virtual machines in our computer at the same time. For instance, we can run Linux on our Windows PC, run Windows and Linux on our Mac etc. Lets watch the following video tutorial to install VirtualBox and to install and configure Cloudera QuickStart VM 5.8.0 in Windows. This video tutorial will show us how to share our computers files with a Virtual Machine. In this post, we learned on how to install VirtualBox, Cloudera QuickStart VM 5.8.0 in Windows and how to share files between Windows host and the Virtual Machine. If you have any questions or comments regarding this blogpost or would like to suggest another way to share files with the VM, please feel free to post it in the comment section below. She has 20 years of experience in software application development, the majority of which was spent leading an Enterprise Application Development Team. As part of the Mining Massive Data Sets Graduate NDO Program from Stanford University, she had an opportunity to work on projects in Machine Learning and Social Network Analysis. In her spare time, she loves working on applying Machine Learning Algorithms on Kaggle Open Data Sets.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |