Quantcast
Channel: Tech Tutorials
Browsing all 862 articles
Browse latest View live

Image may be NSFW.
Clik here to view.

Installing Hadoop on a Single Node Cluster in Pseudo-Distributed Mode

In this post we’ll see how to install Hadoop on a single node cluster in pseudo-distributed mode.Steps shown here are done on Ubuntu 16.04 and Hadoop version used is 2.9.0. Modes in Hadoop Before...

View Article


What is Big Data

Big Data as the name suggests is data so huge, complex and ever growing that conventional technologies are not able to store or process that amount of data.Examples of Big DataSome of the examples of...

View Article


Image may be NSFW.
Clik here to view.

Introduction to Hadoop Framework

In the post What is Big Data it has already been discussed that the challenges such a huge data poses are in the form of- How to store such huge data.How to process it.This post gives introduction to...

View Article

Image may be NSFW.
Clik here to view.

What is HDFS

When you store a file it is divided into blocks of fixed size, in case of local file system these blocks are stored in a single system. In a distributed file system these blocks of the file are stored...

View Article

Image may be NSFW.
Clik here to view.

NameNode, DataNode And Secondary NameNode in HDFS

HDFS has a master/slave architecture. With in an HDFS cluster there is a single NameNode and a number of DataNodes, usually one per node in the cluster.In this post we'll see in detail what NameNode...

View Article


Image may be NSFW.
Clik here to view.

Replica Placement Policy in Hadoop Framework

HDFS as the name says is a distributed file system which is designed to store large files. A large file is divided into blocks of defined size and these blocks are stored across machines in a cluster....

View Article

Image may be NSFW.
Clik here to view.

HDFS Federation in Hadoop Framework

In this post we’ll talk about the HDFS Federation feature introduced in Hadoop 2.x versions. With HDFS federation we can have more than one NameNode in the Hadoop cluster each managing a part of the...

View Article

What is SafeMode in Hadoop

When the NameNode starts in a Hadoop cluster, following tasks are performed by NameNode. NameNode reads the FsImage and EditLog from disk, applies all the transactions from the EditLog to the in-memory...

View Article


Image may be NSFW.
Clik here to view.

HDFS High Availability

This post gives an overview of HDFS High Availability (HA), why it is required and how HDFS High Availability can be managed.Problem with single NameNodeTo guard against the vulnerability of having a...

View Article


Image may be NSFW.
Clik here to view.

File Read in HDFS - Hadoop Framework Internal Steps

In this post we’ll see what all happens internally with in the Hadoop framework when a file is read in HDFS.Reading file in HDFSWith in the Hadoop framework it is the DFSClient class which communicates...

View Article

Image may be NSFW.
Clik here to view.

File Write in HDFS - Hadoop Framework Internal Steps

In this post we’ll see what all happens internally with in the Hadoop framework when a file is written in HDFS.Writing file in HDFSWhen client application wants to create a file in HDFS it calls...

View Article

Java Program to Read File in HDFS

In this post we’ll see a Java program to read a file in HDFS. You can read a file in HDFS in two ways-Create an object of FSDataInputStream and use that object to read data from file. You can use...

View Article

HDFS Commands Reference List

In this post I have compiled a list of some frequently used HDFS commands along with examples.Here note that you can either use hadoop fs - <command> or hdfs dfs - <command>. The difference...

View Article


Java Program to Write File in HDFS

In this post we’ll see a Java program to write a file in HDFS. You can write a file in HDFS in two ways-Create an object of FSDataOutputStream and use that object to write data to file. You can use...

View Article

How MapReduce Works in Hadoop

In the post Word Count MapReduce Program in Hadoop a word count MapReduce program is already written in Java. In this post, using that program as reference we’ll see how MapReduce works in Hadoop...

View Article


Word Count MapReduce Program in Hadoop

The first MapReduce program most of the people write after installing Hadoop is invariably the word count MapReduce program. That’s what this post shows, writing word count MapReduce program in Java...

View Article

Image may be NSFW.
Clik here to view.

YARN in Hadoop

YARN (Yet Another Resource Negotiator) is the cluster resource management and job scheduling layer of Hadoop. YARN is introduced in Hadoop 2.x version to address the scalability issues in MRv1. It also...

View Article


Fair Scheduler in YARN

In the post YARN in Hadoop we have already seen that it is the scheduler component of the ResourceManager which is responsible for allocating resources to the running jobs. The scheduler component is...

View Article

Capacity Scheduler in YARN

In the post YARN in Hadoop we have already seen that it is the scheduler component of the ResourceManager which is responsible for allocating resources to the running jobs. The scheduler component is...

View Article

Uber Mode in Hadoop

When a MapReduce job is submitted, ResourceManager launches the ApplicationMaster process (For MapReduce the ApplicationMaster is MRAppMaster) on a container. Then ApplicationMaster retrieves the...

View Article
Browsing all 862 articles
Browse latest View live