Explain the architecture of Hadoop Eco system?

Vivek Shrivastava
Vivek Shrivastava

Posted On: Dec 31, 2020

 

Apache Hadoop is used to process a huge amount of data. The architecture of Apache Hadoop consists of Hadoop components and various technologies which is helpful to solve complex data problems easily.

The description of each component of the architecture of the Hadoop ecosystem are as follows:

Namenode controls the operation of data.

Datanode writes the data to local storage. To store all data at a single place is not always recommended, as it may cause loss of data in case of an outage situation.

Task tracker accepts tasks assigned to the slave node.

The map takes data from a stream and each line is processed after splitting it into various fields.

Reduce: The fields, obtained through Map are grouped together or concatenated with each other.

    Related Questions

    Please Login or Register to leave a response.

    Related Questions

    Cognizant Hadoop Interview Questions

    What is Incremental load in hive?

    In the hive, Incremental load is generally used to implement slowly changing dimensions. When you migrate your data to the Hadoop Hive, you might usually keep the slowly changing tables to sync up tab...

    Cognizant Hadoop Interview Questions

    What is difference between MR1 and MR2?

    MR stands for MapReduce. The Difference between MR1 and MR2 are as follows: The earlier version of the map-reduce framework in Hadoop 1.0 is called MR1. The newer version of MapReduce is known as MR2...

    Cognizant Hadoop Interview Questions

    What is bucketing in Apache hive?

    Bucketing is defined as a technique offered by Apache Hive to decompose data into more manageable parts, also known as buckets that enhance query performance. Bucketing allowed by partitioning, where ...