Posted On: Mar 02, 2020
InputSplit in the MapReduce is used to represent the data logically that is used by the mapper process. So the number of InputSplits are equal to the number of map tasks. Every InputSplit has a storage location and the length of the InputSplits is measured in bytes. The important thing to note is that the InputSplit just references the data and it doesn’t actually contain the data. The Input format in the Hadoop is responsible for creating the InputSplits. The split size based on the size of the data in the MapReduce program can be user-defined.
Never Miss an Articles from us.