Restraining the Storage Capacity of DataNode in Hadoop Clusters

Task Objectives:

🔹 Launch a Namenode and two Datanodes in the AWS cloud.

🔹 Create anEBS volume of size 4Gib each and attach one to each of the Datanodes.

🔹 Create a partition of 2Gb in the DN volume and mount it to the folder that is contributing to the cluster.

🔹 Similarly, create a partition of 610Mb in the DN2 volume.

🔹 Check the contribution of the DataNodes in the WebUI

→ Launching:

launch a NameNode and DataNode(s) instances based on the capacity of your choice

After Configuring them, you may attach the volume size of your choice

In my case, I used AWS and created and attached a volume of size 4Gib to my DataNode instance

→Set up:

Edit the hdfs.xml and core-site.xml files in both NameNode and DataNode and edit the IPs and traffics based on the requirement

Once the set up is over run the command :

hadoop -format

After formatting we can start the hadoop services and test using jps command

→Partitioning:

using the fdisk command we can create disk partitions..

mkfs.ext4 /dev/xvdr will format the partition

This is where the partitioning was created

After Running the fdisk -l cmd again, Now we see 3

At last , we mount that drive using the mount command:

mount /dev/xvdr

mount drive_name folder

Now , if we check back in the Hadoop’s WebUI . The storage capacity has been limited from 9.99Gb to a mere 1.91Gb

And that’s how easy it is to limit the storage capacity of the DataNodes.

Thank you for the time!

--

--

No responses yet