Restraining the Storage Capacity of DataNode in Hadoop Clusters
Task Objectives:
🔹 Launch a Namenode and two Datanodes in the AWS cloud.
🔹 Create anEBS volume of size 4Gib each and attach one to each of the Datanodes.
🔹 Create a partition of 2Gb in the DN volume and mount it to the folder that is contributing to the cluster.
🔹 Similarly, create a partition of 610Mb in the DN2 volume.
🔹 Check the contribution of the DataNodes in the WebUI
→ Launching:
launch a NameNode and DataNode(s) instances based on the capacity of your choice
After Configuring them, you may attach the volume size of your choice
In my case, I used AWS and created and attached a volume of size 4Gib to my DataNode instance
→Set up:
Edit the hdfs.xml and core-site.xml files in both NameNode and DataNode and edit the IPs and traffics based on the requirement
Once the set up is over run the command :
hadoop -format
After formatting we can start the hadoop services and test using jps command
→Partitioning:
using the fdisk command we can create disk partitions..
mkfs.ext4 /dev/xvdr will format the partition
After Running the fdisk -l cmd again, Now we see 3
At last , we mount that drive using the mount command:
mount /dev/xvdr
Now , if we check back in the Hadoop’s WebUI . The storage capacity has been limited from 9.99Gb to a mere 1.91Gb
And that’s how easy it is to limit the storage capacity of the DataNodes.
Thank you for the time!