Hadoop Cluster Configuration via Ansible

B.V.Rohan Bharadwaj
3 min readJan 29, 2021

What is Ansible?

Ansible is an agent-less software platform that automates cloud provisioning, configuration management, application deployment, intra-service orchestration, and many other IT needs. It can configure both Unix-like systems alike Microsoft Windows.

Ansible Playbook

An Ansible Playbook is the core feature of Ansible and its used to tell Ansible what to execute.

A Playbook is sort of a to-do list for Ansible that contains a list of tasks.

Hadoop

Hadoop is an open-source software framework meant for storing data and running applications clusters in commodity hardware. It provides massive storage for any kind of data, enormous processing power and the ability to handle virtually limitless concurrent tasks or jobs.

Task Objectives:

🔹 Configure Hadoop and start cluster
services using Ansible Playbook

Procedure

NameNode Configuration:

I’ve launched a VM with the IP 192.168.127.130 , this will be configured as the NameNode. Inorder to setup a NameNode , we must edit two files :

hdfs-site.xml

core-site.xml

The above playbook will automate the process of setting up the NameNode using Jinja template

hdfs-site.xml template file:

core-site.xml template file:

→DataNode Configuration:

I’ve launched a VM with the IP 192.168.127.129 , this will be configured as the DataNode.

→Managed Nodes:

→Playbook Execution:

NameNode Setup:

DataNode Setup:

Now, we can check whether the NameNode and DataNode are configured using Ansible

→Cluster:

Thank you for the time~!

--

--