Integration of LVM with Hadoop-Cluster

Let’s understand few concepts related to it🧐

What is LVM? Why we use it?

Logical Volume Management enables the combining of multiple individual hard drives or disk partitions into a single volume group (VG). That volume group can then be subdivided into logical volumes (LV) or used as a single large volume. Regular file systems, such as EXT3 or EXT4, can then be created on a logical volume.

The EXT2, 3, and 4 filesystems all allow both offline (unmounted) and online (mounted) resizing when increasing the size of a filesystem, and offline resizing when reducing the size.

LVM helps to provides Elasticity to the storage Device and it’s an advance version of partition.

LVM allows combining partitions and entire hard drives into Volume Groups

What is Elasticity ?

Do you remember what our college physics had taught us about elasticity🤔Let me define it 👉🏻The property of a substance that enables it to change its length, volume, or shape in direct response to a force effecting such a change and to recover its original form upon the removal of the force is called Elasticity.

First we create ah Hadoop Cluster with 1 master and 1 Datanode

Before Increasing or decreasing the memory , we check the total number of partitions in the OS using : fdisk -l

Next we add a Secondary storage to the OS : Since am using AWS , I use EBS to add a secondary storage

Now we have to create a physical volume for the storage using command: pvcreate /dev/xvdf and to confirm we use command : pvdisplay

After creating Physical volume , we need to create a volume group using command : vgcreate [Group-name] /dev/xvdf and to confirm we use command :vgdisplay

Finally creating a Logical Volume using the previous created volume group using command : lvcreate — size [Storage amount] — name [group-name] [physical volume]

We need to format the volume before mounting using command :

mkfs.ext4 /dev/[Groupname]/[partition name]

Finally , we mount the volume using command :

mount /dev/[Partition-name]/[Group-name] /storage-name

Thus the storage has been increased , to confirm we can use command :

hadoop dfsadmin -report

Thus it is confirmed that the storage has been increased by 5 GB.

This whole process can be automated using python scripting :

This is how integration of Logical Volume Manager with Hadoop is possible.

If you find it useful please upvote it.

Connect with me on LinkedIn | GitHub

Lifelong learning technologies