VDO – optimizing your EuroLinux 8 disk space

VDO – optimizing your EuroLinux 8 disk space

Virtual Data Optimizer (VDO) includes everything you need to create a transparent layer for data compression and deduplication. It reduces disk space usage on block devices, minimises replication, and increases data throughput.

Virtual Data Optimizer (VDO) includes everything you need to create a transparent layer for data compression and deduplication. It reduces disk space usage on block devices, minimises replication, and increases data throughput.

VDO uses three main techniques:

  • zero-blocks elimination – filters out data blocks that contain only zeros and stores information about these blocks in metadata only. The non-zero data blocks are then passed to the next processing phase. This phase allows the use of the Thin-Provisioning function in VDO devices
  • deduplication – eliminates redundant data blocks. When creating multiple copies of the same data, VDO detects duplicate blocks and updates the metadata to use these duplicates as references to the original blocks without creating redundant blocks
  • compression – the kvdo kernel module compresses data blocks using LZ4 algorithm.

With these techniques, VDO can significantly increase both the efficiency of storage and the usage of the network bandwidth. The VDO layer is placed on an existing block device, such as a RAID device or local disk, and block devices can also be encrypted.

vdo

Logical devices created with VDO are named VDO volumes. They are similar to disk partitions – they can be formatted with the desired file system and mounted just like a regular file system. You can also use a VDO volume as an LVM physical volume.

Since the VDO volume is Thinly-Provisioned, the file system and applications only see logical space in use and are not aware of the actual physical space available.

When hosting virtual machines or containers, it is recommended to provide storage at a 10:1 logical to physical ratio (for example, using 1 TB of physical storage, we present it as 10 TB of logical storage). For object-based storage platforms such as Ceph, a logical to physical ratio of 3:1 is recommended (meaning 1TB of physical storage will be represented as 3TB of logical storage).

VDO Installation

Installing VDO on EuroLinux 8 (as well as on CentOS, Red Hat® Enterprise Linux® and Oracle® Linux) involves running the following command:

    [eurolinux@el84 ~]$ sudo dnf install vdo kmod-kvdo

Creating a VDO volume

To create a new VDO volume, prepare the following information:

  • the name of the underlying block device
  • the name of the optimised block device that will be presented by VDO
  • the logical size to be presented to storage layers above the VDO.

Without the latter parameter, VDO will create a volume that provides a 1:1 mapping between the physical and logical blocks. You can later increase the physical and logical size of the volume using vdo growPhysical and vdo growLogical commands.

As a simple example, we will create a VDO volume on the device /dev /vdb with the name vdo1 and the logical size of 50GB by running the vdo create command:

    [eurolinux@el84 ~]$ sudo vdo create --name=vdo1 \
                                        --device=/dev/vdb \
                                        --vdoLogicalSize=50G
    Creating VDO vdo1
          The VDO volume can address 2 GB in 1 data slab.
          It can grow to address at most 16 TB of physical storage in 8192 slabs.
          If a larger maximum size might be needed, use bigger slabs.
    Starting VDO vdo1
    Starting compression on VDO vdo1
    VDO instance 0 volume is ready at /dev/mapper/vdo1

VDO Information and Statistics

To analyse a VDO volume, run the vdo status command. It displays a report on the VDO system and the status of the volume in YAML format. We can limit the display of information to a specific volume by using the --name= option – of course, in case of just one volume, using this option will not be necessary.

    [eurolinux@el84 ~]$ sudo vdo status
    VDO status:
      Date: '2021-09-10 13:57:42-04:00'
      Node: el84
    Kernel module:
      Loaded: true
      Name: kvdo
      Version information:
        kvdo version: 6.2.4.26
    Configuration:
      File: /etc/vdoconf.yml
      Last modified: '2021-09-10 12:58:27'
    VDOs:
      vdo1:
        Acknowledgement threads: 1
        Activate: enabled
        Bio rotation interval: 64
        Bio submission threads: 4
        Block map cache size: 128M
        Block map period: 16380
    (...)

To check the volume, we can use the vdostats command. Since VDO provides Thin-Provisioning, this tool should also be used to determine how much free physical space is left on the underlying storage device:

    [eurolinux@el84 ~]$ sudo vdostats --human-readable
    Device                    Size      Used Available Use% Space saving%
    /dev/mapper/vdo1          5.0G      3.0G      2.0G  60%           N/A

The output of the vdostats command displays the VDO volume device name (Device) along with statistics that indicate the total number of blocks (1K-blocks), the number of blocks in use (Used), the number of remaining blocks (Available), the percentage of total blocks in use (Use%), and the percentage of space saved (Space saving%).

File System

Next, we can format the VDO volume with the XFS file system:

    [eurolinux@el84 ~]$ sudo mkfs.xfs -K /dev/mapper/vdo1
    meta-data=/dev/mapper/vdo1       isize=512    agcount=4, agsize=3276800 blks
             =                       sectsz=4096  attr=2, projid32bit=1
             =                       crc=1        finobt=1, sparse=1, rmapbt=0
             =                       reflink=1
    data     =                       bsize=4096   blocks=13107200, imaxpct=25
             =                       sunit=0      swidth=0 blks
    naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
    log      =internal log           bsize=4096   blocks=6400, version=2
             =                       sectsz=4096  sunit=1 blks, lazy-count=1
    realtime =none                   extsz=4096   blocks=0, rtextents=0

 

and mount the resource:

    [eurolinux@el84 ~]$ sudo mount /dev/mapper/vdo1 /mnt && df -h
    Filesystem                  Size  Used Avail Use% Mounted on
    devtmpfs                    627M     0  627M   0% /dev
    tmpfs                       657M     0  657M   0% /dev/shm
    tmpfs                       657M  9.3M  648M   2% /run
    tmpfs                       657M     0  657M   0% /sys/fs/cgroup
    /dev/mapper/eurolinux-root   21G  4.7G   16G  23% /
    /dev/vda1                  1014M  246M  769M  25% /boot
    tmpfs                       132M  4.0K  132M   1% /run/user/1000
    /dev/mapper/vdo1             50G  390M   50G   1% /mnt

At system startup, systemd vdo unit automatically starts all VDO devices that are configured as active. The vdo unit is installed and enabled by default when the VDO package is installed. In the event of a system restart after an unclean shutdown, VDO performs a metadata rebuild to check its consistency and repairs it as needed.

Summary

VDO is designed to save disk space and reduce costs. Savings can be seen in both traditional data centres and cloud-based deployments. Depending on your needs, this can translate into lower costs per compute instance, lower costs of external block storage, and lower costs of long-term snapshot storage.

The degree of data reduction that can be observed using VDO will vary depending on the type of data stored. Compressed video or audio files will not take full advantage of this technology, but backups, virtual machines and container deployments will provide very tangible savings.

Authors

The blog articles are written by people from the EuroLinux team. We owe 80% of the content to our developers, the rest is prepared by the sales or marketing department. We make every effort to ensure that the content is the best in terms of content and language, but we are not infallible. If you see anything that needs to be corrected or clarified, we'd love to hear from you.