Virtual Data Optimizer (VDO) includes everything you need to create a transparent layer for data compression and deduplication. It reduces disk space usage on block devices, minimises replication, and increases data throughput.
VDO uses three main techniques:
- zero-blocks elimination – filters out data blocks that contain only zeros and stores information about these blocks in metadata only. The non-zero data blocks are then passed to the next processing phase. This phase allows the use of the Thin-Provisioning function in VDO devices
- deduplication – eliminates redundant data blocks. When creating multiple copies of the same data, VDO detects duplicate blocks and updates the metadata to use these duplicates as references to the original blocks without creating redundant blocks
- compression - the kvdo kernel module compresses data blocks using LZ4 algorithm.
With these techniques, VDO can significantly increase both the efficiency of storage and the usage of the network bandwidth. The VDO layer is placed on an existing block device, such as a RAID device or local disk, and block devices can also be encrypted.
Logical devices created with VDO are named VDO volumes. They are similar to disk partitions - they can be formatted with the desired file system and mounted just like a regular file system. You can also use a VDO volume as an LVM physical volume.
Since the VDO volume is Thinly-Provisioned, the file system and applications only see logical space in use and are not aware of the actual physical space available.
When hosting virtual machines or containers, it is recommended to provide storage at a 10:1 logical to physical ratio (for example, using 1 TB of physical storage, we present it as 10 TB of logical storage). For object-based storage platforms such as Ceph, a logical to physical ratio of 3:1 is recommended (meaning 1TB of physical storage will be represented as 3TB of logical storage).
Installing VDO on EuroLinux 8 (as well as on CentOS, Red Hat® Enterprise Linux® and Oracle® Linux) involves running the following command:
[[email protected] ~]$ sudo dnf install vdo kmod-kvdo
Creating a VDO volume
To create a new VDO volume, prepare the following information:
- the name of the underlying block device
- the name of the optimised block device that will be presented by VDO
- the logical size to be presented to storage layers above the VDO.
Without the latter parameter, VDO will create a volume that provides a 1:1 mapping between the physical and logical blocks. You can later increase the physical and logical size of the volume using
vdo growPhysical and
vdo growLogical commands.
As a simple example, we will create a VDO volume on the device /dev /vdb with the name vdo1 and the logical size of 50GB by running the
vdo create command:
[[email protected] ~]$ sudo vdo create --name=vdo1 \ --device=/dev/vdb \ --vdoLogicalSize=50G Creating VDO vdo1 The VDO volume can address 2 GB in 1 data slab. It can grow to address at most 16 TB of physical storage in 8192 slabs. If a larger maximum size might be needed, use bigger slabs. Starting VDO vdo1 Starting compression on VDO vdo1 VDO instance 0 volume is ready at /dev/mapper/vdo1
VDO Information and Statistics
To analyse a VDO volume, run the
vdo status command. It displays a report on the VDO system and the status of the volume in YAML format. We can limit the display of information to a specific volume by using the
--name= option – of course, in case of just one volume, using this option will not be necessary.
[[email protected] ~]$ sudo vdo status VDO status: Date: '2021-09-10 13:57:42-04:00' Node: el84 Kernel module: Loaded: true Name: kvdo Version information: kvdo version: 188.8.131.52 Configuration: File: /etc/vdoconf.yml Last modified: '2021-09-10 12:58:27' VDOs: vdo1: Acknowledgement threads: 1 Activate: enabled Bio rotation interval: 64 Bio submission threads: 4 Block map cache size: 128M Block map period: 16380 (...)
To check the volume, we can use the
vdostats command. Since VDO provides Thin-Provisioning, this tool should also be used to determine how much free physical space is left on the underlying storage device:
[[email protected] ~]$ sudo vdostats --human-readable Device Size Used Available Use% Space saving% /dev/mapper/vdo1 5.0G 3.0G 2.0G 60% N/A
The output of the
vdostats command displays the VDO volume device name (Device) along with statistics that indicate the total number of blocks (1K-blocks), the number of blocks in use (Used), the number of remaining blocks (Available), the percentage of total blocks in use (Use%), and the percentage of space saved (Space saving%).
Next, we can format the VDO volume with the XFS file system:
[[email protected] ~]$ sudo mkfs.xfs -K /dev/mapper/vdo1 meta-data=/dev/mapper/vdo1 isize=512 agcount=4, agsize=3276800 blks = sectsz=4096 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=1 data = bsize=4096 blocks=13107200, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=6400, version=2 = sectsz=4096 sunit=1 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0
and mount the resource:
[[email protected] ~]$ sudo mount /dev/mapper/vdo1 /mnt && df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 627M 0 627M 0% /dev tmpfs 657M 0 657M 0% /dev/shm tmpfs 657M 9.3M 648M 2% /run tmpfs 657M 0 657M 0% /sys/fs/cgroup /dev/mapper/eurolinux-root 21G 4.7G 16G 23% / /dev/vda1 1014M 246M 769M 25% /boot tmpfs 132M 4.0K 132M 1% /run/user/1000 /dev/mapper/vdo1 50G 390M 50G 1% /mnt
At system startup, systemd vdo unit automatically starts all VDO devices that are configured as active. The vdo unit is installed and enabled by default when the VDO package is installed. In the event of a system restart after an unclean shutdown, VDO performs a metadata rebuild to check its consistency and repairs it as needed.
VDO is designed to save disk space and reduce costs. Savings can be seen in both traditional data centres and cloud-based deployments. Depending on your needs, this can translate into lower costs per compute instance, lower costs of external block storage, and lower costs of long-term snapshot storage.
The degree of data reduction that can be observed using VDO will vary depending on the type of data stored. Compressed video or audio files will not take full advantage of this technology, but backups, virtual machines and container deployments will provide very tangible savings.