Create RAID6 Array with mdadm and xfs

I recently used mdadm to setup RAID6 on my GNU/Linux file servers. The following is a tutorial on the commands I ran on debian 9 stretch to accomplish this task.

Prerequisites

You will need to install the parted, mdadm, and xfsprogs packages. GNU parted is used to create partition tables. mdadm is used to create and maintain multi devices. The xfsprogs package contains several xfs related utilities including mkfs.xfs and xfs_info.

$ sudo apt-get install parted mdadm xfsprogs
Create Partition Table

The first step in preparing your disk drives is to create the GPT partition table.

$ sudo parted /dev/sdb mklabel gpt
Create Partition

Next you need to create the partitions. While this step is optional since the RAID array can be built from devices, it is recommended that you use partitions so that down the road you have the option of replacing a drive with one of equal or greater capacity. For my purposes, I just created a single partition that consumes all of the available space.

The “-a optimal” option tells parted to use optimal alignment.

$ sudo parted -a optimal /dev/sdb mkpart primary 0% 100%
Enable the RAID Flag

Next you need to set the raid flag on the partition.

$ sudo parted /dev/sdb set 1 raid on
Verify the Configuration

You can verify that the Partition Table is type gpt, the primary partiton has been created with the appropriate size and that the raid flag is set on that partition.

$ sudo parted /dev/sdb print
Model: ATA HGST HDN724040AL (scsi)
Disk /dev/sdb: 4001GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:

Number Start  End    Size   File system Name    Flags
1      1049kB 4001GB 4001GB             primary raid
Create the Multi Device

After you have completed the previous steps on all disks, the next step is to create the multi device (md). In my case, I am creating a RAID6 multi device from 7 partitions.

$ sudo mdadm --create --verbose /dev/md0 --level=6 --raid-devices=6 /dev/sd{b,c,d,e,f,g,h}1

The multi device is created, but it will not be preserved on reboot. To preserve the configuration on reboot we need to append to the mdadm configuration file. You may also need to update initramfs and grub to ensure the multi device is automatically assembled on reboot.

$ sudo mdadm --detail --scan | sudo tee -a /etc/mdadm/mdadm.conf
$ sudo update-initramfs -u
$ sudo update-grub
Create the XFS Filesystem

Now that we have a multi device, we can create the filesystem. I recommend using xfs if the effective capacity of your multi device may eventually grow to be more than 16TB of hard drive space.

The “-d” option allows us to specify one or more disk parameters.

The sunit disk parameter stands for Stripe Unit. It is optimal when assigned to the Stripe Size in bytes divided by 512. In my case the Stripe Size is 512KB, and thus the optimal Stripe Unit is 1024.

The swidth disk parameter stands for Stripe Width. It is defined by the number of non-parity disks times Stripe Unit. In my case, I am using RAID6 with a total of 7 disks. Since RAID6 requires 2 parity disks, that leaves me with 5 non-parity disks. Thus the optimal Stripe Width is 5120 (5 non-parity disks times the 1024 Stripe Unit).

$ sudo mkfs.xfs -d sunit=1024,swidth=5120 /dev/md0

You can verify the sunit and swidth settings aftewards, but it can be tricky because the units are different. With mkfs.xfs they are defined as a multiple of 512-bytes. With xfs_info, they are defined as a multiples of the block size (bsize). In this case bsize is 4096. So the values appear as 1/8th of the value we specified in the mkfs.xfs command.

$ sudo xfs_info /dev/md0
meta-data=/dev/md0 isize=256 agcount=32, agsize=152612736 blks
 = sectsz=4096 attr=2, projid32bit=1
 = crc=0 finobt=0 spinodes=0 rmapbt=0
 = reflink=0
data = bsize=4096 blocks=4883607040, imaxpct=5
 = sunit=128 swidth=640 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal bsize=4096 blocks=521728, version=2
 = sectsz=4096 sunit=1 blks, lazy-count
Alternative: Create the ext4 Filesystem

If you are certain you will never exceed the 16TB limit, then ext4 is a reasonable alternative choice for a filesystem. Note that you should only run this command if you are using ext4 instead of xfs.

$ sudo mkfs.ext4 -F /dev/md0
Mount the Multi Device

A mount point is simply the directory to which the device is mounted. So we just need to make sure that directory exists and mount the multi device to the newly created mount point.

$ sudo mkdir -p /mnt/md0
$ sudo mount /dev/md0 /mnt/md0

Next we update fstab to preserve the changes on reboot.

$ echo '/dev/md0 /mnt/md0 xfs defaults,nofail 0 0' | sudo tee -a /etc/fstab

Finally df can be used to verify the disk is mounted and displays the usage.

$ df -h -t xfs
Filesystem Size Used Avail Use% Mounted on
/dev/md0   19T  0T   0T    0%   /mnt/md0
Verify the Setup

You can verify the setup with lsblk. It shows the disk device and capacity, the raid partition, and multi device and its capacity, filesystem type, raid level, and mount point.

$ lsblk -o NAME,SIZE,FSTYPE,TYPE,MOUNTPOINT /dev/sd{b,c,d,e,f,g,h}
NAME SIZE FSTYPE TYPE MOUNTPOINT
sdb 3.7T disk
└─sdb1 3.7T linux_raid_member part
 └─md0 18.2T xfs raid6 /mnt/md0
sdc 3.7T disk
└─sdc1 3.7T linux_raid_member part
 └─md0 18.2T xfs raid6 /mnt/md0
sdd 3.7T disk
└─sdd1 3.7T linux_raid_member part
 └─md0 18.2T xfs raid6 /mnt/md0
sde 3.7T disk
└─sde1 3.7T linux_raid_member part
 └─md0 18.2T xfs raid6 /mnt/md0
sdf 3.7T disk
└─sdf1 3.7T linux_raid_member part
 └─md0 18.2T xfs raid6 /mnt/md0
sdg 3.7T disk
└─sdg1 3.7T linux_raid_member part
 └─md0 18.2T xfs raid6 /mnt/md0
sdh 3.7T disk
└─sdh1 3.7T linux_raid_member part
 └─md0 18.2T xfs raid6 /mnt/md0

You can view the state of multi devices using the proc virtual filesystem.

$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active raid6 sdc1[4] sdg1[3] sde1[2] sdd1[7] sdf1[6] sdh1[0] sdb1[5]
 19534428160 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/7] [UUUUUUU]
 bitmap: 0/30 pages [0KB], 65536KB chunk

unused devices: <none>

Lastly, you can view detailed information about a multi device using mdadm.

$ sudo mdadm --detail /dev/md0
/dev/md0:
 Version : 1.2
 Creation Time : Mon May 22 21:28:46 2017
 Raid Level : raid6
 Array Size : 19534428160 (18629.48 GiB 20003.25 GB)
 Used Dev Size : 3906885632 (3725.90 GiB 4000.65 GB)
 Raid Devices : 7
 Total Devices : 7
 Persistence : Superblock is persistent

Intent Bitmap : Internal

Update Time : Fri Sep 8 20:07:07 2017
 State : clean
 Active Devices : 7
Working Devices : 7
 Failed Devices : 0
 Spare Devices : 0

Layout : left-symmetric
 Chunk Size : 512K

Name : euclid:0 (local to host euclid)
 UUID : a7faccce:58914d5e:b26cce7e:795b773b
 Events : 48719

Number Major Minor RaidDevice State
 0 8 113 0 active sync /dev/sdh1
 6 8 81 1 active sync /dev/sdf1
 7 8 49 2 active sync /dev/sdd1
 3 8 97 3 active sync /dev/sdg1
 4 8 1 4 active sync /dev/sdc1
 5 8 17 5 active sync /dev/sdb1
 2 8 65 6 active sync /dev/sde1