Monthly Archives: September 2017

A home NAS

So, my MythTV box was starting to fill up.  It had three 3TB drives in it.  I also had three 3TB drives in my main desktop machine to hold a backup of the Myth box.  With space running low, and with the cases pretty full of hard drives, it was time to do something.

I decided I would build a NAS, and buy some 6TB drives.  My reasoning was that I could stripe pairs of 3TB drives together into logical 6TB drives.  I would then have a Myth box with three 6TB drives, and a NAS with six 3TB drives and one 6TB drive.  The NAS drives would be set up to look like four 6TB drives, and I could do a RAIDZ1 on those.

Over time, as the 3TB drives failed, I would buy 6 TB drives when necessary, and pair together the survivors of striped pairs that had failed.  Eventually, I would have four 6TB drives in the box.

One reason for having a NAS is that I could just pick up the whole box, carry it to a relative’s house, and leave it there if I was going on a vacation.  I don’t like having all my backups in the same place, even if important files are backed up in a fireproof/waterproof safe.

I do have sensitive files, so it was also important that the contents of the NAS be unavailable when it was booted up, until a passphrase is supplied.

So, I took my starting point from two articles on Brian Moses’ blog.  https://blog.brianmoses.net/2016/02/diy-nas-2016-edition.html and https://blog.brianmoses.net/2017/03/diy-nas-2017-edition.html.

I bought some hardware:

I installed FreeNAS-11.0-U2 (e417d8aa5) on the boot USB drive, and started configuring.

I wasn’t able to figure out how to stripe the 3TB drives together into 6TB logical discs for the RAID array, and asked on the forum: https://forums.freenas.org/index.php?threads/using-striped-disks-as-raid-members.57245/.

It turns out you can’t do that, but the suggestion I received there, to partition the 6TB drive into two 3TB logical units and then put everything together as RAIDZ2 was a workable alternative.

So, here’s the procedure I worked out.  My 3TB drives are on ada0, ada1, ada2, ada5, ada6, and ada7.  My 6TB drive is on ada8.  A lot of this had to be done on the command line.  As I was eventually to figure out, there’s also a lot of stuff that I traditionally do on the command line that can’t be done that way anymore.

First, I create an encryption key:

dd if=/dev/random of=MainPool.key bs=64 count=1

I uploaded this key to Google Drive, as an off-site backup.  The passphrase that is used with the key means that the key isn’t particularly useful by itself.

So, we create the encrypted drives:

for i in ada0 ada1 ada2 ada5 ada6 ada7 ada8
do
    geli init -s 4096 -K MainPool.key /dev/$i
done

It will ask for the passphrase twice for each drive, so 14 times.  Then we attach the encrypted devices.

for i in ada0 ada1 ada2 ada5 ada6 ada7 ada8
do
    geli attach -k MainPool.key /dev/$i
done

It will ask for the passphrase once for each drive.

Next, we put a single partition on each of the 3TB drives:

for i in ada0 ada1 ada2 ada5 ada6 ada7
do
    gpart create -s gpt /dev/${i}.eli
    gpart add -t freebsd-zfs -b 128 /dev/${i}.eli
done

Now, we have to partition the 6TB encrypted drive.  Note that the size of the drive looks different on the encrypted than the bare device, I think the block sizes are different.  So, I used this sequence of commands, with the argument on the third line being half of the size reported by the ‘show’ command on the second line:

gpart create -s gpt /dev/ada8.eli
gpart show /dev/ada8.eli
gpart add -t freebsd-zfs -s 732565317  /dev/ada8.eli
gpart add -t freebsd-zfs /dev/ada8.eli

Running glabel status allows me to identify the gptids of the partitions.  That’s important because I don’t know whether the adaN identifiers change when drives are removed or based on boot-time hardware probing.  So, gptids are UUID labels that we can use to identify the partitions unambiguously.

Next, we create the pool:

zpool create MainPool raidz2 \
      gptid/8377bd7e-8d2c-11e7-8faf-d05099c2b71d \
      gptid/83d3e21a-8d2c-11e7-8faf-d05099c2b71d \
      gptid/8425b23d-8d2c-11e7-8faf-d05099c2b71d \
      gptid/8478358d-8d2c-11e7-8faf-d05099c2b71d \
      gptid/84d9db6d-8d2c-11e7-8faf-d05099c2b71d \
      gptid/852b9055-8d2c-11e7-8faf-d05099c2b71d \
      gptid/e7e945b1-8d2c-11e7-8faf-d05099c2b71d \
      gptid/eb13e372-8d2c-11e7-8faf-d05099c2b71d

Of course, you substitute your own gptids there.

Finally, we export the pool with zpool export MainPool

All of this happened on the command line, but we can now switch over to the GUI.  In the GUI, go to “Storage”, and select “Import volume”.  Choose the encrypted pool option.  You’ll have to supply the key (MainPool.key), which has to be on the computer that is running your web browser, so you can upload it.  You will then be asked for the passphrase, which you type in.  The system ponders for a while, and then the pool appears.

Next, I created datasets for the backups of the computers in my house. The main computer is called Londo, so I created a Londo-BACKUP dataset.  Underneath that, I created separate datasets for my root partition, my home partition, and my encrypted partition.  The MythTV box is called “mythtv”, I created a MythTV-BACKUP dataset, and underneath that separate datasets for the non-media partitions, and one for each media partition.   I turned off compression on the media partition datasets, as that wasn’t going to achieve anything on H.264 files.  With this granularity of datasets, I can snapshot the root filesystem and the user files separately, and I can avoid snapshotting the MythTV media partitions, which see a lot of churn in multi-GB files.

We now move on to the configuration.  This was more difficult than it had to be, mostly because by now, having set up the drives that way, I was primed to use the command line for things.  I know how to configure ssh, rsync, and so on, and I made my changes, but nothing worked.  Turns out that the FreeNAS system overwrites those configuration files when it wants to, so my changes were being silently discarded.  Many of these settings have to be altered through the GUI, not from the shell.

After FreeNAS installation, the sshd is set up to refuse port forwarding requests.  I wanted to use those for my rsync jobs.  I would alter the /etc/ssh/sshd_config file, and the changes would do nothing.  The file wasn’t modified.  Turns out that there are two sshd_config files.  The /etc/ssh directory, while present and populated, is unused.  The actual location of the sshd files is /etc/local/ssh, and they are subject to being overwritten, so I used the GUI to turn on port forwarding.

I was getting very slow throughputs on ssh, about 1/8 of the wire speed.  I confirmed that wget ran at the expected speed, and that two ssh sessions each got 1/8 of the wire speed, so I was CPU bound on the decryption of the data stream.  That was a bit surprising, it’s been a while since I saw a CPU that couldn’t handle a vanilla ssh session at gigabit ethernet speeds.  So, I checked to see what ciphers were supported on the FreeNAS sshd, and tested them for throughput.  I settled on “aes128-gcm@openssh.com”, which allowed me to use half the wire speed.  Good enough, though the initial backup would take over 40 hours, rather than just 20.  I avoided that by backing up over three separate ssh channels, so I could go at full wire speed by backing up different datasets in parallel.

On to rsync.  I like to have security inside my firewall, and don’t like the idea of backups being sent over unencrypted channels.  I also don’t want the compromise of a single machine to endanger other machines if that’s at all avoidable.  So, simple rsync over port 873 wasn’t what I was looking for.  I also wanted the machines in the house to be able to decide if and when to perform a backup, rather than having the NAS pull backups.  That way my scripts could prepare the backup and ensure that their filesystems are healthy before starting to write to the NAS.  The obvious choice, then, is rsync tunneled over ssh.

First, I generated an rsync key:

ssh-keygen -b 521 -t ecdsa -N “” -C “Rsync access key” -f rsync_key

I copied the private key to all the machines needing to make backups, and I put the public key into the authorized_keys file:

no-pty,command=”/bin/echo No commands permitted” ecdsa-sha2-nistp521 <KEYTEXT> Rsync access key

The default umask in bash on the FreeNAS box is 0022, so you have to be careful with permissions.  Make sure to set files in .ssh to 0400 or 0600, to ensure that they are not ignored.

This authorized_keys file does not allow the user of the key to execute any commands.  They can open a connection, and they can forward ports.  So, each machine on my network can use this same key to open an encrypted connection to the rsyncd port on the NAS.

Next, we have to set up rsyncd.  This has to happen in the GUI.  There’s a box labelled “auxiliary parameters”, you just copy everything in there.  It opens in the global section, so put in the individual section headers, and you can append anything you like to the base rsyncd setup.  Here’s mine:

address = 127.0.0.1

[Londo]
path = /mnt/MainPool/Londo-BACKUP
use chroot = yes
numeric ids = yes
read only = no
write only = no
uid = 0
gid = 0
auth users = londobackup
secrets file = /root/rsync-secrets.txt

[MythTV]
path = /mnt/MainPool/MythTV-BACKUP
use chroot = yes
numeric ids = yes
read only = no
write only = no
uid = 0
gid = 0
auth users = mythbackup
secrets file = /root/rsync-secrets.txt

[Djinn]
path = /mnt/MainPool/Djinn-BACKUP
use chroot=yes
numeric ids = yes
read only = no
write only = no
uid = 0
gid = 0
auth users = djinnbackup
secrets file = /root/rsync-secrets.txt

EDIT #1 (2017-09-05): Originally, I had the secrets file at /usr/local/etc/rsync/secrets.txt, but it turns out that, on reboot, extra files in the /usr/local/etc/rsync directory are deleted, so my secrets file disappeared and my backups failed.  I have moved it to the parent directory now.

EDIT #2 (2017-09-07): Turns out the /usr/local/etc directory isn’t safe either.  After performing an upgrade, I lost the passwords file again.  I have moved it to /root.

With the ‘address’ keyword in the global section, we restrict access to the rsyncd to localhost.  That’s fine for us, we’ll be coming in through ssh, so our connections will appear to come from 127.0.0.1.  Other accesses will be blocked.  I use chroot and numeric IDs because I do not have common UIDs between machines in my home network anyway, so I don’t care to remap IDs on the NAS.  I run as UID/GID zero so that the backup has permission to create any files and ownerships that are appropriate.  There is a secrets file that contains the plaintext passwords needed to have access to each module.  Mine looks like this:

mythbackup:<PASSWORD1>
londobackup:<PASSWORD2>
djinnbackup:<PASSWORD3>

The appropriate password is also copied into a file on each machine being backed up, I’ve chosen to put them in ~/.ssh/rsync-password.  Only the password, not the username.  Make sure the permissions are 0400 or 0600.

Now, the backup configuration.  Let’s look at the MythTV box.  Here’s its /root/.ssh/config file:

host freenas-1
hostname freenas-1.i.cneufeld.ca
compression yes
IdentityFile /root/.ssh/rsync_key
protocol 2
RequestTTY no
Ciphers aes128-gcm@openssh.com
LocalForward 4322 127.0.0.1:873

This says that if the root user on the MythTV box just types “ssh freenas-1”, that it will go to the correct machine, with ssh stream compression, using the rsync key, protocol 2, no terminal, using the cipher we identified as acceptably fast, and will open a local forward port on 4322 on the MythTV box that encrypts all traffic and sends it to port 873 on the NAS box.

Now, the backup script:

 

#! /bin/sh
#

RSYNC_OPTS="--password-file=/root/.ssh/rsync-password -avx --del -H"

ls /myth/tv1/xfs-vol \
   /myth/tv2/xfs-vol \
   /myth/tv3/xfs-vol > /dev/null 2>&1 || exit 1

ssh -N freenas-1 &
killme=$!

sleep 5

/root/bin/generate-sql-dump.sh

rsync ${RSYNC_OPTS} / \
      rsync://mythbackup@127.0.0.1:4322/MythTV/Non-media_Filesystem
rsync ${RSYNC_OPTS} --exclude=.mythtv/cache \
      --exclude=.mythtv/Cache* \
      /home \
      rsync://mythbackup@127.0.0.1:4322/MythTV/Non-media_Filesystem/home
rsync ${RSYNC_OPTS} /data/srv/mysql \
      rsync://mythbackup@127.0.0.1:4322/MythTV/Non-media_Filesystem/data/srv/mysql
rsync ${RSYNC_OPTS} \
      /data/storage/disk0 \
      rsync://mythbackup@127.0.0.1:4322/MythTV/Non-media_Filesystem/data/storage/disk0


pushd /myth
rsync ${RSYNC_OPTS} . rsync://mythbackup@127.0.0.1:4322/MythTV/Media_Disk_1
popd

pushd /myth/tv2
rsync ${RSYNC_OPTS} . rsync://mythbackup@127.0.0.1:4322/MythTV/Media_Disk_2
popd

pushd /myth/tv3
rsync ${RSYNC_OPTS} . rsync://mythbackup@127.0.0.1:4322/MythTV/Media_Disk_3
popd

kill $killme

What does this do?  First, it verifies that all the media drives are mounted.  I created directories called xfs-vol on each drive.  If those directories are not all present, it means that at least one partition is not correctly mounted, and we don’t want to run a backup.  If a power spike bounced the box while killing a drive, it would start up, but maybe /myth/tv3/ would be empty.  I don’t want the backup procedure to run, and delete the entire /myth/tv3 backup.

Next, we create the ssh connection to the NAS and record the PID of the ssh.  We wait a few seconds for the connection to complete.

We generate a mysql dump.  Backing up the MySQL files is rarely a good strategy, the resulting files generally can’t be used.  The mysql dump is an ASCII snapshot of the database at a given instant, and can be used to rebuild the database during a restore from backup.

Because I use the ‘-x’ switch in rsync, each partition has to be explicitly backed up, we don’t descend into mount points.  The next 4 lines send 4 non-media partitions into a single backup directory and dataset on the NAS.  The “mythbackup” user is the username in /usr/local/etc/rsync/secrets.txt on the NAS box, it need not exist in /etc/passwd on either box.

Next are the media partitions.  They are mounted on /myth, /myth/tv2, and /myth/tv3.  To avoid leading cruft, we chdir to each mount point and then send the backup to the appropriate subdirectory of the module.  Once everything’s backed up, we kill the ssh tunnel.

That’s pretty well everything for now.  I might write another little article soon about what I did with the pair of 250GB SSDs.  They’re for swap, ZIL, and L2ARC (see this article), with about 120GB left over for a mirrored, unencrypted pool holding things like family photos that can be recovered even if the decryption key is lost.