How to configure S3 storage backend for Bareos

By Saša Teković Sysbee Blog Bareos, S3 storage 0 Comments

We haven’t had a chance to mention Bareos backup solution on our blog yet, so we’ll make amends and begin with a short intro.

Bareos stands for Backup Archiving Recovery Open Sourced and, as the name suggests, it’s an open-source, cross-network backup solution. The project was started in 2010 as a fork of the well-known Bacula backup solution. It supports Linux, FreeBSD, AIX, HP-UX, Solaris, Windows, and MacOS operating systems, and thanks to its pluggable architecture, it supports multiple storage backends and can backup data stored on the filesystem, in relational databases (MySQL, MariaDB, PostgreSQL, MSSQL), LDAP directory, virtual machines, etc.

Why use S3 as a storage backend

The beauty of Bareos is that it’s a flexible backup solution. This means that you can use multiple storage backends in your backup infrastructure. Our backup infrastructure consists of an on-premises storage system and S3 object storage.

We utilize on-premises storage for new and recent backups because it offers better backup creation and restore performance. The downside is the cost of long-term backup storage, at least compared to the price of S3 object storage.

For storing long-term backups, we utilize AWS S3 service. While the backup creation and restore can’t match the performance of the on-premises storage, there are multiple benefits for using S3 object storage – it’s cheaper, data is encrypted in-transit and at-rest, S3 service offers high-availability and durability. On top of that, having your backups replicated in a different geographical location is like having a disaster recovery plan for your disaster recovery plan. 🙂
S3 offers different storage classes, so depending on your use case, you can choose a more performant and expensive or slower and less durable, but cheaper storage class.

Configuring S3 storage backend for Bareos

Prerequisites for configuring S3 storage backend are:

S3 bucket
IAM credentials (preferably with least-privilege IAM policy)
bareos-storage-droplet (DEB/RPM) package (available in the Bareos repository)

First, we need to create a new Bareos Director Storage Resource. We recommend defining multiple S3 storage resources with multiple devices to allow parallel backup jobs running against the S3 storage.

/etc/bareos/bareos-dir.d/storage/s3.conf

Storage {
  Name = s3_bareos
  Address  = "backup-server.example"
  Password = "<YOUR_BAREOS_STORAGE_PASSWORD>"
  Device = "s3_dev1"
  Device = "s3_dev2"
  Media Type = "s3_object1"
  Maximum Concurrent Jobs = 2 # equals the number of devices
}

Storage {
  Name = s3_bareos2
  Address  = "backup-server.example"
  Password = "<YOUR_AWS_SECRET_ACCESS_KEY>"
  Device = "s3_dev3"
  Device = "s3_dev4"
  Media Type = "s3_object2"
  Maximum Concurrent Jobs = 2 # equals the number of devices
}

With the storage resource definition in place, we now need to configure the storage daemon device.

/etc/bareos/bareos-sd.d/device/s3.conf

Device {                                                                                                                                        
  Name = "s3_dev1"
  Media Type = "s3_object1"
  Archive Device = S3 Object Storage

  Device Options = "profile=/etc/bareos/bareos-sd.d/device/droplet/s3.profile,bucket=<YOUR_S3_BUCKET_NAME>,chunksize=100M,iothreads=10,retries=0,storageclass=standard_ia"

  Device Type = droplet
  Label Media = yes
  Random Access = yes
  Automatic Mount = yes
  Removable Media = no
  Always Open = no
  Description = "s3 device"
  Maximum Concurrent Jobs = 1
}
Device {                                                                                                                                        
  Name = "s3_dev2"
  Media Type = "s3_object1"
  Archive Device = S3 Object Storage

  Device Options = "profile=/etc/bareos/bareos-sd.d/device/droplet/s3.profile,bucket=<YOUR_S3_BUCKET_NAME>,chunksize=100M,iothreads=10,retries=0,storageclass=standard_ia"

  Device Type = droplet
  Label Media = yes
  Random Access = yes
  Automatic Mount = yes
  Removable Media = no
  Always Open = no
  Description = "s3 device"
  Maximum Concurrent Jobs = 1
}

Device {                                                                                                                                        
  Name = "s3_dev3"
  Media Type = "s3_object2"
  Archive Device = S3 Object Storage

  Device Options = "profile=/etc/bareos/bareos-sd.d/device/droplet/s3.profile,bucket=<YOUR_S3_BUCKET_NAME>,chunksize=100M,iothreads=10,retries=0,storageclass=standard_ia"

  Device Type = droplet
  Label Media = yes
  Random Access = yes
  Automatic Mount = yes
  Removable Media = no
  Always Open = no
  Description = "s3 device"
  Maximum Concurrent Jobs = 1
}

Device {                                                                                                                                        
  Name = "s3_dev4"
  Media Type = "s3_object2"
  Archive Device = S3 Object Storage

  Device Options = "profile=/etc/bareos/bareos-sd.d/device/droplet/s3.profile,bucket=<YOUR_S3_BUCKET_NAME>,chunksize=100M,iothreads=10,retries=0,storageclass=standard_ia"

  Device Type = droplet
  Label Media = yes
  Random Access = yes
  Automatic Mount = yes
  Removable Media = no
  Always Open = no
  Description = "s3 device"
  Maximum Concurrent Jobs = 1
}

Make sure to update storageclass parameter in case you wish to use a different S3 storage class.

Finally, we need to configure the S3 profile and define our IAM access key ID, secret access key, and AWS region.
/etc/bareos/bareos-sd.d/device/droplet/s3.profile

aws_region = eu-central-1
access_key = "<YOUR_IAM_ACCESS_KEY_ID>"
secret_key = "<YOUR_IAM_SECRET_KEY>"
host = s3.eu-central-1.amazonaws.com:443
use_https = true
backend = s3
aws_auth_sign_version = 4
pricing_dir = ""

With the configuration now in place, restart Bareos storage daemon (bareos-sd) and reload the Bareos directory (bareos-dir).

The final piece of the puzzle is to associate your storage pool with s3_bareos or s3_bareos2 storage resource. E.g.
/etc/bareos/bareos-dir.d/pool/longterm_storage.conf

Pool {
  Name = longterm-s3-storage
  Pool Type = Backup
  Storage = "s3_bareos"          # S3 storage has to be defined here, see bug https://bugs.bareos.org/view.php?id=824
  Recycle = yes                  # Bareos can automatically recycle Volumes
  AutoPrune = yes                # Prune expired volumes
  Volume Retention = 365 days    # How long should the backups be kept?
  Maximum Volumes = 50           # Limit number of Volumes in Pool
  Maximum Volume Jobs = 1        # Number of jobs per volume
  Label Format = "s3-longterm-"  # Volumes will be labeled with this prefix
}

Things to look out for

Due to a limitation of the Droplet library, which Bareos uses to communicate with the AWS S3 service, you must set “Maximum Concurrent Jobs = 1” parameter in the Device resource settings, otherwise, Bareos storage daemon will refuse to start. This behavior acts as a safety mechanism to avoid data corruption in your S3 volumes, as the Droplet backend doesn’t support block interleaving.

Make sure to omit or adjust the Pool’s settings to your liking, especially parameters such as “Volume Retention”, “Maximum Volumes”, and “AutoPrune”, to avoid premature volume recycling and data loss.
Take a close look at the documentation for iothreads “Device Option” setting. Especially the note about retries>=0 setting in combination with cached writing.

Speaking of AWS…

Did you know that we are a certified AWS Select Tier partner? If you require help with managing your AWS costs or if you are interested in a fully managed AWS infrastructure, feel free to contact us. We’ll gladly help you out!