Seafile Mirror - Simple automatic backup of your Seafile libraries

Wouldn’t it be a shame if your library were to be destroyed?

I have been using Seafile for years to host and synchronise files on my own server. It’s fast and reliable, especially when dealing with a large number and size of files. But making reliable backups of all its files isn’t so trivial. This is because the files are stored in a layout similar to bare Git repositories, and Seafile’s headless tool, seafile-cli, is… suboptimal. So I created what started out as a wrapper for it and ended up as a full-blown tool for automatically synchronising your libraries to a backup location: Seafile Mirror.

My requirements

Of course, you could just take snapshots of the whole server, or copy the raw Seafile data files and import them into a newly created Seafile instance as a disaster recovery, but I want to be able to directly access the current state of the files whenever I need them in case of an emergency.

It was also important for me to have a snapshot, not just another real-time sync of a library. This is because I also want to have a backup in case I (or an attacker) mess up a Seafile library. A real-time sync would immediately fetch that failed state.

I also want to take a snapshot at a configurable interval. Some libraries should be synchronised more often than others. For example, my picture albums do not change as often as my miscellaneous documents, but they use at least 20 times the disk space and therefore network traffic when running a full sync.

Also, the backup service must have read-only access to the files.

A version controlled backup of the backup (i.e. the plain files) wasn’t in scope. I handle this separately by backing up my backup location, which also contains similar backups of other services and machines. For this reason, my current solution does not do incremental backups, even though this may be relevant for other use cases.

The problems

Actually, seafile-cli should have been everything you’d need to fulfill the requirements. But no. It turned out that this tool has a number of fundamental issues:

  • You can make the host the tool is running on a sync peer. However, it easily leads to sync errors if the user just has read-only permissions to the library.
  • You can also download a library but then again it may lead to strange sync errors.
  • It requires a running daemon which crashes irregularly during larger sync tasks or has other issues.
  • Download/sync intervals cannot be set manually.

The solution

seafile-mirror takes care of all these stumbling blocks:

  • It downloads/syncs defined libraries in customisable intervals
  • It de-syncs libaries immediately after they have been downloaded to avoid sync errors
  • You can force-re-sync a library even if its re-sync interval hasn’t reached yet
  • Extensive informative and error logging is provided
  • Of course created with automation in mind so you can run it in cronjobs or systemd triggers
  • And as explained, it deals with the numerous caveats of seaf-cli and Seafile in general

Full installation and usage documentation can be found in the project repository. Installation is as simple as running pip3 install seafile-mirror, and a sample configuration is provided.

In my setup, I run this application on a headless server with systemd under a separate user account. Therefore the systemd service needs to be set up first. This is also covered in the tool’s documentation. And as an Ansible power user, I also provide an Ansible role that does all the setup and configuration.

Possible next steps

The tool has been running every day since a couple of months without any issues. However, I could imagine a few more features to be helpful for more people:

  • Support of login tokens: Currently, only user/password auth is supported which is fine for my use-case as it’s just a read-only user. This wouldn’t be hard to fix either, seafile-cli supports it (at least in theory). (#2)
  • Support of encrypted libraries: Shouldn’t be a big issue, it would require passing the password to the underlying seafile-cli command. (#3)

If you have encountered problems or would like to point out the need for specific features, please feel free to contact me or comment on the Mastodon post. I’d also love to hear if you’ve become a happy user of the tool 😊.



Comments