Skip to content. | Skip to navigation

Personal tools
Log in
Sections
You are here: Home Cloud backup & sync for Linux, Android: Comparison table & a winner

Cloud backup & sync for Linux, Android: Comparison table & a winner

published Sep 25, 2018 01:20   by admin ( last modified Oct 04, 2018 10:51 )

Summary

After having reviewed some 28 components including Borg, Restic, Duplicati, Seafile and others (see big comparison table below), the rather boring answer as the selected winner I will start using is Rsync + Encfs for personal backup, with cloud solutions Rsync.net or Hetzner Box as snapshotting back ends, or your own server of course. I have used rsync+btrfs previously for snapshotting backups so it is not a completely new solution for me. The new thing is using the cloud.

For sync ("magic folder") functionality I still need to evaluate Syncthing and Sparkleshare.

The only reason Rsync is even in the running, is because there are at least two specialized storage services that do rsync backups and snapshots them automatically on the server side. This server-side snapshotting means that the client cannot delete old backups, something the other cloud backup solutions run the risk of.

Encfs has some problems of metadata leaking in normal mode, but it has keystretching (meaning your password is not directly used but a more difficult one is made from your password).

This post has two sections, the comparison table and after that a section on needs and why I chose what I chose.

Rationale for cloud backup

I've had some really nice personal backup services in place: rsync, rdiff-backup, Camlistore (now called perkeep), time-machine-like contraptions on Btrfs (jury is still out for me if Btrfs is good though, I gotta check what's happened given the error messages. Update: It looks like a hardware fault in the Ext4 system SSD disk, so no fault of Btrfs; it still runs!). Problem is that they have all relied on my own servers, and as life goes by these servers tend to break or be repurposed.

Lists

Comparison table

Preliminary list of software and service for backup,  file synchronization and cloud storage components that work on and with Linux.

  • "Linux/Android"— If it is available for Linux and Android. "Standard" for Linux means it is already in the normal Ubuntu repositories.
  • "Their storage" means that you do not need to configure your own storage server. They, or a third party, supply server storage for free, or for a fee. Ideally combined with "Least authority". If free, capacity is listed.
  • "Magic folder" Files are transparently updated between devices.
  • "Multi sync"— Sync more than two devices/machines to the same backup or set of files.
  • "Libre" means if the source code is libre. I haven't checked client libraries always.
  • "LA—Least authority" means that you're in control of the encryption on the client side, and servers can't break it (hopefully). Comes in handy if servers get hacked. Some refer to this as "end to end encryption", however that is slightly different in definition. "Zero knowledge" is also used as a term.
  • "Post check"—Means you can verify backups by sampling them
  • "Go back"— That you can go back in time, for example with snapshots or versions

There are most likely mistakes below. Some non-Linux services included so you do not need to check them out because I may have missed them. "Yes" and anything that is not a "No" in a cell is good (from my perspective).

Service Linux/
Android
 
Magic
folder/
schedule
Multi
sync
Their
storage
$ Price
1TB/yr
LA/
Key
Stretch
Post
Check
Integration
Frontend/Back
Libre Redundant Go
back
Conflict
res.
Lang/
last
comm.
Extra
Features
DropBox Yes/Yes Yes Yes 2 GB $120 No   Magic folder No No No Branch    
siacoin Yes/No ?   For pay   Yes/?   Nextcloud, FS Yes Yes       Crypto coin

tahoe-lafs

Yes/No Yes   Optional   Yes/?   Magic folder,
web, sftp
Yes m of n Yes Shaky Python m of n redundant
cozy Yes/Yes No   5GB $120 No     Server No No      
nextcloud Yes/Yes Yes Yes 10GB $420 Beta   WebDAV/ Yes     Branch PHP Sprycloud price
owncloud Standard/Yes Yes   Optional       WebDAV/ Yes     Branch PHP  
seafile Standard/Wobbly Yes   No   Yes/?   Magic folder/Stand-alone Yes     Branch C Android app keeps crashing
perkeep
(camlistore)
Yes/Upload     Optional       Stand-alone Yes Replicas Yes Branch Golang Content
adressable
Sparkleshare Standard/No Yes   No   Yes/?   Magic folder Yes No Yes     "Not for backups"

Syncthing

Standard/Yes Yes   No   No   Browser Yes   Yes Branch Golang,days, 3 "Not for backups"
Rclone Standard/No No No Optional       34 backends Yes No backend   No    
Duplicati Yes/No No No Optional   Yes/sha256   26 backends,
incl. Sia, Tahoe
Yes No backend Yes   C# minutes, 5  
Restic Standard/No No/No   Optional   Yes/Scrypt
e.g. n18,r1,p3
Yes B2, GDrive, S3, etc. Yes No Yes   Golang, days verifiable backups
Borg Standard/No No/No No     Yes/PBKDF2 Yes Stand-alone Yes No     Python, days Mountable backups
Bup Standard/No No                        
Back in time                            
IDrive No/Yes                          
Amazon Drive Yes/Yes     5GB $60 No                
Backblaze B2         $100                 Only pay what you use
Jottacloud Yes/Yes No   5GB $84                  
Rsync.net Yes/No       $480                 ZFS backend
Hetzner SB         $96     Borg, rsync, WebDAV            
upspin Yes/No                         Early stages proj.
safe network - -   -   -   - - - - -   Beta crypto coin
Mega Sync/Yes                          

OneDrive

No                          
storj (N/A) - -   For pay   Yes   - Yes Yes       Alpha Crypto coin
Google Drive Yes/Yes No   15GB $100 No   Browser No No No     Editors
Tarsnap Yes/No No   For pay $3000+ Yes         Yes     Deduplicates
Rsync+
Rsync.net/Hetzner
Standard/Yes No/No Yes No $480/$96 No No No Yes No Yes None    
Rsync+
Rsync.net/Hetzner
+EncFS/gocryptfs
Standard/Yes No/No Yes No $480/$96 Yes/PBKDF2 (Scrypt for GocryptFS) No No Yes No Yes None    

Other offerings (or lack of such)

  • For syncany, the team has gone missing… Maybe they have been bought to work on some well-funded solution?
  • Filecoin has been funded to the tune of $250 million dollars. I hope to see something produced from them soon!

What I'm looking for

I would like to have full redundancy, all the way from the device. I had this before with two independent systems: synology diskstation and rsync. Fully independent, all the way from the data. I did try to use obnam at one time, but it did not work for me in a reliable way.

Magical folder

It's probably not a good idea to have two different programs share or nest magical folders. I guess the update algorithms could start fighting. It therefore seems like a better idea to use one magical folder service, such as dropbox, and then apply one or several backup services on that magical folder using a completely different backup system. Or even different systems.

Versioning

Your data could be accidentally overwritten by user processes. In that case you want to be able to go back.

Quick restore

You want to be able to be up and running quickly again, both on user devices and get a new backup server up and running again.

Redundancy in backups

This means using different systems already at the client, and also monitor what is going on.

Somebody else's storage

I'd like to try to use remote storage services. One way of doing that more securely is to have things encrypted client side, something called "Zero knowledge" on e.g. Wikipedia's comparison page. I prefer the term "Least authority" which is the "LA" in Tahoe-LAFS.

Least authority

One way of establishing this separately is to use EncFS and backup the encypted version. An interesting way is to keep the encrypted folder with read/write rights to it so it can be used with a backup client with low privileges. A downside with EncFS is that you more than double your storage need on the client computer, unless you use the reverse mount option, which actually is pretty handy.

One guy has tested how well deltas work with EncFS, and the further back in the file the change is, the better it works. A project called gocryptfs seeks to perform faster than EncryptFS paranoia mode.

Some quotes from Tahoe-LAFS which are a bit worrying

It seems to be a really solid system, but as with all complex systems, the behaviour is not always what you'd like. Some quotes from their pages:

"Just always remember that once you create a directory, you need to save the directory's URI, or you won't be able to find it again."

"This means that every so often, users need to renew their leases, or risk having their data deleted." — If you do not periodically renew, things may disappear. If you perish, so does your data. Maybe you can set the lease to infinity?

"If there is more than one simultaneous attempt to change a mutable file or directory […]. This might, in rare cases, cause the file or directory contents to be accidentally deleted."

Deduplication and versioning file systems

It seems like a good idea to use deduplicating file system to create snapshots which on a deduplicating file system could just be folders. Two file system that can do snapshots and that are regarded as stable are:

  • ZFS— Has license problems on Linux, but it is possible to use. Needs 8GB of RAM or thereabouts for efficient deduplication.
  • NILFS—Regarded as stable on Linux, according to Wikipedia: "In this list, this one is [the] only stable and included in mainline kernel.". According to NILFS FAQ it does not support SELinux that well: "At present, NILFS does not fully support SELinux since extended attributes are not implemented yet

Another way of doing snapshots seems to be on a slightly higher level:

RedHat has decided not to support btrfs in the future and are working on something called Stratis, which is similar to LVM thin pools, built ontop of the XFS file system.

We may also get an alternative to btrfs on Linux with bcacheFS.

Cloud backups — Rsync, borg, restic or  duplicati?

For cloud backup purposes it has narrowed down to four choices of which I may deploy more than one:

Rsync

rsync — Rsync can  work with the rsync.com site . Overall a simple and time-trusted setup. They use ZFS on their side and they do snapshots, and you can decide when those snapshots are happening. It can be a bit expensive though. The setup with rsync.com would be very similar to the setup I already have for local backups, with rsync to btrfs snapshots. However push instead of pull. It should also work fine with my phone with Termux or Syncopoli. No scheduling built in. Hetzner box is a cheaper alternative that does the same as rsync.com, although probably less reiably, which they are open about.

+ simple, tried and trusted. Available by default on all Linux distributions.

+ with rsync.com, the client cannot overwrite old backups. This is a truly big point!

+ There are any number of rsync clients for Android, such as Syncopoli.

- no scheduling

- you need to learn rsync syntax (I already know it though)

- No encryption. Although it may be a benefit if you use a good complement. Question is, what is? There is EncFS and a new competitor gocryptfs.

Borg

I had given up on Borg, since it needs a Borg back-end until I found Hetzner storage boxes. These work out of the box (pun intended) with Borg. However do I want to learn yet another configuration language?

Restic

Restic — Restic seems to get the nod from some very intelligent programmers, check for example this review by Filippo Valsorda. However it has no scheduling or process management of backups. That is kind of important, also in the respect of recovering from errors. But maybe the other alternatives have not put too much work into that anyway?

The parameters for scrypt in restic are something like,"N":262144,"r":1,"p":3. This is on the low side, consuming only about 32 MB RAM I believe. Restic is set up to read whatever values of these parameters so if you feel adventurous you can change the key files in the repo to higher values and make sure you know what the answer is of course.

+ In Ubuntu repo

+ Liked by some smart people

- No scheduling

- Need to learn the language

Duplicati

duplicati — This also comes recommended, however it is the only one of these four that is not in the Ubuntu repositories, and it has slightly less glowing reviews than restic. Currently one version, version 2.0, is in beta and the old version, 1.3.4, is not supported anymore. That is in itself weird.

+ Great user interface

+ Includes scheduling. The only one that does so of the shortlisted candidates

- Not in Ubuntu repos

- Keystretching is there but not as well-implemented. See next section for more info.

How good is the client side encryption & keystretching in EncFS, GocryptFS, Borg, Duplicate and Restic?

There are at least three components here:

1) The encryption used. They all use AES but there might be subtle differences.

2) Overall design and leaking of metadata

3) Keystretching. Passwords I believe can often be the weakest link, and some good keystretching could mitigate that.

Encryption

They all use AES although Borg is thinking of Chacha20, not sure if they have implemented it?

Keystretching

Of the techniques used by the components, the best one is Scrypt as long as it uses enough memory, followed by PBKDF2, and then after that at the last place applying sha256 over and over again.

Scrypt is used by Restic and GocryptFS.

Duplicati uses sha256 8192 times keystretching if you use the AES option. A sha256 miner could do short work of that, I guess, evaluating password candidates in parallel. There is however also a gpg encryption option. Not sure why they use sha256 8192 times, seems like a subpar choice. It can use OpenPGP too, and in GPG libraries there is a GCRY_KDF_SCRYPT option, not sure how much it is used though: https://www.gnupg.org/documentation/manuals/gcrypt/Key-Derivation.html. I can see no mention on the web of using scrypt for generating keys in GPG, so I'm not sure it can even be used in practice.

Borg uses PBKDF2.

EncFS uses PBKDF2 for whatever number of iterations that take 0.5 seconds, or 3 seconds in paranoia mode.

Overall design and leaking of metadata

Taylor Hornby has made audits of both EncFS and GocryptFS, the latter here: https://defuse.ca/audits/gocryptfs.htm

EncFS has some problems with leaking metadata that are widely known. But leaking metadata about files may not be all that bad for my use case?

Restic has actually been reviewed (sort of) by a cryptography specialist (Filippo Valsorda) and he gave a thumbs up, if not an all clear. It also has keystretching which I see as a requirement more or less! It uses scrypt for keystretching which I think is a good choice as long as you're not inside the parameters of a scrypt miner. It encrypts with AES-256-CTR-Poly1305-AES

And the winner is

Rsync

Rsync only wins because at least two storage providers have provided snapshots for it.