The Living Thing / Notebooks : Synchronising files and backing up your computer

Syncing/sharing

Pure network drives just aren’t as awesome as working locally, and synchronising changes globally. Realising this is why the Dropbox founders are now rich. Well done them. Dependence on single remote servers for every trifling step is stupid.

Peer to peer is more robust. (Taking it still further, how about everything be sneakernets?)

Anyway, file synchronising is handy, and tricky to do, so the solutions which do it easiest are also usually suboptimal. e.g. I have been using Dropbox, but their technical and legal shortcomings are awful.

Some alternatives follow:

Dropbox for the skeptical

If you must use Dropbox, you can at least run it in a container, such as docker so they can’t spy on your stuff. Probably. At least not on the stuff you haven’t explicitly put in Dropbox, which is presumably already enough stuff. This is not painful, but horridly nerdy, and still encourages unsafe Dropbox-trusting amongst your friends. At the end of it, you have made a the tool so inconveneint that you may as well have been using Owncloud.

docker pull janeczku/dropbox
docker run -d --restart=always --name=dropbox \
  -v /path/to/localfolder:/dbox/Dropbox \
  janeczku/dropbox

Keybase, not quite a file sync

An in-principle secure alternative is keybase, although it’s not quite syncing, it’s a kind of syncing-rebooted-thing, which facilitates secure-ish peer sharing something something.

Owncloud

Owncloud is dubiously secure; they have security advisories all the time. But even without that silliness, they don’t store files encrypted, so your server host can see what you are doing. OTOH, it’s easy to run on your own server so useful for sharing something public such as open research etc for only the cost of hosting, which is low. Additionally, Australian academics get a free 100Gb from AARNET, so we may as well.

However, there are various quirks to survive.

Command-line usage is not obvious. First, you can access it as a WebDAV share, which is unwieldy but probably works. However it’s also probably slow. We really want sync here.

The actual owncloud CLI documentation is hidden deeply. Tony Maro gives a walk-through

git-annex

git-annex I have not yet tried, but it supports explicit and customisable folder-tree synchronisation, merging, and sneakernets and as such I am well disposed toward it. You can choose whether to really sync files or not, and have various versions. It doesn’t support iOs. Windows support is experimental. Granularity is per-file. It has weird symlink-based file access protocol which might be inconveneient for some uses. (I’m imagining this is trouble for Microsoft Word or whatever)

The documentation is very nerdy and not very clear, but I think my needs are nerdy and unclear by modern standards.

sync-thing

syncthing has an elegant bit-torrent-like design, although too obtrusively technical. It is reminiscent of git-annex but doesn’t have a combinatorial explosion of options, just one single sync protocol. Granularity is per-folder. Like git-annex, it’s doesn’t support iOS. In contrast, it doesn’t support archiving stuff to USB keys or semi-offline stores.

Stated design criteria:

  • Private. None of your data is ever stored anywhere else than on your computers. There is no central server that might be compromised, legally or illegally.
  • Encrypted. All communication is secured using TLS. The encryption used includes perfect forward secrecy to prevent any eavesdropper from ever gaining access to your data.
  • Authenticated. Every node is identified by a strong cryptographic certificate. Only nodes you have explicitly allowed can connect to your cluster.

Mega

Mega Easy to run. Public source, but not open source. (Long story.)

Anyway it’s relatively easy to use because it works in the browser, so it won’t terrify your non-geek friends. Not too much. Much cheaper than dropbox. Host-blind encryption business from New Zealand. The UI is occasionally freaky but it’s reasonably functional, especially for its bargain-basement price. An OK tradeoff of respectability and affordability, like living in Bulgaria.

Rclone

Rclone is a command line program to sync files and directories to and from Google Drive, Amazon S3, Memset Memstore, dropbox etc.

Features:

Features

  • MD5/SHA1 hashes checked at all times for file integrity
  • Timestamps preserved on files
  • Partial syncs supported on a whole file basis
  • Copy mode to just copy new/changed files
  • Sync (one way) mode to make a directory identical
  • Check mode to check for file hash equality
  • Can sync to and from network, eg two different cloud accounts
  • Optional encryption (Crypt)
  • Optional FUSE mount (rclone mount)

ad hoc

rsync, aws sync.

Others

Online backup

Listing encrypted backups only, because I am not crazy.

Also, I’m only listing open-source options or ones not in a jurisdiction with especially poor privacy, such as China, Russia or the USA.

Windows, OSX, Linux, duplicati:

Duplicati works with standard protocols like FTP, SSH, WebDAV as well as popular services like Microsoft OneDrive, Amazon Cloud Drive / S3, Google Drive, box.com, Mega, hubiC and many others.

Features:

  • Backup files and folders with strong AES-256 encryption. Save space with incremental backups and data deduplication.
  • Run backups on any machine through the web-based interface or via command line interface.
  • Duplicati has a built-in scheduler and auto-updater.

OSX, linux, more bare-bones, duplicity:

Duplicity backs directories by producing encrypted tar-format volumes and uploading them to a remote or local file server. Because duplicity uses librsync, the incremental archives are space efficient and only record the parts of files that have changed since the last backup. Because duplicity uses GnuPG to encrypt and/or sign these archives, they will be safe from spying and/or modification by the server.

Linux, OSX, tarsnap. comes with a server for $0.25/gb/month:

Tarsnap is a secure, efficient online backup service:

Encryption: your data can only be accessed with your personal keys. We can’t access your data even if we wanted to! Source code: the client code is available. You don’t need to trust us; you can check the encryption yourself! Deduplication: only the unique data between your current files and encrypted archives is uploaded. This reduces the bandwidth and storage required, saving you money! Tarsnap runs on UNIX-like operating systems (BSD, Linux, MacOS X, Cygwin, etc)

syncing dotfiles

You might try mackup to sync settings for linux and osx machines alike to some folder somewhere. It’s a database of which actual settings of various apps are actually syncable. On second thoughts, this is a fragile approach. And it freaks out if you have non-ascii characters in your filenames. Do something different.

Revised recommendation:

Use a bare git repo:

git init --bare $HOME/.dotfiles
alias dotfiles='git --git-dir=$HOME/.dotfiles/ --work-tree=$HOME'
dotfiles config --local status.showUntrackedFiles no
echo "alias dotfiles='git --git-dir=$HOME/.dotfiles/ --work-tree=$HOME'" \
  >> $HOME/.bashrc

Yes, much less freaky.

Actually, do you know what is even easier? Just make a git repo in your root dir. No more overthinking.

git init $HOME
git config --local status.showUntrackedFiles no

Now! go forth and steal other peoples’ dotfile tricks.