Lightweight file synchronization for ownCloud and WebDAV
Par Yves le dimanche 2 août 2015, 18:45 - Lien permanent
I recently began using ownCloud for file synchronization. All in all, although there are some minor hindrances, the experience is really satisfying. So much so, that I moved all my “personal cloud” data to ownCloud, from the previous NFS share. However, although the regular ownCloud client is just fine where available, it is not available everywhere. In particular:
- I carry around on a USB stick a lightweight Linux desktop based on TinyCore Linux, for which the client is not available.
- I also have an old laptop that is stuck with an obsolete operating system because the video chipset is buggy, and no newer OS will support it (even though the “same” chipset reference in another laptop works just fine…).
For these situations, I tried using DavFS, but this solution was much too slow; it is a great fall-back, though. Next I tried the Java program WebDAV-Sync, but although the initial download went fine, sync did not work all that well: the whole share was fully downloaded again each time!
So I created my own synchronization tool, the only dependencies of which are curl and bash, and optionally ssh. These dependencies are available everywhere, including Windows and some embedded systems
With an utter lack of originality, I named my tool miniOCsync, since it is a lightweight bash script for synchronizing a local directory with ownCloud. This tool can be used with any WebDAV server, though. However, if the server is ownCloud, and SSH access to the server is available, then miniOCsync can connect to the server using ssh for the contents-scan part of the synchronization process, which is a huge time-saver! As an added bonus, using ssh makes computing “md5sums” on the server possible, which makes the file comparisons more reliable; without ssh, only file sizes can be compared. I might add date- and time-comparisons in the future but I’m not sure that these are reliably stored by both ownCloud and its official client, and DavFS.
The tool is very simple. miniOCsync itself is just the
miniOCsync.sh script. This script manages some working data in a directory named
~/.cache/miniOCsync, and synchronizes data with the local
~/ownCloud directory. A configuration file must be created in
~/.config/miniOCsync.conf; here is an example:
'~$' '\.part$' '^(.*/)?\.owncloudsync.log$' '^(.*/)?Thumbs\.db$' '^(.*/)?desktop\.ini$'
'^(.*/)?\.Trash' '^(.*/)?\.~lock\.' '^(.*/)?\..*\.swp$' '^(.*/)?~.*\.tmp$'
The command “
miniOCsync.sh -h | less” gives a lot more information, including how to set up ssh access in a secure manner. For the first run of the tool, an initialization mode must be chosen:
- Parameters “
-i l” initialize the reference meta-data using ocal data. As a consequence, the local data will appear in sync with the reference, while any different data in the “cloud” (WebDAV server) will appear different from the reference, hence newer. In short, those parameters give priority to the server’s data for the first run.
- Conversely, parameters “
-i c” initialize the reference meta-data using loud data. As a consequence, the cloud data will appear in sync with the reference, while any different data in the local
~/ownClouddirectory will appear different from the reference, hence newer. In short, those parameters give priority to the local data for the first run.
The script’s algorithm is really simple:
- If the local data is different from the reference, but the remote data is not, then the local data is uploaded to the server.
- If the remote data is different from the reference, but the local data is not, then the remote data is downloaded from the server.
- Whenever both the local and the remote data have changed since the last synchronization, the local data is moved to
~/ownCloud/lost+found/, and then the remote data is downloaded in its place.
-n” parameter instructs miniOCsync to run the synchronization algorithm, and log its actions as usual, but without actually changing anything on disk, either locally or on the server. This is a great way to preview the synchronization’s outcome without taking any risk.
miniOCsync can be run on a one-shot basis, or it can be scheduled with a cron-like tool. The script exits with no side effects if it detects that another instance by the same user is already running, so that you can for example schedule a run every 5 minutes even if the synchronization sometimes takes longer than that.
Attached is the script file, which may be used and adapted under the conditions of the GPLv3 license.
- 2015-08-06 — It is now possible to initialize the synchronization when one side is empty (I did not need this use case up to now). Performance has been enhanced. Some minor improvements were performed.
- 2015-08-07 — An important bug related to the management of the reference meta-data has been fixed, and curl bug #1063 has been circumvented. The initialization step should be run again using the new version of the program, in order to fix the reference file.