Commit Graph

518 Commits

Author SHA1 Message Date
Michał Matczuk
f396550934 backend/local: Avoid polluting page cache when uploading local files to remote backends
This patch makes rclone keep linux page cache usage under control when
uploading local files to remote backends. When opening a file it issues
FADV_SEQUENTIAL to configure read ahead strategy. While reading
the file it issues FADV_DONTNEED every 128kB to free page cache from
already consumed pages.

```
fadvise64(5, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
read(5, "\324\375\251\376\213\361\240\224>\t5E\301\331X\274^\203oA\353\303.2'\206z\177N\27fB"..., 32768) = 32768
read(5, "\361\311\vW!\354_\317hf\276t\307\30L\351\272T\342C\243\370\240\213\355\210\v\221\201\177[\333"..., 32768) = 32768
read(5, ":\371\337Gn\355C\322\334 \253f\373\277\301;\215\n\240\347\305\6N\257\313\4\365\276ANq!"..., 32768) = 32768
read(5, "\312\243\360P\263\242\267H\304\240Y\310\367sT\321\256\6[b\310\224\361\344$Ms\234\5\314\306i"..., 32768) = 32768
fadvise64(5, 0, 131072, POSIX_FADV_DONTNEED) = 0
read(5, "m\251\7a\306\226\366-\v~\"\216\353\342~0\fht\315DK0\236.\\\201!A#\177\320"..., 32768) = 32768
read(5, "\7\324\207,\205\360\376\307\276\254\250\232\21G\323n\255\354\234\257P\322y\3502\37\246\21\334^42"..., 32768) = 32768
read(5, "e{*\225\223R\320\212EG:^\302\377\242\337\10\222J\16A\305\0\353\354\326P\336\357A|-"..., 32768) = 32768
read(5, "n\23XA4*R\352\234\257\364\355Y\204t9T\363\33\357\333\3674\246\221T\360\226\326G\354\374"..., 32768) = 32768
fadvise64(5, 131072, 131072, POSIX_FADV_DONTNEED) = 0
read(5, "SX\331\251}\24\353\37\310#\307|h%\372\34\310\3070YX\250s\2269\242\236\371\302z\357_"..., 32768) = 32768
read(5, "\177\3500\236Y\245\376NIY\177\360p!\337L]\2726\206@\240\246pG\213\254N\274\226\303\357"..., 32768) = 32768
read(5, "\242$*\364\217U\264]\221Y\245\342r\t\253\25Hr\363\263\364\336\322\t\325\325\f\37z\324\201\351"..., 32768) = 32768
read(5, "\2305\242\366\370\203tM\226<\230\25\316(9\25x\2\376\212\346Q\223 \353\225\323\264jf|\216"..., 32768) = 32768
fadvise64(5, 262144, 131072, POSIX_FADV_DONTNEED) = 0
```

Page cache consumption per file can be checked with tools like [pcstat](https://github.com/tobert/pcstat).

This patch does not have a performance impact. Please find below results
of an experiment comparing local copy of 1GB file with and without this
patch.

With the patch:

```
(mmt/fadvise)$ pcstat 1GB.bin.1
+-----------+----------------+------------+-----------+---------+
| Name      | Size (bytes)   | Pages      | Cached    | Percent |
|-----------+----------------+------------+-----------+---------|
| 1GB.bin.1 | 1073741824     | 262144     | 0         | 000.000 |
+-----------+----------------+------------+-----------+---------+
(mmt/fadvise)$ taskset -c 0 /usr/bin/time -v ./rclone copy 1GB.bin.1 /var/empty/rclone
        Command being timed: "./rclone copy 1GB.bin.1 /var/empty/rclone"
        User time (seconds): 13.19
        System time (seconds): 1.12
        Percent of CPU this job got: 96%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:14.81
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 27660
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 2212
        Voluntary context switches: 5755
        Involuntary context switches: 9782
        Swaps: 0
        File system inputs: 4155264
        File system outputs: 2097152
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
(mmt/fadvise)$ pcstat 1GB.bin.1
+-----------+----------------+------------+-----------+---------+
| Name      | Size (bytes)   | Pages      | Cached    | Percent |
|-----------+----------------+------------+-----------+---------|
| 1GB.bin.1 | 1073741824     | 262144     | 0         | 000.000 |
+-----------+----------------+------------+-----------+---------+
```

Without the patch:

```
(master)$ taskset -c 0 /usr/bin/time -v ./rclone copy 1GB.bin.1 /var/empty/rclone
        Command being timed: "./rclone copy 1GB.bin.1 /var/empty/rclone"
        User time (seconds): 14.46
        System time (seconds): 0.81
        Percent of CPU this job got: 93%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.41
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 27600
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 2228
        Voluntary context switches: 7190
        Involuntary context switches: 1980
        Swaps: 0
        File system inputs: 2097152
        File system outputs: 2097152
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
(master)$ pcstat 1GB.bin.1
+-----------+----------------+------------+-----------+---------+
| Name      | Size (bytes)   | Pages      | Cached    | Percent |
|-----------+----------------+------------+-----------+---------|
| 1GB.bin.1 | 1073741824     | 262144     | 262144    | 100.000 |
+-----------+----------------+------------+-----------+---------+
```
2019-08-08 23:41:52 +01:00
Nick Craig-Wood
9e81fc343e swift: fix upload when using no_chunk to return the correct size
When using the VFS with swift and --swift-no-chunk, PutStream was
returning objects with size -1 which was causing corrupted transfer
messages.

This was fixed by counting the bytes transferred in a streamed file
and updating the metadata with that.
2019-08-08 12:41:46 +01:00
Nick Craig-Wood
e502be475a azureblob/b2/dropbox/gcs/koofr/qingstor/s3: fix 0 length files
In 0386d22cc9 we introduced a test for 0 length files read the
way mount does.

This test failed on these backends which we fix up here.
2019-08-06 15:18:08 +01:00
Nick Craig-Wood
5065c422b4 lib/random: unify random string generation into random.String
This was factored from fstest as we were including the testing
enviroment into the main binary because of it.

This was causing opening the browser to fail because of 8243ff8bc8.
2019-08-06 12:44:08 +01:00
Nick Craig-Wood
0be14120e4 swift: use FixRangeOption to fix 0 length files via the VFS 2019-08-03 18:25:44 +01:00
Nick Craig-Wood
629b7eacd8 b2: fix integration tests after accounting changes
In 53a1a0e3ef we started returning non nil from NewObject when
an object isn't found.  This breaks the integration tests and the API
expected of a backend.

This fixes it.
2019-08-03 13:30:31 +01:00
yparitcher
d3149acc32 b2: link sharing 2019-08-03 13:30:31 +01:00
Nick Craig-Wood
5be968c0ca drive: update API for teamdrive use - fixes #3348 2019-08-02 16:06:23 +01:00
Nick Craig-Wood
57d5de6fba build: fix up package paths after repo move
git grep -l github.com/ncw/rclone | xargs -d'\n' perl -i~ -lpe 's|github.com/ncw/rclone|github.com/rclone/rclone|g'
goimports -w `find . -name \*.go`
2019-07-28 18:47:38 +01:00
Aleksandar Jankovic
53a1a0e3ef accounting: add reference to completed transfers
Add core/transferred call that lists completed transfers and their
status.
2019-07-28 14:48:19 +01:00
Aleksandar Jankovic
8243ff8bc8 accounting: isolate stats to groups
Introduce stats groups that will isolate accounting for logically
different transferring operations. That way multiple accounting
operations can be done in parallel without interfering with each other
stats.

Using groups is optional. There is dedicated global stats that will be
used by default if no group is specified. This is operating mode for CLI
usage which is just fire and forget operation.

For running rclone as rc http server each request will create it's own
group. Also there is an option to specify your own group.
2019-07-28 14:48:19 +01:00
yparitcher
ccc416e62b b2: Fix link sharing #3314 2019-07-28 11:47:31 +01:00
jaKa
a35aa1360e Support setting modification times on Koofr backend.
Configuration time option to disable the above for if using Dropbox (does not
allow setting mtime on copy) or Amazon Drive (neither on upload nor on copy).
2019-07-24 21:11:58 +01:00
Nick Craig-Wood
493dfb68fd opendrive: refactor to use existing lib/rest facilities for uploads
This also checks the return of the call to make sure the number of
bytes written was as expected.
2019-07-24 20:34:29 +01:00
Nick Craig-Wood
1f1ab179a6 webdav: refresh token when it expires with --webdav-bearer-token-command
Fixes #2380
2019-07-22 16:01:55 +01:00
Nick Craig-Wood
c642531a1e webdav: add --webdav-bearer-token-command - fixes #2380
This can be used with oidc-agent to get a bearer token
2019-07-22 15:59:54 +01:00
buengese
def790986c fichier: make FolderID int and adjust related code - fixes #3359 2019-07-20 02:49:08 +02:00
Yi FU
0a1169e659 ssh: opt-in support for diffie-hellman-group-exchange-sha256 diffie-hellman-group-exchange-sha1 - fixes #1810 2019-07-13 12:21:56 +02:00
Nick Craig-Wood
5433021e8b drive: fix server side copy of big files
Before this change rclone was sending a MimeType in the requests for
server side Move and Copy.

The conjecture is that if you attempt to set the MimeType to something
different in a Copy then Google Drive has to do an actual copy of the
file data.  This takes a very long time (since it is large) and fails
after a 90s timeout.

After the change we no longer set the MimeType in Move or Copy and the
copies happen instantly and correctly.

Many thanks to @darthShadow for discovering that this was causing the
problem.

Fixes #3070
Fixes #3033
Fixes #3300
Fixes #3155
2019-07-05 10:49:19 +01:00
Nick Craig-Wood
c9f77719e4 b2: enable server side copy to copy between buckets - fixes #3303 2019-07-05 10:07:05 +01:00
Nick Craig-Wood
d7016866e0 googlephotos: fix creation of duplicated albums
Also make sure we don't list the albums twice
2019-07-04 13:45:52 +01:00
yparitcher
d72e4105fb b2: Fix link sharing #3314 2019-07-04 11:53:59 +01:00
yparitcher
3f5767b94e b2: Implement link sharing #2178 2019-07-03 14:10:25 +01:00
Nick Craig-Wood
a1cfe61ffd googlephotos: Backend for accessing Google Photos #369 2019-07-02 15:26:55 +01:00
Matti Niemenmaa
a6dca4c13f s3: Add INTELLIGENT_TIERING storage class
For Intelligent-Tiering:
https://aws.amazon.com/s3/storage-classes/#Unknown_or_changing_access
2019-07-01 18:17:48 +01:00
Laura
dde4dd0198 fichier: 1fichier support - fixes #2908
This was started by Fionera, finished off by Laura with fixes and more
docs from Nick.

Co-authored-by: Fionera <fionera@fionera.de>
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
2019-06-30 18:35:01 +01:00
Fionera
6cd7c3b774 lib/rest: Calculate correct Content-Length on MultiPart Uploads
This is used in the pcloud backend and in the upcoming 1fichier backend.
2019-06-30 17:57:22 +01:00
Nick Craig-Wood
07e2c3a50f b2: fix nil pointer error introduced by context propagation patch
For some reason f78cd1e043 introduced an unrelated change -
perhaps a merge error.  Removing this change fixes the nil pointer
problem.
2019-06-28 22:38:41 +01:00
Jon Fautley
cd762f04b8 sftp: Completely ignore all modtime checks if SetModTime=false 2019-06-28 10:33:14 +01:00
Sandeep
6907242cae
azureblob: Updated config help details to remove connection string references (#3306) 2019-06-27 18:53:33 -07:00
nguyenhuuluan434
940d88b695 refactor code 2019-06-27 13:28:35 +01:00
nguyenhuuluan434
ca324b5084 trying to capture segments info during upload to swift backend and
delete if there is error duing upload object.
2019-06-27 13:28:35 +01:00
Nick Craig-Wood
9f4589a997 gcs: reduce oauth scope requested as suggested by Google
As part of getting the rclone oauth consent screen approved by Google,
it came up that the scope in use by the gcs backend was too large.

This change reduces it to the minimum scope which still allows rclone
to work correctly.

Old scope: https://www.googleapis.com/auth/devstorage.full_control
New scope: https://www.googleapis.com/auth/devstorage.read_write
2019-06-27 12:05:49 +01:00
Sandeep
fc44eb4093
Azure Storage Emulator support (#3285)
* azureblob - Add support for Azure Storage Emulator to test things locally.

Testing - Verified changes by testing manually.

* docs: update azureblob docs to reflect support of storage emulator
2019-06-26 20:46:22 -07:00
Nick Craig-Wood
a1840f6fc7 sftp: add missing interface check and fix About #3257
This bug was introduced as part of adding context to the backends and
slipped through the net because the About call did not have an
interface assertion in the sftp backend.

I checked there were no other missing interface assertions on all the
optional methods on all the backends.
2019-06-26 16:56:33 +01:00
Aleksandar Jankovic
f78cd1e043 Add context propagation to rclone
- Change rclone/fs interfaces to accept context.Context
- Update interface implementations to use context.Context
- Change top level usage to propagate context to lover level functions

Context propagation is needed for stopping transfers and passing other
request-scoped values.
2019-06-19 11:59:46 +01:00
Nick Craig-Wood
628530362a local: add --local-case-sensitive and --local-case-insensitive
This is to force the remote to declare itself as case sensitive or
insensitive where the defaults for the operating system are wrong.

See: https://forum.rclone.org/t/duplicate-object-found-in-source-ignoring-dedupe-not-finding-anything/10465
2019-06-17 17:09:48 +01:00
Nick Craig-Wood
3087c5d559 webdav: retry on 423 Locked errors #3263 2019-06-15 10:58:13 +01:00
Nick Craig-Wood
22368b997c b2: implement SetModTime #3210
SetModTime() is implemented by copying an object onto itself and
updating the metadata in the process.
2019-06-13 17:31:33 +01:00
Nick Craig-Wood
a5bed67016 b2: implement server side copy - fixes #3210 2019-06-13 17:31:33 +01:00
Nick Craig-Wood
4d195d5a52 gcs: Fix upload errors when uploading pre 1970 files
Before this change rclone attempted to set the "updated" field in
uploaded objects to the modification time.

However when this modification time was before 1970, google drive
would return the rather cryptic error:

    googleapi: Error 400: Invalid value for UnsignedLong: -42000, invalid

However API docs: https://cloud.google.com/storage/docs/json_api/v1/objects#resource
state the "updated" field is read only and tests confirm that.  Even
though the field is read only, it looks like Google parses it.

This change therefore removes the attempt to set the "updated" field
(which was doing nothing anyway) and fixes the problem uploading pre
1970 files.

See #3196 and https://forum.rclone.org/t/invalid-value-for-unsignedlong-file-missing-date-modified/3466
2019-06-12 10:51:49 +01:00
Nick Craig-Wood
e24cadc7a1 box: Fix ineffectual assignment (ineffassign) 2019-06-10 19:33:10 +01:00
Gary Kim
db8cd1a993 ftp: Add no_check_certificate option for FTPS 2019-06-09 16:06:39 +01:00
Gary Kim
2890b69c48 ftp: Add FTP over TLS support 2019-06-09 16:06:39 +01:00
Garry McNulty
e2fde62cd9 drive: add --drive-size-as-quota to show storage quota usage for file size - fixes #3135 2019-06-09 16:00:41 +01:00
Nick Craig-Wood
aa81957586 drive: add --drive-server-side-across-configs
In #2728 and 55b9a4e we decided to allow server side operations
between google drives with different configurations.

This works in some cases (eg between teamdrives) but does not work in
the general case, and this caused breakage in quite a number of
people's workflows.

This change makes the feature conditional on the
--drive-server-side-across-configs flag which defaults to off.

See: https://forum.rclone.org/t/gdrive-to-gdrive-error-404-file-not-found/9621/10

Fixes #3119
2019-06-07 12:12:49 +01:00
Cnly
e4c2468244 onedrive: More accurately check if root is found - fixes #3164 2019-06-06 15:48:46 +01:00
Nick Craig-Wood
5a941cdcdc local: fix preallocate warning on Linux with ZFS
Under Linux, rclone attempts to preallocate files for efficiency.

Before this change, pre-allocation would fail on ZFS with the error

    Failed to pre-allocate: operation not supported

After this change rclone tries a different flag combination for ZFS
then disables pre-allocate if that doesn't work.

Fixes #3066
2019-06-03 18:02:26 +01:00
Philip Harvey
1a2fb52266 s3: make SetModTime work for GLACIER while syncing - Fixes #3224
Before this change rclone would fail with

    Failed to set modification time: InvalidObjectState: Operation is not valid for the source object's storage class

when attempting to set the modification time of an object in GLACIER.

After this change rclone will re-upload the object as part of a sync if it needs to change the modification time.

See: https://forum.rclone.org/t/suspected-bug-in-s3-or-compatible-sync-logic-to-glacier/10187
2019-06-03 15:28:19 +01:00
buengese
25f7f2b60a jottacloud: Add support for selecting device and mountpoint. fixes #3069 2019-05-29 17:16:42 +01:00