mirror of
https://github.com/rclone/rclone.git
synced 2025-12-06 00:03:32 +00:00
fs: implement --metadata-mapper to transform metatadata with a user supplied program
This commit is contained in:
@@ -475,6 +475,10 @@ Note that arbitrary metadata may be added to objects using the
|
||||
`--metadata-set key=value` flag when the object is first uploaded.
|
||||
This flag can be repeated as many times as necessary.
|
||||
|
||||
The [--metadata-mapper](#metadata-mapper) flag can be used to pass the
|
||||
name of a program in which can transform metadata when it is being
|
||||
copied from source to destination.
|
||||
|
||||
### Types of metadata
|
||||
|
||||
Metadata is divided into two type. System metadata and User metadata.
|
||||
@@ -1504,12 +1508,123 @@ from reaching the limit. Only applicable for `--max-transfer`
|
||||
|
||||
Setting this flag enables rclone to copy the metadata from the source
|
||||
to the destination. For local backends this is ownership, permissions,
|
||||
xattr etc. See the [#metadata](metadata section) for more info.
|
||||
xattr etc. See the [metadata section](#metadata) for more info.
|
||||
|
||||
### --metadata-mapper SpaceSepList {#metadata-mapper}
|
||||
|
||||
If you supply the parameter `--metadata-mapper /path/to/program` then
|
||||
rclone will use that program to map metadata from source object to
|
||||
destination object.
|
||||
|
||||
The argument to this flag should be a command with an optional space separated
|
||||
list of arguments. If one of the arguments has a space in then enclose
|
||||
it in `"`, if you want a literal `"` in an argument then enclose the
|
||||
argument in `"` and double the `"`. See [CSV encoding](https://godoc.org/encoding/csv)
|
||||
for more info.
|
||||
|
||||
--metadata-mapper "python bin/test_metadata_mapper.py"
|
||||
--metadata-mapper 'python bin/test_metadata_mapper.py "argument with a space"'
|
||||
--metadata-mapper 'python bin/test_metadata_mapper.py "argument with ""two"" quotes"'
|
||||
|
||||
This uses a simple JSON based protocol with input on STDIN and output
|
||||
on STDOUT. This will be called for every file and directory copied and
|
||||
may be called concurrently.
|
||||
|
||||
The program's job is to take a metadata blob on the input and turn it
|
||||
into a metadata blob on the output suitable for the destination
|
||||
backend.
|
||||
|
||||
Input to the program (via STDIN) might look like this. This provides
|
||||
some context for the `Metadata` which may be important.
|
||||
|
||||
- `SrcFs` is the config string for the remote that the object is currently on.
|
||||
- `SrcFsType` is the name of the source backend.
|
||||
- `DstFs` is the config string for the remote that the object is being copied to
|
||||
- `DstFsType` is the name of the destination backend.
|
||||
- `Remote` is the path of the file relative to the root.
|
||||
- `Size`, `MimeType`, `ModTime` are attributes of the file.
|
||||
- `IsDir` is `true` if this is a directory (not yet implemented).
|
||||
- `ID` is the source `ID` of the file if known.
|
||||
- `Metadata` is the backend specific metadata as described in the backend docs.
|
||||
|
||||
```json
|
||||
{
|
||||
"SrcFs": "gdrive:",
|
||||
"SrcFsType": "drive",
|
||||
"DstFs": "newdrive:user",
|
||||
"DstFsType": "onedrive",
|
||||
"Remote": "test.txt",
|
||||
"Size": 6,
|
||||
"MimeType": "text/plain; charset=utf-8",
|
||||
"ModTime": "2022-10-11T17:53:10.286745272+01:00",
|
||||
"IsDir": false,
|
||||
"ID": "xyz",
|
||||
"Metadata": {
|
||||
"btime": "2022-10-11T16:53:11Z",
|
||||
"content-type": "text/plain; charset=utf-8",
|
||||
"mtime": "2022-10-11T17:53:10.286745272+01:00",
|
||||
"owner": "user1@domain1.com",
|
||||
"permissions": "...",
|
||||
"description": "my nice file",
|
||||
"starred": "false"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The program should then modify the input as desired and send it to
|
||||
STDOUT. The returned `Metadata` field will be used in its entirety for
|
||||
the destination object. Any other fields will be ignored. Note in this
|
||||
example we translate user names and permissions and add something to
|
||||
the description:
|
||||
|
||||
```json
|
||||
{
|
||||
"Metadata": {
|
||||
"btime": "2022-10-11T16:53:11Z",
|
||||
"content-type": "text/plain; charset=utf-8",
|
||||
"mtime": "2022-10-11T17:53:10.286745272+01:00",
|
||||
"owner": "user1@domain2.com",
|
||||
"permissions": "...",
|
||||
"description": "my nice file [migrated from domain1]",
|
||||
"starred": "false"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Metadata can be removed here too.
|
||||
|
||||
An example python program might look something like this to implement
|
||||
the above transformations.
|
||||
|
||||
```python
|
||||
import sys, json
|
||||
|
||||
i = json.load(sys.stdin)
|
||||
metadata = i["Metadata"]
|
||||
# Add tag to description
|
||||
if "description" in metadata:
|
||||
metadata["description"] += " [migrated from domain1]"
|
||||
else:
|
||||
metadata["description"] = "[migrated from domain1]"
|
||||
# Modify owner
|
||||
if "owner" in metadata:
|
||||
metadata["owner"] = metadata["owner"].replace("domain1.com", "domain2.com")
|
||||
o = { "Metadata": metadata }
|
||||
json.dump(o, sys.stdout, indent="\t")
|
||||
```
|
||||
|
||||
You can find this example (slightly expanded) in the rclone source code at
|
||||
[bin/test_metadata_mapper.py](https://github.com/rclone/rclone/blob/master/test_metadata_mapper.py).
|
||||
|
||||
If you want to see the input to the metadata mapper and the output
|
||||
returned from it in the log you can use `-vv --dump mapper`.
|
||||
|
||||
See the [metadata section](#metadata) for more info.
|
||||
|
||||
### --metadata-set key=value
|
||||
|
||||
Add metadata `key` = `value` when uploading. This can be repeated as
|
||||
many times as required. See the [#metadata](metadata section) for more
|
||||
many times as required. See the [metadata section](#metadata) for more
|
||||
info.
|
||||
|
||||
### --modify-window=TIME ###
|
||||
@@ -1752,9 +1867,9 @@ for more info.
|
||||
|
||||
Eg
|
||||
|
||||
--password-command echo hello
|
||||
--password-command echo "hello with space"
|
||||
--password-command echo "hello with ""quotes"" and space"
|
||||
--password-command "echo hello"
|
||||
--password-command 'echo "hello with space"'
|
||||
--password-command 'echo "hello with ""quotes"" and space"'
|
||||
|
||||
See the [Configuration Encryption](#configuration-encryption) for more info.
|
||||
|
||||
@@ -2503,6 +2618,12 @@ This dumps a list of the open files at the end of the command. It
|
||||
uses the `lsof` command to do that so you'll need that installed to
|
||||
use it.
|
||||
|
||||
#### --dump mapper ####
|
||||
|
||||
This shows the JSON blobs being sent to the program supplied with
|
||||
`--metadata-mapper` and received from it. It can be useful for
|
||||
debugging the metadata mapper interface.
|
||||
|
||||
### --memprofile=FILE ###
|
||||
|
||||
Write memory profile to file. This can be analysed with `go tool pprof`.
|
||||
|
||||
Reference in New Issue
Block a user