--- title: "Cluster" description: "Clustering rclone" versionIntroduced: "v1.72" --- # Cluster Rclone has a cluster mode invoked with the `--cluster` flag. This enables a group of rclone instances to work together on doing a sync. This is controlled by a group of flags starting with `--cluster-` and enabled with the `--cluster` flag. ```text --cluster string Enable cluster mode with remote to use as shared storage --cluster-batch-files int Max number of files for a cluster batch (default 1000) --cluster-batch-size SizeSuffix Max size of files for a cluster batch (default 1Ti) --cluster-cleanup ClusterCleanup Control which cluster files get cleaned up (default full) --cluster-id string Set to an ID for the cluster. An ID of 0 or empty becomes the controller --cluster-quit-workers Set to cause the controller to quit the workers when it finished ``` The command might look something like this which is a normal rclone command but with a new `--cluster` flag which points at an rclone remote defining the cluster storage. This is the signal to rclone that it should engage the cluster mode with a controller and workers. ```sh rclone copy source: destination: --flags --cluster /work rclone copy source: destination: --flags --cluster s3:bucket ``` This works only with the `rclone sync`, `copy` and `move` commands. If the remote specified by the `--cluster` command is inside the `source:` or `destination:` it must be excluded with the filter flags. Any rclone remotes used in the transfer must be defined in all cluster nodes. Defining remotes with connection strings will get around that problem. ## Terminology The cluster has two logical groups, the controller and the workers. There is one controller and many workers. The controller and the workers will communicate with each other by creating files in the remote pointed to by the `--cluster` flag. This could be for example an S3 bucket or a Kubernetes PVC. The files are JSON serialized rc commands. Multiple commands are sent using `rc/batch`. The commands flow `pending` →`processing` → `done` → `finished` ```text └── queue ├── pending ← pending task files created by the controller ├── processing ← claimed tasks being executed by a worker ├── done ← finished tasks awaiting the controller to read the result └── finished ← completed task files ``` The cluster can be set up in two ways as a persistent cluster or as a transient cluster. ### Persistent cluster Run a cluster of workers using ```sh rclone rcd --cluster /work ``` Then run rclone commands when required on the cluster: ```sh rclone copy source: destination: --flags --cluster /work ``` In this mode there can be many rclone commands executing at once. ### Transient cluster Run many copies of rclone simultaneously, for example in a Kubernetes indexed job. The rclone with `--cluster-id 0` becomes the controller and the others become the workers. For a Kubernetes indexed job, setting `--cluster-id $(JOB_COMPLETION_INDEX)` would work well. Add the `--cluster-quit-workers` flag - this will cause the controller to make sure the workers exit when it has finished. All instances of rclone run a command like this so the whole cluster can only run one rclone command: ```sh rclone copy source: destination: --flags --cluster /work --cluster-id $(JOB_COMPLETION_INDEX) --cluster-quit-workers ``` ## Controller The controller runs the sync and work distribution. - It does the listing of the source and destination directories comparing files in order to find files which need to be transferred. - Files which need to be transferred are then batched into jobs of `--cluster-batch-files` files to transfer or `--cluster-batch-size` max size in `queue/pending` for the workers to pick up. - It watches `queue/done` for finished jobs and updates the transfer statistics and logs any errors, accordingly moving the job to `queue/finished`. Once the sync is complete, if `--cluster-quit-workers` is set, then it sends the workers a special command which causes them all to exit. The controller only sends transfer jobs to the workers. All the other tasks (eg listing, comparing) are done by the controller. The controller does not execute any transfer tasks itself. The controller reads worker status as written to `queue/status` and will detect workers which have stopped. If it detects a failed worker then it will re-assign any outstanding work. ## Workers The workers job is entirely to act as API endpoints that receive their work via files in `/work`. Then - Read work in `queue/pending` - Attempt to rename into `queue/processing` - If the cluster work directory supports atomic renames, then use those, otherwise read the file, write the copy, delete the original. If the delete fails then the rename was not successful (possible on s3 backends). - If successful then do that item of work. If not successful another worker got there first and sleep for a bit then retry. - After the copy is complete then remove the `queue/processing` file or rename it into `queue/finished` if the `--cluster-cleanup` flag allows it. - Repeat Every second the worker will write a status file in `queue/status` to be read by the controller. ## Layout of the work directory The format of the files in this directory may change without notice but the layout is documented here as it can help debugging. ```text /work - root of the work directory └── queue - files to control the queue ├── done - job files that are finished and read ├── finished - job files that are finished but not yet read ├── pending - job files that are not started yet ├── processing - job files that are running └── status - worker status files ``` If debugging use `--cluster-cleanup none` to leave the completed files in the directory layout. ## Flags ### --cluster string This enables the cluster mode. Without this flag, all the other cluster flags are ignored. This should be given a remote which can be a local directory, eg `/work` or a remote directory, eg `s3:bucket`. ### --cluster-batch-files int This controls the number of files copied in a cluster batch. Setting this larger may be more efficient but it means the statistics will be less accurate on the controller (default 1000). ### --cluster-batch-size SizeSuffix This controls the total size of files in a cluster batch. If the size of the files in a batch exceeds this number then the batch will be sent to the workers. Setting this larger may be more efficient but it means the statistics will be less accurate on the controller. (default 1TiB) ### --cluster-cleanup ClusterCleanup Controls which cluster files get cleaned up. - `full` - clean all work files (default) - `completed` - clean completed work files but leave the errors and status - `none` - leave all the file (useful for debugging) ### --cluster-id string Set an ID for the rclone instance. This can be a string or a number. An ID of 0 will become the controller otherwise the instance will become a worker. If this flag isn't supplied or the value is empty, then a random string will be used instead. ### --cluster-quit-workers If this flag is set, then when the controller finishes its sync task it will quit all the workers before it exits. ## Not implemented Here are some features from the original design which are not implemented yet: - the controller will not notice if workers die or fail to complete their tasks - the controller does not re-assign the workers work if necessary - the controller does not restart the sync - the workers do not write any status files (but the stats are correctly accounted)