1
0
mirror of https://github.com/gilbertchen/duplicacy synced 2025-12-06 00:03:38 +00:00

Compare commits

...

10 Commits

Author SHA1 Message Date
gilbertchen
50eaaa94f2 Update DESIGN.md 2016-03-23 18:17:18 -04:00
gilbertchen
ee682bad52 Update GUIDE.md 2016-03-12 13:05:06 -05:00
gilbertchen
38a778557b Update GUIDE.md 2016-03-11 23:11:34 -05:00
gilbertchen
a08e54d6d3 Update README.md 2016-03-08 21:09:51 -05:00
gilbertchen
de3e2b9823 Update GUIDE.md 2016-03-07 16:14:53 -05:00
gilbertchen
b44dfc3ba5 Update GUIDE.md 2016-03-07 16:13:52 -05:00
gilbertchen
09ebcf79d2 Update README.md 2016-03-03 22:06:02 -05:00
gilbertchen
a93faa84b1 Update README.md 2016-03-03 22:04:52 -05:00
gilbertchen
b51d979bec Update README.md 2016-03-02 13:26:55 -05:00
gilbertchen
c2cc41d47c Update README.md 2016-03-02 13:09:27 -05:00
3 changed files with 26 additions and 22 deletions

View File

@@ -16,7 +16,7 @@ time the rolling hash window is shifted by one byte, thus significantly reducing
What is novel about lock-free deduplication is the absence of a centralized indexing database for tracking all existing
chunks and for determining which chunks are not needed any more. Instead, to check if a chunk has already been uploaded
before, one can just perform a file lookup via the file storage API using the file name derived from the hash of the chunk.
This effectively turn a cloud storage offering only a very limited
This effectively turns a cloud storage offering only a very limited
set of basic file operations into a powerful modern backup backend capable of both block-level and file-level deduplication. More importantly, the absence of a centralized indexing database means that there is no need to implement a distributed locking mechanism on top of the file storage.
By eliminating the chunk indexing database, lock-free duplication not only reduces the code complexity but also makes the deduplication less error-prone. Each chunk is saved individually in its own file, and once saved there is no need for modification. Data corruption is therefore less likely to occur because of the immutability of chunk files. Another benefit that comes naturally from lock-free duplication is that when one client creates a new chunk, other clients that happen to have the same original file will notice that the chunk already exist and therefore will not upload the same chunk again. This pushes the deduplication to its highest level -- clients without knowledge of each other can share identical chunks with no extra effort.

View File

@@ -457,20 +457,20 @@ For the *restore* command, the include/exclude patterns are specified as the com
Duplicacy will attempt to retrieve in three ways the storage password and the storage-specific access tokens/keys.
* If a secret vault service is available, Duplicacy will store the password input by the user in such a secret vault and later retrieve it when needed. On Mac OS X it is Keychain, and on Linux it is gnome-keyring. On Windows the password is encrypted and decrypted by the Data Protection API, and encrypted password is stored in the file *.duplicacy/keyring*. However, if the -no-save-password option is specified for the storage, then Duplicacy will not save passwords this way
* If a secret vault service is available, Duplicacy will store passwords/keys entered by the user in such a secret vault and later retrieve them when needed. On Mac OS X it is Keychain, and on Linux it is gnome-keyring. On Windows the passwords/keys are encrypted and decrypted by the Data Protection API, and encrypted passwords/keys are stored in the file *.duplicacy/keyring*. However, if the -no-save-password option is specified for the storage, then Duplicacy will not save passwords this way.
* If an environment variable for a password is provided, Duplicacy will always take it. The table below shows the name of the environment variable for each kind of password. Note that if the storage is not the default one, the storage name will be included in the name of the environment variable.
* If a matching key and its value are saved to the preference file (.duplicacy/preferences) by the *set* command, the value will be used as the password. The last column in the table below lists the name of the preference key for each type of password.
| password type | environment variable (default storage) | environment variable (non-default storage) | key in preferences |
|:----------------:|:----------------:|:----------------:|:----------------:|
| storage password | DUPLICACY_PASSWORD | DUPLICACY_<STORAGENAME>_PASSWORD | password |
| sftp password | DUPLICACY_SSH_PASSWORD | DUPLICACY_<STORAGENAME>_SSH_PASSWORD | ssh_password |
| Dropbox Token | DUPLICACY_DROPBOX_TOKEN | DUPLICACY_<STORAGENAME>_DROPBOX_TOEKN | dropbox_token |
| S3 Access ID | DUPLICACY_S3_ID | DUPLICACY_<STORAGENAME>_S3_ID | s3_id |
| S3 Secret Key | DUPLICACY_S3_KEY | DUPLICACY_<STORAGENAME>_S3_KEY | s3_key |
| BackBlaze Account ID | DUPLICACY_B2_ID | DUPLICACY_<STORAGENAME>_B2_ID | b2_id |
| Backblaze Application Key | DUPLICACY_B2_KEY | DUPLICACY_<STORAGENAME>_B2_KEY | b2_key |
| Azure Access Key | DUPLICACY_AZURE_KEY | DUPLICACY_<STORAGENAME>_AZURE_KEY | azure_key |
| storage password | DUPLICACY_PASSWORD | DUPLICACY_&lt;STORAGENAME&gt;_PASSWORD | password |
| sftp password | DUPLICACY_SSH_PASSWORD | DUPLICACY_&lt;STORAGENAME&gt;_SSH_PASSWORD | ssh_password |
| Dropbox Token | DUPLICACY_DROPBOX_TOKEN | DUPLICACY_&lt;STORAGENAME>&gt;_DROPBOX_TOKEN | dropbox_token |
| S3 Access ID | DUPLICACY_S3_ID | DUPLICACY_&lt;STORAGENAME&gt;_S3_ID | s3_id |
| S3 Secret Key | DUPLICACY_S3_SECRET | DUPLICACY_&lt;STORAGENAME&gt;_S3_SECRET | s3_secret |
| BackBlaze Account ID | DUPLICACY_B2_ID | DUPLICACY_&lt;STORAGENAME&gt;_B2_ID | b2_id |
| Backblaze Application Key | DUPLICACY_B2_KEY | DUPLICACY_&lt;STORAGENAME&gt;_B2_KEY | b2_key |
| Azure Access Key | DUPLICACY_AZURE_KEY | DUPLICACY_&lt;STORAGENAME&gt;_AZURE_KEY | azure_key |
Note that the passwords stored in the environment variable and the preference need to be in plaintext and thus are insecure and should be avoided whenever possible.

View File

@@ -7,11 +7,12 @@ This repository contains only binary releases and documentation for Duplicacy.
Duplicacy supports major cloud storage providers (Amazon S3, Google Cloud Storage, Microsoft Azure, Dropbox, and BackBlaze) and offers all essential features of a modern backup tool:
* Incremental backup: only back up what has been changed
* Full snapshot : although each backup is incremental, it behaves like a full snapshot
* Full snapshot : although each backup is incremental, it must behave like a full snapshot for easy restore and deletion
* Deduplication: identical files must be stored as one copy (file-level deduplication), and identical parts from different files must be stored as one copy (block-level deduplication)
* Encryption: encrypt not only file contents but also file paths, sizes, times, etc.
* Deletion: every backup can be deleted independently without affecting others
* Concurrent access: multiple clients can back up to the same storage at the same time
* Snapshot migration: all or selected snapshots can be migrated from one storage to another
The key idea behind Duplicacy is a concept called **Lock-Free Deduplication**, which can be summarized as follows:
@@ -90,7 +91,7 @@ $ duplicacy copy -r 1 -to s3 # Copy snapshot at revision 1 to the s3 storage
$ duplicacy copy -to s3 # Copy every snapshot to the s3 storage
```
The [User Guide](https://github.com/gilbertchen/duplicacy-beta/blob/master/DESIGN.md) contains a complete reference to
The [User Guide](https://github.com/gilbertchen/duplicacy-beta/blob/master/GUIDE.md) contains a complete reference to
all commands and other features of Duplicacy.
## Storages
@@ -110,7 +111,7 @@ Storage URL: /path/to/storage (on Linux or Mac OS X)
Storage URL: sftp://username@server/path/to/storage
```
Login methods include password authentication and public key authentication. Due to a limitation of the underlying Go SSH library, the key pair for public key authentication be generated without a passphrase. To work with a key that has a passphrase, you can set up SSH agent forwarding which is also supported by Duplicacy.
Login methods include password authentication and public key authentication. Due to a limitation of the underlying Go SSH library, the key pair for public key authentication must be generated without a passphrase. To work with a key that has a passphrase, you can set up SSH agent forwarding which is also supported by Duplicacy.
#### Dropbox
@@ -124,7 +125,7 @@ For Duplicacy to access your Dropbox storage, you must provide an access token t
* Or authorize Duplicacy to access its app folder inside your Dropbox (following [this link](https://dl.dropboxusercontent.com/u/95866350/start_dropbox_token.html)), and Dropbox will generate the access token (which is not visible to us, as the redirect page showing the token is merely a static html hosted by Dropbox)
Dropbox has two advantages over other cloud providers. First, if you are already a paid user then to use the unused space as the backup storage is basically free. Second, unlike other providers Dropbox does not charge API usage fees.
Dropbox has two advantages over other cloud providers. First, if you are already a paid user then to use the unused space as the backup storage is basically free. Second, unlike other providers Dropbox does not charge bandwidth or API usage fees.
#### Amazon S3
@@ -185,12 +186,15 @@ A command to delete old backups is in the developer's [plan](https://github.com/
The following table compares the feature lists of all these backup tools:
| Tool | Incremental Backup | Full Snapshot | Deduplication | Encryption | Deletion | Concurrent Access |Cloud Support |
|:----:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|
| duplicity | Yes | No | Weak | Yes | No | No | Extensive |
| bup | Yes | Yes | Yes | Yes | No | No | No |
| Obnam | Yes | Yes | Weak | Yes | Yes | Exclusive locking | No |
| Attic | Yes | Yes | Yes | Yes | Yes | Not recommended | No |
| restic | Yes | Yes | Yes | Yes | No | Exclusive locking | S3 only |
| **Duplicacy** | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** | **Lock-free** | **S3, GCS, Azure, Dropbox, BackBlaze** |
| Feature/Tool | duplicity | bup | Obnam | Attic | restic | **Duplicacy** |
|:------------------:|:---------:|:---:|:-----------------:|:---------------:|:-----------------:|:-------------:|
| Incremental Backup | Yes | Yes | Yes | Yes | Yes | **Yes** |
| Full Snapshot | No | Yes | Yes | Yes | Yes | **Yes** |
| Deduplication | Weak | Yes | Weak | Yes | Yes | **Yes** |
| Encryption | Yes | Yes | Yes | Yes | Yes | **Yes** |
| Deletion | No | No | Yes | Yes | No | **Yes** |
| Concurrent Access | No | No | Exclusive locking | Not recommended | Exclusive locking | **Lock-free** |
| Cloud Support | Extensive | No | No | No | S3 only | **S3, GCS, Azure, Dropbox, BackBlaze**|
| Snapshot Migration | No | No | No | No | No | **Yes** |