Update DESIGN.md

2026-01-21 11:53:21 +00:00 · 2016-03-23 18:17:18 -04:00
parent ee682bad52
commit 50eaaa94f2
1 changed files with 1 additions and 1 deletions
--- a/DESIGN.md
+++ b/DESIGN.md
@@ -16,7 +16,7 @@ time the rolling hash window is shifted by one byte, thus significantly reducing
 What is novel about lock-free deduplication is the absence of a centralized indexing database for tracking all existing
 chunks and for determining which chunks are not needed any more.  Instead, to check if a chunk has already been uploaded
 before, one can just perform a file lookup via the file storage API using the file name derived from the hash of the chunk.
-This effectively turn a cloud storage offering only a very limited
+This effectively turns a cloud storage offering only a very limited
 set of basic file operations into a powerful modern backup backend capable of both block-level and file-level deduplication.  More importantly, the absence of a centralized indexing database means that there is no need to implement a distributed locking mechanism on top of the file storage.

 By eliminating the chunk indexing database, lock-free duplication not only reduces the code complexity but also makes the deduplication less error-prone.  Each chunk is saved individually in its own file, and once saved there is no need for modification.  Data corruption is therefore less likely to occur because of the immutability of chunk files.  Another benefit that comes naturally from lock-free duplication is that when one client creates a new chunk, other clients that happen to have the same original file will notice that the chunk  already exist and therefore will not upload the same chunk again.  This pushes the deduplication to its highest level -- clients without knowledge of each other can share identical chunks with no extra effort.