Deduplication Algorithm
Introduction to Deduplication According to wikipedia, "Data deduplication is a specific form of compression where redundant data is eliminated, typically to improve storage utilization. In the deduplication process, duplicate data is deleted, leaving only one copy of the data to be stored. However, indexing of all data is still retained should that data ever be required. Deduplication is able to reduce the required storage capacity since only the unique data is stored. Methods For DedupLication Algorithm: File-level Deduplication Block-level Deduplication File-level deduplication watches for multiple copies of the same file, stores the first copy, and then just links the other references to the first file. Only one copy gets stored on the disk/tape archive. Ultimately, the space you save on disk relates to how many copies of the file there were in the file system. Lets assume a company having a 1000 employee share a common file say "data.txt" which ...