- Windows File Systems
- Windows NTFS
- Windows ReFS
- Windows DFS
- Storage Spaces Direct
- Storage Replica
- Storage QoS
- Volume Shadowcopy Services
- Windows Volume Mgmt.
- Windows System state
- Removable Storage System
You use compression to save on disk space, but the act of compression to store, then de-compression to read, uses CPU power. Its a trade off, you need to decide which is the most important, saving CPU cycles or saving disk storage. Generally, you would compress files which are not used much, and not compress very active files. Compression could reduce your disk usage by about 60%. Compression does not work well on files that are already compressed, for instance .jpg, .mp3 or .mpg files. Microsoft also does not recommend that you compress files bigger than 30MB as the files become fragmented and performance suffers.
So, compression saves some disk space but burns your CPU and does not work well for some files. Is compression worthwhile? In my opinion, no as it is cheap enough to add more disk space to a server. However if you have an old, historical text folder about that you rarely use, it could be worthwhile to compress it. An other useful reason to compress is if you are sending a few text files by e-mail.
On NTFS volumes, you can compress individual files, folders, or entire drives by simply right clicking on the object, selecting 'properties' then the 'advanced' tab. You should see a box like the one below. Just click on the 'Compress contents to save disk space' button. When you compress a whole folder, any new files added to that folder are automatically compressed as well.
How do you know if compression is active? In Windows 10, compresses folders have 2 blue dots in the top right corner. Open any folder window; choose 'View', 'Options'; and on the View tab check the box labeled 'Display Compressed Files and Folders with Alternate Color'.
Older releases of Windows gave you the option to compress old files with the 'Disk Cleanup' utility, but that was removed in Windows 7, which probably indicates how useful Microsoft thinks compression is.
The other useful option is to be able to zip up a folder or a group of files to send by email. To do this, either select the files you want by clicking on them with your mouse while holding the ctl key down, or select an entire folder. Right click on them, then take the 'Send to' option then 'Compressed (zipped) folder'. This will create a new, zipped folder while leaving your original data intact.
If you are working with a very distributed infrastructure, maybe using Distributed File System, then you could end up transferring a lot of data around your network.
Remote Differential Compression (RDC) is intended to help manage this data transfer over limited-bandwidth networks. If a file is updated, RDC will only transfer the changed parts of files, called deltas, instead of the whole file. Microsoft claims that RDC can reduce bandwidth requirements by as much as 400:1.
Data deduplication is an extension of compression. Compression works on files, Deduplication happens at block level and eliminates duplicate copies of repeating data, which includes duplicate files or duplicate data within several files. Data deduplication works by splitting data up into small variable-sized chunks then comparing these chunks with existing chunks to identify duplicates. If the chunk is a duplicate, then it is replaced with a reference so that only a single copy of each chunk is stored.
Windows Deduplication is post-process, so files are initally created full size and are not deduplicated at once, but are retained full size for a minimum amount of time before they are processed. Chunks are stored in container files and new chunks are appended to the current chunk store container. When its size reaches about 1 GB, that container file is sealed and a new container file is created.
Deduplicated data chunks are not deleted immediatley when a file is deleted. The references (or reparse points) to the deduplicated chunks are deleted, and a garbage collection job runs later to reclaim the data from obsolete chunks.
Data Deduplication was introduced in Windows Server 2012 R2, but had some limitations. Volume sizes above 10 TB were not considered good candidates for deduplication, the very large files that are typical of backup processes were not good candidates and the Data Deduplication process used a single-thread and I/O queue for each volume.
Data Deduplication in Windows Server 2016 supports volume sizes up to 64 TB and files up to 1TB. The Data Deduplication process now runs multiple threads in parallel using multiple I/O queues for each volume, so speeding up the post-processing operations.