Windows ReFS

Microsoft is developing Windows as a server operating system that is capable of hosting the most demanding of applications and one of the issues it faced was NTFS. The NTFS file system is the principle Windows operating system but it has a couple of serious limitations. It cannot easily handle the multi-terabyte drives that are now in common use, and when the file system breaks, it needs a lengthy disk check operation to fix it. Most companies cannot afford to have business critical applications down for an extended period while disk issues are being fixed.

To fix these issues, Microsoft developed a new file system called ReFS, or 'Resilient File System', which was first introduced with Windows Server 2012. Microsoft has a 'statement of intent' to move to ReFS as the default file system but there is no timescale for this as yet. The fact that we cannot yet boot from an ReFS system is an immediate show stopper. ReFS was designed to support most of the NTFS features, so it would not need new system APIs and so most file system filters will continue to work with ReFS volumes.
The NTFS features that are supported include; Access Control Lists, BitLocker encryption, change notifications, file IDs, USN Journals, junction points, symbolic links, mount points, reparse points, volume snapshots and oplocks.
Some NTFS features were removed in the initial release of ReFS, then restored in later editions. These include; Alternate Data streams and automatic correction of corruption when integrity streams are used on parity spaces. Alternate data streams was required to allow ReFS to support MSSQL servers.
Some features were dropped and have not been re-instated so far. these are object IDs, short 8.3 filenames, NTFS compression, file level encryption (EFS), user data transactions, sparse, hard-links, extended attributes, and quotas.
Major issue are that ReFS does not offer data deduplication, it does not support SAN attached volumes and Windows cannot be booted from a ReFS volume.

Ensuring Data Integrity

Metadata is 'data about data' and is used to describe disks, directories and files. As such it is vital that the metadata does not get corrupted. ReFS uses a number of techniques to make sure the metadata stays valid, including independently stored 64-bit checksums and ensuring that metadata is not written in place to avoid the possibility of 'torn writes'.
All ReFS metadata is check-summed at the level of a B+ tree page, and the checksum is stored independently from the page itself. This allows ReFS to detect all forms of disk corruption, including lost and misdirected writes and bit rot, or degradation of data on the media.

The same technique can be used for file data, but this is optional. It is called an 'integrity streams', and if it is used then ReFS always writes the file changes to a location different from the original one. This allocate-on-write technique ensures that pre-existing data is not lost due to the new write. The action of writing the update and writing the checksum is 'monatomic', that is, they must both be completed as a single transaction. An important result from this is that if a file does get corrupted, say by a power failure during write, then that file can be deleted, then either restored from backup or re-created.
Older NTFS file system could not open or delete corrupted files, so for NTFS the only resolution to a corrupt file wass to run chkdsk against the whole volume. The ReFS solution ensures that if a single file does get corrupted, then access to the rest of the good data is not affected. This is especially important as volume sizes get ever larger and so volume checks take longer and longer to run.

back to top