MSSQL User Databases

Some Big Numbers

The maximum size for a SQL Server 2016 database is 524,272 TB. Hardware limitations and physical management issues will mean that databases are likely to be restricted to a fraction of this size.
Each database can contain up to 32,767 file groups, and the maximum size of each file is 16TB.
The maximum size of Log files is 2 TB.
A maximum of 32,767 databases can be created on an SQL server and 50 instances can be defined on one physical server.
Microsoft recommends that SQL Server 2016 databases be placed on an NTFS file system. FAT32 is supported but not recommended. Physical storage can be local disk, shared storage or SMB. If used, a Windows file server should be 2012 or later.

Database Files

A database file is made up of a number of 8 KB pages, and a Disk I/O operation will read or write a whole 8 KB page at a time. Pages are managed in extents, which consist of eight contiguous pages. At the beginning of each page is a 96-byte header header record that contains metadata about the page, including the page number, and the amount of free space in the page.

A database can consist of three types of file:

Primary file
The primary data file is the starting point of the database and points to any other files in the database. You use the master database to find the locations of all the files in a database, but they are also recorded in the primary file for each database so that the master database can be rebuilt using data from the primary user database files. If a database is small, then all the user data will be held in the primary file.
The recommended file name extension for primary data files is .mdf.
Secondary files
If the primary file is not large enough to hold all the data in the database, then secondary files are required. They are basically used to hold the data that does not fit in the primary data file. Some databases will not need secondary data files, while others may be large enough to need several secondary data files. You may also want to define secondary files on separate disk drives to spread the IO workload across multiple disks.
The recommended file name extension for secondary data files is .ndf.
Transaction logs
These files hold records of updates to the data files, and can be used to recover the database after an error. Every database has at least one transaction log file, and some databases may have more than one.
The recommended file name extension for log files is .ldf.
The SQL Server initially writes its updates to a buffer, both database updates and log updates. These are not written out to disk until the application issues a checkpoint. The term 'dirty page' refers to a page in the buffer that has been updated, but not yet written out to disk. It is essential that SQL Server writes the log file updates out to disk before the data updates, as otherwise uncommitted transactions could not be rolled back. This is called a Write-ahead log or WAL.

The .mdf, .ndf, and .ldf file naming standards are not enforced, but if you use them, then anyone can identify the file type from the extension. These are the names of the physical files on disk, called the os-file-names. You should not confuse these with the logical-file-names, the name used by applications.

File sizes and growth

When you define an SQL Server file, you can specify that it should grow automatically if it fills up its initial space allocation. You say how much space you want added to the file, and each time the file fills, it increases its size by that space amount. If you have multiple files in a filegroup, they will not autogrow until all the files are full. Growth then occurs using a round-robin algorithm.
If you don't state what growth parameters you want when you create a database then you will get the ones that are specified for the model database and they might not be appropriate. When a database adds an extent, all database processing stops until the add extent process completes. New extents will probably be allocated to different parts of the physical disk, so as extents are added, the database becomes fragmented and this can affect performance. Bottom line is that you do not want the database to be extending too often. If the database is growing rapidly, you want to be adding larger extents to reduce the adding frequency. If the database grows slowly, consider adding smaller extents to use space effectively.
You can also specify a maximum size for each file, otherwise the file can continue to grow until the disk is full. If you use autogrow, then you do not need to spend so much administration time monitoring the state of the database files and increasing them manually when they start to fill up.

By default, all files are allocated onto a single disk, as some servers just have one disk. If you have more than one disk available for database files, then you should separate the database and log files onto different disks.

Database Filegroups

It is possible to group database files together in filegroups to simplify management. It may be possible to improve performance by storing tables and indexes on separate disks, to spread the IO workload and prevent head contention. If you define a separate filegroup for every disk you can simplify the management of the files.
Log files are never members of a filegroup. Log space is managed separately from data space.

The primary filegroup is the default. It contains the primary data file and any other files that are not specifically assigned to another filegroup. System tables are allocated in the primary filegroup.
You can place specific files into User-defined filegroups by using the FILEGROUP keyword in a CREATE DATABASE or ALTER DATABASE statement.
You can specify a default user-defined-filegroup for each database, so pages which are not given a specific filegroup are assigned to it, instead of the primary file group. If you do not specify a default filegroup, then the primary filegroup is the default.

It is possible to make all the files in a user-file-group read only, and it is also possible to detach a file group from one server, then reattach it to another server. This means that it is possible to copy an SQL database to removable media, then distribute it to remote clients, who would be able to read the data using their own SQL server.

You do not need to use file groups at all, especially if you are running a smaller system. In this case all files are included in the primary filegroup.

Azure Cloud Database as a Service

Azure Database as a service allows you to store your user databases in the Cloud as an Azure blob. Note that you should not store system databases in Azure blob storage as this is neither recommended nor supported.

There are several benefits to using Azure, some of which are:
Azure gives you limitless storage capacity, and you just pay for the storage you are using.
You can use Azure snapshots using the Azure Blob storage service to provide nearly instantaneous backups and quicker restores for database files.
You can move databases between machines without needing any application changes. The machines can be either on your premises, or in the Cloud.
You can create multiple copies of your databases on the local machines or virtual machines

You might be concerend about ther security of a database in the Cloud, but one way to ensure your data is to separate your compute instance from your storage instance.
You then encrypt your database using Transparent Data Encryption (TDE) certificates, which are physically separated from the data. The TDE keys can be stored in the master database, which must be stored locally on your site and should backed up locally (so the backups with the encryption key are not held in the Cloud). You then use these local held keys to access the data, which is hekd on Azure Storage. If an unauthosied person was able to access your cloud storage, they cannot access your data as they would not have access to the TDE certificates.

An Overview of Using Azure Databases

First, you need to create a storage account and at least one container in Windows Azure. A storage account can contain an unlimited number of containers within the storage limits. Each container will then hold a set of Blobs, either block blobs or page blobs. Page blobs are more efficient when ranges of bytes in a file are modified frequently.

You must then create a policy on the container, which contains things like read, write, and list rights. Wehn you create a policy on a container, you also generate a SAS key which is kept unencrypted in memory and is needed by the SQL Server to access the blob files in the container. For each container used by a data or a log file, you must create a SQL Server Credential whose name matches the container path. This includes information on the policy of the container as well as a shared access signature that is necessary to access the container. You store the information regarding Azure Storage container, its associated policy name, and SAS key in the SQL Server credential store.

back to top