Thursday, August 18, 2016

TempDB Configuration

Recently we had some discussion around the best practices for tempdb configuration. The entire discussion was based on SQL Server 2012 EE. This blogpost highlights some salient points of tempdb configuration and include links to additional resources as well. 
  • How many files (data / log) we need to create? Is it based on logical cores or physical CPUs?
Any SQL Server database requires two files (1 data file and 1 log file) at minimum. SQL Server will allow you to add multiple log files but there is no benefit by doing so as SQL Server writes to log file in serial fashion. As a result, we can safely state, you do not need multiple log files for tempdb. 

What about the data files? SQL Server allows to add multiple data files too and there are benefits of doing this. Since our discussion is around tempdb, we can add multiple data files to address the very common phenomena known as tempdb contention. The whole objective of adding multiple data files is to improve the concurrency of tempdb activities in other words to improve the throughput of tempdb. Beginning from SQL Server 2005, tempdb tends to become the bottleneck as SQL Server continues to add new features which utilizes the tempdb heavily. As a result contention around tempdb becomes increased. However the history of tempdb contention goes back to SQL Server 7.0 because TF-1118 is also applies to SQL Server 7.0. The TF-1118 uses to reduce the tempdb contention of allocation pages (PFS, GAM and SGAM).


When deciding the no.of data files for tempdb, a general rule of thumb is to have no.of data files equal to the no.of logical cores up to 8 logical cores, meaning if the machine has 16 logical cores, you can add up to 8 data files. I've seen servers where you've 16 cores and tempdb has 16 data files too. This configuration is not appropriate according to the general rule of thumb. 

Modern CPUs have large number of cores. For example, Intel Xeon E7 processor family has a processor which has 24 cores. If you consider 8 CPU sockets server then you will get 192 (24*8) logical cores and when hyperthreading is enabled it will become 384 (192*2) logical cores. So in such a server, are we going to configure 384 data files for tempdb? That is an extremely bad idea because when there are such no.of data files the SQL Server has an additional overhead of threads management. 

Back to no.of files discussion, you can start with 8 data files (if no.of cores are 8) and then increase the no.of data files by multiple of 4 (up to the no.of logical processors of the machine). You need to do that, only if you still see contention around tempdb. 

  • How to decide the initial size of the tempdb?
There is no universal formula to do this and therefore it is non-trivial. You need to identify the features you're planning to use which potentially utilize the tempdb space. E.g: consider temp table and table variable usage, sorting, ordering, version store, etc. Then use your best judgement to decide the initial size and then monitor the tempdb usage by executing a workload in a test environment and decide whether you need to resize it or not. 
  • Do we need to enable autogrowth or not? If we do, then, is it for all the files or just one file?
Autogrowth setting should be used for emergency situations (for both data and log files) and should not be the process to grow the data file or log file. You always need to manually size the data files and log file and set the autogrowth so that in an emergency situation if tempdb needs additional space the SQL Server will grow the files without impacting the current server activities. If you've multiple data files situation then enable the autogrowth for all the files. 
  • The size of autogrowth?
Microsoft RAP tool has a rule which detects non standard database autogrowth sizes. The rule is as below;

"Database files have been identified that have the next Auto Growth increment of 1GB or above, and instant file Initialization is not enabled"

When the tool detects this, it classified as severity high risk. Basically it does not recommend to have autogrowth sizes 1GB or above. So that I would keep autogrowth size as 1023 MB (yes, it is not 1024 GB). This is especially important for the log file because depending on the autogrowth chunk size, the no.of VLFs vary. No.of VLF logic is as below (this applies to SQL Server 2016 as well);


chunks less than 64MB and up to 64MB  = 4 VLFs

chunks larger than 64MB and up to 1GB = 8 VLFs

chunks larger than 1GB = 16 VLFs


Note: The size of the t-log does not affect the no.of VLFs but the chunk size. 
  • Do we need to keep all data files and log file in the same drive?
It is better to keep them in separate drives if you've the luxury to have multiple drives (to increase the IO bandwidth). The limitation comes in clustered environment where you can have limited drive letters. (Please note that you can also place the tempdb in SSD).

If you've 4 nodes cluster with 3 SQL Server instances. We have already utilized 20 drive letters (including the C drive for OS). These drive letters are for system data like master, msdb, etc, data files, log files, local MSDTC, tempdb, backup and Quorum. 
  • Changes in SQL Server 2016
The TF (trace flag) 1117 is used to grow all the data files when a file in the filegroup reaches the autogrowth threshold. Beginning from SQL Server 2016 this behavior is controlled by the two options added to ALTER DATABASE statement. The two options are, 1. AUTOGROW_SINGLE_FILE, 2. AUTOGROW_ALL_FILES. However this is applies to user databases not the tempdb. 

Starting from SQL Server 2016, the behaviour of TF-1117 or TF-1118 will be automatically enabled by default for tempdb. So it makes DBAs life much more easier. 


Just to make it clear, see the below table; 



Summary
There are lot to talk and learn about tempdb. Tempdb is the only database per instance which keeps all the temporary work of that instance. So it may become a bottleneck and the entire system will become slow. So it is always need to pay a close attention to tempdb behaviour and its activities then take remedial actions to minimize them. Best practices are there for you get a good start but what is important is to identify the optimal configuration and parameters for your system rather than following best practices blindly. 

Cheers!

No comments:

Post a Comment

How to interpret Disk Latency

I was analyzing IO stats in one of our SQL Servers and noticed that IO latency (Read / Write) are very high. As a rule of thumb, we know tha...