Friday, December 16, 2016

Dedicated Administrator Connection (DAC)

The dedicated administrator connection (DAC) allows an administrator to connect to an SQL Server instance to run diagnostic functions or T-SQL statements even if the server is not responding to normal connections. This feature was introduced in SQL Server 2012 and by default this option is disabled. 

You can enable DAC using below two methods;

​​
Using SSMS
  1. Open 
    ​​
    SSMS
    ​.​

  2. In object explorer, connect to the SQL Server instance that you want to enable 
    ​​
    DAC
    ​.​

  3. Right click on Server instance and select 
    ​​
    Facets
    ​.​

  4. In 
    ​​
    View Facets
     window select 
    ​​
    Server Configuration
     using the drop down box. See below figure
    ​:​


Using T-SQL

sp_configure 'show advanced options',1;
GO
RECONFIGURE;
GO
sp_configure 'remote admin connections', 1; 
GO 
RECONFIGURE
GO

How to connect via DAC
​​
Once you enabled the DAC for an SQL Server instance, it is pretty easy to connect to the Server via DAC. You just need to mention ADMIN: in front of the server name that your connecting. See below figure:



Note that you can establish a single successful DAC connection to a SQL Server. If you try another DAC connection attempt, you get the following error message;

"A connection was successfully established with the server, but then an error occurred during the pre-login handshake. (provider: TCP Provider, error: 0 - The specified network name is no longer available.) (Microsoft SQL Server, Error: 64)"

There is a possibility that you will get the above error when you try to login to a server via DAC to perform some urgent troubleshooting task. That means someone else has already logged into the server using remote DAC and you may immediately do to panic mode. In such cases, it is very important to know who is using DAC connection. 

Use the below T-SQL code to get the information of already established DAC connection. 

SELECT CASE WHEN ses.session_id= @@SPID THEN 'It''s me! ' ELSE '' END + 
       
COALESCE(ses.login_name, '???') AS WhosGotTheDAC, 
       
ses.session_id, 
       
ses.login_time, 
       
ses.status, 
       
ses.original_login_name 
FROM   sys.endpoints AS en 
       
JOIN sys.dm_exec_sessions ses 
         
ON en.endpoint_id = ses.endpoint_id 
WHERE  en.NAME = 'Dedicated Admin Connection'


If your using the server name when connecting, make sure SQL Server Browser service is running. Otherwise you will receive the most common log on error which is "A Network-related or instance-specific error occurred…" which basically says it cannot find the server. 


General guideline

It is always a good practice to enable

​D​
AC in all production servers. This option is extremely useful when you want to troubleshoot server issues. 

Cheers!

Friday, December 2, 2016

Persistent Log Buffer with SQL Server 2016 SP1

You can get significant performance gain by just moving the existing applications to SQL Server 2016. There are several case studies to prove this story. You can even get more performance gain (2X - 4X) with combination of SQL Server 2016 SP1 and Windows Server 2016 (not yet released). This performance gain achieved through Persistent (non-volatile) Log Buffer technique. This performance enhancement has been released to SQL Server with SP1. SQL Server utilizes Windows Server 2016 direct memory access (DAX). With this new feature SQL Server can write transaction log data in memory access speed over disk IO speed earlier. With the new type of memory which is persistent, the moments SQL Server writes to log cache (memory area for t-log), the data is persistent. 

For more details:


Tuesday, November 29, 2016

How SQL Server runs on Linux?


I wrote this blog post after Microsoft announced that SQL Server is going to support Linux. I had many questions in mind on how MS is going to design SQL Server to run on Linux. The major question I had was, Is it complete re-write to work on Linux kernel? Now Microsoft has released some details on, how they have designed SQL Server to run on Linux. It was not a complete rewrite of SQL Server instead Microsoft has used a virtualization technique called Drawbridge to run SQL Server on Linux. Drawbridge contains Windows OS Library versions which consists of a user-mode kernel of Windows NT as a process called picoprocess. Picoprocess is a container service and it communicates with Linux OS. 

​For more details refer below links;

​Cheers!

Friday, November 18, 2016

SQL Server on Mac

I'm super excited to hear about the news, the first release of CTP (Community Technology Preview) of SQL Server on Linux at Microsoft connect() 2016 on 11/16/2016. This is definitely an unforgettable day for SQL Server enthusiasts around the world and I could not wait to install it on my Mac and give it a try. I thought of writing a blog post about the installation steps that I used to setup SQL Server on Mac. 

However there are few NEW things that you want to know before jump into the installation. As SQL Server DBAs, I believe many are not familiar much with Linux OS as well as new development methods like Docker containers. The CTP released for Linux is available as a Docker container and there are some prerequisites that you need to set up before the SQL Server. It is same for Windows as well. The CTP version of SQL Server is named as SQL Server v.Next. 

​Install Docker on Mac

What is Docker? Docker site defines it as: "Docker containers wrap a piece of software in a complete filesystem that contains everything needed to run: code, runtime, system tools, system libraries – anything that can be installed on a server. This guarantees that the software will always run the same, regardless of its environment."

Installing Docker as easy as other program in Mac. You just need to download the Docker.dmg and follow the instructions. Below are the screenshots that I took during my installation. 






Once the installation is complete, you can see the Docker icon in Taskbar of the Mac. You can also see the status as "Docker is running". I presume this as a thin virtualization layer which you can host what is call, containers. You can view additional settings of Docker by getting its Preferences (like Properties Window in Windows OS).

By default Docker for Mac has 2GB memory allocated. However, to run SQL Server you need 4GB of minimum memory and by clicking on Preferences you can configure the memory allocation as mentioned below;

After that click on "Apply & Restart" button to take the changes in effect. You can notice that the status will change to "Docker restarting..." after a while it shows as "Docker is running".

With that you have completed the infrastructure necessary to run the SQL Server docker container. 

You also need to familiar with some Docker terminal commands for some administrative work. Later part of the blog has those infomation.  

Download and Install SQL Server Docker Container for Mac

For this task, we need to open Mac Terminal Window and enter the following commands. 

1. Pull the Docker image from Docker hub. 

    sudo is required since this needs admin privileges. 

    sudo docker pull microsoft/mssql-server-linux

2. Run the Docker image.

docker run –e 'ACCEPT_EULA=Y' –e 'SA_PASSWORD=P@$$w0rd!' -p 1433:1433 -d microsoft/mssql-server-linux

Note: You can give any password for the SA account. 

Once completed;

Note: Sometimes copy and paste commands to Terminal Window does weird things to formatting. After copying, hyphens and quotation marks changes to different characters. If that happens you will get errors stating some characters are NOT VALID. DO NOT panic, just delete them and re-type the correct character in the Terminal Window itself. Then all should work without any issues.

Once you come to this level, you have installed SQL Server and its running as Docker image. Now you need to know few Docker commands to see the status of Docker. Again you need to run the those commands in Mac Terminal.

Docker commands

1. docker ps

List of containers which are running at the moment.

2. docker ps -a

To list all containers that are attached to the Docker. Below screenshot shows the output of this command. You can see, the STATUS column is "Exited", meaning the SQL Server image is not running at the moment. So you need to start the SQL Server docker container.


3. docker start <container id>

You can see that now, after starting the particular container id for SQL Server docker image, the SQL Server is running. 

4. docker stop <container id>

If you want to stop the docker container. I believe this is like you stop SQL Server service in Windows machine. 

What About SQL Server Client Tools?

The most favorite tool for SQL Server DBAs is the SQL Server Management Studio (SSMS). The next question is how to install SSMS in Linux. Unfortunately native build of SSMS for Linux is NOT YET available as of today. However you can use some other command line tools such as Powershell or SQLCMD. I've installed SQL-CLI in Mac so that I can connect to SQL Server on Mac. 

Install SQL-CLI on Mac

First you need to install node.js on your Mac using the link mentioned under References. It's a dmg package and you can install it same way you did in Docker. Its super easy. After that you need to run below command in Terminal to install the sql-cli tool. 

npm install -g sql-cli

See below screenshot for the output. 


With that you have a client tool (command line) to connect to the SQL Server on Mac. 

If you just type MSSQL in the terminal, you will get a login failure that is because you've not specified the credentials. Let's see how you can connect to the SQL Server instance with proper credentials. 

Connecting to SQL Server on Mac

Type the below command in Terminal. 

> mssql -u sa -p 'P@$$w0rd!'


Wow!. Isn't it nice? Now I've just connected to SQL Server on Mac for the first time. Now it's time to keep getting busy with it.

Executing SQL Commands

1. SQL Server Version


2. List all the databases


Still I've not created any user databases so that you can see only system databases. Nothing new here. 

2. List of all the table in msdb DB


Now you can play with regular work that you do with SQL Server on daily basis. 

Cheers!

References


Friday, October 21, 2016

Short notes on Statistics


QUERY EXECUTION
  • Recompile would be the 1st step to influence to optimize to recompile the code again.
  • Compilation is the most important stage
  • Cost based optimizer == need to evaluate a query reasonably quickly
  • When writing a query, there are many ways to write it but do not focus to that first. Focus the result.
  • WHERE msal * 12 > literal -> Bad
  • WHERE msal > literal / 12 -> Good
  • All about statistics / estimation -> Query performance
  • Most of the cases the issue is not because of out of date stats or indexes, it could be parameter sniffing
  • EXEC <proc name> <parameters> -> uses the existing plan
  • EXEC <proc name> <parameters> WITH RECOMPILE -> generates a new plan
  • Using above two, you can identify the parameter sensitiveness (sniffing)
  • Update stats invalidate the plan
  • Estimates vs Actual -> If it is different, it may not be the stat problem always. It may be parameter sniffing.
STATISTICS
  • Selectivity
    • Optimizer loves highly selective predicates
    • Low no.of rows -> high selectivity
    • High no.of rows -> low selectivity
  • Statistics -> Summarized info about the data distribution of table columns
  • DBCC AUTOPILOT -> Undocumented
  • Hypothetical Indexes -> just the index structure.
    • What if analysis using AUTOPILOT with >= SQL Server 2008
  • sp_helpstats ‘<table name>’, ‘all’
  • Histogram -> 200 steps + 1 row for the null if the column allows null
    • SQL Server 7.0 had 300 rows
    • When the histogram is large, it increases the compilation time because histogram is not like an index
    • EQ_ROWS – Equal rows for the index key
    • Rows * All density = Avg no.of rows returns for that column or the combination of cols
    • If the table is huge, the data distribution present in histogram is not quite accurate
    • Step compression -> When building the histogram, if it finds the value approximately similar to each adjacent steps, then the algorithm will compress those steps and create 1 step
    • RANGE_ROWS -> values between two HI_KEYs and excluding the HI_KEYs at both ends
    • AVG_RANGE_ROWS = RANGE_ROWS / DISTINCT_RANGE_ROWS
  • Exec sp_autostats ‘<table name>’
  • Sys.stats -> shows all the stats including index stats
  • Sys.dm_db_stats_properties -> gives lot of details that you can use to update stats more effectively programmatically.
  • Histogram -> direct hit
  • DBCC SHOWSTATISTICS  WITH HISTOGRAM
  • Stats will update on Index Rebuild but not on Index re-org.
  • Entire stats structure stored in db as a BLOB (Header, Density Vector and Histogram)
  • Partitioned table uses table level stats which has 200 steps by each by each partition
  • Online partition level index rebuild in SQL Server 2014, but when MS introduced partitioning? It was in SQL Server 2005. So MS took 9 years to get online partition level index rebuild
  • Tuple cardinality -> used to estimate distinct cardinality
  • Optimization | Compilation | Execution
    • Local variable values are not known at optimization time
  • Parameters and literals can be sniffed -> uses histogram
  • Variables cannot be sniffed -> density vector
  • UDTATE STATS with any % will not be parallelized but FULL SCAN can be parallelized.
  • OPTION (QUERYTRACEON 3602, 9204, RECOMPILE)
  • Dynamic auto updating threshold
    • For large tables to have 20% change , you need to wait long time to update stats.
  • SQL Server does not understand the correlation between columns
  • Calculation direction in query plan is from right to left.
CONCERNS AROUND THE HISTOGRAM
  • Problem is always with monster tables
  • Is 200 steps enough for such tables?
  • Even the table is partitioned, still the histogram is @ table level
  • Sp_recompile for a table is an expensive operation. It needs Sch-M lock on the table
  • Filtered stats – consider creating filtered stats, even daily basis to tackle and solve the estimates problems due to skewed data. So that you will get better execution plans
  • QUERYTRACEON (2353) -> Additional info.

Monday, August 29, 2016

Powershell supports Linux

Microsoft this week announced that PowerShell is now open sourced and available on Linux, the start of a development process that aims to enable users to manage any platform from anywhere, on any device.

R Integration in SQL Server 2016

R is a popular programming language that is used in data science to analyse data. R has capabilities for data analytics. Data analytics needs special data structures such as arrays, vectors, matrix. Traditional RDBMS products do not have capabilities for data analytics because they do not have such data structures stated above. 

The latest release of MS SQL Server 2016 has R integration so that data processing and data analytics can be done in the same data management platform. 

This blog post highlights R integration in SQL Server 2016 by using a simple example. 

Let's assume you have a requirement to generate sequence of numbers. There are various methods you can do in SQL Server using T-SQL.  The typical solution would be using a WHILE loop and few control variables which is mentioned below;


There is an interesting  msdn forum question which has many alternative ways to achieve this. It is worth to read the discussion. 

The same thing can be accomplished using R in a very simple way. R has a function named seq, and below code snippet shows R seq to generate a sequence of numbers from 1 to 10. You can use RStudio to use full R functionalities. 

> seq(1, 10)
 [1]  1  2  3  4  5  6  7  8  9 10

Now, lets see how we can integrate the above simple R function with SQL Server 2016. First you need to enable R integration in SQL Server 2016. 


I had to restart SQL Server service after executing the above code which should not be the case. Below is the version of the SQL Server that I'm using for this demo. 

Microsoft SQL Server 2016 (RTM) - 13.0.1601.5 (X64)

I had to restart SQL Server Launchpad service too which is a new service introduced in MS SQL Server 2016. 

Then you can use sp_execute_external_script programming interface to call an external script written in R or in other language which recognized by SQL Server


Below is the output of the above code. 


There are many complex data analytics work you can do in R and the same thing you can do inside SQL Server by combing the data stored in SQL Server. 

Cheers!

Thursday, August 25, 2016

SQL Server on Linux

It is certain that many are excited and welcome the Microsoft official announcement of SQL Server is going to support Linux platform. Specially the open source community must be thrilled with the news. 

However, still there is not much information available to public about how SQL Server on Linux will looks like. As per Microsoft the initial release will be available on mid of 2017. 


Announcing SQL Server on Linux


History


If we look at the history of SQL Server, the very early released versions were on UNIX based operating system known as OS/2 which is jointly developed by IBM and Microsoft. As per wikipedia the kernel type of OS/2 is mentioned as hybrid. That was 1988/89 period. 

The SQL Server versions 1.0, 1.1, and 4.2 were all based on OS/2 platform. Microsoft had separately developed the version 4.2 to support for their first version of Windows NT OS but with the same features as in the version that runs on OS/2.

Historically, the SQL Server has born in UNIX based platform and later it has been ported to support only on Windows after ending the agreement with Sybase.  

After 25 years later, Microsoft has decided to make SQL Server available on Linux platform which is a good move. 



source: Microsoft SQL Server - History


Challenges


In my opinion, the main challenge would be to develop an abstraction layer to support Linux platform. The architecture of Windows and Linux are distinctly different. Both has preemptive scheduling but for the process model, Linux has very unique implementation of threads. SQL Server's process/threads model that runs in Windows is, single process multi-threading, meaning you can see only a single process for an SQL Server instance while internally it has thread pool to service for client requests. Consequently, thread is the unit of work at OS level to accomplish SQL Server requests in Windows. 

Conversely, in Linux has multi processes and there is no concept called threads as in Windows OS. Linux calls fork() system call to create a child process from the parent process and then it is finished or killed. In a nutshell, in Linux, everything is done by in-terms of processes. 

Bridging this architectural difference in SQL Server probably the most challenging part. I'm no expert in OS but if you think in software engineering point of view, this is the image I get about the abstraction layer. 

SQL Server has it's own user mode operating system known as SQLOS which is non-preemptive scheduler to interact with the Windows OS to service better for special needs of SQL Server. The SQLOS was introduced in SQL Server 2005 code name Yukon. 


To support Linux platform, the SQLOS component may need lot of additions or else there could be a separate component like SQLOS to support Linux platform. We do not know that information yet. 


SQLOS – unleashed


What features will be in SQL Server on Linux?


It is unclear what features will be available in SQL Server version that runs on Linux. To provide full features set as in SQL Server 2016 would be a real challenge. Another question is, whether it will include the BI tools? As per the announcement blog, it clearly states "We are bringing the core relational database capabilities to preview today" What are the core relational database capabilities? I presume the following capabilities should definitely need be addressed. They are; scheduling, memory management, I/O, exception handling, ACID support, etc. How about high availability options like clustering and AlwaysOn? I predict there will be some sort of high availability option in the first release on Linux. How about the Page size, is it 8K or different? By default Linux page size is 4K (4096 Bytes), however SQL Server in Windows supports non-configurable page size which is 8K(8192 Bytes). Consequently, SQL Server on Linux has to support is default page size of 4K. However unlike in Windows, Linux page size is configurable but it is not certain how feasible and the consequences of doing it. I just checked the page size in Mac OS and it is 4K too. See below figure. 




Will there be In-Memory OLTP engine with SQL Server on Linux? In-Memory OLTP, code name Hekaton was first introduced in SQL Server 2014 and it further developed in SQL Server 2016. In-Memory OLTP is an enormously important feature in future RDBMS products. However it is unclear yet, the first release of SQL Server on Linux will have this feature or not. 


What this means to DBAs?


Skill set required for future SQL Server DBAs are going to be expanded. They need to learn Linux OS commands as well as Python and Powershell for scripting tasks. In future companies will going to have a mix of Linux and Windows based SQL Server installations most probably. Licensing cost will decide which platform version of SQL Server is going to be the bigger portion. In a nutshell, future SQL Server DBAs will be cross-platform DBAs. It's challenging isn't it?


How about the Certifications?


Most likely, a new certification path will emerge to cater for the new trends/skills of SQL Server. To become a successful DBA, you might need to be qualified for SQL Server for both Linux and Windows platforms as well as cloud platforms. 


How this will impact to other RDBMS products?


My sense is, this will impact directly to MySQL. MS SQL Server is a proven enterprise level data platform. If anyone has a chance to get that into Linux OS, then it is highly unlikely someone will choose MySQL besides SQL Server providing that SQL Server Developer edition is free. 


I'm really excited and looking forward to get some hands on with SQL Server on Linux in very near future.


Cheers!

Thursday, August 18, 2016

TempDB Configuration

Recently we had some discussion around the best practices for tempdb configuration. The entire discussion was based on SQL Server 2012 EE. This blogpost highlights some salient points of tempdb configuration and include links to additional resources as well. 
  • How many files (data / log) we need to create? Is it based on logical cores or physical CPUs?
Any SQL Server database requires two files (1 data file and 1 log file) at minimum. SQL Server will allow you to add multiple log files but there is no benefit by doing so as SQL Server writes to log file in serial fashion. As a result, we can safely state, you do not need multiple log files for tempdb. 

What about the data files? SQL Server allows to add multiple data files too and there are benefits of doing this. Since our discussion is around tempdb, we can add multiple data files to address the very common phenomena known as tempdb contention. The whole objective of adding multiple data files is to improve the concurrency of tempdb activities in other words to improve the throughput of tempdb. Beginning from SQL Server 2005, tempdb tends to become the bottleneck as SQL Server continues to add new features which utilizes the tempdb heavily. As a result contention around tempdb becomes increased. However the history of tempdb contention goes back to SQL Server 7.0 because TF-1118 is also applies to SQL Server 7.0. The TF-1118 uses to reduce the tempdb contention of allocation pages (PFS, GAM and SGAM).


When deciding the no.of data files for tempdb, a general rule of thumb is to have no.of data files equal to the no.of logical cores up to 8 logical cores, meaning if the machine has 16 logical cores, you can add up to 8 data files. I've seen servers where you've 16 cores and tempdb has 16 data files too. This configuration is not appropriate according to the general rule of thumb. 

Modern CPUs have large number of cores. For example, Intel Xeon E7 processor family has a processor which has 24 cores. If you consider 8 CPU sockets server then you will get 192 (24*8) logical cores and when hyperthreading is enabled it will become 384 (192*2) logical cores. So in such a server, are we going to configure 384 data files for tempdb? That is an extremely bad idea because when there are such no.of data files the SQL Server has an additional overhead of threads management. 

Back to no.of files discussion, you can start with 8 data files (if no.of cores are 8) and then increase the no.of data files by multiple of 4 (up to the no.of logical processors of the machine). You need to do that, only if you still see contention around tempdb. 

  • How to decide the initial size of the tempdb?
There is no universal formula to do this and therefore it is non-trivial. You need to identify the features you're planning to use which potentially utilize the tempdb space. E.g: consider temp table and table variable usage, sorting, ordering, version store, etc. Then use your best judgement to decide the initial size and then monitor the tempdb usage by executing a workload in a test environment and decide whether you need to resize it or not. 
  • Do we need to enable autogrowth or not? If we do, then, is it for all the files or just one file?
Autogrowth setting should be used for emergency situations (for both data and log files) and should not be the process to grow the data file or log file. You always need to manually size the data files and log file and set the autogrowth so that in an emergency situation if tempdb needs additional space the SQL Server will grow the files without impacting the current server activities. If you've multiple data files situation then enable the autogrowth for all the files. 
  • The size of autogrowth?
Microsoft RAP tool has a rule which detects non standard database autogrowth sizes. The rule is as below;

"Database files have been identified that have the next Auto Growth increment of 1GB or above, and instant file Initialization is not enabled"

When the tool detects this, it classified as severity high risk. Basically it does not recommend to have autogrowth sizes 1GB or above. So that I would keep autogrowth size as 1023 MB (yes, it is not 1024 GB). This is especially important for the log file because depending on the autogrowth chunk size, the no.of VLFs vary. No.of VLF logic is as below (this applies to SQL Server 2016 as well);


chunks less than 64MB and up to 64MB  = 4 VLFs

chunks larger than 64MB and up to 1GB = 8 VLFs

chunks larger than 1GB = 16 VLFs


Note: The size of the t-log does not affect the no.of VLFs but the chunk size. 
  • Do we need to keep all data files and log file in the same drive?
It is better to keep them in separate drives if you've the luxury to have multiple drives (to increase the IO bandwidth). The limitation comes in clustered environment where you can have limited drive letters. (Please note that you can also place the tempdb in SSD).

If you've 4 nodes cluster with 3 SQL Server instances. We have already utilized 20 drive letters (including the C drive for OS). These drive letters are for system data like master, msdb, etc, data files, log files, local MSDTC, tempdb, backup and Quorum. 
  • Changes in SQL Server 2016
The TF (trace flag) 1117 is used to grow all the data files when a file in the filegroup reaches the autogrowth threshold. Beginning from SQL Server 2016 this behavior is controlled by the two options added to ALTER DATABASE statement. The two options are, 1. AUTOGROW_SINGLE_FILE, 2. AUTOGROW_ALL_FILES. However this is applies to user databases not the tempdb. 

Starting from SQL Server 2016, the behaviour of TF-1117 or TF-1118 will be automatically enabled by default for tempdb. So it makes DBAs life much more easier. 


Just to make it clear, see the below table; 



Summary
There are lot to talk and learn about tempdb. Tempdb is the only database per instance which keeps all the temporary work of that instance. So it may become a bottleneck and the entire system will become slow. So it is always need to pay a close attention to tempdb behaviour and its activities then take remedial actions to minimize them. Best practices are there for you get a good start but what is important is to identify the optimal configuration and parameters for your system rather than following best practices blindly. 

Cheers!

Monday, August 8, 2016

How to check the isolation level

SQL Server supports all four ANSI standards isolation levels. They are;
  1. Read Uncommitted (the lowest level, high concurrency, less blocking, more currency related issues) 
  2. Read Committed (default)
  3. Repeatable Read
  4. Serializable (the highest level, less concurrency, higher blocking, no concurrency related issues) 
Other than the above four, SQL Server introduced additional two optimistic concurrency control isolation levels based on row versioning technique;
  1. Read Committed Snapshot
  2. Snapshot isolation level
DBCC USEROPTIONS command can be used to see the current isolation level in SQL Server.





How to interpret Disk Latency

I was analyzing IO stats in one of our SQL Servers and noticed that IO latency (Read / Write) are very high. As a rule of thumb, we know tha...