Quantcast
Channel: SQL Server DBA Diaries
Viewing all 19 articles
Browse latest View live

Reading list for the week – 26/09/11

$
0
0

This week I have quite a few very good articles listed in the Reading list for the week. The list begins with a post from the master himself, Paul Randal (b | t). In How does DBCC CHECKDB WITH ESTIMATEONLY work? post Paul explains how DBCC CHECKDB consumes space on tempdb and how one can use WITH ESTIMATEONLY to forecast the space requirement on tempdb for CHECKDB to complete.

Paul White (b | t) explains in detail which statistics are used to compile an Execution Plan in his post How to Find the Statistics Used to Compile an Execution Plan.

We all come across bugs in SQL Server. SQL community is the major contributor in reporting these bugs. Aaron Bertrand (b |t), who is an expert in filing the bugs related to SQL Server talks about the art of filing bugs in his post Want your bug fixed? File a good bug!

Laerte Poltronieri Junior explains how to store the Event Log entries in SQL Server by using Powershell, in his post Storing Windows Event Viewer Output in a SQL Server table with PowerShell

Cumulative update package 16 for SQL Server 2008 Service Pack 1 was released recently. More details about it here.

Happy learning!


Reading list for the week – 03/10/11

$
0
0

In this edition of Reading list for the week, I am presenting you with some of the Microsoft KB articles which were updated recently.

  • How to use Kerberos authentication in SQL Server – In this article we learn how Kerberos authentication works and how we can configure SQL Server to use it.
  • SQL Server databases can be configured to grow and shrink automatically. The questions like “whether to configure Auto Grow/Shrink?”, “what will be performance impacts” are answered in details in Considerations for the “autogrow” and “autoshrink” settings in SQL Server article. This one is a must read for every DBA.
  • SQL Server Denali introduces a new feature called Product Update. This feature ensures that all the latest updates are applied to SQL Server instance at the time of installation. The Product Update feature can also make use of the Windows Update to get the latest updates for SQL Server. This article explains how to troubleshoot issues with Product Update when it is dependent on Windows Update.

Alejandro Pelc (b) posts writes about an approach to present deadlock information in a more easy to read format in his article Catching Deadlock Information in SQL Logs.

Happy learning!

Reading list for the week – 10/10/11

$
0
0

Here are the top items in my Reading List for this week.

  1. Itzik Ben-Gan (b | t) in his post Denali T-SQL at a Glance – New and Enhanced Functions talks about the new and improved functions in SQL Server Denali.
  2. If you read SQL Server MVP Deep Dives, you would be thrilled to know that SQL Server MVP Deep Dives – Volume 2 is now published!. Read more about on Greg Low’s blog post.
  3. Michael Otey (b) presents his view on why the PC is not going away yet in his article Windows 8 and Windows Server 8 Dispel the Myth of the Post-PC Era.
  4. A simple, yet informative article by Linchi Shea (b) explains about Multi-victim deadlocks
  5. On the other big news SQL Server 2008 SP3 is released. Read more about it here.

Happy learning!

Reading list for the week – 17/10/11

$
0
0

In this week’s edition, I have got quite a few good links lined up.

The big new first. SQL Server code-named “Denali” gets an Official Name. In the recently concluded PASS Summit 2011 it was announced that the next version of SQL Server will be called SQL Server 2012. Read more about this in this article.

Another big announcement that was made last week is that Microsoft SQL Server ODBC Driver  for Linux will be available along with SQL Server 2012! This is for sure a big step in the right direction.

We know that the Analysis Service cannot be added as a cluster instance. Amit Banerjee (b | t) explains how to add an Analysis Service as a failover cluster instance using the command line setup. This article is a very interesting read.

On clicking Fragmentation tab under the properties of an index in SSMS, the response time used to be very slow. This was because SSMS used to check the fragmentation of all the objects referenced by sys.dm_db_index_physical_stats DMV instead of the selected index. This hotfix resolves that bug.

Tibor Karaszi (b | t) in his article Who owns your jobs talks about the relationship between Active Directory users/groups and SQL Server Agent Job ownership.

Happy learning!

Reading list for the week – 24/10/11

$
0
0

I am starting off this week’s Reading List with an article related to Hotfix Service Model. This article has good information on how Microsoft SQL Server team uses the Incremental Model to deliver hotfixes to SQL Server.

A new hotfix was recently released to fix the issue of series of messages while restarting SQL Server 2005 through SSMS. This issue happens when SQL Server 2005 and Biz Talk Server 2006 are installed on the same server. Read more about this here.

Next is an interesting article by Kalen Delaney (b | t) on The Pros and Cons of Parameter Sniffing.

Are you using Database Mirroring and the transaction log is growing huge during Index maintenance? If yes, Kimberly L. Tripp (b | t) and Paul Randal (b | t) discuss about a different approach to index maintenance on mirrored database in this article.

That’s all for this week. Happy learning!

Service does not start | TDSSNIClient initialization failed with error 0×80092004, status code 0×80

$
0
0

Yesterday on my local instance, SQL Server service failed to start. The SQL Server error log had the following entries in it.

2012-12-31 12:31:26.58 Server      Error: 17190, Severity: 16, State: 1.
2012-12-31 12:31:26.58 Server      Initializing the FallBack certificate failed with error code: 1, state: 1, error number: -2146893788.
2012-12-31 12:31:26.58 Server      Unable to initialize SSL encryption because a valid certificate could not be found, and it is not possible to create a self-signed certificate.
2012-12-31 12:31:26.58 spid7s      Informational: No full-text supported languages found.
2012-12-31 12:31:26.58 Server      Error: 17182, Severity: 16, State: 1.
2012-12-31 12:31:26.58 Server      TDSSNIClient initialization failed with error 0x80092004, status code 0x80. Reason: Unable to initialize SSL support. Cannot find object or property.

2012-12-31 12:31:26.58 Server      Error: 17182, Severity: 16, State: 1.
2012-12-31 12:31:26.58 Server      TDSSNIClient initialization failed with error 0x80092004, status code 0x1. Reason: Initialization failed with an infrastructure error. Check for previous errors. Cannot find object or property.

2012-12-31 12:31:26.58 Server      Error: 17826, Severity: 18, State: 3.
2012-12-31 12:31:26.58 Server      Could not start the network library because of an internal error in the network library. To determine the cause, review the errors immediately preceding this one in the error log.
2012-12-31 12:31:26.58 Server      Error: 17120, Severity: 16, State: 1.
2012-12-31 12:31:26.58 Server      SQL Server could not spawn FRunCM thread. Check the SQL Server error log and the Windows event logs for information about possible related problems.

Before getting too much into the error details, I started to analyse what had changed since the last service/server restart. Here is what I had done the day before, to troubleshoot some other issue.

  • Removed the Server from the domain and made it part of a workgroup
  • After completing the desired tasks, I had made the server part of the domain again.

That’s all. The SQL Server service was configured to start using a domain account. 

When I changed the SQL Server service account from the domain account to a local (built-in) account, the service started normally.

Configuration Manager

However, if I changed the service to start using a domain account the same error message would re-appear.

Cannot find object

The following portion of the error message caught my attention.

TDSSNIClient initialization failed with error 0x80092004, status code 0x80. Reason: Unable to initialize SSL support. Cannot find object or property

SQL Server service is trying to intialize SSL support but it cannot find the requisite information. Like  I mentioned earlier, the only thing that had changed on this instance was that the server was re-added to the domain. Did the permissions of the SQL Server service account have changed? No. I verified that the permissions were intact. Did something go wrong with the local profile of the Service account? Here is what I saw under the User Profiles (Computer Properties –> Advanced –> User Profiles)

 

The SQL Server service account had *two* profiles on this computer. One had the status of “Backup” whereas the other one had the status of “Temporary”. This was unusual because for any user who logs on to a computer, one local profile is created. The profile related files are created under C:\Users directory (on Windows 2003 server and below it is C:\Documents and Settings). In this case there was only one folder created but under User Profiles it was showing two. Since this seems to be an issue, I deleted the profile which had the status of “Backup”. After this In logged onto the system using the SQL Server service account to ensure that the there are no visible errors during the profile creation.

After this the SQL Server service started normally! Perfect way to end the year 2012 and I am welcoming the year 2013 with this blog post!

SQL Server Install | Use Role Management Tool to install Microsoft .Net framework 3.5 SP1

$
0
0

I was trying to setup SQL Server 2008 on my lab machine running on Windows Server 2008 R2. When the setup was installing the prerequisites, the following error message popped up.

.net2

This is a very simple error message, which a DBA would come across often. However the next screen which popped up after this message was interesting.

.net1

Since Windows Server 2008 R2 has the .Net framework 3.5 already available as a feature, all we need to do is just enable that feature. There is no need to download the installer and install it separately. As the error message states, just enable it using the “Role Management Tool”. Role Management tool can be found under Server Manager –> Features –> Add features as seen in the following screenshot.

.net3

.net4

That’s all. Now the .Net framework 3.5.1 is installed and ready to use! After this the SQL Server setup completed successfully.

Database mirroring cannot be enabled because the database is not in full recovery mode on both partners

$
0
0

Recently I had worked on a mirroring issue. While initializing mirroring, an error message as shown in the below screenshot is encountered.

mirror2

Here’s how this issue was fixed

  • Check if the recovery model of the database on the Prinicipal Server was set to FULL. <– Yes, it was.
  • Take the full database backup of the database on the Prinicpal server
  • Take the transaction log backup of the database on the Principal Server
  • Restore the full database backup on the mirrored instance WITH NORECOVERY
  • Restore the transaction log backup on the mirrored instance WITH NORECOVERY

Configure Mirroring again.


Jobs fail with “Cannot execute as the database principal because the principal “dbo” does not exist this type of principal cannot be impersonated or you do not have permission” error

$
0
0

Recently I worked on an issue in which the jobs which were running fine started failing all of the sudden. The job history had the following error message.

Cannot execute as the database principal because the principal "dbo" does not exist this type of principal cannot be impersonated or you do not have permission

But the principal “dbo” exists, it can be impersonated and I have permission because I am the sysadmin! Now what changed?

Checked the job properties and “sa” was the job owner. This job was last modified a few years ago. The next step is to find out the properties of the database against which the job is executing the queries.

When I right clicked on the database and selected properties, got the following error.

Property Owner is not available for Database ‘[DatabaseName]‘. This property may not exist for this object, or may not be retrievable due to insufficient access rights.  (Microsoft.SqlServer.Smo)

This error is generated when the database owner’s login does not exist in the active directory and I had blogged about it here.

When I executed sp_helpdb, the output looked like the one below.

sp_helpdb_invalidowner

As evident from the output, the database owner’s login was not present in the Active Directory and it was reported as ~~unknown~~. Now that we know the problem, the resolution is pretty simple. Just executed sp_changedbowner or ALTER AUTHORIZATION command against the databases in question. The job started running successfully.

The next time you create a database please take care to change the database owner to a different login. Else you leave the organization and someone else needs to cleanup your name from all the databases that you had created!

Add node fails with Attempted to read or write protected memory error

$
0
0

Recently I had worked on an issue where the end user was trying to add a third Node to an existing 2 node cluster. Soon after the setup is intiated, it would fail with the following error message (snippet from the Summary.txt).

Exception type: System.AccessViolationException
    Message: 
        Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
    Data: 
      DisableWatson = true
    Stack: 
        at Microsoft.SqlServer.Chainer.Infrastructure.MsiNativeMethods.MsiOpenPackageEx(String szPackagePath, UInt32 dwOptions, UInt32& hProduct)
        at Microsoft.SqlServer.Configuration.MsiExtension.InstallPackage.SetCommonProperties()
        at Microsoft.SqlServer.Configuration.MsiExtension.InstallPackage.RunMsiCore(String commandLine)
        at Microsoft.SqlServer.Configuration.MsiExtension.InstallPackage.RunMsiWithRetry(String commandline)
        at Microsoft.SqlServer.Configuration.MsiExtension.InstallPackage.RunMsi(IEnumerable`1 commandLineProps)
        at Microsoft.SqlServer.Configuration.SetupExtension.MSIInstallerEngine.InstallPackage(PackageId pkg, InstallAction pkgAction)
        at Microsoft.SqlServer.Configuration.MsiExtension.PackageInstallAction.Execute(String actionId, TextWriter errorStream)
        at Microsoft.SqlServer.Setup.Chainer.Workflow.ActionInvocation.ExecuteActionHelper(TextWriter statusStream, ISequencedAction actionToRun)

“Attempted to read or write protected memory. This is often an indication that other memory is corrupt.” Scary little error message! Since this is a setup related issue, the best place to start troubleshooting it is from the Setup Logs.  This article has more details on how to read SQL Server Setup logs.

Detail.txt pointed in the right direction as seen below.

Opening existing patch 'c:\Windows\Installer\97363c.msp'.
Couldn't find local patch 'c:\Windows\Installer\97363c.msp'. Looking for it at its source.
Resolving Patch source.
SOURCEMGMT: Looking for sourcelist for product {D7C6A337-F6BB-46CB-AE32-204DD6A8825D}
SOURCEMGMT: Trying source c:\7cb5fee2d4775f9d53c7f95659\x64\setup\.
Note: 1: 2203 2: c:\7cb5fee2d4775f9d53c7f95659\x64\setup\sql_ssms.msp 3: -2147287037 
SOURCEMGMT: Source is invalid due to missing/inaccessible package.
Unable to create a temp copy of patch 'sql_ssms.msp'.
Searching provided command line patches for patch code {D7C6A337-F6BB-46CB-AE32-204DD6A8825D}
Could not find source for missing patch {D7C6A337-F6BB-46CB-AE32-204DD6A8825D} -- orphaning this patch
SequencePatches starts. Product code: {72AB7E6F-BC24-481E-8C45-1AB

Now it was very clear that the issue is the classic missing MSI/MSP file issue. It was just matter of copying the right files to C:\Windows\Installer as explained here and here.

Once the missing files were replaced, the setup completed without any further errors.

SQL Server Agent failing to start with the error “StartServiceCtrlDispatcher failed (error 6)”

$
0
0

Recently I worked on an issue where SQL Server Agent was failing to start. Before this there was some problems with the disks on the server and the databases were restored from backups.

I started looking for errors in few of the logs

  • System Event Log – No errors
  • SQL Server Error Log- No errors

Looked for the SQL Server Agent logs in the directory where SQL Server Error Logs are located but could not find any. The attempts to start the service from services.msc and SQL Server Configuration Manager failed with a generic error message.

Hence I took the SQL Server Agent service’s binary path from the service properties.

BinaryPath

Then typed “…..Binn\SQLAGENT.EXE” -i MSSQLSERVER” the same command in the command prompt.

ErrorMessage_StartServiceCtrlDispatcher

Error message “StartServiceCtrlDispatcher failed (error 6).” doesn’t help! Hence I started sqlserveragent.exe with the “-c” parameter which indicates SQL Server Agent is running in console mode.

“…..Binn\SQLAGENT.EXE” -i MSSQLSERVR -c”

Now the details came out!

CannotFindPath

SQL Server Agent is trying to rename D:\Data3\SQLAGENT.OUT to D:\Data3\SQLAGENT.1
It is failing to start since the file doesn’t exist because the drive isn’t there!

Now time to change the SQL Server Agent Log path. Tried executing the following command but it failed

EXEC msdb.dbo.sp_set_sqlagent_properties @errorlog_file=N’C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\Log\SQLAGENT.OUT’

Msg 15281, Level 16, State 1, Procedure sp_set_sqlagent_properties, Line 0
SQL Server blocked access to procedure ‘dbo.sp_set_sqlagent_properties’ of component ‘Agent XPs’ because this component is turned off as part of the security configuration for this server. A system administrator can enable the use of ‘Agent XPs’ by using sp_configure. For more information about enabling ‘Agent XPs’, see “Surface Area Configuration” in SQL Server Books Online.

The only option left was to modify the SQL Server Agent Error Log path in the Registry.
Navigated to the following registry key and modified it to point to the correct path.

RegistryPath

As expected now the SQL Server Agent started successfully.

Log reader fails with “The process could not execute ‘sp_replcmds’ ” error

$
0
0

Recently I worked on an issue in which the Replication wasn’t working right after setting it up. The Publication and the subscriptions were created successfully but the subscription was still uninitialized.

As a first step checked the status of the Snapshot Agent under Replication Monitor. It was failing with the following error message.

The concurrent snapshot for publication 'Sitecore_301_Redirect' is not available because it has not been fully generated or the Log Reader Agent is not running to activate it.

The Snapshot agent was correct that the snapshot was not generated and also the Log Reader agent wasn’t running. The Log Reader agent was failing with the error below.

The process could not execute 'sp_replcmds' on {ServerName}

The above message a very generic in nature. Hence I added a verbose log to the Log Reader Agent as explained in this KB article.

-Continuous -Output "D:\VerboseLog\Log.txt" -Outputverboselevel 3
LogReader_VerboseLog

 

Here is what was recorded in the Log Reader agent when the job failed the next time

Status: 0, code: 15517, text: 'Cannot execute as the database principal because the principal "dbo" does not exist, this type of principal cannot be impersonated, or you do not have permission.'.

sys.databases DMV reported that the current owner of the job wasn’t a DBO on the Publication Database. Hence the next logical step was to make the job owner the DBO of the database by executing

sp_changedbowner '{LoginName}'

After this change, all the replication agents worked without any problems and data started flowing to the subscriber.

Create and connect to a SQL Database on Windows Azure

$
0
0

Azure, is one of the fastest growing cloud platform by Microsoft. In this simple post, I will create and connect to a SQL Database on Microsoft Azure.

To get started logon to the Microsoft Azure Management Portal , navitage to SQL Database and click on New.

sqldatabases_new

In the next screen, give the database a unique name and select the appropriate values.

CreateNewDatabase

The database gets created in a few seconds and it would be ready to use when the status changes to ONLINE.

DatabaseOnline

Clicking on the Database Name (in this case is is TheTestDatabase), would give the detailed information about the database including the address of the database.DatabaseProperties

Now you can use this information in the Connection String or just connect to it using SQL Server Management Studio (SSMS). But when you try to connect to this database from a different computer, it would fail with the following error.firewallerror

Error text:


Cannot open server 'ServerName' requested by the login. Client with IP address '{IPAddress}' is not allowed to access the server.
To enable access, use the Windows Azure Management Portal or run sp_set_firewall_rule on the master database to create a firewall rule for this IP address or address range.
It may take up to five minutes for this change to take effect.
Login failed for user 'pradeep'.
This session has been assigned a tracing ID of '{TrackingID}'.
Provide this tracing ID to customer support when you need assistance. (Microsoft SQL Server, Error: 40615)

As seen in the error message, by default all IP addresses are not allowed to connect to a SQL Database. The requisite IP Addresses/IP Address ranges needs to explicitly configured for the server hosting the SQL Database, else the connections will fail with the above error.

To configure the Firewall rules, first we need to navigate to the server on which the SQL database is hosted.

ServerName

In the Configure screen, give the rule a name and specify the IP address range which is allowed to connect to the database.

FirewallRule

After entering the details make sure to click on the “Save” button at the bottom of the screen. Now if we try connecting to the database from the same host, the connection would succeed and the database would be listed in SSMS.

DatabaseConnected

Hope you found this post useful. Happy cloud computing!

How to extract the contents of .MSP files

$
0
0

I had encountered an issue while installing a SQL Server service pack. The resolution for it was to replace the existing SQL Server binaries from that of a Service Pack. To extract the contents of a Service Pack executable we need to just type “ServicePackExeName /extract” as I explained in this post.

I was looking for the sqlservr.exe but it wasn’t directly available in the extracted folder. The folder structure looked something like this.

ExtractedFolder

From my past experience I knew that the Service Pack installer would copy sqlservr.exe by extracting sqlrun_sql.msp file.

sqlrun_sql.msp

Now how do I get the sqlservr.exe from the .msp file? The command “sqlrun_sql.msp /extract” wouldn’t work here.

One of the options is to extract the contents of the .msp file using a file archiver like 7-zip. After extracting the contents, look for a file which has the name CAB in it. In this case, the file name is “PCW_CAB_Family01″.

extracted

Again extract this file to a folder and the contents would look something like this.

sqlservr.exe

Now just rename the sqlservr.exe.******** to sqlservr.exe. That’s all. You have just extracted the requisite file from an .msp file.

The other alternative is to use the MsiX utility by Heath Stewart. The usage of this tool is very straight forward. Just type the following command in the Command Prompt and it would extract the contents of the .msp file

MsiX sqlrun_sql.msp

MsiX_cmdprompt

After this navigate to the folder where the file was extracted and extract the contents of the *CAB* file.

MsiX_extractedFollow the same steps and rename the file that you need (sqlservr.exe.****) to the requisite file.

Mission accomplished!

Export the Event Logs without opening MMC

$
0
0

I recently worked on a database corruption issue. In order to troubleshoot this I had to collect all the Event Logs from the server. But when I tried to open Event Viewer, MMC would fail with the following error.

MMC could not create the snap-in.
Name: Event Viewer
CLSID: FX:{b05566ad-fe9c-7a4cbb7cb510}

For that matter any MMC snap-in on the server would fail with the same error. But getting the event logs was critical to get troubleshooting going. Thats when Wevtutil.exe which ships with the Windows Operating System came to the rescue.

Wevtutil.exe is by default located in the C:\windows\system32 folder.

I had to just execute the following command to export the System Event Log to C:\SystemLogBackup.evtx

wevtutil.exe epl System C:\SystemLogBackup.evtx

The “epl” parameter exports the event log specified (System in this case) to the destination file.
All I had to do is copy the exported file to my local desktop and double click it to open in the Event Viewer snap-in.

SavedLogs

This is indeed a good tool to have in a DBA’s armory.


CDC job failing with “Unable to add entries to the Change Data Capture LSN time mapping table to reflect dml changes applied to the tracked tables” error

$
0
0

I got a chance work on CDC job failure issue recently. The customer had configured CDC on the database but it wasn’t working as expected. Hence he had disabled/enabled CDC multiple times on this database.

The “cdc.[DBName]_capture” job was failing with the following error

Message: 22858, Level 16, State 1
Unable to add entries to the Change Data Capture LSN time mapping table to reflect dml changes applied to the tracked tables. Refer to previous errors in the current session to identify the cause and correct any associated problems. [SQLSTATE 42000] (Error 22858) The statement has been terminated. [SQLSTATE 01000] (Error 3621). NOTE: The step was retried the requested number of times (10) without succeeding. The step failed.

Since the error message had reference to “Change Data Capture LSN time mapping table”, I looked into the cdc.lsn_time_mapping table. On a normal CDC configuration this table would have an entry for each transaction that was captured. But in this case, there was only one entry but the “tran_begin_time” and “tran_end_time” columns had a value which was a couple of days old. Also the tran_begin_lsn was printed as 0×00000000000000000000 and tran_id was 0×00. This isn’t normal.

To get more details about this error, I added a verbose log to the CDC capture job.

The verbose log printed the following message when the job was running.

session_id error_message
----------- ----------------------------------------------
5 Violation of PRIMARY KEY constraint 'lsn_time_mapping_clustered_idx'. Cannot insert duplicate key in object 'cdc.lsn_time_mapping'. The duplicate key value is (0x0008236700032c170001). CF8:0005
5 Unable to add entries to the Change Data Capture LSN time mapping table to reflect dml changes applied to the tracked tables.

From the verbose log it was evident that the CDC capture job was trying to insert a duplicate row the object cdc.lsn_time_mapping and failing.

The sys.databases DMV reported that the log_reuse_wait_desc was REPLICATION.

name database_id log_reuse_wait_desc
---------------- ----------- ---------------------
master 1 NOTHING
tempdb 2 NOTHING
model 3 LOG_BACKUP
msdb 4 NOTHING
CDC_DB_NAME 5 REPLICATION

This indicated that another CDC or replication was active on this database. Since there was no replication configured on this instance, it had to be a CDC job.

DBCC OPENTRAN reported that there was another CDC transaction was active on this database

Transaction information for database 'cdc_db_name'.
Replicated Transaction Information:
Oldest distributed LSN : (0:0:0)
Oldest non-distributed LSN : (533351:207666:1)
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

Since customer had attempted enabling/disabling CDC multiple there was a possibility that a stale CDC transaction was still active on the database. To clear the article cache of CDC, executed sp_replflush against the database in question.

After restarting the job, it again failed but with a different error message now.

Another connection is already running 'sp_replcmds' for Change Data Capture in the current database.

DBCC OPENTRAN reported that there was a open transaction on the database but it was a user transaction which wasn’t running sp_replcmds.

Executed sp_replflush one more time and restarted the job again. This time it didn’t fail and also we started seeing rows getting inserted into all the CDC related tables.

Since the issue at hand was resolved, didn’t delve into the root cause. Most likely it was a stale CDC transaction which wasn’t cleaned up when CDC was disabled on the database.

SQL Server Cluster resource fails with “Data source name not found and no default driver specified” error

$
0
0

Last week I had worked on SQL Server 2008 R2 installation on a Windows Cluster. The installation would go fine until the installer tries to bring the SQL Server resource online. At that time it would fail with the following error message,

The cluster resource 'SQL Server' could not be brought online. Error: The group or resource is not in the correct state to perform the requested operation. (Exception from HRESULT: 0x8007139F)

On clicking OK, the installation would not fail but the SQL Server resource would continue to remain in failed state. Looking at System/Application Event Logs wasn’t of much help. From the SQL Server error logs it was evident that the Service was getting started successfully but it was getting stopped after the Cluster Manager failed to bring the resource online after 10 attempts.

The cluster log revealed that the resource was failing to come online with the following error

ERROR [IM002] [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified

Since SQL Server service is working fine, the problem is for sure outside SQL Server. The resolutions mentioned here and here, did not help.

As we already know, the Windows Failover Cluster manager uses SQSRVRES.dll to connect and manage the SQL Server Cluster resource. Hence looked at the properties of SQSRVRES.dll to check if there was any problems with it. This DLL was just fine but its version was 11.0.2100. This is strange since the version of SQL Server that we were trying to install was SQL Server 2008 R2 (10.50.xxxx). On checking customer informed that he had tried installing SQL Server 2012 earlier on this server but it was uninstalled for some reasons.

This clearly seemed to be the root cause. Hence copied the SQSRVRES.dll from a server which already had SQL Server 2008 R2 installed and overwrote it on the server in question. After this the installation succeeded and the SQL Sever cluster resource also came online just fine.

Application timing out due to excessive blocking on tempdb PFS

$
0
0

Problem

Applications Connecting to SQL Server timing out

Symptoms

  1. Excessive blocking with the wait resource being tempdb PFS page
  2. High CPU utilization on the server

Troubleshooting

sys.dm_exec_requests reported blocking and the head blocker looked like this

session_id ecid wait_type resource
---------- ----- --------------- --------------------
58 4 PAGELATCH_UP 2:7:32352
58 1 CXPACKET
58 5 CXPACKET
58 6 CXPACKET
58 3 CXPACKET
58 2 CXPACKET
58 0 CXPACKET exchangeEvent id=Pipe7dd014b80

The wait resource was always reported at tempdb PFS page (2:7:(8088*x)).

The head blocker and the blocking session were executing a SELECT against a XML document.

As explained here parsing a XML document would create work table in tempdb. In this situation the application was parsing a lot of XML documents. Hence the contention on tempdb was justified.

  • tempdb had 5 data files and the data files had initial size of 100 MB (with autogrowth of 10%). As per the recommendations here, increased the tempdb files to 8 and increased the initial size to a higher value to reduce the auto growth.
  • Also enabled trace flag 1118. This forces uniform extent allocations instead of mixed page allocations.
  • Enabled trace flag 1117. This will auto grow all the files in the filegroup whenever one of the files tries to auto grow.
  • Since this instance was running on SQL Server 2012, applied the latest Service Pack + CU as recommended in http://support.microsoft.com/kb/2964518

The tempdb contention got reduced to some extent still the CPU utilization was very high(above 85%). Because of this the application was still timing out.

  • Changed the Power Plan to “High Performance” as explained in http://support.microsoft.com/kb/2207548. This didn’t help much.
  • As you notice in the sys.dm_exec_requests output, all threads in the session are waiting on CXPACKET wait type for one thread to complete its work. Hence set the appropriate value for max degree of parallelism (3 in this case) as explained here.

CPU utilization came down from 85% + to 20-30%

Status

Mission accomplished!

No transaction is active message when accessing Linked Server

$
0
0

Last week I had worked on an issue related to Linked Server. The customer had migrated the SQL Server Instances to Virtual Servers. They had quite a few Linked Servers setup. After migration any Distributed Transaction like the one below across the linked servers would fail immediately.

begin distributed tran
select * from RemoteServer.DBName.dbo.TableName
commit tran

Error message

OLE DB provider "SQLNCLI11" for linked server "linkedservername" returned message "No transaction is active.".
Msg 7391, Level 16, State 2, Line 2
The operation could not be performed because OLE DB provider "SQLNCLI11" for linked server "linkedservername" was unable to begin a distributed transaction.

The first place I checked for problems was the Component Services (run –> dcomcnfg).

LocalDTCProperties

The options in the Local DTC Properties were correctly set as seen in this screenshot.

DTCProperties

Restarting the “Distributed Transaction Coordinator (MSDTC)” service didn’t help either.

The next step was to look for possible error messages in the Event Log. In the Application Event Log, the following error message was logged.

The local MS DTC detected that the MS DTC on ServerName has the same unique identity as the local MS DTC.
This means that the two MS DTC will not be able to communicate with each other.
This problem typically occurs if one of the systems were cloned using unsupported cloning tools.
MS DTC requires that the systems be cloned using supported cloning tools such as SYSPREP. Running 'msdtc -uninstall' and then 'msdtc -install' from the command prompt will fix the problem.
Note: Running 'msdtc -uninstall' will result in the system losing all MS DTC configuration information.

The error message in bold indicates that the Unique Identity for the MS DTC (SID) was same on both the local and the destination servers. How is this possible? While the new servers were being built they had syspreped servers. Hence the configuration of MS DTC was also propagated to all the servers where the same image was used.

Now that we knew the root cause, the resolution was pretty straight forward. Executed the following steps as explained in this article.

  • Opened the Command Prompt as an Administrator and executed “msdtc -uninstall”
  • Deleted the following registry keys (after exporting them as a precautionary measure)

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\MSDTC
HKEY_CLASSES_ROOT\CID
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSDTC

  • Executed “msdtc -install” command in the Command Prompt
  • Rebooted the server

After reboot, the linked server queries returned the expected results.

Viewing all 19 articles
Browse latest View live