Diagnosing ADO.NET with ETW traces


The majority of modern applications use ORMs as a way to connect to a database. Those libraries simplify the usage of the underlying ADO.NET API in a way that we might even forget that under the hood we are dealing with a relational model. This object-oriented wonderland usually lasts till the moment when the first SQL exceptions crop up. If we configured our ORM correctly we should be able to diagnose database problems with just logs that it provides. But what if we didn’t or if the problem lies deep in the ADO.NET layer? We might then try to use a debugger or a SQL Server Profiler. Both those tools, although great, are usually too heavy (or too invasive) for a production environment. Fortunately there is still one option left – ADO.NET ETW tracing. In today’s post I will show you how to turn on this type of tracing and how to use it to quickly diagnose some database problems.

For this purpose we will use a buggy MVC application which consists of two controllers:

namespace LowLevelDesign.DiagnosingAdoNet 
{
    public class HomeController : Controller
    {
        //
        // GET: /Home/
        public ActionResult Index()
        {
            using (var conn = new SqlConnection(WebConfigurationManager.ConnectionStrings["dbconn"].ConnectionString)) 
            {
                conn.Open();
                return View(conn.Query<Product>("select * from Products"));
            }
        }
    }
}

namespace LowLevelDesign.DiagnosingAdoNet
{
    public class ProductController : Controller
    {
        public ActionResult Edit(String productKey) {
            Product prod = null;
            if (!String.IsNullOrEmpty(productKey)) {
                SqlConnection conn = new SqlConnection(WebConfigurationManager.ConnectionStrings["dbconn"].ConnectionString);
                conn.Open();

                prod = conn.Query<Product>("select * from Products where ProductKey = @productKey", new { productKey }).SingleOrDefault();
            }
            return View(prod ?? new Product());
        }

        [HttpPost]
        public ActionResult Edit(Product product) {
            if (!ModelState.IsValid) {
                return View(product);
            }
            using (SqlConnection conn = new SqlConnection(WebConfigurationManager.ConnectionStrings["dbconn"].ConnectionString)) {
              conn.Open();
              var t = conn.BeginTransaction();
              if (conn.Execute("update Products set Price = @Price where ProductKey = @ProductKey", product, t) == 0) {
                  conn.Execute("insert into Products values (@ProductKey, @Price)", product, t);
              }
            }
            // save or update product
            return RedirectToAction("Index", "Home");
        }
    }
}

two views and it’s using a simple database:

<configuration>
  ...  
  <connectionStrings>
    <add name="dbconn" connectionString="Data Source=(local)\SQLEXPRESS;Trusted_Connection=True;Initial Catalog=TestDb;Timeout=2;Max Pool Size=4" />
  </connectionStrings>
  ...
create database TestDb;

use TestDb;

create table Products (
  ProductKey varchar(20) primary key,
  Price smallmoney
);

insert into Products values ('P1', 20);
insert into Products values ('P2', 12);

You can download the whole application from http://lowleveldesign.codeplex.com/releases/view/94187. Let’s now run it (you can use the runiis.bat script from the application directory) to see how it behaves. If you have visited a product edit screen you should have noticed that changes are not saved – that’s our first issue. After few tries you should experience another problem: a SQL timeout exception which completely breaks the application:

Timeout expired.  The timeout period elapsed prior to obtaining a connection from the pool.  This may have occurred because all pooled connections were in use and max pool size was reached.

Start ADO.NET tracing

It’s time to start trace collection in order to investigate what’s failing. There are two steps required to setup ADO.NET tracing on your (or server) machine. Firstly, you need to inform ADO.NET that it should provide traces (this setting is stored in a registry – just run setup-ado.net4_x64.reg or setup-ado.net4_x86.reg from my diagnostics toolkit page: http://diagnettoolkit.codeplex.com/). Secondly, you need to install the actual event provider – go to a .NET Framework directory (c:\Windows\Microsoft.NET\Framework64\v4.0.30319\) and run mofcomp adonetdiag.mof. Check if the provider was successfully installed by running logman query providers:


PS reader> logman query providers '{914ABDE2-171E-C600-3348-C514171DE148}'

Provider                                 GUID
-------------------------------------------------------------------------------
System.Data.1                            {914ABDE2-171E-C600-3348-C514171DE148}

The command completed successfully.
PS reader> logman query providers '{7ACDCAC8-8947-F88A-E51A-24018F5129EF}'

Provider                                 GUID
-------------------------------------------------------------------------------
ADONETDIAG.ETW                           {7ACDCAC8-8947-F88A-E51A-24018F5129EF}

The command completed successfully.

Finally it’s time to restart the application, but this time we will collect its traces in the background. Logman has an option to read providers configuration from a text file so let’s prepare such a file (you can find it also here) – we will name it ctrl.guid.adonet:

{7ACDCAC8-8947-F88A-E51A-24018F5129EF}  0xFFFFFFFF  5   ADONETDIAG.ETW
{914ABDE2-171E-C600-3348-C514171DE148}  0xFFFFFFFF  5   System.Data.1

The 0xFFFFFFFF value selects all events generated by the provider and 5 sets the log verbosity level to the most detailed one. Now let’s start the collection from an elevated command prompt:

> logman start adonettrace -pf .\ctrl.guid.adonet -o adonettrace.etl -ets
The command completed successfully.

Play a bit with the web application till a moment it’s broken and stop the trace collection:

> logman stop adonettrace -ets
The command completed successfully.

Analyze ADO.NET traces

A new adonettrace.etl file should be created in the directory from which we run the command. As it’s a binary file we need to convert it to something more readable (by using for example tracerpt):

> tracerpt -of xml .\adonettrace.etl

Input
----------------
File(s):
     .\adonettrace.etl

100.00%

Output
----------------
DumpFile:           dumpfile.xml
Summary:            summary.txt

The command completed successfully.

As you can see two files were created by tracerpt. Summary.txt is just a summary of events that the etl file contains and dumpfile.xml is the actual event data dump. If you setup everything correctly the dumpfile.xml should weigh at least few hundred kilobytes. An example event record from this file looks as follows:

<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
	<System>
		<Provider Guid="{914abde2-171e-c600-3348-c514171de148}" />
		<EventID>0</EventID>
		<Version>0</Version>
		<Level>0</Level>
		<Task>0</Task>
		<Opcode>18</Opcode>
		<Keywords>0x0</Keywords>
		<TimeCreated SystemTime="2012-09-06T21:46:53.915622600Z" />
		<Correlation ActivityID="{00000000-0000-0000-0000-000000000000}" />
		<Execution ProcessID="2008" ThreadID="6280" ProcessorID="1" KernelTime="30" UserTime="60" />
		<Channel />
		<Computer />
	</System>
	<EventData>
		<Data Name="ModID">       2</Data>
		<Data Name="msgStr">&lt;prov.DbConnectionHelper.ConnectionString_Set|API&gt; 1#, &apos;Data Source=(local)\SQLEXPRESS;Trusted_Connection=True;Initial Catalog=TestDb;Timeout=2;Max Pool Size=4&apos;
</Data>
	</EventData>
	<RenderingInfo Culture="pl-PL">
		<Opcode>TextW</Opcode>
		<Provider>Bid2Etw_System_Data_1</Provider>
		<EventName xmlns="http://schemas.microsoft.com/win/2004/08/events/trace">System.Data</EventName>
	</RenderingInfo>
	<ExtendedTracingInfo xmlns="http://schemas.microsoft.com/win/2004/08/events/trace">
		<EventGuid>{914abde3-171e-c600-3348-c514171de148}</EventGuid>
	</ExtendedTracingInfo>
</Event>

As you can see there is a lot of information provided in such a record. In most diagnostics cases it should suffice to check the <Execution> and <Data Name=”msgStr”> tags. The <Execution> tag is interesting for us because it contains a system process ID. What I forgot to mention is that ETW traces are collected globally so if there were more applications on your system using ADO.NET at the “collection” time their traces will be also included in the .etl file (and thus dumpfile.xml). The <Data Name=”msgStr”> tag contains the actual ADO.NET log. As this information might be a bit hard to read, Microsoft created a naming convention (more here) that helps you organize traces. You may also use AdoNetTraceReader which will extract message strings, indent log blocks and eventually filter them by PID. You can download this tool from my .NET diagnostics toolkit page and run it against your dumpfile.xml:

> .\AdoNetTraceReader.exe -i dumpfile.xml -o parsed.txt -p 2008

2008 represents id of the web application pool on my machine.

It’s time to look what traces will tell us about the database problems that we’ve experienced. Let’s first diagnose why the application breaks. The exception message informs us that our connection pool is leaking which means there are places in code where the database connection is not closed after usage. Look for START <sc.SqlConnection.Open|API> to find all connection openings. Each opening should have a corresponding closing phase: START <sc.SqlConnection.Close|API>. As you can see in a snippet below each opened connection receives its own ordinal number. Using this number we can track what happened to a particular connection:

START  <sc.SqlConnection.Open|API> 3#
<sc.SqlConnection.Open|API|Correlation> ObjectID3#, ActivityID fd1e3c4a-fd2e-48a3-aeed-4d040907b07f:4
<prov.DbConnectionPool.GetConnection|RES|CPOOL> 1#, Getting connection.
<prov.DbConnectionPool.GetFromGeneralPool|RES|CPOOL> 1#, Connection 5#, Popped from general pool.
<prov.DbConnectionInternal.PostPop|RES|CPOOL> 5#, Preparing to pop from pool,  owning connection 0#, pooledCount=0
<prov.DbConnectionInternal.ActivateConnection|RES|INFO|CPOOL> 5#, Activating
END 

...

START  <sc.SqlConnection.Close|API> 3#
<sc.SqlConnection.Close|API|Correlation> ObjectID3#, ActivityID fd1e3c4a-fd2e-48a3-aeed-4d040907b07f:6
<prov.DbConnectionInternal.PrePush|RES|CPOOL> 5#, Preparing to push into pool, owning connection 0#, pooledCount=0
<prov.DbConnectionPool.DeactivateObject|RES|CPOOL> 1#, Connection 5#, Deactivating.
<prov.DbConnectionInternal.DeactivateConnection|RES|INFO|CPOOL> 5#, Deactivating
<sc.SqlInternalConnection.Deactivate|ADV> 5# deactivating
<sc.TdsParser.Deactivate|ADV> 2# deactivating
<sc.TdsParser.Deactivate|STATE> 2#, 
	         _physicalStateObj = False
	         _pMarsPhysicalConObj = True
	         _state = OpenLoggedIn
	         _server = (local)\SQLEXPRESS
	         _fResetConnection = False
	         _defaultCollation = (LCID=1045, Opts=25)
	         _defaultCodePage = 1250
	         _defaultLCID = 1045
	         _defaultEncoding = System.Text.SBCSCodePageEncoding
	         _encryptionOption = OFF
	         _currentTransaction = (null)
	         _pendingTransaction = (null)
	         _retainedTransactionId = 0
	         _nonTransactedOpenResultCount = 0
	         _connHandler = 5
	         _fMARS = False
	         _sessionPool = (null)
	         _isShiloh = True
	         _isShilohSP1 = True
	         _isYukon = True
	         _sniSpnBuffer = 1290
	         _errors = (null)
	         _warnings = (null)
	         _attentionErrors = (null)
	         _attentionWarnings = (null)
	         _statistics = True
	         _statisticsIsInTransaction = False
	         _fPreserveTransaction = False         _fParallel = False
<prov.DbConnectionPool.PutNewObject|RES|CPOOL> 1#, Connection 5#, Pushing to general pool.
END

Armed with this knowledge we should be able to quickly find the first leaking connection – in my case:

START  <sc.SqlConnection.Open|API> 6#
<sc.SqlConnection.Open|API|Correlation> ObjectID6#, ActivityID fd1e3c4a-fd2e-48a3-aeed-4d040907b07f:10
<prov.DbConnectionPool.GetConnection|RES|CPOOL> 1#, Getting connection.
<prov.DbConnectionPool.GetFromGeneralPool|RES|CPOOL> 1#, Connection 5#, Popped from general pool.
<prov.DbConnectionInternal.PostPop|RES|CPOOL> 5#, Preparing to pop from pool,  owning connection 0#, pooledCount=0
<prov.DbConnectionInternal.ActivateConnection|RES|INFO|CPOOL> 5#, Activating

Now it’s time to find out which part of the application used this connection. Reading next lines of the trace reveals the executed SQL query with its parameters – @ProductKey = 'P1' (highlighted lines):

...
<sc.SqlCommand.set_CommandText|API> 7#, 'select * from Products where ProductKey = @productKey'
<sc.SqlCommand.ExecuteDbDataReader|API|Correlation> ObjectID7#, ActivityID fd1e3c4a-fd2e-48a3-aeed-4d040907b07f:11
<sc.TdsParser.GetSession|ADV> 2# getting physical session 2
<sc.SqlCommand.ExecuteReader|INFO> 7#, Command executed as RPC.
<sc.TdsParser.WritePacket|INFO|ADV>  Packet sent00000000:  03 09 01 00 00 00 01 00 | 16 00 00 00 12 00 00 00  ................
00000010:  02 00 00 00 00 00 00 00 | 00 00 01 00 00 00 FF FF  ................
00000020:  0A 00 02 00 00 00 E7 6A | 00 15 04 D0 00 00 6A 00  .......j......j.
00000030:  73 00 65 00 6C 00 65 00 | 63 00 74 00 20 00 2A 00  s.e.l.e.c.t. .*.
00000040:  20 00 66 00 72 00 6F 00 | 6D 00 20 00 50 00 72 00   .f.r.o.m. .P.r.
00000050:  6F 00 64 00 75 00 63 00 | 74 00 73 00 20 00 77 00  o.d.u.c.t.s. .w.
00000060:  68 00 65 00 72 00 65 00 | 20 00 50 00 72 00 6F 00  h.e.r.e. .P.r.o.
00000070:  64 00 75 00 63 00 74 00 | 4B 00 65 00 79 00 20 00  d.u.c.t.K.e.y. .
00000080:  3D 00 20 00 40 00 70 00 | 72 00 6F 00 64 00 75 00  =. .@.p.r.o.d.u.
00000090:  63 00 74 00 4B 00 65 00 | 79 00 00 00 E7 34 00 15  c.t.K.e.y....4..
000000A0:  04 D0 00 00 34 00 40 00 | 70 00 72 00 6F 00 64 00  ....4.@.p.r.o.d.
000000B0:  75 00 63 00 74 00 4B 00 | 65 00 79 00 20 00 6E 00  u.c.t.K.e.y. .n.
000000C0:  76 00 61 00 72 00 63 00 | 68 00 61 00 72 00 28 00  v.a.r.c.h.a.r.(.
000000D0:  34 00 30 00 30 00 30 00 | 29 00 0B 40 00 70 00 72  4.0.0.0.)..@.p.r
000000E0:  00 6F 00 64 00 75 00 63 | 00 74 00 4B 00 65 00 79  .o.d.u.c.t.K.e.y
000000F0:  00 00 E7 40 1F 15 04 D0 | 00 00 04 00 50 00 31 00  ...@........P.1.
...

In our simple application we quickly find that it’s ProductController‘s Edit action that leaks connections.

There is still one issue to investigate: prices that are not getting updated. Let’s then find a trace for the update statement: update Products set Price = @Price where ProductKey = @ProductKey:

...
<prov.DbConnectionInternal.ActivateConnection|RES|INFO|CPOOL> 7#, Activating
END 
START  <sc.SqlConnection.BeginTransaction|API> 3#, iso=-1{ds.IsolationLevel}, transactionName='None'
<sc.SqlInternalTransaction.ctor|RES|CPOOL> 1#, Created for connection 7#, outer transaction 1#, Type 2
<sc.TdsParser.WritePacket|INFO|ADV>  Packet sent    00000000:  0E 01 00 22 00 00 01 00 | 16 00 00 00 12 00 00 00  ..."............
00000010:  02 00 00 00 00 00 00 00 | 00 00 01 00 00 00 05 00  ................
00000020:  02 00                   |                          ..
<sc.TdsParser.ReadNetworkPacket|INFO|ADV> Packet read    00000000:  04 01 00 23 00 34 01 00 | E3 0B 00 08 08 01 00 00  ...#.4..........
00000010:  00 34 00 00 00 00 FD 00 | 00 FD 00 00 00 00 00 00  .4..............
00000020:  00 00 00                |                          ...
END 
<prov.DbConnectionHelper.ConnectionString_Get|API> 3#
START  <prov.DbConnectionHelper.CreateDbCommand|API> 3#
<sc.SqlCommand.set_Connection|API> 3#, 3#
END 
<sc.SqlCommand.set_Transaction|API> 3#
<sc.SqlCommand.set_CommandText|API> 3#, '    update Products set Price = @Price where ProductKey = @ProductKey    '
START  <sc.SqlCommand.ExecuteNonQuery|API> 3#
<sc.SqlCommand.ExecuteNonQuery|INFO> 3#, Command executed as RPC.
<sc.TdsParser.GetSession|ADV> 2# getting physical session 2
...

By looking few lines above the command we can see that it’s run in a transaction: START <sc.SqlConnection.BeginTransaction|API> 3#, iso=-1{ds.IsolationLevel}, transactionName='None'. From the further trace we read that there were no errors while executing the command so let’s focus on the closing phase:

START  <sc.SqlConnection.Close|API> 3#
<prov.DbConnectionInternal.CloseConnection|RES|CPOOL> 7# Closing.
...
<sc.TdsParser.Deactivate|STATE> 2#, 
       _physicalStateObj = False
       _pMarsPhysicalConObj = True
       _state = OpenLoggedIn
       _server = (local)\SQLEXPRESS
       _fResetConnection = False
       _defaultCollation = (LCID=1045, Opts=25)
       _defaultCodePage = 1250
       _defaultLCID = 1045
       _defaultEncoding = System.Text.SBCSCodePageEncoding
       _encryptionOption = OFF
       _currentTransaction = (ObjId=1, tranId=223338299393, state=Active, type=LocalFromAPI, open=0, disp=False
       _pendingTransaction = (null)
       _retainedTransactionId = 0
       _nonTransactedOpenResultCount = 0
       _connHandler = 7
       _fAsync = False
       _fMARS = False
       _fAwaitingPreLogin = False
       _fPreLoginErrorOccurred = False
       _sessionPool = (null)
       _isShiloh = True
       _isShilohSP1 = True
       _isYukon = True
       _sniSpnBuffer = 1290
       _errors = (null)
       _warnings = (null)
       _attentionErrors = (null)
       _attentionWarnings = (null)
       _statistics = True
       _statisticsIsInTransaction = True
       _fPreserveTransaction = False         _fParallel = False
<sc.SqlInteralTransaction.CloseFromConnection|RES|CPOOL> 1#, Closing

Highlighted lines:

  • _currentTransaction = (ObjId=1, tranId=223338299393, state=Active, type=LocalFromAPI, open=0, disp=False
  • <sc.SqlInteralTransaction.CloseFromConnection|RES|CPOOL> 1#, Closing

should raise our suspicion. They tell us that during the closing phase the transaction is still active (_currentTransaction is not null) and then it’s closed only because the connection is dropped. That explains why our changes were not reflected in the database. Adding t.Commit() just before the return statement in the [HttpPost]Edit action of the ProductController resolves our problem. I leave as an exercise for you to check how traces look like when the transaction is committed (or rolled back).

I hope the information provided in this post will help you diagnose any future ADO.NET exceptions. If you would like to experiment even more with ADO.NET tracing feel free to download the sample application code from http://lowleveldesign.codeplex.com/releases/view/94187. There is also a great article on data access tracing available on MSDN that presents even more providers and trace options :).

Diagnosing ADO.NET with ETW traces

2 thoughts on “Diagnosing ADO.NET with ETW traces

  1. Pashec says:

    Thanks, for the great article. Though I have an issue with collecting trace information. My adonettrace.etl file is always 8192 bytes. So is there something wrong with configuration? I’m using Windows 7 x64.

    1. Hi Pashec, thanks for a comment. 8192 is the size of en empty log – so ADO.NET tracing is not working on your machine. There might be unfortunately many reasons for that. First check the registry:

      – key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\BidInterface\Loader
      “:Path”=”C:\indows\Microsoft.NET\Framework64\v4.0.30319\AdoNetDiag.dll”

      – key HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\BidInterface\Loader
      “:Path”=”C:\Windows\Microsoft.NET\Framework\v4.0.30319\AdoNetDiag.dll”

      Then check if the provider is present: logman query providers ADONETDIAG.ETW

      What version of .NET is your application using? Is it 32-bit application?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s