A story of fixing a memory leak in MiniDumper


MiniDumper is a diagnostic tool for collecting memory dumps of .NET applications. Dumps created by MiniDumper are significantly smaller than full-memory dumps collected by, for example, procdump. However, they contain enough information to diagnose most of the issues in the managed applications. MiniDumper was initially developed by Sasha Goldstein, and I made few contributions to its code base. You may learn more about this tool from Sasha’s or my blog posts.

Recently, one of MiniDumper users reported a memory leak in the application. The numbers looked scary as there was a 20MB leak on each memory dump. The issue stayed opened a few weeks before I finally found a moment to look at it. As it was quite a compelling case, I decided to share with you the diagnostics steps in the hope it proves useful in your investigations.

Confirming the leak in VMMap

I prepared a simple test application to feed MiniDumper with exceptions:

using System;
using System.Threading;

public class Program 
{
    public static void Main() {
        while (true) {
            try {
                Console.ReadLine();
                throw new Exception("test");
            } catch (Exception) { }
        }
    }
}

I then configured MiniDumper to make a memory dump on each thrown first-chance exception: MiniDumper64.exe -mh -e 1 -n 20 -x .\ .\Test.exe. In the meantime, I started VMMap and attached it to the MiniDumper process. After each dump, I collected a snapshot of the MiniDumper memory (F5 key or View -> Refresh in the VMMap window). Few refreshes later, I opened the timeline window, and there was a noticeable increase in the yellow (Private Data) part of the graph:

After selecting two adjacent snapshots, VMMap produced the following Details View:

Address ranges that are in the new snapshot but not in the previous snapshots (new allocations) are highlighted in green. As you can see, even for my simple app more than 5MB were allocated on each dump and never reclaimed. Interestingly, it wasn’t the managed heap or the native heap that was leaking (VMMap has separate categories for them), but Private Data. Applications may allocate Private Data for various purposes and diagnosing leaks in those segments could be hard. MiniDumper relies heavily on ClrMD, aka Microsoft.Diagnostics.Runtime for examining the managed memory. Moreover, ClrMD wraps tons of native COM interfaces exposed by .NET. In the hope of a quick fix, I checked the ClrMD repository for memory leak issues and soon found one. Unfortunately, updating MiniDumper to the latest version did not fix my problem.

In the next step, I wanted to know who is to blame for those excessive allocations. However, before doing this, I extracted from MiniDumper the code that was leaking (the code touching ClrMD) and prepared a MemoryLeak application:

class Program
{
    static void Main(string[] args)
    {
        int pid = int.Parse(args[0]);

        while (true)
        {
            EnumRegions(pid);

            GC.Collect();

            Console.WriteLine("Press any key to continue...");
            Console.ReadKey();
        }
    }
    private static void EnumRegions(int pid)
    {
        var readerType = typeof(DataTarget).Assembly.GetType("Microsoft.Diagnostics.Runtime.LiveDataReader");
        var reader = (IDataReader)Activator.CreateInstance(readerType, pid, false);
        var target = DataTarget.CreateFromDataReader(reader);

        foreach (var clrVersion in target.ClrVersions)
        {
            var runtime = clrVersion.CreateRuntime();

            TouchOtherRegions(runtime);
        }
    }

    private static void TouchOtherRegions(ClrRuntime runtime)
    {
        var heap = runtime.Heap;
        heap.EnumerateRoots(enumerateStatics: false).Count();
    }
}

It took few iterations of the VMMap snapshotting to make the Details look more compact, with about 3MB leaking between subsequent snapshots. With the test application ready, it was time to look at the call stacks. VMMap has a tracing functionality and can group allocations by call stacks, but it correctly resolves only native stacks and leaves .NET frames as hex strings. Other debugging tools I know suffer from the same problem (for example, umdh or Application Verifier with Windbg). However, I recalled reading in the Inside Windows Debugging book (which is excellent btw.) that ETW can also trace memory allocations. Moreover, ETW tools on Windows 10 can decode the managed stacks as well as the native ones. It was time to start WPRUI.exe (part of the Windows Performance Toolkit).

Collecting allocation trace using ETW

In the WPRUI window I chose the Heap and VirtualAlloc usage events and clicked the Start button:

Then run my MemoryLeak application and after few “enters” stopped the tracing. After opening the trace in Windows Performance Analyzer, I checked the graphs in the Memory section and double clicked the VirtualAlloc Commit LifeTimes graph. Next, I filtered the graph to the MemoryLeak process, and the following staircase graph appeared:

In the table below the graph, I added the Commit Stack column and moved it to the left of the yellow line (to group the call stacks). After expanding the call stack tree, I found some interesting allocations of 2.7MB:

However, now I knew the source of them:

- MemoryLeak.exe!MemoryLeak.Program::Main
 |- MemoryLeak.exe!MemoryLeak.Program::EnumRegions
 |  |- MemoryLeak.exe!MemoryLeak.Program::TouchOtherRegions 0x0
 |  |  |- System.Core.ni.dll!System.Linq.Enumerable.Count[System.__Canon](System.Collections.Generic.IEnumerable`1)
 |  |  |  |- Microsoft.Diagnostics.Runtime.dll!Microsoft.Diagnostics.Runtime.Desktop.DesktopGCHeap+d__39::MoveNext
 |  |  |  |  |- Microsoft.Diagnostics.Runtime.dll!Microsoft.Diagnostics.Runtime.Desktop.DesktopGCHeap+d__37::MoveNext
 |  |  |  |  |  |- Microsoft.Diagnostics.Runtime.dll!Microsoft.Diagnostics.Runtime.Desktop.V45Runtime+d__10::MoveNext
 |  |  |  |  |  |  |- Microsoft.Diagnostics.Runtime.dll!Microsoft.Diagnostics.Runtime.Desktop.DesktopThread::get_StackTrace
 |  |  |  |  |  |  |  |- Microsoft.Diagnostics.Runtime.dll!Microsoft.Diagnostics.Runtime.Desktop.DesktopRuntimeBase+d__82::MoveNext
 |  |  |  |  |  |  |  |  |- Microsoft.Diagnostics.Runtime.dll!Microsoft.Diagnostics.Runtime.Desktop.V45Runtime::GetStackFrame
 |  |  |  |  |  |  |  |  |  |- Microsoft.Diagnostics.Runtime.dll!Microsoft.Diagnostics.Runtime.Desktop.DesktopMethod::Create
 |  |  |  |  |  |  |  |  |  |  |- Microsoft.Diagnostics.Runtime.dll!Microsoft.Diagnostics.Runtime.Desktop.DesktopModule::GetMetadataImport
 |  |  |  |  |  |  |  |  |  |  |  |- Microsoft.Diagnostics.Runtime.dll!Microsoft.Diagnostics.Runtime.DacInterface.SOSDac::GetMetadataImport
 |  |  |  |  |  |  |  |  |  |  |  |  Microsoft.Diagnostics.Runtime.dll!Microsoft.Diagnostics.Runtime.Utilities.CallableCOMWrapper::.ctor
 |  |  |  |  |  |  |  |  |  |  |  |  Microsoft.Diagnostics.Runtime.dll!dynamicClass::IL_STUB_PInvoke
 |  |  |  |  |  |  |  |  |  |  |  |  mscordacwks.dll!ClrDataModule::QueryInterface
 |  |  |  |  |  |  |  |  |  |  |  |  mscordacwks.dll!ClrDataModule::GetMdInterface
 |  |  |  |  |  |  |  |  |  |  |  |  mscordacwks.dll!Module::GetMDImport
 |  |  |  |  |  |  |  |  |  |  |  |  mscordacwks.dll!ClrDataAccess::GetMDImport
 |  |  |  |  |  |  |  |  |  |  |  |  mscordacwks.dll!DacInstantiateTypeByAddressHelper
 |  |  |  |  |  |  |  |  |  |  |  |  mscordacwks.dll!DacInstanceManager::Alloc
 |  |  |  |  |  |  |  |  |  |  |  |  mscordacwks.dll!ClrVirtualAlloc
 |  |  |  |  |  |  |  |  |  |  |  |  KernelBase.dll!VirtualAlloc
 |  |  |  |  |  |  |  |  |  |  |  |  ntdll.dll!NtAllocateVirtualMemory
 |  |  |  |  |  |  |  |  |  |  |  |  ntoskrnl.exe!KiSystemServiceCopyEnd
 |  |  |  |  |  |  |  |  |  |  |  |  ntoskrnl.exe!NtAllocateVirtualMemory
 |  |  |  |  |  |  |  |  |  |  |  |  ntoskrnl.exe!MiAllocateVirtualMemory

As the allocations were buried in the native code, I thought I would have a look at how the DacInstanceManager::Alloc method is implemented in the CoreCLR. So I found an interesting comment at the beginning of this method:

DAC_INSTANCE*
DacInstanceManager::Alloc(TADDR addr, ULONG32 size, DAC_USAGE_TYPE usage)
{
    SUPPORTS_DAC_HOST_ONLY;
    DAC_INSTANCE_BLOCK* block;
    DAC_INSTANCE* inst;
    ULONG32 fullSize;

    static_assert_no_msg(sizeof(DAC_INSTANCE_BLOCK) <= DAC_INSTANCE_ALIGN);
    static_assert_no_msg((sizeof(DAC_INSTANCE) & (DAC_INSTANCE_ALIGN - 1)) == 0);

    //
    // All allocated instances must be kept alive as long
    // as anybody may have a host pointer for one of them.
    // This means that we cannot delete an arbitrary instance
    // unless we are sure no pointers exist, which currently
    // is not possible to determine, thus we just hold everything
    // until a Flush.  This greatly simplifies instance allocation
    // as we can then just sweep through large blocks rather
    // than having to use a real allocator.  The only
    // complication is that we need to keep all instance
    // data aligned.  We have guaranteed that the header will
    // preserve alignment of the data following if the header
    // is aligned, so as long as we round up all allocations
    // to a multiple of the alignment size everything just works.
    //
    ...
}

It looks that DacInstanceManager keeps some objects in memory until we call the Flush method. Flush is also called by the DacInstanceManager destructor. So was MiniDumper missing this call?

Analysing GC references in Windbg

As you can see in the previously listed call stack, the wrapper over the Dac is the SOSDac class inheriting from the CallableComWrapper class, which has a proper finalizer and the Dispose() method. Logically, GC should remove its instances after sweeping away the objects that were referencing them. That would typically happen, but there is a cross-reference dependency in ClrMD which blocks the disposal of the DataTarget instance and in consequence Dac wrappers. To find the culprit we could either use PerfView or Windbg. I’m a bit biased so I picked Windbg and attached it to the MemoryLeak process (in PerfView you should use the “Take Heap Snapshot” option from the Memory menu). Then I found the method table address for the SOSDac class:

0:000> !Name2EE Microsoft.Diagnostics.Runtime.dll Microsoft.Diagnostics.Runtime.DacInterface.SOSDac
Module:      00007ff8cf134cb8
Assembly:    Microsoft.Diagnostics.Runtime.dll
Token:       0000000002000373
MethodTable: 00007ff8cf604408
EEClass:     00007ff8cf5f6cc8
Name:        Microsoft.Diagnostics.Runtime.DacInterface.SOSDac

Knowing the method table, we could list the instances of the SOSDac class on the GC Heap:

0:000> !dumpheap -mt 00007ff8cf604408
         Address               MT     Size
000001c380010a90 00007ff8cf604408      544     

Statistics:
              MT    Count    TotalSize Class Name
00007ff8cf604408        1         1632 Microsoft.Diagnostics.Runtime.DacInterface.SOSDac

Finally, find out who is referencing it:

0:000> !gcroot 000001c380010a90
HandleTable:
    000001c3ea641318 (strong handle)
    -> 000001c38000f7f8 Microsoft.Diagnostics.Runtime.DacInterface.DacDataTargetWrapper
    -> 000001c380005438 Microsoft.Diagnostics.Runtime.DataTargetImpl
    -> 000001c38000c020 System.Collections.Generic.List`1[[Microsoft.Diagnostics.Runtime.DacLibrary, Microsoft.Diagnostics.Runtime]]
    -> 000001c38000c048 Microsoft.Diagnostics.Runtime.DacLibrary[]
    -> 000001c38000f640 Microsoft.Diagnostics.Runtime.DacLibrary
    -> 000001c380010a90 Microsoft.Diagnostics.Runtime.DacInterface.SOSDac

Found 1 unique roots (run '!GCRoot -all' to see all roots).

As we can see, GC did not reclaim our DataTarget instance because GC Handle was indirectly referencing it. After checking the source code of the DacDataTargetWrapper, everything becomes clear (I stripped the unnecessary parts):

internal unsafe class DacDataTargetWrapper : COMCallableIUnknown, ICorDebugDataTarget
{
    private static readonly Guid IID_IDacDataTarget = new Guid("3E11CCEE-D08B-43e5-AF01-32717A64DA03");
    private static readonly Guid IID_IMetadataLocator = new Guid("aa8fa804-bc05-4642-b2c5-c353ed22fc63");

    private readonly DataTarget _dataTarget;

    ...
}

...

public unsafe class COMCallableIUnknown : COMHelper
{
    private readonly GCHandle _handle;
    private int _refCount;

    ...

    public COMCallableIUnknown()
    {
        _handle = GCHandle.Alloc(this);

        ...
    }

    ...
}

That’s the lesson I learned about the ClrMD library: ALWAYS dispose of the DataTarget instances. For the MemoryLeak application a fix was to put the DataTarget creation into a using block:

private static void EnumRegions(int pid)
{
    var readerType = typeof(DataTarget).Assembly.GetType("Microsoft.Diagnostics.Runtime.LiveDataReader");
    var reader = (IDataReader)Activator.CreateInstance(readerType, pid, false);
    using (var target = DataTarget.CreateFromDataReader(reader)) // missing using statement
    {
        foreach (var clrVersion in target.ClrVersions)
        {
            var runtime = clrVersion.CreateRuntime();

            TouchOtherRegions(runtime);
        }
    }
}

In MiniDumper I had two places in the code to update and the new 2.2.19053 version is free of the memory leaks (at least known ones :)) so please update.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.