Have you ever had to deal with a .NET app that keeps crashing due to OutOfMemoryException? Or maybe it uses a lot more memory than you expect and you’re not quite sure why. Memory leaks aren’t supposed to be a thing in .NET, but we do still bear some responsibility for managing resources correctly as developers on the .NET platform.
Part of that responsibility is understanding the tools and libraries we use in our code. In this post, we’ll explore a sneaky memory problem related to one of our favorite modern tools—the IoC container.
TL/DR: If you’re having memory problems and have configured Transient registrations for your IoC container, try using Scoped registrations instead. And, as always, make sure objects constructed by your container are being disposed properly.
Memory problems in production
We recently came across a client app behaving curiously under production load. There weren’t any performance issues, crashes, or OOM exceptions, but memory usage numbers in the production monitoring system were higher than expected. The monitoring history also showed slow growth in the total memory used over time.
A memory leak? Say it ain’t so!
The application is a simple ETL process to read XML input files and transform the contents to a new file format (HL7). It makes use of EntityFramework for persistence of a very small amount of data—saga state for the MassTransit library. All told, there are thousands of files processed each day, but we’re talking about MBs of data per day.
Investigating the issue
After establishing a local test environment, we ran a profile of the app using JetBrains dotMemory. The results made it clear: Somewhere we had Gen2 heap objects not getting garbage collected.
Further drill down into the heap contents show that it was our EF DbContext that wasn’t getting garbage collected. The profile showed that our IoC container (Microsoft.Extensions.DependencyInjection) was retaining references to the thousands of DbContext objects inside an array of IDisposables.
Our first stop was the container setup:
Transient seems ok here. The basic understanding of the transient lifecycle is that you get a new object every time one is requested, which seems innocuous at first glance. Also remember that we’re not operating in ASP.Net, so we don’t think about a request scope in this app like we might were we considering handling individual HTTP requests. That turns out to be key here.
Uncovering the hidden request pipeline
It’s worth understanding just why we’re getting so many instances of the StateMachineDbContext in the first place. Our context is being used by the MassTransit library to track process state within a saga. If you’re unfamiliar with MassTransit or the saga pattern, the most important thing to know is that our parsing process implemented in multiple steps coordinated by passing messages over a message bus.
Every step in the file handling process is implemented within a message handler. After a given handler performs its work, it fires off a new message to the bus to trigger the next handler in the parsing process. Each time a message is received on the bus, MassTransit is constructing a message handler to process the message and a StateMachineDbContext to save the resulting saga state.
In other words, even though we’re running in a Windows service, we’ve effectively got a request processing pipeline just like we would in ASP.NET.
Look closely and you’ll see there’s another request lifecycle in our app. Our process runs continuously (like an HTTP server) and parses input files periodically received from an external source (like an HTTP request). Our use of MassTransit amplifies the number of DbContexts we allocate per file processed, but we’d potentially see the problem even without MassTransit. The issue would be observed for any setup that constructed a handler-per-file with IDisposable dependencies.
Tracking down object disposal
Now that we understand why we’ve got so many contexts, the next question is why these contexts are hanging around in the container.
Even though we now understand we have a hidden request lifecycle, we don’t have the plumbing in place to create a neat processing pipeline like ASP.NET provides for us in a web app. In an ASP.NET world, we’d almost certainly register our context with a Scoped lifetime. The problem for our container is that it has no idea when it’s appropriate to handle the cleanup of disposable objects registered as Transient.
Some containers simply choose not to track the IDisposables at all; it’s left to the developer to track and Dispose() as appropriate. It turns out the technical decision [https://github.com/aspnet/DependencyInjection/issues/456] made by the implementers of the Microsoft.Extensions.DependencyInjection container is to track IDisposables and dispose them when the container itself is disposed.
This is an absolutely critical detail for our problem! The container is keeping a reference to our StateMachineDbContext objects in memory (see the screenshot above), meaning the garbage collector won’t ever clean them up. Even if some other piece of code has called Dispose() on your objects, they still won’t be garbage collected until the container is disposed. In our case, that’s when the app is stopped.
Resolving the root problem
The easiest path to solving our problem comes down to avoiding creation of our StateMachineDbContext as a Transient item in our root container. Note that our solution is completely dependent on how our container choose to handle IDisposables.
In our original registration,
It’s the p.GetService<StateMachineDbContext> in conjunction with our AddDbContext(…, ServiceLifetime.Transient) registration that’s at the root of our problem (pun intended). We’re asking the container to construct the StateMachineDbContext for us, which means it gets tracked in our container’s IDisposable array since it’s registered as Transient scoped.
If we either ensure the StateMachineDbContext registration is not Transient, or we take charge of constructing our StateMachineDbContext outside of the container, our problem will be solved. There’s some irony in the first solution given the earlier discussion about how we don’t really have the plumbing for a neat Unit Of Work pattern in a request processing pipeline, so we’re exploiting the behavior of the container here to stop it from tracking the StateMachineDbContext after construction**.
**Side Note: Even though the container is not tracking the StateMachineDbContext instances it creates for us, the resources still need to be cleaned up. DelegateSagaDbContextFactory is a MassTransit class that will handle calling Dispose() on our StateMachineDbContext after our saga state is persisted. You may need a similar mechanism to actually dispose your resources, depending on your scenario.
The Transient lifecycle is tricky to manage when you really dig into what’s happening under the covers. Use it with caution and a full understanding of how your IoC container of choice handles tracking of IDisposable objects. When in doubt, take a very close look at whether you should really use Transient or if a Scoped registration is more appropriate. And no matter what, make sure Dispose() is being called on your objects!