.NET Performance Tip – Benchmarking



Micro-optimising has a bad reputation although I’d argue that knowledge of such things can help you write better code. We should also make the distinction clear between this and micro-benchmarking, on the other hand, which is a little risky but a lot safer if you know how to do it right.

Now, if you’re making that mental link with premature optimization right now, off the back of a 44 year old (mis)quote from Donald Knuth, there are lots of good, more recent and balanced arguments on the subject out there now; and plenty of tools that help you do it better. I think some of the confusion comes from a misunderstanding of where it sits within a much broader range of knowledge and tools for performance. Remember, he did say that ‘we should not pass up our opportunities in that critical 3%’, which sounds about right in my experience.

Common Pitfalls

There’s loads of pitfalls going old-school with System.Diagnostics.Stopwatch such as:

  • not enough iterations of operations that take a negligible amount of time
  • forgetting to disable compiler optimizations
  • inadvertently measuring code not connected to what you are testing
  • failing to ‘warm-up’ for account for JIT costs and processor caching
  • forgetting to separate setup code from code under test
  • and so on…

Enter Adam Sitniks’ BenchmarkDotNet. This deals with all the problems above and more.


It’s available as a NuGet package:

And has some excellent documentation:

You have a choice of configuration methods via objects, attributes or fluent. Things you can configure include:

  • Compare RyuJIT (default since .NET46 for x64 and since Core 2.0 for x86) and Legacy JIT
  • Compare x86 with x64
  • Compare Core with full framework (aka Clr)
  • JIT inlining (and tail calls, which can be confusingly similar to inlining in 64-bit applications in my experience)
  • You can even test the difference between Server GC and Workstation GC from my last tip

A Very Simple Example

For many scenarios, it is fine to just fire it up with the standard settings. Here’s an example of where I used it to get some comparisons between DateTime , DateTimeOffset  and NodaTime.

  • [ClrJob, CoreJob]  – I used the attribute approach to configuration, decorating the class to make BenchmarkDotNet run the tests on .NET full framework and also Core.
  • [Benchmark]  – used to decorate each method I wanted to benchmark

A call to get things started:

Note if you want to try this code, you’ll need to install the NuGet packages for BenchmarkDotNet and NodaTime.


Obviously, not a substitute for understanding the underlying implementation details of the DateTime  class in the Base Class Library (BCL); but a quick and easy to initially identify problem areas. In fact, this was just a small part of a larger explanation I gave to a colleague around ISO 8601, time zones, daylight saving and the pitfalls of DateTime.Now .

One Thing That Caught Me Out

One gotcha is, if you are testing Core and full framework, make sure you create a new Core console application and edit your csproj file, switching out <TargetFramework>  for e.g.

A Super-Simplified Explanation of .NET Garbage Collection




Super happy to have won First Prize @ Codeproject for best article.

Garbage collection is so often at the root (excuse the pun) of many performance problems, very often because of misunderstanding, so please do set aside time to deepen your understanding after reading this.

This article is a super-simplified, look at .NET garbage collection, with loads of intentional technical omissions. It aims to provide a baseline level of understanding that a typical C# developer realistically needs for their day-to-day work. I certainly won’t add complexity by mentioning the stack, value types, boxing etc.

Disclaimer: The .NET garbage collector is a dynamic beast that adapts to your application and its implementation is often changed.

What Developers Know To Say At Interview

The garbage collector automatically looks for objects that are no longer used and frees up that memory. This helps avoid memory leaks created through programmer error.

That’s fine to get the job but is not enough to engineer good, performant C#, trust me.

Managed Memory (The Brief Version)

The .NET CLR (Common Language Runtime) reserves a chunk of memory available to your application where it will manage any objects allocated by your application. When your application is finished with these objects, they are deallocated. This part is handled by the Garbage Collector (GC).

The GC can and does expand segments sizes when needed but its preference is to reclaiming space through its generational garbage collection…

The Generational .NET Garbage Collector

New, small, objects go into generation 0. When a collection occurs, any objects that are no longer in use (no references to them) have their memory freed up (deallocated). Any objects still in use will survive and promoted to the next generation.

Live Long Short (or Forever) and Prosper

Ask any .NET expert and they will tell you the same thing – an object should be short-lived or else live forever. I won’t be going into detail about performance – this is the one and only rule I want you to take away.

To appreciate it, we need to answer: Why generational?

In a well-engineered C# application, typical objects will live and die without ever being promoted out of gen 0. I’m thinking of operations like:

  • Variables local to a short running method
  • Objects instantiated for the lifetime of a request to a web API call

Gen 1 is the ‘generation in between’, which will catch any wannabe-short-lived objects that escape gen 0, with what is still a relatively quick collection.

A check on which objects are unused consumes resources and suspends applications threads. GC gets increasingly expensive up the generations, as a collection of a particular generation has to also collect all those preceding it e.g. if gen 2 is collected, then so also must gen 1 and gen 0 (see above diagram). This is why we often refer to gen 2 as the full GC. Also, objects that live longer tend to be more complicated to clean up!

Don’t worry though – if we know which objects are likely to live longer, we can just check them less frequently. And with this in mind:

.NET GC runs the most on gen 0, less on gen 1 and even less often on gen 2.

If an object makes it to gen 2 it needs to be for a good reason – like being a permanent, reusable object. If objects make it there unintentionally, they’ll stick around longer, using up memory and resulting in more of those bigger gen 2 full collections!

But Generations Are All Just A Front!

The biggest gotcha when exploring your application through profilers and debuggers, looking at GC for the first time, the Large Object Heap (LOH) also being referred to as generation 2.

Physically, your objects will end up in managed heap segments (in the memory allocated to the CLR, mentioned earlier).

Objects will be added onto gen 0 of the Small Object Heap (SOH) consecutively, avoiding having to look for free space. To reduce fragmentation when objects die, the heap may be compacted.

See the following simplified look at a gen 0 collection, followed by a gen 1 collection, with a new object allocation in between (my first go at such an animation):


Large objects go on the Large Object Heap, which does not compact, but will try to reuse space.

As of .NET451 you can tell the GC to compact it on the next collection. But prefer other options for dealing with LOH fragmentation such as pooling reusable objects instead.

How Big Is a Large Object?

It is well established that >= 85KB a large object (or an array of 1000 doubles). Need to know a bit more than that…

You might be thinking of that large Bitmap image you’ve working with – actually that object uses 24 Bytes and the bitmap itself is in unmanaged memory. It’s really rare to see an object that is really large. More typically a large object is going to be an array.

In the following example, the object from LargeObjectHeapExample is actually 16 Bytes because it is just made up of general class info and pointers to the string and byte array.

By instantiating the LargeObjectHeapExample object we are actually allocating 3 objects on the heap: 2 of them on the small object heap; and the byte array on the large object heap.

Remember what I said earlier about stuff in the Large Object Heap – notice how the byte array reports as being in generation 2! One reason for the LOH being within gen 2 logically, is that large objects typically have longer lifetimes (think back to what I said earlier about objects that live longer in generational GC). The other reason is the expense of copying large objects while performing the compacting that occurs in earlier generations.

What Triggers a Collection?

  • An attempt to allocate exceeds threshold for a generation or the large object heap
  • A call to GC.Collect (I’ll save this for another article)
  • The OS signals low system memory

Remember gen 2 and LOH are logically the same thing, so hitting the threshold on either, will trigger a full (gen 2) collection on both heaps. Something to consider re performance (beyond this article).


  • A collection of a particular generation also collects all those below it i.e. collecting 2 also collects 1 and 0.
  • The GC promotes objects that survive collection (because they are still in use) to the next generation. Although see previous point – don’t expect an object in gen 1 to move to gen 2 when a gen 0 collection occurs.
  • GC runs the most on gen 0, less on gen 1 and even less often on gen 2. With this in mind, objects should be short-lived (die in gen 0 or gen 1 at worst) or live forever (intentionally of course) in gen 2.

C# Debug vs. Release builds and debugging in Visual Studio – from novice to expert in one blog article


Super happy to have won First Prize @ Codeproject for this article.

Repository for my PowerShell script to inspect the DebuggableAttribute of assemblies.


‘Out of the box’ the C# build configurations are Debug and Release.

I planned to write a write an introductory article but as I delved deeper into internals I started exploring actual behaviour with Roslyn vs. previous commentary / what the documentation states. So, while I do start with the basics, I hope there is something for more experienced C# developers too.

Disclaimer: Details will vary slightly for .NET languages other than C#.

A reminder of C# compilation

C# source code passes through 2 compilation steps to become CPU instructions that can be executed.

Diagram showing the 2 steps of compilation in the C# .NET ecosytem

As part of your continuous integration, step 1 would take place on the build server and then step 2 would happen later, whenever the application is being run. When working locally in Visual Studio, both steps, for your convenience, fire off the back of starting the application from the Debug menu.

Compilation step 1: The application is built by the C# compiler. Your code is turned into Common Intermediate Language (CIL), which can be executed in any environment that supports CIL (which from now on I will refer to as IL). Note that the assembly produced is not readable IL text but actually metadata and byte code as binary data (tools are available to view the IL in a text format).

Some code optimisation will be carried out (more on this further on).

Compilation  step 2:  The Just-in-time (JIT) compiler will convert the IL into instructions that the CPU on your machine can execute. This won’t all happen upfront though – in the normal mode of operation, methods are compiled at the time of calling, then cached for later use.

The JIT compiler is just one of a whole bunch of services that make up the Common Language Runtime (CLR), enabling it to execute .NET code.

The bulk of code optimisation will be carried out here (more on this further on).

What is compiler optimisation (in one sentence)?

It is the process of improving factors such as execution speed, size of the code, power usage and in the case of .NET, the time it takes to JIT compiler the code – all without altering the functionality, aka original intent of the programmer.

Why are we concerned with optimisation in this article?

I’ve stated that compilers at both steps will optimise your code. One of the key differences between the Debug and Release build configurations is whether the optimsations are disabled or not, so you do need to understand the implications of optimisation.

C# compiler optimisation

The C# compiler does not do a lot of optimisation. It relies ‘…upon the jitter to do the heavy lifting of optimizations when it generates the real machine code. ‘  (Eric Lippert). It will nonetheless still degrade the debugging experience.  You don’t need in-depth knowledge of C# optimisations to follow this article, but I’ll look at one to illustrate the effect on debugging:

The IL nop instruction (no operation)

The nop instruction has a number of uses in low level programming, such as including small, predictable delays or overwriting instructions you wish to remove. In IL, it is used to help breakpoints set in the your source code behave predictably when debugging.

If we look at the IL generated for a build with optimisations disabled:

nop instruction

This nop instruction directly maps to a curly bracket and allows us to add a breakpoint on it:

curly bracket associated with nop instruction

This would be optimised out of IL generated by the C# compiler if optimisations were enabled, with clear implications for your debugging experience.

For a more detailed discussion on C# compiler optimisations see Eric Lippert’s article: What does the optimize switch do?. There is also a good commentary of IL before and after being optimised here.

The JIT compiler optimisations

Despite having to perform its job swiftly at runtime, the JIT compiler performs a lot of optimisations. There’s not much info on its internals and it is a non-deterministic beast (like Forrest Gump’s box of chocolates) – varying in the native code it produces depending on a many factors. Even while your application is running it is profiling and possibly re-compiling code to improve performance. For a good set of examples of optimisations made by the JIT compiler checkout Sasha Goldshtein’s article.

I will just look at one example to illustrate the effect of optimisation on your debugging experience:

Method inlining

For the real-life optimisation made by the JIT compiler, I’d be showing you assembly instructions. This is just a mock-up in C# to give you the general idea:

Suppose I have:

The JIT compiler would likely perform an inline expansion on this, replacing the call to Add()   with the body of Add()  :

Clearly, trying to step through lines of code that have been moved is going to be difficult and you’ll also have a diminished stack trace.

The default build configurations

So now that you’ve refreshed your understanding of .NET compilation and the two ‘layers’ of optimisation, let’s take a look at the 2 build configurations available ‘out of the box’:

Visual Studio release and debug configurations

Pretty straightforward – Release is fully optimised, the Debug is not at all, which as you are now aware, is fundamental to how easy it is to debug your code. But this is just a superficial view of the possibilities with the debug and optimize arguments.

The optimize and debug arguments in depth

I’ve attempted to diagram these from the Roslyn and mscorlib code, including: CSharpCommandLineParser.cs, CodeGenerator.cs, ILEmitStyle.csdebuggerattributes.cs, Optimizer.cs and OptimizationLevel.cs. Blue parallelograms represent command line arguments and the greens are the resulting values in the codebase.

Diagram of optimize and debug command line arguments and their related settings in code

The OptimizationLevel enumeration

OptimizationLevel.Debug disables all optimizations by the C# compiler and disables JIT optimisations via DebuggableAttribute.DebuggingModes  , which with the help of ildasm, we can see is:

Manifest debuggable attribute

Given this is Little Endian Byte order, it reads as 0x107, which is 263, equating to: Default , DisableOptimizations , IgnoreSymbolStoreSequencePoints  and EnableEditAndContinue, (see debuggerattributes.cs.

OptimizationLevel.Release enables all optimizations by the C# compiler and enables JIT optimizations via DebuggableAttribute.DebuggingModes = ( 01 00 02 00 00 00 00 00 ) , which is just DebuggingModes.IgnoreSymbolStoreSequencePoints .

With this level of optimization, ‘sequence points may be optimized away. As a result it might not be possible to place or hit a breakpoint.’ Also, ‘user-defined locals might be optimized away. They might not be available while debugging.’ (OptimizationLevel.cs).

IL type explained

The type of IL is defined by the following enumeration from ILEmitStyle.cs.

As in the diagram above, the type of IL produced by the C# compiler is determined by the OptimizationLevel ; the debug argument won’t change this, with the exception of debug+ when the OptimizationLevel is Release i.e. in all but the case of debug+, optimize is the only argument that has any impact on optimisation – a departure from pre-Roslyn*.

* In Jeffry Richter’s CLR Via C# (2014), he states that optimize- with debug- results in the C# compiler not optimising IL and the JIT compiler optimising to native.

ILEmitStyle.Debug – no optimization of IL in addition to adding nop instructions in order to map sequence points to IL

ILEmitStyle.Release – do all optimizations

ILEmitStyle.DebugFriendlyRelease – only perform optimizations on the IL that do not degrade debugging. This is the interesting one. It comes off the back of a debug+ and only has an effect on optimized builds i.e. those with OptimizationLevel.Release. For optimize- builds debug+ behaves as debug.

The logic in (CodeGenerator.cs) describes it more clearly than I can:

The comment in the source file Optimizer.cs states that, they do not omit any user defined locals and do not carry values on the stack between statements. I’m glad I read this, as I was a bit disappointed with my own experiments in ildasm with debug+, as all I had been seeing was the retention of local variables and a lot more pushing and popping to and from the stack!

There is no intentional ‘deoptimizing’ such as adding nop instructions.

There’s no obvious direct way to chose this debug flag from within Visual Studio for C# projects? Is anyone making use of this in their production builds?

No difference between debug, debug:full and debug:pdbonly?

Correct – despite the current documentation and the help stating otherwise:

csc command line help

They all achieve the same result – a .pdb file is created. A peek at CSharpCommandLineParser.cs  can confirm this. And for good measure I did check I could attach and debug with WinDbg for both the pdbonly and full values.

They have no impact on code optimisation.

On the plus side, the documentation on Github is more accurate, although I’d say, still not very clear on the special behaviour of debug+.

I’m new.. what’s a .pdb? Put simply, a .pdb file stores debugging information about your DLL or EXE, which will help a debugger map the IL instructions to the original C# code.

What about debug+?

debug+ is its own thing and cannot be suffixed by either full or pdbonly. Some commentators suggest it is the same thing as debug:full, which is not exactly true as stated above – used with optimize- it is indeed the same, but when used with optimize+ it has its own unique behaviour, discussed above under DebugFriendlyRelease .

And debug- or no debug argument at all?

The defaults in CSharpCommandLineParser.cs are:

The values for debug- are:

So we can confidently say debug- and no debug argument result in the same  single effect – no .pdb file is created.

They have no impact on code optimisation.

Suppress JIT optimizations on module load

A checkbox under Options->Debugging->General; this is an option on the debugger in Visual Studio and is not going to affect the assemblies you build.

You should now appreciate that the JIT compiler does most of the significant optimisations and is the bigger hurdle to mapping back to the original source code for debugging. With this enabled, the debugger will request that DisableOptimizations  is ignored by the JIT compiler.

Until circa 2015 the default was enabled. I earlier cited CLR via C#, in that pre-Roslyn we could supply optimise- and debug- arguments to csc.exe and get unoptimised C# that was then optimised by the JIT compiler – so there would have been some use for suppressing the JIT optimisations in the Visual Studio debugger. However, now that anything being JIT optimised is already degrading the debugging experience via C# optimisations, Microsoft decided to default to disabled on the assumption that if you are running the Release build inside Visual Studio, you probably wish to see the behaviour of an optimised build at the expense of debugging.

Typically you only need to switch it on if you need to debug into DLLs from external sources such as NuGet packages.

If you’re trying to attach from Visual Studio to a Release build running in production (with a .pdb or other source for symbols) then an alternative way to instruct the JIT compiler not to optmiize is to add a .ini file with the same name as your executable along side it with the following:

Just My Code.. What?

By default, Options->Debugging→Enable Just My Code is enabled and the debugger considers optimised code to be non-user. The debugger is never even going to attempt non-user code with this enabled.

You could uncheck this option, and then theoretically you can hit breakpoints. But now you are debugging code optimised by both the C# and JIT compilers that barely matches your original source code, with a super-degraded experience – stepping through code will be unpredictable you will probably not be able to obtain the values in local variables.

You should only really be changing this option if working with DLLs from others where you have the .pdb file.

A closer look at DebuggableAttribute

Above, I mentioned using ildasm to examine the manifest of assemblies to examine DebuggableAttribute . I’ve also written a little PowerShell script to produce a friendlier result (available via download link at the start of the article).

Debug build:

Release build:

You can ignore IsJITTrackingEnabled, as it is has been ignored by the JIT compiler since .NET 2.0. The JIT compiler will always generate tracking information during debugging to match up IL with its machine code and track where local variables and function arguments are stored (Source).

IsJITOptimizerDisabled simply checks DebuggingFlags for DebuggingModes.DisableOptimizations. This is the one that turns on optimisation by the JIT compiler.

DebuggingModes.IgnoreSymbolStoreSequencePoints tells the debugger to work out the sequence points from the IL instead of loading the .pdb file, which would have performance implications. Sequence points are used to map locations in the IL code to locations in your C# source code. The JIT compiler will not compile any 2 sequence points into a single native instruction. With this flag, the JIT will not load the .pdb file. I’m not sure why this flag is being added to optimised builds by the C# compiler – any thoughts?

Key points

  • debug- (or no debug argument at all) now means: do not create a .pdb file.
  • debug, debug:full and debug:pdbonly all now causes a .pdb file to be output. debug+ will also do the same thing if used alongside optimize-.
  • debug+ is special when used alongside optimize+, creating IL that is easier to debug.
  • each ‘layer’ of optimisation (C# compiler, then JIT) further degrades your debugging experience. You will now get both ‘layers’ for optimize+ and neither of them for optimize-.
  • since .NET 2.0 the JIT compiler will always generate tracking information regardless of the attribute IsJITTrackingEnabled
  • whether building via VS or csc.exe, the DebuggableAttribute is now always present
  • the JIT can be told to ignore IsJITOptimizerDisabled during Visual Studio debugging via the general debugging option, Suppress JIT optimizations on module load. It can also be instructed to do so via a .ini file
  • optimised+ will create binaries that the debugger considers non-user code. You can disable the option Just My Code, but expect a severely degraded debugging experience,

You have a choice of:

  • Debug: debug|debug:full|debug:pdbonly optimize+
  • Release: debug-|no debug argument optimize+
  • DebugFriendlyRelease: debug+ optimize+

However, DebugFriendlyRelease is only possible by calling Roslyn csc.exe directly. I would be interested to hear from anyone that has been using this.

Addressing a simple yet common C# Async/Await misconception


Super happy to have won First Prize @ Codeproject for this article.

Git repository with example code discussed in this article.

Async/await has been part of C# since C# 5.0 yet many developers have yet to explore how it works under the covers, which I would recommend for any syntactic sugar in C#. I won’t be going into that level of detail now, nor will I explore the subtleties of IO and CPU bound operations.

The common misconception

That awaited code executes in parallel to the code that follows.

i.e. in the following code, LongOperation() is called and awaited, and while this is executing, and before it has completed, the code ‘doing other things’ will start being executed.

This is not how it behaves.

In the above code, what actually happens is that the await operator causes WithAwaitAtCallAsync() to suspend at that line and returns control back to DemoAsyncAwait() until the awaited task, LongOperation(), is complete.

When LongOperation() completes, then ‘do other things’ will be executed.

And if we don’t await when we call?

Then you do get that behaviour some developers innocently expect from awaiting the call, where LongOperation() is left to complete in the background while continuing on with  WithoutAwaitAtCallAsync() in parallel, ‘doing other things’:

However, if LongOperation() is not complete when we reach the awaited Task it returned, then it yields control back to DemoAsyncAwait(), as above. It does not continue to complete ‘more things to do’ – not until the awaited task is complete.

Complete Console Application Example

Some notes about this code:

  • Always use await over Task.Wait() to retrieve the result of a background task (outside of this demo) to avoid blocking. I’ve used Task.Wait() it in my demonstrations to force blocking and prevent the two separate demo results overlapping in time.
  • I have intentioinally not used Task.Run() as I don’t want to confuse things with new threads. Let’s just assume LongOperation() is IO-bound.
  • I used Task.Delay() to simulate the long operation. Thread.Sleep() would block the thread

This is what happens when the code is executed (with colouring):



If you use the await keyword when calling an async method from inside an async method, execution of the calling method is suspended to avoid blocking the thread and control is passed (or yielded) back up the method chain. If, on its journey up the chain, it reaches a call that was not awaited, then code in that method is able to continue in parallel to the remaining processing in the chain of awaited methods until it runs out of work to do, and then needs to await the result, which is inside the Task object returned by LongOperation().​​​​​​​

.NET String Interning to Improve String Comparison Performance (C# examples)



String comparisons must be one of the most common things we do in C#; and string comparisons can be really expensive! So its worthwhile knowing the what, when and why to improving string comparison performance.

In this article I will explore one way – string interning.

What is string interning?

String interning refers to having a single copy of each unique string in an string intern pool, which is via a hash table in the.NET common language runtime (CLR). Where the key is a hash of the string and the value is a reference to the actual String object.

So if I have the same string occurring 100 times, interning will ensure only one occurrence of that string is actually allocated any memory. Also, when I wish to compare strings, if they are interned, then I just need to do a reference comparison.

String interning mechanics

In this example, I explicitly intern string literals just for demonstration purposes.

Line 1:

  • This new “stringy” is hashed and the hash is looked up in our pool (of course its not there yet)
  • A copy of the “stringy” object would be made
  • A new entry would be made to the interning hash table, with the key being the hash of “stringy” and the value being a reference to the copy of “stringy”
  • Assuming application no longer references original “stringy”, the GC can reclaim that memory

Line 2: This new “stringy” is hashed and the hash is looked up in our pool (which we just put there). The reference to the “stringy” copy is returned
Line 3: Same as line 2

Interning depends on string immutability

Take a classic example of string immutability:

We know that when line 2 is executed, the “stringy” object is de-referenced and left for garbage collection; and s1 then points to a new String object “stringy new string”.

String interning works because the CLR knows, categorically, that the String object referenced cannot change. Here I’ve added a fourth line to the earlier example:

Line 4: s1 doesn’t change because it is immutable; it now points to a new String object ” stringy new string”.
s2 and s3 will still safely reference the copy that was made at line 1

Using Interning in the .NET CLR

You’ve already seen a bit in the examples above. .NET offers two static string methods:

Intern(String str)

It hashes string str and checks the intern pool hash table and either:

  • returns a reference to the (already) interned string, if interned; or
  • a references to str is added to the intern pool and this reference is returned

IsInterned(String str)

It hashes string str and checks the intern pool hash table. Rather than a standard bool, it returns either:

  • null, if not interned
  • a reference to the (already) interned string, if interned

String literals (not generated in code) are automatically interned, although CLR versions have varied in this behaviour, so if you expect strings interned, it is best to always do it explicitly in your code.

My simple test: Setup

I’ve run some timed tests on my aging i5 Windows 10 PC at home to provide some numbers to help explore the potential gains from string interning. I used the following test setup:

  • Strings randomly generated
  • All string comparisons are ordinal
  • Strings are all the same length of 512 characters as I want the CLR will compare every character to force the worst-case (the CLR checks string length first for ordinal comparisons)
  • The string added last (so to the end of the List<T>) is also stored as a ‘known’ search term. This is because I am only exploring the worst-case approach
  • For the List<T> interned, every string added to the list, and also the search term, were wrapped in the string.Intern(String str) method

I timed how long populating each collection took and then how long searching for the known search term took also, to the nearest millisecond.

The collections/approaches used for my tests:

  • List<T> with no interning used, searched via a foreach loop and string.Equals on each element
  • List<T> with no interning used, searched via the Contains method
  • List<T> with interning used, searched via a foreach loop and object.ReferenceEquals
  • HashSet<T>, searched via the Contains method

The main objective is to observe string search performance gains from using string interning with List<T> HashSet is just included as a baseline for known fast searches.

My simple test: Results

In Figure 1 below, I have plotted the size of collections in number of strings added, against the time taken to add that number of randomly generated strings. Clearly there is no significant difference in this operation, when using a HashSet<t> compared to a List<T> without interning. Perhaps if had run to larger sets the gap would widen further based on trend?

There is slightly more overhead when populating the List<T> with string interning, which is initially of no consequence but is growing faster than the other options.

Figure 1: Populating List<T> and HashSet<T> collections with random strings

Figure 2, shows the times for searching for the known string. All the times are pretty small for these collection sizes but the growths are clear.

Figure 2: Times taken searching for a string known, which was added last

As expected, HashSet is O(1) and the others are O(N). The searches not utilising interning are clearly growing in time taken at a greater rate.


HashSet<T> is present in this article only as a baseline for fast searching and should obviously be your choice for speed where there are no constraints requiring a List<T>.

In scenarios where you must use a List<T> such as where you still wish to enumerate quickly through a collection, there are some performance gains to be had from using string interning, with benefits increasing as the size of the collection grows. The drawback is the slightly increased populating overhead (although I think it is fair to suggest that most real-world use cases would not involve populating the entire collection in one go).

The utility and behaviour of string interning, reminds me of database indexes – it takes a bit longer to add a new item but that item will be faster to find. So perhaps the same considerations for database indexes are true for string interning?

There is also the added bonus that string interning will prevent any duplicate strings being stored, which in some scenarios could mean substantial memory savings.

Potential benefits:

  • Faster searching via object references
  • Reduced memory usage because duplicate interned strings will only be stored once

Potential performance hit:

  • Memory referenced by the intern hash table is unlikely to be released until the CLR terminates
  • You still need to create the string to be interned, which will be allocated memory (granted, this will then be garbage collected)


  • https://msdn.microsoft.com/en-us/library/system.string.intern.aspx

Erratic Behaviour from .NET MemoryCache Expiration Demystified


On a recent project I experienced first-hand, how the .NET MemoryCache class, when used with either absolute or sliding expiration, can produce some unpredictable and undocumented results.

Sometimes cache items expire exactly when expected… yay. But mostly, they expire an arbitrary period of time late.

For example, a cache item with an absolute expiry of 5 seconds might expire after 5 seconds but could just as likely take up to a further 20 seconds to expire.

This might only be significant where precision, down to a few seconds, is required (such as where I have used it to buffer / throttle FileSystemWartcher events) but I thought it would be worthwhile decompiling System.Runtime.Caching.dll and then clearly documenting the behaviour we can expect.

When does a cache item actually expire?

There are 2 ways your expired item can leave the cache:

  • Every 20 seconds, on a Timer, it will pass through all items and flush out anything past its expiry
  • Whenever an item is accessed, its expiry is checked and that item will be removed if expired

This goes for both absolute and sliding expiration. The timer is enabled as soon as anything is added to the cache, whether or not it has an expiration set.

Note that this is all about observable behaviour, as witnessed by the bemused debugger, because once an item has past its expiry, you can no longer access it anyway – see point 2 above, where accessing an item forces a flush.

Just as weird with Sliding Expiration…

Sliding expiration is where an expiration time is set, the same as absolute, but if it is accessed the timer is reset back to the configured expiration length again.

  • If the new expiry is not at least 1 second longer than the current (remaining) expiry, it will not be updated/reset on access

Essentially, this means that while you can add to cache with a sliding expiration of <= 1 second, there is no chance of any accessing causing the expiration to reset.

Note that if you ever feel the urge to avoid triggering a reset on sliding expiration, you could do this by boxing up values and getting/setting via the reference to the object instead.

Conclusion / What’s so bewildering?

In short, it is undocumented behaviour and a little unexpected.

Consider the 20 second timer and the 5 second absolute expiry example. When it is actually removed from the cache, will depend on where we are in the 20 seconds Timer cycle; it could be any time period, up to an additional 20 seconds, before it fires, giving a potential total of ~ 25 seconds between actually expiring from being added.

Add to this, the additional confusion you’ll come across while debugging, caused by items past their expiry time being flushed whenever they are accessed, it has even troubled the great Troy Hunt: https://twitter.com/troyhunt/status/766940358307516416. Granted he was using ASP.NET caching but the core is pretty much the same, as System.Runtime.Caching was just modified for general .NET usage.

Decompiling System.Runtime.Caching.dll

Some snippets from the .NET FCL for those wanting a peek at the inner workings themselves.


FlushExpiredItems is called from the TimerCallback (on the 20 seconds) and can also be triggered manually via the MemoryCache method, Trim. There must be interval of >= 1 second between flushes.

Love the goto – so retro. EDIT: Eli points out that it might just be my decompiler!


UpdateSlidingExp updates/resets sliding expiration. Note the limit MIN_UPDATE_DELTA of 1 sec.


See how code accessing a cached item will trigger a check on its expiration and if expired, remove it from the cache.

FileSystemWatcher vs locked files


Git repository with example code discussed in this article.

Another Problem with FileSystemWatcher

You’ve just written your nice shiny new application to monitor a folder for new files arriving and added the code send that file off somewhere else and delete it. Perhaps you even spent some time packaging it in a nice Windows Service. It probably behaved well during debugging. You move it into a test environment and let the manual testers loose. They copy a file, your eager file watcher spots the new file as soon as it starts writing and does the funky stuff and BANG!… an IOException:

The process cannot access the file X because it is being used by another process.

The copying had not finished before the event fired. It doesn’t even have to be a large file as your super awesome watcher is just so efficient.

A Solution

  1. When a file system event occurs, store its details in Memory Cache for X amount of time
  2. Setup a callback, which will execute when the event expires from Memory Cache
  3. In the callback, check the file is available for write operations
  4. If it is, then get on with the intended file operation
  5. Else, put it back in Memory Cache for X time and repeat above steps

It would make sense to track and limit the number of retry attempts to get a lock on the file.

Code Snippets

I’ve built this on top of the code discussed in a previous post on dealing with multiple FileSystemWatcher events.

Complete code for this example solution here

When a file system event is handled, store the file details and the retry count, using a simple POCO, in MemoryCache with a timer of, something like 60 seconds:

A simple POCO:

In the constructor, I initialised my cache item policy with a callback to execute when these cached POCOs expire:

The callback itself…

  1. Increment the number retries
  2. Try and get a lock on the file
  3. If still lock put it back into the cache for another 60 seconds (repeat this MaxRetries times)
  4. Else, get on with the intended file operation

Other ideas / To do….

  • Could also store the actual event object
  • Could explore options for non-volatile persistence
  • Might find sliding expiration more appropriate in some scenario

A robust solution for FileSystemWatcher firing events multiple times


Git repository with example code discussed in this article.

The Problem

FileSystemWatcher is a great little class to take the hassle out of monitoring activity in folders and files but, through no real fault of its own, it can behave unpredictably, firing multiple events for a single action.

Note that in some scenarios, like the example used below, the first event will be the start of the file writing and the second event will be the end, which, while not documented behaviour, is at least predictable. Try it with a very large file to see for yourself.

However, FileSystemWatcher cannot make any promises to behave predictably for all OS and application behaviours. See also, MSDN documentation:

Common file system operations might raise more than one event. For example, when a file is moved from one directory to another, several OnChanged and some OnCreated and OnDeleted events might be raised. Moving a file is a complex operation that consists of multiple simple operations, therefore raising multiple events. Likewise, some applications (for example, antivirus software) might cause additional file system events that are detected by FileSystemWatcher.

Example: Recreating edit a file in Notepad firing 2 events

As stated above, we know that 2 events from this action would mark the start and end of a write, meaning we could just focus on the second, if we had full confidence this would be the consistent behaviour. For the purposes of this article, it makes for a convenient examples to recreate.

If you edited a text file in c:\temp, you would get 2 events firing.

Complete Console applications for both available on Github.

A robust solution

Good use of NotifyFilter (see my post on how to select NotifyFilters) can help but there are still plenty of scenarios, like those above, where additional events will still get through for a file system event.

I worked on a nice little idea for utilising MemoryCache as a buffer to ‘throttle’ additional events.

  1. A file event ( Changed in the example below) is triggered
  2. The event is handled by OnChanged. But instead of completing the desired action, it stores the event in MemoryCache with a 1 second expiration and a CacheItemPolicy  callback setup to execute on expiration.
  3. When it expires, the callback OnRemovedFromCache completes the behaviour intended for that file event.

Note that I use AddOrGetExisting as an simple way to block any additional events firing within the cache period being added to the cache.

If you’re looking to handle multiple, different, events from a single/unique file, such as Created with Changed , then you could key the cache on the file name and event named, concatenated.