Sponsor: Do you build complex software systems? See how NServiceBus makes it easier to design, build, and manage software systems that use message queues to achieve loose coupling. Get started for free.
Everything is built on layers of abstraction. But how well do you understand the abstractions of the tools, libraries, and frameworks you’re using? We know everything comes with trade-offs, and being informed about the underlying implementation can help you make better decisions.
Check out my YouTube channel, where I post all kinds of content accompanying my posts, including this video showing everything in this post.
I posted a video, Refactoring code a better design, where I changed a call from First() to Single(). If you’re unfamiliar with LINQ or C#, First() is an extension on Enumerable<T> that returns you the first element. Single() returns only the element that matches the specified condition and throws an exception if there’s more than one element that matches.
(Some) People have advocated for calling First() because it’s “faster”. I put “faster” in quotes because that depends on the underlying implementation of First() that you’re using. This is the issue with “best practices“, is they are often taken as blanket statements without considering the trade-offs or understanding the context around when it might be something to consider.
Here’s an example that’s using Entity Framework Core to query a MySQL Database.
In this example, the primary key is the OrderId. The “best practice” is to call First(OrDefault) because it’s “faster”. Why? Because First() only needs to match a first element and doesn’t need to do anything to compute if there’s more than one matching element. This would make sense if you were using the First/Single against an in-memory collection, but you’re not. This is against a database and that matters.
I’ve created some tests to illustrate the SQL generated and the execution time for comparisons. First, here’s a TestDbContext using EF Core and I’m loving the generated SQL to the console.
Here are two tests to illustrate the differences between Single() and First()
Single() generated the following SQL:
You can see that the LIMIT 2 is there because if we receive more than one record, it will throw an exception because there’s more than one matching record.
As you can guess, First() generates the following SQL:
It uses a LIMIT 1 because it’s only concerned about getting back one record from the database, if there’s more than one, it doesn’t matter.
The implementation of EF Core’s version of you is, at most, fetching out 2 records from the database if you call Single(). If you were to call First() the query would specifically only return 1 record from the database. So, the difference between the two implementations in terms of performance boils down to returning 1 or 2 records at most.
But hang on, as I mentioned, given my understanding of this database and knowing that OrderId is unique (and my primary key), I know there’s only going to be one record. So the difference between calling First() or Default() is what?
In terms of performance, they both produce the exact same query plans. Exactly. Why because it’s using the primary key, which we’re aware of.
I seeded a database with a million records and then ran some benchmarks. (note: I removed the console logging so no SQL is outputted from the above sample).
Understand the Abstraction & Implementation
Is First() faster? Yes, it is, barely. However, is it worth the trade-off? Everything has trade-offs. Calling First() is explicitly in code saying, “give me the first record”. But what happens if there’s more than one record? Nothing, you’d never know. First() is masking a potential data issue. Single() is explicit in the code telling you, “there should only be one element/record in this collection that I’m trying to find”. If there’s more than one, an bomb/exception goes off to uncover data issues.
First() on Enumerable<T> is a totally different implementation than what EF Core uses. If you have a large in-memory collection and you need to squeeze out performance for a given situation, maybe it is worth calling First() even though you know there should be a single element.
But the “best practice” or rule of thumb that you should use First() over Single() is absurd. Understand the implementation of that abstraction. How it behaves on an Enumerable<T> is very different than how it does against an IQueryable<T> from EF Core.
As another example, if you’re querying a database with EF Core, not against a primary key, and not against a unique index. Let’s say you have an “Active” column, in which only one row has a true value for a subset of rows. You cannot have a unique constraint against this and enforced by the database.
This is the exact example of how you’re knowledge is that there should only be one active record and Single() will expose if there is some other means that data is being persisted incorrectly and causing bad data. Would you rather have a logged exception that you can fix to find the root cause of bad data, or have more bad data or unknowns occur because you’re only returning the first record? What about if you have any type of ordering in different places? You might not be getting the same record calling First().
Understand your abstractions and the implementation of those abstractions you’re using to make better decisions with understanding the trade-off.s
Developer-level members of my Patreon or YouTube channel get access to a private Discord server to chat with other developers about Software Architecture and Design and access to source code for any working demo application I post on my blog or YouTube. Check out my Patreon or YouTube Membership for more info.