How To Improve Query Performance Using EF Core Query Splitting
In the market for optimizing your database queries using EF Core? In this tutorial I will show you a practical example of using Query Splitting, the results in SQL and give you a heads up before implementing it in your application.
I had a request from a client the other day asking me if there was any way to make their EF Core queries faster? Always a question I would answer maybe to... 😅
We had a long discussion about database resources, network performance, how the setup was hosted, and how the code was made up.
I concluded that scaling up the resources on the database server wouldn't help in terms of performance and would also be a more expensive solution for the client.
Instead, I started testing the queries in their repositories and found out that they were having some heavy queries with multiple .Include
and .ThenInclude
to get child data. The problem with this is that it can be quite heavy for the database in one call.
So how can we fix that? Well... EF Core Query Splitting is the answer to your question. ✌️
What is EF Core Query Splitting?
Entity Framework Core 5.0 introduced a new feature called Query Splitting. This feature allows you to split a single query that includes related data into multiple queries, thus potentially improving performance by reducing the amount of redundant data retrieved from the database. 🚀
Here's an example of how you can use query splitting in your application.
var orders = context.Orders
.Include(o => o.OrderDetails)
.AsSplitQuery()
.ToList();
In this example, without AsSplitQuery()
, EF Core would generate a single SQL query with a JOIN
to retrieve the orders and their related order details. This can result in a lot of redundant data if there are many order details for each order, and we would like to avoid that.
With AsSplitQuery()
, EF Core generates two SQL queries.
- One to retrieve the orders
- Second to retrieve the order details.
The results are then combined in memory. This can reduce the amount of data retrieved from the database and improve performance, especially when there are many related entities.
A Pragmatic Practical Example
A good example of optimization using query splitting in Entity Framework Core might be when you have a one-to-many relationship with a large number of related entities. For instance, consider a Blog
entity that has many Post
entities, like the model below.
public class Blog
{
public int BlogId { get; set; }
public string Url { get; set; }
public List<Post> Posts { get; set; }
}
public class Post
{
public int PostId { get; set; }
public string Title { get; set; }
public string Content { get; set; }
public int BlogId { get; set; }
public Blog Blog { get; set; }
}
If you want to retrieve all blogs along with their posts, you might initially write a query like this.
var blogs = context.Blogs
.Include(b => b.Posts)
.ToList();
Without query splitting, this would result in a single SQL query with a JOIN, which could retrieve a lot of redundant data if each blog has many posts. This is the query made from EF core to your database in raw SQL.
SELECT [b].[BlogId], [b].[Url], [p].[PostId], [p].[Title], [p].[Content], [p].[BlogId]
FROM [Blogs] AS [b]
LEFT JOIN [Posts] AS [p] ON [b].[BlogId] = [p].[BlogId]
ORDER BY [b].[BlogId], [p].[PostId]
By using query splitting, you can split this into two queries, as I mentioned earlier. Here is how that is accomplished.
var blogs = context.Blogs
.Include(b => b.Posts)
.AsSplitQuery()
.ToList();
This can reduce the amount of data retrieved from the database and improve performance, especially when there are many posts for each blog in your database. Let's take a look at what that would look like in raw SQL during runtime.
-- Query for loading Blogs
SELECT [b].[BlogId], [b].[Url]
FROM [Blogs] AS [b]
-- Query for loading Posts
SELECT [p].[PostId], [p].[Title], [p].[Content], [p].[BlogId]
FROM [Posts] AS [p]
WHERE EXISTS (
SELECT 1
FROM [Blogs] AS [b]
WHERE [p].[BlogId] = [b].[BlogId])
ORDER BY [p].[BlogId]
As you can see the queries are now split into two separate queries just like we wanted them to be.
Enabling Query Splitting For All Queries In App
Well, you might come across an application where you will encounter hundreds of queries like the ones I showed above. If you know that you would like to use this on all of the queries, there is an easy way of enabling/adding query splitting for all EF Core queries in your application.
Enabling query splitting globally is done at the database context level when you configure it. All you have to do is call the UseQuerySplittingBehavior()
method as shown below.
services.AddDbContext<AppDbContext>(options =>
options.UseSqlServer(
"default-connection",
o => o.UseQuerySplittingBehavior(
QuerySplittingBehavior.SplitQuery)));
Now all queries with .Include()
and .ThenInclude()
will be split using query splitting.
Disable Global Query Splitting On Some Queries
Imagine having lots of queries and you have enabled the global query splitting feature for your application, but a few of them should not do it.
Instead of adding .AsSplitQuery()
on all your queries, you can use the global adder and then use .AdSingleQuery()
on the queries where you would like to have everything done in one query to the database.
Here is an example for your reference.
var blogs = context.Blogs
.Include(b => b.Posts)
.AsSingleQuery()
.ToList();
That would result in EF Core avoiding splitting the query even though it's globally enabled for the application.
The Catches
Alright.. now you know how to enable this simple, but pretty awesome EF core feature. Before you just rush out there and do it everywhere, because what can possibly go wrong, it's making the performance better?! 😅
I would like to introduce a few things you should be aware of.
- You will have no consistency guarantee when using multiple SQL queries.
- There is a possible chance that you might encounter issues if your application is performing concurrent updates.
The solution to the above? ⬆️ — You can wrap the queries into one transaction, but please don't as you would get a headick later in terms of performance. (I should say that from a friend 😆).
Ensure the network connection to and from your database is of high performance as splitting queries results in multiple calls and requires the database to handle more transactions, resulting in multiple results being transferred over the network.
Summary
Entity Framework Core Query Splitting is a nice addition to the toolbox when talking performance and would like to make the application faster.
Please read the section "The Catches" above as there are two warnings I would like you to take into consideration before you go ahead and implement it. Make sure your application is not affected by that, then you are safe!
With your new knowledge about EF Core, I can only say one thing - Go optimize those database queries! If you have any questions, please let me know in the comments below. Until next time - Happy coding! ✌️
My name is Christian. I am a 28-year-old Solution Architect & Software Engineer with a passion for .NET, Cloud, and Containers. I love to share my knowledge and teach other like-minded about tech.