This is a part of a series of blog posts on data access with Dapper. To see the full list of posts, visit the Dapper Series Index Page.
In today's post, we explore paging through large result sets. Paging is a common technique that is used when dealing with large results sets. Typically, it is not useful for an application to request millions of records at a time because there is no efficient way to deal with all those records in memory all at once. This is especially true when rendering data on a grid in a user interface. The screen can only display a limited number of records at a time so it is generally a bad use of system resources to hold everything in memory when only a small subset of those records can be displayed at any given time.
Modern versions of SQL Server support the OFFSET / FETCH clause to implement query paging.
In continuing with our airline theme, consider a Flight entity. A Flight represents a particular occurrence of a ScheduledFlight on a particular day. That is, it has a reference to the ScheduledFlight along with some properties indicating the scheduled arrival and departure times.
1 2 3 4 5 6 7 8 9
publicclassFlight { publicint Id {get; set;} publicint ScheduledFlightId {get; set;} public ScheduledFlight ScheduledFlight { get; set;} public DateTime Day {get; set;} public DateTime ScheduledDeparture {get; set;} public DateTime ScheduledArrival {get; set;} }
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
publicclassScheduledFlight { publicint Id {get; set;} publicstring FlightNumber {get; set;}
publicbool IsSundayFlight {get; set;} publicbool IsMondayFlight {get; set;} // Some other properties }
Writing the query
As we learned in a previous post, we can load the Flight entity along with it's related ScheduledFlight entity using a technique called multi-mapping.
In this case, loading all the flights to or from a particular airport, we would use the following query.
1 2 3 4 5 6
SELECT f.*, sf.* FROM Flight f INNERJOIN ScheduledFlight sf ON f.ScheduledFlightId = sf.Id INNERJOIN Airport a ON sf.ArrivalAirportId = a.Id INNERJOIN Airport d ON sf.DepartureAirportId = d.Id WHERE a.Code = @AirportCode OR d.Code = @AirportCode
But this query could yield more results than we want to deal with at any given time. Using OFFSET/FETCH, we can ask for only a block of results at a time.
1 2 3 4 5 6 7 8 9
SELECT f.*, sf.* FROM Flight f INNERJOIN ScheduledFlight sf ON f.ScheduledFlightId = sf.Id INNERJOIN Airport a ON sf.ArrivalAirportId = a.Id INNERJOIN Airport d ON sf.DepartureAirportId = d.Id WHERE a.Code = @AirportCode OR d.Code = @AirportCode ORDERBY f.Day, sf.FlightNumber OFFSET @OffsetROWS FETCHNEXT @PageSize ROWSONLY
Note that an ORDER BY clause is required when using OFFSET/FETCH.
//GET api/flights [HttpGet] publicasync Task<IEnumerable<Flight>> Get(string airportCode, int page=1, int pageSize=10) { IEnumerable<Flight> results;
using (var connection = new SqlConnection(_connectionString)) { await connection.OpenAsync(); var query = @" SELECT f.*, sf.* FROM Flight f INNER JOIN ScheduledFlight sf ON f.ScheduledFlightId = sf.Id INNER JOIN Airport a ON sf.ArrivalAirportId = a.Id INNER JOIN Airport d ON sf.DepartureAirportId = d.Id WHERE a.Code = @AirportCode OR d.Code = @AirportCode ORDER BY f.Day, sf.FlightNumber OFFSET @Offset ROWS FETCH NEXT @PageSize ROWS ONLY; ";
Here we calculate the offset by based on the page and pageSize arguments that were passed in. This allows the caller of the API to request a particular number of rows and the starting point.
One step further
When dealing with paged result sets, it can be useful for the caller of the API to also know the total number of records. Without the total number of records, it would be difficult to know how many records are remaining which in turn makes it difficult to render a paging control, a progress bar or a scroll bar (depending on the use case).
A technique I like to use here is to have my API return a PagedResults<T> class that contains the list of items for the current page along with the total count.
1 2 3 4 5
public class PagedResults<T> { public IEnumerable<T> Items { get; set; } publicint TotalCount { get; set; } }
To populate this using Dapper, we can add a second result set to the query. That second result set will simply be a count of all the records. Note that the same WHERE clause is used in both queries.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
SELECT f.*, sf.* FROM Flight f INNERJOIN ScheduledFlight sf ON f.ScheduledFlightId = sf.Id INNERJOIN Airport a ON sf.ArrivalAirportId = a.Id INNERJOIN Airport d ON sf.DepartureAirportId = d.Id WHERE a.Code = @AirportCode OR d.Code = @AirportCode ORDERBY f.Day, sf.FlightNumber OFFSET @OffsetROWS FETCHNEXT @PageSize ROWSONLY;
SELECTCOUNT(*) FROM Flight f INNERJOIN ScheduledFlight sf ON f.ScheduledFlightId = sf.Id INNERJOIN Airport a ON sf.ArrivalAirportId = a.Id INNERJOIN Airport d ON sf.DepartureAirportId = d.Id WHERE a.Code = @AirportCode OR d.Code = @AirportCode
Now in our code that executes the query, we will the QueryMultipleAsync method to execute both SQL statements in a single round trip.
//GET api/flights [HttpGet] publicasync Task<PagedResults<Flight>> Get(string airportCode, int page=1, int pageSize=10) { var results = new PagedResults<Flight>();
using (var connection = new SqlConnection(_connectionString)) { await connection.OpenAsync(); var query = @" SELECT f.*, sf.* FROM Flight f INNER JOIN ScheduledFlight sf ON f.ScheduledFlightId = sf.Id INNER JOIN Airport a ON sf.ArrivalAirportId = a.Id INNER JOIN Airport d ON sf.DepartureAirportId = d.Id WHERE a.Code = @AirportCode OR d.Code = @AirportCode ORDER BY f.Day, sf.FlightNumber OFFSET @Offset ROWS FETCH NEXT @PageSize ROWS ONLY; SELECT COUNT(*) FROM Flight f INNER JOIN ScheduledFlight sf ON f.ScheduledFlightId = sf.Id INNER JOIN Airport a ON sf.ArrivalAirportId = a.Id INNER JOIN Airport d ON sf.DepartureAirportId = d.Id WHERE a.Code = @AirportCode OR d.Code = @AirportCode ";
Paged result sets is an important technique when dealing with large amounts of data. When using a full ORM like Entity Framework, this is implemented easily using LINQ's Skip and Take methods. It's so easy in fact that it can look a little like magic. In reality, it is actually very simple to write your own queries to support paged result sets and execute those queries using Dapper.