Max Musing

📄 How we paginate queries across every SQL dialect

We had a comment in our codebase that said SQL Server doesn't handle pagination well, so we should just avoid pagination for now. And it did exactly that: shipped the entire result set to the browser and let it sort itself out. Same deal for Spanner.

A lot of Basedash is just "run the SQL the user wrote, show the rows." We support 10+ dialects (Postgres, MySQL, BigQuery, Snowflake, Athena, SQL Server, Spanner, and friends), and every query gets paginated because we can't stream a 12M row result set to a browser.

For years the approach was wrap-and-pray. Take whatever the user wrote, stuff it in a subquery, slap pagination on the outside:

SELECT * FROM (
  -- whatever the user wrote
) AS t
LIMIT 100 OFFSET 200;

The nice thing about this is the query stays opaque. You don't have to understand its grammar to paginate it, which is exactly why it shipped fast and lasted years.

It works great on the easy dialects but falls apart on the annoying ones. SQL Server has no LIMIT at all. Athena wants OFFSET before LIMIT, the opposite of Postgres. Spanner won't preserve ordering through a subquery, so you get 100 rows in some order, just not the one anybody asked for. And if the user already wrote their own LIMIT, page 2 came back empty.

So we ended up with a pile of provider-specific overrides taped onto a leaky abstraction.

The fix was to parse instead of wrap.

We pulled in a multi-dialect SQL parser, turn each query into an AST, and attach LIMIT/OFFSET directly to the root statement. That deleted 4 provider-specific overrides, got SQL Server and Spanner paginating for the first time, and cleaned up every stack trace that used to mention a mystery "t" alias. New dialects get correct pagination by adding 1 line to a switch.

What I keep thinking about is that the wrapper was the right call for a really long time. Parsing only won once a good multi-dialect parser was something we could install instead of build.

Curious how other people handle this. Do you parse user SQL or wrap it? And if you wrap, at what point did it stop being good enough?

51 views

Add a comment

Replies

Best
Sebastian Patterson

@maxmusing What made you switch from wrapping to parsing?

Gideon Henry

Was SQL Server the biggest challenge?

Xander Cole

@maxmusing Did this reduce support requests?

Bella Christine

@maxmusing Any plans to use parsing elsewhere?