Add a progressiveChunkSize option for Flight rendering #35089

mhart · 2025-11-10T04:48:29Z

Use the same default as ReactFizzServer. Setting it higher leads to even more performance gains (up to 2x), but obviously needs to be balanced with blocking painting.

On Vercel, Next.js 16.0.2-canary.12, before:

After, with MAX_ROW_SIZE=65100 (1.72x better P95):

Using the somewhat conservative default from this PR (MAX_ROW_SIZE=12800) delivers a ~1.3x gain in local testing (1.4x on Bun). Results are even more noticeable on Next.js 15, on other platforms, and also appear to be larger in real-world testing on serverless platforms (though I haven't done extensive testing here).

Steady-state memory usage appears somewhat lower after this too, though marginal (402MB vs 428MB on the same Vercel function).

I've often see gains that are even higher in real world testing (this is a 2.2x gain, also on Next.js 16 canary), but I suspect that this may just be due to serverless hardware noise / noisy-neighbours.

Summary

#33030 introduced a fixed MAX_ROW_SIZE=3200, above which flight tasks are deferred, so as to reduce blocking the painting of large non-lazy elements that may contain client components (I may have some of the exact details / terminology incorrect here).

However, this appears to have had a relatively large impact on the SSR performance of large pages / elements, especially in Next.js. Profiling the Next.js benchmark from t3dotgg, which SSRs a ~2MB page with no async components or Suspense, shows that a large amount of rendering time is spent handling these lazy chunks. A smaller reproduction is mentioned below.

Making this configurable would at least give frameworks (and/or users) the option to choose which trade-offs they'd prefer – and with some refactoring, may allow for larger chunk sizes for SSR, while still delivering smaller chunks for client-side rendering.

How did you test this change?

More results and a reproduction (~120kb html page) can be found here: https://github.com/mhart/react-server-defer-task

Local testing with Bun and wrk (2 threads, 10 concurrent requests):

Avg w/ MAX_ROW_SIZE=3200: 158.21ms (default setting)
Avg w/ MAX_ROW_SIZE=18700: 90.65ms (1.75x faster)

Other runtimes:

Node.js 25.2.0: 1.39x faster
Node.js 24.11.1: 1.4x faster
Node.js 22.21.1: 1.4x faster

More details

My initial investigations showed a large amount of time was being spent in this "throw" line. Depending on the JS engine and its settings, throws can be quite expensive, as call stack information may need to be gathered at the time of throw, and many of these stacks were at least 30 deep.

Following the trail of where the throws were coming from led me to see they were "pending" lazy chunks that had been created by "deferTask", and that led me to see they were being created as the max row size had been reached.

In the aforementioned page, there are 9,485 chunks thrown this way with the current max row size. Removing this check suddenly led to a 2x performance improvement (on Next.js 15).

NB: To be clear, I don't think most of the gains are due to the reduction of throwing (at least on Node.js, Bun may be different) – I think most of the slowdown is just due to the introduction of many many lazy chunks that need to be handled in the rendering pipeline.

Other thoughts

I'm not wedded to this approach – I'm not even sure progressiveChunkSize has the same meaning as MAX_ROW_SIZE, though it intuitively feels like it is. This PR is, if nothing else, just a way to highlight that the splitting into many lazy chunks has a noticeable impact on SSR performance, and that better batching would be beneficial.

The 2MB page renders in 24ms locally on Node.js, using renderToReadableStream (the 120kb page renders in 1.2ms, ~15x faster than in Next.js) – so the serialization (and subsequent deserialization and reserialization) of RSC chunks clearly has a noticeable effect – and that makes up the bulk of the Next.js time here. Early JSON serialization, then JSON parsing and then JSON serialization again show up predominately in the profiles.

It seems to me that alternative approaches that avoid the need to deserialize just to render HTML on the server in the first place might be useful here (eg, just render to an object stream / async iterator) – and would also allow for other optimizations, such as knowing where the closing head tag is or closing body tag, instead of needing to deserialize bytes to find these (as Next.js currently has to do).

Also the need to serialize the entire RSC stream and double or 2.5x the payload seems like extra work – having markers in the already-serialized HTML for client component boundaries, and only serializing extra properties that can't be rendered to HTML would be useful to avoid doing extra work here.

However, these are larger changes that would require alternative approaches – doing something around the lazy chunk serialization here feels like an easier quick win.

Use the same default as ReactFizzServer. This is four times the current value, but leads to a 30% performance increase for large pages as less expensive async work needs to be done.

mhart · 2025-11-10T05:13:40Z

TODO: If this approach is the right way to go about this (exposing a progressiveChunkSize option), I'll put some time into adding some tests that exercise it.

I just figured it'd be better to have a discussion about whether this was the right approach first – I realize that adding an extra option like this to a React API isn't something that would be done lightly, and there may be better, less invasive, approaches (even if it's just bumping the limit)

mhart · 2025-11-13T02:55:39Z

Open #35125 to discuss an alternative, which is just better batching of deferred chunks

mhart · 2025-11-14T07:31:37Z

Updated the reproduction to be a more reasonable 120kb page: https://github.com/mhart/react-server-defer-task

Still ends up with a 1.75x improvement on Bun (1.4x on Node) with better batching.

Add a progressiveChunkSize option for Flight rendering

9e24aa8

Use the same default as ReactFizzServer. This is four times the current value, but leads to a 30% performance increase for large pages as less expensive async work needs to be done.

meta-cla bot added the CLA Signed label Nov 10, 2025

mhart mentioned this pull request Nov 13, 2025

Bug: Performance regression due to deferTask not batching #35125

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add a progressiveChunkSize option for Flight rendering #35089

Add a progressiveChunkSize option for Flight rendering #35089

Uh oh!

mhart commented Nov 10, 2025 •

edited

Loading

Uh oh!

mhart commented Nov 10, 2025

Uh oh!

mhart commented Nov 13, 2025

Uh oh!

mhart commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add a progressiveChunkSize option for Flight rendering #35089

Are you sure you want to change the base?

Add a progressiveChunkSize option for Flight rendering #35089

Uh oh!

Conversation

mhart commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How did you test this change?

More details

Other thoughts

Uh oh!

mhart commented Nov 10, 2025

Uh oh!

mhart commented Nov 13, 2025

Uh oh!

mhart commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mhart commented Nov 10, 2025 •

edited

Loading