Skip to content

GLES2 2d Batch rendering

Rafał Mikrut requested to merge github/fork/lawnjelly/batch-2d into master

Created by: lawnjelly

2d rendering is currently bottlenecked by drawing rects one at a time, limiting OpenGL efficiency. This PR batches rects and renders in fewer drawcalls, resulting in significant performance improvements. This also speeds up text rendering.

The code dynamically chooses between a vertex format with and without color, depending on the input data for a frame, in order to optimize throughput and maximize batch size.

Notes

  • Although it only batches rects to start with, the same framework can be used to batch the other primitives.

  • Can only batch within _canvas_item_render_commands. Notably this covers tilemaps, text, the _draw API but NOT separate sprites (we evaluated this and it would require far more complex changes).

  • There is a high speed simple growable pod dynamic array template included. For now it is included in the drivers/gles2 directory.

I hesitate to put it in core as it is really only intended for these rasterizers and is not general purpose. It will be very easy to replace when we have a high speed, non-COW vector in the future.

  • I'm using indexed primitives so that only 4 verts are required per quad. Indices in GLES2 however can only be 16 bit, so they are limited to addressing 65535 verts (from the origin). For this reason there is a maximum size for a vertex buffer, 1 quad less than would fit into 65535. When this occurs, the routine ends the batch, draws the current batches, resets and starts filling batches again where it left off.

  • Swapping between colored / non-colored vertex format is currently done based on a simple heuristic at runtime. It is also possible to use hysteresis, or to allow the user to switch manually, however this has not been implemented yet, it may not be required.

  • Now tested on Linux, Windows 10, Android, WebGL. The more testing the better, and should anyone be able to try it on their devices / more platforms that would be welcome.

  • Because there is still the possibility for bugs, I have added (temporarily) a Project setting : rendering/quality/2d/use_batching which defaults to true, but can be turned off if there are problems. The legacy non-batched method is still included.

  • Just in case it affects people being able to even start up the editor in 2d, I have disabled the batching in the IDE until we have some feedback. If all goes well it could be enabled in the IDE by default with another PR.

What kind of speed increases can I expect?

It depends on a number of factors, including what you are rendering and what the hardware is. Benefit roughly scales with the number of quads. There will be less benefit in games that are fill rate limited already. In my tests with 2d games / tests, frame rate increases of 2-10x were typical. I was getting larger speed increases on desktop than on my tablet.

Note there are some cases (notably bunnymark v2) which can't be batched currently, see below. Any more figures doing the comparison on different hardware would be welcome.

Increases will be higher in release build. With some more tweaks I have now clocked 58x increase in frame rate in vertex throughput limited scene ( #19917 ).

Merge request reports