From 71253231c46a8acd61c0fb7fa27e3a53d2b2fc83 Mon Sep 17 00:00:00 2001 From: Alice Cecile Date: Sun, 31 May 2026 11:19:53 -0700 Subject: [PATCH 1/4] Revise partial_bindless release note --- .../release-notes/partial_bindless_metal.md | 35 ++++++++++--------- 1 file changed, 19 insertions(+), 16 deletions(-) diff --git a/_release-content/release-notes/partial_bindless_metal.md b/_release-content/release-notes/partial_bindless_metal.md index 0e4507d40dbbc..a21e926d6fe26 100644 --- a/_release-content/release-notes/partial_bindless_metal.md +++ b/_release-content/release-notes/partial_bindless_metal.md @@ -4,26 +4,23 @@ authors: ["@holg"] pull_requests: [23436] --- -Cross-platform game engines must constantly navigate real differences in platform APIs. -Bevy's goal is to let users write a single application and ship it everywhere — -Windows, Mac, Linux, mobile — with confidence that it will just work. -That's a tough promise to live up to: rendering complex scenes on Mac and iOS was markedly slower. +In an ideal world, Bevy users could write a single application and ship it everywhere, with every last one of the messy cross-platform differences beautifully abstracted away. +That's a little hard though, and in this particular case, we found that rendering complex scenes on Mac and iOS was markedly slower. Bindless rendering is how modern engines handle scenes with many different materials efficiently: shaders index into shared pools of textures and buffers rather than rebinding them per draw call. -Bindless is not just a performance optimization — it's how modern renderers are structured. -Metal (Apple's GPU API) supports texture binding arrays but not buffer binding arrays. -Bevy required both to enable bindless, which previously excluded Metal entirely — even for materials that never use buffer arrays. -If you were shipping on Mac or iOS, your game was running on a slower, fundamentally different code path. +Metal (Apple's GPU API) has partial bindless support: +it permits texture binding arrays but not buffer binding arrays. +Historically, Bevy required both to enable bindless, which excluded Metal entirely, even for materials that never use buffer arrays. -Most materials, including `StandardMaterial`, only use `#[data(...)]`, textures, and samplers — they never needed buffer array support. -Bevy now checks what each material actually needs; -if it only needs texture arrays, it gets bindless on Metal. -Materials using `#[uniform(..., binding_array(...))]` still fall back to non-bindless on Metal. +Most materials, including `StandardMaterial`, do not need buffer array support. +To ensure those materials take the fast path, Bevy now checks the actual needs of each material. +If you only need texture arrays, your material can be rendered efficiently across Bevy's desktop platforms. +If you use `#[uniform(..., binding_array(...))]`, expect unusually poor performance on Metal. -Two correctness bugs were fixed in the process. -The sampler limit check was testing the wrong metric: `max_samplers_per_shader_stage` counts binding slots, but the relevant limit is `max_binding_array_sampler_elements_per_shader_stage`, the array element count — a mismatch that could silently exceed hardware limits. -Bevy now also skips creating binding array slots for resource types a material doesn't use, staying within Metal's hard 31 argument buffer slot limit and reducing overhead on all platforms. +We've also fixed two important correctness bugs in the process. +First, we discovered that the sampler limit check was testing the wrong metric: `max_samplers_per_shader_stage` counts binding slots, but the relevant limit is `max_binding_array_sampler_elements_per_shader_stage`, the array element count (a mismatch that could silently exceed hardware limits). +Second, Bevy now also skips creating binding array slots for resource types a material doesn't use, staying within Metal's hard 31 argument buffer slot limit and reducing overhead on all platforms. Benchmarked on Bistro Exterior (698 materials), 5-minute runs: @@ -31,6 +28,12 @@ Benchmarked on Bistro Exterior (698 materials), 5-minute runs: | ------------------------ | ------------------- | ------------------- | ----------- | | Apple M2 Max (Metal) | +18% | +77% | −57 MB RAM | | NVIDIA 5060 Ti | +84% | +174% | Same | -| Intel i360P | +15% | Same | Same | | AMD Vega 8 / Ryzen 4800U | Same | Same | −88 MB VRAM | +| Intel i360P | +15% | Same | Same | | Intel Iris XE | Same | Same | Same | + +[Bistro] is a demanding, fairly realistic scene. +While Metal's bindless limitations remain frustrating, +it's lovely to see those performance gains, and to know that Bevy is not artificially holding performance on iOS and macOS back. + +[Bistro]: https://developer.nvidia.com/orca/amazon-lumberyard-bistro \ No newline at end of file From 6c49ded4b85bc10cfa4774e3407e8cb52b147bd4 Mon Sep 17 00:00:00 2001 From: Alice Cecile Date: Sun, 31 May 2026 11:36:43 -0700 Subject: [PATCH 2/4] Single trailing newline... --- _release-content/release-notes/partial_bindless_metal.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_release-content/release-notes/partial_bindless_metal.md b/_release-content/release-notes/partial_bindless_metal.md index a21e926d6fe26..7649df73afcd1 100644 --- a/_release-content/release-notes/partial_bindless_metal.md +++ b/_release-content/release-notes/partial_bindless_metal.md @@ -36,4 +36,4 @@ Benchmarked on Bistro Exterior (698 materials), 5-minute runs: While Metal's bindless limitations remain frustrating, it's lovely to see those performance gains, and to know that Bevy is not artificially holding performance on iOS and macOS back. -[Bistro]: https://developer.nvidia.com/orca/amazon-lumberyard-bistro \ No newline at end of file +[Bistro]: https://developer.nvidia.com/orca/amazon-lumberyard-bistro From 3375c278aa77c80b0ce5ae32ffa401b61ed618a0 Mon Sep 17 00:00:00 2001 From: Alice Cecile Date: Sun, 31 May 2026 13:44:37 -0700 Subject: [PATCH 3/4] Reframe to include the fact that this impacts Dx12 too --- .../release-notes/partial_bindless_metal.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/_release-content/release-notes/partial_bindless_metal.md b/_release-content/release-notes/partial_bindless_metal.md index 7649df73afcd1..e93254768f67b 100644 --- a/_release-content/release-notes/partial_bindless_metal.md +++ b/_release-content/release-notes/partial_bindless_metal.md @@ -1,22 +1,24 @@ --- -title: Partial Bindless on Metal and Reduced Bind Group Overhead +title: Partial Bindless and Reduced Bind Group Overhead authors: ["@holg"] pull_requests: [23436] --- In an ideal world, Bevy users could write a single application and ship it everywhere, with every last one of the messy cross-platform differences beautifully abstracted away. -That's a little hard though, and in this particular case, we found that rendering complex scenes on Mac and iOS was markedly slower. +That can be a bit hard though. +In this particular case, we found that rendering complex scenes on Mac and iOS was markedly slower than it should have been. +Looking into it, the lack of bindless rendering support was to blame. Bindless rendering is how modern engines handle scenes with many different materials efficiently: shaders index into shared pools of textures and buffers rather than rebinding them per draw call. -Metal (Apple's GPU API) has partial bindless support: -it permits texture binding arrays but not buffer binding arrays. +Both Metal (Apple's GPU API) and DX12 (an older Windows API) have partial bindless support: +they permit texture binding arrays but not buffer binding arrays. Historically, Bevy required both to enable bindless, which excluded Metal entirely, even for materials that never use buffer arrays. Most materials, including `StandardMaterial`, do not need buffer array support. To ensure those materials take the fast path, Bevy now checks the actual needs of each material. If you only need texture arrays, your material can be rendered efficiently across Bevy's desktop platforms. -If you use `#[uniform(..., binding_array(...))]`, expect unusually poor performance on Metal. +If you use `#[uniform(..., binding_array(...))]`, expect unusually poor performance when using Metal or DX12. We've also fixed two important correctness bugs in the process. First, we discovered that the sampler limit check was testing the wrong metric: `max_samplers_per_shader_stage` counts binding slots, but the relevant limit is `max_binding_array_sampler_elements_per_shader_stage`, the array element count (a mismatch that could silently exceed hardware limits). @@ -33,7 +35,7 @@ Benchmarked on Bistro Exterior (698 materials), 5-minute runs: | Intel Iris XE | Same | Same | Same | [Bistro] is a demanding, fairly realistic scene. -While Metal's bindless limitations remain frustrating, -it's lovely to see those performance gains, and to know that Bevy is not artificially holding performance on iOS and macOS back. +While bindless limitations remain frustrating, especially on Mac where Vulkan isn't an option, +it's lovely to see those performance gains, and to know that Bevy itself is no longer artifically holding our users back. [Bistro]: https://developer.nvidia.com/orca/amazon-lumberyard-bistro From 11b6c6c09b6e695d7186ab9ed615f2d80805d87b Mon Sep 17 00:00:00 2001 From: Alice Cecile Date: Sun, 31 May 2026 13:49:31 -0700 Subject: [PATCH 4/4] Typos Co-authored-by: Alice Cecile --- _release-content/release-notes/partial_bindless_metal.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_release-content/release-notes/partial_bindless_metal.md b/_release-content/release-notes/partial_bindless_metal.md index e93254768f67b..25bf2d27ca6ac 100644 --- a/_release-content/release-notes/partial_bindless_metal.md +++ b/_release-content/release-notes/partial_bindless_metal.md @@ -36,6 +36,6 @@ Benchmarked on Bistro Exterior (698 materials), 5-minute runs: [Bistro] is a demanding, fairly realistic scene. While bindless limitations remain frustrating, especially on Mac where Vulkan isn't an option, -it's lovely to see those performance gains, and to know that Bevy itself is no longer artifically holding our users back. +it's lovely to see those performance gains, and to know that Bevy itself is no longer artificially holding our users back. [Bistro]: https://developer.nvidia.com/orca/amazon-lumberyard-bistro