Only for OpenGL: there a QRhiBuffer with UniformBuffer usage is not
backed by a native OpenGL buffer at all. The ominuously named
beginFullDynamicBufferUpdateForCurrentFrame() just returns a pointer
into the underlying QByteArray (or similar) storage then. (only on GL,
only for uniform buffers) Knowing this, the batches' uniform data
can be memcopied directly into this area, instead of enqueing resource
updates that ultimately achieve the same.
Could be applicable to D3D11, to save a memcpy. However, it is not given
that, thinking of unmerged batches, separately memcopying lots of small
blocks (Qt Quick does not actually build up the final, full uniform
buffer contents itself) into a buffer's mapped (host visible) storage is
a good thing. (given the potentially different memory caching behavior
e.g.) So it is not done there.
To recognize the special case, we use nativeBuffers().slotCount == 0
which is only every true with the OpenGL and Null backends (indicating
that the QRhiBuffer has 0 real buffer resources associated), which
conveniently avoids the need to introduce backend-specific conditions.
Pick-to: 6.8 6.7 6.6 6.5
Task-number: QTBUG-125087
Change-Id: I88a9ef5eff6c52a88b2605bdab8cada86a18e98e
Reviewed-by: Andy Nichols <andy.nichols@qt.io>
Copy the flag from Quick 3D, which used to set this for quite some time
now as a microoptimization. It is not great technically and may need to
be removed in the future once compute is used in some form in
combination with textures or buffers that are also used in the Qt Quick
render pass.
Task-number: QTBUG-125087
Pick-to: 6.8 6.7 6.6 6.5
Change-Id: Ie2d4d6a784580e669fe0fc0b843519b4826efa34
Reviewed-by: Andy Nichols <andy.nichols@qt.io>
Instead of attempting to handle this one layer down, when deciding about
merging, make the decision already when creating the batches.
Pick-to: 6.8 6.7 6.6 6.5
Fixes: QTBUG-97557
Task-number: QTBUG-125087
Change-Id: I83825c60d76124244979e009459d47f07a67fa77
Reviewed-by: Andy Nichols <andy.nichols@qt.io>
Not letting non-indexed geometry getting merged is wrong, at least from
a performance perspective, as it has to potential to cause huge
regressions with certain scenes when compared to runs without the patch.
Instead, a follow up patch will rather prevent putting geometry with
different index types into the same batch. Then no further actions will
be needed here when deciding about merging within the batch.
This reverts commit f9e95c9d44.
Fixes: QTBUG-123191
Task-number: QTBUG-125087
Pick-to: 6.8 6.7 6.6 6.5
Change-Id: I1d283ca78aad74561450f976be895f46dc739b73
Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@qt.io>
[ChangeLog][QtQuick][SceneGraph] QSGMaterialShader subclasses can now
specify a blend operation different from Add using the new
GraphicsPipelineState members opColor and opAlpha.
Fixes: QTBUG-125214
Change-Id: I5a6f51ce9774e19085acf470cbff07afc916d0ff
Reviewed-by: Andy Nichols <andy.nichols@qt.io>
Every time type is compared and compare() is called, viewCount() must
also be compared.
More importantly, the correct view count (handled via internal flags
for SC/BC) has to be pushed to the QSGMaterial always, before checking
for an existing QSGMaterialShader - otherwise if material #2 has a hit
for a shader from material #1, material #2 will never have the view
count pushed so it continues to report viewCount() == 1, which
has...interesting consequences. Avoid this.
Change-Id: I0808cc57f53a8dab891d1b36b41790f58c978ef4
Reviewed-by: Andy Nichols <andy.nichols@qt.io>
This will be used in RenderMode3D only in practice, where Qt Quick 3D
is going to pass in multiple matrices to the QSGRenderer.
Task-number: QTBUG-114871
Change-Id: Icae7f05958729d9e51948e1f38621ec4a541192d
Reviewed-by: Andy Nichols <andy.nichols@qt.io>
Meaning the toggling of visibility. Having a QSGRenderNode come and
go in the scenegraph leads to visual problems, in case the adding
and removal of the node toggles the m_forceNoUseDepthBuffer flag,
which in turn makes useDepthBuffer() return a different value
than before (so disables and then enables doing the opaque
pass in the renderer). Changing this value needs a full rebuild
of the render lists. When adding a node, this is done (regardless
of toggling the flag). When removing, it was not done at all.
Now we do it when resetting the no-Z flag back to false.
Add a button to the customrendernode example to toggle visibility
since this is useful for example's purposes anyways. However, this
on its own is not sufficient to reproduce the issue. For that,
the DepthAwareRendering flag needs to be removed from the
QSGRenderNode subclass.
Pick-to: 6.6 6.5
Fixes: QTBUG-119160
Change-Id: I232354d88f5a4fe5f9f2d6102d0d5439d92782fb
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Andy Nichols <andy.nichols@qt.io>
When we recycle batches, there is no logic to make sure we pick
a batch which is the appropriate size for the actual vertex buffer
or index buffer we will upload. In fact, when picking a batch
from the pool, we do not even know the size yet, since we have not
done any merging yet. In some cases, this can cause memory to grow
much larger than needed, in particular if we have a single large
batch and thousands of smaller ones which are updated regularly.
As we recycle "random" batches from the pool, we will gradually
resize all of them to fit the largest batch, and end up using
largest batch size * N for memory, when we could have just reused
the buffers for the largest batch in each update.
One way to fix this is to keep a separate buffer pool instead of
tying these to batches and search for the smallest buffer that can
fit the data. This means another search in a tight loop, but the
number of buffers will typically not be large, and we have such
searches in other places in the loop already, so complexity
remains the same at least.
Other ways to mitigate this might be to set a max size on the batch
pool, but this would probably require a lot of effort to find what
the sweet spots are, and in the end it's likely to be different
for different target systems.
Fixes: QTBUG-107214
Change-Id: Ie60d158348fdbda2fb79eb8bf3b61311cc67ce67
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
It's been QRhiShaderStage for ever now, but the old alias is still
available and so the earliest scenegraph renderer code was never
migrated away from the old name. Do it now, to allow eventually
dropping the alias from qrhi.h.
Pick-to: 6.6
Change-Id: I6e21410aa8f1b49b832412d6c7ee8a1b3dfd4c2c
Reviewed-by: Laszlo Agocs <laszlo.agocs@qt.io>
...while expanding the "depth post-pass" comments since it certainly
needs an explanation.
Change-Id: Ic1ea0822a3bf4174dd92f2c2ac74f258b62e11a3
Reviewed-by: Andy Nichols <andy.nichols@qt.io>
One of the code paths that call createShader() was not passing along
the render mode, causing it to default to 2D rendering. This could
cause issues when rendering a 2D item in 3D if the material has
special handling of perspective transforms, like e.g. with text
which needs a different shader to get correct antialiasing in 3D.
Pick-to: 6.2 6.5 6.6
Change-Id: I4a22642863df58351daeec63e99b54a986b4b45e
Reviewed-by: Laszlo Agocs <laszlo.agocs@qt.io>
It already happens as part of Renderer::map. So, if we set size = 0
before we call uploadBatch then it will implicitly set the size needed.
One of the loops was also grabbing the sizes before the final size
was determined.
Pick-to: 6.5
Change-Id: I610d9850a66011c3d17eea131c063c42fe149962
Reviewed-by: Laszlo Agocs <laszlo.agocs@qt.io>
resize(0) only sets the size and doesn't deallocate anything.
Conversely, shrink(0), which was used before it was changed
to resize(0) doesn't change the size and only deallocates.
Use a mix of both for now.
Pick-to: 6.5
Change-Id: I3e36a766ef7483158a5a300351b0d7864d24e184
Reviewed-by: Laszlo Agocs <laszlo.agocs@qt.io>
Previously QSGRenderNode::m_matrix would be a dangling pointer most of
the time. We document it as being valid during both the prepare, and
render methods, but it would only be valid during the prepare phase.
This was solved with the projectionMatrix by keeping a local copy
instead of a pointer. No code was actually using the dangling pointer,
but as a preventive measure since we document matrix as being valid also
during the rhiRender we now have a local copy in QSGRenderNodePrivate,
and manually set the pointer reference for the API, as well as a note to
fix this non-ideal situtation in Qt 7.
Fixes: QTBUG-97589
Change-Id: Idc0617de579d3d4ce5cc590534445f609adb9d61
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Laszlo Agocs <laszlo.agocs@qt.io>
If geometry without an index buffer becomes a batch root, it should not
be merged with other geometry since an index buffer is required for a
merged batch.
This should not be very disruptive, as this can only occur under
very specific circumstances anyway (custom QSGGeometry used in a scene
in a particular order).
Fixes: QTBUG-97557
Pick-to: 6.5 6.4 6.2
Change-Id: I6b3845fb988a4bfe2a6f989801741246b8f4fb37
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Laszlo Agocs <laszlo.agocs@qt.io>
On the Apple M1 chipset with clang 14 and above, the optimizer
will replace fmul/fsub with fmsub, which frequently produces near-zero
negative values. The viewport z clipping space is [0.0, 1.0], which can
result in the top-most layer being clipped.
Pick-to: 6.5 6.4 6.2
Fixes: QTBUG-109739
Change-Id: I22b6c49f066444a102d8bbc2237e4aa9e8c89d8c
Reviewed-by: Laszlo Agocs <laszlo.agocs@qt.io>
The odd programRhi struct is a leftover from 5.14/15 times where there
were multiple code paths and so the data structures had to gather for
both. Drop that inner struct and use sensible names.
Pick-to: 6.5 6.4
Change-Id: Ic157325afdb11b362b009212ab45093e8b7472f8
Reviewed-by: Kaj Grönholm <kaj.gronholm@qt.io>
The suspicion is that in Qt 5 this was skipped because
1) releasing the generic allocation for any Element is done anyway due
to destroying the m_elementAllocator, and
2) the only other thing releaseElement() did there was to release the
extra data for QSGRenderNodes, which could thus be leaked, but
render nodes are a rarely used advanced feature, and
3) upon closing a window, the scenegraph is torn down and the
node-removed notification conveniently puts the Element data from
the shadow nodes on the m_elementsToDelete queue (which is handled
in the Renderer dtor); the same goes when dynamically removing a
node at run time.
So even if one cared about a few small leaks for apps with
QSGRenderNode usage, to trigger this one would need to exercise
some code path that created and released renderers, i.e. something
with QSGLayer, and in a way that it does not lead to manipulating
the scene graph. (so shader effects or Item.layer seem not to
trigger this)
In Qt 6 calling releaseElement() is important for the most common
QSGGeometryNodes as well since there are additional resources to take
care of. And therefore the test code's grabToImage() (which internally
creates, renders, and then destroys a QSGLayer, and thus fires up and
then drops a full Renderer, while not touching the scenegraph itself
in any way) exhibits the problem much more clearly than in Qt 5.
Pick-to: 6.4 6.2
Fixes: QTBUG-108026
Change-Id: Id60dfca7921efc2f3057c9cb8cedb238a1db2ee5
Reviewed-by: Christian Strømme <christian.stromme@qt.io>
Reviewed-by: Andy Nichols <andy.nichols@qt.io>
This is a semantic patch using ClangTidyTransformator to convert
sequences of Q_UNREACHABLE() + return into Q_UNREACHABLE_RETURN(),
newly added to qtbase.
const std::string unr = "unr", val = "val", ret = "ret";
auto makeUnreachableReturn = cat("Q_UNREACHABLE_RETURN(",
ifBound(val, cat(node(val)), cat("")),
")");
auto ignoringSwitchCases = [](auto stmt) {
return anyOf(stmt, switchCase(subStmt(stmt)));
};
makeRule(stmt(ignoringSwitchCases(stmt(isExpandedFromMacro("Q_UNREACHABLE")).bind(unr)),
nextStmt(returnStmt(optionally(hasReturnValue(expr().bind(val)))).bind(ret))),
{changeTo(node(unr), cat(makeUnreachableReturn,
";")), // TODO: why is the ; lost w/o this?
changeTo(node(ret), cat(""))},
cat("use ", makeUnreachableReturn));
a.k.a qt-use-unreachable-return.
subStmt() and nextStmt() are non-standard matchers.
There was one false positive, suppressed it with NOLINTNEXTLINE.
It's not really a false positiive, it's just that Clang sees the world
in one way and if conditonal compilation (#if) differs for other
compilers, Clang doesn't know better. This is an artifact of matching
two consecutive statements.
Change-Id: I3855b2dc8523db1ea860f72ad9818738162495c6
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Fabian Kosmale <fabian.kosmale@qt.io>
We've been requiring C++17 since Qt 6.0, and our qAsConst use finally
starts to bother us (QTBUG-99313), so time to port away from it
now.
Since qAsConst has exactly the same semantics as std::as_const (down
to rvalue treatment, constexpr'ness and noexcept'ness), there's really
nothing more to it than a global search-and-replace.
Task-number: QTBUG-99313
Change-Id: I601bf70f020f511019ed28731ba53b14b765dbf0
Reviewed-by: Shawn Rutledge <shawn.rutledge@qt.io>
This is a semantic patch using ClangTidyTransformator as in
qtbase/df9d882d41b741fef7c5beeddb0abe9d904443d8:
auto QtContainerClass = anyOf(
expr(hasType(cxxRecordDecl(isSameOrDerivedFrom(hasAnyName(classes))))).bind(o),
expr(hasType(namedDecl(hasAnyName(<classes>)))).bind(o));
makeRule(cxxMemberCallExpr(on(QtContainerClass),
callee(cxxMethodDecl(hasAnyName({"count", "length"),
parameterCountIs(0))))),
changeTo(cat(access(o, cat("size"), "()"))),
cat("use 'size()' instead of 'count()/length()'"))
a.k.a qt-port-to-std-compatible-api with config Scope: 'Container',
with the extended set of container classes recognized.
Change-Id: Idb1f75dfe2323bd1d9e8b4d58d54f1b4b80c7ed7
Reviewed-by: Fabian Kosmale <fabian.kosmale@qt.io>
Turns out that implementing "proper" QSGRenderNode subclasses is
not quite straightforward in Qt 6. If a rendernode wants to
query the projection and modelview matrices, and store something
calculated based on them somewhere (e.g. push to a uniform buffer),
this is not actually possible to do in prepare(), only in render(),
and that is is no good when working with QRhi or some modern API.
For rendernodes that do not query the scenegraph's projectionMatrix
this is not a problem but for those that want it, there needs to be a
way to access it outside of the RenderState interface, because a
RenderState is only passed to render(), not to prepare().
[ChangeLog][QtQuick] Added a projectionMatrix() accessor to
QSGRenderNode. This way a reimplemented prepare() can query both
matrices and can load the final modelview-projection matrix into a
uniform buffer for example. Previously this was only possible in
render() because the projection matrix was only exposed in the
RenderState. With QRhi, or modern APIs such as Vulkan, that is not
quite sufficient since it is more than likely that one will want to
calculate and store the matrix in prepare(), not in render().
Change-Id: Ic6ff734962381ee7b199329bf2a37b218c09dc2c
Reviewed-by: Andy Nichols <andy.nichols@qt.io>
With the helper struct (QSGRenderTarget) in place these are neither
needed nor help following the code. We can now also drop the
compatibility ctor from the struct.
Change-Id: I3f18630ed670788bf24e0d41ae788a0ad4ea1db5
Reviewed-by: Andy Nichols <andy.nichols@qt.io>
A QSGRenderNode in a subscene that is under a layer.enabled: true Item
is rendered to a texture, i.e. a render target different from the main
(the window/swapchain) one. It also involves using different a different
renderpass object. QSGRendererInterface allows to query window-level
stuff, so e.g. the main render target, but that is not useful here. The
QSGRenderNode subclass needs a way to access these things.
Start with just another private member var., this enables Quick 3D to
implement its Inline render mode correctly. This can be turned into an
API on QSGRenderNode later.
Pick-to: 6.4
Task-number: QTBUG-105354
Change-Id: Iba97ded98b08a23b671f3729d8bab35fad1fd0e8
Reviewed-by: Andy Nichols <andy.nichols@qt.io>
When this was added in 2015, the default code path put vertex and
index data into the same buffer which also involved using the same
"pool" (not really a pool, just temporary backing storage) for
both. The guess is that the largest-vertex-buffer-size * 2 logic was
meant to accommodate for this, i.e. an attempt to reduce the chance of
further resize() calls later on by setting a hopefully-large-enough
size up front, while doubling the size in an attempt to end up with a
QDataBuffer large enough for largest vertex+index combination.
In Qt 6 (or in Qt 5 in some cases, such as with the WebGL platform
plugin), vertex and index buffers are always separate. Making both of
these "pools" to have a size of largest * 2 upfront is wasteful, they
will never need to store more data than the largest buffer.
Change-Id: Iaad9da81e6692fad2458a1f8414787239821ef59
Reviewed-by: Ulf Hermann <ulf.hermann@qt.io>
Reviewed-by: Andy Nichols <andy.nichols@qt.io>
int was used a bit too liberally, even in places where the QRhi API
already took a quint32. More importantly, eliminate some
signed-unsigned mismatches that occur now that the QRhi API is fixed
up to consistently use the correct types.
Change-Id: I02773d005c8fe6d9e6d1cfebb617d40447d7ef6c
Reviewed-by: Ulf Hermann <ulf.hermann@qt.io>
Reviewed-by: Laszlo Agocs <laszlo.agocs@qt.io>
This needs to also have an opt-in flag, because existing code must
continue to work as-is (i.e. when it does not touch the new
srcAlpha/dstAlpha fields at all, the color blending factors should be
used for alpha too). Therefore, by default there is no change in
behavior, no matter if one touches the alpha blending factors or not.
The renderer will continue to use the color factors for everything.
Only when the flag is flipped will srcAlpha and dstAlpha have an
effect.
Inspired by a recent text node bug which, while solved differently in
the end, shows that the ability to specify separate factors is useful
in practice occasionally.
[ChangeLog][QtQuick] When a custom material declares its own blending
factors, it is now possible to specify the source/destination alpha
blending factors separate from the color blending factors. This is
done by first setting the separateBlendFactors to true in
QSGMaterialShader::GraphicsPipelineState, and then writing the desired
values to srcAlpha and dstAlpha, in addition to srcColor and dstColor.
Change-Id: I83826a68b80d76500ee195db73b29d393a7d0381
Reviewed-by: Andy Nichols <andy.nichols@qt.io>
Replace the current license disclaimer in files by
a SPDX-License-Identifier.
Files that have to be modified by hand are modified.
License files are organized under LICENSES directory.
Pick-to: 6.4
Task-number: QTBUG-67283
Change-Id: I63563bbeb6f60f89d2c99660400dca7fab78a294
Reviewed-by: Shawn Rutledge <shawn.rutledge@qt.io>
There's only ever one of them, and accessing objectName() from a
different than the owning thread is UB (data race).
The default-constructed QString() is needed, cf. QTBUG-103986.
This is a hotfix for QTBUG-102403, which ran afoul of the guards added
in qtbase/3f32dcd1ddcbe04c77ccd83e2eaa566d7212e732 to fix QTBUG-99775.
Task-number: QTBUG-102403
Pick-to: 6.3 6.3.1 6.2
Change-Id: Idf07b1119ec393e7fd179e8b545cfeb70f5916ec
Reviewed-by: Allan Sandfeld Jensen <allan.jensen@qt.io>
One set of functions is enough, no need for a separate set of "Rhi"
ones.
Change-Id: I58600c539734d5f8a767926eb9649d9968bab08a
Reviewed-by: Christian Strømme <christian.stromme@qt.io>
Add QQuickRenderTarget::fromPaintDevice, aollow to get a
QQuickRendererTarget from the QPaintDevice object.
[ChangeLog][QtQuick] Added QQuickRenderTarget::fromPaintDevice,
Allowed to set the render target of QQuickWindow on the software
renderer.
Task-number: QTBUG-94075
Change-Id: I4946c25d2a6315cd8f9c12a7ac7ac4cf71d95361
Reviewed-by: Laszlo Agocs <laszlo.agocs@qt.io>
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
The qtbase RHI graphics pipeline recently gained support for Polygon
Mode (Triangle Fill Mode in Metal, Fill Mode in D3D). This patch
extends support to the scenegraph.
Note - glPolygonMode not supported on OpenGL ES
[ChangeLog][QtQuick][QSGMaterialShader] Added support for polygon
rasterization mode to graphics pipeline state.
Change-Id: I6fc05bc4d38446e573687ca651d6d13a76b3e667
Reviewed-by: Laszlo Agocs <laszlo.agocs@qt.io>
The flag has been unconditionally set to true since Qt 6.0.
Change-Id: Ie6ad91f00b5c7ac7d6a7745035175f23c9641a42
Reviewed-by: Hatem ElKharashy <hatem.elkharashy@qt.io>
Reviewed-by: Andy Nichols <andy.nichols@qt.io>
The current scenegraph implementation only allows a single texture per
combined image sampler binding. Shaders may specify arrays of combined
image samplers. This change allows an array of textures to be provided
per binding.
[ChangeLog][QtQuick][QSGMaterialShader] Added support for arrays of
combined image samplers per binding.
Change-Id: I37b6f38c60d23cfe5f04388a4f0bf42fb1a73837
Reviewed-by: Laszlo Agocs <laszlo.agocs@qt.io>
Be consistent in handling a failing create(). Do not try to enqueue
resource updates if we did not successfully built the buffer. Do not try
to render either.
With just a few checks we can now fully survive failing all vertex/index
buffer creation (i.e. warnings will be printed and nothing will be
rendered by Quick, but the application will continue to work). This
matches the behavior one gets when the uniform buffer creation fails,
that is already robust enough.
Pick-to: 6.3 6.2
Fixes: QTBUG-99477
Change-Id: I2ca0f3ada8a7449c5ac1346216a58b06658b5cd5
Reviewed-by: Andy Nichols <andy.nichols@qt.io>
Fix creator qml puppet crash at startup due to nullptr exception. The
puppet clears the cached resources, which calls shrink(0) to the vertex
and index upload pools. The shrink only sets the capacity to 0 and
deallocates the buffer, but leaves the size intact. The next time the
upload pools are used they are not reallocated, because only the size
gets checked. Use resize instead of shrink since it calls the shrink
and also sets the size to zero.
Fixes: QTBUG-98150
Pick-to: 6.2 5.15
Change-Id: Id852e71d4caa0f6c5ca560142024ca9d143df7f8
Reviewed-by: Laszlo Agocs <laszlo.agocs@qt.io>
Follow what we did to the srb: instead of storing an object reference,
store an opaque blob to test for compatibility. This works even if the
object is destroyed in the meantime. Thus, the annoying and error-prone
invalidatePipelineCacheDependency() goes away, finally.
All this works because creating a QRhiGraphicsPipeline with a given
QRhiRenderPassDescriptor only needs the renderpass object for building
(as in create()). Afterwards the object can be destroyed, even. The
pipeline is still usable with any render targets that have a
_compatible_ rp descriptor. Therefore, it is ideal if there is a way
to test for this compatibility without having to have the original
object around for ever.
Relies on enablers (most importantly,
QRhiRenderPassDescriptor::serializedFormat()) added to QRhi in qtbase.
While not necessarily obvious, the real benefit here is significant
performance gains in certain situations: consider rendering a scene to a
given render target, then to another one, then yet another one, for
example via QQuickRenderControl. We no longer have to do aggressive
purging of the pipelines just because a QRhiRenderPassDescriptor some of
the recently created pipelines may be associated with is going to
disappear. Rather, everything can stay as-is, there are no crashes due
to dangling pointers in the pipeline cache keys. The majority
of cases involve highly compatible render targets (e.g. rendering
into textures of the same format), so reuse is highly beneficial.
Pick-to: 6.2
Change-Id: I8ad4bfcb3c7e66c7364c91a749e2a2e6837b4f1a
Reviewed-by: Andy Nichols <andy.nichols@qt.io>
Remove the srbCache which is effectively a map of unbounded size.
Depending on scene structure and changes to the scene that trigger
rebuilding the batching structures of the scenegraph, it can grow
unexpectedly large, potentially growing continuously on every frame,
with animated scenes that manage to trigger the "right" kind of
changes to the scene.
Other similar structures, such as the graphics pipeline or the sampler
tables have a known maximum size, because we know that there can only
be a certain number of combinations in the values that form the keys
to the map. Whereas with QRhiShaderResourceBindings objects the key is
the full binding list, including the QRhiBuffer/Texture/Sampler
pointers.
This becomes complex pretty quickly when combined with the internal
mechanisms of the scenegraph's batching system and the lifecycle of
these QRhiResource objects. For instance, uniform buffers live in
Batch, which is a pooled, (semi-)transient object: changes to the
scene that lead to full or partial rebuild will generate a new set of
Batches, meaning a QSGGeometryNode can be associated with a different
QRhiBuffer afterwards (not that we create new buffers all the time,
they are reused, but they can be associated with different nodes over
their lifetime). Which then means that to render the node, we now need
a different srb as well (that references the correct
QRhiBuffer). Depending on the scene and how it changes over time, this
may lead to an explosive number of combinations, thus a "continuously"
growing srbCache map, inserting more and more new entries, while
leaving perhaps-unused entries in there for ever. Trimming is far from
trivial as the costs of managing the purging of unused entries is
likely more expensive than not having the "cache" in the first place.
Thus, drop the whole data structure and give ownership of the srb to
Element (which is the object in the shadow tree for each
QSGGeometryNode). To not completely degrade performance, this needs
some new enablers in QRhi, namely the ability to have a fast path for
replacing the QRhiResources (buffers, textures, samplers) in the
binding list of an already baked srb without having to go through full
rebaking. (e.g. with Vulkan, we want to reuse all the existing
descriptor set layouts and the descriptor sets themselves, it's just
the descriptors themselves that need new values written)
So now, instead of looking up in a QHash, we first compare the full
binding list (this is what one gets with the QHash when there's a hit
anyway: at minimum the binding list is hashed and then compared). If
this matches then fine (and we saved the time spent on hashing even).
Then, if there's a mismatch, we now do a layout-only comparison. If
this matches, then we use the new enablers in QRhi to do a "light
update" (different resources, but same binding points, types, etc.) to
the srb. Finally, if there was a mismatch here too, we go through
create() to do a full rebuild.
Therefore, the performance effects of this patch depend heavily on the
scene. Scenes that do not do changes that trigger re-batching often
may see improvements even. Others are assumed to have small or
negligible consequences with regards to performance (due to now
spending time on a layout-compatible comparison of binding lists which
was not there before). In return, we do not need to have the srbCache
QHash at all, leading to, with scenes exhibiting the relevant
patterns, significantly reduced memory usage.
Once srbCache is gone, the following issues need to be handled:
First, now that the srb ownership stays with the shadow node
(Element), instead of guaranteed to stay around for ever in an srb
cache data structure, throwing references into the keys used to look
up pipelines is problematic (and was never nice to begin with).
Get rid of this, building on the new QRhi API that gives us a list of
uints that can be used to test for layout compatibility. The
container is implicitly shared, which makes the switch quite simple
and efficient.
Second, all this does not mean that some sort of reuse of srb objects
is not needed. Consider scrolling a list view. Nodes (and so shadow
nodes, i.e. Element objects) come and go in this case. Creating full
new srb objects for the Elements for the list items becoming visible
is not ideal. (even if only really dubious with Vulkan, where there
are true Vulkan API objects underneath)
To facilitate reuse, reintroduce an srb pool in ShaderManager, but
this time with different semantics: this stores unused srb objects,
with the serialized layout description as the key. When a new Element
needs an srb, we try to pick a (layout compatible) srb from this
pool. If there is a match, the Element gets the srb with ownership,
and the srb is removed from the pool. When an Element is about to go
away, its srb may be "moved" to the pool when the pool is not too
large already. The threshold for being too large is currently simply
1024 elements (in the QMultiHash). This can be overridden with an
environment variable, although that should pretty much be never needed
in practice.
Pick-to: 6.2
Fixes: QTBUG-96130
Change-Id: Ie5a49beeb6dea0db6a81fdceebf39374ad192b0e
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Christian Strømme <christian.stromme@qt.io>
Clazy gives a warning that it might detach. qAsConst makes it go away.
Pick-to: 6.2
Change-Id: I293fd5337bc02b3c492464400651358459cf4ab4
Reviewed-by: Laszlo Agocs <laszlo.agocs@qt.io>
A loop-local variable is often more efficient than a variable at
function scope; and avoiding it altogether is more compact.
Pick-to: 6.2
Change-Id: Iac88bc433ce640a52a6bd909c708711e294ecc59
Reviewed-by: Laszlo Agocs <laszlo.agocs@qt.io>
Qt 5's CustomCompileStep has limited use in Qt 6: as there are no
virtual compile() type of functions to override in the API anymore,
its only effect now is disabling batching for nodes using materials
that have the flag set.
Coincidentally it turns out that some views in Creator's QML Profiler
rely on some quite advanced manual alteration of gl_Position.z in the
materials' vertex shader.
This is pretty bad if batching is involved, although in Qt 5 they
could get away by setting the internal undocumented qt_zRange
uniform to a magic value (1.0) to alter the behavior of the final
gl_Position override statement inserted by the shader rewriter
to batched materials' vertex shaders.
In Qt 6 this approach is not available, so the Profiler's rendering
becomes incorrect due to the vertex shader's own gl_Position.z
alterations being lost when batching is active. Instead, Creator 5
(that uses Qt 6) will set CustomCompileStep on its QSGMaterial
subclasses where relevant in order to never batch nodes with
these materials.
All in all this implies that there is a need for a better-named flag.
Therefore, add a NoBatching flag in place of CustomCompileStep, while
still keeping the latter as an alias until Qt 7.
Change-Id: Ib806fbfa55132fdd02765adece366f0175129ae8
Reviewed-by: Andy Nichols <andy.nichols@qt.io>
Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@qt.io>
Cannot possibly do reasonable overlap checks when we have no idea how
such lines are rasterized, meaning we do not know the real bounds of
the geometry.
Pick-to: 6.1 6.0 5.15
Fixes: QTBUG-91749
Change-Id: Ia444232330da2f1d29841589f0e65bb52822c4ae
Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@qt.io>
If you have a list of render orders A, B, C, then A and C
changes, we also invalidated B. This was to avoid the situation
where B was ignored by the batching algorithm and A and C
would end up being batched together.
This has a performance impact for some scenes though, where
you might have top level and bottom level items that constantly
change, causing all geometry in the scene to be uploaded every
frame.
Instead of doing this, we skip invalidation of B, but include its
bounding rect when checking for overlaps, so that we still avoid
batching A and C together but without the need to re-upload B.
Fixes: QTBUG-90632
Pick-to: 6.1
Change-Id: I462cf1938360643ca9fa1425ae8c8e08b534fd76
Reviewed-by: Andy Nichols <andy.nichols@qt.io>
Reviewed-by: Laszlo Agocs <laszlo.agocs@qt.io>
Reviewed-by: Michael Brasser <michael.brasser@live.com>
The QRhiBuffer does not shrink; thus we can end up with
buffer->buf->size > buffer->size. This would subsequently lead to an
out-of-bounds memory access, and a crash. Fix this by using the
uploadStaticBuffer overload which takes the size.
As a drive-by, remove pointless QByteArray::fromRawData call.
Pick-to: 6.0 6.1
Change-Id: I40058ada6a6a5eb745ae559e8c9ed474fd41f75c
Reviewed-by: Laszlo Agocs <laszlo.agocs@qt.io>