From: "Alexandre Courbot" <acourbot@nvidia.com>
To: "Danilo Krummrich" <dakr@kernel.org>
Cc: "Eliot Courtney" <ecourtney@nvidia.com>,
"Alice Ryhl" <aliceryhl@google.com>,
"David Airlie" <airlied@gmail.com>,
"Simona Vetter" <simona@ffwll.ch>,
<rust-for-linux@vger.kernel.org>, <nouveau@lists.freedesktop.org>,
<dri-devel@lists.freedesktop.org>, <linux-kernel@vger.kernel.org>,
"dri-devel" <dri-devel-bounces@lists.freedesktop.org>,
"Gary Guo" <gary@garyguo.net>
Subject: Re: [PATCH 6/9] gpu: nova-core: generalize `flush_into_kvec` to `flush_into_vec`
Date: Tue, 17 Mar 2026 10:55:16 +0900 [thread overview]
Message-ID: <DH4OM17DJ47V.3NJG7G02O7P1L@nvidia.com> (raw)
In-Reply-To: <DH47AVPEKN06.3BERUSJIB4M1R@kernel.org>
On Mon Mar 16, 2026 at 9:21 PM JST, Danilo Krummrich wrote:
> (Cc: Gary)
>
> On Mon Mar 16, 2026 at 12:44 PM CET, Eliot Courtney wrote:
>> On Tue Mar 10, 2026 at 7:01 AM JST, Danilo Krummrich wrote:
>>> On Mon Mar 9, 2026 at 10:57 PM CET, Danilo Krummrich wrote:
>>>> On 2/27/2026 1:32 PM, Eliot Courtney wrote:
>>>>> Add general `flush_into_vec` function. Add `flush_into_kvvec`
>>>>> convenience wrapper alongside the existing `flush_into_kvec` function.
>>>>> This is generally useful but immediately used for e.g. holding RM
>>>>> control payloads, which can be large (~>=20 KiB).
>>>>
>>>> Why not just always use KVVec? It also seems that the KVec variant is not used?
>>>
>>> (Besides its single usage in GspSequence, which wouldn't hurt to be a KVVec.)
>>>
>>>> If there's no reason for having both, I'd also just call this into_vec().
>>
>> I think always using KVVec should be fine, thanks!
>>
>> For the naming, I think `read_to_vec` may be more conventional for this
>> -- `into_vec` implies consuming the object, but if we want to keep the
>> warning in `Cmdq::receive_msg` if not all the data is consumed we need
>> to take &mut self.
>
> I had another look at this and especially how the SBuffer you refer to is used.
> Unfortunately, the underlying code is broken.
>
> driver_read_area() creates a reference to the whole DMA object, including the
> area the GSP might concurrently write to. This is undefined behavior. See also
> commit commit 0073a17b4666 ("gpu: nova-core: gsp: fix UB in DmaGspMem pointer
> accessors"), where I fixed something similar.
We shouldn't be doing that - I think we are limited by the current
CoherentAllocation API though. But IIUC this is something that I/O
projections will allow us to handle properly?
>
> Additionally, even if it would only create a reference to the part of the buffer
> that can be considerd untouched by the GSP and hence suits for creating a
> reference, driver_read_area() and all subsequent callers would still need to be
> unsafe as they would need to promise to not keep the reference alive beyond GSP
> accessing that memory region again.
This is guaranteed by the inability to update the CPU read pointer for
as long as the slices exists.
To expand a bit: `driver_read_area` returns a slice to the area of the
DMA object that the GSP is guaranteed *not* to write into until the
driver updates the CPU read pointer.
This area is between the CPU read pointer (which signals the next bytes
the CPU has to read, and which the GSP won't cross) and the GSP write
pointer (i.e. the next page to be written by the GSP).
Everything in this zone is data that the GSP has already written but the
driver hasn't read yet at the time of the call.
The CPU read pointer cannot be updated for as long as the returned
slices exist - the slices hold a reference to the `DmaGspMem`, and
updating the read pointer requires a mutable reference to the same
`DmaGspMem`.
Meanwhile, the GSP can keep writing data while the slice exists but that
data will be past the area of the slice, and the GSP will never write
past the CPU read pointer.
So the data in the returned slices is guaranteed to be there at the time
of the call, and immutable for as long as the slices exist. Thus, they
can be provided by a safe method.
Unless we decide to not trust the GSP, but that would be opening a whole
new can of worms.
> I don't want to merge any code that builds on top of this before we have sorted
> this out.
If what I have written above is correct, then the fix should simply be
to use I/O projections to create properly-bounded references. Any more
immediate fix would need to be much more intrusive and require a
refactoring that is imho more risky than carrying on for a bit with the
current behavior.
next prev parent reply other threads:[~2026-03-17 1:55 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-27 12:32 [PATCH 0/9] gpu: nova-core: gsp: add RM control command infrastructure Eliot Courtney
2026-02-27 12:32 ` [PATCH 1/9] gpu: nova-core: gsp: add NV_STATUS error code bindings Eliot Courtney
2026-02-27 12:32 ` [PATCH 2/9] gpu: nova-core: gsp: add NvStatus enum for RM control errors Eliot Courtney
2026-02-27 12:32 ` [PATCH 3/9] gpu: nova-core: gsp: expose GSP-RM internal client and subdevice handles Eliot Courtney
2026-03-09 21:22 ` Joel Fernandes
2026-03-09 23:41 ` Joel Fernandes
2026-03-10 0:06 ` John Hubbard
2026-03-10 2:17 ` Joel Fernandes
2026-03-10 2:29 ` John Hubbard
2026-03-10 18:48 ` Joel Fernandes
2026-03-10 2:36 ` Alexandre Courbot
2026-03-10 4:02 ` Eliot Courtney
2026-03-10 10:35 ` Danilo Krummrich
2026-02-27 12:32 ` [PATCH 4/9] gpu: nova-core: gsp: add RM control RPC structure binding Eliot Courtney
2026-02-27 12:32 ` [PATCH 5/9] gpu: nova-core: gsp: add types for RM control RPCs Eliot Courtney
2026-03-09 21:45 ` Joel Fernandes
2026-03-16 11:42 ` Eliot Courtney
2026-02-27 12:32 ` [PATCH 6/9] gpu: nova-core: generalize `flush_into_kvec` to `flush_into_vec` Eliot Courtney
2026-03-09 21:53 ` Joel Fernandes
2026-03-09 21:57 ` Danilo Krummrich
2026-03-09 22:01 ` Danilo Krummrich
2026-03-16 11:44 ` Eliot Courtney
2026-03-16 12:21 ` Danilo Krummrich
2026-03-17 1:55 ` Alexandre Courbot [this message]
2026-03-17 10:49 ` Danilo Krummrich
2026-03-17 13:41 ` Alexandre Courbot
2026-03-17 14:12 ` Danilo Krummrich
2026-03-18 1:52 ` Alexandre Courbot
2026-02-27 12:32 ` [PATCH 7/9] gpu: nova-core: gsp: add RM control command infrastructure Eliot Courtney
2026-03-02 8:00 ` Zhi Wang
2026-03-09 22:08 ` Joel Fernandes
2026-03-13 15:40 ` Danilo Krummrich
2026-03-16 12:06 ` Eliot Courtney
2026-02-27 12:32 ` [PATCH 8/9] gpu: nova-core: gsp: add CE fault method buffer size bindings Eliot Courtney
2026-03-09 22:08 ` Joel Fernandes
2026-02-27 12:32 ` [PATCH 9/9] gpu: nova-core: gsp: add CeGetFaultMethodBufferSize RM control command Eliot Courtney
2026-03-09 22:23 ` Joel Fernandes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DH4OM17DJ47V.3NJG7G02O7P1L@nvidia.com \
--to=acourbot@nvidia.com \
--cc=airlied@gmail.com \
--cc=aliceryhl@google.com \
--cc=dakr@kernel.org \
--cc=dri-devel-bounces@lists.freedesktop.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=ecourtney@nvidia.com \
--cc=gary@garyguo.net \
--cc=linux-kernel@vger.kernel.org \
--cc=nouveau@lists.freedesktop.org \
--cc=rust-for-linux@vger.kernel.org \
--cc=simona@ffwll.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox