From: "Danilo Krummrich" <dakr@kernel.org>
To: "Alexandre Courbot" <acourbot@nvidia.com>
Cc: "Eliot Courtney" <ecourtney@nvidia.com>,
"Alice Ryhl" <aliceryhl@google.com>,
"David Airlie" <airlied@gmail.com>,
"Simona Vetter" <simona@ffwll.ch>,
<rust-for-linux@vger.kernel.org>, <nouveau@lists.freedesktop.org>,
<dri-devel@lists.freedesktop.org>, <linux-kernel@vger.kernel.org>,
"dri-devel" <dri-devel-bounces@lists.freedesktop.org>,
"Gary Guo" <gary@garyguo.net>
Subject: Re: [PATCH 6/9] gpu: nova-core: generalize `flush_into_kvec` to `flush_into_vec`
Date: Tue, 17 Mar 2026 15:12:32 +0100 [thread overview]
Message-ID: <DH54AITP8HBY.PPSYM4FA2DZI@kernel.org> (raw)
In-Reply-To: <DH53MXEAVB0Q.2SXHCQTW8WG2Y@nvidia.com>
On Tue Mar 17, 2026 at 2:41 PM CET, Alexandre Courbot wrote:
> On Tue Mar 17, 2026 at 7:49 PM JST, Danilo Krummrich wrote:
>> On Tue Mar 17, 2026 at 2:55 AM CET, Alexandre Courbot wrote:
>>> We shouldn't be doing that - I think we are limited by the current
>>> CoherentAllocation API though. But IIUC this is something that I/O
>>> projections will allow us to handle properly?
>>
>> Why do we need projections to avoid UB here? driver_read_area() already even
>> peeks into the firmware abstraction layer, which is where MsgqData technically
>> belongs into (despite being trivial).
>>
>> let gsp_mem = &unsafe { self.0.as_slice(0, 1) }.unwrap()[0];
>> let data = &gsp_mem.gspq.msgq.data;
>>
>> Why do we need I/O projections to do raw pointer arithmetic where creating a
>> reference is UB?
>>
>> (Eventually, we want to use IoView of course, as this is a textbook example of
>> what I proposed IoSlice for.)
>
> Limiting the amount of `unsafe`s, but I guess we can live with that as
> this is going to be short-term anyway.
Of course it is going to be better with IoSlice, but limiting the number of
unsafe calls regardless is a bit pointless if the "safe" ones can cause
undefined behavior. :)
>> Another option in the meantime would be / have been to use dma_read!() and
>> extract (copy) the data right away in driver_read_area(), which I'd probably
>> prefer over raw pointer arithmetic.
>
> I'd personally like to keep the current "no-copy" approach as it
> implements the right reference discipline (i.e. you need a mutable
> reference to update the read pointer, which cannot be done if the buffer
> is read by the driver) and moving to copy semantics would open a window
> of opportunity to mess with that balance further (on top of requiring
> bigger code changes that will be temporary).
I don't even know if we want them to be temporary, i.e. we can copy right away
and IoSlice would still be an improvement in order to make the copy in the first
place.
Also, you say "no-copy", but that's not true, we do copy eventually. In fact,
the whole point of this patch is to copy this buffer into a KVVec.
So, why not copy it right away with dma_read!() (later replaced with an IoSlice
copy) and then process it further?
I am also very sceptical of the "holding on to the reference prevents the read
pointer update" argument. Once we have a copy, there is no need not to update
the read pointer anymore in the first place, no?
>> But in any case, this can (and should) be fixed even without IoView.
>>
>> Besides that, nothing prevents us doing the same thing I did for gsp_write_ptr()
>> in the meantime to not break out of the firmware abstraction layer.
>>
>>> This is guaranteed by the inability to update the CPU read pointer for
>>> as long as the slices exists.
>>
>> Fair enough.
>>
>>> Unless we decide to not trust the GSP, but that would be opening a whole
>>> new can of worms.
>>
>> I thought about this as well, and I think it's fine. The safety comment within
>> the function has to justify why the device won't access the memory. If the
>> device does so regardless, it's simply a bug.
>>
>>>> I don't want to merge any code that builds on top of this before we have sorted
>>>> this out.
>>>
>>> If what I have written above is correct, then the fix should simply be
>>> to use I/O projections to create properly-bounded references.
>>
>> I still don't think we need I/O projections for a reasonable fix and I also
>> don't agree that we should keep UB until new features land.
>
> I have the following (modulo missing safety comments) to fix
> `driver_read_area` - does it look acceptable to you? If so I'll go
> ahead and fix `driver_write_area` as well.
Not pretty (which is of course not on you :), but looks correct.
I still feel like we should just copy right away, as mentioned above.
> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
> index efa1aab1568f..3bddb5a2923f 100644
> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
> @@ -296,24 +296,53 @@ fn driver_write_area_size(&self) -> usize {
> let tx = self.gsp_write_ptr() as usize;
> let rx = self.cpu_read_ptr() as usize;
>
> + // Pointer to the start of the GSP message queue.
> + //
> // SAFETY:
> - // - The `CoherentAllocation` contains exactly one object.
> - // - We will only access the driver-owned part of the shared memory.
> - // - Per the safety statement of the function, no concurrent access will be performed.
> - let gsp_mem = &unsafe { self.0.as_slice(0, 1) }.unwrap()[0];
> - let data = &gsp_mem.gspq.msgq.data;
> + // - `self.0` contains exactly one element.
> + // - `gspq.msgq.data[0]` is within the bounds of that element.
> + let data = unsafe { &raw const (*self.0.start_ptr()).gspq.msgq.data[0] };
> +
> + // Safety/Panic comments to be referenced by the code below.
> + //
> + // SAFETY[1]:
> + // - `data` contains `MSGQ_NUM_PAGES` elements.
> + // - The area starting at `rx` and ending at `tx - 1` modulo `MSGQ_NUM_PAGES`,
> + // inclusive, belongs to the driver for reading and is not accessed concurrently by
> + // the GSP.
> + //
> + // PANIC[1]:
> + // - Per the invariant of `cpu_read_ptr`, `rx < MSGQ_NUM_PAGES`.
> + // - Per the invariant of `gsp_write_ptr`, `tx < MSGQ_NUM_PAGES`.
>
> - // The area starting at `rx` and ending at `tx - 1` modulo MSGQ_NUM_PAGES, inclusive,
> - // belongs to the driver for reading.
> - // PANIC:
> - // - per the invariant of `cpu_read_ptr`, `rx < MSGQ_NUM_PAGES`
> - // - per the invariant of `gsp_write_ptr`, `tx < MSGQ_NUM_PAGES`
> if rx <= tx {
> // The area is contiguous.
> - (&data[rx..tx], &[])
> + (
> + // SAFETY: See SAFETY[1].
> + //
> + // PANIC:
> + // - See PANIC[1].
> + // - Per the branch test, `rx <= tx`.
> + unsafe { core::slice::from_raw_parts(data.add(rx), tx - rx) },
> + &[],
> + )
> } else {
> // The area is discontiguous.
> - (&data[rx..], &data[..tx])
> + (
> + // SAFETY: See SAFETY[1].
> + //
> + // PANIC: See PANIC[1].
> + unsafe {
> + core::slice::from_raw_parts(
> + data.add(rx),
> + num::u32_as_usize(MSGQ_NUM_PAGES) - rx,
> + )
> + },
> + // SAFETY: See SAFETY[1].
> + //
> + // PANIC: See PANIC[1].
> + unsafe { core::slice::from_raw_parts(data, tx) },
> + )
> }
> }
next prev parent reply other threads:[~2026-03-17 14:12 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-27 12:32 [PATCH 0/9] gpu: nova-core: gsp: add RM control command infrastructure Eliot Courtney
2026-02-27 12:32 ` [PATCH 1/9] gpu: nova-core: gsp: add NV_STATUS error code bindings Eliot Courtney
2026-02-27 12:32 ` [PATCH 2/9] gpu: nova-core: gsp: add NvStatus enum for RM control errors Eliot Courtney
2026-02-27 12:32 ` [PATCH 3/9] gpu: nova-core: gsp: expose GSP-RM internal client and subdevice handles Eliot Courtney
2026-03-09 21:22 ` Joel Fernandes
2026-03-09 23:41 ` Joel Fernandes
2026-03-10 0:06 ` John Hubbard
2026-03-10 2:17 ` Joel Fernandes
2026-03-10 2:29 ` John Hubbard
2026-03-10 18:48 ` Joel Fernandes
2026-03-10 2:36 ` Alexandre Courbot
2026-03-10 4:02 ` Eliot Courtney
2026-03-10 10:35 ` Danilo Krummrich
2026-02-27 12:32 ` [PATCH 4/9] gpu: nova-core: gsp: add RM control RPC structure binding Eliot Courtney
2026-02-27 12:32 ` [PATCH 5/9] gpu: nova-core: gsp: add types for RM control RPCs Eliot Courtney
2026-03-09 21:45 ` Joel Fernandes
2026-03-16 11:42 ` Eliot Courtney
2026-02-27 12:32 ` [PATCH 6/9] gpu: nova-core: generalize `flush_into_kvec` to `flush_into_vec` Eliot Courtney
2026-03-09 21:53 ` Joel Fernandes
2026-03-09 21:57 ` Danilo Krummrich
2026-03-09 22:01 ` Danilo Krummrich
2026-03-16 11:44 ` Eliot Courtney
2026-03-16 12:21 ` Danilo Krummrich
2026-03-17 1:55 ` Alexandre Courbot
2026-03-17 10:49 ` Danilo Krummrich
2026-03-17 13:41 ` Alexandre Courbot
2026-03-17 14:12 ` Danilo Krummrich [this message]
2026-03-18 1:52 ` Alexandre Courbot
2026-02-27 12:32 ` [PATCH 7/9] gpu: nova-core: gsp: add RM control command infrastructure Eliot Courtney
2026-03-02 8:00 ` Zhi Wang
2026-03-09 22:08 ` Joel Fernandes
2026-03-13 15:40 ` Danilo Krummrich
2026-03-16 12:06 ` Eliot Courtney
2026-02-27 12:32 ` [PATCH 8/9] gpu: nova-core: gsp: add CE fault method buffer size bindings Eliot Courtney
2026-03-09 22:08 ` Joel Fernandes
2026-02-27 12:32 ` [PATCH 9/9] gpu: nova-core: gsp: add CeGetFaultMethodBufferSize RM control command Eliot Courtney
2026-03-09 22:23 ` Joel Fernandes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DH54AITP8HBY.PPSYM4FA2DZI@kernel.org \
--to=dakr@kernel.org \
--cc=acourbot@nvidia.com \
--cc=airlied@gmail.com \
--cc=aliceryhl@google.com \
--cc=dri-devel-bounces@lists.freedesktop.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=ecourtney@nvidia.com \
--cc=gary@garyguo.net \
--cc=linux-kernel@vger.kernel.org \
--cc=nouveau@lists.freedesktop.org \
--cc=rust-for-linux@vger.kernel.org \
--cc=simona@ffwll.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox