public inbox for rust-for-linux@vger.kernel.org
 help / color / mirror / Atom feed
From: "Alexandre Courbot" <acourbot@nvidia.com>
To: "Gary Guo" <gary@garyguo.net>
Cc: "Danilo Krummrich" <dakr@kernel.org>,
	"Alice Ryhl" <aliceryhl@google.com>,
	"David Airlie" <airlied@gmail.com>,
	"Simona Vetter" <simona@ffwll.ch>,
	"Alistair Popple" <apopple@nvidia.com>,
	"John Hubbard" <jhubbard@nvidia.com>,
	"Joel Fernandes" <joelagnelf@nvidia.com>,
	"Timur Tabi" <ttabi@nvidia.com>, "Zhi Wang" <zhiw@nvidia.com>,
	"Eliot Courtney" <ecourtney@nvidia.com>,
	<rust-for-linux@vger.kernel.org>,
	<dri-devel@lists.freedesktop.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2] gpu: nova-core: gsp: fix undefined behavior in command queue code
Date: Sat, 28 Mar 2026 23:53:47 +0900	[thread overview]
Message-ID: <DHEI23HSGJRB.15RJSGAFFTMKY@nvidia.com> (raw)
In-Reply-To: <DHEFUD64AORK.8FOCH0VJAM4B@garyguo.net>

On Sat Mar 28, 2026 at 10:09 PM JST, Gary Guo wrote:
> On Fri Mar 27, 2026 at 12:47 AM GMT, Alexandre Courbot wrote:
>> On Thu Mar 26, 2026 at 9:03 PM JST, Gary Guo wrote:
>>> On Thu Mar 26, 2026 at 4:51 AM GMT, Alexandre Courbot wrote:
>>>> On Thu Mar 26, 2026 at 1:30 PM JST, Alexandre Courbot wrote:
>>>>> On Wed Mar 25, 2026 at 12:15 AM JST, Gary Guo wrote:
>>>>>> On Tue Mar 24, 2026 at 2:44 PM GMT, Alexandre Courbot wrote:
>>>>>>> On Tue Mar 24, 2026 at 1:44 AM JST, Gary Guo wrote:
>>>>>>>> On Mon Mar 23, 2026 at 5:40 AM GMT, Alexandre Courbot wrote:
>>>>>>>>> `driver_read_area` and `driver_write_area` are internal methods that
>>>>>>>>> return slices containing the area of the command queue buffer that the
>>>>>>>>> driver has exclusive read or write access, respectively.
>>>>>>>>>
>>>>>>>>> While their returned value is correct and safe to use, internally they
>>>>>>>>> temporarily create a reference to the whole command-buffer slice,
>>>>>>>>> including GSP-owned regions. These regions can change without notice,
>>>>>>>>> and thus creating a slice to them is undefined behavior.
>>>>>>>>>
>>>>>>>>> Fix this by replacing the slice logic with pointer arithmetic and
>>>>>>>>> creating slices to valid regions only. It adds unsafe code, but should
>>>>>>>>> be mostly replaced by `IoView` and `IoSlice` once they land.
>>>>>>>>>
>>>>>>>>> Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
>>>>>>>>> Reported-by: Danilo Krummrich <dakr@kernel.org>
>>>>>>>>> Closes: https://lore.kernel.org/all/DH47AVPEKN06.3BERUSJIB4M1R@kernel.org/
>>>>>>>>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>>>>>>>>> ---
>>>>>>>>> I didn't apply Eliot's Reviewed-by because the code has changed
>>>>>>>>> drastically. The logic should remain identical though.
>>>>>>>>> ---
>>>>>>>>> Changes in v2:
>>>>>>>>> - Use `u32_as_usize` consistently.
>>>>>>>>> - Reduce the number of `unsafe` blocks by computing the end offset of
>>>>>>>>>   the returned slices and creating them at the end, in one step.
>>>>>>>>> - Take advantage of the fact that both slices have the same start index
>>>>>>>>>   regardless of the branch chosen.
>>>>>>>>> - Improve safety comments.
>>>>>>>>> - Link to v1: https://patch.msgid.link/20260319-cmdq-ub-fix-v1-1-0f9f6e8f3ce3@nvidia.com
>>>>>>>>
>>>>>>>> Here's the diff that fixes the issue using I/O projection
>>>>>>>> https://lore.kernel.org/rust-for-linux/20260323153807.1360705-1-gary@kernel.org/
>>>>>>>
>>>>>>> Should we apply or drop this patch meanwhile? I/O projections are still
>>>>>>> undergoing review, but I'm fine with dropping it if Danilo thinks we can
>>>>>>> live a bit longer with that UB. It's not like the driver is actively
>>>>>>> doing anything useful yet anyway.
>>>>>>
>>>>>> I want to avoid big changes back and forth. We could use raw pointer projection
>>>>>> today, which could be fairly easy to convert to I/O projection:
>>>>>
>>>>> Thanks for the diff. I have adapted it to work on top of Danilo's
>>>>> suggestion to compute the end indices first as it works just as well and
>>>>> is cleaner. I have been running into a link error with this conversion
>>>>> applied though - let's discuss that on v3.
>>>>
>>>> Mmm, I guess this was because the optimizer could not prove that the
>>>> slices were within the bounds of the command queue as the expressions
>>>> passed to `ptr::project` were too complex with that version and this
>>>> makes the `ProjectIndex` check fail. I have better luck when doing
>>>> something closer to the diff you pasted.
>>>
>>> I'm considering switching the projectiong `[]` syntax to become panicking
>>> instead, given that the slicing use case quite often is indeed hard to prove
>>> (and also, we already have panicking comments).
>>>
>>> One option is to just change `[]` to do that, another option is adding a new
>>> `[]!` syntax to denote panicking projections. I'm more inclined to just the
>>> first one to keep consistency with Rust slicing syntax, but the second one is
>>> okay to me too.
>>>
>>> Thoughts?
>>
>> If the slice's validity is hard to prove, then the caller should
>> probably rework their code towards something simpler (like we did with
>> this patch). Allowing a potentially invalid slice to build is just
>> inserting a kernel panic mine, and as you might have noticed from LPC I
>> am not a huge fan of those. :)
>>
>> I think hammering the point about slice validity in the documentation
>> should be enough. We *want* build to fail if the slice can be invalid.
>
> Given the kernel test robot result showing build errors, I am going to add a
> panicking variant. For the use case here you don't really want to use fallible
> returns (panicking indexing + PANIC comments should be sufficient).
>
> I haven't decided on the syntax yet, I'll put this in the next RfL weekly
> meeting agenda to discuss.

Meanwhile it would be nice to patch that UB though. I'll try and repro
the bot's errors locally to see if we can make it work. (it will have to
land after -rc6 unfortunately).

      reply	other threads:[~2026-03-28 14:53 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-23  5:40 [PATCH v2] gpu: nova-core: gsp: fix undefined behavior in command queue code Alexandre Courbot
2026-03-23 16:44 ` Gary Guo
2026-03-24 14:44   ` Alexandre Courbot
2026-03-24 14:45     ` Danilo Krummrich
2026-03-24 15:15     ` Gary Guo
2026-03-26  4:30       ` Alexandre Courbot
2026-03-26  4:51         ` Alexandre Courbot
2026-03-26 12:03           ` Gary Guo
2026-03-26 15:55             ` Alice Ryhl
2026-03-27  0:47             ` Alexandre Courbot
2026-03-28 13:09               ` Gary Guo
2026-03-28 14:53                 ` Alexandre Courbot [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DHEI23HSGJRB.15RJSGAFFTMKY@nvidia.com \
    --to=acourbot@nvidia.com \
    --cc=airlied@gmail.com \
    --cc=aliceryhl@google.com \
    --cc=apopple@nvidia.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=ecourtney@nvidia.com \
    --cc=gary@garyguo.net \
    --cc=jhubbard@nvidia.com \
    --cc=joelagnelf@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=simona@ffwll.ch \
    --cc=ttabi@nvidia.com \
    --cc=zhiw@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox