All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Alexandre Courbot" <acourbot@nvidia.com>
To: "Timur Tabi" <ttabi@nvidia.com>,
	"nouveau@lists.freedesktop.org" <nouveau@lists.freedesktop.org>,
	"Alexandre Courbot" <acourbot@nvidia.com>,
	"dakr@kernel.org" <dakr@kernel.org>,
	"lyude@redhat.com" <lyude@redhat.com>,
	"Joel Fernandes" <joelagnelf@nvidia.com>,
	"John Hubbard" <jhubbard@nvidia.com>,
	"rust-for-linux@vger.kernel.org" <rust-for-linux@vger.kernel.org>
Cc: "nouveau-bounces@lists.freedesktop.org"
	<nouveau-bounces@lists.freedesktop.org>
Subject: Re: [PATCH 10/11] gpu: nova-core: LibosMemoryRegionInitArgument size must be page aligned
Date: Thu, 04 Dec 2025 23:43:19 +0900	[thread overview]
Message-ID: <DEPIFZFMH34K.31NETCKSPOIEL@nvidia.com> (raw)
In-Reply-To: <df975795b0dbe58214ad302d7182ce6fe92e5fd8.camel@nvidia.com>

On Thu Dec 4, 2025 at 3:31 AM JST, Timur Tabi wrote:
> On Wed, 2025-12-03 at 20:54 +0900, Alexandre Courbot wrote:
>> On Tue Dec 2, 2025 at 8:25 AM JST, Timur Tabi wrote:
>> > On Wed, 2025-11-19 at 12:36 +0900, Alexandre Courbot wrote:
>> > > You can use the `Alignment` type here, as the rest of the code does:
>> > > 
>> > >     let size = num::usize_as_u64(obj.size())
>> > >         .align_up(Alignment::new::<GSP_PAGE_SIZE>())?;
>> > > 
>> > > Now `align_up` returns an error in case of overflow, that we will need
>> > > to pass down to the caller by changing the return type of `new`. It is a
>> > > bit annoying, but better than the behavior of `next_mutiple_of` in such
>> > > a case, which is to panic. :)
>> > 
>> > I see your point, but these are u64s that we're talking about.  The only way next_mutiple_of()
>> > can
>> > panic is if obj.size() is greater than 0xFFFFFFFFFFFFF000, which is not possible.  
>> > 
>> > I would say in this case, a panic is preferable to a convoluted error return that will never be
>> > exercised, because failure here indicates a coding error, not an input error.
>> 
>> The input data is a usize, so technically we could get an input that
>> triggers that error.
>
> Actually, I still say it's not possible.  
>
> Say I change the code to this, so that .next_multiple_of is called on a u64 instead of a usize:
>
> 	let size = num::usize_as_u64(obj.size()).next_multiple_of(GSP_PAGE_SIZE);
>
> Again, the only way this can fail is if the allocated object being passed in is almost 16 exabytes
> in size, which is physically impossible.
>
>> I know it's a very edge case, and clearly indicates a bug, but the
>> general rule is: don't panic the kernel. And in Rust, if possible, don't
>> even let me compiler insert panic-handling code. If you don't want to
>> change the return type of the method, then maybe use `unwrap_or` and
>> `inspect_err` to print an error before returning e.g. `0`.
>
> How about this: if .next_multiple_of(GSP_PAGE_SIZE) does return an error, I'll just assign size to
> obj.size() as-is?  After all, at about 16GB/second for DMA, it will take about 31 years to DMA all
> that memory, so I will have long since retired before that bug shows up.

Please allow me to commend you for doing the computation, that really
cracked me up. :D Maybe we need a `ECAREEROVERFLOW` error code for cases
like this.

And yeah, you are abolutely right, but my point was more about not
having the code to handle panic conditions generated. Maybe I am
thinking too much ahead, but I dream of a future where we could make
guarantees like "this function never panics" and have the compiler
complain if it does. So as a matter of principle I like to avoid having
these, especially when they cannot happen in practice.

So something like using `pr_warn` looks reasonable to me as a last
resort.

... or maybe we can address the problem differently. Reading your commit
log again:

  GSP-RM insists that the 'size' parameter of the
  LibosMemoryRegionInitArgument struct be aligned to 4KB.      

sounds to me like "it is a bug if `size` is not aligned to 4KB to begin
with". Could that be a correct interpretation?

Because if we align up past the valid data of the object, then what are
we copying? Granted, `CoherentAllocation` will likely have an aligned
size, but that's a lucky implementation detail. So maybe we can just
downright return an error if the size is not aligned, which would solve
the panic problem.

Or we fix the problem when allocating the `CoherentAllocation`, making
sure the filler data exists and is zeroes, and providing a valid `size`
from the beginning.

  reply	other threads:[~2025-12-04 14:43 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-14 23:30 [PATCH 00/11] gpu: nova-core: add Turing support Timur Tabi
2025-11-14 23:30 ` [PATCH 01/11] gpu: nova-core: rename Imem to ImemSec Timur Tabi
2025-11-17 22:50   ` Lyude Paul
2025-11-14 23:30 ` [PATCH 02/11] gpu: nova-core: add ImemNs section infrastructure Timur Tabi
2025-11-17 23:19   ` Lyude Paul
2025-11-19  1:54   ` Alexandre Courbot
2025-11-19  6:30     ` John Hubbard
2025-11-19  6:55       ` Alexandre Courbot
2025-11-19 19:54         ` Timur Tabi
2025-11-19 20:34           ` Joel Fernandes
2025-11-19 20:45             ` Timur Tabi
2025-11-19 20:54               ` John Hubbard
2025-11-19 20:56                 ` Timur Tabi
2025-11-20  1:45           ` Alexandre Courbot
2025-11-24 22:24             ` Timur Tabi
2025-11-14 23:30 ` [PATCH 03/11] gpu: nova-core: support header parsing on Turing/GA100 Timur Tabi
2025-11-17 22:33   ` Joel Fernandes
2025-11-18  0:52     ` Timur Tabi
2025-11-18  1:04       ` Joel Fernandes
2025-11-18  1:06         ` Timur Tabi
2025-11-18  1:15           ` John Hubbard
2025-11-18  1:29             ` John Hubbard
2025-11-18  1:12         ` John Hubbard
2025-11-18 19:42           ` Joel Fernandes
2025-11-19  2:51   ` Alexandre Courbot
2025-11-19  5:16     ` Timur Tabi
2025-11-19  7:03       ` Alexandre Courbot
2025-11-24 23:24         ` Timur Tabi
2025-11-24 23:54           ` Alexandre Courbot
2025-11-19  7:04       ` John Hubbard
2025-11-19 20:10         ` Joel Fernandes
2025-11-24 23:47           ` Timur Tabi
2025-11-24 23:55             ` John Hubbard
2025-11-25  0:57               ` Alexandre Courbot
2025-11-25  1:02                 ` Timur Tabi
2025-11-25  0:05             ` Joel Fernandes
2025-11-14 23:30 ` [PATCH 04/11] gpu: nova-core: add support for Turing/GA100 fwsignature Timur Tabi
2025-11-17 23:20   ` Lyude Paul
2025-11-19  2:59   ` Alexandre Courbot
2025-11-19  5:17     ` Timur Tabi
2025-11-19  7:11     ` Alexandre Courbot
2025-11-19  7:17       ` John Hubbard
2025-11-19  7:34         ` Alexandre Courbot
2025-11-14 23:30 ` [PATCH 05/11] gpu: nova-core: add NV_PFALCON_FALCON_DMATRFCMD::with_falcon_mem() Timur Tabi
2025-11-19  3:04   ` Alexandre Courbot
2025-11-19  6:32     ` John Hubbard
2025-11-14 23:30 ` [PATCH 06/11] gpu: nova-core: add Turing boot registers Timur Tabi
2025-11-17 22:41   ` Joel Fernandes
2025-11-19  2:17   ` Alexandre Courbot
2025-11-19  6:34     ` John Hubbard
2025-11-19  6:47       ` Alexandre Courbot
2025-11-19  6:51         ` John Hubbard
2025-11-19  7:15           ` Alexandre Courbot
2025-11-19  7:24             ` John Hubbard
2025-11-19 19:10               ` Timur Tabi
2025-11-20  1:41                 ` Alexandre Courbot
2025-11-14 23:30 ` [PATCH 07/11] gpu: nova-core: move some functions into the HAL Timur Tabi
2025-11-14 23:30 ` [PATCH 08/11] gpu: nova-core: Add basic Turing HAL Timur Tabi
2025-11-18  0:50   ` Joel Fernandes
2025-11-19  3:11   ` Alexandre Courbot
2025-11-14 23:30 ` [PATCH 09/11] gpu: nova-core: add FalconUCodeDescV2 support Timur Tabi
2025-11-17 23:10   ` Joel Fernandes
2025-11-18 13:04     ` Alexandre Courbot
2025-11-18 15:08       ` Timur Tabi
2025-11-18 19:46         ` Joel Fernandes
2025-11-19  1:36         ` Alexandre Courbot
2025-11-18 19:45       ` Joel Fernandes
2025-11-19  6:40         ` John Hubbard
2025-11-25 23:59     ` Timur Tabi
2025-11-26  0:31       ` John Hubbard
2025-11-26  1:05         ` Alexandre Courbot
2025-11-26  1:09           ` John Hubbard
2025-11-26  9:57           ` Miguel Ojeda
2025-12-01 21:11     ` Timur Tabi
2025-11-19  3:27   ` Alexandre Courbot
2025-11-14 23:30 ` [PATCH 10/11] gpu: nova-core: LibosMemoryRegionInitArgument size must be page aligned Timur Tabi
2025-11-19  3:36   ` Alexandre Courbot
2025-12-01 23:25     ` Timur Tabi
2025-12-03 11:54       ` Alexandre Courbot
2025-12-03 12:03         ` Alice Ryhl
2025-12-03 13:39           ` Alexandre Courbot
2025-12-03 18:31         ` Timur Tabi
2025-12-04 14:43           ` Alexandre Courbot [this message]
2025-12-04 21:18             ` Timur Tabi
2025-12-04 21:45               ` Timur Tabi
2025-12-05  0:35                 ` Alexandre Courbot
2025-12-05 20:22                   ` Timur Tabi
2025-12-09  2:53                     ` Alexandre Courbot
2025-12-05 23:22                   ` Timur Tabi
2025-12-09  2:55                     ` Alexandre Courbot
2025-12-03 18:34         ` Miguel Ojeda
2025-12-03 19:17           ` Timur Tabi
2025-11-14 23:30 ` [PATCH 11/11] gpu: nova-core: add PIO support for loading firmware images Timur Tabi
2025-11-17 23:34   ` Joel Fernandes
2025-11-18 13:08     ` Alexandre Courbot
2025-12-01 23:26     ` Timur Tabi
2025-11-19  4:28   ` Alexandre Courbot
2025-11-19 13:49     ` Alexandre Courbot
2025-11-19  7:01   ` Alexandre Courbot
2025-11-19  4:29 ` [PATCH 00/11] gpu: nova-core: add Turing support Alexandre Courbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DEPIFZFMH34K.31NETCKSPOIEL@nvidia.com \
    --to=acourbot@nvidia.com \
    --cc=dakr@kernel.org \
    --cc=jhubbard@nvidia.com \
    --cc=joelagnelf@nvidia.com \
    --cc=lyude@redhat.com \
    --cc=nouveau-bounces@lists.freedesktop.org \
    --cc=nouveau@lists.freedesktop.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=ttabi@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.