* Re: [PATCH v3 0/6] gpu: nova-core: run unload sequence upon unbinding
[not found] <20260422-nova-unload-v3-0-1d2c81bd3ced@nvidia.com>
@ 2026-05-13 20:59 ` John Hubbard
2026-05-15 2:20 ` Alexandre Courbot
0 siblings, 1 reply; 2+ messages in thread
From: John Hubbard @ 2026-05-13 20:59 UTC (permalink / raw)
To: Alexandre Courbot, Danilo Krummrich, Alice Ryhl, David Airlie,
Simona Vetter, Bjorn Helgaas, Krzysztof Wilczyński,
Miguel Ojeda, Gary Guo, Björn Roy Baron, Benno Lossin,
Andreas Hindborg, Trevor Gross, Boqun Feng
Cc: Alistair Popple, Joel Fernandes, Timur Tabi, Eliot Courtney,
dri-devel, linux-kernel, rust-for-linux
On 4/22/26 6:40 AM, Alexandre Courbot wrote:
> Currently the GSP is left running and the WPR2 memory region untouched
> when the driver is unbound. This is obviously not ideal for at least two
> reasons:
Hi,
Is this ready to merge, or are you looking for more reviews?
thanks,
--
John Hubbard
>
> - Probing requires setting up the WPR2 region, which cannot be done if
> there is already one in place. Hence the current requirement to reset
> the GPU (using e.g. `echo 1 >/sys/bus/pci/devices/.../reset`) before
> the driver can be probed again after removal.
> - The running GSP may still attempt to access shared memory regions
> which the kernel might recycle.
>
> On top of that, there is a nasty bug in the Blackwell VBIOS that
> sometimes borks the GPU upon PCI reset, requiring a reboot. So relying
> on the PCI reset to unload/reload Nova is really not practical here.
>
> This series does what is needed to leave the GPU in a clean state after
> unbind, for all currently supported GPUs. Blackwell support is trivial
> and will be added alongside the Blackwell series [1] if this can be
> merged first.
>
> The first patch adds a `warn_on_err` utility macro to the kernel crate
> as it is useful to warn on failures in the driver unbind path, but I can
> remove it if it is not deemed useful.
>
> This series applies cleanly on `master` as of today.
>
> [1] https://lore.kernel.org/all/20260411024953.473149-1-jhubbard@nvidia.com/
>
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
> Changes in v3:
> - Disambiguate doccomment for `warn_on_err`.
> - Test the correct bit instead of the whole register value to determine
> that the GSP has stopped.
> - Use an enum instead of a boolean to encode the power level when
> shutting down the GSP.
> - Add missing newline to `dev_err`.
> - Add missing doccomments for new types.
> - Use values from bindings instead of magic numbers.
> - Remove the redundant `get_gsp_info` function.
> - Better document Booter Unloader mailbox sentinel value, and check the
> value of mbox0 upon return.
> - Link to v2: https://patch.msgid.link/20260421-nova-unload-v2-0-2fe54963af8b@nvidia.com
>
> Changes in v2:
> - Rebase on top of `master` and remove unneeded/obsolete preparatory patches.
> - Tidy up the imports of commands from the `fw` module in the `gsp` module.
> - Link to v1: https://patch.msgid.link/20251216-nova-unload-v1-0-6a5d823be19d@nvidia.com
>
> ---
> Alexandre Courbot (6):
> rust: add warn_on_err macro
> gpu: nova-core: use warn_on_err macro
> gpu: nova-core: remove unneeded get_gsp_info proxy function
> gpu: nova-core: do not import firmware commands into GSP command module
> gpu: nova-core: send UNLOADING_GUEST_DRIVER GSP command upon unloading
> gpu: nova-core: run Booter Unloader and FWSEC-SB upon unbinding
>
> drivers/gpu/nova-core/firmware/booter.rs | 1 -
> drivers/gpu/nova-core/firmware/fwsec.rs | 1 -
> drivers/gpu/nova-core/gpu.rs | 21 +++--
> drivers/gpu/nova-core/gsp/boot.rs | 100 +++++++++++++++++++++-
> drivers/gpu/nova-core/gsp/commands.rs | 69 +++++++++++----
> drivers/gpu/nova-core/gsp/fw.rs | 4 +
> drivers/gpu/nova-core/gsp/fw/commands.rs | 44 ++++++++++
> drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs | 11 +++
> drivers/gpu/nova-core/regs.rs | 5 ++
> rust/kernel/bug.rs | 10 +++
> 10 files changed, 241 insertions(+), 25 deletions(-)
> ---
> base-commit: b4e07588e743c989499ca24d49e752c074924a9a
> change-id: 20251216-nova-unload-4029b3b76950
>
> Best regards,
> --
> Alexandre Courbot <acourbot@nvidia.com>
>
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH v3 0/6] gpu: nova-core: run unload sequence upon unbinding
2026-05-13 20:59 ` [PATCH v3 0/6] gpu: nova-core: run unload sequence upon unbinding John Hubbard
@ 2026-05-15 2:20 ` Alexandre Courbot
0 siblings, 0 replies; 2+ messages in thread
From: Alexandre Courbot @ 2026-05-15 2:20 UTC (permalink / raw)
To: John Hubbard
Cc: Danilo Krummrich, Alice Ryhl, David Airlie, Simona Vetter,
Bjorn Helgaas, Krzysztof Wilczyński, Miguel Ojeda, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg,
Trevor Gross, Boqun Feng, Alistair Popple, Joel Fernandes,
Timur Tabi, Eliot Courtney, dri-devel, linux-kernel,
rust-for-linux
On Thu May 14, 2026 at 5:59 AM JST, John Hubbard wrote:
> On 4/22/26 6:40 AM, Alexandre Courbot wrote:
>> Currently the GSP is left running and the WPR2 memory region untouched
>> when the driver is unbound. This is obviously not ideal for at least two
>> reasons:
>
> Hi,
>
> Is this ready to merge, or are you looking for more reviews?
This needs to be rebased (and applied after) the Device HRT series. I
will try to post a v5 today that does that.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-05-15 2:20 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260422-nova-unload-v3-0-1d2c81bd3ced@nvidia.com>
2026-05-13 20:59 ` [PATCH v3 0/6] gpu: nova-core: run unload sequence upon unbinding John Hubbard
2026-05-15 2:20 ` Alexandre Courbot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox