* Re: [PATCH v3 0/6] gpu: nova-core: run unload sequence upon unbinding
[not found] <20260422-nova-unload-v3-0-1d2c81bd3ced@nvidia.com>
@ 2026-05-13 20:59 ` John Hubbard
2026-05-15 2:20 ` Alexandre Courbot
0 siblings, 1 reply; 2+ messages in thread
From: John Hubbard @ 2026-05-13 20:59 UTC (permalink / raw)
To: Alexandre Courbot, Danilo Krummrich, Alice Ryhl, David Airlie,
Simona Vetter, Bjorn Helgaas, Krzysztof Wilczyński,
Miguel Ojeda, Gary Guo, Björn Roy Baron, Benno Lossin,
Andreas Hindborg, Trevor Gross, Boqun Feng
Cc: Alistair Popple, Joel Fernandes, Timur Tabi, Eliot Courtney,
dri-devel, linux-kernel, rust-for-linux
On 4/22/26 6:40 AM, Alexandre Courbot wrote:
> Currently the GSP is left running and the WPR2 memory region untouched
> when the driver is unbound. This is obviously not ideal for at least two
> reasons:
Hi,
Is this ready to merge, or are you looking for more reviews?
thanks,
--
John Hubbard
>
> - Probing requires setting up the WPR2 region, which cannot be done if
> there is already one in place. Hence the current requirement to reset
> the GPU (using e.g. `echo 1 >/sys/bus/pci/devices/.../reset`) before
> the driver can be probed again after removal.
> - The running GSP may still attempt to access shared memory regions
> which the kernel might recycle.
>
> On top of that, there is a nasty bug in the Blackwell VBIOS that
> sometimes borks the GPU upon PCI reset, requiring a reboot. So relying
> on the PCI reset to unload/reload Nova is really not practical here.
>
> This series does what is needed to leave the GPU in a clean state after
> unbind, for all currently supported GPUs. Blackwell support is trivial
> and will be added alongside the Blackwell series [1] if this can be
> merged first.
>
> The first patch adds a `warn_on_err` utility macro to the kernel crate
> as it is useful to warn on failures in the driver unbind path, but I can
> remove it if it is not deemed useful.
>
> This series applies cleanly on `master` as of today.
>
> [1] https://lore.kernel.org/all/20260411024953.473149-1-jhubbard@nvidia.com/
>
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
> Changes in v3:
> - Disambiguate doccomment for `warn_on_err`.
> - Test the correct bit instead of the whole register value to determine
> that the GSP has stopped.
> - Use an enum instead of a boolean to encode the power level when
> shutting down the GSP.
> - Add missing newline to `dev_err`.
> - Add missing doccomments for new types.
> - Use values from bindings instead of magic numbers.
> - Remove the redundant `get_gsp_info` function.
> - Better document Booter Unloader mailbox sentinel value, and check the
> value of mbox0 upon return.
> - Link to v2: https://patch.msgid.link/20260421-nova-unload-v2-0-2fe54963af8b@nvidia.com
>
> Changes in v2:
> - Rebase on top of `master` and remove unneeded/obsolete preparatory patches.
> - Tidy up the imports of commands from the `fw` module in the `gsp` module.
> - Link to v1: https://patch.msgid.link/20251216-nova-unload-v1-0-6a5d823be19d@nvidia.com
>
> ---
> Alexandre Courbot (6):
> rust: add warn_on_err macro
> gpu: nova-core: use warn_on_err macro
> gpu: nova-core: remove unneeded get_gsp_info proxy function
> gpu: nova-core: do not import firmware commands into GSP command module
> gpu: nova-core: send UNLOADING_GUEST_DRIVER GSP command upon unloading
> gpu: nova-core: run Booter Unloader and FWSEC-SB upon unbinding
>
> drivers/gpu/nova-core/firmware/booter.rs | 1 -
> drivers/gpu/nova-core/firmware/fwsec.rs | 1 -
> drivers/gpu/nova-core/gpu.rs | 21 +++--
> drivers/gpu/nova-core/gsp/boot.rs | 100 +++++++++++++++++++++-
> drivers/gpu/nova-core/gsp/commands.rs | 69 +++++++++++----
> drivers/gpu/nova-core/gsp/fw.rs | 4 +
> drivers/gpu/nova-core/gsp/fw/commands.rs | 44 ++++++++++
> drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs | 11 +++
> drivers/gpu/nova-core/regs.rs | 5 ++
> rust/kernel/bug.rs | 10 +++
> 10 files changed, 241 insertions(+), 25 deletions(-)
> ---
> base-commit: b4e07588e743c989499ca24d49e752c074924a9a
> change-id: 20251216-nova-unload-4029b3b76950
>
> Best regards,
> --
> Alexandre Courbot <acourbot@nvidia.com>
>
^ permalink raw reply [flat|nested] 2+ messages in thread