Re: [PATCH v6 11/11] gpu: nova-core: add PIO support for loading firmware images

public inbox for rust-for-linux@vger.kernel.org
 help / color / mirror / Atom feed

From: "Danilo Krummrich" <dakr@kernel.org>
To: "Timur Tabi" <ttabi@nvidia.com>
Cc: "Alexandre Courbot" <acourbot@nvidia.com>,
	"John Hubbard" <jhubbard@nvidia.com>,
	"Joel Fernandes" <joelagnelf@nvidia.com>,
	<nouveau@lists.freedesktop.org>, <rust-for-linux@vger.kernel.org>
Subject: Re: [PATCH v6 11/11] gpu: nova-core: add PIO support for loading firmware images
Date: Fri, 16 Jan 2026 22:05:19 +0100	[thread overview]
Message-ID: <DFQBHVTTHZY8.13ASLCJ3FJP81@kernel.org> (raw)
In-Reply-To: <20260114192950.1143002-12-ttabi@nvidia.com>

On Wed Jan 14, 2026 at 8:29 PM CET, Timur Tabi wrote:
> +    /// Write a byte array to Falcon memory using programmed I/O (PIO).
> +    ///
> +    /// Writes `img` to the specified `target_mem` (IMEM or DMEM) starting at `mem_base`.
> +    /// For IMEM writes, tags are set for each 256-byte block starting from `tag`.
> +    ///
> +    /// Returns `EINVAL` if `img.len()` is not a multiple of 4.
> +    fn pio_wr_bytes(
> +        &self,
> +        bar: &Bar0,
> +        img: &[u8],
> +        mem_base: u16,
> +        target_mem: FalconMem,
> +        port: u8,

This looks like we should use the Bounded type instead.

> +        tag: u16,
> +    ) -> Result {
> +        // Rejecting misaligned images here allows us to avoid checking
> +        // inside the loops.
> +        if img.len() % 4 != 0 {
> +            return Err(EINVAL);
> +        }
> +
> +        let port = usize::from(port);
> +
> +        match target_mem {
> +            FalconMem::ImemSecure | FalconMem::ImemNonSecure => {
> +                regs::NV_PFALCON_FALCON_IMEMC::default()
> +                    .set_secure(target_mem == FalconMem::ImemSecure)
> +                    .set_aincw(true)
> +                    .set_offs(mem_base)
> +                    .write(bar, &E::ID, port);
> +
> +                let mut tag = tag;
> +                for block in img.chunks(256) {

	for (n, block) in img.chunks(256).iter().enumerate() {
	    regs::NV_PFALCON_FALCON_IMEMT::default()
	        .set_tag(tag + n)
	        .write(bar, &E::ID, port);
	}

This way you don't need the mutable shadow of tag. Besides that, how do we know
this doesn't overflow? Don't we need a checked_add()?

> +                    regs::NV_PFALCON_FALCON_IMEMT::default()
> +                        .set_tag(tag)
> +                        .write(bar, &E::ID, port);
> +                    for word in block.chunks_exact(4) {
> +                        let w = [word[0], word[1], word[2], word[3]];
> +                        regs::NV_PFALCON_FALCON_IMEMD::default()
> +                            .set_data(u32::from_le_bytes(w))
> +                            .write(bar, &E::ID, port);
> +                    }
> +                    tag += 1;
> +                }
> +            }
> +            FalconMem::Dmem => {
> +                regs::NV_PFALCON_FALCON_DMEMC::default()
> +                    .set_aincw(true)
> +                    .set_offs(mem_base)
> +                    .write(bar, &E::ID, port);
> +
> +                for block in img.chunks(256) {
> +                    for word in block.chunks_exact(4) {
> +                        let w = [word[0], word[1], word[2], word[3]];
> +                        regs::NV_PFALCON_FALCON_DMEMD::default()
> +                            .set_data(u32::from_le_bytes(w))
> +                            .write(bar, &E::ID, port);
> +                    }
> +                }
> +            }
> +        }
> +
> +        Ok(())
> +    }
> +
> +    fn pio_wr<F: FalconFirmware<Target = E>>(
> +        &self,
> +        bar: &Bar0,
> +        fw: &F,
> +        target_mem: FalconMem,
> +        load_offsets: &FalconLoadTarget,
> +        port: u8,
> +        tag: u16,
> +    ) -> Result {
> +        let start = usize::from_safe_cast(load_offsets.src_start);
> +        let len = usize::from_safe_cast(load_offsets.len);
> +        let mem_base = u16::try_from(load_offsets.dst_start)?;
> +
> +        // SAFETY: we are the only user of the firmware image at this stage
> +        let data = unsafe { fw.as_slice(start, len).map_err(|_| EINVAL)? };

Why do we need the firmware image to be backed by a DMA object at this point
when you load the firmware image through PIO anyways?

> diff --git a/drivers/gpu/nova-core/falcon/hal.rs b/drivers/gpu/nova-core/falcon/hal.rs
> index 49501aa6ff7f..3a882b7d8aa8 100644
> --- a/drivers/gpu/nova-core/falcon/hal.rs
> +++ b/drivers/gpu/nova-core/falcon/hal.rs
> @@ -15,6 +15,11 @@
>  mod ga102;
>  mod tu102;
>  
> +pub(crate) enum LoadMethod {
> +    Pio,
> +    Dma,
> +}

Seems obvious, but still, please add some documentation explaining that this is
the load method for falcon firmware images, etc.

> +pub(crate) struct GenericBootloader {
> +    pub desc: BootloaderDesc,
> +    pub ucode: Vec<u8, kernel::alloc::allocator::Kmalloc>,

	pub ucode: KVec<u8>,

Also, we may want to use KVVec<u8> or just VVec<u8> instead. What's the image
size?

> +}
> +
>  /// The FWSEC microcode, extracted from the BIOS and to be run on the GSP falcon.
>  ///
>  /// It is responsible for e.g. carving out the WPR2 region as the first step of the GSP bootflow.
> @@ -221,6 +286,8 @@ pub(crate) struct FwsecFirmware {
>      desc: FalconUCodeDesc,
>      /// GPU-accessible DMA object containing the firmware.
>      ucode: FirmwareDmaObject<Self, Signed>,
> +    /// Generic bootloader
> +    gen_bootloader: Option<GenericBootloader>,

I'm not convinced this is a good idea. We probably want a HAL here and have
different FwsecFirmware types:

One with a DMA object and one with a system memory object when the architecture
uses PIO. In the latter case the object can have a GenericBootloader field, i.e.
this also gets us rid of the Option and all the subsequent 'if chipset <
Chipset::GA102' checks and 'match gbl_fw' matches below.

>  }
>  
>  impl FalconLoadParams for FwsecFirmware {
> @@ -275,7 +342,19 @@ fn brom_params(&self) -> FalconBromParams {
>      }
>  
>      fn boot_addr(&self) -> u32 {
> -        0
> +        match &self.desc {
> +            FalconUCodeDesc::V2(_v2) => {
> +                // On V2 platforms, the boot address is extracted from the
> +                // generic bootloader, because the gbl is what actually copies
> +                // FWSEC into memory, so that is what needs to be booted.
> +                if let Some(ref gbl) = self.gen_bootloader {
> +                    gbl.desc.start_tag << 8
> +                } else {
> +                    0
> +                }
> +            }
> +            FalconUCodeDesc::V3(_v3) => 0,
> +        }
>      }
>  }
>  
> @@ -376,6 +455,7 @@ impl FwsecFirmware {
>      /// command.
>      pub(crate) fn new(
>          dev: &Device<device::Bound>,
> +        chipset: Chipset,
>          falcon: &Falcon<Gsp>,
>          bar: &Bar0,
>          bios: &Vbios,
> @@ -432,9 +512,54 @@ pub(crate) fn new(
>              ucode_dma.no_patch_signature()
>          };
>  
> +        // The Generic Bootloader exists only on Turing and GA100.  To avoid a bogus
> +        // console error message on other platforms, only try to load it if it's
> +        // supposed to be there.
> +        let gbl_fw = if chipset < Chipset::GA102 {
> +            Some(super::request_firmware(
> +                dev,
> +                chipset,
> +                "gen_bootloader",
> +                FIRMWARE_VERSION,
> +            )?)
> +        } else {
> +            None
> +        };
> +
> +        let gbl = match gbl_fw {
> +            Some(fw) => {
> +                let hdr = fw
> +                    .data()
> +                    .get(0..size_of::<BinHdr>())
> +                    .and_then(BinHdr::from_bytes_copy)
> +                    .ok_or(EINVAL)?;
> +
> +                let desc_offset = usize::from_safe_cast(hdr.header_offset);
> +                let desc = fw
> +                    .data()
> +                    .get(desc_offset..desc_offset + size_of::<BootloaderDesc>())
> +                    .and_then(BootloaderDesc::from_bytes_copy)
> +                    .ok_or(EINVAL)?;
> +
> +                let ucode_start = usize::from_safe_cast(hdr.data_offset);
> +                let ucode_size = usize::from_safe_cast(hdr.data_size);
> +                let ucode_data = fw
> +                    .data()
> +                    .get(ucode_start..ucode_start + ucode_size)
> +                    .ok_or(EINVAL)?;
> +
> +                let mut ucode = KVec::new();
> +                ucode.extend_from_slice(ucode_data, GFP_KERNEL)?;
> +
> +                Some(GenericBootloader { desc, ucode })
> +            }
> +            None => None,
> +        };
> +
>          Ok(FwsecFirmware {
>              desc,
>              ucode: ucode_signed,
> +            gen_bootloader: gbl,
>          })
>      }

next prev parent reply	other threads:[~2026-01-16 21:05 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-14 19:29 [PATCH v6 00/11] gpu: nova-core: add Turing support Timur Tabi
2026-01-14 19:29 ` [PATCH v6 01/11] gpu: nova-core: rename Imem to ImemSecure Timur Tabi
2026-01-14 19:29 ` [PATCH v6 02/11] gpu: nova-core: add ImemNonSecure section infrastructure Timur Tabi
2026-01-22 12:52   ` Gary Guo
2026-01-22 19:00     ` Timur Tabi
2026-01-14 19:29 ` [PATCH v6 03/11] gpu: nova-core: support header parsing on Turing/GA100 Timur Tabi
2026-01-14 19:29 ` [PATCH v6 04/11] gpu: nova-core: add support for Turing/GA100 fwsignature Timur Tabi
2026-01-14 19:29 ` [PATCH v6 05/11] gpu: nova-core: add NV_PFALCON_FALCON_DMATRFCMD::with_falcon_mem() Timur Tabi
2026-01-14 19:29 ` [PATCH v6 06/11] gpu: nova-core: move some functions into the HAL Timur Tabi
2026-01-16 19:55   ` Danilo Krummrich
2026-01-16 20:08     ` John Hubbard
2026-01-16 20:11       ` Danilo Krummrich
2026-01-16 20:15         ` John Hubbard
2026-01-16 20:21           ` Danilo Krummrich
2026-01-16 20:27             ` John Hubbard
2026-01-14 19:29 ` [PATCH v6 07/11] gpu: nova-core: Add basic Turing HAL Timur Tabi
2026-01-14 19:29 ` [PATCH v6 08/11] gpu: nova-core: add Falcon HAL method supports_dma() Timur Tabi
2026-01-16  1:53   ` Alexandre Courbot
2026-01-16  3:00     ` Timur Tabi
2026-01-14 19:29 ` [PATCH v6 09/11] gpu: nova-core: add FalconUCodeDescV2 support Timur Tabi
2026-01-16  3:11   ` Alexandre Courbot
2026-01-16  3:23   ` Alexandre Courbot
2026-01-16 20:09     ` Danilo Krummrich
2026-01-14 19:29 ` [PATCH v6 10/11] gpu: nova-core: align LibosMemoryRegionInitArgument size to page size Timur Tabi
2026-01-14 19:29 ` [PATCH v6 11/11] gpu: nova-core: add PIO support for loading firmware images Timur Tabi
2026-01-16 15:22   ` Alexandre Courbot
2026-01-16 21:05   ` Danilo Krummrich [this message]
2026-01-17  1:55     ` Alexandre Courbot
2026-01-21  0:25     ` Timur Tabi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DFQBHVTTHZY8.13ASLCJ3FJP81@kernel.org \
    --to=dakr@kernel.org \
    --cc=acourbot@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=joelagnelf@nvidia.com \
    --cc=nouveau@lists.freedesktop.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=ttabi@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox