Re: [PATCH v5 1/9] drm/xe/uc_fw: Allow re-initializing firmware

public inbox for intel-xe@lists.freedesktop.org
 help / color / mirror / Atom feed

From: Raag Jadav <raag.jadav@intel.com>
To: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: intel-xe@lists.freedesktop.org, matthew.brost@intel.com,
	rodrigo.vivi@intel.com, thomas.hellstrom@linux.intel.com,
	riana.tauro@intel.com, michal.wajdeczko@intel.com,
	matthew.d.roper@intel.com, michal.winiarski@intel.com,
	matthew.auld@intel.com, maarten@lankhorst.se,
	jani.nikula@intel.com, lukasz.laguna@intel.com,
	zhanjun.dong@intel.com, lukas@wunner.de
Subject: Re: [PATCH v5 1/9] drm/xe/uc_fw: Allow re-initializing firmware
Date: Mon, 20 Apr 2026 13:20:35 +0200	[thread overview]
Message-ID: <aeYMA6LjQ40muCrH@black.igk.intel.com> (raw)
In-Reply-To: <10a458a4-8c28-48f0-847a-1213c74a7e8e@intel.com>

On Wed, Apr 15, 2026 at 09:06:40AM -0700, Daniele Ceraolo Spurio wrote:
> On 4/6/2026 7:07 AM, Raag Jadav wrote:
> > In preparation of usecases which require re-initializing firmware without
> > reloading the driver, introduce xe_uc_fw_reinit(). The uC firmware bo
> > already exists but since it's contents are on VRAM, they are lost on PCIe
> > FLR. Copy the firmware back to it's bo and mark it as loadable as part of
> > re-initialization.
> 
> Could we just keep the firmware in SMEM instead? It's only accessed during
> FW load (unlike other GuC objects like the CTB that are accessed regularly),
> so it shouldn't be a massive penalty even if the memory access takes longer.

The initial fw load indeed happens in SMEM but they're later moved to VRAM
using xe_managed_bo_reinit_in_vram() and uc_fw layer is probably not aware
of this.

Keeping them in SMEM will probably make this patch redundant but I'm not
informed enough about the history to really know the rationale behind
current state of things. Either way works for me.

> > Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> > ---
> > v2: Add kernel doc (Matthew Brost)
> > ---
> >   drivers/gpu/drm/xe/xe_uc_fw.c | 39 +++++++++++++++++++++++++++++------
> >   drivers/gpu/drm/xe/xe_uc_fw.h |  1 +
> >   2 files changed, 34 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_uc_fw.c b/drivers/gpu/drm/xe/xe_uc_fw.c
> > index df2aa196f6f9..fb168238ee7d 100644
> > --- a/drivers/gpu/drm/xe/xe_uc_fw.c
> > +++ b/drivers/gpu/drm/xe/xe_uc_fw.c
> > @@ -715,13 +715,7 @@ static int uc_fw_request(struct xe_uc_fw *uc_fw, const struct firmware **firmwar
> >   	const struct firmware *fw = NULL;
> >   	int err;
> > -	/*
> > -	 * we use FIRMWARE_UNINITIALIZED to detect checks against uc_fw->status
> > -	 * before we're looked at the HW caps to see if we have uc support
> > -	 */
> >   	BUILD_BUG_ON(XE_UC_FIRMWARE_UNINITIALIZED);
> > -	xe_gt_assert(gt, !uc_fw->status);
> > -	xe_gt_assert(gt, !uc_fw->path);
> >   	uc_fw_auto_select(xe, uc_fw);
> > @@ -834,6 +828,14 @@ static int uc_fw_copy(struct xe_uc_fw *uc_fw, const void *data, size_t size, u32
> >   	return err;
> >   }
> > +static void uc_fw_reinit(struct xe_uc_fw *uc_fw, const void *data)
> > +{
> > +	struct xe_device *xe = uc_fw_to_xe(uc_fw);
> > +
> > +	xe_map_memcpy_to(xe, &uc_fw->bo->vmap, 0, data, uc_fw->size);
> > +	xe_uc_fw_change_status(uc_fw, XE_UC_FIRMWARE_LOADABLE);
> > +}
> > +
> >   int xe_uc_fw_init(struct xe_uc_fw *uc_fw)
> >   {
> >   	const struct firmware *fw = NULL;
> > @@ -857,6 +859,31 @@ int xe_uc_fw_init(struct xe_uc_fw *uc_fw)
> >   }
> >   ALLOW_ERROR_INJECTION(xe_uc_fw_init, ERRNO); /* See xe_pci_probe() */
> > +/**
> > + * xe_uc_fw_reinit() - Re-initialize uC firmware into its bo
> > + * @uc_fw: uC firmware
> > + *
> > + * Returns: 0 on success, negative error code otherwise.
> > + */
> > +int xe_uc_fw_reinit(struct xe_uc_fw *uc_fw)
> > +{
> > +	const struct firmware *fw = NULL;
> > +	int err;
> 
> should probably bail early if the FW is not supported.
> 
> > +
> > +	err = uc_fw_request(uc_fw, &fw);
> 
> I don't really like the idea of going through FW selection again, as this
> could technically cause us to switch to a different FW path or image. IMO
> here for native/PF we should do something like:
> 
>     if (!intel_uc_fw_is_available(uc_fw))
>         return 0;
> 
>     xe_uc_fw_change_status(uc_fw, XE_UC_FIRMWARE_SELECTED);
>     old_fw_version = uc_fw->versions.found[XE_UC_FW_VER_RELEASE];
>     fw = firmware_request_nowarn(&fw, uc_fw->path, dev);
> 
>     parse_headers(uc_fw, fw);
>     new_fw_version = uc_fw->versions.found[XE_UC_FW_VER_RELEASE];
>     if (old_fw_version != new_fw_version)
>         return -ENOEXEC;
> 
>     uc_fw_reinit(uc_fw, fw->data);
>     release_firmware(fw);
> 
> The firmware_request_nowarn() and parse_headers() are the same ones used in
> uc_fw_request, so it might be worth moving them to a common function (in
> which case we can use uc_fw_release instead of release_firmware to match).

Makes sense.

Raag

> > +	if (err)
> > +		return err;
> > +
> > +	/* no error and no firmware means nothing to copy */
> > +	if (!fw)
> > +		return 0;
> > +
> > +	uc_fw_reinit(uc_fw, fw->data);
> > +	uc_fw_release(fw);
> > +
> > +	return err;
> > +}
> > +
> >   static u32 uc_fw_ggtt_offset(struct xe_uc_fw *uc_fw)
> >   {
> >   	return xe_bo_ggtt_addr(uc_fw->bo);
> > diff --git a/drivers/gpu/drm/xe/xe_uc_fw.h b/drivers/gpu/drm/xe/xe_uc_fw.h
> > index bb281b72a677..9f469bf07d66 100644
> > --- a/drivers/gpu/drm/xe/xe_uc_fw.h
> > +++ b/drivers/gpu/drm/xe/xe_uc_fw.h
> > @@ -15,6 +15,7 @@
> >   struct drm_printer;
> >   int xe_uc_fw_init(struct xe_uc_fw *uc_fw);
> > +int xe_uc_fw_reinit(struct xe_uc_fw *uc_fw);
> >   size_t xe_uc_fw_copy_rsa(struct xe_uc_fw *uc_fw, void *dst, u32 max_len);
> >   int xe_uc_fw_upload(struct xe_uc_fw *uc_fw, u32 offset, u32 dma_flags);
> >   int xe_uc_fw_check_version_requirements(struct xe_uc_fw *uc_fw);
>

next prev parent reply	other threads:[~2026-04-20 11:20 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-06 14:07 [PATCH v5 0/9] Introduce Xe PCIe FLR Raag Jadav
2026-04-06 14:07 ` [PATCH v5 1/9] drm/xe/uc_fw: Allow re-initializing firmware Raag Jadav
2026-04-15 16:06   ` Daniele Ceraolo Spurio
2026-04-20 11:20     ` Raag Jadav [this message]
2026-04-06 14:07 ` [PATCH v5 2/9] drm/xe/guc_submit: Introduce guc_exec_queue_reinit() Raag Jadav
2026-04-06 14:07 ` [PATCH v5 3/9] drm/xe/gt: Introduce FLR helpers Raag Jadav
2026-04-15 16:25   ` Daniele Ceraolo Spurio
2026-04-06 14:07 ` [PATCH v5 4/9] drm/xe/irq: Introduce xe_irq_disable() Raag Jadav
2026-04-06 14:07 ` [PATCH v5 5/9] drm/xe: Introduce xe_device_assert_lmem_ready() Raag Jadav
2026-04-06 14:07 ` [PATCH v5 6/9] drm/xe/bo_evict: Introduce xe_bo_restore_map() Raag Jadav
2026-04-06 14:07 ` [PATCH v5 7/9] drm/xe/exec_queue: Introduce xe_exec_queue_reinit() Raag Jadav
2026-04-15 16:10   ` Daniele Ceraolo Spurio
2026-04-15 16:48     ` Daniele Ceraolo Spurio
2026-04-15 17:02       ` Daniele Ceraolo Spurio
2026-04-06 14:07 ` [PATCH v5 8/9] drm/xe/migrate: Introduce xe_migrate_reinit() Raag Jadav
2026-04-06 14:07 ` [PATCH v5 9/9] drm/xe/pci: Introduce PCIe FLR Raag Jadav
2026-04-15  8:43   ` Laguna, Lukasz
2026-04-15  9:46     ` Raag Jadav
2026-04-15 10:33       ` Laguna, Lukasz
2026-04-15 10:54         ` Raag Jadav
2026-04-16  6:40           ` Raag Jadav
2026-04-17  7:10             ` Laguna, Lukasz
2026-04-15 16:45   ` Daniele Ceraolo Spurio
2026-04-06 14:18 ` ✗ CI.checkpatch: warning for Introduce Xe PCIe FLR (rev5) Patchwork
2026-04-06 14:19 ` ✓ CI.KUnit: success " Patchwork
2026-04-06 14:54 ` ✓ Xe.CI.BAT: " Patchwork
2026-04-06 18:08 ` ✗ Xe.CI.FULL: failure " Patchwork
2026-04-10 14:22 ` [PATCH v5 0/9] Introduce Xe PCIe FLR Raag Jadav
2026-04-10 18:22   ` Maarten Lankhorst
2026-04-11  8:11     ` Raag Jadav
2026-04-15 15:47 ` Daniele Ceraolo Spurio
2026-04-16  6:19   ` Raag Jadav
2026-04-16  6:35     ` Matthew Brost

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aeYMA6LjQ40muCrH@black.igk.intel.com \
    --to=raag.jadav@intel.com \
    --cc=daniele.ceraolospurio@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jani.nikula@intel.com \
    --cc=lukas@wunner.de \
    --cc=lukasz.laguna@intel.com \
    --cc=maarten@lankhorst.se \
    --cc=matthew.auld@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=matthew.d.roper@intel.com \
    --cc=michal.wajdeczko@intel.com \
    --cc=michal.winiarski@intel.com \
    --cc=riana.tauro@intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=zhanjun.dong@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox