From: Rodrigo Vivi <rodrigo.vivi@intel.com>
To: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Raag Jadav <raag.jadav@intel.com>,
<intel-xe@lists.freedesktop.org>, <anshuman.gupta@intel.com>,
<badal.nilawar@intel.com>, <riana.tauro@intel.com>
Subject: Re: [PATCH v1 1/2] drm/xe/debugfs: Expose PCIe Gen5 update telemetry
Date: Mon, 31 Mar 2025 11:23:39 -0400 [thread overview]
Message-ID: <Z-qzew1aK4LvL-Ir@intel.com> (raw)
In-Reply-To: <yujx3wbw5du2pl3po7ahwrmefsroueqf7sa4gimobelbpdfqcz@ifw4g2ctn4vg>
On Mon, Mar 31, 2025 at 09:52:28AM -0500, Lucas De Marchi wrote:
> On Mon, Mar 31, 2025 at 07:53:35PM +0530, Raag Jadav wrote:
> > Expose debugfs telemetry required for PCIe Gen5 firmware update for
>
> telemetry?? it doesn't seem anything related to telemetry here.
telemetry is definitely not a good word here...
>
> > discrete GPUs.
> >
> > Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_debugfs.c | 93 +++++++++++++++++++++++++++++++
> > drivers/gpu/drm/xe/xe_pcode_api.h | 4 ++
> > 2 files changed, 97 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
> > index d0503959a8ed..67c941abf4fe 100644
> > --- a/drivers/gpu/drm/xe/xe_debugfs.c
> > +++ b/drivers/gpu/drm/xe/xe_debugfs.c
> > @@ -17,6 +17,9 @@
> > #include "xe_gt_debugfs.h"
> > #include "xe_gt_printk.h"
> > #include "xe_guc_ads.h"
> > +#include "xe_mmio.h"
> > +#include "xe_pcode_api.h"
> > +#include "xe_pcode.h"
> > #include "xe_pm.h"
> > #include "xe_pxp_debugfs.h"
> > #include "xe_sriov.h"
> > @@ -191,6 +194,89 @@ static const struct file_operations wedged_mode_fops = {
> > .write = wedged_mode_set,
> > };
> >
> > +/**
> > + * DOC: PCIe Gen5 Update Limitations
> > + *
> > + * Default link speed of discrete GPUs is determined by FIT parameters stored
> > + * in their flash memory, which are subject to override through user initiated
> > + * firmware updates. It has been observed that devices configured with PCIe
> > + * Gen5 as their default speed can come across link quality issues due to host
> > + * or motherboard limitations and may have to auto-downspeed to PCIe Gen4 when
> > + * faced with unstable link at Gen5. The users are required to ensure that the
> > + * device is capable of auto-downspeeding to PCIe Gen4 before pushing the image
> > + * with Gen5 as default configuration. This can be done by reading
> > + * ``pcie_gen4_downspeed_capable`` debugfs entry, which will denote PCIe Gen4
> > + * auto-downspeed capability of the device with boolean output value of ``0``
> > + * or ``1``, meaning `incapable` or `capable` respectively.
>
> It doesn't seem like something to have in debugfs. If this is for end
> users, they may not even have debugfs mounted or available at all.
I was one pushing it more towards debugfs, but I now believe this is sysfs as is.
The admin needs this information before upgrading the IFWI.
>
> Please clarify what's being used for this firmware upgrade.
The final goal is to have the fwupdtool. Some work that Tomas had started.
We are still clearing the path for that. But also there are some igsc tools
that can be used now to flash the fw...
And likely that XPU manager soon plugging into that.
>
> > + *
> > + * .. code-block:: shell
> > + *
> > + * $ cat /sys/kernel/debug/dri/<N>/pcie_gen4_downspeed_capable
> > + *
> > + * Pushing PCIe Gen5 update on a auto-downspeed incapable device and facing
>
> Isn't the ability to downgrade the link to Gen4 something controlled by
> the firmware? Why would we push a Gen5 firmware that can't downgrade to
> Gen4?
There are safe combinations out there that works well and safely in gen5.
No need to downgrade. But in many cases it is safe to check the gen4 downgrade
possibility and status...
>
> > + * link instability due to host or motherboard limitations can result in driver
> > + * not being able to successfully bind to the device, making further firmware
> > + * updates impossible with RMA being the only last resort.
>
> when starting survivability mode, can't we always force it to gen4 to
> avoid this kind of issues?
We, we cannot choose that from software.
>
> Lucas De Marchi
next prev parent reply other threads:[~2025-03-31 15:23 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-31 14:23 [PATCH v1 0/2] DGFX PCIe Gen5 update telemetry and usage Raag Jadav
2025-03-31 14:23 ` [PATCH v1 1/2] drm/xe/debugfs: Expose PCIe Gen5 update telemetry Raag Jadav
2025-03-31 14:52 ` Lucas De Marchi
2025-03-31 15:23 ` Rodrigo Vivi [this message]
2025-03-31 15:15 ` Rodrigo Vivi
2025-04-02 18:24 ` Nilawar, Badal
2025-04-03 3:38 ` Raag Jadav
2025-03-31 14:23 ` [PATCH v1 2/2] drm/xe/doc: Wire up PCIe Gen5 update limitations Raag Jadav
2025-03-31 15:24 ` Rodrigo Vivi
2025-04-02 10:22 ` Raag Jadav
2025-03-31 14:29 ` ✓ CI.Patch_applied: success for DGFX PCIe Gen5 update telemetry and usage Patchwork
2025-03-31 14:30 ` ✗ CI.checkpatch: warning " Patchwork
2025-03-31 14:31 ` ✓ CI.KUnit: success " Patchwork
2025-03-31 14:47 ` ✓ CI.Build: " Patchwork
2025-03-31 14:50 ` ✓ CI.Hooks: " Patchwork
2025-03-31 14:51 ` ✓ CI.checksparse: " Patchwork
2025-03-31 15:37 ` ✓ Xe.CI.BAT: " Patchwork
2025-03-31 16:54 ` ✗ Xe.CI.Full: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z-qzew1aK4LvL-Ir@intel.com \
--to=rodrigo.vivi@intel.com \
--cc=anshuman.gupta@intel.com \
--cc=badal.nilawar@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=lucas.demarchi@intel.com \
--cc=raag.jadav@intel.com \
--cc=riana.tauro@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox