* [PATCH] drm/i915: Include GuC fw version in error state
@ 2017-02-23 23:11 Michel Thierry
2017-02-24 0:22 ` ✓ Fi.CI.BAT: success for " Patchwork
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Michel Thierry @ 2017-02-23 23:11 UTC (permalink / raw)
To: intel-gfx
There was no way to check if the platform is running the latest firmware.
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 2b1d15668192..e022187916ee 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
CSR_VERSION_MINOR(csr->version));
}
+ if (HAS_GUC_UCODE(dev_priv)) {
+ struct intel_uc_fw *guc_fw = &dev_priv->guc.fw;
+
+ err_printf(m, "GuC loaded: %s\n",
+ yesno(guc_fw->load_status ==
+ INTEL_UC_FIRMWARE_SUCCESS));
+ err_printf(m, "GuC fw version: %d.%d\n",
+ guc_fw->major_ver_found, guc_fw->minor_ver_found);
+ }
+
err_printf(m, "EIR: 0x%08x\n", error->eir);
err_printf(m, "IER: 0x%08x\n", error->ier);
for (i = 0; i < error->ngtier; i++)
--
2.11.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 13+ messages in thread* ✓ Fi.CI.BAT: success for drm/i915: Include GuC fw version in error state 2017-02-23 23:11 [PATCH] drm/i915: Include GuC fw version in error state Michel Thierry @ 2017-02-24 0:22 ` Patchwork 2017-02-24 3:43 ` [PATCH] " Kamble, Sagar A 2017-02-24 10:40 ` Michal Wajdeczko 2 siblings, 0 replies; 13+ messages in thread From: Patchwork @ 2017-02-24 0:22 UTC (permalink / raw) To: Michel Thierry; +Cc: intel-gfx == Series Details == Series: drm/i915: Include GuC fw version in error state URL : https://patchwork.freedesktop.org/series/20181/ State : success == Summary == Series 20181v1 drm/i915: Include GuC fw version in error state https://patchwork.freedesktop.org/api/1.0/series/20181/revisions/1/mbox/ fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 fi-bsw-n3050 total:278 pass:239 dwarn:0 dfail:0 fail:0 skip:39 fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 fi-bxt-t5700 total:108 pass:95 dwarn:0 dfail:0 fail:0 skip:12 fi-byt-j1900 total:278 pass:251 dwarn:0 dfail:0 fail:0 skip:27 fi-byt-n2820 total:278 pass:247 dwarn:0 dfail:0 fail:0 skip:31 fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 fi-skl-6700hq total:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 fi-skl-6770hq total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 c6638f903295bbbd29957b878a42b83c5566250c drm-tip: 2017y-02m-23d-22h-50m-35s UTC integration manifest b721bdd drm/i915: Include GuC fw version in error state == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3956/ _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state 2017-02-23 23:11 [PATCH] drm/i915: Include GuC fw version in error state Michel Thierry 2017-02-24 0:22 ` ✓ Fi.CI.BAT: success for " Patchwork @ 2017-02-24 3:43 ` Kamble, Sagar A 2017-02-24 9:13 ` Chris Wilson 2017-02-24 10:40 ` Michal Wajdeczko 2 siblings, 1 reply; 13+ messages in thread From: Kamble, Sagar A @ 2017-02-24 3:43 UTC (permalink / raw) To: Michel Thierry, intel-gfx [-- Attachment #1.1: Type: text/plain, Size: 1273 bytes --] Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> On 2/24/2017 4:41 AM, Michel Thierry wrote: > There was no way to check if the platform is running the latest firmware. > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> > Signed-off-by: Michel Thierry <michel.thierry@intel.com> > --- > drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c > index 2b1d15668192..e022187916ee 100644 > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, > CSR_VERSION_MINOR(csr->version)); > } > > + if (HAS_GUC_UCODE(dev_priv)) { > + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw; > + > + err_printf(m, "GuC loaded: %s\n", > + yesno(guc_fw->load_status == > + INTEL_UC_FIRMWARE_SUCCESS)); > + err_printf(m, "GuC fw version: %d.%d\n", > + guc_fw->major_ver_found, guc_fw->minor_ver_found); > + } > + > err_printf(m, "EIR: 0x%08x\n", error->eir); > err_printf(m, "IER: 0x%08x\n", error->ier); > for (i = 0; i < error->ngtier; i++) [-- Attachment #1.2: Type: text/html, Size: 1948 bytes --] [-- Attachment #2: Type: text/plain, Size: 160 bytes --] _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state 2017-02-24 3:43 ` [PATCH] " Kamble, Sagar A @ 2017-02-24 9:13 ` Chris Wilson 2017-02-24 10:43 ` Michal Wajdeczko 0 siblings, 1 reply; 13+ messages in thread From: Chris Wilson @ 2017-02-24 9:13 UTC (permalink / raw) To: Kamble, Sagar A; +Cc: intel-gfx On Fri, Feb 24, 2017 at 09:13:05AM +0530, Kamble, Sagar A wrote: > Reviewed-by: Sagar Arun Kamble [1]<sagar.a.kamble@intel.com> > > On 2/24/2017 4:41 AM, Michel Thierry wrote: > > There was no way to check if the platform is running the latest firmware. > > Cc: Tvrtko Ursulin [2]<tvrtko.ursulin@intel.com> > Cc: Arkadiusz Hiler [3]<arkadiusz.hiler@intel.com> > Signed-off-by: Michel Thierry [4]<michel.thierry@intel.com> > --- > drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c > index 2b1d15668192..e022187916ee 100644 > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, > CSR_VERSION_MINOR(csr->version)); > } > > + if (HAS_GUC_UCODE(dev_priv)) { > + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw; > + > + err_printf(m, "GuC loaded: %s\n", > + yesno(guc_fw->load_status == > + INTEL_UC_FIRMWARE_SUCCESS)); > + err_printf(m, "GuC fw version: %d.%d\n", > + guc_fw->major_ver_found, guc_fw->minor_ver_found); > + } > + Hmm. The firmware may change between the hang and cat /sys/class/drm/card0/error (as it will be reloaded after the reset). -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state 2017-02-24 9:13 ` Chris Wilson @ 2017-02-24 10:43 ` Michal Wajdeczko 2017-02-24 10:49 ` Chris Wilson 0 siblings, 1 reply; 13+ messages in thread From: Michal Wajdeczko @ 2017-02-24 10:43 UTC (permalink / raw) To: Chris Wilson, Kamble, Sagar A, Michel Thierry, intel-gfx On Fri, Feb 24, 2017 at 09:13:29AM +0000, Chris Wilson wrote: > On Fri, Feb 24, 2017 at 09:13:05AM +0530, Kamble, Sagar A wrote: > > Reviewed-by: Sagar Arun Kamble [1]<sagar.a.kamble@intel.com> > > > > On 2/24/2017 4:41 AM, Michel Thierry wrote: > > > > There was no way to check if the platform is running the latest firmware. > > > > Cc: Tvrtko Ursulin [2]<tvrtko.ursulin@intel.com> > > Cc: Arkadiusz Hiler [3]<arkadiusz.hiler@intel.com> > > Signed-off-by: Michel Thierry [4]<michel.thierry@intel.com> > > --- > > drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c > > index 2b1d15668192..e022187916ee 100644 > > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > > @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, > > CSR_VERSION_MINOR(csr->version)); > > } > > > > + if (HAS_GUC_UCODE(dev_priv)) { > > + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw; > > + > > + err_printf(m, "GuC loaded: %s\n", > > + yesno(guc_fw->load_status == > > + INTEL_UC_FIRMWARE_SUCCESS)); > > + err_printf(m, "GuC fw version: %d.%d\n", > > + guc_fw->major_ver_found, guc_fw->minor_ver_found); > > + } > > + > > Hmm. The firmware may change between the hang and cat > /sys/class/drm/card0/error (as it will be reloaded after the reset). Btw, maybe we should add counter that will be incremented on each fw reload and reported here ? -Michal _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state 2017-02-24 10:43 ` Michal Wajdeczko @ 2017-02-24 10:49 ` Chris Wilson 2017-02-24 15:45 ` Kamble, Sagar A 2017-02-24 16:30 ` Michel Thierry 0 siblings, 2 replies; 13+ messages in thread From: Chris Wilson @ 2017-02-24 10:49 UTC (permalink / raw) To: Michal Wajdeczko; +Cc: intel-gfx On Fri, Feb 24, 2017 at 11:43:32AM +0100, Michal Wajdeczko wrote: > On Fri, Feb 24, 2017 at 09:13:29AM +0000, Chris Wilson wrote: > > On Fri, Feb 24, 2017 at 09:13:05AM +0530, Kamble, Sagar A wrote: > > > Reviewed-by: Sagar Arun Kamble [1]<sagar.a.kamble@intel.com> > > > > > > On 2/24/2017 4:41 AM, Michel Thierry wrote: > > > > > > There was no way to check if the platform is running the latest firmware. > > > > > > Cc: Tvrtko Ursulin [2]<tvrtko.ursulin@intel.com> > > > Cc: Arkadiusz Hiler [3]<arkadiusz.hiler@intel.com> > > > Signed-off-by: Michel Thierry [4]<michel.thierry@intel.com> > > > --- > > > drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++ > > > 1 file changed, 10 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c > > > index 2b1d15668192..e022187916ee 100644 > > > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > > > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > > > @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, > > > CSR_VERSION_MINOR(csr->version)); > > > } > > > > > > + if (HAS_GUC_UCODE(dev_priv)) { > > > + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw; > > > + > > > + err_printf(m, "GuC loaded: %s\n", > > > + yesno(guc_fw->load_status == > > > + INTEL_UC_FIRMWARE_SUCCESS)); > > > + err_printf(m, "GuC fw version: %d.%d\n", > > > + guc_fw->major_ver_found, guc_fw->minor_ver_found); > > > + } > > > + > > > > Hmm. The firmware may change between the hang and cat > > /sys/class/drm/card0/error (as it will be reloaded after the reset). > > Btw, maybe we should add counter that will be incremented on each fw reload > and reported here ? If it occurs to you that we need it for post-mortem debugging and having it is worth more than any potential confusion.... I can see the need for knowing what guc/huc/dmc/etc was running at the time of a hang - I just hope that what was previously running before an earlier reset doesn't contribute. But that's why we focus on the first error in a system... -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state 2017-02-24 10:49 ` Chris Wilson @ 2017-02-24 15:45 ` Kamble, Sagar A 2017-02-24 16:30 ` Michel Thierry 1 sibling, 0 replies; 13+ messages in thread From: Kamble, Sagar A @ 2017-02-24 15:45 UTC (permalink / raw) To: Chris Wilson, Michal Wajdeczko, Michel Thierry, intel-gfx On 2/24/2017 4:19 PM, Chris Wilson wrote: > On Fri, Feb 24, 2017 at 11:43:32AM +0100, Michal Wajdeczko wrote: >> On Fri, Feb 24, 2017 at 09:13:29AM +0000, Chris Wilson wrote: >>> On Fri, Feb 24, 2017 at 09:13:05AM +0530, Kamble, Sagar A wrote: >>>> Reviewed-by: Sagar Arun Kamble [1]<sagar.a.kamble@intel.com> >>>> >>>> On 2/24/2017 4:41 AM, Michel Thierry wrote: >>>> >>>> There was no way to check if the platform is running the latest firmware. >>>> >>>> Cc: Tvrtko Ursulin [2]<tvrtko.ursulin@intel.com> >>>> Cc: Arkadiusz Hiler [3]<arkadiusz.hiler@intel.com> >>>> Signed-off-by: Michel Thierry [4]<michel.thierry@intel.com> >>>> --- >>>> drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++ >>>> 1 file changed, 10 insertions(+) >>>> >>>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c >>>> index 2b1d15668192..e022187916ee 100644 >>>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c >>>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c >>>> @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, >>>> CSR_VERSION_MINOR(csr->version)); >>>> } >>>> >>>> + if (HAS_GUC_UCODE(dev_priv)) { >>>> + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw; >>>> + >>>> + err_printf(m, "GuC loaded: %s\n", >>>> + yesno(guc_fw->load_status == >>>> + INTEL_UC_FIRMWARE_SUCCESS)); >>>> + err_printf(m, "GuC fw version: %d.%d\n", >>>> + guc_fw->major_ver_found, guc_fw->minor_ver_found); >>>> + } >>>> + >>> Hmm. The firmware may change between the hang and cat >>> /sys/class/drm/card0/error (as it will be reloaded after the reset). >> Btw, maybe we should add counter that will be incremented on each fw reload >> and reported here ? > If it occurs to you that we need it for post-mortem debugging and having > it is worth more than any potential confusion.... > > I can see the need for knowing what guc/huc/dmc/etc was running at the > time of a hang - I just hope that what was previously running before an > earlier reset doesn't contribute. But that's why we focus on the first > error in a system... > -Chris > GT reset count is present already in error state. GuC kernel parameters are present and this change will help us identify which firmware issue was encountered. So I feel printing ver_found should be enough. _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state 2017-02-24 10:49 ` Chris Wilson 2017-02-24 15:45 ` Kamble, Sagar A @ 2017-02-24 16:30 ` Michel Thierry 2017-02-24 17:15 ` Chris Wilson 1 sibling, 1 reply; 13+ messages in thread From: Michel Thierry @ 2017-02-24 16:30 UTC (permalink / raw) To: Chris Wilson, Michal Wajdeczko, Kamble, Sagar A, intel-gfx On 2/24/2017 2:49 AM, Chris Wilson wrote: > On Fri, Feb 24, 2017 at 11:43:32AM +0100, Michal Wajdeczko wrote: >> On Fri, Feb 24, 2017 at 09:13:29AM +0000, Chris Wilson wrote: >>> On Fri, Feb 24, 2017 at 09:13:05AM +0530, Kamble, Sagar A wrote: >>>> Reviewed-by: Sagar Arun Kamble [1]<sagar.a.kamble@intel.com> >>>> >>>> On 2/24/2017 4:41 AM, Michel Thierry wrote: >>>> >>>> There was no way to check if the platform is running the latest firmware. >>>> >>>> Cc: Tvrtko Ursulin [2]<tvrtko.ursulin@intel.com> >>>> Cc: Arkadiusz Hiler [3]<arkadiusz.hiler@intel.com> >>>> Signed-off-by: Michel Thierry [4]<michel.thierry@intel.com> >>>> --- >>>> drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++ >>>> 1 file changed, 10 insertions(+) >>>> >>>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c >>>> index 2b1d15668192..e022187916ee 100644 >>>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c >>>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c >>>> @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, >>>> CSR_VERSION_MINOR(csr->version)); >>>> } >>>> >>>> + if (HAS_GUC_UCODE(dev_priv)) { >>>> + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw; >>>> + >>>> + err_printf(m, "GuC loaded: %s\n", >>>> + yesno(guc_fw->load_status == >>>> + INTEL_UC_FIRMWARE_SUCCESS)); >>>> + err_printf(m, "GuC fw version: %d.%d\n", >>>> + guc_fw->major_ver_found, guc_fw->minor_ver_found); >>>> + } >>>> + >>> >>> Hmm. The firmware may change between the hang and cat >>> /sys/class/drm/card0/error (as it will be reloaded after the reset). >> >> Btw, maybe we should add counter that will be incremented on each fw reload >> and reported here ? > > If it occurs to you that we need it for post-mortem debugging and having > it is worth more than any potential confusion.... > > I can see the need for knowing what guc/huc/dmc/etc was running at the > time of a hang - I just hope that what was previously running before an > earlier reset doesn't contribute. But that's why we focus on the first > error in a system... Can the firmware change? Last time I checked the filename was hard-coded in the driver. It's true that the load process could fail and then the information be incorrect. _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state 2017-02-24 16:30 ` Michel Thierry @ 2017-02-24 17:15 ` Chris Wilson 2017-02-24 17:32 ` Michel Thierry 0 siblings, 1 reply; 13+ messages in thread From: Chris Wilson @ 2017-02-24 17:15 UTC (permalink / raw) To: Michel Thierry; +Cc: intel-gfx On Fri, Feb 24, 2017 at 08:30:43AM -0800, Michel Thierry wrote: > On 2/24/2017 2:49 AM, Chris Wilson wrote: > >On Fri, Feb 24, 2017 at 11:43:32AM +0100, Michal Wajdeczko wrote: > >>On Fri, Feb 24, 2017 at 09:13:29AM +0000, Chris Wilson wrote: > >>>On Fri, Feb 24, 2017 at 09:13:05AM +0530, Kamble, Sagar A wrote: > >>>> Reviewed-by: Sagar Arun Kamble [1]<sagar.a.kamble@intel.com> > >>>> > >>>> On 2/24/2017 4:41 AM, Michel Thierry wrote: > >>>> > >>>> There was no way to check if the platform is running the latest firmware. > >>>> > >>>> Cc: Tvrtko Ursulin [2]<tvrtko.ursulin@intel.com> > >>>> Cc: Arkadiusz Hiler [3]<arkadiusz.hiler@intel.com> > >>>> Signed-off-by: Michel Thierry [4]<michel.thierry@intel.com> > >>>> --- > >>>> drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++ > >>>> 1 file changed, 10 insertions(+) > >>>> > >>>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c > >>>> index 2b1d15668192..e022187916ee 100644 > >>>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c > >>>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > >>>> @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, > >>>> CSR_VERSION_MINOR(csr->version)); > >>>> } > >>>> > >>>> + if (HAS_GUC_UCODE(dev_priv)) { > >>>> + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw; > >>>> + > >>>> + err_printf(m, "GuC loaded: %s\n", > >>>> + yesno(guc_fw->load_status == > >>>> + INTEL_UC_FIRMWARE_SUCCESS)); > >>>> + err_printf(m, "GuC fw version: %d.%d\n", > >>>> + guc_fw->major_ver_found, guc_fw->minor_ver_found); > >>>> + } > >>>> + > >>> > >>>Hmm. The firmware may change between the hang and cat > >>>/sys/class/drm/card0/error (as it will be reloaded after the reset). > >> > >>Btw, maybe we should add counter that will be incremented on each fw reload > >>and reported here ? > > > >If it occurs to you that we need it for post-mortem debugging and having > >it is worth more than any potential confusion.... > > > >I can see the need for knowing what guc/huc/dmc/etc was running at the > >time of a hang - I just hope that what was previously running before an > >earlier reset doesn't contribute. But that's why we focus on the first > >error in a system... > > Can the firmware change? > Last time I checked the filename was hard-coded in the driver. It's > true that the load process could fail and then the information be > incorrect. Assume it won't be hardcoded for ever (or at least no more than a week)... And yes, the filesystem state may have changed since the previous load. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state 2017-02-24 17:15 ` Chris Wilson @ 2017-02-24 17:32 ` Michel Thierry 0 siblings, 0 replies; 13+ messages in thread From: Michel Thierry @ 2017-02-24 17:32 UTC (permalink / raw) To: Chris Wilson, Michal Wajdeczko, Kamble, Sagar A, intel-gfx On 2/24/2017 9:15 AM, Chris Wilson wrote: > On Fri, Feb 24, 2017 at 08:30:43AM -0800, Michel Thierry wrote: >> On 2/24/2017 2:49 AM, Chris Wilson wrote: >>> On Fri, Feb 24, 2017 at 11:43:32AM +0100, Michal Wajdeczko wrote: >>>> On Fri, Feb 24, 2017 at 09:13:29AM +0000, Chris Wilson wrote: >>>>> On Fri, Feb 24, 2017 at 09:13:05AM +0530, Kamble, Sagar A wrote: >>>>>> Reviewed-by: Sagar Arun Kamble [1]<sagar.a.kamble@intel.com> >>>>>> >>>>>> On 2/24/2017 4:41 AM, Michel Thierry wrote: >>>>>> >>>>>> There was no way to check if the platform is running the latest firmware. >>>>>> >>>>>> Cc: Tvrtko Ursulin [2]<tvrtko.ursulin@intel.com> >>>>>> Cc: Arkadiusz Hiler [3]<arkadiusz.hiler@intel.com> >>>>>> Signed-off-by: Michel Thierry [4]<michel.thierry@intel.com> >>>>>> --- >>>>>> drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++ >>>>>> 1 file changed, 10 insertions(+) >>>>>> >>>>>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c >>>>>> index 2b1d15668192..e022187916ee 100644 >>>>>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c >>>>>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c >>>>>> @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, >>>>>> CSR_VERSION_MINOR(csr->version)); >>>>>> } >>>>>> >>>>>> + if (HAS_GUC_UCODE(dev_priv)) { >>>>>> + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw; >>>>>> + >>>>>> + err_printf(m, "GuC loaded: %s\n", >>>>>> + yesno(guc_fw->load_status == >>>>>> + INTEL_UC_FIRMWARE_SUCCESS)); >>>>>> + err_printf(m, "GuC fw version: %d.%d\n", >>>>>> + guc_fw->major_ver_found, guc_fw->minor_ver_found); >>>>>> + } >>>>>> + >>>>> >>>>> Hmm. The firmware may change between the hang and cat >>>>> /sys/class/drm/card0/error (as it will be reloaded after the reset). >>>> >>>> Btw, maybe we should add counter that will be incremented on each fw reload >>>> and reported here ? >>> >>> If it occurs to you that we need it for post-mortem debugging and having >>> it is worth more than any potential confusion.... >>> >>> I can see the need for knowing what guc/huc/dmc/etc was running at the >>> time of a hang - I just hope that what was previously running before an >>> earlier reset doesn't contribute. But that's why we focus on the first >>> error in a system... >> >> Can the firmware change? >> Last time I checked the filename was hard-coded in the driver. It's >> true that the load process could fail and then the information be >> incorrect. > > Assume it won't be hardcoded for ever (or at least no more than a week)... > And yes, the filesystem state may have changed since the previous load. ok, I'll add an i915_capture_fw_state to collect the information before the reset (for dmc/guc/huc). _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state 2017-02-23 23:11 [PATCH] drm/i915: Include GuC fw version in error state Michel Thierry 2017-02-24 0:22 ` ✓ Fi.CI.BAT: success for " Patchwork 2017-02-24 3:43 ` [PATCH] " Kamble, Sagar A @ 2017-02-24 10:40 ` Michal Wajdeczko 2017-02-24 16:15 ` Michel Thierry 2 siblings, 1 reply; 13+ messages in thread From: Michal Wajdeczko @ 2017-02-24 10:40 UTC (permalink / raw) To: Michel Thierry; +Cc: intel-gfx On Thu, Feb 23, 2017 at 03:11:37PM -0800, Michel Thierry wrote: > There was no way to check if the platform is running the latest firmware. Can we also add similar patch for the HuC ? > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> > Signed-off-by: Michel Thierry <michel.thierry@intel.com> > --- > drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c > index 2b1d15668192..e022187916ee 100644 > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, > CSR_VERSION_MINOR(csr->version)); > } > > + if (HAS_GUC_UCODE(dev_priv)) { > + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw; I would preffer to use HAS_GUC and intel_guc* here. > + > + err_printf(m, "GuC loaded: %s\n", > + yesno(guc_fw->load_status == > + INTEL_UC_FIRMWARE_SUCCESS)); Hmm, as we do have more detailed load status, why limiting it to yes/no only? -Michal > + err_printf(m, "GuC fw version: %d.%d\n", > + guc_fw->major_ver_found, guc_fw->minor_ver_found); > + } > + > err_printf(m, "EIR: 0x%08x\n", error->eir); > err_printf(m, "IER: 0x%08x\n", error->ier); > for (i = 0; i < error->ngtier; i++) > -- > 2.11.0 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state 2017-02-24 10:40 ` Michal Wajdeczko @ 2017-02-24 16:15 ` Michel Thierry 2017-02-24 16:21 ` Michel Thierry 0 siblings, 1 reply; 13+ messages in thread From: Michel Thierry @ 2017-02-24 16:15 UTC (permalink / raw) To: Michal Wajdeczko; +Cc: intel-gfx On 2/24/2017 2:40 AM, Michal Wajdeczko wrote: > On Thu, Feb 23, 2017 at 03:11:37PM -0800, Michel Thierry wrote: >> There was no way to check if the platform is running the latest firmware. > > Can we also add similar patch for the HuC ? > Please don't tell me the HuC can hang the gpu too. >> >> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> >> Signed-off-by: Michel Thierry <michel.thierry@intel.com> >> --- >> drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c >> index 2b1d15668192..e022187916ee 100644 >> --- a/drivers/gpu/drm/i915/i915_gpu_error.c >> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c >> @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, >> CSR_VERSION_MINOR(csr->version)); >> } >> >> + if (HAS_GUC_UCODE(dev_priv)) { >> + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw; > > I would preffer to use HAS_GUC and intel_guc* here. > > >> + >> + err_printf(m, "GuC loaded: %s\n", >> + yesno(guc_fw->load_status == >> + INTEL_UC_FIRMWARE_SUCCESS)); > > Hmm, as we do have more detailed load status, why limiting it to yes/no only? > > > -Michal > >> + err_printf(m, "GuC fw version: %d.%d\n", >> + guc_fw->major_ver_found, guc_fw->minor_ver_found); >> + } >> + >> err_printf(m, "EIR: 0x%08x\n", error->eir); >> err_printf(m, "IER: 0x%08x\n", error->ier); >> for (i = 0; i < error->ngtier; i++) >> -- >> 2.11.0 >> >> _______________________________________________ >> Intel-gfx mailing list >> Intel-gfx@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/intel-gfx _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state 2017-02-24 16:15 ` Michel Thierry @ 2017-02-24 16:21 ` Michel Thierry 0 siblings, 0 replies; 13+ messages in thread From: Michel Thierry @ 2017-02-24 16:21 UTC (permalink / raw) To: Michal Wajdeczko; +Cc: intel-gfx On 2/24/2017 8:15 AM, Michel Thierry wrote: > > > On 2/24/2017 2:40 AM, Michal Wajdeczko wrote: >> On Thu, Feb 23, 2017 at 03:11:37PM -0800, Michel Thierry wrote: >>> There was no way to check if the platform is running the latest >>> firmware. >> >> Can we also add similar patch for the HuC ? >> > > Please don't tell me the HuC can hang the gpu too. > >>> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >>> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> >>> Signed-off-by: Michel Thierry <michel.thierry@intel.com> >>> --- >>> drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++ >>> 1 file changed, 10 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c >>> b/drivers/gpu/drm/i915/i915_gpu_error.c >>> index 2b1d15668192..e022187916ee 100644 >>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c >>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c >>> @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct >>> drm_i915_error_state_buf *m, >>> CSR_VERSION_MINOR(csr->version)); >>> } >>> >>> + if (HAS_GUC_UCODE(dev_priv)) { >>> + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw; >> >> I would preffer to use HAS_GUC and intel_guc* here. >> >> >>> + >>> + err_printf(m, "GuC loaded: %s\n", >>> + yesno(guc_fw->load_status == >>> + INTEL_UC_FIRMWARE_SUCCESS)); >> >> Hmm, as we do have more detailed load status, why limiting it to >> yes/no only? Will it help in the post-mortem debug? My idea was, if the fw didn't load, we can take it completely out of the picture. >> -Michal >> >>> + err_printf(m, "GuC fw version: %d.%d\n", >>> + guc_fw->major_ver_found, guc_fw->minor_ver_found); >>> + } >>> + >>> err_printf(m, "EIR: 0x%08x\n", error->eir); >>> err_printf(m, "IER: 0x%08x\n", error->ier); >>> for (i = 0; i < error->ngtier; i++) >>> -- >>> 2.11.0 >>> >>> _______________________________________________ >>> Intel-gfx mailing list >>> Intel-gfx@lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2017-02-24 17:32 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-02-23 23:11 [PATCH] drm/i915: Include GuC fw version in error state Michel Thierry 2017-02-24 0:22 ` ✓ Fi.CI.BAT: success for " Patchwork 2017-02-24 3:43 ` [PATCH] " Kamble, Sagar A 2017-02-24 9:13 ` Chris Wilson 2017-02-24 10:43 ` Michal Wajdeczko 2017-02-24 10:49 ` Chris Wilson 2017-02-24 15:45 ` Kamble, Sagar A 2017-02-24 16:30 ` Michel Thierry 2017-02-24 17:15 ` Chris Wilson 2017-02-24 17:32 ` Michel Thierry 2017-02-24 10:40 ` Michal Wajdeczko 2017-02-24 16:15 ` Michel Thierry 2017-02-24 16:21 ` Michel Thierry
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox