* [PATCH] drm/i915: Include GuC fw version in error state
@ 2017-02-23 23:11 Michel Thierry
2017-02-24 0:22 ` ✓ Fi.CI.BAT: success for " Patchwork
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Michel Thierry @ 2017-02-23 23:11 UTC (permalink / raw)
To: intel-gfx
There was no way to check if the platform is running the latest firmware.
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 2b1d15668192..e022187916ee 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
CSR_VERSION_MINOR(csr->version));
}
+ if (HAS_GUC_UCODE(dev_priv)) {
+ struct intel_uc_fw *guc_fw = &dev_priv->guc.fw;
+
+ err_printf(m, "GuC loaded: %s\n",
+ yesno(guc_fw->load_status ==
+ INTEL_UC_FIRMWARE_SUCCESS));
+ err_printf(m, "GuC fw version: %d.%d\n",
+ guc_fw->major_ver_found, guc_fw->minor_ver_found);
+ }
+
err_printf(m, "EIR: 0x%08x\n", error->eir);
err_printf(m, "IER: 0x%08x\n", error->ier);
for (i = 0; i < error->ngtier; i++)
--
2.11.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 13+ messages in thread
* ✓ Fi.CI.BAT: success for drm/i915: Include GuC fw version in error state
2017-02-23 23:11 [PATCH] drm/i915: Include GuC fw version in error state Michel Thierry
@ 2017-02-24 0:22 ` Patchwork
2017-02-24 3:43 ` [PATCH] " Kamble, Sagar A
2017-02-24 10:40 ` Michal Wajdeczko
2 siblings, 0 replies; 13+ messages in thread
From: Patchwork @ 2017-02-24 0:22 UTC (permalink / raw)
To: Michel Thierry; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Include GuC fw version in error state
URL : https://patchwork.freedesktop.org/series/20181/
State : success
== Summary ==
Series 20181v1 drm/i915: Include GuC fw version in error state
https://patchwork.freedesktop.org/api/1.0/series/20181/revisions/1/mbox/
fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11
fi-bsw-n3050 total:278 pass:239 dwarn:0 dfail:0 fail:0 skip:39
fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19
fi-bxt-t5700 total:108 pass:95 dwarn:0 dfail:0 fail:0 skip:12
fi-byt-j1900 total:278 pass:251 dwarn:0 dfail:0 fail:0 skip:27
fi-byt-n2820 total:278 pass:247 dwarn:0 dfail:0 fail:0 skip:31
fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16
fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16
fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50
fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18
fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18
fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18
fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10
fi-skl-6700hq total:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17
fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18
fi-skl-6770hq total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10
fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28
fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29
c6638f903295bbbd29957b878a42b83c5566250c drm-tip: 2017y-02m-23d-22h-50m-35s UTC integration manifest
b721bdd drm/i915: Include GuC fw version in error state
== Logs ==
For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3956/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state
2017-02-23 23:11 [PATCH] drm/i915: Include GuC fw version in error state Michel Thierry
2017-02-24 0:22 ` ✓ Fi.CI.BAT: success for " Patchwork
@ 2017-02-24 3:43 ` Kamble, Sagar A
2017-02-24 9:13 ` Chris Wilson
2017-02-24 10:40 ` Michal Wajdeczko
2 siblings, 1 reply; 13+ messages in thread
From: Kamble, Sagar A @ 2017-02-24 3:43 UTC (permalink / raw)
To: Michel Thierry, intel-gfx
[-- Attachment #1.1: Type: text/plain, Size: 1273 bytes --]
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
On 2/24/2017 4:41 AM, Michel Thierry wrote:
> There was no way to check if the platform is running the latest firmware.
>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> ---
> drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 2b1d15668192..e022187916ee 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
> CSR_VERSION_MINOR(csr->version));
> }
>
> + if (HAS_GUC_UCODE(dev_priv)) {
> + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw;
> +
> + err_printf(m, "GuC loaded: %s\n",
> + yesno(guc_fw->load_status ==
> + INTEL_UC_FIRMWARE_SUCCESS));
> + err_printf(m, "GuC fw version: %d.%d\n",
> + guc_fw->major_ver_found, guc_fw->minor_ver_found);
> + }
> +
> err_printf(m, "EIR: 0x%08x\n", error->eir);
> err_printf(m, "IER: 0x%08x\n", error->ier);
> for (i = 0; i < error->ngtier; i++)
[-- Attachment #1.2: Type: text/html, Size: 1948 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state
2017-02-24 3:43 ` [PATCH] " Kamble, Sagar A
@ 2017-02-24 9:13 ` Chris Wilson
2017-02-24 10:43 ` Michal Wajdeczko
0 siblings, 1 reply; 13+ messages in thread
From: Chris Wilson @ 2017-02-24 9:13 UTC (permalink / raw)
To: Kamble, Sagar A; +Cc: intel-gfx
On Fri, Feb 24, 2017 at 09:13:05AM +0530, Kamble, Sagar A wrote:
> Reviewed-by: Sagar Arun Kamble [1]<sagar.a.kamble@intel.com>
>
> On 2/24/2017 4:41 AM, Michel Thierry wrote:
>
> There was no way to check if the platform is running the latest firmware.
>
> Cc: Tvrtko Ursulin [2]<tvrtko.ursulin@intel.com>
> Cc: Arkadiusz Hiler [3]<arkadiusz.hiler@intel.com>
> Signed-off-by: Michel Thierry [4]<michel.thierry@intel.com>
> ---
> drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 2b1d15668192..e022187916ee 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
> CSR_VERSION_MINOR(csr->version));
> }
>
> + if (HAS_GUC_UCODE(dev_priv)) {
> + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw;
> +
> + err_printf(m, "GuC loaded: %s\n",
> + yesno(guc_fw->load_status ==
> + INTEL_UC_FIRMWARE_SUCCESS));
> + err_printf(m, "GuC fw version: %d.%d\n",
> + guc_fw->major_ver_found, guc_fw->minor_ver_found);
> + }
> +
Hmm. The firmware may change between the hang and cat
/sys/class/drm/card0/error (as it will be reloaded after the reset).
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state
2017-02-23 23:11 [PATCH] drm/i915: Include GuC fw version in error state Michel Thierry
2017-02-24 0:22 ` ✓ Fi.CI.BAT: success for " Patchwork
2017-02-24 3:43 ` [PATCH] " Kamble, Sagar A
@ 2017-02-24 10:40 ` Michal Wajdeczko
2017-02-24 16:15 ` Michel Thierry
2 siblings, 1 reply; 13+ messages in thread
From: Michal Wajdeczko @ 2017-02-24 10:40 UTC (permalink / raw)
To: Michel Thierry; +Cc: intel-gfx
On Thu, Feb 23, 2017 at 03:11:37PM -0800, Michel Thierry wrote:
> There was no way to check if the platform is running the latest firmware.
Can we also add similar patch for the HuC ?
>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> ---
> drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 2b1d15668192..e022187916ee 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
> CSR_VERSION_MINOR(csr->version));
> }
>
> + if (HAS_GUC_UCODE(dev_priv)) {
> + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw;
I would preffer to use HAS_GUC and intel_guc* here.
> +
> + err_printf(m, "GuC loaded: %s\n",
> + yesno(guc_fw->load_status ==
> + INTEL_UC_FIRMWARE_SUCCESS));
Hmm, as we do have more detailed load status, why limiting it to yes/no only?
-Michal
> + err_printf(m, "GuC fw version: %d.%d\n",
> + guc_fw->major_ver_found, guc_fw->minor_ver_found);
> + }
> +
> err_printf(m, "EIR: 0x%08x\n", error->eir);
> err_printf(m, "IER: 0x%08x\n", error->ier);
> for (i = 0; i < error->ngtier; i++)
> --
> 2.11.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state
2017-02-24 9:13 ` Chris Wilson
@ 2017-02-24 10:43 ` Michal Wajdeczko
2017-02-24 10:49 ` Chris Wilson
0 siblings, 1 reply; 13+ messages in thread
From: Michal Wajdeczko @ 2017-02-24 10:43 UTC (permalink / raw)
To: Chris Wilson, Kamble, Sagar A, Michel Thierry, intel-gfx
On Fri, Feb 24, 2017 at 09:13:29AM +0000, Chris Wilson wrote:
> On Fri, Feb 24, 2017 at 09:13:05AM +0530, Kamble, Sagar A wrote:
> > Reviewed-by: Sagar Arun Kamble [1]<sagar.a.kamble@intel.com>
> >
> > On 2/24/2017 4:41 AM, Michel Thierry wrote:
> >
> > There was no way to check if the platform is running the latest firmware.
> >
> > Cc: Tvrtko Ursulin [2]<tvrtko.ursulin@intel.com>
> > Cc: Arkadiusz Hiler [3]<arkadiusz.hiler@intel.com>
> > Signed-off-by: Michel Thierry [4]<michel.thierry@intel.com>
> > ---
> > drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> > index 2b1d15668192..e022187916ee 100644
> > --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> > @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
> > CSR_VERSION_MINOR(csr->version));
> > }
> >
> > + if (HAS_GUC_UCODE(dev_priv)) {
> > + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw;
> > +
> > + err_printf(m, "GuC loaded: %s\n",
> > + yesno(guc_fw->load_status ==
> > + INTEL_UC_FIRMWARE_SUCCESS));
> > + err_printf(m, "GuC fw version: %d.%d\n",
> > + guc_fw->major_ver_found, guc_fw->minor_ver_found);
> > + }
> > +
>
> Hmm. The firmware may change between the hang and cat
> /sys/class/drm/card0/error (as it will be reloaded after the reset).
Btw, maybe we should add counter that will be incremented on each fw reload
and reported here ?
-Michal
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state
2017-02-24 10:43 ` Michal Wajdeczko
@ 2017-02-24 10:49 ` Chris Wilson
2017-02-24 15:45 ` Kamble, Sagar A
2017-02-24 16:30 ` Michel Thierry
0 siblings, 2 replies; 13+ messages in thread
From: Chris Wilson @ 2017-02-24 10:49 UTC (permalink / raw)
To: Michal Wajdeczko; +Cc: intel-gfx
On Fri, Feb 24, 2017 at 11:43:32AM +0100, Michal Wajdeczko wrote:
> On Fri, Feb 24, 2017 at 09:13:29AM +0000, Chris Wilson wrote:
> > On Fri, Feb 24, 2017 at 09:13:05AM +0530, Kamble, Sagar A wrote:
> > > Reviewed-by: Sagar Arun Kamble [1]<sagar.a.kamble@intel.com>
> > >
> > > On 2/24/2017 4:41 AM, Michel Thierry wrote:
> > >
> > > There was no way to check if the platform is running the latest firmware.
> > >
> > > Cc: Tvrtko Ursulin [2]<tvrtko.ursulin@intel.com>
> > > Cc: Arkadiusz Hiler [3]<arkadiusz.hiler@intel.com>
> > > Signed-off-by: Michel Thierry [4]<michel.thierry@intel.com>
> > > ---
> > > drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++
> > > 1 file changed, 10 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> > > index 2b1d15668192..e022187916ee 100644
> > > --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> > > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> > > @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
> > > CSR_VERSION_MINOR(csr->version));
> > > }
> > >
> > > + if (HAS_GUC_UCODE(dev_priv)) {
> > > + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw;
> > > +
> > > + err_printf(m, "GuC loaded: %s\n",
> > > + yesno(guc_fw->load_status ==
> > > + INTEL_UC_FIRMWARE_SUCCESS));
> > > + err_printf(m, "GuC fw version: %d.%d\n",
> > > + guc_fw->major_ver_found, guc_fw->minor_ver_found);
> > > + }
> > > +
> >
> > Hmm. The firmware may change between the hang and cat
> > /sys/class/drm/card0/error (as it will be reloaded after the reset).
>
> Btw, maybe we should add counter that will be incremented on each fw reload
> and reported here ?
If it occurs to you that we need it for post-mortem debugging and having
it is worth more than any potential confusion....
I can see the need for knowing what guc/huc/dmc/etc was running at the
time of a hang - I just hope that what was previously running before an
earlier reset doesn't contribute. But that's why we focus on the first
error in a system...
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state
2017-02-24 10:49 ` Chris Wilson
@ 2017-02-24 15:45 ` Kamble, Sagar A
2017-02-24 16:30 ` Michel Thierry
1 sibling, 0 replies; 13+ messages in thread
From: Kamble, Sagar A @ 2017-02-24 15:45 UTC (permalink / raw)
To: Chris Wilson, Michal Wajdeczko, Michel Thierry, intel-gfx
On 2/24/2017 4:19 PM, Chris Wilson wrote:
> On Fri, Feb 24, 2017 at 11:43:32AM +0100, Michal Wajdeczko wrote:
>> On Fri, Feb 24, 2017 at 09:13:29AM +0000, Chris Wilson wrote:
>>> On Fri, Feb 24, 2017 at 09:13:05AM +0530, Kamble, Sagar A wrote:
>>>> Reviewed-by: Sagar Arun Kamble [1]<sagar.a.kamble@intel.com>
>>>>
>>>> On 2/24/2017 4:41 AM, Michel Thierry wrote:
>>>>
>>>> There was no way to check if the platform is running the latest firmware.
>>>>
>>>> Cc: Tvrtko Ursulin [2]<tvrtko.ursulin@intel.com>
>>>> Cc: Arkadiusz Hiler [3]<arkadiusz.hiler@intel.com>
>>>> Signed-off-by: Michel Thierry [4]<michel.thierry@intel.com>
>>>> ---
>>>> drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++
>>>> 1 file changed, 10 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
>>>> index 2b1d15668192..e022187916ee 100644
>>>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
>>>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
>>>> @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
>>>> CSR_VERSION_MINOR(csr->version));
>>>> }
>>>>
>>>> + if (HAS_GUC_UCODE(dev_priv)) {
>>>> + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw;
>>>> +
>>>> + err_printf(m, "GuC loaded: %s\n",
>>>> + yesno(guc_fw->load_status ==
>>>> + INTEL_UC_FIRMWARE_SUCCESS));
>>>> + err_printf(m, "GuC fw version: %d.%d\n",
>>>> + guc_fw->major_ver_found, guc_fw->minor_ver_found);
>>>> + }
>>>> +
>>> Hmm. The firmware may change between the hang and cat
>>> /sys/class/drm/card0/error (as it will be reloaded after the reset).
>> Btw, maybe we should add counter that will be incremented on each fw reload
>> and reported here ?
> If it occurs to you that we need it for post-mortem debugging and having
> it is worth more than any potential confusion....
>
> I can see the need for knowing what guc/huc/dmc/etc was running at the
> time of a hang - I just hope that what was previously running before an
> earlier reset doesn't contribute. But that's why we focus on the first
> error in a system...
> -Chris
>
GT reset count is present already in error state. GuC kernel parameters
are present and this
change will help us identify which firmware issue was encountered.
So I feel printing ver_found should be enough.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state
2017-02-24 10:40 ` Michal Wajdeczko
@ 2017-02-24 16:15 ` Michel Thierry
2017-02-24 16:21 ` Michel Thierry
0 siblings, 1 reply; 13+ messages in thread
From: Michel Thierry @ 2017-02-24 16:15 UTC (permalink / raw)
To: Michal Wajdeczko; +Cc: intel-gfx
On 2/24/2017 2:40 AM, Michal Wajdeczko wrote:
> On Thu, Feb 23, 2017 at 03:11:37PM -0800, Michel Thierry wrote:
>> There was no way to check if the platform is running the latest firmware.
>
> Can we also add similar patch for the HuC ?
>
Please don't tell me the HuC can hang the gpu too.
>>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
>> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++
>> 1 file changed, 10 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
>> index 2b1d15668192..e022187916ee 100644
>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
>> @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
>> CSR_VERSION_MINOR(csr->version));
>> }
>>
>> + if (HAS_GUC_UCODE(dev_priv)) {
>> + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw;
>
> I would preffer to use HAS_GUC and intel_guc* here.
>
>
>> +
>> + err_printf(m, "GuC loaded: %s\n",
>> + yesno(guc_fw->load_status ==
>> + INTEL_UC_FIRMWARE_SUCCESS));
>
> Hmm, as we do have more detailed load status, why limiting it to yes/no only?
>
>
> -Michal
>
>> + err_printf(m, "GuC fw version: %d.%d\n",
>> + guc_fw->major_ver_found, guc_fw->minor_ver_found);
>> + }
>> +
>> err_printf(m, "EIR: 0x%08x\n", error->eir);
>> err_printf(m, "IER: 0x%08x\n", error->ier);
>> for (i = 0; i < error->ngtier; i++)
>> --
>> 2.11.0
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state
2017-02-24 16:15 ` Michel Thierry
@ 2017-02-24 16:21 ` Michel Thierry
0 siblings, 0 replies; 13+ messages in thread
From: Michel Thierry @ 2017-02-24 16:21 UTC (permalink / raw)
To: Michal Wajdeczko; +Cc: intel-gfx
On 2/24/2017 8:15 AM, Michel Thierry wrote:
>
>
> On 2/24/2017 2:40 AM, Michal Wajdeczko wrote:
>> On Thu, Feb 23, 2017 at 03:11:37PM -0800, Michel Thierry wrote:
>>> There was no way to check if the platform is running the latest
>>> firmware.
>>
>> Can we also add similar patch for the HuC ?
>>
>
> Please don't tell me the HuC can hang the gpu too.
>
>>>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
>>> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
>>> ---
>>> drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++
>>> 1 file changed, 10 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c
>>> b/drivers/gpu/drm/i915/i915_gpu_error.c
>>> index 2b1d15668192..e022187916ee 100644
>>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
>>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
>>> @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct
>>> drm_i915_error_state_buf *m,
>>> CSR_VERSION_MINOR(csr->version));
>>> }
>>>
>>> + if (HAS_GUC_UCODE(dev_priv)) {
>>> + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw;
>>
>> I would preffer to use HAS_GUC and intel_guc* here.
>>
>>
>>> +
>>> + err_printf(m, "GuC loaded: %s\n",
>>> + yesno(guc_fw->load_status ==
>>> + INTEL_UC_FIRMWARE_SUCCESS));
>>
>> Hmm, as we do have more detailed load status, why limiting it to
>> yes/no only?
Will it help in the post-mortem debug?
My idea was, if the fw didn't load, we can take it completely out of the
picture.
>> -Michal
>>
>>> + err_printf(m, "GuC fw version: %d.%d\n",
>>> + guc_fw->major_ver_found, guc_fw->minor_ver_found);
>>> + }
>>> +
>>> err_printf(m, "EIR: 0x%08x\n", error->eir);
>>> err_printf(m, "IER: 0x%08x\n", error->ier);
>>> for (i = 0; i < error->ngtier; i++)
>>> --
>>> 2.11.0
>>>
>>> _______________________________________________
>>> Intel-gfx mailing list
>>> Intel-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state
2017-02-24 10:49 ` Chris Wilson
2017-02-24 15:45 ` Kamble, Sagar A
@ 2017-02-24 16:30 ` Michel Thierry
2017-02-24 17:15 ` Chris Wilson
1 sibling, 1 reply; 13+ messages in thread
From: Michel Thierry @ 2017-02-24 16:30 UTC (permalink / raw)
To: Chris Wilson, Michal Wajdeczko, Kamble, Sagar A, intel-gfx
On 2/24/2017 2:49 AM, Chris Wilson wrote:
> On Fri, Feb 24, 2017 at 11:43:32AM +0100, Michal Wajdeczko wrote:
>> On Fri, Feb 24, 2017 at 09:13:29AM +0000, Chris Wilson wrote:
>>> On Fri, Feb 24, 2017 at 09:13:05AM +0530, Kamble, Sagar A wrote:
>>>> Reviewed-by: Sagar Arun Kamble [1]<sagar.a.kamble@intel.com>
>>>>
>>>> On 2/24/2017 4:41 AM, Michel Thierry wrote:
>>>>
>>>> There was no way to check if the platform is running the latest firmware.
>>>>
>>>> Cc: Tvrtko Ursulin [2]<tvrtko.ursulin@intel.com>
>>>> Cc: Arkadiusz Hiler [3]<arkadiusz.hiler@intel.com>
>>>> Signed-off-by: Michel Thierry [4]<michel.thierry@intel.com>
>>>> ---
>>>> drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++
>>>> 1 file changed, 10 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
>>>> index 2b1d15668192..e022187916ee 100644
>>>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
>>>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
>>>> @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
>>>> CSR_VERSION_MINOR(csr->version));
>>>> }
>>>>
>>>> + if (HAS_GUC_UCODE(dev_priv)) {
>>>> + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw;
>>>> +
>>>> + err_printf(m, "GuC loaded: %s\n",
>>>> + yesno(guc_fw->load_status ==
>>>> + INTEL_UC_FIRMWARE_SUCCESS));
>>>> + err_printf(m, "GuC fw version: %d.%d\n",
>>>> + guc_fw->major_ver_found, guc_fw->minor_ver_found);
>>>> + }
>>>> +
>>>
>>> Hmm. The firmware may change between the hang and cat
>>> /sys/class/drm/card0/error (as it will be reloaded after the reset).
>>
>> Btw, maybe we should add counter that will be incremented on each fw reload
>> and reported here ?
>
> If it occurs to you that we need it for post-mortem debugging and having
> it is worth more than any potential confusion....
>
> I can see the need for knowing what guc/huc/dmc/etc was running at the
> time of a hang - I just hope that what was previously running before an
> earlier reset doesn't contribute. But that's why we focus on the first
> error in a system...
Can the firmware change?
Last time I checked the filename was hard-coded in the driver. It's true
that the load process could fail and then the information be incorrect.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state
2017-02-24 16:30 ` Michel Thierry
@ 2017-02-24 17:15 ` Chris Wilson
2017-02-24 17:32 ` Michel Thierry
0 siblings, 1 reply; 13+ messages in thread
From: Chris Wilson @ 2017-02-24 17:15 UTC (permalink / raw)
To: Michel Thierry; +Cc: intel-gfx
On Fri, Feb 24, 2017 at 08:30:43AM -0800, Michel Thierry wrote:
> On 2/24/2017 2:49 AM, Chris Wilson wrote:
> >On Fri, Feb 24, 2017 at 11:43:32AM +0100, Michal Wajdeczko wrote:
> >>On Fri, Feb 24, 2017 at 09:13:29AM +0000, Chris Wilson wrote:
> >>>On Fri, Feb 24, 2017 at 09:13:05AM +0530, Kamble, Sagar A wrote:
> >>>> Reviewed-by: Sagar Arun Kamble [1]<sagar.a.kamble@intel.com>
> >>>>
> >>>> On 2/24/2017 4:41 AM, Michel Thierry wrote:
> >>>>
> >>>> There was no way to check if the platform is running the latest firmware.
> >>>>
> >>>> Cc: Tvrtko Ursulin [2]<tvrtko.ursulin@intel.com>
> >>>> Cc: Arkadiusz Hiler [3]<arkadiusz.hiler@intel.com>
> >>>> Signed-off-by: Michel Thierry [4]<michel.thierry@intel.com>
> >>>> ---
> >>>> drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++
> >>>> 1 file changed, 10 insertions(+)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> >>>> index 2b1d15668192..e022187916ee 100644
> >>>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> >>>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> >>>> @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
> >>>> CSR_VERSION_MINOR(csr->version));
> >>>> }
> >>>>
> >>>> + if (HAS_GUC_UCODE(dev_priv)) {
> >>>> + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw;
> >>>> +
> >>>> + err_printf(m, "GuC loaded: %s\n",
> >>>> + yesno(guc_fw->load_status ==
> >>>> + INTEL_UC_FIRMWARE_SUCCESS));
> >>>> + err_printf(m, "GuC fw version: %d.%d\n",
> >>>> + guc_fw->major_ver_found, guc_fw->minor_ver_found);
> >>>> + }
> >>>> +
> >>>
> >>>Hmm. The firmware may change between the hang and cat
> >>>/sys/class/drm/card0/error (as it will be reloaded after the reset).
> >>
> >>Btw, maybe we should add counter that will be incremented on each fw reload
> >>and reported here ?
> >
> >If it occurs to you that we need it for post-mortem debugging and having
> >it is worth more than any potential confusion....
> >
> >I can see the need for knowing what guc/huc/dmc/etc was running at the
> >time of a hang - I just hope that what was previously running before an
> >earlier reset doesn't contribute. But that's why we focus on the first
> >error in a system...
>
> Can the firmware change?
> Last time I checked the filename was hard-coded in the driver. It's
> true that the load process could fail and then the information be
> incorrect.
Assume it won't be hardcoded for ever (or at least no more than a week)...
And yes, the filesystem state may have changed since the previous load.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/i915: Include GuC fw version in error state
2017-02-24 17:15 ` Chris Wilson
@ 2017-02-24 17:32 ` Michel Thierry
0 siblings, 0 replies; 13+ messages in thread
From: Michel Thierry @ 2017-02-24 17:32 UTC (permalink / raw)
To: Chris Wilson, Michal Wajdeczko, Kamble, Sagar A, intel-gfx
On 2/24/2017 9:15 AM, Chris Wilson wrote:
> On Fri, Feb 24, 2017 at 08:30:43AM -0800, Michel Thierry wrote:
>> On 2/24/2017 2:49 AM, Chris Wilson wrote:
>>> On Fri, Feb 24, 2017 at 11:43:32AM +0100, Michal Wajdeczko wrote:
>>>> On Fri, Feb 24, 2017 at 09:13:29AM +0000, Chris Wilson wrote:
>>>>> On Fri, Feb 24, 2017 at 09:13:05AM +0530, Kamble, Sagar A wrote:
>>>>>> Reviewed-by: Sagar Arun Kamble [1]<sagar.a.kamble@intel.com>
>>>>>>
>>>>>> On 2/24/2017 4:41 AM, Michel Thierry wrote:
>>>>>>
>>>>>> There was no way to check if the platform is running the latest firmware.
>>>>>>
>>>>>> Cc: Tvrtko Ursulin [2]<tvrtko.ursulin@intel.com>
>>>>>> Cc: Arkadiusz Hiler [3]<arkadiusz.hiler@intel.com>
>>>>>> Signed-off-by: Michel Thierry [4]<michel.thierry@intel.com>
>>>>>> ---
>>>>>> drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++
>>>>>> 1 file changed, 10 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
>>>>>> index 2b1d15668192..e022187916ee 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
>>>>>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
>>>>>> @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
>>>>>> CSR_VERSION_MINOR(csr->version));
>>>>>> }
>>>>>>
>>>>>> + if (HAS_GUC_UCODE(dev_priv)) {
>>>>>> + struct intel_uc_fw *guc_fw = &dev_priv->guc.fw;
>>>>>> +
>>>>>> + err_printf(m, "GuC loaded: %s\n",
>>>>>> + yesno(guc_fw->load_status ==
>>>>>> + INTEL_UC_FIRMWARE_SUCCESS));
>>>>>> + err_printf(m, "GuC fw version: %d.%d\n",
>>>>>> + guc_fw->major_ver_found, guc_fw->minor_ver_found);
>>>>>> + }
>>>>>> +
>>>>>
>>>>> Hmm. The firmware may change between the hang and cat
>>>>> /sys/class/drm/card0/error (as it will be reloaded after the reset).
>>>>
>>>> Btw, maybe we should add counter that will be incremented on each fw reload
>>>> and reported here ?
>>>
>>> If it occurs to you that we need it for post-mortem debugging and having
>>> it is worth more than any potential confusion....
>>>
>>> I can see the need for knowing what guc/huc/dmc/etc was running at the
>>> time of a hang - I just hope that what was previously running before an
>>> earlier reset doesn't contribute. But that's why we focus on the first
>>> error in a system...
>>
>> Can the firmware change?
>> Last time I checked the filename was hard-coded in the driver. It's
>> true that the load process could fail and then the information be
>> incorrect.
>
> Assume it won't be hardcoded for ever (or at least no more than a week)...
> And yes, the filesystem state may have changed since the previous load.
ok, I'll add an i915_capture_fw_state to collect the information before
the reset (for dmc/guc/huc).
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2017-02-24 17:32 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-02-23 23:11 [PATCH] drm/i915: Include GuC fw version in error state Michel Thierry
2017-02-24 0:22 ` ✓ Fi.CI.BAT: success for " Patchwork
2017-02-24 3:43 ` [PATCH] " Kamble, Sagar A
2017-02-24 9:13 ` Chris Wilson
2017-02-24 10:43 ` Michal Wajdeczko
2017-02-24 10:49 ` Chris Wilson
2017-02-24 15:45 ` Kamble, Sagar A
2017-02-24 16:30 ` Michel Thierry
2017-02-24 17:15 ` Chris Wilson
2017-02-24 17:32 ` Michel Thierry
2017-02-24 10:40 ` Michal Wajdeczko
2017-02-24 16:15 ` Michel Thierry
2017-02-24 16:21 ` Michel Thierry
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox