* [PATCH] drm/i915/uc: Fini hw even if GuC is not running
@ 2019-08-13 16:26 Fernando Pacheco
2019-08-13 18:38 ` ✗ Fi.CI.BAT: failure for " Patchwork
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Fernando Pacheco @ 2019-08-13 16:26 UTC (permalink / raw)
To: intel-gfx
We should not be skipping uc_fini_hw on finding GuC
is no longer running. There is plenty of hw and internal
state that can be cleaned up without having to communicate
with GuC.
Signed-off-by: Fernando Pacheco <fernando.pacheco@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
drivers/gpu/drm/i915/gt/uc/intel_uc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
index 0dc2b0cf4604..c698cddc14dc 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
@@ -521,7 +521,7 @@ void intel_uc_fini_hw(struct intel_uc *uc)
{
struct intel_guc *guc = &uc->guc;
- if (!intel_guc_is_running(guc))
+ if (!intel_uc_supports_guc(uc))
return;
if (intel_uc_supports_guc_submission(uc))
--
2.22.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 6+ messages in thread* ✗ Fi.CI.BAT: failure for drm/i915/uc: Fini hw even if GuC is not running
2019-08-13 16:26 [PATCH] drm/i915/uc: Fini hw even if GuC is not running Fernando Pacheco
@ 2019-08-13 18:38 ` Patchwork
2019-08-13 20:16 ` [PATCH] " Daniele Ceraolo Spurio
2019-08-13 20:18 ` Michal Wajdeczko
2 siblings, 0 replies; 6+ messages in thread
From: Patchwork @ 2019-08-13 18:38 UTC (permalink / raw)
To: Fernando Pacheco; +Cc: intel-gfx
== Series Details ==
Series: drm/i915/uc: Fini hw even if GuC is not running
URL : https://patchwork.freedesktop.org/series/65140/
State : failure
== Summary ==
CI Bug Log - changes from CI_DRM_6697 -> Patchwork_13999
====================================================
Summary
-------
**FAILURE**
Serious unknown changes coming with Patchwork_13999 absolutely need to be
verified manually.
If you think the reported changes have nothing to do with the changes
introduced in Patchwork_13999, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13999/
Possible new issues
-------------------
Here are the unknown changes that may have been introduced in Patchwork_13999:
### IGT changes ###
#### Possible regressions ####
* igt@i915_module_load@reload:
- fi-cfl-guc: [PASS][1] -> [INCOMPLETE][2]
[1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6697/fi-cfl-guc/igt@i915_module_load@reload.html
[2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13999/fi-cfl-guc/igt@i915_module_load@reload.html
Known issues
------------
Here are the changes found in Patchwork_13999 that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@gem_ctx_create@basic-files:
- fi-icl-u2: [PASS][3] -> [INCOMPLETE][4] ([fdo#107713] / [fdo#109100])
[3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6697/fi-icl-u2/igt@gem_ctx_create@basic-files.html
[4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13999/fi-icl-u2/igt@gem_ctx_create@basic-files.html
* igt@kms_chamelium@dp-crc-fast:
- fi-kbl-7500u: [PASS][5] -> [DMESG-WARN][6] ([fdo#105602] / [fdo#106107])
[5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6697/fi-kbl-7500u/igt@kms_chamelium@dp-crc-fast.html
[6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13999/fi-kbl-7500u/igt@kms_chamelium@dp-crc-fast.html
* igt@kms_chamelium@hdmi-hpd-fast:
- fi-kbl-7500u: [PASS][7] -> [FAIL][8] ([fdo#109485])
[7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6697/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
[8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13999/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
* igt@vgem_basic@second-client:
- fi-icl-u3: [PASS][9] -> [DMESG-WARN][10] ([fdo#107724])
[9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6697/fi-icl-u3/igt@vgem_basic@second-client.html
[10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13999/fi-icl-u3/igt@vgem_basic@second-client.html
{name}: This element is suppressed. This means it is ignored when computing
the status of the difference (SUCCESS, WARNING, or FAILURE).
[fdo#105602]: https://bugs.freedesktop.org/show_bug.cgi?id=105602
[fdo#106107]: https://bugs.freedesktop.org/show_bug.cgi?id=106107
[fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
[fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724
[fdo#109100]: https://bugs.freedesktop.org/show_bug.cgi?id=109100
[fdo#109485]: https://bugs.freedesktop.org/show_bug.cgi?id=109485
Participating hosts (53 -> 43)
------------------------------
Additional (1): fi-byt-j1900
Missing (11): fi-kbl-soraka fi-icl-u4 fi-bxt-dsi fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-pnv-d510 fi-byt-clapper fi-bdw-samus fi-snb-2600
Build changes
-------------
* CI: CI-20190529 -> None
* Linux: CI_DRM_6697 -> Patchwork_13999
CI-20190529: 20190529
CI_DRM_6697: 1b6f7a0b130658be1f2b25a6d1f63115ade207d0 @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_5132: eced11670da92bc338a162af6eeda39edd49cd12 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_13999: 624e6ea7b2888b69b08f78889ff669523af28c9d @ git://anongit.freedesktop.org/gfx-ci/linux
== Linux commits ==
624e6ea7b288 drm/i915/uc: Fini hw even if GuC is not running
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13999/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] drm/i915/uc: Fini hw even if GuC is not running
2019-08-13 16:26 [PATCH] drm/i915/uc: Fini hw even if GuC is not running Fernando Pacheco
2019-08-13 18:38 ` ✗ Fi.CI.BAT: failure for " Patchwork
@ 2019-08-13 20:16 ` Daniele Ceraolo Spurio
2019-08-14 14:20 ` Fernando Pacheco
2019-08-13 20:18 ` Michal Wajdeczko
2 siblings, 1 reply; 6+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-08-13 20:16 UTC (permalink / raw)
To: Fernando Pacheco, intel-gfx
On 8/13/19 9:26 AM, Fernando Pacheco wrote:
> We should not be skipping uc_fini_hw on finding GuC
> is no longer running. There is plenty of hw and internal
> state that can be cleaned up without having to communicate
> with GuC.
>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110943
> Signed-off-by: Fernando Pacheco <fernando.pacheco@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> ---
> drivers/gpu/drm/i915/gt/uc/intel_uc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> index 0dc2b0cf4604..c698cddc14dc 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> @@ -521,7 +521,7 @@ void intel_uc_fini_hw(struct intel_uc *uc)
> {
> struct intel_guc *guc = &uc->guc;
>
> - if (!intel_guc_is_running(guc))
> + if (!intel_uc_supports_guc(uc))
Both calls below handle the case where GuC is already not running so
we're safe, but now that we wedge when we hit a guc error we can also do
something like:
-EIO load error -> uc_fini_hw() -> wedge
and then
unload -> uc_fini_hw()
it should be all be handled safely (the fault injection test is passing
where we've run it), but we end up with "GuC communication disabled"
multiple times in the logs:
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13999/fi-skl-guc/igt@i915_module_load@reload-with-fault-injection.html
<6> [237.818905] [drm] GuC communication enabled
....
<6> [237.822739] i915 0000:00:02.0: [drm:__i915_inject_load_error
[i915]] Injecting failure -5 at checkpoint 44 [i915_gem_init:1503]
<6> [237.840808] [drm] GuC communication disabled
...
<6> [238.160935] [drm] GuC communication disabled
Maybe return early from guc_disable_communication if the communication
was already disabled?
Daniele
> return;
>
> if (intel_uc_supports_guc_submission(uc))
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] drm/i915/uc: Fini hw even if GuC is not running
2019-08-13 20:16 ` [PATCH] " Daniele Ceraolo Spurio
@ 2019-08-14 14:20 ` Fernando Pacheco
0 siblings, 0 replies; 6+ messages in thread
From: Fernando Pacheco @ 2019-08-14 14:20 UTC (permalink / raw)
To: Daniele Ceraolo Spurio, intel-gfx
On 8/13/19 1:16 PM, Daniele Ceraolo Spurio wrote:
>
>
> On 8/13/19 9:26 AM, Fernando Pacheco wrote:
>> We should not be skipping uc_fini_hw on finding GuC
>> is no longer running. There is plenty of hw and internal
>> state that can be cleaned up without having to communicate
>> with GuC.
>>
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110943
>
>> Signed-off-by: Fernando Pacheco <fernando.pacheco@intel.com>
>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> ---
>> drivers/gpu/drm/i915/gt/uc/intel_uc.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
>> index 0dc2b0cf4604..c698cddc14dc 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
>> @@ -521,7 +521,7 @@ void intel_uc_fini_hw(struct intel_uc *uc)
>> {
>> struct intel_guc *guc = &uc->guc;
>> - if (!intel_guc_is_running(guc))
>> + if (!intel_uc_supports_guc(uc))
>
> Both calls below handle the case where GuC is already not running so we're safe, but now that we wedge when we hit a guc error we can also do something like:
>
> -EIO load error -> uc_fini_hw() -> wedge
> and then
> unload -> uc_fini_hw()
>
> it should be all be handled safely (the fault injection test is passing where we've run it), but we end up with "GuC communication disabled" multiple times in the logs:
>
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13999/fi-skl-guc/igt@i915_module_load@reload-with-fault-injection.html
>
> <6> [237.818905] [drm] GuC communication enabled
> ....
> <6> [237.822739] i915 0000:00:02.0: [drm:__i915_inject_load_error [i915]] Injecting failure -5 at checkpoint 44 [i915_gem_init:1503]
> <6> [237.840808] [drm] GuC communication disabled
> ...
> <6> [238.160935] [drm] GuC communication disabled
>
> Maybe return early from guc_disable_communication if the communication was already disabled?
As discussed offline, an early return might also require changes to the stop communication phase.
I'll work on this separately as for now the extra disable only results in the extra logging.
Thanks,
Fernando
>
>
> Daniele
>
>> return;
>> if (intel_uc_supports_guc_submission(uc))
>>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] drm/i915/uc: Fini hw even if GuC is not running
2019-08-13 16:26 [PATCH] drm/i915/uc: Fini hw even if GuC is not running Fernando Pacheco
2019-08-13 18:38 ` ✗ Fi.CI.BAT: failure for " Patchwork
2019-08-13 20:16 ` [PATCH] " Daniele Ceraolo Spurio
@ 2019-08-13 20:18 ` Michal Wajdeczko
2019-08-14 14:33 ` Fernando Pacheco
2 siblings, 1 reply; 6+ messages in thread
From: Michal Wajdeczko @ 2019-08-13 20:18 UTC (permalink / raw)
To: intel-gfx, Fernando Pacheco
On Tue, 13 Aug 2019 18:26:28 +0200, Fernando Pacheco
<fernando.pacheco@intel.com> wrote:
> We should not be skipping uc_fini_hw on finding GuC
> is no longer running. There is plenty of hw and internal
> state that can be cleaned up without having to communicate
> with GuC.
>
> Signed-off-by: Fernando Pacheco <fernando.pacheco@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> ---
> drivers/gpu/drm/i915/gt/uc/intel_uc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> index 0dc2b0cf4604..c698cddc14dc 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> @@ -521,7 +521,7 @@ void intel_uc_fini_hw(struct intel_uc *uc)
> {
> struct intel_guc *guc = &uc->guc;
> - if (!intel_guc_is_running(guc))
> + if (!intel_uc_supports_guc(uc))
there is a huge difference between is_running vs supports_guc
and choosing supports_guc is optimist approach as we can fail
to fetch guc fw and abort early, so maybe
if (!intel_uc_fw_is_available(&guc->fw))
would be closer to reality (assuming we don't fail on wopcm
(hmm, maybe we should force fw state to FAIL in such case?)
> return;
> if (intel_uc_supports_guc_submission(uc))
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] drm/i915/uc: Fini hw even if GuC is not running
2019-08-13 20:18 ` Michal Wajdeczko
@ 2019-08-14 14:33 ` Fernando Pacheco
0 siblings, 0 replies; 6+ messages in thread
From: Fernando Pacheco @ 2019-08-14 14:33 UTC (permalink / raw)
To: Michal Wajdeczko, intel-gfx
On 8/13/19 1:18 PM, Michal Wajdeczko wrote:
> On Tue, 13 Aug 2019 18:26:28 +0200, Fernando Pacheco <fernando.pacheco@intel.com> wrote:
>
>> We should not be skipping uc_fini_hw on finding GuC
>> is no longer running. There is plenty of hw and internal
>> state that can be cleaned up without having to communicate
>> with GuC.
>>
>> Signed-off-by: Fernando Pacheco <fernando.pacheco@intel.com>
>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> ---
>> drivers/gpu/drm/i915/gt/uc/intel_uc.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
>> index 0dc2b0cf4604..c698cddc14dc 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
>> @@ -521,7 +521,7 @@ void intel_uc_fini_hw(struct intel_uc *uc)
>> {
>> struct intel_guc *guc = &uc->guc;
>> - if (!intel_guc_is_running(guc))
>> + if (!intel_uc_supports_guc(uc))
>
> there is a huge difference between is_running vs supports_guc
> and choosing supports_guc is optimist approach as we can fail
> to fetch guc fw and abort early, so maybe
>
> if (!intel_uc_fw_is_available(&guc->fw))
This is the better check. Thanks!
>
> would be closer to reality (assuming we don't fail on wopcm
> (hmm, maybe we should force fw state to FAIL in such case?)
That would make sense to me. The fail on wopcm directly
affects the state of the fw because we abort the upload
as a result.
Thanks,
Fernando
>
>
>> return;
>> if (intel_uc_supports_guc_submission(uc))
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-08-14 14:37 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-08-13 16:26 [PATCH] drm/i915/uc: Fini hw even if GuC is not running Fernando Pacheco
2019-08-13 18:38 ` ✗ Fi.CI.BAT: failure for " Patchwork
2019-08-13 20:16 ` [PATCH] " Daniele Ceraolo Spurio
2019-08-14 14:20 ` Fernando Pacheco
2019-08-13 20:18 ` Michal Wajdeczko
2019-08-14 14:33 ` Fernando Pacheco
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox