* [Intel-gfx] [PATCH] drm/i915/gt: Bump the reset-failure timeout to 60s
@ 2022-09-16 20:19 Ashutosh Dixit
0 siblings, 0 replies; 4+ messages in thread
From: Ashutosh Dixit @ 2022-09-16 20:19 UTC (permalink / raw)
To: intel-gfx; +Cc: dri-devel, Matthew Auld
From: Chris Wilson <chris@chris-wilson.co.uk>
If attempting to perform a GT reset takes long than 5 seconds (including
resetting the display for gen3/4), then we declare all hope lost and
discard all user work and wedge the device to prevent further
misbehaviour. 5 seconds is too short a time for such drastic action, as
we may be stuck on other timeouts and watchdogs. If we allow a little
bit longer before hitting the big red button, we should at the very
least capture other hung task indicators pointing towards the reason why
the reset was hanging; and allow more marginal cases the extra headroom
to complete the reset without further collateral damage.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/gt/intel_reset.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index b36674356986..3159df6cdd49 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -1278,7 +1278,7 @@ static void intel_gt_reset_global(struct intel_gt *gt,
kobject_uevent_env(kobj, KOBJ_CHANGE, reset_event);
/* Use a watchdog to ensure that our reset completes */
- intel_wedge_on_timeout(&w, gt, 5 * HZ) {
+ intel_wedge_on_timeout(&w, gt, 60 * HZ) {
intel_display_prepare_reset(gt->i915);
intel_gt_reset(gt, engine_mask, reason);
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread* [Intel-gfx] [PATCH] drm/i915/gt: Bump the reset-failure timeout to 60s
@ 2022-09-16 20:48 Ashutosh Dixit
2022-09-19 23:53 ` Matt Roper
2022-09-27 20:44 ` Rodrigo Vivi
0 siblings, 2 replies; 4+ messages in thread
From: Ashutosh Dixit @ 2022-09-16 20:48 UTC (permalink / raw)
To: intel-gfx; +Cc: dri-devel, Matthew Auld
From: Chris Wilson <chris@chris-wilson.co.uk>
If attempting to perform a GT reset takes long than 5 seconds (including
resetting the display for gen3/4), then we declare all hope lost and
discard all user work and wedge the device to prevent further
misbehaviour. 5 seconds is too short a time for such drastic action, as
we may be stuck on other timeouts and watchdogs. If we allow a little
bit longer before hitting the big red button, we should at the very
least capture other hung task indicators pointing towards the reason why
the reset was hanging; and allow more marginal cases the extra headroom
to complete the reset without further collateral damage.
Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/6448
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/gt/intel_reset.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index b36674356986..3159df6cdd49 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -1278,7 +1278,7 @@ static void intel_gt_reset_global(struct intel_gt *gt,
kobject_uevent_env(kobj, KOBJ_CHANGE, reset_event);
/* Use a watchdog to ensure that our reset completes */
- intel_wedge_on_timeout(&w, gt, 5 * HZ) {
+ intel_wedge_on_timeout(&w, gt, 60 * HZ) {
intel_display_prepare_reset(gt->i915);
intel_gt_reset(gt, engine_mask, reason);
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [Intel-gfx] [PATCH] drm/i915/gt: Bump the reset-failure timeout to 60s
2022-09-16 20:48 Ashutosh Dixit
@ 2022-09-19 23:53 ` Matt Roper
2022-09-27 20:44 ` Rodrigo Vivi
1 sibling, 0 replies; 4+ messages in thread
From: Matt Roper @ 2022-09-19 23:53 UTC (permalink / raw)
To: Ashutosh Dixit; +Cc: intel-gfx, Matthew Auld, dri-devel
On Fri, Sep 16, 2022 at 01:48:23PM -0700, Ashutosh Dixit wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
>
> If attempting to perform a GT reset takes long than 5 seconds (including
> resetting the display for gen3/4), then we declare all hope lost and
> discard all user work and wedge the device to prevent further
> misbehaviour. 5 seconds is too short a time for such drastic action, as
> we may be stuck on other timeouts and watchdogs. If we allow a little
> bit longer before hitting the big red button, we should at the very
> least capture other hung task indicators pointing towards the reason why
> the reset was hanging; and allow more marginal cases the extra headroom
> to complete the reset without further collateral damage.
>
> Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/6448
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Seems reasonable.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
> ---
> drivers/gpu/drm/i915/gt/intel_reset.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index b36674356986..3159df6cdd49 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -1278,7 +1278,7 @@ static void intel_gt_reset_global(struct intel_gt *gt,
> kobject_uevent_env(kobj, KOBJ_CHANGE, reset_event);
>
> /* Use a watchdog to ensure that our reset completes */
> - intel_wedge_on_timeout(&w, gt, 5 * HZ) {
> + intel_wedge_on_timeout(&w, gt, 60 * HZ) {
> intel_display_prepare_reset(gt->i915);
>
> intel_gt_reset(gt, engine_mask, reason);
> --
> 2.34.1
>
--
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [Intel-gfx] [PATCH] drm/i915/gt: Bump the reset-failure timeout to 60s
2022-09-16 20:48 Ashutosh Dixit
2022-09-19 23:53 ` Matt Roper
@ 2022-09-27 20:44 ` Rodrigo Vivi
1 sibling, 0 replies; 4+ messages in thread
From: Rodrigo Vivi @ 2022-09-27 20:44 UTC (permalink / raw)
To: Ashutosh Dixit; +Cc: intel-gfx, Matthew Auld, dri-devel
On Fri, Sep 16, 2022 at 01:48:23PM -0700, Ashutosh Dixit wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
>
> If attempting to perform a GT reset takes long than 5 seconds (including
> resetting the display for gen3/4), then we declare all hope lost and
> discard all user work and wedge the device to prevent further
> misbehaviour. 5 seconds is too short a time for such drastic action, as
> we may be stuck on other timeouts and watchdogs. If we allow a little
> bit longer before hitting the big red button, we should at the very
> least capture other hung task indicators pointing towards the reason why
> the reset was hanging; and allow more marginal cases the extra headroom
> to complete the reset without further collateral damage.
>
> Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/6448
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
When handling someone's else patch, please add your signed-off-by here
as well.
> ---
> drivers/gpu/drm/i915/gt/intel_reset.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index b36674356986..3159df6cdd49 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -1278,7 +1278,7 @@ static void intel_gt_reset_global(struct intel_gt *gt,
> kobject_uevent_env(kobj, KOBJ_CHANGE, reset_event);
>
> /* Use a watchdog to ensure that our reset completes */
> - intel_wedge_on_timeout(&w, gt, 5 * HZ) {
> + intel_wedge_on_timeout(&w, gt, 60 * HZ) {
> intel_display_prepare_reset(gt->i915);
>
> intel_gt_reset(gt, engine_mask, reason);
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-09-27 20:45 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-09-16 20:19 [Intel-gfx] [PATCH] drm/i915/gt: Bump the reset-failure timeout to 60s Ashutosh Dixit
-- strict thread matches above, loose matches on Subject: below --
2022-09-16 20:48 Ashutosh Dixit
2022-09-19 23:53 ` Matt Roper
2022-09-27 20:44 ` Rodrigo Vivi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox