From: Rodrigo Vivi <rodrigo.vivi@intel.com>
To: Karthik Poosa <karthik.poosa@intel.com>
Cc: intel-xe@lists.freedesktop.org
Subject: Re: [PATCH] drm/xe: Add wait for completion after gt force reset
Date: Fri, 15 Dec 2023 12:32:02 -0500 [thread overview]
Message-ID: <ZXyNkit45DB5dTHr@intel.com> (raw)
In-Reply-To: <ZXyCGMbgOXZBnYYa@intel.com>
On Fri, Dec 15, 2023 at 11:43:04AM -0500, Rodrigo Vivi wrote:
> On Fri, Dec 15, 2023 at 10:45:41AM +0530, Karthik Poosa wrote:
> > Wait for gt reset to complete before returning from force_reset
> > sysfs call. Without this igt test freq_reset_multiple fails
> > sporadically in case xe_guc_pc is not started.
>
> \o/ I knew I was missing something there. Thanks for finding that out.
>
> >
> > Testcase: igt@xe_guc_pc@freq_reset_multiple
> > Signed-off-by: Karthik Poosa <karthik.poosa@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_gt.c | 3 +++
> > drivers/gpu/drm/xe/xe_gt_debugfs.c | 10 ++++++++++
> > drivers/gpu/drm/xe/xe_gt_types.h | 3 +++
> > 3 files changed, 16 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
> > index dfd9cf01a5d5..eb7552b6dfa5 100644
> > --- a/drivers/gpu/drm/xe/xe_gt.c
> > +++ b/drivers/gpu/drm/xe/xe_gt.c
> > @@ -65,6 +65,7 @@ struct xe_gt *xe_gt_alloc(struct xe_tile *tile)
> >
> > gt->tile = tile;
> > gt->ordered_wq = alloc_ordered_workqueue("gt-ordered-wq", 0);
> > + init_completion(>->reset_done);
> >
> > return gt;
> > }
> > @@ -647,6 +648,8 @@ static int gt_reset(struct xe_gt *gt)
> > xe_device_mem_access_put(gt_to_xe(gt));
> > XE_WARN_ON(err);
> >
> > + complete(>->reset_done);
> > +
> > xe_gt_info(gt, "reset done\n");
> >
> > return 0;
> > diff --git a/drivers/gpu/drm/xe/xe_gt_debugfs.c b/drivers/gpu/drm/xe/xe_gt_debugfs.c
> > index c4b67cf09f8f..49b30937a28b 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_debugfs.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_debugfs.c
> > @@ -23,6 +23,8 @@
> > #include "xe_uc_debugfs.h"
> > #include "xe_wa.h"
> >
> > +#define XE_GT_RESET_TIMEOUT_MS (msecs_to_jiffies(5*1000))
>
> 5s seems to much, but anyway, this is a debugfs function, so, no
> if 5s fail, then we do have bigger problems.
>
> perhaps we could get the timeout as input?
>
> anyway, it is better to have protected like this than the current
> racy one that we have.
>
> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
ops, let me tune down my over excitement here.
We need to address the feedback from Badal that was given at intel-gfx ml.
>
> > +
> > static struct xe_gt *node_to_gt(struct drm_info_node *node)
> > {
> > return node->info_ent->data;
> > @@ -58,9 +60,17 @@ static int hw_engines(struct seq_file *m, void *data)
> > static int force_reset(struct seq_file *m, void *data)
> > {
> > struct xe_gt *gt = node_to_gt(m->private);
> > + struct xe_device *xe = gt_to_xe(gt);
> > + unsigned long timeout;
> >
> > xe_gt_reset_async(gt);
> >
> > + timeout = wait_for_completion_timeout(>->reset_done, XE_GT_RESET_TIMEOUT_MS);
> > + if (timeout == 0) {
> > + drm_err(&xe->drm, "gt reset timed out");
> > + return -ETIMEDOUT;
> > + }
> > +
> > return 0;
> > }
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
> > index f74684660475..6f2fb9e3cfea 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_types.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_types.h
> > @@ -358,6 +358,9 @@ struct xe_gt {
> > /** @oob: bitmap with active OOB workaroudns */
> > unsigned long *oob;
> > } wa_active;
> > +
> > + /** @reset_done : Completion of GT reset */
> > + struct completion reset_done;
> > };
> >
> > #endif
> > --
> > 2.25.1
> >
next prev parent reply other threads:[~2023-12-15 17:32 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-15 5:15 [PATCH] drm/xe: Add wait for completion after gt force reset Karthik Poosa
2023-12-15 5:11 ` ✓ CI.Patch_applied: success for " Patchwork
2023-12-15 5:11 ` ✗ CI.checkpatch: warning " Patchwork
2023-12-15 5:12 ` ✓ CI.KUnit: success " Patchwork
2023-12-15 5:19 ` ✓ CI.Build: " Patchwork
2023-12-15 5:20 ` ✓ CI.Hooks: " Patchwork
2023-12-15 5:21 ` ✓ CI.checksparse: " Patchwork
2023-12-15 5:55 ` ✓ CI.BAT: " Patchwork
2023-12-15 16:43 ` [PATCH] " Rodrigo Vivi
2023-12-15 17:32 ` Rodrigo Vivi [this message]
2023-12-20 12:51 ` Gupta, Anshuman
2023-12-20 15:20 ` Rodrigo Vivi
-- strict thread matches above, loose matches on Subject: below --
2023-12-14 10:06 Karthik Poosa
2023-12-14 11:42 ` Nilawar, Badal
2023-12-14 11:56 ` Jani Nikula
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZXyNkit45DB5dTHr@intel.com \
--to=rodrigo.vivi@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=karthik.poosa@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.