From: John Harrison <john.c.harrison@intel.com>
To: Badal Nilawar <badal.nilawar@intel.com>,
<intel-xe@lists.freedesktop.org>
Cc: <anshuman.gupta@intel.com>, <rodrigo.vivi@intel.com>,
<matthew.brost@intel.com>, <himal.prasad.ghimiray@intel.com>
Subject: Re: [PATCH v2 2/2] drm/xe/guc/ct: Flush g2h worker in case of g2h response timeout
Date: Wed, 16 Oct 2024 11:51:58 -0700 [thread overview]
Message-ID: <a6870973-20ab-4eff-a0d2-34dec13e52e0@intel.com> (raw)
In-Reply-To: <20241016115256.349791-3-badal.nilawar@intel.com>
On 10/16/2024 04:52, Badal Nilawar wrote:
> In case if g2h worker doesn't get opportunity to within specified
> timeout delay then flush the g2h worker explicitly.
>
> v2:
> - Describe change in comment and add TODO (Matt B/John H)
> - Add xe_gt_warn on fence done after G2H flush (John H)
>
> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1620
> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2902
> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: John Harrison <John.C.Harrison@Intel.com>
> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
> ---
> drivers/gpu/drm/xe/xe_guc_ct.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
> index 3096baa4c9f4..c4e06d6722f0 100644
> --- a/drivers/gpu/drm/xe/xe_guc_ct.c
> +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
> @@ -1028,6 +1028,21 @@ static int guc_ct_send_recv(struct xe_guc_ct *ct, const u32 *action, u32 len,
>
> ret = wait_event_timeout(ct->g2h_fence_wq, g2h_fence.done, HZ * 3);
>
> + /*
> + * Explicitly flush g2h_worker if it hasn’t had the chance to run after being queued due
> + * to delays in workqueue scheduling.
> + *
> + * TODO: Drop this change once workqueue scheduling delay issue root caused and fixed.
> + */
> + if (!ret) {
> + flush_work(&ct->g2h_worker);
> + if (g2h_fence.done) {
> + xe_gt_warn(gt, "G2H fence %u, action %04x, done %s after G2H flush\n",
> + g2h_fence.seqno, action[0], str_yes_no(g2h_fence.done));
> + ret = 1;
> + }
> + }
Why bump the timeout and then do the flush? If the only issue is
believed to be the delayed worker thread then waiting longer before
doing the flush seems counter productive. You are just increasing the
time taken for no benefit. Flushing the worker thread should be all that
is required. If anything, we should be breaking the timeout up into
smaller chunks with a flush in each so that the completion happens
sooner not later.
Also, there was a big discussion about resets on the previous revision
of the patch set. What happened with that? I'm not seeing anything about
connecting with the reset paths here?
John.
> +
> /*
> * Ensure we serialize with completion side to prevent UAF with fence going out of scope on
> * the stack, since we have no clue if it will fire after the timeout before we can erase
next prev parent reply other threads:[~2024-10-16 18:52 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-16 11:52 [PATCH v2 0/2] Workaround to handle G2H timeout Badal Nilawar
2024-10-16 11:52 ` [PATCH v2 1/2] drm/xe/guc/ct: Increase wait timeout for g2h response Badal Nilawar
2024-10-16 11:52 ` [PATCH v2 2/2] drm/xe/guc/ct: Flush g2h worker in case of g2h response timeout Badal Nilawar
2024-10-16 18:51 ` John Harrison [this message]
2024-10-16 18:55 ` John Harrison
2024-10-17 8:36 ` Nilawar, Badal
2024-10-17 8:34 ` Nilawar, Badal
2024-10-16 13:20 ` ✓ CI.Patch_applied: success for Workaround to handle G2H timeout Patchwork
2024-10-16 13:20 ` ✓ CI.checkpatch: " Patchwork
2024-10-16 13:22 ` ✓ CI.KUnit: " Patchwork
2024-10-16 13:33 ` ✓ CI.Build: " Patchwork
2024-10-16 13:35 ` ✓ CI.Hooks: " Patchwork
2024-10-16 13:37 ` ✓ CI.checksparse: " Patchwork
2024-10-16 14:01 ` ✓ CI.BAT: " Patchwork
2024-10-17 5:28 ` ✗ CI.FULL: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a6870973-20ab-4eff-a0d2-34dec13e52e0@intel.com \
--to=john.c.harrison@intel.com \
--cc=anshuman.gupta@intel.com \
--cc=badal.nilawar@intel.com \
--cc=himal.prasad.ghimiray@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.brost@intel.com \
--cc=rodrigo.vivi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox