Intel-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Dong, Zhanjun" <zhanjun.dong@intel.com>
To: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>,
	<intel-gfx@lists.freedesktop.org>,
	<dri-devel@lists.freedesktop.org>
Cc: John Harrison <John.C.Harrison@Intel.com>
Subject: Re: [PATCH v1] drm/i915/guc: Flush ct receive tasklet during reset preparation
Date: Tue, 5 Nov 2024 10:38:03 -0500	[thread overview]
Message-ID: <9339ed5a-bc5e-4329-bf2e-77bd53eae3c3@intel.com> (raw)
In-Reply-To: <d48da820-a9b6-4bf1-95c2-984d900a2700@intel.com>



On 2024-11-04 6:20 p.m., Daniele Ceraolo Spurio wrote:
> 
> 
> 
> On 10/30/2024 3:38 PM, Zhanjun Dong wrote:
>> GuC to host communication is interrupt driven, the handling has 3
>> parts: interrupt context, tasklet and request queue worker.
>> During GuC reset prepare, interrupt is disabled before destroy
>> contexts steps start. The IRQ and worker flushed to finish
>> in progress message handling if there are. The tasklet flush is
>> missing, it might causes 2 race conditions:
>> 1. Tasklet runs after IRQ flushed, add request to queue after worker
>> flush started, causes unexpected G2H message request processing,
>> meanwhile, reset prepare code already get the context destroyed.
>> This will causes error reported about bad context state.
>> 2. Tasklet runs after intel_guc_submission_reset_prepare,
>> ct_try_receive_message start to run, while intel_uc_reset_prepare
>> already finished guc sanitize and set ct->enable to false. This will
>> causes warning on incorrect ct->enable state.
>>
>> Add the missing tasklet flush to flush all 3 parts.
>>
>> Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
>> Cc: John Harrison <John.C.Harrison@Intel.com>
>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/ 
>> drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
>> index 9ede6f240d79..353a9167c9a4 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
>> @@ -1688,6 +1688,10 @@ void intel_guc_submission_reset_prepare(struct 
>> intel_guc *guc)
>>       spin_lock_irq(guc_to_gt(guc)->irq_lock);
>>       spin_unlock_irq(guc_to_gt(guc)->irq_lock);
>> +    /* Flush tasklet */
>> +    tasklet_disable(&guc->ct.receive_tasklet);
>> +    tasklet_enable(&guc->ct.receive_tasklet);
>> +
> 
> It looks like we might have the same problem around suspend/resume, 
> because AFAICS the tasklet is never stopped anywhere except driver 
> unload. Maybe it's worth adding the tasklet disabling/enabling to the 
> interrupt disabling/enabling functions, i.e. guc->interrupts.disable/ 
> enable(), so it's automatically called any time we want to disable GuC 
> interrupts? not a blocker.
> 
> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> 
> Daniele
> 
Thanks Daniele for review.

I like the idea to put tasklet disabling/enabling to the
 > interrupt disabling/enabling functions. Let me do some investigation 
on suspend/resume workflow and run some test first. It might take some time.
This patch might fix multiple issues, I would like to get it merged 
after we got positive CI.Full result.

Regards,
Zhanjun Dong

>>       guc_flush_submissions(guc);
>>       guc_flush_destroyed_contexts(guc);
>>       flush_work(&guc->ct.requests.worker);
> 


      reply	other threads:[~2024-11-05 15:38 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-30 22:38 [PATCH v1] drm/i915/guc: Flush ct receive tasklet during reset preparation Zhanjun Dong
2024-10-31  0:41 ` ✗ Fi.CI.BAT: failure for " Patchwork
2024-10-31 15:22 ` ✓ Fi.CI.BAT: success for drm/i915/guc: Flush ct receive tasklet during reset preparation (rev2) Patchwork
2024-10-31 23:13 ` ✗ Fi.CI.IGT: failure " Patchwork
2024-11-04 18:26 ` [PATCH v1] drm/i915/guc: Flush ct receive tasklet during reset preparation Teres Alexis, Alan Previn
2024-11-04 23:20 ` Daniele Ceraolo Spurio
2024-11-05 15:38   ` Dong, Zhanjun [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9339ed5a-bc5e-4329-bf2e-77bd53eae3c3@intel.com \
    --to=zhanjun.dong@intel.com \
    --cc=John.C.Harrison@Intel.com \
    --cc=daniele.ceraolospurio@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox