All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
To: "Teres Alexis, Alan Previn" <alan.previn.teres.alexis@intel.com>,
	"Vivi, Rodrigo" <rodrigo.vivi@intel.com>
Cc: "Ursulin, Tvrtko" <tvrtko.ursulin@intel.com>,
	"intel-gfx@lists.freedesktop.org"
	<intel-gfx@lists.freedesktop.org>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	"Jana, Mousumi" <mousumi.jana@intel.com>
Subject: Re: [Intel-gfx] [PATCH v7 2/2] drm/i915/guc: Close deregister-context race against CT-loss
Date: Wed, 6 Dec 2023 13:47:27 -0800	[thread overview]
Message-ID: <cf64cae8-eac2-4e3e-9fd1-aef79c4000f3@intel.com> (raw)
In-Reply-To: <ce76d74bdd99d328eca5689ea5815fbb3a689ee6.camel@intel.com>



On 11/30/2023 4:10 PM, Teres Alexis, Alan Previn wrote:
>> As far as i can tell, its only if we started resetting / wedging right after this
>> queued worker got started.
> alan: hope Daniele can proof read my tracing and confirm if got it right.

Yup, we don't flush the worker in reset prepare, so there is a chance 
that it might run parallel to the reset/wedge code, which we handle by 
checking the submission status. The list manipulation is protected by 
spinlock so we're safe on that side. The rest of the approach also LGTM:

Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

Daniele

WARNING: multiple messages have this Message-ID (diff)
From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
To: "Teres Alexis, Alan Previn" <alan.previn.teres.alexis@intel.com>,
	"Vivi, Rodrigo" <rodrigo.vivi@intel.com>
Cc: "Ursulin, Tvrtko" <tvrtko.ursulin@intel.com>,
	"Gupta, Anshuman" <anshuman.gupta@intel.com>,
	"intel-gfx@lists.freedesktop.org"
	<intel-gfx@lists.freedesktop.org>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	"Jana, Mousumi" <mousumi.jana@intel.com>,
	"Harrison, John C" <john.c.harrison@intel.com>
Subject: Re: [PATCH v7 2/2] drm/i915/guc: Close deregister-context race against CT-loss
Date: Wed, 6 Dec 2023 13:47:27 -0800	[thread overview]
Message-ID: <cf64cae8-eac2-4e3e-9fd1-aef79c4000f3@intel.com> (raw)
In-Reply-To: <ce76d74bdd99d328eca5689ea5815fbb3a689ee6.camel@intel.com>



On 11/30/2023 4:10 PM, Teres Alexis, Alan Previn wrote:
>> As far as i can tell, its only if we started resetting / wedging right after this
>> queued worker got started.
> alan: hope Daniele can proof read my tracing and confirm if got it right.

Yup, we don't flush the worker in reset prepare, so there is a chance 
that it might run parallel to the reset/wedge code, which we handle by 
checking the submission status. The list manipulation is protected by 
spinlock so we're safe on that side. The rest of the approach also LGTM:

Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

Daniele

  reply	other threads:[~2023-12-06 21:47 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-30  0:20 [Intel-gfx] [PATCH v7 0/2] Resolve suspend-resume racing with GuC destroy-context-worker Alan Previn
2023-11-30  0:20 ` Alan Previn
2023-11-30  0:20 ` [Intel-gfx] [PATCH v7 1/2] drm/i915/guc: Flush context destruction worker at suspend Alan Previn
2023-11-30  0:20   ` Alan Previn
2023-11-30  0:20 ` [Intel-gfx] [PATCH v7 2/2] drm/i915/guc: Close deregister-context race against CT-loss Alan Previn
2023-11-30  0:20   ` Alan Previn
2023-11-30 21:18   ` [Intel-gfx] " Rodrigo Vivi
2023-11-30 21:18     ` Rodrigo Vivi
2023-12-01  0:09     ` [Intel-gfx] " Teres Alexis, Alan Previn
2023-12-01  0:09       ` Teres Alexis, Alan Previn
2023-12-01  0:10       ` [Intel-gfx] " Teres Alexis, Alan Previn
2023-12-01  0:10         ` Teres Alexis, Alan Previn
2023-12-06 21:47         ` Daniele Ceraolo Spurio [this message]
2023-12-06 21:47           ` Daniele Ceraolo Spurio
2023-11-30  7:50 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for Resolve suspend-resume racing with GuC destroy-context-worker (rev7) Patchwork
2023-12-01  1:33 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Resolve suspend-resume racing with GuC destroy-context-worker (rev8) Patchwork
2023-12-01  2:20 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2023-12-01  5:10   ` Teres Alexis, Alan Previn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cf64cae8-eac2-4e3e-9fd1-aef79c4000f3@intel.com \
    --to=daniele.ceraolospurio@intel.com \
    --cc=alan.previn.teres.alexis@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=mousumi.jana@intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=tvrtko.ursulin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.