Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>,
	<intel-xe@lists.freedesktop.org>,
	John Harrison <John.C.Harrison@intel.com>
Subject: Re: [PATCH] drm/xe/pxp: Don't kill queues while holding the spinlock
Date: Thu, 13 Feb 2025 12:19:46 -0800	[thread overview]
Message-ID: <Z65T4j6Z81Hxl0so@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <1cf17877-18ae-4cd0-8be5-9a5abd46c58d@intel.com>

On Thu, Feb 13, 2025 at 09:23:26AM -0800, Daniele Ceraolo Spurio wrote:
> 
> 
> On 2/12/2025 10:42 PM, Dan Carpenter wrote:
> > On Wed, Feb 12, 2025 at 05:26:55PM -0800, Matthew Brost wrote:
> > > On Wed, Feb 12, 2025 at 04:40:32PM -0800, Daniele Ceraolo Spurio wrote:
> > > > xe_exec_queue_kill can sleep, so we can't call it from under the lock.
> > > > We can instead move the queues to a separate list and then kill them all
> > > > after we release the lock.
> > > > 
> > > > Since being in the list is used to track whether RPM cleanup is needed,
> > > > we can no longer defer that to queue_destroy, so we perform it
> > > > immediately instead.
> > > > 
> > > > Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
> > > > Fixes: f8caa80154c4 ("drm/xe/pxp: Add PXP queue tracking and session start")
> > > > Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> > > Patch LGTM but can this actually happen though? i.e. Can or do we enable
> > > PXP on LR queues?
> > > 
> > This isn't really an answer to your question, but when I reported this
> > bug I didn't notice the if (xe_vm_in_preempt_fence_mode()) check in
> > xe_vm_remove_compute_exec_queue().  So it's possible that this was a
> > false positive?
> 
> We currently don't have a use-case where we need a vm in preempt_fence_mode
> for a queue that uses PXP, but I didn't block the combination because there
> is a chance we might want to use it in the future (compute PXP is supported
> by the HW, even if we don't currently support it in Xe), so a user can still
> set things up that way.
> 
> > 
> > > Also as a follow should be add a might_sleep() to xe_exec_queue_kill to
> > > catch this type of bug immediately?
> > There is a might_sleep() in down_write().  If this is a real bug that
> > would have caught it.  The problem is that people don't generally test
> > with CONFIG_DEBUG_ATOMIC_SLEEP so the might_sleep() calls are turned off.
> 
> We do have CONFIG_DEBUG_ATOMIC_SLEEP enabled in CI (and I have it locally
> since I use the CI config), but since PXP + preempt_fence_mode is not an
> expected use-case we don't have any tests that cover that combination, so we
> return early from that xe_vm_remove_compute_exec_queue() and don't hit the
> down_write/might_sleep. I'll see if I can add a test to cover it, as there
> might be other issues I've missed.
> Also, I don't think it'd be right to add a might_sleep at the top of the
> exec_queue_kill() function either, because if a caller is sure that
> xe_vm_in_preempt_fence_mode() is false they should be allowed to
> call exec_queue_kill() from atomic context.

I see what you are saying here but allowing something 'like we know we
not preempt queue so it is safe to kill in an atomic conetxt' seems
risky and a very odd thing to support. IMO we just make it clear that
this function can't be called in an atomic context.

We likely have some upcoming TLB invalidation changes too which I think
will move all queues to a per VM list with the list being protected by a
sleeping lock. Removal from this list should likely be done in kill too.
This is speculation however.

I agree some test cases for preempt queues and PXP would be a good idea
if this isn't explictly disallowed at the IOCTL level.

Matt

> 
> Thanks,
> Daniele
> 
> > 
> > regards,
> > dan carpenter
> > 
> 

  reply	other threads:[~2025-02-13 20:18 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-13  0:40 [PATCH] drm/xe/pxp: Don't kill queues while holding the spinlock Daniele Ceraolo Spurio
2025-02-13  0:47 ` ✓ CI.Patch_applied: success for " Patchwork
2025-02-13  0:47 ` ✗ CI.checkpatch: warning " Patchwork
2025-02-13  0:48 ` ✗ CI.KUnit: failure " Patchwork
2025-02-13  1:26 ` [PATCH] " Matthew Brost
2025-02-13  6:42   ` Dan Carpenter
2025-02-13 17:23     ` Daniele Ceraolo Spurio
2025-02-13 20:19       ` Matthew Brost [this message]
2025-02-19  0:38         ` Daniele Ceraolo Spurio
2025-02-19  3:18           ` Matthew Brost
2025-02-19  3:20             ` Matthew Brost
2025-02-19 21:33               ` Daniele Ceraolo Spurio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z65T4j6Z81Hxl0so@lstrano-desk.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=John.C.Harrison@intel.com \
    --cc=dan.carpenter@linaro.org \
    --cc=daniele.ceraolospurio@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox