Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Matthew Brost <matthew.brost@intel.com>,
	Matthew Auld <matthew.auld@intel.com>
Cc: intel-xe@lists.freedesktop.org
Subject: Re: [PATCH] drm/xe/vm: don't ignore error when in_kthread
Date: Thu, 08 Feb 2024 10:22:02 +0100	[thread overview]
Message-ID: <d9fa7cf66c85f52e8c09fc0e3459a2c0d05abf78.camel@linux.intel.com> (raw)
In-Reply-To: <ZcEr9axBgGiIoFee@DUT025-TGLU.fm.intel.com>

On Mon, 2024-02-05 at 18:41 +0000, Matthew Brost wrote:
> On Fri, Feb 02, 2024 at 05:14:36PM +0000, Matthew Auld wrote:
> > If GUP fails and we are in_kthread, we can have pinned = 0 and ret
> > = 0.
> > If that happens we call sg_alloc_append_table_from_pages() with
> > n_pages
> > = 0, which is not well behaved and can trigger:
> > 
> > kernel BUG at include/linux/scatterlist.h:115!
> > 
> > depending on if the pages array happens to be zeroed or not. Even
> > if we
> > don't hit that it crashes later when trying to dma_map the returned
> > table.
> > 
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > Cc: Matthew Brost <matthew.brost@intel.com>
> 
> Someone from Habana point this out a while back and forgot to follow
> up
> on fixing this. Thanks for fixing this and looks correct.
> 
> Should we include a Fixes tag here? I am thinking so.
> 
> With a fixes tag:
> Reviewed: Matthew Brost <matthew.brost@intel.com>

Hi, 
Matt + Matt

I think this requires yet another fix. The reason for this odd
construct was that on process exit (CTRL-C), the userptr mappings are
torn down, leading to an -EFAULT here. This is then propagated to the
rebind worker and we get a printout like

[  188.922692] xe 0000:03:00.0: [drm] VM worker error: -14
[  188.922913] xe 0000:03:00.0: [drm] VM worker error: -14
[  188.922943] xe 0000:03:00.0: [drm] VM worker error: -14
[  188.922948] xe 0000:03:00.0: [drm] VM worker error: -14
[  188.922952] xe 0000:03:00.0: [drm] VM worker error: -14
[  188.922956] xe 0000:03:00.0: [drm] VM worker error: -14
[  188.922960] xe 0000:03:00.0: [drm] VM worker error: -14

(xe-exec-threads --r threads-cm-userptr-invalidate-race + CTRL-C)

And the idea was that the rebind worker just re-enabled without setting
up these bindings. If any job was then still accessing this address (it
shouldn't at this point, right?) we'd catch this with an IOMMU
pagefault or similar.

But in any case, we need to filter out the above log spamming.

/Thomas

> 
> > ---
> >  drivers/gpu/drm/xe/xe_vm.c | 5 +----
> >  1 file changed, 1 insertion(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_vm.c
> > b/drivers/gpu/drm/xe/xe_vm.c
> > index 9c1c68a2fff7..63aeb3aead04 100644
> > --- a/drivers/gpu/drm/xe/xe_vm.c
> > +++ b/drivers/gpu/drm/xe/xe_vm.c
> > @@ -114,11 +114,8 @@ int xe_vma_userptr_pin_pages(struct
> > xe_userptr_vma *uvma)
> >  					  num_pages - pinned,
> >  					  read_only ? 0 :
> > FOLL_WRITE,
> >  					  &pages[pinned]);
> > -		if (ret < 0) {
> > -			if (in_kthread)
> > -				ret = 0;
> > +		if (ret < 0)
> >  			break;
> > -		}
> >  
> >  		pinned += ret;
> >  		ret = 0;
> > -- 
> > 2.43.0
> > 


      parent reply	other threads:[~2024-02-08  9:22 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-02 17:14 [PATCH] drm/xe/vm: don't ignore error when in_kthread Matthew Auld
2024-02-02 17:51 ` ✓ CI.Patch_applied: success for " Patchwork
2024-02-02 17:51 ` ✓ CI.checkpatch: " Patchwork
2024-02-02 17:52 ` ✓ CI.KUnit: " Patchwork
2024-02-02 17:59 ` ✓ CI.Build: " Patchwork
2024-02-02 17:59 ` ✓ CI.Hooks: " Patchwork
2024-02-02 18:01 ` ✓ CI.checksparse: " Patchwork
2024-02-02 18:24 ` ✓ CI.BAT: " Patchwork
2024-02-05 18:41 ` [PATCH] " Matthew Brost
2024-02-06 10:55   ` Matthew Auld
2024-02-08  9:22   ` Thomas Hellström [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d9fa7cf66c85f52e8c09fc0e3459a2c0d05abf78.camel@linux.intel.com \
    --to=thomas.hellstrom@linux.intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.auld@intel.com \
    --cc=matthew.brost@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox