public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Retry gtt fault when out of fence register
@ 2023-10-12 13:28 Ville Syrjala
  2023-10-12 13:40 ` Greg KH
  2023-10-13 10:53 ` [Intel-gfx] " Andi Shyti
  0 siblings, 2 replies; 7+ messages in thread
From: Ville Syrjala @ 2023-10-12 13:28 UTC (permalink / raw)
  To: intel-gfx; +Cc: stable

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

If we can't find a free fence register to handle a fault in the GMADR
range just return VM_FAULT_NOPAGE without populating the PTE so that
userspace will retry the access and trigger another fault. Eventually
we should find a free fence and the fault will get properly handled.

A further improvement idea might be to reserve a fence (or one per CPU?)
for the express purpose of handling faults without having to retry. But
that would require some additional work.

Looks like this may have gotten broken originally by
commit 39965b376601 ("drm/i915: don't trash the gtt when running out of fences")
as that changed the errno to -EDEADLK which wasn't handle by the gtt
fault code either. But later in commit 2feeb52859fc ("drm/i915/gt: Fix
-EDEADLK handling regression") I changed it again to -ENOBUFS as -EDEADLK
was now getting used for the ww mutex dance. So this fix only makes
sense after that last commit.

Cc: stable@vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9479
Fixes: 2feeb52859fc ("drm/i915/gt: Fix -EDEADLK handling regression")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index aa4d842d4c5a..310654542b42 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -235,6 +235,7 @@ static vm_fault_t i915_error_to_vmf_fault(int err)
 	case 0:
 	case -EAGAIN:
 	case -ENOSPC: /* transient failure to evict? */
+	case -ENOBUFS: /* temporarily out of fences? */
 	case -ERESTARTSYS:
 	case -EINTR:
 	case -EBUSY:
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/i915: Retry gtt fault when out of fence register
  2023-10-12 13:28 [PATCH] drm/i915: Retry gtt fault when out of fence register Ville Syrjala
@ 2023-10-12 13:40 ` Greg KH
  2023-10-12 13:53   ` Ville Syrjälä
  2023-10-13 10:53 ` [Intel-gfx] " Andi Shyti
  1 sibling, 1 reply; 7+ messages in thread
From: Greg KH @ 2023-10-12 13:40 UTC (permalink / raw)
  To: Ville Syrjala; +Cc: intel-gfx, stable

On Thu, Oct 12, 2023 at 04:28:01PM +0300, Ville Syrjala wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> If we can't find a free fence register to handle a fault in the GMADR
> range just return VM_FAULT_NOPAGE without populating the PTE so that
> userspace will retry the access and trigger another fault. Eventually
> we should find a free fence and the fault will get properly handled.
> 
> A further improvement idea might be to reserve a fence (or one per CPU?)
> for the express purpose of handling faults without having to retry. But
> that would require some additional work.
> 
> Looks like this may have gotten broken originally by
> commit 39965b376601 ("drm/i915: don't trash the gtt when running out of fences")
> as that changed the errno to -EDEADLK which wasn't handle by the gtt
> fault code either. But later in commit 2feeb52859fc ("drm/i915/gt: Fix
> -EDEADLK handling regression") I changed it again to -ENOBUFS as -EDEADLK
> was now getting used for the ww mutex dance. So this fix only makes
> sense after that last commit.
> 
> Cc: stable@vger.kernel.org
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9479
> Fixes: 2feeb52859fc ("drm/i915/gt: Fix -EDEADLK handling regression")
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_mman.c | 1 +
>  1 file changed, 1 insertion(+)
> 

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/i915: Retry gtt fault when out of fence register
  2023-10-12 13:40 ` Greg KH
@ 2023-10-12 13:53   ` Ville Syrjälä
  2023-10-12 16:12     ` Greg KH
  0 siblings, 1 reply; 7+ messages in thread
From: Ville Syrjälä @ 2023-10-12 13:53 UTC (permalink / raw)
  To: Greg KH; +Cc: intel-gfx, stable

On Thu, Oct 12, 2023 at 03:40:08PM +0200, Greg KH wrote:
> On Thu, Oct 12, 2023 at 04:28:01PM +0300, Ville Syrjala wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > If we can't find a free fence register to handle a fault in the GMADR
> > range just return VM_FAULT_NOPAGE without populating the PTE so that
> > userspace will retry the access and trigger another fault. Eventually
> > we should find a free fence and the fault will get properly handled.
> > 
> > A further improvement idea might be to reserve a fence (or one per CPU?)
> > for the express purpose of handling faults without having to retry. But
> > that would require some additional work.
> > 
> > Looks like this may have gotten broken originally by
> > commit 39965b376601 ("drm/i915: don't trash the gtt when running out of fences")
> > as that changed the errno to -EDEADLK which wasn't handle by the gtt
> > fault code either. But later in commit 2feeb52859fc ("drm/i915/gt: Fix
> > -EDEADLK handling regression") I changed it again to -ENOBUFS as -EDEADLK
> > was now getting used for the ww mutex dance. So this fix only makes
> > sense after that last commit.
> > 
> > Cc: stable@vger.kernel.org
> > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9479
> > Fixes: 2feeb52859fc ("drm/i915/gt: Fix -EDEADLK handling regression")
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_mman.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> 
> <formletter>
> 
> This is not the correct way to submit patches for inclusion in the
> stable kernel tree.  Please read:
>     https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> for how to do this properly.
> 
> </formletter>

Say what now?

-- 
Ville Syrjälä
Intel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/i915: Retry gtt fault when out of fence register
  2023-10-12 13:53   ` Ville Syrjälä
@ 2023-10-12 16:12     ` Greg KH
  2023-10-16 15:52       ` Ville Syrjälä
  0 siblings, 1 reply; 7+ messages in thread
From: Greg KH @ 2023-10-12 16:12 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, stable

On Thu, Oct 12, 2023 at 04:53:38PM +0300, Ville Syrjälä wrote:
> On Thu, Oct 12, 2023 at 03:40:08PM +0200, Greg KH wrote:
> > On Thu, Oct 12, 2023 at 04:28:01PM +0300, Ville Syrjala wrote:
> > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > 
> > > If we can't find a free fence register to handle a fault in the GMADR
> > > range just return VM_FAULT_NOPAGE without populating the PTE so that
> > > userspace will retry the access and trigger another fault. Eventually
> > > we should find a free fence and the fault will get properly handled.
> > > 
> > > A further improvement idea might be to reserve a fence (or one per CPU?)
> > > for the express purpose of handling faults without having to retry. But
> > > that would require some additional work.
> > > 
> > > Looks like this may have gotten broken originally by
> > > commit 39965b376601 ("drm/i915: don't trash the gtt when running out of fences")
> > > as that changed the errno to -EDEADLK which wasn't handle by the gtt
> > > fault code either. But later in commit 2feeb52859fc ("drm/i915/gt: Fix
> > > -EDEADLK handling regression") I changed it again to -ENOBUFS as -EDEADLK
> > > was now getting used for the ww mutex dance. So this fix only makes
> > > sense after that last commit.
> > > 
> > > Cc: stable@vger.kernel.org
> > > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9479
> > > Fixes: 2feeb52859fc ("drm/i915/gt: Fix -EDEADLK handling regression")
> > > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/gem/i915_gem_mman.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > 
> > <formletter>
> > 
> > This is not the correct way to submit patches for inclusion in the
> > stable kernel tree.  Please read:
> >     https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> > for how to do this properly.
> > 
> > </formletter>
> 
> Say what now?

Sorry, my bot thought this was a patch sent only to stable, I've kicked
it a bit and it shouldn't do that again...

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Retry gtt fault when out of fence register
  2023-10-12 13:28 [PATCH] drm/i915: Retry gtt fault when out of fence register Ville Syrjala
  2023-10-12 13:40 ` Greg KH
@ 2023-10-13 10:53 ` Andi Shyti
  2023-10-16 15:49   ` Ville Syrjälä
  1 sibling, 1 reply; 7+ messages in thread
From: Andi Shyti @ 2023-10-13 10:53 UTC (permalink / raw)
  To: Ville Syrjala; +Cc: intel-gfx, stable

Hi Ville,

> If we can't find a free fence register to handle a fault in the GMADR
> range just return VM_FAULT_NOPAGE without populating the PTE so that
> userspace will retry the access and trigger another fault. Eventually
> we should find a free fence and the fault will get properly handled.
> 
> A further improvement idea might be to reserve a fence (or one per CPU?)
> for the express purpose of handling faults without having to retry. But
> that would require some additional work.
> 
> Looks like this may have gotten broken originally by
> commit 39965b376601 ("drm/i915: don't trash the gtt when running out of fences")
> as that changed the errno to -EDEADLK which wasn't handle by the gtt
> fault code either. But later in commit 2feeb52859fc ("drm/i915/gt: Fix
> -EDEADLK handling regression") I changed it again to -ENOBUFS as -EDEADLK
> was now getting used for the ww mutex dance. So this fix only makes
> sense after that last commit.
> 
> Cc: stable@vger.kernel.org
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9479
> Fixes: 2feeb52859fc ("drm/i915/gt: Fix -EDEADLK handling regression")
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> 

Andi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Retry gtt fault when out of fence register
  2023-10-13 10:53 ` [Intel-gfx] " Andi Shyti
@ 2023-10-16 15:49   ` Ville Syrjälä
  0 siblings, 0 replies; 7+ messages in thread
From: Ville Syrjälä @ 2023-10-16 15:49 UTC (permalink / raw)
  To: Andi Shyti; +Cc: intel-gfx, stable

On Fri, Oct 13, 2023 at 12:53:59PM +0200, Andi Shyti wrote:
> Hi Ville,
> 
> > If we can't find a free fence register to handle a fault in the GMADR
> > range just return VM_FAULT_NOPAGE without populating the PTE so that
> > userspace will retry the access and trigger another fault. Eventually
> > we should find a free fence and the fault will get properly handled.
> > 
> > A further improvement idea might be to reserve a fence (or one per CPU?)
> > for the express purpose of handling faults without having to retry. But
> > that would require some additional work.
> > 
> > Looks like this may have gotten broken originally by
> > commit 39965b376601 ("drm/i915: don't trash the gtt when running out of fences")
> > as that changed the errno to -EDEADLK which wasn't handle by the gtt
> > fault code either. But later in commit 2feeb52859fc ("drm/i915/gt: Fix
> > -EDEADLK handling regression") I changed it again to -ENOBUFS as -EDEADLK
> > was now getting used for the ww mutex dance. So this fix only makes
> > sense after that last commit.
> > 
> > Cc: stable@vger.kernel.org
> > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9479
> > Fixes: 2feeb52859fc ("drm/i915/gt: Fix -EDEADLK handling regression")
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> 

Thanks. Pushed to gt-next.

-- 
Ville Syrjälä
Intel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/i915: Retry gtt fault when out of fence register
  2023-10-12 16:12     ` Greg KH
@ 2023-10-16 15:52       ` Ville Syrjälä
  0 siblings, 0 replies; 7+ messages in thread
From: Ville Syrjälä @ 2023-10-16 15:52 UTC (permalink / raw)
  To: Greg KH; +Cc: intel-gfx, stable

On Thu, Oct 12, 2023 at 06:12:26PM +0200, Greg KH wrote:
> On Thu, Oct 12, 2023 at 04:53:38PM +0300, Ville Syrjälä wrote:
> > On Thu, Oct 12, 2023 at 03:40:08PM +0200, Greg KH wrote:
> > > On Thu, Oct 12, 2023 at 04:28:01PM +0300, Ville Syrjala wrote:
> > > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > > 
> > > > If we can't find a free fence register to handle a fault in the GMADR
> > > > range just return VM_FAULT_NOPAGE without populating the PTE so that
> > > > userspace will retry the access and trigger another fault. Eventually
> > > > we should find a free fence and the fault will get properly handled.
> > > > 
> > > > A further improvement idea might be to reserve a fence (or one per CPU?)
> > > > for the express purpose of handling faults without having to retry. But
> > > > that would require some additional work.
> > > > 
> > > > Looks like this may have gotten broken originally by
> > > > commit 39965b376601 ("drm/i915: don't trash the gtt when running out of fences")
> > > > as that changed the errno to -EDEADLK which wasn't handle by the gtt
> > > > fault code either. But later in commit 2feeb52859fc ("drm/i915/gt: Fix
> > > > -EDEADLK handling regression") I changed it again to -ENOBUFS as -EDEADLK
> > > > was now getting used for the ww mutex dance. So this fix only makes
> > > > sense after that last commit.
> > > > 
> > > > Cc: stable@vger.kernel.org
> > > > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9479
> > > > Fixes: 2feeb52859fc ("drm/i915/gt: Fix -EDEADLK handling regression")
> > > > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/gem/i915_gem_mman.c | 1 +
> > > >  1 file changed, 1 insertion(+)
> > > > 
> > > 
> > > <formletter>
> > > 
> > > This is not the correct way to submit patches for inclusion in the
> > > stable kernel tree.  Please read:
> > >     https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> > > for how to do this properly.
> > > 
> > > </formletter>
> > 
> > Say what now?
> 
> Sorry, my bot thought this was a patch sent only to stable, I've kicked
> it a bit and it shouldn't do that again...

Ah OK, thanks.

I was a bit worried that my reading comprehension had deterirated enough
that I couldn't figure iut what new requirement in the process I had
violated :)

-- 
Ville Syrjälä
Intel

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-10-16 15:52 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-12 13:28 [PATCH] drm/i915: Retry gtt fault when out of fence register Ville Syrjala
2023-10-12 13:40 ` Greg KH
2023-10-12 13:53   ` Ville Syrjälä
2023-10-12 16:12     ` Greg KH
2023-10-16 15:52       ` Ville Syrjälä
2023-10-13 10:53 ` [Intel-gfx] " Andi Shyti
2023-10-16 15:49   ` Ville Syrjälä

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox