All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Avoiding recursing on ww_mutex inside shrinker
@ 2017-02-27 22:39 Chris Wilson
  2017-02-27 23:22 ` ✗ Fi.CI.BAT: failure for " Patchwork
  2017-02-28 14:21 ` [PATCH] " Joonas Lahtinen
  0 siblings, 2 replies; 3+ messages in thread
From: Chris Wilson @ 2017-02-27 22:39 UTC (permalink / raw)
  To: intel-gfx; +Cc: Matthew Auld

We have to avoid taking ww_mutex inside the shrinker as we use it as a
plain mutex type and so need to avoid recursive deadlocks:

[  602.771969] =================================
[  602.771970] [ INFO: inconsistent lock state ]
[  602.771973] 4.10.0gpudebug+ #122 Not tainted
[  602.771974] ---------------------------------
[  602.771975] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
[  602.771978] kswapd0/40 [HC0[0]:SC0[0]:HE1:SE1] takes:
[  602.771979]  (reservation_ww_class_mutex){+.+.?.}, at: [<ffffffffa054680a>] i915_gem_object_wait+0x39a/0x410 [i915]
[  602.772020] {RECLAIM_FS-ON-W} state was registered at:
[  602.772024]   mark_held_locks+0x76/0x90
[  602.772026]   lockdep_trace_alloc+0xb8/0xc0
[  602.772028]   __kmalloc_track_caller+0x5d/0x130
[  602.772031]   krealloc+0x89/0xb0
[  602.772033]   reservation_object_reserve_shared+0xaf/0xd0
[  602.772055]   i915_gem_do_execbuffer.isra.35+0x1413/0x18b0 [i915]
[  602.772075]   i915_gem_execbuffer2+0x10e/0x1d0 [i915]
[  602.772078]   drm_ioctl+0x291/0x480
[  602.772079]   do_vfs_ioctl+0x695/0x6f0
[  602.772081]   SyS_ioctl+0x3c/0x70
[  602.772084]   entry_SYSCALL_64_fastpath+0x18/0xad
[  602.772085] irq event stamp: 5197423
[  602.772088] hardirqs last  enabled at (5197423): [<ffffffff8116751d>] kfree+0xdd/0x170
[  602.772091] hardirqs last disabled at (5197422): [<ffffffff811674f9>] kfree+0xb9/0x170
[  602.772095] softirqs last  enabled at (5190992): [<ffffffff8107bfe1>] __do_softirq+0x221/0x280
[  602.772097] softirqs last disabled at (5190575): [<ffffffff8107c294>] irq_exit+0x64/0xc0
[  602.772099]
               other info that might help us debug this:
[  602.772100]  Possible unsafe locking scenario:

[  602.772101]        CPU0
[  602.772101]        ----
[  602.772102]   lock(reservation_ww_class_mutex);
[  602.772104]   <Interrupt>
[  602.772105]     lock(reservation_ww_class_mutex);
[  602.772107]
                *** DEADLOCK ***

[  602.772109] 2 locks held by kswapd0/40:
[  602.772110]  #0:  (shrinker_rwsem){++++..}, at: [<ffffffff811337b5>] shrink_slab.constprop.62+0x35/0x280
[  602.772116]  #1:  (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa0553957>] i915_gem_shrinker_lock+0x27/0x60 [i915]
[  602.772141]
               stack backtrace:
[  602.772144] CPU: 2 PID: 40 Comm: kswapd0 Not tainted 4.10.0gpudebug+ #122
[  602.772145] Hardware name: LENOVO 42433ZG/42433ZG, BIOS 8AET64WW (1.44 ) 07/26/2013
[  602.772147] Call Trace:
[  602.772151]  dump_stack+0x68/0xa1
[  602.772153]  print_usage_bug+0x1d4/0x1f0
[  602.772155]  mark_lock+0x390/0x530
[  602.772157]  ? print_irq_inversion_bug+0x200/0x200
[  602.772159]  __lock_acquire+0x405/0x1260
[  602.772181]  ? i915_gem_object_wait+0x39a/0x410 [i915]
[  602.772183]  lock_acquire+0x60/0x80
[  602.772205]  ? i915_gem_object_wait+0x39a/0x410 [i915]
[  602.772207]  mutex_lock_nested+0x69/0x760
[  602.772229]  ? i915_gem_object_wait+0x39a/0x410 [i915]
[  602.772231]  ? kfree+0xdd/0x170
[  602.772253]  ? i915_gem_object_wait+0x163/0x410 [i915]
[  602.772255]  ? trace_hardirqs_on_caller+0x18d/0x1c0
[  602.772256]  ? trace_hardirqs_on+0xd/0x10
[  602.772278]  i915_gem_object_wait+0x39a/0x410 [i915]
[  602.772300]  i915_gem_object_unbind+0x5e/0x130 [i915]
[  602.772323]  i915_gem_shrink+0x22d/0x3d0 [i915]
[  602.772347]  i915_gem_shrinker_scan+0x3f/0x80 [i915]
[  602.772349]  shrink_slab.constprop.62+0x1ad/0x280
[  602.772352]  shrink_node+0x52/0x80
[  602.772355]  kswapd+0x427/0x5c0
[  602.772358]  kthread+0x122/0x130
[  602.772360]  ? try_to_free_pages+0x270/0x270
[  602.772362]  ? kthread_stop+0x70/0x70
[  602.772365]  ret_from_fork+0x2e/0x40

Reported-by: Jan Nordholz <jckn@gmx.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99977#c10
Fixes: e54ca9774777 ("drm/i915: Remove completed fences after a wait")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 4c645f8ab05d..68d1a59ec4e6 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -466,10 +466,11 @@ i915_gem_object_wait_reservation(struct reservation_object *resv,
 	dma_fence_put(excl);
 
 	if (prune_fences && !__read_seqcount_retry(&resv->seq, seq)) {
-		reservation_object_lock(resv, NULL);
-		if (!__read_seqcount_retry(&resv->seq, seq))
-			reservation_object_add_excl_fence(resv, NULL);
-		reservation_object_unlock(resv);
+		if (reservation_object_trylock(resv)) {
+			if (!__read_seqcount_retry(&resv->seq, seq))
+				reservation_object_add_excl_fence(resv, NULL);
+			reservation_object_unlock(resv);
+		}
 	}
 
 	return timeout;
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915: Avoiding recursing on ww_mutex inside shrinker
  2017-02-27 22:39 [PATCH] drm/i915: Avoiding recursing on ww_mutex inside shrinker Chris Wilson
@ 2017-02-27 23:22 ` Patchwork
  2017-02-28 14:21 ` [PATCH] " Joonas Lahtinen
  1 sibling, 0 replies; 3+ messages in thread
From: Patchwork @ 2017-02-27 23:22 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Avoiding recursing on ww_mutex inside shrinker
URL   : https://patchwork.freedesktop.org/series/20328/
State : failure

== Summary ==

Series 20328v1 drm/i915: Avoiding recursing on ww_mutex inside shrinker
https://patchwork.freedesktop.org/api/1.0/series/20328/revisions/1/mbox/

Test gem_exec_flush:
        Subgroup basic-batch-kernel-default-uc:
                pass       -> FAIL       (fi-snb-2600)

fi-bdw-5557u     total:278  pass:267  dwarn:0   dfail:0   fail:0   skip:11 
fi-bsw-n3050     total:278  pass:239  dwarn:0   dfail:0   fail:0   skip:39 
fi-bxt-j4205     total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19 
fi-bxt-t5700     total:108  pass:95   dwarn:0   dfail:0   fail:0   skip:12 
fi-byt-j1900     total:278  pass:251  dwarn:0   dfail:0   fail:0   skip:27 
fi-byt-n2820     total:278  pass:247  dwarn:0   dfail:0   fail:0   skip:31 
fi-hsw-4770      total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16 
fi-hsw-4770r     total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16 
fi-ilk-650       total:278  pass:228  dwarn:0   dfail:0   fail:0   skip:50 
fi-ivb-3520m     total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18 
fi-ivb-3770      total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18 
fi-kbl-7500u     total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18 
fi-skl-6260u     total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10 
fi-skl-6700hq    total:278  pass:261  dwarn:0   dfail:0   fail:0   skip:17 
fi-skl-6700k     total:278  pass:256  dwarn:4   dfail:0   fail:0   skip:18 
fi-skl-6770hq    total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10 
fi-snb-2520m     total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28 
fi-snb-2600      total:278  pass:248  dwarn:0   dfail:0   fail:1   skip:29 

1a8bd0fb40e5d02f827f925b7702ed6f64fadce2 drm-tip: 2017y-02m-27d-22h-04m-19s UTC integration manifest
56e84b1 drm/i915: Avoiding recursing on ww_mutex inside shrinker

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3989/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] drm/i915: Avoiding recursing on ww_mutex inside shrinker
  2017-02-27 22:39 [PATCH] drm/i915: Avoiding recursing on ww_mutex inside shrinker Chris Wilson
  2017-02-27 23:22 ` ✗ Fi.CI.BAT: failure for " Patchwork
@ 2017-02-28 14:21 ` Joonas Lahtinen
  1 sibling, 0 replies; 3+ messages in thread
From: Joonas Lahtinen @ 2017-02-28 14:21 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Matthew Auld

On ma, 2017-02-27 at 22:39 +0000, Chris Wilson wrote:
> We have to avoid taking ww_mutex inside the shrinker as we use it as a
> plain mutex type and so need to avoid recursive deadlocks:
> 
> [  602.771969] =================================
> [  602.771970] [ INFO: inconsistent lock state ]
> [  602.771973] 4.10.0gpudebug+ #122 Not tainted
> [  602.771974] ---------------------------------
> [  602.771975] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
> [  602.771978] kswapd0/40 [HC0[0]:SC0[0]:HE1:SE1] takes:
> [  602.771979]  (reservation_ww_class_mutex){+.+.?.}, at: [<ffffffffa054680a>] i915_gem_object_wait+0x39a/0x410 [i915]
> [  602.772020] {RECLAIM_FS-ON-W} state was registered at:
> [  602.772024]   mark_held_locks+0x76/0x90
> [  602.772026]   lockdep_trace_alloc+0xb8/0xc0
> [  602.772028]   __kmalloc_track_caller+0x5d/0x130
> [  602.772031]   krealloc+0x89/0xb0
> [  602.772033]   reservation_object_reserve_shared+0xaf/0xd0
> [  602.772055]   i915_gem_do_execbuffer.isra.35+0x1413/0x18b0 [i915]
> [  602.772075]   i915_gem_execbuffer2+0x10e/0x1d0 [i915]
> [  602.772078]   drm_ioctl+0x291/0x480
> [  602.772079]   do_vfs_ioctl+0x695/0x6f0
> [  602.772081]   SyS_ioctl+0x3c/0x70
> [  602.772084]   entry_SYSCALL_64_fastpath+0x18/0xad
> [  602.772085] irq event stamp: 5197423
> [  602.772088] hardirqs last  enabled at (5197423): [<ffffffff8116751d>] kfree+0xdd/0x170
> [  602.772091] hardirqs last disabled at (5197422): [<ffffffff811674f9>] kfree+0xb9/0x170
> [  602.772095] softirqs last  enabled at (5190992): [<ffffffff8107bfe1>] __do_softirq+0x221/0x280
> [  602.772097] softirqs last disabled at (5190575): [<ffffffff8107c294>] irq_exit+0x64/0xc0
> [  602.772099]
>                other info that might help us debug this:
> [  602.772100]  Possible unsafe locking scenario:
> 
> [  602.772101]        CPU0
> [  602.772101]        ----
> [  602.772102]   lock(reservation_ww_class_mutex);
> [  602.772104]   <Interrupt>
> [  602.772105]     lock(reservation_ww_class_mutex);
> [  602.772107]
>                 *** DEADLOCK ***
> 
> [  602.772109] 2 locks held by kswapd0/40:
> [  602.772110]  #0:  (shrinker_rwsem){++++..}, at: [<ffffffff811337b5>] shrink_slab.constprop.62+0x35/0x280
> [  602.772116]  #1:  (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa0553957>] i915_gem_shrinker_lock+0x27/0x60 [i915]
> [  602.772141]
>                stack backtrace:
> [  602.772144] CPU: 2 PID: 40 Comm: kswapd0 Not tainted 4.10.0gpudebug+ #122
> [  602.772145] Hardware name: LENOVO 42433ZG/42433ZG, BIOS 8AET64WW (1.44 ) 07/26/2013
> [  602.772147] Call Trace:
> [  602.772151]  dump_stack+0x68/0xa1
> [  602.772153]  print_usage_bug+0x1d4/0x1f0
> [  602.772155]  mark_lock+0x390/0x530
> [  602.772157]  ? print_irq_inversion_bug+0x200/0x200
> [  602.772159]  __lock_acquire+0x405/0x1260
> [  602.772181]  ? i915_gem_object_wait+0x39a/0x410 [i915]
> [  602.772183]  lock_acquire+0x60/0x80
> [  602.772205]  ? i915_gem_object_wait+0x39a/0x410 [i915]
> [  602.772207]  mutex_lock_nested+0x69/0x760
> [  602.772229]  ? i915_gem_object_wait+0x39a/0x410 [i915]
> [  602.772231]  ? kfree+0xdd/0x170
> [  602.772253]  ? i915_gem_object_wait+0x163/0x410 [i915]
> [  602.772255]  ? trace_hardirqs_on_caller+0x18d/0x1c0
> [  602.772256]  ? trace_hardirqs_on+0xd/0x10
> [  602.772278]  i915_gem_object_wait+0x39a/0x410 [i915]
> [  602.772300]  i915_gem_object_unbind+0x5e/0x130 [i915]
> [  602.772323]  i915_gem_shrink+0x22d/0x3d0 [i915]
> [  602.772347]  i915_gem_shrinker_scan+0x3f/0x80 [i915]
> [  602.772349]  shrink_slab.constprop.62+0x1ad/0x280
> [  602.772352]  shrink_node+0x52/0x80
> [  602.772355]  kswapd+0x427/0x5c0
> [  602.772358]  kthread+0x122/0x130
> [  602.772360]  ? try_to_free_pages+0x270/0x270
> [  602.772362]  ? kthread_stop+0x70/0x70
> [  602.772365]  ret_from_fork+0x2e/0x40
> 
> Reported-by: Jan Nordholz <jckn@gmx.net>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99977#c10
> Fixes: e54ca9774777 ("drm/i915: Remove completed fences after a wait")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Matthew Auld <matthew.auld@intel.com>

<SNIP>

> @@ -466,10 +466,11 @@ i915_gem_object_wait_reservation(struct reservation_object *resv,
>  	dma_fence_put(excl);
> 

Make a comment here that this pruning is opportunistic only.

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas

>  	if (prune_fences && !__read_seqcount_retry(&resv->seq, seq)) {
> -		reservation_object_lock(resv, NULL);
> -		if (!__read_seqcount_retry(&resv->seq, seq))
> -			reservation_object_add_excl_fence(resv, NULL);
> -		reservation_object_unlock(resv);
> +		if (reservation_object_trylock(resv)) {
> +			if (!__read_seqcount_retry(&resv->seq, seq))
> +				reservation_object_add_excl_fence(resv, NULL);
> +			reservation_object_unlock(resv);
> +		}
>  	}
>  
>  	return timeout;
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-02-28 14:21 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-02-27 22:39 [PATCH] drm/i915: Avoiding recursing on ww_mutex inside shrinker Chris Wilson
2017-02-27 23:22 ` ✗ Fi.CI.BAT: failure for " Patchwork
2017-02-28 14:21 ` [PATCH] " Joonas Lahtinen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.