All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rob Clark <robdclark@gmail.com>
To: dri-devel@lists.freedesktop.org
Cc: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Subject: [PATCH] dma-buf/fence-array: fix deadlock in fence-array
Date: Mon, 17 Oct 2016 14:40:12 -0400	[thread overview]
Message-ID: <1476729612-22602-1-git-send-email-robdclark@gmail.com> (raw)

Currently with fence-array, we have a potential deadlock situation.  If we
fence_add_callback() on an array-fence, the array-fence's lock is acquired
first, and in it's ->enable_signaling() callback, it will install cb's on
it's array-member fences, so the array-member's lock is acquired second.

But in the signal path, the array-member's lock is acquired first, and the
array-fence's lock acquired second.

To solve that, always enabling signaling up-front (in the fence_array
constructor) without the fence_array's lock held.

lockdep splat:

 ======================================================
[ INFO: possible circular locking dependency detected ]
4.7.0-rc7+ #489 Not tainted
-------------------------------------------------------
surfaceflinger/2034 is trying to acquire lock:
 (&(&array->lock)->rlock){......}, at: [<ffff00000858cddc>] fence_signal+0x5c/0xf8

but task is already holding lock:
 (&(&obj->child_list_lock)->rlock){......}, at: [<ffff0000085902f8>] sw_sync_ioctl+0x228/0x3b0

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&(&obj->child_list_lock)->rlock){......}:
       [<ffff000008108924>] __lock_acquire+0x173c/0x18d8
       [<ffff000008108e0c>] lock_acquire+0x4c/0x68
       [<ffff000008ac6a6c>] _raw_spin_lock_irqsave+0x54/0x70
       [<ffff00000858d05c>] fence_add_callback+0x3c/0x100
       [<ffff00000858f100>] fence_array_enable_signaling+0x80/0x170
       [<ffff00000858d0d8>] fence_add_callback+0xb8/0x100
       [<ffff00000858f504>] sync_file_poll+0xd4/0xf0
       [<ffff0000081fd3a0>] do_sys_poll+0x220/0x438
       [<ffff0000081fd8d0>] SyS_ppoll+0x1b0/0x1d8
       [<ffff000008084f30>] el0_svc_naked+0x24/0x28

-> #0 (&(&array->lock)->rlock){......}:
       [<ffff000008104768>] print_circular_bug+0x80/0x2e0
       [<ffff0000081089ac>] __lock_acquire+0x17c4/0x18d8
       [<ffff000008108e0c>] lock_acquire+0x4c/0x68
       [<ffff000008ac6a6c>] _raw_spin_lock_irqsave+0x54/0x70
       [<ffff00000858cddc>] fence_signal+0x5c/0xf8
       [<ffff00000858f268>] fence_array_cb_func+0x78/0x88
       [<ffff00000858cb28>] fence_signal_locked+0x80/0xe0
       [<ffff0000085903c8>] sw_sync_ioctl+0x2f8/0x3b0
       [<ffff0000081faf6c>] do_vfs_ioctl+0xa4/0x790
       [<ffff0000081fb6e4>] SyS_ioctl+0x8c/0xa0
       [<ffff000008084f30>] el0_svc_naked+0x24/0x28

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&(&obj->child_list_lock)->rlock);
                               lock(&(&array->lock)->rlock);
                               lock(&(&obj->child_list_lock)->rlock);
  lock(&(&array->lock)->rlock);

 *** DEADLOCK ***

1 lock held by surfaceflinger/2034:
 #0:  (&(&obj->child_list_lock)->rlock){......}, at: [<ffff0000085902f8>] sw_sync_ioctl+0x228/0x3b0

Suggested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
---
 drivers/dma-buf/fence-array.c |  8 ++++++++
 drivers/dma-buf/fence.c       | 21 +++++++++++++++++++++
 include/linux/fence.h         |  1 +
 3 files changed, 30 insertions(+)

diff --git a/drivers/dma-buf/fence-array.c b/drivers/dma-buf/fence-array.c
index f1989fc..b5821e9 100644
--- a/drivers/dma-buf/fence-array.c
+++ b/drivers/dma-buf/fence-array.c
@@ -140,6 +140,14 @@ struct fence_array *fence_array_create(int num_fences, struct fence **fences,
 	atomic_set(&array->num_pending, signal_on_any ? 1 : num_fences);
 	array->fences = fences;
 
+	/* enable signaling without our lock being held while callbacks
+	 * are installed on the array-member fences.  Otherwise there is
+	 * a potential deadlock between enabling signaling (our lock
+	 * acquired first) and the array-members getting signalled (their
+	 * lock aquired first).
+	 */
+	fence_enable_sw_signaling_nolock(&array->base);
+
 	return array;
 }
 EXPORT_SYMBOL(fence_array_create);
diff --git a/drivers/dma-buf/fence.c b/drivers/dma-buf/fence.c
index cc05ddd..ff6b410 100644
--- a/drivers/dma-buf/fence.c
+++ b/drivers/dma-buf/fence.c
@@ -219,6 +219,27 @@ void fence_enable_sw_signaling(struct fence *fence)
 EXPORT_SYMBOL(fence_enable_sw_signaling);
 
 /**
+ * fence_enable_sw_signaling_nolock - enable signaling on fence without taking lock
+ * @fence:	[in]	the fence to enable
+ *
+ * WARNING this is only safe to use when you know you have the only reference
+ * to the fence there is no possibility for race conditions as a result of
+ * calling ops->enable_signaling() without lock.  Ie. it is probably only safe
+ * to use from the fence's construction function.
+ */
+void fence_enable_sw_signaling_nolock(struct fence *fence)
+{
+	if (!test_and_set_bit(FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags) &&
+	    !test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
+		trace_fence_enable_signal(fence);
+
+		if (!fence->ops->enable_signaling(fence))
+			fence_signal(fence);
+	}
+}
+EXPORT_SYMBOL(fence_enable_sw_signaling_nolock);
+
+/**
  * fence_add_callback - add a callback to be called when the fence
  * is signaled
  * @fence:	[in]	the fence to wait on
diff --git a/include/linux/fence.h b/include/linux/fence.h
index 0d76305..076ac29 100644
--- a/include/linux/fence.h
+++ b/include/linux/fence.h
@@ -226,6 +226,7 @@ int fence_add_callback(struct fence *fence, struct fence_cb *cb,
 		       fence_func_t func);
 bool fence_remove_callback(struct fence *fence, struct fence_cb *cb);
 void fence_enable_sw_signaling(struct fence *fence);
+void fence_enable_sw_signaling_nolock(struct fence *fence);
 
 /**
  * fence_is_signaled_locked - Return an indication if the fence is signaled yet.
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

             reply	other threads:[~2016-10-17 18:40 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-17 18:40 Rob Clark [this message]
2016-10-17 18:52 ` [PATCH] dma-buf/fence-array: fix deadlock in fence-array Gustavo Padovan
2016-10-17 18:59   ` Rob Clark
2016-10-17 19:26     ` Gustavo Padovan
2016-10-17 19:39     ` Chris Wilson
2016-10-17 19:44       ` Gustavo Padovan
2016-10-18  7:41         ` Daniel Vetter
2016-10-18 11:49           ` Rob Clark

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1476729612-22602-1-git-send-email-robdclark@gmail.com \
    --to=robdclark@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gustavo.padovan@collabora.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.