From: lauraa@codeaurora.org (Laura Abbott)
To: linux-arm-kernel@lists.infradead.org
Subject: [BUG] Circular locking dependency - DRM/CMA/MM/hotplug/...
Date: Fri, 21 Mar 2014 11:47:29 -0700 [thread overview]
Message-ID: <532C8941.4090104@codeaurora.org> (raw)
In-Reply-To: <xa1t8ut8wc2f.fsf@mina86.com>
On 2/18/2014 9:44 AM, Michal Nazarewicz wrote:
>> On 2014-02-12 17:33, Russell King - ARM Linux wrote:
>>> What if we did these changes:
>>>
>>> struct page *dma_alloc_from_contiguous(struct device *dev, int count,
>>> unsigned int align)
>>> {
>>> ...
>>> mutex_lock(&cma_mutex);
>>> ...
>>> for (;;) {
>>> pageno = bitmap_find_next_zero_area(cma->bitmap, cma->count,
>>> start, count, mask);
>>> if (pageno >= cma->count)
>>> break;
>>>
>>> pfn = cma->base_pfn + pageno;
>>> + bitmap_set(cma->bitmap, pageno, count);
>>> + mutex_unlock(&cma_mutex);
>>> ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA);
>>> + mutex_lock(&cma_mutex);
>>> if (ret == 0) {
>>> - bitmap_set(cma->bitmap, pageno, count);
>>> page = pfn_to_page(pfn);
>>> break;
>>> - } else if (ret != -EBUSY) {
>>> + }
>>> + bitmap_clear(cma->bitmap, pageno, count);
>>> + if (ret != -EBUSY) {
>>> break;
>>> }
>>> ...
>>> mutex_unlock(&cma_mutex);
>>> pr_debug("%s(): returned %p\n", __func__, page);
>>> return page;
>>> }
>
> Like Marek said, this will fail if two concurrent calls to
> alloc_contig_range are made such that they operate on the same pageblock
> (which is possible as the allocated regions do not need to be pageblock
> aligned).
>
> Another mutex could be added just for this one call, but as I understand
> this would not solve the problem.
>
>>> bool dma_release_from_contiguous(struct device *dev, struct page *pages,
>>> int count)
>>> {
>>> ...
>>> + free_contig_range(pfn, count);
>>> mutex_lock(&cma_mutex);
>>> bitmap_clear(cma->bitmap, pfn - cma->base_pfn, count);
>>> - free_contig_range(pfn, count);
>>> mutex_unlock(&cma_mutex);
>>> ...
>>> }
>
> This *should* be fine. Didn't test it.
>
I managed to hit a different deadlock that had a similar root cause.
I also managed to independently come up with a similar solution. This
has been tested somewhat but not in wide distribution.
Thanks,
Laura
----- 8< ------
>From 2aa000fbd4189d967c45c4f1ac5aee812ed83082 Mon Sep 17 00:00:00 2001
From: Laura Abbott <lauraa@codeaurora.org>
Date: Tue, 25 Feb 2014 11:01:19 -0800
Subject: [PATCH] cma: Remove potential deadlock situation
CMA locking is currently very coarse. The cma_mutex protects both
the bitmap and avoids concurrency with alloc_contig_range. There
are several situations which may result in a deadlock on the CMA
mutex currently, mostly involving AB/BA situations with alloc and
free. Fix this issue by protecting the bitmap with a mutex per CMA
region and use the existing mutex for protecting against concurrency
with alloc_contig_range.
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
---
drivers/base/dma-contiguous.c | 32 +++++++++++++++++++++++++-------
1 file changed, 25 insertions(+), 7 deletions(-)
diff --git a/drivers/base/dma-contiguous.c b/drivers/base/dma-contiguous.c
index 165c2c2..dfb48ef 100644
--- a/drivers/base/dma-contiguous.c
+++ b/drivers/base/dma-contiguous.c
@@ -37,6 +37,7 @@ struct cma {
unsigned long base_pfn;
unsigned long count;
unsigned long *bitmap;
+ struct mutex lock;
};
struct cma *dma_contiguous_default_area;
@@ -161,6 +162,7 @@ static int __init cma_activate_area(struct cma *cma)
init_cma_reserved_pageblock(pfn_to_page(base_pfn));
} while (--i);
+ mutex_init(&cma->lock);
return 0;
}
@@ -261,6 +263,13 @@ err:
return ret;
}
+static void clear_cma_bitmap(struct cma *cma, unsigned long pfn, int count)
+{
+ mutex_lock(&cma->lock);
+ bitmap_clear(cma->bitmap, pfn - cma->base_pfn, count);
+ mutex_unlock(&cma->lock);
+}
+
/**
* dma_alloc_from_contiguous() - allocate pages from contiguous area
* @dev: Pointer to device for which the allocation is performed.
@@ -294,30 +303,41 @@ struct page *dma_alloc_from_contiguous(struct device *dev, int count,
mask = (1 << align) - 1;
- mutex_lock(&cma_mutex);
for (;;) {
+ mutex_lock(&cma->lock);
pageno = bitmap_find_next_zero_area(cma->bitmap, cma->count,
start, count, mask);
- if (pageno >= cma->count)
+ if (pageno >= cma->count) {
+ mutex_unlock(&cma_mutex);
break;
+ }
+ bitmap_set(cma->bitmap, pageno, count);
+ /*
+ * It's safe to drop the lock here. We've marked this region for
+ * our exclusive use. If the migration fails we will take the
+ * lock again and unmark it.
+ */
+ mutex_unlock(&cma->lock);
pfn = cma->base_pfn + pageno;
+ mutex_lock(&cma_mutex);
ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA);
+ mutex_unlock(&cma_mutex);
if (ret == 0) {
- bitmap_set(cma->bitmap, pageno, count);
page = pfn_to_page(pfn);
break;
} else if (ret != -EBUSY) {
+ clear_cma_bitmap(cma, pfn, count);
break;
}
+ clear_cma_bitmap(cma, pfn, count);
pr_debug("%s(): memory range at %p is busy, retrying\n",
__func__, pfn_to_page(pfn));
/* try again with a bit different memory target */
start = pageno + mask + 1;
}
- mutex_unlock(&cma_mutex);
pr_debug("%s(): returned %p\n", __func__, page);
return page;
}
@@ -350,10 +370,8 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
VM_BUG_ON(pfn + count > cma->base_pfn + cma->count);
- mutex_lock(&cma_mutex);
- bitmap_clear(cma->bitmap, pfn - cma->base_pfn, count);
free_contig_range(pfn, count);
- mutex_unlock(&cma_mutex);
+ clear_cma_bitmap(cma, pfn, count);
return true;
}
--
Code Aurora Forum chooses to take this file under the GPL v 2 license only.
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
next prev parent reply other threads:[~2014-03-21 18:47 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-11 18:35 [BUG] Circular locking dependency - DRM/CMA/MM/hotplug/ Russell King - ARM Linux
2014-02-12 9:06 ` Daniel Vetter
2014-02-12 15:40 ` Marek Szyprowski
2014-02-12 16:33 ` Russell King - ARM Linux
2014-02-12 18:29 ` Daniel Vetter
2014-02-12 18:42 ` Russell King - ARM Linux
2014-02-18 14:25 ` Marek Szyprowski
2014-02-18 14:32 ` Russell King - ARM Linux
2014-02-18 17:44 ` Michal Nazarewicz
2014-03-21 18:47 ` Laura Abbott [this message]
2014-03-24 14:11 ` Michal Nazarewicz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=532C8941.4090104@codeaurora.org \
--to=lauraa@codeaurora.org \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).