From mboxrd@z Thu Jan 1 00:00:00 1970 From: mina86@mina86.com (Michal Nazarewicz) Date: Mon, 24 Mar 2014 15:11:41 +0100 Subject: [BUG] Circular locking dependency - DRM/CMA/MM/hotplug/... In-Reply-To: <532C8941.4090104@codeaurora.org> References: <20140211183543.GK26684@n2100.arm.linux.org.uk> <52FB9602.1000805@samsung.com> <20140212163317.GQ26684@n2100.arm.linux.org.uk> <53036D51.2070502@samsung.com> <532C8941.4090104@codeaurora.org> Message-ID: To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Fri, Mar 21 2014, Laura Abbott wrote: > From: Laura Abbott > Date: Tue, 25 Feb 2014 11:01:19 -0800 > Subject: [PATCH] cma: Remove potential deadlock situation > > CMA locking is currently very coarse. The cma_mutex protects both > the bitmap and avoids concurrency with alloc_contig_range. There > are several situations which may result in a deadlock on the CMA > mutex currently, mostly involving AB/BA situations with alloc and > free. Fix this issue by protecting the bitmap with a mutex per CMA > region and use the existing mutex for protecting against concurrency > with alloc_contig_range. > > Signed-off-by: Laura Abbott Acked-by: Michal Nazarewicz Furthermore, since CMA regions are always MAX_ORDER-page or pageblock (whichever is bigger) aligned, we could use two mutexes per CMA region: one protecting the bitmap and the other one protecting calls to alloc_contig_range touching given region. On the other hand, we could also go the other way and have two global mutexes: one protecting all the bitmaps in all the regions and another protecting calls to alloc_contig_range. Either way, I think the below patch should work and fix the problem. > --- > drivers/base/dma-contiguous.c | 32 +++++++++++++++++++++++++------- > 1 file changed, 25 insertions(+), 7 deletions(-) > > diff --git a/drivers/base/dma-contiguous.c b/drivers/base/dma-contiguous.c > index 165c2c2..dfb48ef 100644 > --- a/drivers/base/dma-contiguous.c > +++ b/drivers/base/dma-contiguous.c > @@ -37,6 +37,7 @@ struct cma { > unsigned long base_pfn; > unsigned long count; > unsigned long *bitmap; > + struct mutex lock; > }; > > struct cma *dma_contiguous_default_area; > @@ -161,6 +162,7 @@ static int __init cma_activate_area(struct cma *cma) > init_cma_reserved_pageblock(pfn_to_page(base_pfn)); > } while (--i); > > + mutex_init(&cma->lock); > return 0; > } > > @@ -261,6 +263,13 @@ err: > return ret; > } > > +static void clear_cma_bitmap(struct cma *cma, unsigned long pfn, int count) > +{ > + mutex_lock(&cma->lock); > + bitmap_clear(cma->bitmap, pfn - cma->base_pfn, count); > + mutex_unlock(&cma->lock); > +} > + > /** > * dma_alloc_from_contiguous() - allocate pages from contiguous area > * @dev: Pointer to device for which the allocation is performed. > @@ -294,30 +303,41 @@ struct page *dma_alloc_from_contiguous(struct device *dev, int count, > > mask = (1 << align) - 1; > > - mutex_lock(&cma_mutex); > > for (;;) { > + mutex_lock(&cma->lock); > pageno = bitmap_find_next_zero_area(cma->bitmap, cma->count, > start, count, mask); > - if (pageno >= cma->count) > + if (pageno >= cma->count) { > + mutex_unlock(&cma_mutex); > break; > + } > + bitmap_set(cma->bitmap, pageno, count); > + /* > + * It's safe to drop the lock here. We've marked this region for > + * our exclusive use. If the migration fails we will take the > + * lock again and unmark it. > + */ > + mutex_unlock(&cma->lock); > > pfn = cma->base_pfn + pageno; > + mutex_lock(&cma_mutex); > ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA); > + mutex_unlock(&cma_mutex); > if (ret == 0) { > - bitmap_set(cma->bitmap, pageno, count); > page = pfn_to_page(pfn); > break; > } else if (ret != -EBUSY) { > + clear_cma_bitmap(cma, pfn, count); > break; > } > + clear_cma_bitmap(cma, pfn, count); > pr_debug("%s(): memory range at %p is busy, retrying\n", > __func__, pfn_to_page(pfn)); > /* try again with a bit different memory target */ > start = pageno + mask + 1; > } > > - mutex_unlock(&cma_mutex); > pr_debug("%s(): returned %p\n", __func__, page); > return page; > } > @@ -350,10 +370,8 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages, > > VM_BUG_ON(pfn + count > cma->base_pfn + cma->count); > > - mutex_lock(&cma_mutex); > - bitmap_clear(cma->bitmap, pfn - cma->base_pfn, count); > free_contig_range(pfn, count); > - mutex_unlock(&cma_mutex); > + clear_cma_bitmap(cma, pfn, count); > > return true; > } > -- > Code Aurora Forum chooses to take this file under the GPL v 2 license only. > -- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, > hosted by The Linux Foundation -- Best regards, _ _ .o. | Liege of Serenely Enlightened Majesty of o' \,=./ `o ..o | Computer Science, Micha? ?mina86? Nazarewicz (o o) ooo +------ooO--(_)--Ooo-- -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michal Nazarewicz Subject: Re: [BUG] Circular locking dependency - DRM/CMA/MM/hotplug/... Date: Mon, 24 Mar 2014 15:11:41 +0100 Message-ID: References: <20140211183543.GK26684@n2100.arm.linux.org.uk> <52FB9602.1000805@samsung.com> <20140212163317.GQ26684@n2100.arm.linux.org.uk> <53036D51.2070502@samsung.com> <532C8941.4090104@codeaurora.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Return-path: In-Reply-To: <532C8941.4090104@codeaurora.org> Sender: linux-kernel-owner@vger.kernel.org To: Laura Abbott , Marek Szyprowski , Russell King - ARM Linux Cc: David Airlie , dri-devel@lists.freedesktop.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org List-Id: dri-devel@lists.freedesktop.org --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On Fri, Mar 21 2014, Laura Abbott wrote: > From: Laura Abbott > Date: Tue, 25 Feb 2014 11:01:19 -0800 > Subject: [PATCH] cma: Remove potential deadlock situation > > CMA locking is currently very coarse. The cma_mutex protects both > the bitmap and avoids concurrency with alloc_contig_range. There > are several situations which may result in a deadlock on the CMA > mutex currently, mostly involving AB/BA situations with alloc and > free. Fix this issue by protecting the bitmap with a mutex per CMA > region and use the existing mutex for protecting against concurrency > with alloc_contig_range. > > Signed-off-by: Laura Abbott Acked-by: Michal Nazarewicz Furthermore, since CMA regions are always MAX_ORDER-page or pageblock (whichever is bigger) aligned, we could use two mutexes per CMA region: one protecting the bitmap and the other one protecting calls to alloc_contig_range touching given region. On the other hand, we could also go the other way and have two global mutexes: one protecting all the bitmaps in all the regions and another protecting calls to alloc_contig_range. Either way, I think the below patch should work and fix the problem. > --- > drivers/base/dma-contiguous.c | 32 +++++++++++++++++++++++++------- > 1 file changed, 25 insertions(+), 7 deletions(-) > > diff --git a/drivers/base/dma-contiguous.c b/drivers/base/dma-contiguous.c > index 165c2c2..dfb48ef 100644 > --- a/drivers/base/dma-contiguous.c > +++ b/drivers/base/dma-contiguous.c > @@ -37,6 +37,7 @@ struct cma { > unsigned long base_pfn; > unsigned long count; > unsigned long *bitmap; > + struct mutex lock; > }; >=20=20 > struct cma *dma_contiguous_default_area; > @@ -161,6 +162,7 @@ static int __init cma_activate_area(struct cma *cma) > init_cma_reserved_pageblock(pfn_to_page(base_pfn)); > } while (--i); >=20=20 > + mutex_init(&cma->lock); > return 0; > } >=20=20 > @@ -261,6 +263,13 @@ err: > return ret; > } >=20=20 > +static void clear_cma_bitmap(struct cma *cma, unsigned long pfn, int cou= nt) > +{ > + mutex_lock(&cma->lock); > + bitmap_clear(cma->bitmap, pfn - cma->base_pfn, count); > + mutex_unlock(&cma->lock); > +} > + > /** > * dma_alloc_from_contiguous() - allocate pages from contiguous area > * @dev: Pointer to device for which the allocation is performed. > @@ -294,30 +303,41 @@ struct page *dma_alloc_from_contiguous(struct devic= e *dev, int count, >=20=20 > mask =3D (1 << align) - 1; >=20=20 > - mutex_lock(&cma_mutex); >=20=20 > for (;;) { > + mutex_lock(&cma->lock); > pageno =3D bitmap_find_next_zero_area(cma->bitmap, cma->count, > start, count, mask); > - if (pageno >=3D cma->count) > + if (pageno >=3D cma->count) { > + mutex_unlock(&cma_mutex); > break; > + } > + bitmap_set(cma->bitmap, pageno, count); > + /* > + * It's safe to drop the lock here. We've marked this region for > + * our exclusive use. If the migration fails we will take the > + * lock again and unmark it. > + */ > + mutex_unlock(&cma->lock); >=20=20 > pfn =3D cma->base_pfn + pageno; > + mutex_lock(&cma_mutex); > ret =3D alloc_contig_range(pfn, pfn + count, MIGRATE_CMA); > + mutex_unlock(&cma_mutex); > if (ret =3D=3D 0) { > - bitmap_set(cma->bitmap, pageno, count); > page =3D pfn_to_page(pfn); > break; > } else if (ret !=3D -EBUSY) { > + clear_cma_bitmap(cma, pfn, count); > break; > } > + clear_cma_bitmap(cma, pfn, count); > pr_debug("%s(): memory range at %p is busy, retrying\n", > __func__, pfn_to_page(pfn)); > /* try again with a bit different memory target */ > start =3D pageno + mask + 1; > } >=20=20 > - mutex_unlock(&cma_mutex); > pr_debug("%s(): returned %p\n", __func__, page); > return page; > } > @@ -350,10 +370,8 @@ bool dma_release_from_contiguous(struct device *dev,= struct page *pages, >=20=20 > VM_BUG_ON(pfn + count > cma->base_pfn + cma->count); >=20=20 > - mutex_lock(&cma_mutex); > - bitmap_clear(cma->bitmap, pfn - cma->base_pfn, count); > free_contig_range(pfn, count); > - mutex_unlock(&cma_mutex); > + clear_cma_bitmap(cma, pfn, count); >=20=20 > return true; > } > --=20 > Code Aurora Forum chooses to take this file under the GPL v 2 license onl= y. > --=20 > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, > hosted by The Linux Foundation --=20 Best regards, _ _ .o. | Liege of Serenely Enlightened Majesty of o' \,=3D./ `o ..o | Computer Science, Micha=C5=82 =E2=80=9Cmina86=E2=80=9D Nazarewicz = (o o) ooo +------ooO--(_)--Ooo-- --=-=-= Content-Type: multipart/signed; boundary="==-=-="; micalg=pgp-sha1; protocol="application/pgp-signature" --==-=-= Content-Type: text/plain --==-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAEBAgAGBQJTMD0eAAoJECBgQBJQdR/0gMoQAIqIZlRiMMcEhChqbALBmpXn keL8pq1VOJ7fuNVhaNzqj9QBw5XfvWt5y+QxeJDWt1wpOASj+vCj7forAz6cpd5Q yL6PUmwIYFVPiiLN6Wg/Id+2eyiAx/Cr6IGX9sf0ARocN+8A4BOmr14bb8XTuVSw m7VwndD6PWjPk1ubW61MQATvIZqhMvd8Qjp3EQ6aqZjZJVKqQm5FeIsamXkXYrB+ Y8068hr1MsPPvWiE1Rqd1E2vrh/hWVJCVl7f1soE58M36T3cWSyZcn/s1w+qiM2k u8TxGyunjM6fLv5K77U0miPfu185PnO7RVmPHWchZ5sCLHKSAozgwbPF8cjeS3TL 5YMz6geTrwZtGGgd+J1hOHVWqA5lJqZu6u9c+yBpELGLsc/tDU7qUctqseu6T0Yl U5nbrP3Q/oFWDDTNHhZ+i//3ctVx2vvCjjElwC9bHRRzA2iMXnUs+yUHfvPjqHSz as5Lq+LM5/NEhOoTpCDCcd/wTGdynTNQs7+ghK8AjBRqohyO5R8xCmkLOSMicDgo 6mh4Dgvw5PL6HAluBqDQaYNWqDv3UyIp9sj76cxbSnBl0StdNh6jLQZ5mB3Vlcja ymUmHnSTg2gveQt1/plKOk+0rAEeZOerbJr+AxFYh8D1eqqhQOFBXYqDGLQ37I9l 4eZ9MO7+iPslQFjaVBRe =LudM -----END PGP SIGNATURE----- --==-=-=-- --=-=-=--