From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: race condition in xen-gntdev Date: Wed, 22 Jul 2015 09:58:50 -0400 Message-ID: <20150722135850.GE18628@l.oracle.com> References: <20150527234508.GA14838@mail-itl> <20150617194211.GB11083@mail-itl> <20150622174626.GH5408@l.oracle.com> <20150622181335.GJ11083@mail-itl> <20150622183713.GD9631@l.oracle.com> <55885E88.2040805@tycho.nsa.gov> <20150626012824.GD967@mail-itl> <20150629143926.GA24629@l.oracle.com> <20150629145010.GT982@mail-itl> <20150722032155.GC5250@mail-itl> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: Content-Disposition: inline In-Reply-To: <20150722032155.GC5250@mail-itl> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Marek =?iso-8859-1?Q?Marczykowski-G=F3recki?= Cc: Boris Ostrovsky , Daniel De Graaf , David Vrabel , xen-devel List-Id: xen-devel@lists.xenproject.org On Wed, Jul 22, 2015 at 05:21:55AM +0200, Marek Marczykowski-G=F3recki wrot= e: > On Mon, Jun 29, 2015 at 04:50:10PM +0200, Marek Marczykowski-G=F3recki wr= ote: > > On Mon, Jun 29, 2015 at 10:39:26AM -0400, Konrad Rzeszutek Wilk wrote: > > > On Fri, Jun 26, 2015 at 03:28:24AM +0200, Marek Marczykowski-G=F3reck= i wrote: > > > > On Mon, Jun 22, 2015 at 03:14:16PM -0400, Daniel De Graaf wrote: > > > > > The reason that gntdev_release didn't have a lock is because ther= e are not > > > > > supposed to be any references to the areas pointed to by priv->ma= ps when it > > > > > is called. However, since the MMU notifier has not yet been unre= gistered, > > > > > it is apparently possible to race here; the comment on mmu_notifi= er_unregister > > > > > seems to confirm this as a possibility (as do the backtraces). > > > > > = > > > > > I think adding the lock will be sufficient. > > > > = > > > > Ok, so here is the patch: > > > = > > > Awesome! > > > = > > > Since you are the one who has been seeing this particular fault - any= chance > > > you could give it some soak time? If I recall your emails correctly i= t takes > > > about a week or so before you saw the crash? > > = > > Sure. I've already installed patched kernel, will report back results > > later. > = > Ok, after few weeks I can surely confirm - this fixes the issue. Fantastic! I think David has already committed it in. Thanks again for doing such a detailed analysis of the issue. > = > > > > -----------8<------------ > > > > = > > > > From b876e14888bdafa112c3265e6420543fa74aa709 Mon Sep 17 00:00:00 2= 001 > > > > From: =3D?UTF-8?q?Marek=3D20Marczykowski-G=3DC3=3DB3recki?=3D > > > > > > > > Date: Fri, 26 Jun 2015 02:16:49 +0200 > > > > Subject: [PATCH] xen/grant: fix race condition in gntdev_release > > > > = > > > > While gntdev_release is called, MMU notifier is still registered and > > > > can traverse priv->maps list even if no pages are mapped (which is = the > > > > case - gntdev_release is called after all). But gntdev_release will > > > > clear that list, so make sure that only one of those things happens= at > > > > the same time. > > > > = > > > > Signed-off-by: Marek Marczykowski-G=F3recki > > > > --- > > > > drivers/xen/gntdev.c | 2 ++ > > > > 1 file changed, 2 insertions(+) > > > > = > > > > diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c > > > > index 8927485..4bd23bb 100644 > > > > --- a/drivers/xen/gntdev.c > > > > +++ b/drivers/xen/gntdev.c > > > > @@ -568,12 +568,14 @@ static int gntdev_release(struct inode *inode= , struct file *flip) > > > > = > > > > pr_debug("priv %p\n", priv); > > > > = > > > > + mutex_lock(&priv->lock); > > > > while (!list_empty(&priv->maps)) { > > > > map =3D list_entry(priv->maps.next, struct grant_map, next); > > > > list_del(&map->next); > > > > gntdev_put_map(NULL /* already removed */, map); > > > > } > > > > WARN_ON(!list_empty(&priv->freeable_maps)); > > > > + mutex_unlock(&priv->lock); > > > > = > > > > if (use_ptemod) > > > > mmu_notifier_unregister(&priv->mn, priv->mm); > > > > -- = > > > > 1.9.3 > > > > = > > > > = > > > > -- = > > > > Best Regards, > > > > Marek Marczykowski-G=F3recki > > > > Invisible Things Lab > > > > A: Because it messes up the order in which people normally read tex= t. > > > > Q: Why is top-posting such a bad thing? > > > = > > > = > > = > = > = > = > -- = > Best Regards, > Marek Marczykowski-G=F3recki > Invisible Things Lab > A: Because it messes up the order in which people normally read text. > Q: Why is top-posting such a bad thing?