From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jonathan Davies Subject: Re: xenfs: race condition on xenstore watch Date: Fri, 31 May 2013 18:07:33 +0100 Message-ID: <51A8D8D5.8090308@citrix.com> References: <5193BCFA.3030604@citrix.com> <20130528125504.GA28949@phenom.dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20130528125504.GA28949@phenom.dumpdata.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Konrad Rzeszutek Wilk Cc: Jan Beulich , xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org On 28/05/13 13:55, Konrad Rzeszutek Wilk wrote: > On Wed, May 15, 2013 at 05:51:06PM +0100, Jonathan Davies wrote: >> Dear xen-devel, >> >> There's a race condition in xenfs (xenstore driver) that causes >> userspace utility xenstore-watch to crash. >> >> Normally, the userspace process gets an "OK" from xenfs and then the >> watch fires immediately after. Occasionally, this happens the other way >> around: the watch fires before the driver sends "OK", which confuses >> the xenstore-watch client. It seems to me that the client is within its >> rights to expect the "OK" first. >> >> Here's what is happening: >> >> The userspace process xenstore-watch writes to /proc/xen/xenbus with >> msg_type==XS_WATCH. This is handled by xenbus_write_watch which calls >> register_xenbus_watch with watch_fired as a callback *before* acquiring >> the reply_mutex and sending the synthesised "OK" reply. >> >> This gives a fast xenstore the opportunity to cause the watch_fired to >> run (and briefly grab the reply_mutex for itself) before the fake "OK" >> message is sent. >> >> Below, I've included a putative patch for pre-3.3 xenfs that fixes this >> problem. (It looks like the patch would also apply cleanly to >> 3.3-onwards xenbus_dev_frontend.c, but I haven't tried.) Any comments >> about whether this is a reasonable approach? > > It can't apply cleanly as the file moved :-( Well, it applies cleanly if you rename the affected file to drivers/xen/xenbus/xenbus_dev_frontend.c and can tolerate a 15-line offset... :-) >> A cursory glance at drivers/xen/xenbus/xenbus_dev.c suggests that it >> suffers from the same problem. Although I haven't haven't tested this, >> I'd expect that it requires a very similar solution. >> >> Jonathan >> >> >> Take the (non-reentrant) reply_mutex before calling >> register_xenbus_watch to prevent the watch_fired callback from writing >> anything until the "OK" has been sent. > > Should that perhaps be done inside the msg_type == XS_WATCH code (with a > bool that would determine whether an reply_mutex has been taken?) > > As in, is there no need to take this mutex if msg_type != XS_WATCH? When msg_type != XS_WATCH, we only need to take the mutex briefly when sending the "OK". There's no issue with XS_UNWATCH because there's never anything further to be written. v2 of the patch won't hold the lock while doing the work for XS_UNWATCH. >> Signed-off-by: Jonathan Davies >> >> diff -r 5ab1b4af1faf drivers/xen/xenfs/xenbus.c >> --- a/drivers/xen/xenfs/xenbus.c Thu Dec 03 06:00:06 2009 +0000 >> +++ b/drivers/xen/xenfs/xenbus.c Wed May 15 17:24:47 2013 +0100 >> @@ -359,6 +359,8 @@ static int xenbus_write_watch(unsigned m >> } >> token++; >> >> + mutex_lock(&u->reply_mutex); >> + > > Have you tested this with the kernel compiled with DEBUG_MUTEX and DEBUG_PROVE_LOCKING > to make sure there are no mutex/spinlock issues? No. I will give that a try. >> if (msg_type == XS_WATCH) { >> watch = alloc_watch_adapter(path, token); >> if (watch == NULL) { >> @@ -401,12 +403,11 @@ static int xenbus_write_watch(unsigned m >> "OK" >> }; >> >> - mutex_lock(&u->reply_mutex); >> rc = queue_reply(&u->read_buffers, &reply, sizeof(reply)); >> - mutex_unlock(&u->reply_mutex); >> } >> >> out: >> + mutex_unlock(&u->reply_mutex); >> return rc; >> } Thanks for the feedback, Jonathan