xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Davies <jonathan.davies@citrix.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jan Beulich <JBeulich@suse.com>, xen-devel@lists.xen.org
Subject: Re: xenfs: race condition on xenstore watch
Date: Fri, 31 May 2013 18:07:33 +0100	[thread overview]
Message-ID: <51A8D8D5.8090308@citrix.com> (raw)
In-Reply-To: <20130528125504.GA28949@phenom.dumpdata.com>

On 28/05/13 13:55, Konrad Rzeszutek Wilk wrote:
> On Wed, May 15, 2013 at 05:51:06PM +0100, Jonathan Davies wrote:
>> Dear xen-devel,
>>
>> There's a race condition in xenfs (xenstore driver) that causes
>> userspace utility xenstore-watch to crash.
>>
>> Normally, the userspace process gets an "OK" from xenfs and then the
>> watch fires immediately after. Occasionally, this happens the other way
>> around: the watch fires before the driver sends "OK", which confuses
>> the xenstore-watch client. It seems to me that the client is within its
>> rights to expect the "OK" first.
>>
>> Here's what is happening:
>>
>> The userspace process xenstore-watch writes to /proc/xen/xenbus with
>> msg_type==XS_WATCH. This is handled by xenbus_write_watch which calls
>> register_xenbus_watch with watch_fired as a callback *before* acquiring
>> the reply_mutex and sending the synthesised "OK" reply.
>>
>> This gives a fast xenstore the opportunity to cause the watch_fired to
>> run (and briefly grab the reply_mutex for itself) before the fake "OK"
>> message is sent.
>>
>> Below, I've included a putative patch for pre-3.3 xenfs that fixes this
>> problem. (It looks like the patch would also apply cleanly to
>> 3.3-onwards xenbus_dev_frontend.c, but I haven't tried.) Any comments
>> about whether this is a reasonable approach?
>
> It can't apply cleanly as the file moved :-(

Well, it applies cleanly if you rename the affected file to 
drivers/xen/xenbus/xenbus_dev_frontend.c and can tolerate a 15-line 
offset... :-)

>> A cursory glance at drivers/xen/xenbus/xenbus_dev.c suggests that it
>> suffers from the same problem. Although I haven't haven't tested this,
>> I'd expect that it requires a very similar solution.
>>
>> Jonathan
>>
>>
>> Take the (non-reentrant) reply_mutex before calling
>> register_xenbus_watch to prevent the watch_fired callback from writing
>> anything until the "OK" has been sent.
>
> Should that perhaps be done inside the msg_type == XS_WATCH code (with a
> bool that would determine whether an reply_mutex has been taken?)
>
> As in, is there no need to take this mutex if msg_type != XS_WATCH?

When msg_type != XS_WATCH, we only need to take the mutex briefly when 
sending the "OK". There's no issue with XS_UNWATCH because there's never 
anything further to be written. v2 of the patch won't hold the lock while 
doing the work for XS_UNWATCH.

>> Signed-off-by: Jonathan Davies <jonathan.davies@citrix.com>
>>
>> diff -r 5ab1b4af1faf drivers/xen/xenfs/xenbus.c
>> --- a/drivers/xen/xenfs/xenbus.c	Thu Dec 03 06:00:06 2009 +0000
>> +++ b/drivers/xen/xenfs/xenbus.c	Wed May 15 17:24:47 2013 +0100
>> @@ -359,6 +359,8 @@ static int xenbus_write_watch(unsigned m
>>   	}
>>   	token++;
>>
>> +	mutex_lock(&u->reply_mutex);
>> +
>
> Have you tested this with the kernel compiled with DEBUG_MUTEX and DEBUG_PROVE_LOCKING
> to make sure there are no mutex/spinlock issues?

No. I will give that a try.

>>   	if (msg_type == XS_WATCH) {
>>   		watch = alloc_watch_adapter(path, token);
>>   		if (watch == NULL) {
>> @@ -401,12 +403,11 @@ static int xenbus_write_watch(unsigned m
>>   			"OK"
>>   		};
>>
>> -		mutex_lock(&u->reply_mutex);
>>   		rc = queue_reply(&u->read_buffers, &reply, sizeof(reply));
>> -		mutex_unlock(&u->reply_mutex);
>>   	}
>>
>>   out:
>> +	mutex_unlock(&u->reply_mutex);
>>   	return rc;
>>   }

Thanks for the feedback,
Jonathan

      parent reply	other threads:[~2013-05-31 17:07 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-15 16:51 xenfs: race condition on xenstore watch Jonathan Davies
2013-05-16  8:26 ` Jan Beulich
2013-05-28 12:55 ` Konrad Rzeszutek Wilk
2013-05-31 13:45   ` Jan Beulich
2013-05-31 17:07     ` Jonathan Davies
2013-06-03  7:57       ` Jan Beulich
2013-05-31 17:07   ` Jonathan Davies [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51A8D8D5.8090308@citrix.com \
    --to=jonathan.davies@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=konrad.wilk@oracle.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).