From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jonathan Davies Subject: Re: xenfs: race condition on xenstore watch Date: Fri, 31 May 2013 18:07:37 +0100 Message-ID: <51A8D8D9.7020106@citrix.com> References: <5193BCFA.3030604@citrix.com> <20130528125504.GA28949@phenom.dumpdata.com> <51A8C59F02000078000DA131@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <51A8C59F02000078000DA131@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: xen-devel@lists.xen.org, Konrad Rzeszutek Wilk List-Id: xen-devel@lists.xenproject.org On 31/05/13 14:45, Jan Beulich wrote: >>>> On 28.05.13 at 14:55, Konrad Rzeszutek Wilk wrote: >> On Wed, May 15, 2013 at 05:51:06PM +0100, Jonathan Davies wrote: >>> Normally, the userspace process gets an "OK" from xenfs and then the >>> watch fires immediately after. Occasionally, this happens the other way >>> around: the watch fires before the driver sends "OK", which confuses >>> the xenstore-watch client. It seems to me that the client is within its >>> rights to expect the "OK" first. > > Now that I look at this another time, I wonder how this behavior > can be guaranteed even with your patch: xenbus_write_watch() > sends the reply by calling queue_reply() followed by wake_up(), > so the reply getting delivered and the watch firing appear to be > inherently asynchronous anyway, and your patch just reduces > the race window. Am I overlooking something here? There's no call to wake_up() in xenbus_write_watch(), as far as I can see... The key thing is to guarantee that xenbus_write_watch()'s call to queue_reply() (for the "OK" message) executes before the watch_fired() callback -- registered by the call to register_xenbus_watch() -- could be called-back. Given that watch_fired() also takes the reply_mutex before its own calls to queue_reply() and wake_up(), this behaviour can be guaranteed by taking the reply_mutex before registering the callback. >>> Take the (non-reentrant) reply_mutex before calling >>> register_xenbus_watch to prevent the watch_fired callback from writing >>> anything until the "OK" has been sent. >> >> Should that perhaps be done inside the msg_type == XS_WATCH code (with a >> bool that would determine whether an reply_mutex has been taken?) >> >> As in, is there no need to take this mutex if msg_type != XS_WATCH? > > As said in an earlier reply, I also think that it would be preferable > to shrink the locked region as much as possible. v2 of the patch will have a much tighter locked region. I'll post that once I've tested it. Thanks, Jonathan