From mboxrd@z Thu Jan 1 00:00:00 1970 From: Keir Fraser Subject: Re: [PATCH 1/3] xen/pv-on-hvm kexec: prevent crash in xenwatch_thread() when stale watch events arrive Date: Wed, 17 Aug 2011 14:30:12 +0100 Message-ID: References: <20110817125127.GA3163@aepfle.de> Mime-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20110817125127.GA3163@aepfle.de> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Olaf Hering , Ian Campbell Cc: Jeremy Fitzhardinge , "xen-devel@lists.xensource.com" , "linux-kernel@vger.kernel.org" , Konrad Rzeszutek Wilk List-Id: xen-devel@lists.xenproject.org On 17/08/2011 13:51, "Olaf Hering" wrote: > On Tue, Aug 16, Ian Campbell wrote: > >> On Tue, 2011-08-16 at 14:16 +0100, Olaf Hering wrote: >>> During repeated kexec boots xenwatch_thread() can crash because >>> xenbus_watch->callback is cleared by xenbus_watch_path() if a node/token >>> combo for a new watch happens to match an already registered watch from >>> an old kernel. In this case xs_watch returns -EEXISTS, then >>> register_xenbus_watch() does not remove the to-be-registered watch from >>> the list of active watches but returns the -EEXISTS to the caller >>> anyway. >> >> Isn't this behaviour the root cause of the issue (which should be fixed) >> rather than papering over it during watch processing. IOW should't >> register_xenbus_watch cleanup after itself if xs_watch fails. > > Keir, the EEXISTS case in register_xenbus_watch() was added by you 6 > years ago. Do you happen to know what it tried to solve, and do these > conditions still apply today? Perhaps the EEXISTS can be removed now. > > http://xenbits.xen.org/hg/xen-unstable.hg/diff/8016551fde98/linux-2.6-xen-spar > se/drivers/xen/xenbus/xenbus_xs.c Bad me. Either remove the EEXIST check, or convert EEXIST to return code 0 in register_xenbus_watch(). You could do either, since I'm sure I added the EEXIST check only as an attempt to theoretically robustify that function, and looks like I got it wrong. K. > Olaf