From mboxrd@z Thu Jan 1 00:00:00 1970 From: Keir Fraser Subject: Re: Re: [PATCH] Require that xenstored writes to a domain complete in a single chunk Date: Mon, 26 Feb 2007 18:14:05 +0000 Message-ID: References: <87slcsx11y.fsf@apfelstrudel.hh.sledj.net> Mime-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <87slcsx11y.fsf@apfelstrudel.hh.sledj.net> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: David Edmondson , xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org On 26/2/07 17:48, "David Edmondson" wrote: >> For older guest compatibility perhaps we can take a variant of your >> patch that only waits for enough space is the entire message fits in >> the ring in one go. This would be 'best-effort' at compatibility >> while not precluding use of larger messages in general. > > Is the implication that you think that this problem could occur with a > Linux guest (I've never seen it, though have tested much less)? The Linux suspend thread does not sync with the xenbus reader thread at all. I'm not sure why we've never seen any problems on Linux, but I guess it's rare that a message cannot be sent all in one go. Especially a watch event, as those are usually fairly short. Oh..... Wait a minute! On Linux we explicitly tear down watches before suspend. Or at least we used to, before a patch of a few weeks ago (c/s 13519, Jan 19th 2007). This would save us because no watches registered -> no watches fire. Do you not have anything similar to this in your xenbus code (presumably you took the dual-licensed files as a basis for the Solaris implementation)? So Linux now needs fixing too, but the bug window here has been only a few weeks and no supported kernel releases include the bug. This being the case we should probably just fix this issue in current guest kernels. -- Keir