From mboxrd@z Thu Jan 1 00:00:00 1970 From: Keir Fraser Subject: Re: regression from c/s 22071:c5aed2e049bc (ept: Put locks around ept_get_entry) ? Date: Thu, 16 Dec 2010 16:59:15 +0000 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jan Beulich Cc: George Dunlap , Christoph Egger , "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org On 16/12/2010 16:42, "Keir Fraser" wrote: > On 16/12/2010 16:22, "Jan Beulich" wrote: > >>> Probably a similar assumption to what we make in x86_64's pte_write_atomic() >>> implementation? Possibly pte_{read,write}_atomic() should cast the pte >>> pointer to volatile, and the EPT reads/writes should be similarly wrapped in >>> macros which do casting. I'm sure we make various other assumptions about >>> read/write atomicity in Xen, but aiming to fix them as we find them is maybe >>> not a bad idea. >>> >>> If that sounds good, I can propose a patch? >> >> Oh, yes. I didn't even consider there might be more places. >> >> What I'm surprised about is you suggesting to take the "volatile" >> route instead of the barrier() one... > > I don't think barrier() would solve the problem at hand. The idiom we are > dealing with is something like: > x = *px; > [barrier()] > > [barrier()] > *px = x; > > I don't see that adding the bracketed barrier() calls above ensures that the > access to *px are done in a single atomic instruction. There's nothing > touching non-local variables between the two barrier()s, so for example the > code that messes with x could be moved after the second barrier() and then > the compiler could choose to mess with *px directly if it wishes. Or in George's EPT changes, I think the issue was getting an atomic snapshot of some P2M flags (populate-on-demand vs. valid vs. ...). Again, barrier() would not help since could be moved before the first barrier(), and again the compiler can then do a number of direct reads on *px. So, again, the right fix is to make the memory read properly atomic via use of volatile, and preferably a snippet of inline asm to guarantee we emit the desired single mov instruction. -- Keir > The issue is not one of serialisation or code ordering. It is one of > memory-access atomicity. Thus it seems to me that volatile is the correct > approach therefore. Perhaps *(volatile type *)px = x or, really, even better > I should define some {read,write}_atomic{8,16,32,64} accessor functions > which use inline asm to absolutely definitely emit a single atomic 'mov' > instruction. > > Make sense? > > -- Keir > >