All of lore.kernel.org
 help / color / mirror / Atom feed
* issues with movnti emulation
@ 2008-11-20 16:38 Jan Beulich
  2008-11-20 17:13 ` Keir Fraser
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2008-11-20 16:38 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

We've got reports of that change causing HVM data corruption issues. While
I can't see what's wrong with the patch, I'd suggest at least reverting it from
the 3.3 tree (which is what our code is based upon) for the time being.

Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: issues with movnti emulation
  2008-11-20 16:38 issues with movnti emulation Jan Beulich
@ 2008-11-20 17:13 ` Keir Fraser
  2008-11-20 17:16   ` Tim Deegan
                     ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Keir Fraser @ 2008-11-20 17:13 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

I think the issue is that I did a bad backport to 3.3. The 'case 0xc3'
should be under twobyte_special_insn rather than twobyte_insn, right? The
two separate paths got merged into one in xen-unstable.

Of course this data corruption ought only to happen in cases where we'd
previously have failed an mmio emulation (and hence probably killed the
guest kernel?).

 -- Keir

On 20/11/08 16:38, "Jan Beulich" <jbeulich@novell.com> wrote:

> We've got reports of that change causing HVM data corruption issues. While
> I can't see what's wrong with the patch, I'd suggest at least reverting it
> from
> the 3.3 tree (which is what our code is based upon) for the time being.
> 
> Jan
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Re: issues with movnti emulation
  2008-11-20 17:13 ` Keir Fraser
@ 2008-11-20 17:16   ` Tim Deegan
  2008-11-20 17:43     ` Keir Fraser
  2008-11-20 18:42   ` Kevin Wolf
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Tim Deegan @ 2008-11-20 17:16 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

At 17:13 +0000 on 20 Nov (1227201181), Keir Fraser wrote:
> I think the issue is that I did a bad backport to 3.3. The 'case 0xc3'
> should be under twobyte_special_insn rather than twobyte_insn, right? The
> two separate paths got merged into one in xen-unstable.
> 
> Of course this data corruption ought only to happen in cases where we'd
> previously have failed an mmio emulation (and hence probably killed the
> guest kernel?).

A more likely culprit is that some OSes use movnti to zero pages that
used to be pagetables; when we couldn't emulate it we just (correctly)
unshadowed those pages.

Cheers,

Tim.

> 
>  -- Keir
> 
> On 20/11/08 16:38, "Jan Beulich" <jbeulich@novell.com> wrote:
> 
> > We've got reports of that change causing HVM data corruption issues. While
> > I can't see what's wrong with the patch, I'd suggest at least reverting it
> > from
> > the 3.3 tree (which is what our code is based upon) for the time being.
> > 
> > Jan
> > 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Citrix Systems (R&D) Ltd.
[Company #02300071, SL9 0DZ, UK.]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Re: issues with movnti emulation
  2008-11-20 17:16   ` Tim Deegan
@ 2008-11-20 17:43     ` Keir Fraser
  2008-11-21 16:01       ` Tim Deegan
  0 siblings, 1 reply; 11+ messages in thread
From: Keir Fraser @ 2008-11-20 17:43 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel

On 20/11/08 17:16, "Tim Deegan" <Tim.Deegan@citrix.com> wrote:

> At 17:13 +0000 on 20 Nov (1227201181), Keir Fraser wrote:
>> I think the issue is that I did a bad backport to 3.3. The 'case 0xc3'
>> should be under twobyte_special_insn rather than twobyte_insn, right? The
>> two separate paths got merged into one in xen-unstable.
>> 
>> Of course this data corruption ought only to happen in cases where we'd
>> previously have failed an mmio emulation (and hence probably killed the
>> guest kernel?).
> 
> A more likely culprit is that some OSes use movnti to zero pages that
> used to be pagetables; when we couldn't emulate it we just (correctly)
> unshadowed those pages.

Yes, you're probably right. I wonder if we are relying on emulation failures
to inform unshadowing at all often? We might have to revisit constraining
x86_emulate() when called by shadow code, do you think?

 -- Keir

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Re: issues with movnti emulation
  2008-11-20 17:13 ` Keir Fraser
  2008-11-20 17:16   ` Tim Deegan
@ 2008-11-20 18:42   ` Kevin Wolf
  2008-11-21 11:04   ` Jan Beulich
  2008-11-25 14:01   ` Gianluca Guida
  3 siblings, 0 replies; 11+ messages in thread
From: Kevin Wolf @ 2008-11-20 18:42 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

Keir Fraser schrieb:
> I think the issue is that I did a bad backport to 3.3. The 'case 0xc3'
> should be under twobyte_special_insn rather than twobyte_insn, right? The
> two separate paths got merged into one in xen-unstable.

The other way round, but yes, this seems to have caused the corruption.

Kevin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: issues with movnti emulation
  2008-11-20 17:13 ` Keir Fraser
  2008-11-20 17:16   ` Tim Deegan
  2008-11-20 18:42   ` Kevin Wolf
@ 2008-11-21 11:04   ` Jan Beulich
  2008-11-21 11:19     ` Keir Fraser
  2008-11-25 14:01   ` Gianluca Guida
  3 siblings, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2008-11-21 11:04 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

>>> Keir Fraser <keir.fraser@eu.citrix.com> 20.11.08 18:13 >>>
>I think the issue is that I did a bad backport to 3.3. The 'case 0xc3'
>should be under twobyte_special_insn rather than twobyte_insn, right? The
>two separate paths got merged into one in xen-unstable.

Oh, indeed - if you mean it the other way around.

>Of course this data corruption ought only to happen in cases where we'd
>previously have failed an mmio emulation (and hence probably killed the
>guest kernel?).

Yes, we previously saw emulation failure messages. The guest wasn't
killed because of that, however. I have to admit it's been a while since
I last looked at mmio emulation - is it eagerly trying to emulate successive
instructions, and return to native execution when emulation failed? If
not, I could neither explain why only some data got corrupted here, nor
why the guest didn't get killed.

Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: issues with movnti emulation
  2008-11-21 11:04   ` Jan Beulich
@ 2008-11-21 11:19     ` Keir Fraser
  0 siblings, 0 replies; 11+ messages in thread
From: Keir Fraser @ 2008-11-21 11:19 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On 21/11/08 11:04, "Jan Beulich" <jbeulich@novell.com> wrote:

>> Of course this data corruption ought only to happen in cases where we'd
>> previously have failed an mmio emulation (and hence probably killed the
>> guest kernel?).
> 
> Yes, we previously saw emulation failure messages. The guest wasn't
> killed because of that, however. I have to admit it's been a while since
> I last looked at mmio emulation - is it eagerly trying to emulate successive
> instructions, and return to native execution when emulation failed? If
> not, I could neither explain why only some data got corrupted here, nor
> why the guest didn't get killed.

TimD had the correct explanation -- page-table pages getting recycled via
Windows' page scrubber.

 -- Keir

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Re: issues with movnti emulation
  2008-11-20 17:43     ` Keir Fraser
@ 2008-11-21 16:01       ` Tim Deegan
  2008-11-24 16:18         ` Gianluca Guida
  0 siblings, 1 reply; 11+ messages in thread
From: Tim Deegan @ 2008-11-21 16:01 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

At 17:43 +0000 on 20 Nov (1227202988), Keir Fraser wrote:
> Yes, you're probably right. I wonder if we are relying on emulation failures
> to inform unshadowing at all often? We might have to revisit constraining
> x86_emulate() when called by shadow code, do you think?

Yes, I think it would probably be worth looking at that.

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Citrix Systems (R&D) Ltd.
[Company #02300071, SL9 0DZ, UK.]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Re: issues with movnti emulation
  2008-11-24 16:18         ` Gianluca Guida
@ 2008-11-24 15:34           ` Jan Beulich
  0 siblings, 0 replies; 11+ messages in thread
From: Jan Beulich @ 2008-11-24 15:34 UTC (permalink / raw)
  To: Gianluca Guida; +Cc: xen-devel, Tim Deegan, Keir Fraser

>>> Gianluca Guida <gianluca.guida@eu.citrix.com> 24.11.08 17:18 >>>
>Tim Deegan wrote:
>> At 17:43 +0000 on 20 Nov (1227202988), Keir Fraser wrote:
>>> Yes, you're probably right. I wonder if we are relying on emulation failures
>>> to inform unshadowing at all often? We might have to revisit constraining
>>> x86_emulate() when called by shadow code, do you think?
>> 
>> Yes, I think it would probably be worth looking at that.
>
>In what kind of guest/workloads we were experiencing this corruption?

SLE11 installation.

Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Re: issues with movnti emulation
  2008-11-21 16:01       ` Tim Deegan
@ 2008-11-24 16:18         ` Gianluca Guida
  2008-11-24 15:34           ` Jan Beulich
  0 siblings, 1 reply; 11+ messages in thread
From: Gianluca Guida @ 2008-11-24 16:18 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel, Keir Fraser

Tim Deegan wrote:
> At 17:43 +0000 on 20 Nov (1227202988), Keir Fraser wrote:
>> Yes, you're probably right. I wonder if we are relying on emulation failures
>> to inform unshadowing at all often? We might have to revisit constraining
>> x86_emulate() when called by shadow code, do you think?
> 
> Yes, I think it would probably be worth looking at that.

In what kind of guest/workloads we were experiencing this corruption?

Thanks,
Gianluca

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Re: issues with movnti emulation
  2008-11-20 17:13 ` Keir Fraser
                     ` (2 preceding siblings ...)
  2008-11-21 11:04   ` Jan Beulich
@ 2008-11-25 14:01   ` Gianluca Guida
  3 siblings, 0 replies; 11+ messages in thread
From: Gianluca Guida @ 2008-11-25 14:01 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

Keir Fraser wrote:
> I think the issue is that I did a bad backport to 3.3. The 'case 0xc3'
> should be under twobyte_special_insn rather than twobyte_insn, right? The
> two separate paths got merged into one in xen-unstable.

This seems actually to be the case. The actual move from src.val to 
dst.val never happened with the current patch, and this made movnti to 
write in memory the original dst.val value, leading to memory corruption.

By moving the switch case into twobyte_insn the problem goes away.

Gianluca

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-11-25 14:01 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-20 16:38 issues with movnti emulation Jan Beulich
2008-11-20 17:13 ` Keir Fraser
2008-11-20 17:16   ` Tim Deegan
2008-11-20 17:43     ` Keir Fraser
2008-11-21 16:01       ` Tim Deegan
2008-11-24 16:18         ` Gianluca Guida
2008-11-24 15:34           ` Jan Beulich
2008-11-20 18:42   ` Kevin Wolf
2008-11-21 11:04   ` Jan Beulich
2008-11-21 11:19     ` Keir Fraser
2008-11-25 14:01   ` Gianluca Guida

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.