All of lore.kernel.org
 help / color / mirror / Atom feed
* the ptwr cmpxchg fix
@ 2007-10-18 19:58 John Levon
  2007-10-19  7:38 ` Keir Fraser
  2007-10-19  7:45 ` Keir Fraser
  0 siblings, 2 replies; 5+ messages in thread
From: John Levon @ 2007-10-18 19:58 UTC (permalink / raw)
  To: xen-devel


I didn't think this through well enough. It's an ABI change.

The reason Solaris worked at all before was that we didn't remove
_PAGE_USER for kernel PTEs when compatibility was broken last time
(3.0.3 I think). Thus the combination of our bug and the hypervisor's
bug conspired to work.

Now the hypervisor is fixed, we'll be getting _PAGE_GLOBAL on our kernel
pages - not a good idea. But we can't just fix Solaris, because other
hypervisors without the fix will then not be putting _PAGE_USER on
kernel PTEs - much worse!!

I think the right thing to do is:

- finally start the page listing incompatibilities on the Wiki
  (theoretical or otherwise) [1]

- fix Solaris to add _PAGE_USER (or PT_USER as we know it) iff we have a
  'broken' hypervisor. I'm not sure how to do that though, beyond a
  "if it's our hypervisor, or Xen 3.1.2 or higher" check. BTW it would
  be nice to see this in 3.1.2

Does that make sense Keir?

thanks
john

[1]

This is the list I'm aware of that breaks Solaris domUs:

    *  Xen 3.1.1 is broken for 64-bit in B75, b76 (6616864).
      xen-unstable post 2007-10-15 is OK, as is Xen 3.1
    * Xen 3.0.4 upstream is broken for Solaris domU (doesn't
      save/restore trap interrupt settings)
    * pre-3.0.4 doesn't work on 64-bit (changes in PTE handling?)
    * pre-3.0.4 doesn't work with SMP guests (spurious page fault code)
    * To quote Jan Beulich:

Subject: [Xen-devel] c/s 15147 change to struct vcpu_register_vcpu_info

This changeset changed the layout of the structure, and 3.1 as well as
2.6.23 use the old layout, while 3.1.1 uses to new one.

We don't use this on Solaris yet, however

    * the fix for cmpxchg and PT_GLOBAL means that newer Solaris
      versions (or anything cmpxchg'ing a PTE) that correctly don't set
      PT_USER will break on hypervisors without this changeset:
      16129:2173fe77dcd2 from xen-unstable

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: the ptwr cmpxchg fix
  2007-10-18 19:58 the ptwr cmpxchg fix John Levon
@ 2007-10-19  7:38 ` Keir Fraser
  2007-10-22 12:09   ` John Levon
  2007-10-19  7:45 ` Keir Fraser
  1 sibling, 1 reply; 5+ messages in thread
From: Keir Fraser @ 2007-10-19  7:38 UTC (permalink / raw)
  To: John Levon, xen-devel

Yes, deciding whether to add _PAGE_USER based on Xen version seems the best
way to go. Bear in mind that the ABI bug affects *only* cmpxchg of
pagetables. Any of the following methods of writing a pte with _PAGE_USER
set will also set _PAGE_GLOBAL (unless our software flag _PAGE_GUEST_KERNEL
is also set):
 * update_va_mapping()
 * MMU_NORMAL_PT_UPDATE
 * direct modification of a not-yet-pinned pagetable (the _PAGE_GLOBAL will
be added on each pte when the pagetable becomes pinned)

So, unless you *only* ever update kernel ptes with cmpxchg, you have quite a
nasty problem with older Xen: some updates methods will adjust the l1e,
while direct cmpxchg won't.

Our Linux guests have not had problems because I'm pretty sure we basically
never cmpxchg a kernel pte.

 -- Keir

On 18/10/07 20:58, "John Levon" <levon@movementarian.org> wrote:

> 
> I didn't think this through well enough. It's an ABI change.
> 
> The reason Solaris worked at all before was that we didn't remove
> _PAGE_USER for kernel PTEs when compatibility was broken last time
> (3.0.3 I think). Thus the combination of our bug and the hypervisor's
> bug conspired to work.
> 
> Now the hypervisor is fixed, we'll be getting _PAGE_GLOBAL on our kernel
> pages - not a good idea. But we can't just fix Solaris, because other
> hypervisors without the fix will then not be putting _PAGE_USER on
> kernel PTEs - much worse!!
> 
> I think the right thing to do is:
> 
> - finally start the page listing incompatibilities on the Wiki
>   (theoretical or otherwise) [1]
> 
> - fix Solaris to add _PAGE_USER (or PT_USER as we know it) iff we have a
>   'broken' hypervisor. I'm not sure how to do that though, beyond a
>   "if it's our hypervisor, or Xen 3.1.2 or higher" check. BTW it would
>   be nice to see this in 3.1.2
> 
> Does that make sense Keir?
> 
> thanks
> john
> 
> [1]
> 
> This is the list I'm aware of that breaks Solaris domUs:
> 
>     *  Xen 3.1.1 is broken for 64-bit in B75, b76 (6616864).
>       xen-unstable post 2007-10-15 is OK, as is Xen 3.1
>     * Xen 3.0.4 upstream is broken for Solaris domU (doesn't
>       save/restore trap interrupt settings)
>     * pre-3.0.4 doesn't work on 64-bit (changes in PTE handling?)
>     * pre-3.0.4 doesn't work with SMP guests (spurious page fault code)
>     * To quote Jan Beulich:
> 
> Subject: [Xen-devel] c/s 15147 change to struct vcpu_register_vcpu_info
> 
> This changeset changed the layout of the structure, and 3.1 as well as
> 2.6.23 use the old layout, while 3.1.1 uses to new one.
> 
> We don't use this on Solaris yet, however
> 
>     * the fix for cmpxchg and PT_GLOBAL means that newer Solaris
>       versions (or anything cmpxchg'ing a PTE) that correctly don't set
>       PT_USER will break on hypervisors without this changeset:
>       16129:2173fe77dcd2 from xen-unstable
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: the ptwr cmpxchg fix
  2007-10-18 19:58 the ptwr cmpxchg fix John Levon
  2007-10-19  7:38 ` Keir Fraser
@ 2007-10-19  7:45 ` Keir Fraser
  2007-10-19 22:50   ` Jeremy Fitzhardinge
  1 sibling, 1 reply; 5+ messages in thread
From: Keir Fraser @ 2007-10-19  7:45 UTC (permalink / raw)
  To: John Levon, xen-devel

On 18/10/07 20:58, "John Levon" <levon@movementarian.org> wrote:

>     * To quote Jan Beulich:
> 
> Subject: [Xen-devel] c/s 15147 change to struct vcpu_register_vcpu_info
> 
> This changeset changed the layout of the structure, and 3.1 as well as
> 2.6.23 use the old layout, while 3.1.1 uses to new one.
> 
> We don't use this on Solaris yet, however

Since the old structure layout was never present in a stable release of Xen
(that operation was not supported at all in 3.1.0) this one doesn't really
belong on the incompatibility list.

 -- Keir

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: the ptwr cmpxchg fix
  2007-10-19  7:45 ` Keir Fraser
@ 2007-10-19 22:50   ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 5+ messages in thread
From: Jeremy Fitzhardinge @ 2007-10-19 22:50 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, John Levon

Keir Fraser wrote:
> On 18/10/07 20:58, "John Levon" <levon@movementarian.org> wrote:
>
>   
>>     * To quote Jan Beulich:
>>
>> Subject: [Xen-devel] c/s 15147 change to struct vcpu_register_vcpu_info
>>
>> This changeset changed the layout of the structure, and 3.1 as well as
>> 2.6.23 use the old layout, while 3.1.1 uses to new one.
>>
>> We don't use this on Solaris yet, however
>>     
>
> Since the old structure layout was never present in a stable release of Xen
> (that operation was not supported at all in 3.1.0) this one doesn't really
> belong on the incompatibility list.
>   

I think the structure was present, but there was no implementation to
back it up.  It's remotely possible someone decided to implement the
vcpu_info placement without ever testing it, but it seems unlikely.

    J

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: the ptwr cmpxchg fix
  2007-10-19  7:38 ` Keir Fraser
@ 2007-10-22 12:09   ` John Levon
  0 siblings, 0 replies; 5+ messages in thread
From: John Levon @ 2007-10-22 12:09 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

On Fri, Oct 19, 2007 at 08:38:14AM +0100, Keir Fraser wrote:

> Yes, deciding whether to add _PAGE_USER based on Xen version seems the best
> way to go.

OK.

> Bear in mind that the ABI bug affects *only* cmpxchg of
> pagetables.

Yep. Unfortunately we use that rather heavily, although Linux doesn't :)

regards
john

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-10-22 12:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-18 19:58 the ptwr cmpxchg fix John Levon
2007-10-19  7:38 ` Keir Fraser
2007-10-22 12:09   ` John Levon
2007-10-19  7:45 ` Keir Fraser
2007-10-19 22:50   ` Jeremy Fitzhardinge

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.