All of lore.kernel.org
 help / color / mirror / Atom feed
* PAE PV guest kernel regression
       [not found] <c30e80f1-120e-4b42-ab33-c53bbe714a94@zmail17.collab.prod.int.phx2.redhat.com>
@ 2012-06-06  8:54 ` Andrew Jones
  2012-06-06 10:41   ` Jan Beulich
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Jones @ 2012-06-06  8:54 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel


https://bugzilla.redhat.com/show_bug.cgi?id=829016

was opened yesterday. Maybe the following commit
is the culprit?

commit cb8095bba6d24118135a5683a956f4f4fb5f17bb
Author: Jan Beulich <JBeulich@suse.com>
Date:   Fri Jan 20 16:22:04 2012 +0000

    x86: atomic64 assembly improvements

Drew

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PAE PV guest kernel regression
  2012-06-06  8:54 ` PAE PV guest kernel regression Andrew Jones
@ 2012-06-06 10:41   ` Jan Beulich
  2012-06-07 10:46     ` Andrew Jones
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2012-06-06 10:41 UTC (permalink / raw)
  To: Andrew Jones; +Cc: xen-devel

>>> On 06.06.12 at 10:54, Andrew Jones <drjones@redhat.com> wrote:

> https://bugzilla.redhat.com/show_bug.cgi?id=829016 
> 
> was opened yesterday. Maybe the following commit
> is the culprit?
> 
> commit cb8095bba6d24118135a5683a956f4f4fb5f17bb
> Author: Jan Beulich <JBeulich@suse.com>
> Date:   Fri Jan 20 16:22:04 2012 +0000
> 
>     x86: atomic64 assembly improvements

Hardly - that change didn't even touch atomic64_read_cx8()
or any of its constraints.

Further more, this

*pdpt = 0000000023be6027 *pde = 00000000037c2067 *pte = 8000000023bdb061 
Oops: 0003 [#1] SMP 

tells us that this was a write-protection fault, yet the hypervisor
failed to emulate this (hence the hypervisor log would likely help).
After quite some time spent looking through the upstream 3.0.4
sources (and my local object files) I can't, however, spot where
the call to atomic64_read() is actually located in the functions in
question.

Jan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PAE PV guest kernel regression
  2012-06-06 10:41   ` Jan Beulich
@ 2012-06-07 10:46     ` Andrew Jones
  2012-06-07 14:55       ` Jan Beulich
  2012-06-07 14:57       ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 10+ messages in thread
From: Andrew Jones @ 2012-06-07 10:46 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel



----- Original Message -----
> >>> On 06.06.12 at 10:54, Andrew Jones <drjones@redhat.com> wrote:
> 
> > https://bugzilla.redhat.com/show_bug.cgi?id=829016
> > 
> > was opened yesterday. Maybe the following commit
> > is the culprit?
> > 
> > commit cb8095bba6d24118135a5683a956f4f4fb5f17bb
> > Author: Jan Beulich <JBeulich@suse.com>
> > Date:   Fri Jan 20 16:22:04 2012 +0000
> > 
> >     x86: atomic64 assembly improvements
> 
> Hardly - that change didn't even touch atomic64_read_cx8()
> or any of its constraints.
> 
> Further more, this
> 
> *pdpt = 0000000023be6027 *pde = 00000000037c2067 *pte =
> 8000000023bdb061
> Oops: 0003 [#1] SMP
> 
> tells us that this was a write-protection fault, yet the hypervisor
> failed to emulate this (hence the hypervisor log would likely help).
> After quite some time spent looking through the upstream 3.0.4
> sources (and my local object files) I can't, however, spot where
> the call to atomic64_read() is actually located in the functions in
> question.
> 

Ah, I didn't check to see what Fedora pulled in on top of 3.4. Had I
done that I would have immediately suspected a different patch instead
(mm-pmd_read_atomic-fix-32bit-PAE-pmd-walk-vs-pmd_populate-SMP-race-condition.patch,
upstream commit 26c191788f18). We've already encountered one problem
with this patch for RHEL6 and fixed it. The patch F17 has, however, is
already the "fixed" version. Now the difference between RHEL6 and F17
though is that F17 has CONFIG_TRANSPARENT_HUGEPAGE=y for 32b guests,
but RHEL6 does not. So now with this patch F17 is calling
atomic64_read() from pmd_none_or_trans_huge_or_clear_bad().

Drew

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PAE PV guest kernel regression
  2012-06-07 10:46     ` Andrew Jones
@ 2012-06-07 14:55       ` Jan Beulich
  2012-06-07 15:19         ` Andrew Jones
  2012-06-07 14:57       ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2012-06-07 14:55 UTC (permalink / raw)
  To: drjones; +Cc: xen-devel

>>> Andrew Jones <drjones@redhat.com> 06/07/12 12:47 PM >>>
>Ah, I didn't check to see what Fedora pulled in on top of 3.4. Had I
>done that I would have immediately suspected a different patch instead
>(mm-pmd_read_atomic-fix-32bit-PAE-pmd-walk-vs-pmd_populate-SMP-race-condition.patch,
>upstream commit 26c191788f18). We've already encountered one problem
>with this patch for RHEL6 and fixed it. The patch F17 has, however, is
>already the "fixed" version. Now the difference between RHEL6 and F17
>though is that F17 has CONFIG_TRANSPARENT_HUGEPAGE=y for 32b guests,
>but RHEL6 does not. So now with this patch F17 is calling
>atomic64_read() from pmd_none_or_trans_huge_or_clear_bad().

By itself the use of atomic64_read() on page table entries shouldn't be
a problem though, particularly not if what might get written is a not
present entry (while the way atoimc64_read_cx8() works would also
allow for values with the low bit set to be pseudo-written, but neither
did that appear to be the case in at least one of the two cases where
register dumps were provided in your bugzilla, nor should an attempt
to write back a value that was already there cause any problem, even
if the low bit was set).

In any case, the hypervisor log would be interesting to see. Plus - is
this perhaps a rather old hypervisor running there (i.e. potentially
lacking some fix)?

Jan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PAE PV guest kernel regression
  2012-06-07 10:46     ` Andrew Jones
  2012-06-07 14:55       ` Jan Beulich
@ 2012-06-07 14:57       ` Konrad Rzeszutek Wilk
  2012-06-07 15:16         ` Andrew Jones
  1 sibling, 1 reply; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-06-07 14:57 UTC (permalink / raw)
  To: Andrew Jones; +Cc: Jan Beulich, xen-devel

> Ah, I didn't check to see what Fedora pulled in on top of 3.4. Had I
> done that I would have immediately suspected a different patch instead
> (mm-pmd_read_atomic-fix-32bit-PAE-pmd-walk-vs-pmd_populate-SMP-race-condition.patch,
> upstream commit 26c191788f18). We've already encountered one problem
> with this patch for RHEL6 and fixed it. The patch F17 has, however, is

Was the fix for the git commit 26c191788f18 (which I understand is in RHEL6)
posted?

> already the "fixed" version. Now the difference between RHEL6 and F17
> though is that F17 has CONFIG_TRANSPARENT_HUGEPAGE=y for 32b guests,
> but RHEL6 does not. So now with this patch F17 is calling
> atomic64_read() from pmd_none_or_trans_huge_or_clear_bad().
> 
> Drew
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PAE PV guest kernel regression
  2012-06-07 14:57       ` Konrad Rzeszutek Wilk
@ 2012-06-07 15:16         ` Andrew Jones
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Jones @ 2012-06-07 15:16 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Jan Beulich, xen-devel



----- Original Message ----- 
> Was the fix for the git commit 26c191788f18 (which I understand is in
> RHEL6)
> posted?

Nope, it got rolled into 26c191788f18 before 26c191788f18 was
committed. The fix was to not use __pmd, which is paravirtualized,
when doing the final creation of the pmd_t retval. Here it is
(the 2f88dfd2 referenced there is a rhel6 commit).

    Commit 2f88dfd2 introduces read_pmd_atomic which calls __pmd. __pmd is
    paravirtualized and for xen it calls xen_make_pmd(). xen_make_pmd may
    return a zero pmd in the case that the pfn's associated mfn is still
    invalid. However at the point when read_pmd_atomic is called the mfn can
    appear invalid when it isn't, if its creation was lazily deferred. If
    read_pmd_atomic returns zero in these cases then it can eventually lead to
    a crash. To resolve the issue we can avoid calling __pmd and just return
    the open coded compound literal (as native_make_pmd would do) instead.
    
diff --git a/arch/x86/include/asm/pgtable-3level.h b/arch/x86/include/asm/pgtabl
index f2d473b..4f325a1 100644
--- a/arch/x86/include/asm/pgtable-3level.h
+++ b/arch/x86/include/asm/pgtable-3level.h
@@ -73,12 +73,13 @@ static inline pmd_t read_pmd_atomic(pmd_t *pmdp)
                ret |= ((pmdval_t)*(tmp + 1)) << 32;
        }
 
-       return __pmd(ret);
+       return (pmd_t) { ret };
 }
 #else /* CONFIG_TRANSPARENT_HUGEPAGE */
 static inline pmd_t read_pmd_atomic(pmd_t *pmdp)
 {
-       return __pmd(atomic64_read((atomic64_t *)pmdp));
+       pmdval_t val = (pmdval_t)atomic64_read((atomic64_t *)pmdp);
+       return (pmd_t) { val };
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: PAE PV guest kernel regression
  2012-06-07 14:55       ` Jan Beulich
@ 2012-06-07 15:19         ` Andrew Jones
  2012-06-07 15:44           ` Andrew Jones
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Jones @ 2012-06-07 15:19 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel



----- Original Message -----
> >>> Andrew Jones <drjones@redhat.com> 06/07/12 12:47 PM >>>
> >Ah, I didn't check to see what Fedora pulled in on top of 3.4. Had I
> >done that I would have immediately suspected a different patch
> >instead
> >(mm-pmd_read_atomic-fix-32bit-PAE-pmd-walk-vs-pmd_populate-SMP-race-condition.patch,
> >upstream commit 26c191788f18). We've already encountered one problem
> >with this patch for RHEL6 and fixed it. The patch F17 has, however,
> >is
> >already the "fixed" version. Now the difference between RHEL6 and
> >F17
> >though is that F17 has CONFIG_TRANSPARENT_HUGEPAGE=y for 32b guests,
> >but RHEL6 does not. So now with this patch F17 is calling
> >atomic64_read() from pmd_none_or_trans_huge_or_clear_bad().
> 
> By itself the use of atomic64_read() on page table entries shouldn't
> be
> a problem though, particularly not if what might get written is a not
> present entry (while the way atoimc64_read_cx8() works would also
> allow for values with the low bit set to be pseudo-written, but
> neither
> did that appear to be the case in at least one of the two cases where
> register dumps were provided in your bugzilla, nor should an attempt
> to write back a value that was already there cause any problem, even
> if the low bit was set).
> 
> In any case, the hypervisor log would be interesting to see. Plus -
> is
> this perhaps a rather old hypervisor running there (i.e. potentially
> lacking some fix)?

I just reproduced this on my own test box, which is running an HV
based on older code (it's RHEL 5.8). Even with
'loglvl=all guest_loglvl=all' I didn't get anything logged in
'xm dmesg'.

Drew

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PAE PV guest kernel regression
  2012-06-07 15:19         ` Andrew Jones
@ 2012-06-07 15:44           ` Andrew Jones
  2012-06-07 18:48             ` Jan Beulich
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Jones @ 2012-06-07 15:44 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel



----- Original Message -----
> > In any case, the hypervisor log would be interesting to see. Plus -
> > is
> > this perhaps a rather old hypervisor running there (i.e.
> > potentially
> > lacking some fix)?
> 
> I just reproduced this on my own test box, which is running an HV
> based on older code (it's RHEL 5.8). Even with
> 'loglvl=all guest_loglvl=all' I didn't get anything logged in
> 'xm dmesg'.

I don't have an later hypervisor setup for testing, but it'd
be nice to know if guests work on them. If so, then maybe RHEL5
and EC2 need to look at Xen c/s 17498 and/or others.

Drew

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PAE PV guest kernel regression
  2012-06-07 15:44           ` Andrew Jones
@ 2012-06-07 18:48             ` Jan Beulich
  2012-06-07 19:43               ` Jan Beulich
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2012-06-07 18:48 UTC (permalink / raw)
  To: drjones; +Cc: xen-devel

>>> Andrew Jones <drjones@redhat.com> 06/07/12 5:44 PM >>>
>> > In any case, the hypervisor log would be interesting to see. Plus -
>> > is
>> > this perhaps a rather old hypervisor running there (i.e.
>> > potentially
>> > lacking some fix)?
>> 
>> I just reproduced this on my own test box, which is running an HV
>> based on older code (it's RHEL 5.8). Even with
>> 'loglvl=all guest_loglvl=all' I didn't get anything logged in
>> 'xm dmesg'.
>
>I don't have an later hypervisor setup for testing, but it'd
>be nice to know if guests work on them. If so, then maybe RHEL5
>and EC2 need to look at Xen c/s 17498 and/or others.

Is it perhaps the really old 3.0.4 based one, and lacking a backport of
14041:b010e556fe2c? In that case the comparison part of the
cmpxchg8b emulation failing would apparently be treated equally to
e.g. a fault having occurred during the emulation (i.e. the original
page fault would get forwarded rather than just EFLAGS.ZF getting
set).

Jan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PAE PV guest kernel regression
  2012-06-07 18:48             ` Jan Beulich
@ 2012-06-07 19:43               ` Jan Beulich
  0 siblings, 0 replies; 10+ messages in thread
From: Jan Beulich @ 2012-06-07 19:43 UTC (permalink / raw)
  To: drjones; +Cc: xen-devel

>>> "Jan Beulich" <jbeulich@suse.com> 06/07/12 8:50 PM >>>
>Is it perhaps the really old 3.0.4 based one, and lacking a backport of
>14041:b010e556fe2c? In that case the comparison part of the
>cmpxchg8b emulation failing would apparently be treated equally to
>e.g. a fault having occurred during the emulation (i.e. the original
>page fault would get forwarded rather than just EFLAGS.ZF getting
>set).

Having read the other thread with Andrea's explanation, that's unlikely
the culprit. He says the cmpxchg8b is being done on a PMD entry,
which the ptwr code definitely doesn't support so far.

Jan

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-06-07 19:43 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <c30e80f1-120e-4b42-ab33-c53bbe714a94@zmail17.collab.prod.int.phx2.redhat.com>
2012-06-06  8:54 ` PAE PV guest kernel regression Andrew Jones
2012-06-06 10:41   ` Jan Beulich
2012-06-07 10:46     ` Andrew Jones
2012-06-07 14:55       ` Jan Beulich
2012-06-07 15:19         ` Andrew Jones
2012-06-07 15:44           ` Andrew Jones
2012-06-07 18:48             ` Jan Beulich
2012-06-07 19:43               ` Jan Beulich
2012-06-07 14:57       ` Konrad Rzeszutek Wilk
2012-06-07 15:16         ` Andrew Jones

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.