xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* L1[0x1fb] = 0000000000000000 which faults on one type of machine but on another works?
@ 2011-03-16 22:19 Konrad Rzeszutek Wilk
  2011-03-16 22:32 ` Keir Fraser
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-03-16 22:19 UTC (permalink / raw)
  To: xen-devel, gianni.tedesco, andrew.thomas, Jeremy Fitzhardinge,
	Ian Campbell, keir.xen
  Cc: swente

I am troubleshooting an issue where the Linux kernel tries
to dereference a not present entry. I have a fix for this
in for-2.6.32/bug-fixes .. but please read on.

Specifically it tries to derefence the fixmapped value of
APIC_BASE. The fixmapped value of APIC_BASE is actually not set
due to git commit a1d8e2fa8325064338b2da1bcf0d7a0473883c284
which adds this in arch/x86/kernel/acpi/boot.c:

static void __init acpi_register_lapic_address(unsigned long address)
 {
        /* Xen dom0 doesn't have usable lapics */
       if (xen_initial_domain())
             return;
 
        mp_lapic_addr = address;

	set_fixmap_nocache(FIX_APIC_BASE, address);

Later on we use 'native_apic_read' which tries to use the APIC_BASE as
address (it is present to be @ slot FIX_APIC_BASE of the fixmap
API) and it fails (on some machines).

Since we don't call 'set_fixmap_nocache(FIX_APIC_BASE)' and 
if one were to go through the pagetable this is what we get:


[    0.000000] SMP: Allowing 1 CPUs, 0 hotplug CPUs
[    0.000000] mapped APIC to ffffffffff5fb000 (00000000)
(XEN) d0:v0: unhandled page fault (ec=0000)
(XEN) Pagetable walk from ffffffffff5fb020:
(XEN)  L4[0x1ff] = 0000000221003067 0000000000001003
(XEN)  L3[0x1ff] = 0000000221004067 0000000000001004
(XEN)  L2[0x1fa] = 0000000221771067 0000000000001771 
(XEN)  L1[0x1fb] = 0000000000000000 ffffffffffffffff
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-4.1-110309  x86_64  debug=y  Tainted:    C ]----
(XEN) CPU:    0
(XEN) RIP:    e033:[<ffffffff8102b5d1>]
(XEN) RFLAGS: 0000000000000292   EM: 1   CONTEXT: pv guest
(XEN) rax: ffffffff8164cf50   rbx: 000000026ec00000   rcx: 00000000ffffdd85
(XEN) rdx: 00000000ffffffff   rsi: 0000000000000000   rdi: 0000000000000020
(XEN) rbp: ffffffff81643ea8   rsp: ffffffff81643e50   r8:  0000000000000002
(XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
(XEN) r12: ffff880013671800   r13: 00000000bff66000   r14: ffffffffffffffff
(XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 00000000000006f0
(XEN) cr3: 0000000221001000   cr2: ffffffffff5fb020
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
(XEN) Guest stack trace from rsp=ffffffff81643e50:

Which is to say that the L1 has this:
0000000115771fa0:  00000000 00000000 00000000 00000000
0000000115771fb0:  00000000 00000000 00000000 00000000
0000000115771fc0:  00000000 00000000 15770067 80100001
0000000115771fd0:  15770067 80100001 00000000 00000000
0000000115771fe0:  00000000 00000000 00000000 00000000
0000000115771ff0:  00000000 00000000 00000000 00000000

L1[0x1fb] is machine address 115771fd8, which has nothing in it.

OK, so I've come up a fix that is a back-port of how 2.6.38 does it
which is that it removes the check I mentioned above and in xen_set_fixmap
we set the FIX_APIC_BASE to actually point to a dummy ioapic_mapping. 
It is 7cb068cf1ba90425e12f3a7b3caed9d018fa9b8c in for-2.6.32/bug-fixes

Gianni, you might want to check this out in case it fixes the problem you
are experiencing.

But one thing I can't understand is why on one machine (IBM x3850)
I get this crash, while another one with the same pagetable contents
(L1 has nothing for 0x1fb) it works just fine? I added a panic and used
the Xen hypervisor kdb to manually inspect the pagetable, and it has
the same contents as the IBM x3850 -but it boots fine with this invalid value.
Any ideas?


FYI, seems another user (Sven Sübert) IBM x3650 hits the same bug. And with
this fix he is able to boot.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: L1[0x1fb] = 0000000000000000 which faults on one type of machine but on another works?
  2011-03-16 22:19 L1[0x1fb] = 0000000000000000 which faults on one type of machine but on another works? Konrad Rzeszutek Wilk
@ 2011-03-16 22:32 ` Keir Fraser
  2011-03-17 10:25 ` Jan Beulich
  2011-03-22 13:10 ` L1[0x1fb] = 0000000000000000 which faults on one type of machine but on another works? Gianni Tedesco
  2 siblings, 0 replies; 10+ messages in thread
From: Keir Fraser @ 2011-03-16 22:32 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, xen-devel, gianni.tedesco, andrew.thomas,
	Jeremy Fitzhardinge, Ian
  Cc: swente

On 16/03/2011 22:19, "Konrad Rzeszutek Wilk" <konrad.wilk@oracle.com> wrote:

> OK, so I've come up a fix that is a back-port of how 2.6.38 does it
> which is that it removes the check I mentioned above and in xen_set_fixmap
> we set the FIX_APIC_BASE to actually point to a dummy ioapic_mapping.
> It is 7cb068cf1ba90425e12f3a7b3caed9d018fa9b8c in for-2.6.32/bug-fixes
> 
> Gianni, you might want to check this out in case it fixes the problem you
> are experiencing.
> 
> But one thing I can't understand is why on one machine (IBM x3850)
> I get this crash, while another one with the same pagetable contents
> (L1 has nothing for 0x1fb) it works just fine? I added a panic and used
> the Xen hypervisor kdb to manually inspect the pagetable, and it has
> the same contents as the IBM x3850 -but it boots fine with this invalid value.
> Any ideas?

Could the native_apic_read() come from ACPI DSDT of that particular machine
type (x3850)?

 K.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: L1[0x1fb] = 0000000000000000 which faults on one type of machine but on another works?
  2011-03-16 22:19 L1[0x1fb] = 0000000000000000 which faults on one type of machine but on another works? Konrad Rzeszutek Wilk
  2011-03-16 22:32 ` Keir Fraser
@ 2011-03-17 10:25 ` Jan Beulich
  2011-03-17 15:52   ` Konrad Rzeszutek Wilk
  2011-03-22 13:10 ` L1[0x1fb] = 0000000000000000 which faults on one type of machine but on another works? Gianni Tedesco
  2 siblings, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2011-03-17 10:25 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Jeremy Fitzhardinge, xen-devel, andrew.thomas, Ian Campbell,
	keir.xen, swente, gianni.tedesco

>>> On 16.03.11 at 23:19, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> But one thing I can't understand is why on one machine (IBM x3850)
> I get this crash, while another one with the same pagetable contents
> (L1 has nothing for 0x1fb) it works just fine? I added a panic and used
> the Xen hypervisor kdb to manually inspect the pagetable, and it has
> the same contents as the IBM x3850 -but it boots fine with this invalid 
> value.
> Any ideas?

Without seeing the full stack trace it's hard to tell. To me, it looks
like a mistake for native_apic_read() to be called at all under Xen,
and perhaps there's one lurking somewhere that gets hit only on
those IBM (Summit?) machines.

Jan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: L1[0x1fb] = 0000000000000000 which faults on one type of machine but on another works?
  2011-03-17 10:25 ` Jan Beulich
@ 2011-03-17 15:52   ` Konrad Rzeszutek Wilk
  2011-03-17 16:12     ` Konrad Rzeszutek Wilk
  2011-03-17 16:12     ` Jan Beulich
  0 siblings, 2 replies; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-03-17 15:52 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Jeremy Fitzhardinge, xen-devel, andrew.thomas, Ian Campbell,
	keir.xen, swente, gianni.tedesco

On Thu, Mar 17, 2011 at 10:25:11AM +0000, Jan Beulich wrote:
> >>> On 16.03.11 at 23:19, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> > But one thing I can't understand is why on one machine (IBM x3850)
> > I get this crash, while another one with the same pagetable contents
> > (L1 has nothing for 0x1fb) it works just fine? I added a panic and used
> > the Xen hypervisor kdb to manually inspect the pagetable, and it has
> > the same contents as the IBM x3850 -but it boots fine with this invalid 
> > value.
> > Any ideas?
> 
> Without seeing the full stack trace it's hard to tell. To me, it looks
> like a mistake for native_apic_read() to be called at all under Xen,
> and perhaps there's one lurking somewhere that gets hit only on
> those IBM (Summit?) machines.

That was it. When we bootup we call 'set_xen_basic_apic_ops' which
sets apic->read to xen_apic_read. The default 'apic' is set to
apic_flat, so in essence we change apic_flat->read from native_apic_read
to xen_apic_read.

During bootup, the default_acpi_madt_oem_check is run which
runs through all of the apic_probe[] array, on which the last
one is is apic_physflat. And apic_physflat->probe() returns true
on this IBM Summit box (and ES7000 boxs, and whatever has FADT
set to ACPI_FADT_APIC_PHYSICAL) so we set apic now to apic_physflat
and the apic->read ends up being native_apic_read.

2.6.38 fixes this by allowing in acpi_register_lapic_address, the
the set_fixmap_nocache(FIX_APIC_BASE, address) to be called and we
can provide it with a dummy page and native_apic_read can happily
read from that fake page.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: L1[0x1fb] = 0000000000000000 which faults on one type of machine but on another works?
  2011-03-17 15:52   ` Konrad Rzeszutek Wilk
@ 2011-03-17 16:12     ` Konrad Rzeszutek Wilk
  2011-03-17 16:12     ` Jan Beulich
  1 sibling, 0 replies; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-03-17 16:12 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Jeremy Fitzhardinge, xen-devel, andrew.thomas, Ian Campbell,
	keir.xen, swente, gianni.tedesco

On Thu, Mar 17, 2011 at 11:52:12AM -0400, Konrad Rzeszutek Wilk wrote:
> On Thu, Mar 17, 2011 at 10:25:11AM +0000, Jan Beulich wrote:
> > >>> On 16.03.11 at 23:19, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> > > But one thing I can't understand is why on one machine (IBM x3850)
> > > I get this crash, while another one with the same pagetable contents
> > > (L1 has nothing for 0x1fb) it works just fine? I added a panic and used
> > > the Xen hypervisor kdb to manually inspect the pagetable, and it has
> > > the same contents as the IBM x3850 -but it boots fine with this invalid 
> > > value.
> > > Any ideas?
> > 
> > Without seeing the full stack trace it's hard to tell. To me, it looks
> > like a mistake for native_apic_read() to be called at all under Xen,
> > and perhaps there's one lurking somewhere that gets hit only on
> > those IBM (Summit?) machines.
> 
> That was it. When we bootup we call 'set_xen_basic_apic_ops' which

Forgot to mention it but thank you for steering me in the right direction!

The patches are in
 git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git for-2.6.32/bug-fixes

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: L1[0x1fb] = 0000000000000000 which faults on one type of machine but on another works?
  2011-03-17 15:52   ` Konrad Rzeszutek Wilk
  2011-03-17 16:12     ` Konrad Rzeszutek Wilk
@ 2011-03-17 16:12     ` Jan Beulich
  2011-03-17 16:41       ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2011-03-17 16:12 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Jeremy Fitzhardinge, xen-devel, andrew.thomas, Ian Campbell,
	keir.xen, swente, gianni.tedesco

>>> On 17.03.11 at 16:52, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> 2.6.38 fixes this by allowing in acpi_register_lapic_address, the
> the set_fixmap_nocache(FIX_APIC_BASE, address) to be called and we
> can provide it with a dummy page and native_apic_read can happily
> read from that fake page.

I wonder whether that's going to be appropriate in cases...

Jan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: L1[0x1fb] = 0000000000000000 which faults on one type of machine but on another works?
  2011-03-17 16:12     ` Jan Beulich
@ 2011-03-17 16:41       ` Konrad Rzeszutek Wilk
  2011-03-17 17:21         ` Jeremy Fitzhardinge
  2011-03-17 19:56         ` [PATCH] xen/apic: Provide an 'apic_xen' to set the override the apic->[read|write] for all cases Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-03-17 16:41 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Jeremy Fitzhardinge, xen-devel, andrew.thomas, Ian Campbell,
	keir.xen, swente, gianni.tedesco

On Thu, Mar 17, 2011 at 04:12:48PM +0000, Jan Beulich wrote:
> >>> On 17.03.11 at 16:52, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> > 2.6.38 fixes this by allowing in acpi_register_lapic_address, the
> > the set_fixmap_nocache(FIX_APIC_BASE, address) to be called and we
> > can provide it with a dummy page and native_apic_read can happily
> > read from that fake page.
> 
> I wonder whether that's going to be appropriate in cases...

If you boot the 2.6.38 it works, but it does provide these ugly and untrue values:

   0.000000] ACPI: IOAPIC (id[0x0f] address[0xfec00000] gsi_base[0])
[    0.000000] IOAPIC[0]: apic_id 15, version 255, address 0xfec00000, GSI 0-255
[    0.000000] ACPI: IOAPIC (id[0x0e] address[0xfec01000] gsi_base[36])
[    0.000000] IOAPIC[1]: apic_id 14, version 255, address 0xfec01000, GSI 36-291
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.000000] Int: type 0, pol 0, trig 0, bus 00, IRQ 00, APIC ID f, APIC INT 02
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 8 global_irq 8 low edge)
[    0.000000] Int: type 0, pol 3, trig 1, bus 00, IRQ 08, APIC ID f, APIC INT 08
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 low edge)
[    0.000000] Int: type 0, pol 3, trig 1, bus 00, IRQ 0e, APIC ID f, APIC INT 0e
[    0.000000] Int: type 0, pol 3, trig 3, bus 00, IRQ 09, APIC ID f, APIC INT 09
[    0.000000] ACPI: IRQ0 used by override.

I don't remember if it was suggested to hpa/ingo/tglx whether we could
provide another 'struct apic' that would be Xen specific and the apic->probe()
would either provide a struct mostly filled with dummy functions that return
nothing, or the Xen apic->probe() function would over-write the current
'apic->read,write, etc' with the xen dummy functions.

However we seem to achieve this already by providing a dummy page that 
is read/writen to by the native_apic_[read|write].

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: L1[0x1fb] = 0000000000000000 which faults on one type of machine but on another works?
  2011-03-17 16:41       ` Konrad Rzeszutek Wilk
@ 2011-03-17 17:21         ` Jeremy Fitzhardinge
  2011-03-17 19:56         ` [PATCH] xen/apic: Provide an 'apic_xen' to set the override the apic->[read|write] for all cases Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 10+ messages in thread
From: Jeremy Fitzhardinge @ 2011-03-17 17:21 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: xen-devel, andrew.thomas, Jan Beulich, Ian Campbell, keir.xen,
	swente, gianni.tedesco

On 03/17/2011 09:41 AM, Konrad Rzeszutek Wilk wrote:
> I don't remember if it was suggested to hpa/ingo/tglx whether we could
> provide another 'struct apic' that would be Xen specific and the apic->probe()
> would either provide a struct mostly filled with dummy functions that return
> nothing, or the Xen apic->probe() function would over-write the current
> 'apic->read,write, etc' with the xen dummy functions.

I still maintain the "proper fix" is to just turn off the APIC CPU
capability.  There is no local apic, and trying to pretend otherwise
just leads to a mass of hacks.

But of course, that's not particularly easy in practice...

    J

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] xen/apic: Provide an 'apic_xen' to set the override the apic->[read|write] for all cases.
  2011-03-17 16:41       ` Konrad Rzeszutek Wilk
  2011-03-17 17:21         ` Jeremy Fitzhardinge
@ 2011-03-17 19:56         ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-03-17 19:56 UTC (permalink / raw)
  To: Jan Beulich, Jeremy Fitzhardinge
  Cc: xen-devel, andrew.thomas, Ian Campbell, keir.xen, swente,
	gianni.tedesco

On Thu, Mar 17, 2011 at 12:41:43PM -0400, Konrad Rzeszutek Wilk wrote:
> On Thu, Mar 17, 2011 at 04:12:48PM +0000, Jan Beulich wrote:
> > >>> On 17.03.11 at 16:52, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> > > 2.6.38 fixes this by allowing in acpi_register_lapic_address, the
> > > the set_fixmap_nocache(FIX_APIC_BASE, address) to be called and we
> > > can provide it with a dummy page and native_apic_read can happily
> > > read from that fake page.
> > 
> > I wonder whether that's going to be appropriate in cases...
> 
> If you boot the 2.6.38 it works, but it does provide these ugly and untrue values:
> 
>    0.000000] ACPI: IOAPIC (id[0x0f] address[0xfec00000] gsi_base[0])
> [    0.000000] IOAPIC[0]: apic_id 15, version 255, address 0xfec00000, GSI 0-255
> [    0.000000] ACPI: IOAPIC (id[0x0e] address[0xfec01000] gsi_base[36])
> [    0.000000] IOAPIC[1]: apic_id 14, version 255, address 0xfec01000, GSI 36-291
> [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> [    0.000000] Int: type 0, pol 0, trig 0, bus 00, IRQ 00, APIC ID f, APIC INT 02
> [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 8 global_irq 8 low edge)
> [    0.000000] Int: type 0, pol 3, trig 1, bus 00, IRQ 08, APIC ID f, APIC INT 08
> [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 low edge)
> [    0.000000] Int: type 0, pol 3, trig 1, bus 00, IRQ 0e, APIC ID f, APIC INT 0e
> [    0.000000] Int: type 0, pol 3, trig 3, bus 00, IRQ 09, APIC ID f, APIC INT 09
> [    0.000000] ACPI: IRQ0 used by override.
> 
> I don't remember if it was suggested to hpa/ingo/tglx whether we could
> provide another 'struct apic' that would be Xen specific and the apic->probe()
> would either provide a struct mostly filled with dummy functions that return
> nothing, or the Xen apic->probe() function would over-write the current
> 'apic->read,write, etc' with the xen dummy functions.
> 
> However we seem to achieve this already by providing a dummy page that 
> is read/writen to by the native_apic_[read|write].

Except that mechanism seems to require some other back-ports from 2.6.38 that
I am not so sure about. The patch worked great on the IBM box but broke all
other ones. Stefano had sent me a couple of fixes where we remove some other
"if (xen_initial_domain)" and move the "memset(ioapic_dummy_.." to another
location but it did not work completly right.

Instead of chasing the right combination, I went ahead with what
I suggested about introducing another 'struct apic'.

Here is the patch and if I revert the fix that I posted and apply this one
(already on for-2.6.32/bug-fixes) I get all my machines to boot.

This is for 2.6.32 - don't know if we need to provide it for 2.6.38.

>From a92e580fbb1ddae8aafed6360a105f274348d776 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Thu, 17 Mar 2011 14:17:52 -0400
Subject: [PATCH] xen/apic: Provide an 'apic_xen' to set the override the apic->[read|write] for all cases.

When we bootup we call 'set_xen_basic_apic_ops' which
sets apic->read to xen_apic_read. The default 'apic' is set to
apic_flat, so in essence we change apic_flat->read from native_apic_read
to xen_apic_read.

During bootup, the default_acpi_madt_oem_check is run which
runs through all of the apic_probe[] array, on which the last
one is is apic_physflat. And apic_physflat->probe() returns true
on this IBM Summit box (and ES7000 boxs, and whatever has FADT
set to ACPI_FADT_APIC_PHYSICAL) so we set apic now to apic_physflat
and the apic->read ends up being native_apic_read.

2.6.38 fixes this by allowing in acpi_register_lapic_address, the
the set_fixmap_nocache(FIX_APIC_BASE, address) to be called and we
can provide it with a dummy page and native_apic_read can happily
read from that fake page.

However, the 2.6.38 is not that applicable here as it crashes
the case for non-IBM machines. The patch:
"xen/ioapic: Allow set_fixmap to set FIX_APIC_BASE to dummy mapping."
(7cb068cf1ba90425e12f3a7b3caed9d018fa9b8c) tried this and while it
worked for IBM Summit machines it broke all other. Moving the
memset to other areas of the code did not help either. The author
thinks that there must be some extra back-ports involved to use that
mechanism.

This fix adds a 'struct apic' that is Xen specific. This 'apic_xen'
is the first item on the apic_probe[i] for both 32 and 64-bit systems.
As the the first on the list, if it detects that it is running under Xen
it will short-circuit the iteration through the apic_probe[] hence not
allowing us to set it to apic_flat (or bigsmp on 32). We populate the
'apic_xen' with the default values from the 'apic' and set the members
with the Xen specific functions.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 arch/x86/kernel/apic/probe_32.c |    4 ++++
 arch/x86/kernel/apic/probe_64.c |    4 ++++
 arch/x86/xen/enlighten.c        |   26 ++++++++++++++++++++++++++
 3 files changed, 34 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/apic/probe_32.c b/arch/x86/kernel/apic/probe_32.c
index 88b9d22..798904d 100644
--- a/arch/x86/kernel/apic/probe_32.c
+++ b/arch/x86/kernel/apic/probe_32.c
@@ -174,11 +174,15 @@ extern struct apic apic_summit;
 extern struct apic apic_bigsmp;
 extern struct apic apic_es7000;
 extern struct apic apic_es7000_cluster;
+extern struct apic apic_xen;
 
 struct apic *apic = &apic_default;
 EXPORT_SYMBOL_GPL(apic);
 
 static struct apic *apic_probe[] __initdata = {
+#ifdef CONFIG_XEN
+	&apic_xen,
+#endif
 #ifdef CONFIG_X86_NUMAQ
 	&apic_numaq,
 #endif
diff --git a/arch/x86/kernel/apic/probe_64.c b/arch/x86/kernel/apic/probe_64.c
index 4c56f54..5ab12a4 100644
--- a/arch/x86/kernel/apic/probe_64.c
+++ b/arch/x86/kernel/apic/probe_64.c
@@ -28,11 +28,15 @@ extern struct apic apic_physflat;
 extern struct apic apic_x2xpic_uv_x;
 extern struct apic apic_x2apic_phys;
 extern struct apic apic_x2apic_cluster;
+extern struct apic apic_xen;
 
 struct apic __read_mostly *apic = &apic_flat;
 EXPORT_SYMBOL_GPL(apic);
 
 static struct apic *apic_probe[] __initdata = {
+#ifdef CONFIG_XEN
+	&apic_xen,
+#endif
 #ifdef CONFIG_X86_UV
 	&apic_x2apic_uv_x,
 #endif
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 070f138..c809938 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -750,6 +750,27 @@ static u32 xen_safe_apic_wait_icr_idle(void)
         return 0;
 }
 
+static __init int xen_safe_flat_acpi_madt_oem_check(char *oem_id,
+						    char *oem_table_id)
+{
+	if (!xen_initial_domain())
+		return 0;
+
+	return 1;
+}
+
+static __init int xen_safe_probe(void) {
+
+	if (!xen_initial_domain())
+		return 0;
+
+	return 1;
+}
+
+struct apic apic_xen = {
+	.name	= "xen",
+};
+
 static __init void set_xen_basic_apic_ops(void)
 {
 	apic->read = xen_apic_read;
@@ -758,6 +779,11 @@ static __init void set_xen_basic_apic_ops(void)
 	apic->icr_write = xen_apic_icr_write;
 	apic->wait_icr_idle = xen_apic_wait_icr_idle;
 	apic->safe_wait_icr_idle = xen_safe_apic_wait_icr_idle;
+	apic->probe = xen_safe_probe;
+	apic->acpi_madt_oem_check  = xen_safe_flat_acpi_madt_oem_check;
+	/* Copy over the full contents of the newly modified apic into
+	 * our apic_xen, which is to be called first by apic_probe[]. */
+	memcpy(&apic_xen, apic, sizeof(struct apic));
 }
 
 #endif
-- 
1.7.1

> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: L1[0x1fb] = 0000000000000000 which faults on one type of machine but on another works?
  2011-03-16 22:19 L1[0x1fb] = 0000000000000000 which faults on one type of machine but on another works? Konrad Rzeszutek Wilk
  2011-03-16 22:32 ` Keir Fraser
  2011-03-17 10:25 ` Jan Beulich
@ 2011-03-22 13:10 ` Gianni Tedesco
  2 siblings, 0 replies; 10+ messages in thread
From: Gianni Tedesco @ 2011-03-22 13:10 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Jeremy Fitzhardinge, xen-devel@lists.xensource.com,
	andrew.thomas@oracle.com, Ian Campbell, keir.xen@gmail.com,
	swente@infinitumb.de

On Wed, 2011-03-16 at 22:19 +0000, Konrad Rzeszutek Wilk wrote:
> I am troubleshooting an issue where the Linux kernel tries
> to dereference a not present entry. I have a fix for this
> in for-2.6.32/bug-fixes .. but please read on.

I'll give it a shot, I'll try anything at this point ;P

> Specifically it tries to derefence the fixmapped value of
> APIC_BASE. The fixmapped value of APIC_BASE is actually not set
> due to git commit a1d8e2fa8325064338b2da1bcf0d7a0473883c284
> which adds this in arch/x86/kernel/acpi/boot.c:
> 
> static void __init acpi_register_lapic_address(unsigned long address)
>  {
>         /* Xen dom0 doesn't have usable lapics */
>        if (xen_initial_domain())
>              return;
>  
>         mp_lapic_addr = address;
> 
> 	set_fixmap_nocache(FIX_APIC_BASE, address);
> 
> Later on we use 'native_apic_read' which tries to use the APIC_BASE as
> address (it is present to be @ slot FIX_APIC_BASE of the fixmap
> API) and it fails (on some machines).
> 
> Since we don't call 'set_fixmap_nocache(FIX_APIC_BASE)' and 
> if one were to go through the pagetable this is what we get:
> 
> 
> [    0.000000] SMP: Allowing 1 CPUs, 0 hotplug CPUs
> [    0.000000] mapped APIC to ffffffffff5fb000 (00000000)
> (XEN) d0:v0: unhandled page fault (ec=0000)
> (XEN) Pagetable walk from ffffffffff5fb020:
> (XEN)  L4[0x1ff] = 0000000221003067 0000000000001003
> (XEN)  L3[0x1ff] = 0000000221004067 0000000000001004
> (XEN)  L2[0x1fa] = 0000000221771067 0000000000001771 
> (XEN)  L1[0x1fb] = 0000000000000000 ffffffffffffffff
> (XEN) domain_crash_sync called from entry.S
> (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
> (XEN) ----[ Xen-4.1-110309  x86_64  debug=y  Tainted:    C ]----
> (XEN) CPU:    0
> (XEN) RIP:    e033:[<ffffffff8102b5d1>]
> (XEN) RFLAGS: 0000000000000292   EM: 1   CONTEXT: pv guest
> (XEN) rax: ffffffff8164cf50   rbx: 000000026ec00000   rcx: 00000000ffffdd85
> (XEN) rdx: 00000000ffffffff   rsi: 0000000000000000   rdi: 0000000000000020
> (XEN) rbp: ffffffff81643ea8   rsp: ffffffff81643e50   r8:  0000000000000002
> (XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
> (XEN) r12: ffff880013671800   r13: 00000000bff66000   r14: ffffffffffffffff
> (XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 00000000000006f0
> (XEN) cr3: 0000000221001000   cr2: ffffffffff5fb020
> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
> (XEN) Guest stack trace from rsp=ffffffff81643e50:
> 
> Which is to say that the L1 has this:
> 0000000115771fa0:  00000000 00000000 00000000 00000000
> 0000000115771fb0:  00000000 00000000 00000000 00000000
> 0000000115771fc0:  00000000 00000000 15770067 80100001
> 0000000115771fd0:  15770067 80100001 00000000 00000000
> 0000000115771fe0:  00000000 00000000 00000000 00000000
> 0000000115771ff0:  00000000 00000000 00000000 00000000
> 
> L1[0x1fb] is machine address 115771fd8, which has nothing in it.
> 
> OK, so I've come up a fix that is a back-port of how 2.6.38 does it
> which is that it removes the check I mentioned above and in xen_set_fixmap
> we set the FIX_APIC_BASE to actually point to a dummy ioapic_mapping. 
> It is 7cb068cf1ba90425e12f3a7b3caed9d018fa9b8c in for-2.6.32/bug-fixes
> 
> Gianni, you might want to check this out in case it fixes the problem you
> are experiencing.

Not sure, mine happens a lot earlier, sort of just after the very early
memory initialisation. Also we're nowhere near trying to use APIC
anything as an address afaict - just trying to reach the xen info page.

The last thing I see is:
[    0.000000] kernel direct mapping tables up to 2f000000 @ 100000-27a000
[    0.000000] init_memory_mapping: 0000000100000000-00000002a7000000


> But one thing I can't understand is why on one machine (IBM x3850)
> I get this crash, while another one with the same pagetable contents
> (L1 has nothing for 0x1fb) it works just fine? I added a panic and used
> the Xen hypervisor kdb to manually inspect the pagetable, and it has
> the same contents as the IBM x3850 -but it boots fine with this invalid value.
> Any ideas?

A missing TLB flush? heh

> 
> FYI, seems another user (Sven Sübert) IBM x3650 hits the same bug. And with
> this fix he is able to boot.

Very odd, if this isn't the bug I'm seeing it might be tangentially
related.

I'll let you know

Gianni

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-03-22 13:10 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-16 22:19 L1[0x1fb] = 0000000000000000 which faults on one type of machine but on another works? Konrad Rzeszutek Wilk
2011-03-16 22:32 ` Keir Fraser
2011-03-17 10:25 ` Jan Beulich
2011-03-17 15:52   ` Konrad Rzeszutek Wilk
2011-03-17 16:12     ` Konrad Rzeszutek Wilk
2011-03-17 16:12     ` Jan Beulich
2011-03-17 16:41       ` Konrad Rzeszutek Wilk
2011-03-17 17:21         ` Jeremy Fitzhardinge
2011-03-17 19:56         ` [PATCH] xen/apic: Provide an 'apic_xen' to set the override the apic->[read|write] for all cases Konrad Rzeszutek Wilk
2011-03-22 13:10 ` L1[0x1fb] = 0000000000000000 which faults on one type of machine but on another works? Gianni Tedesco

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).