xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: xen-devel@lists.xensource.com, gianni.tedesco@citrix.com,
	andrew.thomas@oracle.com, Jeremy Fitzhardinge <jeremy@goop.org>,
	Ian Campbell <Ian.Campbell@eu.citrix.com>,
	keir.xen@gmail.com
Cc: swente@infinitumb.de
Subject: L1[0x1fb] = 0000000000000000 which faults on one type of machine but on another works?
Date: Wed, 16 Mar 2011 18:19:12 -0400	[thread overview]
Message-ID: <20110316221912.GA13035@dumpdata.com> (raw)

I am troubleshooting an issue where the Linux kernel tries
to dereference a not present entry. I have a fix for this
in for-2.6.32/bug-fixes .. but please read on.

Specifically it tries to derefence the fixmapped value of
APIC_BASE. The fixmapped value of APIC_BASE is actually not set
due to git commit a1d8e2fa8325064338b2da1bcf0d7a0473883c284
which adds this in arch/x86/kernel/acpi/boot.c:

static void __init acpi_register_lapic_address(unsigned long address)
 {
        /* Xen dom0 doesn't have usable lapics */
       if (xen_initial_domain())
             return;
 
        mp_lapic_addr = address;

	set_fixmap_nocache(FIX_APIC_BASE, address);

Later on we use 'native_apic_read' which tries to use the APIC_BASE as
address (it is present to be @ slot FIX_APIC_BASE of the fixmap
API) and it fails (on some machines).

Since we don't call 'set_fixmap_nocache(FIX_APIC_BASE)' and 
if one were to go through the pagetable this is what we get:


[    0.000000] SMP: Allowing 1 CPUs, 0 hotplug CPUs
[    0.000000] mapped APIC to ffffffffff5fb000 (00000000)
(XEN) d0:v0: unhandled page fault (ec=0000)
(XEN) Pagetable walk from ffffffffff5fb020:
(XEN)  L4[0x1ff] = 0000000221003067 0000000000001003
(XEN)  L3[0x1ff] = 0000000221004067 0000000000001004
(XEN)  L2[0x1fa] = 0000000221771067 0000000000001771 
(XEN)  L1[0x1fb] = 0000000000000000 ffffffffffffffff
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-4.1-110309  x86_64  debug=y  Tainted:    C ]----
(XEN) CPU:    0
(XEN) RIP:    e033:[<ffffffff8102b5d1>]
(XEN) RFLAGS: 0000000000000292   EM: 1   CONTEXT: pv guest
(XEN) rax: ffffffff8164cf50   rbx: 000000026ec00000   rcx: 00000000ffffdd85
(XEN) rdx: 00000000ffffffff   rsi: 0000000000000000   rdi: 0000000000000020
(XEN) rbp: ffffffff81643ea8   rsp: ffffffff81643e50   r8:  0000000000000002
(XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
(XEN) r12: ffff880013671800   r13: 00000000bff66000   r14: ffffffffffffffff
(XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 00000000000006f0
(XEN) cr3: 0000000221001000   cr2: ffffffffff5fb020
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
(XEN) Guest stack trace from rsp=ffffffff81643e50:

Which is to say that the L1 has this:
0000000115771fa0:  00000000 00000000 00000000 00000000
0000000115771fb0:  00000000 00000000 00000000 00000000
0000000115771fc0:  00000000 00000000 15770067 80100001
0000000115771fd0:  15770067 80100001 00000000 00000000
0000000115771fe0:  00000000 00000000 00000000 00000000
0000000115771ff0:  00000000 00000000 00000000 00000000

L1[0x1fb] is machine address 115771fd8, which has nothing in it.

OK, so I've come up a fix that is a back-port of how 2.6.38 does it
which is that it removes the check I mentioned above and in xen_set_fixmap
we set the FIX_APIC_BASE to actually point to a dummy ioapic_mapping. 
It is 7cb068cf1ba90425e12f3a7b3caed9d018fa9b8c in for-2.6.32/bug-fixes

Gianni, you might want to check this out in case it fixes the problem you
are experiencing.

But one thing I can't understand is why on one machine (IBM x3850)
I get this crash, while another one with the same pagetable contents
(L1 has nothing for 0x1fb) it works just fine? I added a panic and used
the Xen hypervisor kdb to manually inspect the pagetable, and it has
the same contents as the IBM x3850 -but it boots fine with this invalid value.
Any ideas?


FYI, seems another user (Sven Sübert) IBM x3650 hits the same bug. And with
this fix he is able to boot.

             reply	other threads:[~2011-03-16 22:19 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-16 22:19 Konrad Rzeszutek Wilk [this message]
2011-03-16 22:32 ` L1[0x1fb] = 0000000000000000 which faults on one type of machine but on another works? Keir Fraser
2011-03-17 10:25 ` Jan Beulich
2011-03-17 15:52   ` Konrad Rzeszutek Wilk
2011-03-17 16:12     ` Konrad Rzeszutek Wilk
2011-03-17 16:12     ` Jan Beulich
2011-03-17 16:41       ` Konrad Rzeszutek Wilk
2011-03-17 17:21         ` Jeremy Fitzhardinge
2011-03-17 19:56         ` [PATCH] xen/apic: Provide an 'apic_xen' to set the override the apic->[read|write] for all cases Konrad Rzeszutek Wilk
2011-03-22 13:10 ` L1[0x1fb] = 0000000000000000 which faults on one type of machine but on another works? Gianni Tedesco

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110316221912.GA13035@dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=Ian.Campbell@eu.citrix.com \
    --cc=andrew.thomas@oracle.com \
    --cc=gianni.tedesco@citrix.com \
    --cc=jeremy@goop.org \
    --cc=keir.xen@gmail.com \
    --cc=swente@infinitumb.de \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).