xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* ACPI-Tables corrupted?
@ 2010-07-28  9:38 Juergen Gross
  2010-07-28 10:03 ` Keir Fraser
  0 siblings, 1 reply; 23+ messages in thread
From: Juergen Gross @ 2010-07-28  9:38 UTC (permalink / raw)
  To: xen-devel@lists.xensource.com

Hi,

on a Nehalem system with VT-d enabled we are seeing strange ACPI-Table
contents, especially a corrupted DMAR entry.

The hypervisor shows following data on boot:
(XEN) ACPI: RSDP 000F80E0, 0024 (r2 PTLTD )
(XEN) ACPI: XSDT BF7C469E, 00D4 (r1 PTLTD  	 XSDT      60000  LTP        0)
(XEN) ACPI: FACP BF7C9CC9, 00F4 (r3 FSC    TYLERBRG    60000 PTL     F4240)
(XEN) ACPI: DSDT BF7C4772, 54D3 (r1 FSC    D2619       60000 MSFT  3000001)
(XEN) ACPI: FACS BF7CBFC0, 0040
(XEN) ACPI: TCPA BF7C9DBD, 0032 (r1 Phoeni  x          60000  TL         0)
(XEN) ACPI: SLIT BF7C9DEF, 0030 (r1 FSC                60000            5A)
(XEN) ACPI: EINJ BF7C9E1F, 01B0 (r1 PTL    WHEAPTL     60000 PTL         1)
(XEN) ACPI: HEST BF7C9FCF, 0268 (r1 PTL    WHEAPTL     60000 PTL         1)
(XEN) ACPI: BERT BF7CA237, 0030 (r1 PTL    WHEAPTL     60000 PTL         1)
(XEN) ACPI: SSDT BF7CA267, 00E1 (r1 wheaos  wheaosc    60000 INTL 20050624)
(XEN) ACPI: ERST BF7CA348, 0270 (r1 PTL    WHEAPTL     60000 PTL         1)
(XEN) ACPI: SSDT BF7CA5B8, 009E (r1 FSC    CST_PR00    60000  CSF        1)
(XEN) ACPI: SSDT BF7CA656, 009E (r1 FSC    CST_PR02    60000  CSF        1)
(XEN) ACPI: SSDT BF7CA6F4, 009E (r1 FSC    CST_PR04    60000  CSF        1)
(XEN) ACPI: SSDT BF7CA792, 009E (r1 FSC    CST_PR06    60000  CSF        1)
(XEN) ACPI: SSDT BF7CA830, 015B (r1 FSC    PST_PR00    60000  CSF        1)
(XEN) ACPI: SSDT BF7CA98B, 015B (r1 FSC    PST_PR02    60000  CSF        1)
(XEN) ACPI: SSDT BF7CAAE6, 015B (r1 FSC    PST_PR04    60000  CSF        1)
(XEN) ACPI: SSDT BF7CAC41, 015B (r1 FSC    PST_PR06    60000  CSF        1)
(XEN) ACPI: SPCR BF7CAD9C, 0050 (r1 PTLTD  $UCRTBL$    60000 PTL         1)
(XEN) ACPI: DMAR BF7CADEC, 00E8 (r1 Intel  OEMDMAR     60000 LOHR        1)
(XEN) ACPI: MCFG BF7CAED4, 003C (r1 PTLTD    MCFG      60000  LTP        0)
(XEN) ACPI: HPET BF7CAF10, 0038 (r1 PTLTD  HPETTBL     60000  LTP        1)
(XEN) ACPI: APIC BF7CAF48, 0090 (r1 PTLTD  	 APIC      60000  LTP        0)
(XEN) ACPI: BOOT BF7CAFD8, 0028 (r1 PTLTD  $SBFTBL$    60000  LTP        1)

The dom0 kernel prints the following:
[    8.193909] ACPI: RSDP 00000000000f80e0 00024 (v02 PTLTD )
[    8.193921] ACPI: XSDT 00000000bf7c469e 000D4 (v01 PTLTD  ? XSDT   00060000 
  LTP 00000000)
[    8.193929] ACPI: FACP 00000000bf7c9cc9 000F4 (v03 FSC    TYLERBRG 00060000 
PTL  000F4240)
[    8.193937] ACPI: DSDT 00000000bf7c4772 054D3 (v01 FSC    D2619    00060000 
MSFT 03000001)
[    8.193941] ACPI: FACS 00000000bf7cbfc0 00040
[    8.193945] ACPI: TCPA 00000000bf7c9dbd 00032 (v01 Phoeni  x       00060000 
  TL  00000000)
[    8.193950] ACPI: SLIT 00000000bf7c9def 00030 (v01 FSC             00060000 
      0000005A)
[    8.193955] ACPI: EINJ 00000000bf7c9e1f 001B0 (v01 PTL    WHEAPTL  00060000 
PTL  00000001)
[    8.193960] ACPI: HEST 00000000bf7c9fcf 00268 (v01 PTL    WHEAPTL  00060000 
PTL  00000001)
[    8.193964] ACPI: BERT 00000000bf7ca237 00030 (v01 PTL    WHEAPTL  00060000 
PTL  00000001)
[    8.193969] ACPI: SSDT 00000000bf7ca267 000E1 (v01 wheaos  wheaosc 00060000 
INTL 20050624)
[    8.193974] ACPI: ERST 00000000bf7ca348 00270 (v01 PTL    WHEAPTL  00060000 
PTL  00000001)
[    8.193979] ACPI: SSDT 00000000bf7ca5b8 0009E (v01 FSC    CST_PR00 00060000 
  CSF 00000001)
[    8.193983] ACPI: SSDT 00000000bf7ca656 0009E (v01 FSC    CST_PR02 00060000 
  CSF 00000001)
[    8.193988] ACPI: SSDT 00000000bf7ca6f4 0009E (v01 FSC    CST_PR04 00060000 
  CSF 00000001)
[    8.193993] ACPI: SSDT 00000000bf7ca792 0009E (v01 FSC    CST_PR06 00060000 
  CSF 00000001)
[    8.193997] ACPI: SSDT 00000000bf7ca830 0015B (v01 FSC    PST_PR00 00060000 
  CSF 00000001)
[    8.194002] ACPI: SSDT 00000000bf7ca98b 0015B (v01 FSC    PST_PR02 00060000 
  CSF 00000001)
[    8.194007] ACPI: SSDT 00000000bf7caae6 0015B (v01 FSC    PST_PR04 00060000 
  CSF 00000001)
[    8.194011] ACPI: SSDT 00000000bf7cac41 0015B (v01 FSC    PST_PR06 00060000 
  CSF 00000001)
[    8.194016] ACPI: SPCR 00000000bf7cad9c 00050 (v01 PTLTD  $UCRTBL$ 00060000 
PTL  00000001)
[    8.194021] ACPI:      00000000bf7cadec 000E8 (v01 Intel  OEMDMAR  00060000 
LOHR 00000001)
[    8.194025] ACPI: MCFG 00000000bf7caed4 0003C (v01 PTLTD    MCFG   00060000 
  LTP 00000000)
[    8.194030] ACPI: HPET 00000000bf7caf10 00038 (v01 PTLTD  HPETTBL  00060000 
  LTP 00000001)
[    8.194035] ACPI: APIC 00000000bf7caf48 00090 (v01 PTLTD  ? APIC   00060000 
  LTP 00000000)
[    8.194039] ACPI: BOOT 00000000bf7cafd8 00028 (v01 PTLTD  $SBFTBL$ 00060000 
  LTP 00000001)

As you can see, the DMAR eye-catcher is replaced by blanks!
This leads to a programmed panic in the crash kernel later in case of a
panic in dom0...

Any ideas?
BTW: seen in unstable AND 4.0


Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ACPI-Tables corrupted?
  2010-07-28  9:38 ACPI-Tables corrupted? Juergen Gross
@ 2010-07-28 10:03 ` Keir Fraser
  2010-07-28 11:26   ` Juergen Gross
  0 siblings, 1 reply; 23+ messages in thread
From: Keir Fraser @ 2010-07-28 10:03 UTC (permalink / raw)
  To: Juergen Gross, xen-devel@lists.xensource.com

On 28/07/2010 10:38, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:

> As you can see, the DMAR eye-catcher is replaced by blanks!
> This leads to a programmed panic in the crash kernel later in case of a
> panic in dom0...
> 
> Any ideas?
> BTW: seen in unstable AND 4.0

Look at the tail of xen/drivers/passthrough/vtd/dmar.c: Xen *always*
*unconditionally* trashes the DMAR so that dom0 will not parse it.
Presumably bad stuff would happen if it did.

See for example xen-unstable:20181 changeset comment for something of an
explanation and the patch originator.

 -- Keir

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ACPI-Tables corrupted?
  2010-07-28 10:03 ` Keir Fraser
@ 2010-07-28 11:26   ` Juergen Gross
  2010-07-28 11:39     ` Keir Fraser
  0 siblings, 1 reply; 23+ messages in thread
From: Juergen Gross @ 2010-07-28 11:26 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel@lists.xensource.com

On 07/28/2010 12:03 PM, Keir Fraser wrote:
> On 28/07/2010 10:38, "Juergen Gross"<juergen.gross@ts.fujitsu.com>  wrote:
>
>> As you can see, the DMAR eye-catcher is replaced by blanks!
>> This leads to a programmed panic in the crash kernel later in case of a
>> panic in dom0...
>>
>> Any ideas?
>> BTW: seen in unstable AND 4.0
>
> Look at the tail of xen/drivers/passthrough/vtd/dmar.c: Xen *always*
> *unconditionally* trashes the DMAR so that dom0 will not parse it.
> Presumably bad stuff would happen if it did.

As Dom0 is a pv-kernel, it should be able to ignore this entry.
The crash kernel OTOH should not panic due to the trashed entry!
What is the correct solution here?

The crash kernel expects a valid DMAR entry, as following code in
enable_IR_x2apic() suggests:

         /* IR is required if there is APIC ID > 255 even when running
          * under KVM
          */
         if (max_physical_apicid > 255 || !kvm_para_available()) {
             if (max_physical_apicid > 255) {
                 pr_warning("NTR enable_IR_x2apic max_physical_apicid > 255\n");
             }
             if (!kvm_para_available()) {
                 pr_warning("NTR enable_IR_x2apic !kvm_para_available()\n");
             }
             goto nox2apic;
         }

(kernel is 2.6.32.12 from Novell SLES11 SP1, the pr_warnings were added to
find the error path - it was the !kvm_para_available())

Looking closer to the code rises some doubts about correctness. The comment
at top seems not to be reflected by the following if...


Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ACPI-Tables corrupted?
  2010-07-28 11:26   ` Juergen Gross
@ 2010-07-28 11:39     ` Keir Fraser
  2010-07-28 12:13       ` Juergen Gross
  0 siblings, 1 reply; 23+ messages in thread
From: Keir Fraser @ 2010-07-28 11:39 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel@lists.xensource.com




On 28/07/2010 12:26, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:

> On 07/28/2010 12:03 PM, Keir Fraser wrote:
>> On 28/07/2010 10:38, "Juergen Gross"<juergen.gross@ts.fujitsu.com>  wrote:
>> 
>>> As you can see, the DMAR eye-catcher is replaced by blanks!
>>> This leads to a programmed panic in the crash kernel later in case of a
>>> panic in dom0...
>>> 
>>> Any ideas?
>>> BTW: seen in unstable AND 4.0
>> 
>> Look at the tail of xen/drivers/passthrough/vtd/dmar.c: Xen *always*
>> *unconditionally* trashes the DMAR so that dom0 will not parse it.
>> Presumably bad stuff would happen if it did.
> 
> As Dom0 is a pv-kernel, it should be able to ignore this entry.
> The crash kernel OTOH should not panic due to the trashed entry!
> What is the correct solution here?

Could provide a cmdline option to not nobble the DMAR?

> The crash kernel expects a valid DMAR entry, as following code in
> enable_IR_x2apic() suggests:

I don't know what that function does, nor how the error path below depends
on DMAR. DMAR isn't mentioned in the below code.

 K.

>          /* IR is required if there is APIC ID > 255 even when running
>           * under KVM
>           */
>          if (max_physical_apicid > 255 || !kvm_para_available()) {
>              if (max_physical_apicid > 255) {
>                  pr_warning("NTR enable_IR_x2apic max_physical_apicid >
> 255\n");
>              }
>              if (!kvm_para_available()) {
>                  pr_warning("NTR enable_IR_x2apic !kvm_para_available()\n");
>              }
>              goto nox2apic;
>          }
> 
> (kernel is 2.6.32.12 from Novell SLES11 SP1, the pr_warnings were added to
> find the error path - it was the !kvm_para_available())
> 
> Looking closer to the code rises some doubts about correctness. The comment
> at top seems not to be reflected by the following if...
> 
> 
> Juergen

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ACPI-Tables corrupted?
  2010-07-28 11:39     ` Keir Fraser
@ 2010-07-28 12:13       ` Juergen Gross
  2010-07-28 12:45         ` Keir Fraser
  2010-07-29  6:31         ` Jiang, Yunhong
  0 siblings, 2 replies; 23+ messages in thread
From: Juergen Gross @ 2010-07-28 12:13 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel@lists.xensource.com

On 07/28/2010 01:39 PM, Keir Fraser wrote:
>
>
>
> On 28/07/2010 12:26, "Juergen Gross"<juergen.gross@ts.fujitsu.com>  wrote:
>
>> On 07/28/2010 12:03 PM, Keir Fraser wrote:
>>> On 28/07/2010 10:38, "Juergen Gross"<juergen.gross@ts.fujitsu.com>   wrote:
>>>
>>>> As you can see, the DMAR eye-catcher is replaced by blanks!
>>>> This leads to a programmed panic in the crash kernel later in case of a
>>>> panic in dom0...
>>>>
>>>> Any ideas?
>>>> BTW: seen in unstable AND 4.0
>>>
>>> Look at the tail of xen/drivers/passthrough/vtd/dmar.c: Xen *always*
>>> *unconditionally* trashes the DMAR so that dom0 will not parse it.
>>> Presumably bad stuff would happen if it did.
>>
>> As Dom0 is a pv-kernel, it should be able to ignore this entry.
>> The crash kernel OTOH should not panic due to the trashed entry!
>> What is the correct solution here?
>
> Could provide a cmdline option to not nobble the DMAR?

That's a possibility.
I wonder whether it wouldn't be better to let dom0 decide not to use it if
running under xen. This would remove the requirement for zapping the ACPI
table. IMO it's always a bad idea to change data of a deeper layer...

>
>> The crash kernel expects a valid DMAR entry, as following code in
>> enable_IR_x2apic() suggests:
>
> I don't know what that function does, nor how the error path below depends
> on DMAR. DMAR isn't mentioned in the below code.

Sorry, here a larger fragment (source arch/x86/kernel/apic/apic.c):

void __init enable_IR_x2apic(void)
{
         unsigned long flags;
         struct IO_APIC_route_entry **ioapic_entries = NULL;
         int ret, x2apic_enabled = 0;
         int dmar_table_init_ret = 0;

#ifdef CONFIG_INTR_REMAP
         dmar_table_init_ret = dmar_table_init();
         if (dmar_table_init_ret)
                 pr_debug("dmar_table_init() failed with %d:\n",
                                 dmar_table_init_ret);
#endif

         ioapic_entries = alloc_ioapic_entries();
         if (!ioapic_entries) {
                 pr_err("Allocate ioapic_entries failed\n");
                 goto out;
         }

         ret = save_IO_APIC_setup(ioapic_entries);
         if (ret) {
                 pr_info("Saving IO-APIC state failed: %d\n", ret);
                 goto out;
         }

         local_irq_save(flags);
         mask_8259A();
         mask_IO_APIC_setup(ioapic_entries);

         if (dmar_table_init_ret)
                 ret = 0;
         else
                 ret = enable_IR();

         if (!ret) {
                 /* IR is required if there is APIC ID > 255 even when running
                  * under KVM
                  */
                 if (max_physical_apicid > 255 || !kvm_para_available())
                         goto nox2apic;
                 /*
                  * without IR all CPUs can be addressed by IOAPIC/MSI
                  * only in physical mode
                  */
                 x2apic_force_phys();
         }

         x2apic_enabled = 1;

         if (x2apic_supported() && !x2apic_mode) {
                 x2apic_mode = 1;
                 enable_x2apic();
                 pr_info("Enabled x2apic\n");
         }

nox2apic:
         if (!ret) /* IR enabling failed */
                 restore_IO_APIC_setup(ioapic_entries);
         unmask_8259A();
         local_irq_restore(flags);

out:
         if (ioapic_entries)
                 free_ioapic_entries(ioapic_entries);

         if (x2apic_enabled)
                 return;

         if (x2apic_preenabled)
                 panic("x2apic: enabled by BIOS but kernel init failed.");
         else if (cpu_has_x2apic)
                 pr_info("Not enabling x2apic, Intr-remapping init failed.\n");
}


dmar_table_init() will return -ENODEV if no DMAR record is found.


Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ACPI-Tables corrupted?
  2010-07-28 12:13       ` Juergen Gross
@ 2010-07-28 12:45         ` Keir Fraser
  2010-07-28 13:27           ` Juergen Gross
  2010-08-06 13:39           ` Jan Beulich
  2010-07-29  6:31         ` Jiang, Yunhong
  1 sibling, 2 replies; 23+ messages in thread
From: Keir Fraser @ 2010-07-28 12:45 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel@lists.xensource.com

On 28/07/2010 13:13, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:

>>> As Dom0 is a pv-kernel, it should be able to ignore this entry.
>>> The crash kernel OTOH should not panic due to the trashed entry!
>>> What is the correct solution here?
>> 
>> Could provide a cmdline option to not nobble the DMAR?
> 
> That's a possibility.
> I wonder whether it wouldn't be better to let dom0 decide not to use it if
> running under xen. This would remove the requirement for zapping the ACPI
> table. IMO it's always a bad idea to change data of a deeper layer...

If we don't zap the DMAR then every existing dom0 kernel will fail with new
hypervisor. We could gate it on a new elfnote, or rename to XMAR and have
dom0 rename it back, or just have a flag day.

>>> The crash kernel expects a valid DMAR entry, as following code in
>>> enable_IR_x2apic() suggests:
>> 
>> I don't know what that function does, nor how the error path below depends
>> on DMAR. DMAR isn't mentioned in the below code.
> 
> Sorry, here a larger fragment (source arch/x86/kernel/apic/apic.c):
>                  /* IR is required if there is APIC ID > 255 even when running
>                   * under KVM
>                   */
>                  if (max_physical_apicid > 255 || !kvm_para_available())
>                          goto nox2apic;

The if stmt is confusing. Also, what would happen if this kernel was booted
on a system without VT-d (and hence no DMAR)? Presumably it *can* boot in a
DMAR-less environment -- there must be something odd going on for it to end
on this path for us.

 -- Keir

>                  /*
>                   * without IR all CPUs can be addressed by IOAPIC/MSI
>                   * only in physical mode
>                   */
>                  x2apic_force_phys();
>          }
> 
>          x2apic_enabled = 1;
> 
>          if (x2apic_supported() && !x2apic_mode) {
>                  x2apic_mode = 1;
>                  enable_x2apic();
>                  pr_info("Enabled x2apic\n");
>          }
> 
> nox2apic:
>          if (!ret) /* IR enabling failed */
>                  restore_IO_APIC_setup(ioapic_entries);
>          unmask_8259A();
>          local_irq_restore(flags);
> 
> out:
>          if (ioapic_entries)
>                  free_ioapic_entries(ioapic_entries);
> 
>          if (x2apic_enabled)
>                  return;
> 
>          if (x2apic_preenabled)
>                  panic("x2apic: enabled by BIOS but kernel init failed.");
>          else if (cpu_has_x2apic)
>                  pr_info("Not enabling x2apic, Intr-remapping init
> failed.\n");
> }
> 
> 
> dmar_table_init() will return -ENODEV if no DMAR record is found.
> 
> 
> Juergen

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ACPI-Tables corrupted?
  2010-07-28 12:45         ` Keir Fraser
@ 2010-07-28 13:27           ` Juergen Gross
  2010-07-28 13:36             ` Konrad Rzeszutek Wilk
  2010-08-06 13:39           ` Jan Beulich
  1 sibling, 1 reply; 23+ messages in thread
From: Juergen Gross @ 2010-07-28 13:27 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel@lists.xensource.com

On 07/28/2010 02:45 PM, Keir Fraser wrote:
> On 28/07/2010 13:13, "Juergen Gross"<juergen.gross@ts.fujitsu.com>  wrote:
>
>>>> As Dom0 is a pv-kernel, it should be able to ignore this entry.
>>>> The crash kernel OTOH should not panic due to the trashed entry!
>>>> What is the correct solution here?
>>>
>>> Could provide a cmdline option to not nobble the DMAR?
>>
>> That's a possibility.
>> I wonder whether it wouldn't be better to let dom0 decide not to use it if
>> running under xen. This would remove the requirement for zapping the ACPI
>> table. IMO it's always a bad idea to change data of a deeper layer...
>
> If we don't zap the DMAR then every existing dom0 kernel will fail with new
> hypervisor. We could gate it on a new elfnote, or rename to XMAR and have
> dom0 rename it back, or just have a flag day.

The really clean solution would be to virtualize the ACPI table for dom0 and
remove the DMAR entry in this version. This would require some major work, I
guess (clone at least the BIOS page containing the ACPI anchor and present
a modified version to dom0).

>
>>>> The crash kernel expects a valid DMAR entry, as following code in
>>>> enable_IR_x2apic() suggests:
>>>
>>> I don't know what that function does, nor how the error path below depends
>>> on DMAR. DMAR isn't mentioned in the below code.
>>
>> Sorry, here a larger fragment (source arch/x86/kernel/apic/apic.c):
>>                   /* IR is required if there is APIC ID>  255 even when running
>>                    * under KVM
>>                    */
>>                   if (max_physical_apicid>  255 || !kvm_para_available())
>>                           goto nox2apic;
>
> The if stmt is confusing. Also, what would happen if this kernel was booted
> on a system without VT-d (and hence no DMAR)? Presumably it *can* boot in a
> DMAR-less environment -- there must be something odd going on for it to end
> on this path for us.

Yeah, that puzzled me, too.


Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ACPI-Tables corrupted?
  2010-07-28 13:27           ` Juergen Gross
@ 2010-07-28 13:36             ` Konrad Rzeszutek Wilk
  2010-07-29  6:19               ` Juergen Gross
  0 siblings, 1 reply; 23+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-07-28 13:36 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel@lists.xensource.com, Keir Fraser

> The really clean solution would be to virtualize the ACPI table for dom0 and
> remove the DMAR entry in this version. This would require some major work, I
> guess (clone at least the BIOS page containing the ACPI anchor and present
> a modified version to dom0).

Well, that is what it does right now. It zeros it out so that the DMAR
entry is gone from the ACPI tables.

I am not really sure that having a DMAR accessible to Dom0 is good. You
would have two entities trying to write to the DMAR's to control the
IOMMU and the PCI devices. Does Xen enable the IOMMU? Do you see that in
the serial log?
> 
> >
> >>>>The crash kernel expects a valid DMAR entry, as following code in
> >>>>enable_IR_x2apic() suggests:
> >>>
> >>>I don't know what that function does, nor how the error path below depends
> >>>on DMAR. DMAR isn't mentioned in the below code.
> >>
> >>Sorry, here a larger fragment (source arch/x86/kernel/apic/apic.c):
> >>                  /* IR is required if there is APIC ID>  255 even when running
> >>                   * under KVM
> >>                   */
> >>                  if (max_physical_apicid>  255 || !kvm_para_available())
> >>                          goto nox2apic;
> >
> >The if stmt is confusing. Also, what would happen if this kernel was booted
> >on a system without VT-d (and hence no DMAR)? Presumably it *can* boot in a
> >DMAR-less environment -- there must be something odd going on for it to end
> >on this path for us.
> 
> Yeah, that puzzled me, too.

What is the crash? And do you see any indiciation that x2APIC is turned
on? Do provide a serial log please.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ACPI-Tables corrupted?
  2010-07-28 13:36             ` Konrad Rzeszutek Wilk
@ 2010-07-29  6:19               ` Juergen Gross
  2010-07-29  6:39                 ` Keir Fraser
  0 siblings, 1 reply; 23+ messages in thread
From: Juergen Gross @ 2010-07-29  6:19 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel@lists.xensource.com, Keir Fraser

[-- Attachment #1: Type: text/plain, Size: 3000 bytes --]

On 07/28/2010 03:36 PM, Konrad Rzeszutek Wilk wrote:
>> The really clean solution would be to virtualize the ACPI table for dom0 and
>> remove the DMAR entry in this version. This would require some major work, I
>> guess (clone at least the BIOS page containing the ACPI anchor and present
>> a modified version to dom0).
>
> Well, that is what it does right now. It zeros it out so that the DMAR
> entry is gone from the ACPI tables.

No. It changes the ORIGINAL ACPI table, not a copy of it.

>
> I am not really sure that having a DMAR accessible to Dom0 is good. You
> would have two entities trying to write to the DMAR's to control the
> IOMMU and the PCI devices. Does Xen enable the IOMMU? Do you see that in
> the serial log?

I don't want to let dom0 access DMAR. I want the crash kernel be able to
access it.

And I think Xen does enable the IOMMU:

(XEN) HVM: ASIDs enabled.
(XEN) HVM: VMX enabled
(XEN) HVM: Hardware Assisted Paging detected.
(XEN) Intel machine check reporting enabled
(XEN) Intel VT-d Snoop Control supported.
(XEN) Intel VT-d DMA Passthrough not supported.
(XEN) Intel VT-d Queued Invalidation supported.
(XEN) Intel VT-d Interrupt Remapping supported.
(XEN) I/O virtualisation enabled
(XEN) I/O virtualisation for PV guests disabled
(XEN) x2APIC mode enabled.


>>
>>>
>>>>>> The crash kernel expects a valid DMAR entry, as following code in
>>>>>> enable_IR_x2apic() suggests:
>>>>>
>>>>> I don't know what that function does, nor how the error path below depends
>>>>> on DMAR. DMAR isn't mentioned in the below code.
>>>>
>>>> Sorry, here a larger fragment (source arch/x86/kernel/apic/apic.c):
>>>>                   /* IR is required if there is APIC ID>   255 even when running
>>>>                    * under KVM
>>>>                    */
>>>>                   if (max_physical_apicid>   255 || !kvm_para_available())
>>>>                           goto nox2apic;
>>>
>>> The if stmt is confusing. Also, what would happen if this kernel was booted
>>> on a system without VT-d (and hence no DMAR)? Presumably it *can* boot in a
>>> DMAR-less environment -- there must be something odd going on for it to end
>>> on this path for us.
>>
>> Yeah, that puzzled me, too.
>
> What is the crash? And do you see any indiciation that x2APIC is turned
> on? Do provide a serial log please.

Log is attached.

I did some more testing. The problem occurred on a Nehalem-EX system. I tried
the same on a Nehalem-EP system and all was okay. I suspect some further
problems in the ACPI tables of the EX system now. I'm not too familiar with
ACPI tables. Anything I can do for further analysis?


Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

[-- Attachment #2: crash.log --]
[-- Type: text/x-log, Size: 32574 bytes --]

(XEN) 'C' pressed -> triggering crashdump
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Linux version 2.6.32.12-0.7-default (geeko@buildhost) (gcc version 4.3.4 [gcc-4_3-branch revision 152973] (SUSE Linux) ) #1 SMP 2010-05-20 11:14:20 +0200
[    0.000000] Command line: root=/dev/md11  swiotlb=128 reassigndev=all console=tty0 console=ttyS0,38400 xencons=ttyS0 elevator=deadline sysrq=1 reset_devices irqpoll maxcpus=1 kernelversion=2.6.32.12-0.7-xen noirqdebug memmap=exactmap memmap=640K@0K memmap=261484K@17024K elfcorehdr=278508K memmap=5992K#1968932K memmap=4624K#1974924K memmap=220K#1980024K memmap=80K#1980256K memmap=4K#1980352K memmap=136K#1980364K memmap=188K#1980632K memmap=3076K#1981080K memmap=2584K#1984348K memmap=2048K#1986932K memmap=1880K#1988980K memmap=88K#1990948K memmap=20K#1991092K memmap=32K#1991116K memmap=20K#1991156K memmap=4636K#1991180K memmap=32672K#1996252K memmap=1312K#2029884K memmap=224K#2031196K memmap=192K#2031420K
[    0.000000] KERNEL supported cpus:
[    0.000000]   Intel GenuineIntel
[    0.000000]   AMD AuthenticAMD
[    0.000000]   Centaur CentaurHauls
[    0.000000] BIOS-provided physical RAM map:
[    0.000000]  BIOS-e820: 0000000000000100 - 000000000009b400 (usable)
[    0.000000]  BIOS-e820: 000000000009b400 - 00000000000a0000 (reserved)
[    0.000000]  BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
[    0.000000]  BIOS-e820: 0000000000100000 - 00000000782c9000 (usable)
[    0.000000]  BIOS-e820: 00000000782c9000 - 00000000788a3000 (ACPI NVS)
[    0.000000]  BIOS-e820: 00000000788a3000 - 0000000078d27000 (ACPI data)
[    0.000000]  BIOS-e820: 0000000078d27000 - 0000000078d9e000 (reserved)
[    0.000000]  BIOS-e820: 0000000078d9e000 - 0000000078dd5000 (ACPI data)
[    0.000000]  BIOS-e820: 0000000078dd5000 - 0000000078dd8000 (reserved)
[    0.000000]  BIOS-e820: 0000000078dd8000 - 0000000078dec000 (ACPI data)
[    0.000000]  BIOS-e820: 0000000078dec000 - 0000000078df0000 (reserved)
[    0.000000]  BIOS-e820: 0000000078df0000 - 0000000078df1000 (ACPI data)
[    0.000000]  BIOS-e820: 0000000078df1000 - 0000000078df3000 (reserved)
[    0.000000]  BIOS-e820: 0000000078df3000 - 0000000078e15000 (ACPI data)
[    0.000000]  BIOS-e820: 0000000078e15000 - 0000000078e36000 (reserved)
[    0.000000]  BIOS-e820: 0000000078e36000 - 0000000078e65000 (ACPI data)
[    0.000000]  BIOS-e820: 0000000078e65000 - 0000000078ea6000 (reserved)
[    0.000000]  BIOS-e820: 0000000078ea6000 - 00000000791a7000 (ACPI data)
[    0.000000]  BIOS-e820: 00000000791a7000 - 00000000791d7000 (reserved)
[    0.000000]  BIOS-e820: 00000000791d7000 - 000000007945d000 (ACPI data)
[    0.000000]  BIOS-e820: 000000007945d000 - 000000007965d000 (ACPI NVS)
[    0.000000]  BIOS-e820: 000000007965d000 - 0000000079833000 (ACPI data)
[    0.000000]  BIOS-e820: 0000000079833000 - 0000000079849000 (reserved)
[    0.000000]  BIOS-e820: 0000000079849000 - 000000007985f000 (ACPI data)
[    0.000000]  BIOS-e820: 000000007985f000 - 000000007986d000 (reserved)
[    0.000000]  BIOS-e820: 000000007986d000 - 0000000079872000 (ACPI data)
[    0.000000]  BIOS-e820: 0000000079872000 - 0000000079873000 (reserved)
[    0.000000]  BIOS-e820: 0000000079873000 - 000000007987b000 (ACPI data)
[    0.000000]  BIOS-e820: 000000007987b000 - 000000007987d000 (reserved)
[    0.000000]  BIOS-e820: 000000007987d000 - 0000000079882000 (ACPI data)
[    0.000000]  BIOS-e820: 0000000079882000 - 0000000079883000 (reserved)
[    0.000000]  BIOS-e820: 0000000079883000 - 0000000079d0a000 (ACPI data)
[    0.000000]  BIOS-e820: 0000000079d0a000 - 0000000079d77000 (reserved)
[    0.000000]  BIOS-e820: 0000000079d77000 - 000000007bd5f000 (ACPI data)
[    0.000000]  BIOS-e820: 000000007bd5f000 - 000000007be4f000 (reserved)
[    0.000000]  BIOS-e820: 000000007be4f000 - 000000007bf97000 (ACPI data)
[    0.000000]  BIOS-e820: 000000007bf97000 - 000000007bfcf000 (ACPI NVS)
[    0.000000]  BIOS-e820: 000000007bfcf000 - 000000007bfff000 (ACPI data)
[    0.000000]  BIOS-e820: 000000007bfff000 - 0000000090000000 (reserved)
[    0.000000]  BIOS-e820: 00000000fc000000 - 00000000fd000000 (reserved)
[    0.000000]  BIOS-e820: 00000000fec00000 - 00000000fec02000 (reserved)
[    0.000000]  BIOS-e820: 00000000fec04000 - 00000000fec05000 (reserved)
[    0.000000]  BIOS-e820: 00000000fed1c000 - 00000000fed20000 (reserved)
[    0.000000]  BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
[    0.000000]  BIOS-e820: 00000000ff000000 - 0000000100000000 (reserved)
[    0.000000]  BIOS-e820: 0000000100000000 - 0000004080000000 (usable)
[    0.000000] last_pfn = 0x4080000 max_arch_pfn = 0x400000000
[    0.000000] user-defined physical RAM map:
[    0.000000]  user: 0000000000000000 - 00000000000a0000 (usable)
[    0.000000]  user: 00000000010a0000 - 0000000010ffb000 (usable)
[    0.000000]  user: 00000000782c9000 - 0000000078d27000 (ACPI data)
[    0.000000]  user: 0000000078d9e000 - 0000000078dd5000 (ACPI data)
[    0.000000]  user: 0000000078dd8000 - 0000000078dec000 (ACPI data)
[    0.000000]  user: 0000000078df0000 - 0000000078df1000 (ACPI data)
[    0.000000]  user: 0000000078df3000 - 0000000078e15000 (ACPI data)
[    0.000000]  user: 0000000078e36000 - 0000000078e65000 (ACPI data)
[    0.000000]  user: 0000000078ea6000 - 00000000791a7000 (ACPI data)
[    0.000000]  user: 00000000791d7000 - 0000000079833000 (ACPI data)
[    0.000000]  user: 0000000079849000 - 000000007985f000 (ACPI data)
[    0.000000]  user: 000000007986d000 - 0000000079872000 (ACPI data)
[    0.000000]  user: 0000000079873000 - 000000007987b000 (ACPI data)
[    0.000000]  user: 000000007987d000 - 0000000079882000 (ACPI data)
[    0.000000]  user: 0000000079883000 - 0000000079d0a000 (ACPI data)
[    0.000000]  user: 0000000079d77000 - 000000007bd5f000 (ACPI data)
[    0.000000]  user: 000000007be4f000 - 000000007bfff000 (ACPI data)
[    0.000000] DMI 2.6 present.
[    0.000000] last_pfn = 0x10ffb max_arch_pfn = 0x400000000
[    0.000000] x86 PAT enabled: cpu 0, old 0x50100070406, new 0x7010600070106
[    0.000000] x2apic enabled by BIOS, switching to x2apic ops
[    0.000000] Scanning 1 areas for low memory corruption
[    0.000000] modified physical RAM map:
[    0.000000]  modified: 0000000000000000 - 0000000000001000 (usable)
[    0.000000]  modified: 0000000000001000 - 0000000000006000 (reserved)
[    0.000000]  modified: 0000000000006000 - 00000000000a0000 (usable)
[    0.000000]  modified: 00000000010a0000 - 0000000010ffb000 (usable)
[    0.000000]  modified: 00000000782c9000 - 0000000078d27000 (ACPI data)
[    0.000000]  modified: 0000000078d9e000 - 0000000078dd5000 (ACPI data)
[    0.000000]  modified: 0000000078dd8000 - 0000000078dec000 (ACPI data)
[    0.000000]  modified: 0000000078df0000 - 0000000078df1000 (ACPI data)
[    0.000000]  modified: 0000000078df3000 - 0000000078e15000 (ACPI data)
[    0.000000]  modified: 0000000078e36000 - 0000000078e65000 (ACPI data)
[    0.000000]  modified: 0000000078ea6000 - 00000000791a7000 (ACPI data)
[    0.000000]  modified: 00000000791d7000 - 0000000079833000 (ACPI data)
[    0.000000]  modified: 0000000079849000 - 000000007985f000 (ACPI data)
[    0.000000]  modified: 000000007986d000 - 0000000079872000 (ACPI data)
[    0.000000]  modified: 0000000079873000 - 000000007987b000 (ACPI data)
[    0.000000]  modified: 000000007987d000 - 0000000079882000 (ACPI data)
[    0.000000]  modified: 0000000079883000 - 0000000079d0a000 (ACPI data)
[    0.000000]  modified: 0000000079d77000 - 000000007bd5f000 (ACPI data)
[    0.000000]  modified: 000000007be4f000 - 000000007bfff000 (ACPI data)
[    0.000000] init_memory_mapping: 0000000000000000-0000000010ffb000
[    0.000000] RAMDISK: 10ab8000 - 10feea25
[    0.000000] ACPI: RSDP 00000000000f0410 00024 (v02 FSC   )
[    0.000000] ACPI: XSDT 000000007bffc120 000BC (v01 FSC    PC       00000000      01000013)
[    0.000000] ACPI: FACP 000000007bffb000 000F4 (v04 FSC    PC       00000000 MSFT 0100000D)
[    0.000000] ACPI: DSDT 000000007bfdf000 1B243 (v02 FSC    PC       00000003 MSFT 0100000D)
[    0.000000] ACPI: FACS 000000007bf97000 00040
[    0.000000] ACPI: APIC 000000007bfde000 003E4 (v02 FSC    PC       00000000 MSFT 0100000D)
[    0.000000] ACPI: MSCT 000000007bfdd000 00090 (v01 FSC    PC       00000001 MSFT 0100000D)
[    0.000000] ACPI: MCFG 000000007bfdc000 0003C (v01 FSC    PC       00000001 MSFT 0100000D)
[    0.000000] ACPI: HPET 000000007bfdb000 00038 (v01 FSC    PC       00000001 MSFT 0100000D)
[    0.000000] ACPI: SLIT 000000007bfda000 0003C (v01 FSC    PC       00000001 MSFT 0100000D)
[    0.000000] ACPI: SRAT 000000007bfd9000 00930 (v02 FSC    PC       00000001 MSFT 0100000D)
[    0.000000] ACPI: SPCR 000000007bfd8000 00050 (v01 FSC    PC       00000000 MSFT 0100000D)
[    0.000000] ACPI: WDDT 000000007bfd7000 00040 (v01 FSC    PC       00000000 MSFT 0100000D)
[    0.000000] ACPI: SSDT 000000007bf5a000 3CFA4 (v02 FSC    PC       00004000 INTL 20090903)
[    0.000000] ACPI: SSDT 000000007bfd6000 00024 (v02 FSC    PC       00004000 INTL 20090903)
[    0.000000] ACPI: PMCT 000000007bfd5000 00060 (v01 FSC    PC       00000000 MSFT 0100000D)
[    0.000000] ACPI: MIGT 000000007bfd4000 00040 (v01 FSC    PC       00000000 MSFT 0100000D)
[    0.000000] ACPI: SLIC 000000007bfd3000 00176 (v01 FSC    PC       00000002      01000013)
[    0.000000] ACPI: HEST 000000007bfd2000 000A8 (v01 FSC    PC       00000001 INTL 00000001)
[    0.000000] ACPI: BERT 000000007bfd1000 00030 (v01 FSC    PC       00000001 INTL 00000001)
[    0.000000] ACPI: ERST 000000007bfd0000 00230 (v01 FSC    PC       00000001 INTL 00000001)
[    0.000000] ACPI: EINJ 000000007bfcf000 00130 (v01 FSC    PC       00000001 INTL 00000001)
[    0.000000] ACPI:      000000007bf59000 00308 (v01 FSC    PC       00000001 MSFT 0100000D)
[    0.000000] Setting APIC routing to cluster x2apic.
[    0.000000] SRAT: PXM 0 -> APIC 0 -> Node 0
[    0.000000] SRAT: PXM 2 -> APIC 64 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 32 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 96 -> Node 3
[    0.000000] SRAT: PXM 2 -> APIC 66 -> Node 1
[    0.000000] SRAT: PXM 0 -> APIC 4 -> Node 0
[    0.000000] SRAT: PXM 1 -> APIC 36 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 100 -> Node 3
[    0.000000] SRAT: PXM 0 -> APIC 6 -> Node 0
[    0.000000] SRAT: PXM 2 -> APIC 70 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 38 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 102 -> Node 3
[    0.000000] SRAT: PXM 0 -> APIC 16 -> Node 0
[    0.000000] SRAT: PXM 2 -> APIC 80 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 48 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 112 -> Node 3
[    0.000000] SRAT: PXM 0 -> APIC 18 -> Node 0
[    0.000000] SRAT: PXM 1 -> APIC 50 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 114 -> Node 3
[    0.000000] SRAT: PXM 2 -> APIC 84 -> Node 1
[    0.000000] SRAT: PXM 0 -> APIC 22 -> Node 0
[    0.000000] SRAT: PXM 2 -> APIC 86 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 54 -> Node 2
[    0.000000] SRAT: PXM 3 -> APIC 118 -> Node 3
[    0.000000] SRAT: Node 0 PXM 0 0-80000000
[    0.000000] SRAT: Node 0 PXM 0 100000000-880000000
[    0.000000] SRAT: Node 0 PXM 0 880000000-1080000000
[    0.000000] SRAT: Node 2 PXM 1 1080000000-1880000000
[    0.000000] SRAT: Node 2 PXM 1 1880000000-2080000000
[    0.000000] SRAT: Node 1 PXM 2 2080000000-2880000000
[    0.000000] SRAT: Node 1 PXM 2 2880000000-3080000000
[    0.000000] SRAT: Node 3 PXM 3 3080000000-3880000000
[    0.000000] SRAT: Node 3 PXM 3 3880000000-4080000000
[    0.000000] SRAT: Node 0 PXM 0 4100000000-6100000000
[    0.000000] SRAT: hot plug zone found 4100000000 - 6100000000
[    0.000000] SRAT: Node 0 PXM 0 6100000000-8100000000
[    0.000000] SRAT: hot plug zone found 4100000000 - 8100000000
[    0.000000] SRAT: Node 2 PXM 1 8100000000-a100000000
[    0.000000] SRAT: hot plug zone found 8100000000 - a100000000
[    0.000000] SRAT: Node 2 PXM 1 a100000000-c100000000
[    0.000000] SRAT: hot plug zone found 8100000000 - c100000000
[    0.000000] SRAT: Node 1 PXM 2 c100000000-e100000000
[    0.000000] SRAT: hot plug zone found c100000000 - e100000000
[    0.000000] SRAT: Node 1 PXM 2 e100000000-10100000000
[    0.000000] SRAT: hot plug zone found c100000000 - 10100000000
[    0.000000] SRAT: Node 3 PXM 3 10100000000-12100000000
[    0.000000] SRAT: hot plug zone found 10100000000 - 12100000000
[    0.000000] SRAT: Node 3 PXM 3 12100000000-14100000000
[    0.000000] SRAT: hot plug zone found 10100000000 - 14100000000
[    0.000000] Bootmem setup node 0 0000000000000000-0000000010ffb000
[    0.000000]   NODE_DATA [0000000000009600 - 000000000003d5ff]
[    0.000000]   bootmap [000000000003e000 -  00000000000401ff] pages 3
[    0.000000] (9 early reservations) ==> bootmem [0000000000 - 0010ffb000]
[    0.000000]   #0 [0000000000 - 0000001000]   BIOS data page ==> [0000000000 - 0000001000]
[    0.000000]   #1 [0000006000 - 0000008000]       TRAMPOLINE ==> [0000006000 - 0000008000]
[    0.000000]   #2 [0002000000 - 0002cd7778]    TEXT DATA BSS ==> [0002000000 - 0002cd7778]
[    0.000000]   #3 [0010ab8000 - 0010feea25]          RAMDISK ==> [0010ab8000 - 0010feea25]
[    0.000000]   #4 [000009f000 - 0000100000]    BIOS reserved ==> [000009f000 - 0000100000]
[    0.000000]   #5 [0002cd8000 - 0002cd8521]              BRK ==> [0002cd8000 - 0002cd8521]
[    0.000000]   #6 [0000008000 - 0000009000]          PGTABLE ==> [0000008000 - 0000009000]
[    0.000000]   #7 [0000009000 - 000000903c]        ACPI SLIT ==> [0000009000 - 000000903c]
[    0.000000]   #8 [0000009080 - 0000009600]       MEMNODEMAP ==> [0000009080 - 0000009600]
[    0.000000] found SMP MP-table at [ffff8800000fde00] fde00
[    0.000000] Zone PFN ranges:
[    0.000000]   DMA      0x00000000 -> 0x00001000
[    0.000000]   DMA32    0x00001000 -> 0x00100000
[    0.000000]   Normal   0x00100000 -> 0x00100000
[    0.000000] Movable zone start PFN for each node
[    0.000000] early_node_map[3] active PFN ranges
[    0.000000]     0: 0x00000000 -> 0x00000001
[    0.000000]     0: 0x00000006 -> 0x000000a0
[    0.000000]     0: 0x000010a0 -> 0x00010ffb
[    0.000000] ACPI: PM-Timer IO Port: 0x408
[    0.000000] Setting APIC routing to cluster x2apic.
[    0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x20] lapic_id[0x40] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x10] lapic_id[0x20] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x30] lapic_id[0x60] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x22] lapic_id[0x42] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x12] lapic_id[0x22] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x32] lapic_id[0x62] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x04] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x24] lapic_id[0x44] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x14] lapic_id[0x24] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x34] lapic_id[0x64] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x06] lapic_id[0x06] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x26] lapic_id[0x46] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x16] lapic_id[0x26] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x36] lapic_id[0x66] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x08] lapic_id[0x10] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x28] lapic_id[0x50] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x18] lapic_id[0x30] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x38] lapic_id[0x70] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x0a] lapic_id[0x12] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x2a] lapic_id[0x52] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x1a] lapic_id[0x32] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x3a] lapic_id[0x72] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x0c] lapic_id[0x14] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x2c] lapic_id[0x54] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x1c] lapic_id[0x34] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x3c] lapic_id[0x74] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x0e] lapic_id[0x16] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x2e] lapic_id[0x56] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x1e] lapic_id[0x36] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x3e] lapic_id[0x76] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x21] lapic_id[0x41] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x11] lapic_id[0x21] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x31] lapic_id[0x61] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x23] lapic_id[0x43] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x13] lapic_id[0x23] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x33] lapic_id[0x63] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x05] lapic_id[0x05] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x25] lapic_id[0x45] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x15] lapic_id[0x25] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x35] lapic_id[0x65] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x07] lapic_id[0x07] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x27] lapic_id[0x47] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x17] lapic_id[0x27] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x37] lapic_id[0x67] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x09] lapic_id[0x11] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x29] lapic_id[0x51] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x19] lapic_id[0x31] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x39] lapic_id[0x71] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x0b] lapic_id[0x13] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x2b] lapic_id[0x53] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x1b] lapic_id[0x33] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x3b] lapic_id[0x73] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x0d] lapic_id[0x15] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x2d] lapic_id[0x55] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x1d] lapic_id[0x35] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x3d] lapic_id[0x75] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x0f] lapic_id[0x17] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x2f] lapic_id[0x57] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x1f] lapic_id[0x37] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x3f] lapic_id[0x77] disabled)
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x00] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x01] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x02] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x03] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x04] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x05] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x06] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x07] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x08] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x09] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x0a] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x0b] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x0c] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x0d] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x0e] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x0f] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x10] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x11] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x12] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x13] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x14] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x15] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x16] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x17] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x18] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x19] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x1a] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x1b] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x1c] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x1d] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x1e] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x1f] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x20] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x21] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x22] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x23] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x24] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x25] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x26] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x27] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x28] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x29] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x2a] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x2b] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x2c] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x2d] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x2e] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x2f] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x30] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x31] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x32] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x33] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x34] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x35] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x36] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x37] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x38] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x39] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x3a] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x3b] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x3c] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x3d] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x3e] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x3f] high level lint[0x1])
[    0.000000] ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0])
[    0.000000] IOAPIC[0]: apic_id 8, version 32, address 0xfec00000, GSI 0-23
[    0.000000] ACPI: IOAPIC (id[0x09] address[0xfec01000] gsi_base[24])
[    0.000000] IOAPIC[1]: apic_id 9, version 32, address 0xfec01000, GSI 24-47
[    0.000000] ACPI: IOAPIC (id[0x0a] address[0xfec04000] gsi_base[48])
[    0.000000] IOAPIC[2]: apic_id 10, version 32, address 0xfec04000, GSI 48-71
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[    0.000000] Using ACPI (MADT) for SMP configuration information
[    0.000000] ACPI: HPET id: 0x8086a401 base: 0xfed00000
[    0.000000] SMP: Allowing 64 CPUs, 40 hotplug CPUs
[    0.000000] PM: Registered nosave memory: 0000000000001000 - 0000000000006000
[    0.000000] PM: Registered nosave memory: 00000000000a0000 - 00000000010a0000
[    0.000000] Allocating PCI resources starting at 7bfff000 (gap: 7bfff000:84001000)
[    0.000000] Booting paravirtualized kernel on bare hardware
[    0.000000] NR_CPUS:4096 nr_cpumask_bits:64 nr_cpu_ids:64 nr_node_ids:4
[    0.000000] PERCPU: Embedded 28 pages/cpu @ffff880002e00000 s82072 r8192 d24424 u131072
[    0.000000] pcpu-alloc: s82072 r8192 d24424 u131072 alloc=1*2097152
[    0.000000] pcpu-alloc: [0] 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 
[    0.000000] pcpu-alloc: [0] 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
[    0.000000] pcpu-alloc: [0] 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 
[    0.000000] pcpu-alloc: [0] 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 
[    0.000000] Built 1 zonelists in Node order, mobility grouping on.  Total pages: 64574
[    0.000000] Policy zone: DMA32
[    0.000000] Kernel command line: root=/dev/md11  swiotlb=128 reassigndev=all console=tty0 console=ttyS0,38400 xencons=ttyS0 elevator=deadline sysrq=1 reset_devices irqpoll maxcpus=1 kernelversion=2.6.32.12-0.7-xen noirqdebug memmap=exactmap memmap=640K@0K memmap=261484K@17024K elfcorehdr=278508K memmap=5992K#1968932K memmap=4624K#1974924K memmap=220K#1980024K memmap=80K#1980256K memmap=4K#1980352K memmap=136K#1980364K memmap=188K#1980632K memmap=3076K#1981080K memmap=2584K#1984348K memmap=2048K#1986932K memmap=1880K#1988980K memmap=88K#1990948K memmap=20K#1991092K memmap=32K#1991116K memmap=20K#1991156K memmap=4636K#1991180K memmap=32672K#1996252K memmap=1312K#2029884K memmap=224K#2031196K memmap=192K#2031420K
[    0.000000] Misrouted IRQ fixup and polling support enabled
[    0.000000] This may significantly impact system performance
[    0.000000] IRQ lockup detection disabled
[    0.000000] PID hash table entries: 1024 (order: 1, 8192 bytes)
[    0.000000] Checking aperture...
[    0.000000] No AGP bridge found
[    0.000000] Memory: 229800k/278508k available (3703k kernel code, 16404k absent, 32304k reserved, 5688k data, 964k init)
[    0.000000] Hierarchical RCU implementation.
[    0.000000] NR_IRQS:33024 nr_irqs:1736
[    0.000000] Extended CMOS year: 2000
[    0.000000] Console: colour VGA+ 80x25
[    0.000000] console [tty0] enabled
[    0.000000] console [ttyS0] enabled
[    0.000000] allocated 3932160 bytes of page_cgroup
[    0.000000] please try 'cgroup_disable=memory' option if you don't want memory cgroups
[    0.000000] HPET: 4 timers in total, 0 timers will be used for per-cpu timer
[    0.004000] Fast TSC calibration using PIT
[    0.008000] Detected 2659.691 MHz processor.
[    0.000017] Calibrating delay loop (skipped), value calculated using timer frequency.. 5319.38 BogoMIPS (lpj=10638764)
[    0.035553] pid_max: default: 65536 minimum: 512
[    0.051265] kdb version 4.4 by Keith Owens, Scott Lurndal. Copyright SGI, All Rights Reserved
kdb_cmd[0]: defcmd archkdb "" "First line arch debugging"
kdb_cmd[8]: defcmd archkdbcpu "" "archkdb with only tasks on cpus"
kdb_cmd[15]: defcmd archkdbshort "" "archkdb with less detailed backtrace"
kdb_cmd[22]: defcmd archkdbcommon "" "Common arch debugging"
[    0.156408] Security Framework initialized
[    0.170050] AppArmor: AppArmor initialized
[    0.183754] Dentry cache hash table entries: 32768 (order: 6, 262144 bytes)
[    0.206965] Inode-cache hash table entries: 16384 (order: 5, 131072 bytes)
[    0.229857] Mount-cache hash table entries: 256
[    0.245241] Initializing cgroup subsys ns
[    0.258574] Initializing cgroup subsys cpuacct
[    0.273344] Initializing cgroup subsys memory
[    0.287847] Initializing cgroup subsys devices
[    0.302609] Initializing cgroup subsys freezer
[    0.317370] Initializing cgroup subsys net_cls
[    0.332226] CPU: Physical Processor ID: 0
[    0.345557] CPU: Processor Core ID: 0
[    0.357745] mce: CPU supports 22 MCE banks
[    0.371390] CPU0: Thermal monitoring enabled (TM1)
[    0.387306] using mwait in idle threads.
[    0.400346] Performance Events: Nehalem/Corei7 events, Intel PMU driver.
[    0.422717] ... version:                3
[    0.436047] ... bit width:              48
[    0.449660] ... generic registers:      4
[    0.462986] ... value mask:             0000ffffffffffff
[    0.480610] ... max period:             000000007fffffff
[    0.498232] ... fixed-purpose events:   3
[    0.511558] ... event mask:             000000070000000f
[    0.529194] SMP alternatives: switching to UP code
[    0.551422] ACPI: Core revision 20090903
[    0.723874] Kernel panic - not syncing: x2apic: enabled by BIOS but kernel init failed.
[    0.750460] Pid: 1, comm: swapper Not tainted 2.6.32.12-0.7-default #1
[    0.772093] Call Trace:
[    0.780289]  [<ffffffff810061dc>] dump_trace+0x6c/0x2d0
[    0.797640]  [<ffffffff81394288>] dump_stack+0x69/0x71
[    0.814698]  [<ffffffff81394308>] panic+0x78/0x199
[    0.830613]  [<ffffffff8195014f>] enable_IR_x2apic+0x171/0x1b7
[    0.849968]  [<ffffffff8194de41>] native_smp_prepare_cpus+0x139/0x281
[    0.871320]  [<ffffffff81943837>] kernel_init+0xb8/0x1b2
[    0.888949]  [<ffffffff81003fba>] child_rip+0xa/0x20
[    0.905419] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[    0.931563] IP: [<ffffffff810600ba>] queue_work_on+0x2a/0x60
[    0.950408] PGD 0 
[    0.957221] Oops: 0000 [#1] SMP 
[    0.968091] last sysfs file: 
[    0.977978] CPU 0 
[    0.984789] Modules linked in:
[    0.995036] Supported: Yes
[    1.004064] Pid: 1, comm: swapper Not tainted 2.6.32.12-0.7-default #1 PRIMERGY RX600 S5
[    1.030918] RIP: 0010:[<ffffffff810600ba>]  [<ffffffff810600ba>] queue_work_on+0x2a/0x60
[    1.057852] RSP: 0018:ffff88000f449e00  EFLAGS: 00010246
[    1.075472] RAX: ffffffff81852308 RBX: ffffffff81531f10 RCX: 0000000000000000
[    1.099102] RDX: ffffffff81852300 RSI: 0000000000000000 RDI: 0000000000000000
[    1.122732] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000007
[    1.146364] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
[    1.169996] R13: 0000000000000246 R14: 0000000000013480 R15: 0000000000013480
[    1.193627] FS:  0000000000000000(0000) GS:ffff880002e00000(0000) knlGS:0000000000000000
[    1.220483] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[    1.239533] CR2: 0000000000000020 CR3: 0000000002804000 CR4: 00000000000006f0
[    1.263165] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    1.286798] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[    1.310427] Process swapper (pid: 1, threadinfo ffff88000f448000, task ffff88000f446040)
[    1.337284] Stack:
[    1.344020]  0000000000013480 ffffffff81225dac ffff88000fe96580 ffffffff8139436d
[    1.368165] <0> 0000000000000008 ffff88000f449e90 ffff88000f449e40 0000000000000282
[    1.393965] <0> ffff88000fe96580 0000000000000082 000000000000001c ffff88000fe96580
[    1.420463] Call Trace:
[    1.428641]  [<ffffffff81225dac>] splash_verbose+0x2c/0x40
[    1.446835]  [<ffffffff8139436d>] panic+0xdd/0x199
[    1.464004]  [<ffffffff8195014f>] enable_IR_x2apic+0x171/0x1b7
[    1.483342]  [<ffffffff8194de41>] native_smp_prepare_cpus+0x139/0x281
[    1.504685]  [<ffffffff81943837>] kernel_init+0xb8/0x1b2
[    1.522306]  [<ffffffff81003fba>] child_rip+0xa/0x20
[    1.538779] Code: 00 48 83 ec 08 41 89 f8 3e 0f ba 2a 00 19 c0 31 c9 85 c0 74 0c 89 c8 48 83 c4 08 c3 0f 1f 44 00 00 48 8d 42 08 48 39 42 08 75 2f <8b> 7e 20 85 ff 44 0f 45 05 39 79 8c 00 48 8b 3e 48 89 d6 49 63 
[    1.605994] RIP  [<ffffffff810600ba>] queue_work_on+0x2a/0x60
[    1.625119]  RSP <ffff88000f449e00>
[    1.636723] CR2: 0000000000000020
[    1.647771] ---[ end trace 4eaa2a86a8e2da22 ]---
[    1.663116] Kernel panic - not syncing: Attempted to kill init!
[    1.682746] Pid: 1, comm: swapper Tainted: G      D      2.6.32.12-0.7-default #1
[    1.707601] Call Trace:
[    1.715784]  [<ffffffff810061dc>] dump_trace+0x6c/0x2d0
[    1.733133]  [<ffffffff81394288>] dump_stack+0x69/0x71
[    1.750189]  [<ffffffff81394308>] panic+0x78/0x199
[    1.766106]  [<ffffffff81050bf2>] forget_original_parent+0x332/0x340
[    1.787171]  [<ffffffff81050c10>] exit_notify+0x10/0x190
[    1.804803]  [<ffffffff81050f08>] do_exit+0x178/0x360
[    1.821572]  [<ffffffff813980a1>] oops_end+0xe1/0xf0
[    1.838058]  [<ffffffff8102d955>] __bad_area_nosemaphore+0x155/0x230
[    1.859125]  [<ffffffff813972ef>] page_fault+0x1f/0x30
[    1.876187]  [<ffffffff810600ba>] queue_work_on+0x2a/0x60
[    1.894104]  [<ffffffff81225dac>] splash_verbose+0x2c/0x40
[    1.912306]  [<ffffffff8139436d>] panic+0xdd/0x199
[    1.928218]  [<ffffffff8195014f>] enable_IR_x2apic+0x171/0x1b7
[    1.947542]  [<ffffffff8194de41>] native_smp_prepare_cpus+0x139/0x281
[    1.968893]  [<ffffffff81943837>] kernel_init+0xb8/0x1b2
[    1.986524]  [<ffffffff81003fba>] child_rip+0xa/0x20

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: ACPI-Tables corrupted?
  2010-07-28 12:13       ` Juergen Gross
  2010-07-28 12:45         ` Keir Fraser
@ 2010-07-29  6:31         ` Jiang, Yunhong
  2010-07-29  6:40           ` Keir Fraser
  1 sibling, 1 reply; 23+ messages in thread
From: Jiang, Yunhong @ 2010-07-29  6:31 UTC (permalink / raw)
  To: Juergen Gross, Keir Fraser
  Cc: Jeremy Fitzhardinge, xen-devel@lists.xensource.com, Han, Weidong

A bit curios, why enable_IR_x2apic() will be called in dom0? IMO, dom0 will not control interrupt controller, either xapic, or x2apic. Shouldn't this be commented out in pvops dom0?

What's the panic point in your environment? Is it the following code? If yes, that means you enable x2apic in BIOS and you can workaround this issue by disable x2apic in BIOS.

        if (x2apic_preenabled)
                 panic("x2apic: enabled by BIOS but kernel init failed.");

Thanks
--jyh

>-----Original Message-----
>From: xen-devel-bounces@lists.xensource.com
>[mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Juergen Gross
>Sent: Wednesday, July 28, 2010 8:14 PM
>To: Keir Fraser
>Cc: xen-devel@lists.xensource.com
>Subject: Re: [Xen-devel] ACPI-Tables corrupted?
>
>On 07/28/2010 01:39 PM, Keir Fraser wrote:
>>
>>
>>
>> On 28/07/2010 12:26, "Juergen Gross"<juergen.gross@ts.fujitsu.com>  wrote:
>>
>>> On 07/28/2010 12:03 PM, Keir Fraser wrote:
>>>> On 28/07/2010 10:38, "Juergen Gross"<juergen.gross@ts.fujitsu.com>
>wrote:
>>>>
>>>>> As you can see, the DMAR eye-catcher is replaced by blanks!
>>>>> This leads to a programmed panic in the crash kernel later in case of a
>>>>> panic in dom0...
>>>>>
>>>>> Any ideas?
>>>>> BTW: seen in unstable AND 4.0
>>>>
>>>> Look at the tail of xen/drivers/passthrough/vtd/dmar.c: Xen *always*
>>>> *unconditionally* trashes the DMAR so that dom0 will not parse it.
>>>> Presumably bad stuff would happen if it did.
>>>
>>> As Dom0 is a pv-kernel, it should be able to ignore this entry.
>>> The crash kernel OTOH should not panic due to the trashed entry!
>>> What is the correct solution here?
>>
>> Could provide a cmdline option to not nobble the DMAR?
>
>That's a possibility.
>I wonder whether it wouldn't be better to let dom0 decide not to use it if
>running under xen. This would remove the requirement for zapping the ACPI
>table. IMO it's always a bad idea to change data of a deeper layer...
>
>>
>>> The crash kernel expects a valid DMAR entry, as following code in
>>> enable_IR_x2apic() suggests:
>>
>> I don't know what that function does, nor how the error path below depends
>> on DMAR. DMAR isn't mentioned in the below code.
>
>Sorry, here a larger fragment (source arch/x86/kernel/apic/apic.c):
>
>void __init enable_IR_x2apic(void)
>{
>         unsigned long flags;
>         struct IO_APIC_route_entry **ioapic_entries = NULL;
>         int ret, x2apic_enabled = 0;
>         int dmar_table_init_ret = 0;
>
>#ifdef CONFIG_INTR_REMAP
>         dmar_table_init_ret = dmar_table_init();
>         if (dmar_table_init_ret)
>                 pr_debug("dmar_table_init() failed with %d:\n",
>                                 dmar_table_init_ret);
>#endif
>
>         ioapic_entries = alloc_ioapic_entries();
>         if (!ioapic_entries) {
>                 pr_err("Allocate ioapic_entries failed\n");
>                 goto out;
>         }
>
>         ret = save_IO_APIC_setup(ioapic_entries);
>         if (ret) {
>                 pr_info("Saving IO-APIC state failed: %d\n", ret);
>                 goto out;
>         }
>
>         local_irq_save(flags);
>         mask_8259A();
>         mask_IO_APIC_setup(ioapic_entries);
>
>         if (dmar_table_init_ret)
>                 ret = 0;
>         else
>                 ret = enable_IR();
>
>         if (!ret) {
>                 /* IR is required if there is APIC ID > 255 even when running
>                  * under KVM
>                  */
>                 if (max_physical_apicid > 255 || !kvm_para_available())
>                         goto nox2apic;
>                 /*
>                  * without IR all CPUs can be addressed by IOAPIC/MSI
>                  * only in physical mode
>                  */
>                 x2apic_force_phys();
>         }
>
>         x2apic_enabled = 1;
>
>         if (x2apic_supported() && !x2apic_mode) {
>                 x2apic_mode = 1;
>                 enable_x2apic();
>                 pr_info("Enabled x2apic\n");
>         }
>
>nox2apic:
>         if (!ret) /* IR enabling failed */
>                 restore_IO_APIC_setup(ioapic_entries);
>         unmask_8259A();
>         local_irq_restore(flags);
>
>out:
>         if (ioapic_entries)
>                 free_ioapic_entries(ioapic_entries);
>
>         if (x2apic_enabled)
>                 return;
>
>         if (x2apic_preenabled)
>                 panic("x2apic: enabled by BIOS but kernel init failed.");
>         else if (cpu_has_x2apic)
>                 pr_info("Not enabling x2apic, Intr-remapping init failed.\n");
>}
>
>
>dmar_table_init() will return -ENODEV if no DMAR record is found.
>
>
>Juergen
>
>--
>Juergen Gross                 Principal Developer Operating Systems
>TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
>Fujitsu Technology Solutions              e-mail: juergen.gross@ts.fujitsu.com
>Domagkstr. 28                           Internet: ts.fujitsu.com
>D-80807 Muenchen                 Company details:
>ts.fujitsu.com/imprint.html
>
>_______________________________________________
>Xen-devel mailing list
>Xen-devel@lists.xensource.com
>http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ACPI-Tables corrupted?
  2010-07-29  6:19               ` Juergen Gross
@ 2010-07-29  6:39                 ` Keir Fraser
  2010-07-29  6:48                   ` Juergen Gross
  0 siblings, 1 reply; 23+ messages in thread
From: Keir Fraser @ 2010-07-29  6:39 UTC (permalink / raw)
  To: Juergen Gross, Konrad Rzeszutek Wilk; +Cc: xen-devel@lists.xensource.com

On 29/07/2010 07:19, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:

>> What is the crash? And do you see any indiciation that x2APIC is turned
>> on? Do provide a serial log please.
> 
> Log is attached.
> 
> I did some more testing. The problem occurred on a Nehalem-EX system. I tried
> the same on a Nehalem-EP system and all was okay. I suspect some further
> problems in the ACPI tables of the EX system now. I'm not too familiar with
> ACPI tables. Anything I can do for further analysis?

Okay I think now that the crash is legitimate. The easy fix is that we
should reinstate the DMAR before kexec'ing the crash kernel. I'll look into
doing a patch to the hypervisor later today.

 Thanks,
 Keir

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ACPI-Tables corrupted?
  2010-07-29  6:31         ` Jiang, Yunhong
@ 2010-07-29  6:40           ` Keir Fraser
  2010-07-29  6:48             ` Keir Fraser
  2010-07-29 10:24             ` Keir Fraser
  0 siblings, 2 replies; 23+ messages in thread
From: Keir Fraser @ 2010-07-29  6:40 UTC (permalink / raw)
  To: Jiang, Yunhong, Juergen Gross
  Cc: Jeremy Fitzhardinge, xen-devel@lists.xensource.com, Han, Weidong

It's not a dom0, it's a kexec'ed crash kernel. We should be reinstating DMAR
before jumping into a native kernel. I will sort out a fix.

 -- Keir

On 29/07/2010 07:31, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:

> A bit curios, why enable_IR_x2apic() will be called in dom0? IMO, dom0 will
> not control interrupt controller, either xapic, or x2apic. Shouldn't this be
> commented out in pvops dom0?
> 
> What's the panic point in your environment? Is it the following code? If yes,
> that means you enable x2apic in BIOS and you can workaround this issue by
> disable x2apic in BIOS.
> 
>         if (x2apic_preenabled)
>                  panic("x2apic: enabled by BIOS but kernel init failed.");
> 
> Thanks
> --jyh
> 
>> -----Original Message-----
>> From: xen-devel-bounces@lists.xensource.com
>> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Juergen Gross
>> Sent: Wednesday, July 28, 2010 8:14 PM
>> To: Keir Fraser
>> Cc: xen-devel@lists.xensource.com
>> Subject: Re: [Xen-devel] ACPI-Tables corrupted?
>> 
>> On 07/28/2010 01:39 PM, Keir Fraser wrote:
>>> 
>>> 
>>> 
>>> On 28/07/2010 12:26, "Juergen Gross"<juergen.gross@ts.fujitsu.com>  wrote:
>>> 
>>>> On 07/28/2010 12:03 PM, Keir Fraser wrote:
>>>>> On 28/07/2010 10:38, "Juergen Gross"<juergen.gross@ts.fujitsu.com>
>> wrote:
>>>>> 
>>>>>> As you can see, the DMAR eye-catcher is replaced by blanks!
>>>>>> This leads to a programmed panic in the crash kernel later in case of a
>>>>>> panic in dom0...
>>>>>> 
>>>>>> Any ideas?
>>>>>> BTW: seen in unstable AND 4.0
>>>>> 
>>>>> Look at the tail of xen/drivers/passthrough/vtd/dmar.c: Xen *always*
>>>>> *unconditionally* trashes the DMAR so that dom0 will not parse it.
>>>>> Presumably bad stuff would happen if it did.
>>>> 
>>>> As Dom0 is a pv-kernel, it should be able to ignore this entry.
>>>> The crash kernel OTOH should not panic due to the trashed entry!
>>>> What is the correct solution here?
>>> 
>>> Could provide a cmdline option to not nobble the DMAR?
>> 
>> That's a possibility.
>> I wonder whether it wouldn't be better to let dom0 decide not to use it if
>> running under xen. This would remove the requirement for zapping the ACPI
>> table. IMO it's always a bad idea to change data of a deeper layer...
>> 
>>> 
>>>> The crash kernel expects a valid DMAR entry, as following code in
>>>> enable_IR_x2apic() suggests:
>>> 
>>> I don't know what that function does, nor how the error path below depends
>>> on DMAR. DMAR isn't mentioned in the below code.
>> 
>> Sorry, here a larger fragment (source arch/x86/kernel/apic/apic.c):
>> 
>> void __init enable_IR_x2apic(void)
>> {
>>         unsigned long flags;
>>         struct IO_APIC_route_entry **ioapic_entries = NULL;
>>         int ret, x2apic_enabled = 0;
>>         int dmar_table_init_ret = 0;
>> 
>> #ifdef CONFIG_INTR_REMAP
>>         dmar_table_init_ret = dmar_table_init();
>>         if (dmar_table_init_ret)
>>                 pr_debug("dmar_table_init() failed with %d:\n",
>>                                 dmar_table_init_ret);
>> #endif
>> 
>>         ioapic_entries = alloc_ioapic_entries();
>>         if (!ioapic_entries) {
>>                 pr_err("Allocate ioapic_entries failed\n");
>>                 goto out;
>>         }
>> 
>>         ret = save_IO_APIC_setup(ioapic_entries);
>>         if (ret) {
>>                 pr_info("Saving IO-APIC state failed: %d\n", ret);
>>                 goto out;
>>         }
>> 
>>         local_irq_save(flags);
>>         mask_8259A();
>>         mask_IO_APIC_setup(ioapic_entries);
>> 
>>         if (dmar_table_init_ret)
>>                 ret = 0;
>>         else
>>                 ret = enable_IR();
>> 
>>         if (!ret) {
>>                 /* IR is required if there is APIC ID > 255 even when running
>>                  * under KVM
>>                  */
>>                 if (max_physical_apicid > 255 || !kvm_para_available())
>>                         goto nox2apic;
>>                 /*
>>                  * without IR all CPUs can be addressed by IOAPIC/MSI
>>                  * only in physical mode
>>                  */
>>                 x2apic_force_phys();
>>         }
>> 
>>         x2apic_enabled = 1;
>> 
>>         if (x2apic_supported() && !x2apic_mode) {
>>                 x2apic_mode = 1;
>>                 enable_x2apic();
>>                 pr_info("Enabled x2apic\n");
>>         }
>> 
>> nox2apic:
>>         if (!ret) /* IR enabling failed */
>>                 restore_IO_APIC_setup(ioapic_entries);
>>         unmask_8259A();
>>         local_irq_restore(flags);
>> 
>> out:
>>         if (ioapic_entries)
>>                 free_ioapic_entries(ioapic_entries);
>> 
>>         if (x2apic_enabled)
>>                 return;
>> 
>>         if (x2apic_preenabled)
>>                 panic("x2apic: enabled by BIOS but kernel init failed.");
>>         else if (cpu_has_x2apic)
>>                 pr_info("Not enabling x2apic, Intr-remapping init
>> failed.\n");
>> }
>> 
>> 
>> dmar_table_init() will return -ENODEV if no DMAR record is found.
>> 
>> 
>> Juergen
>> 
>> --
>> Juergen Gross                 Principal Developer Operating Systems
>> TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
>> Fujitsu Technology Solutions              e-mail:
>> juergen.gross@ts.fujitsu.com
>> Domagkstr. 28                           Internet: ts.fujitsu.com
>> D-80807 Muenchen                 Company details:
>> ts.fujitsu.com/imprint.html
>> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ACPI-Tables corrupted?
  2010-07-29  6:39                 ` Keir Fraser
@ 2010-07-29  6:48                   ` Juergen Gross
  2010-07-29  6:48                     ` Keir Fraser
  0 siblings, 1 reply; 23+ messages in thread
From: Juergen Gross @ 2010-07-29  6:48 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk

On 07/29/2010 08:39 AM, Keir Fraser wrote:
> On 29/07/2010 07:19, "Juergen Gross"<juergen.gross@ts.fujitsu.com>  wrote:
>
>>> What is the crash? And do you see any indiciation that x2APIC is turned
>>> on? Do provide a serial log please.
>>
>> Log is attached.
>>
>> I did some more testing. The problem occurred on a Nehalem-EX system. I tried
>> the same on a Nehalem-EP system and all was okay. I suspect some further
>> problems in the ACPI tables of the EX system now. I'm not too familiar with
>> ACPI tables. Anything I can do for further analysis?
>
> Okay I think now that the crash is legitimate. The easy fix is that we
> should reinstate the DMAR before kexec'ing the crash kernel. I'll look into
> doing a patch to the hypervisor later today.

Thanks!
BTW: Do you understand why the EP system has no problems with the patched DMAR
entry?

Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ACPI-Tables corrupted?
  2010-07-29  6:40           ` Keir Fraser
@ 2010-07-29  6:48             ` Keir Fraser
  2010-07-29  7:37               ` Jiang, Yunhong
  2010-07-29 10:24             ` Keir Fraser
  1 sibling, 1 reply; 23+ messages in thread
From: Keir Fraser @ 2010-07-29  6:48 UTC (permalink / raw)
  To: Jiang, Yunhong, Juergen Gross
  Cc: Jeremy Fitzhardinge, xen-devel@lists.xensource.com, Han, Weidong

Strictly speaking we should have a stab at disabling x2apic as well, but
that's harder. And not necessary for newer Linux kernels, which is what we
will usually be kexec'ing to anyway.

 -- Keir

On 29/07/2010 07:40, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:

> It's not a dom0, it's a kexec'ed crash kernel. We should be reinstating DMAR
> before jumping into a native kernel. I will sort out a fix.
> 
>  -- Keir
> 
> On 29/07/2010 07:31, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
> 
>> A bit curios, why enable_IR_x2apic() will be called in dom0? IMO, dom0 will
>> not control interrupt controller, either xapic, or x2apic. Shouldn't this be
>> commented out in pvops dom0?
>> 
>> What's the panic point in your environment? Is it the following code? If yes,
>> that means you enable x2apic in BIOS and you can workaround this issue by
>> disable x2apic in BIOS.
>> 
>>         if (x2apic_preenabled)
>>                  panic("x2apic: enabled by BIOS but kernel init failed.");
>> 
>> Thanks
>> --jyh
>> 
>>> -----Original Message-----
>>> From: xen-devel-bounces@lists.xensource.com
>>> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Juergen Gross
>>> Sent: Wednesday, July 28, 2010 8:14 PM
>>> To: Keir Fraser
>>> Cc: xen-devel@lists.xensource.com
>>> Subject: Re: [Xen-devel] ACPI-Tables corrupted?
>>> 
>>> On 07/28/2010 01:39 PM, Keir Fraser wrote:
>>>> 
>>>> 
>>>> 
>>>> On 28/07/2010 12:26, "Juergen Gross"<juergen.gross@ts.fujitsu.com>  wrote:
>>>> 
>>>>> On 07/28/2010 12:03 PM, Keir Fraser wrote:
>>>>>> On 28/07/2010 10:38, "Juergen Gross"<juergen.gross@ts.fujitsu.com>
>>> wrote:
>>>>>> 
>>>>>>> As you can see, the DMAR eye-catcher is replaced by blanks!
>>>>>>> This leads to a programmed panic in the crash kernel later in case of a
>>>>>>> panic in dom0...
>>>>>>> 
>>>>>>> Any ideas?
>>>>>>> BTW: seen in unstable AND 4.0
>>>>>> 
>>>>>> Look at the tail of xen/drivers/passthrough/vtd/dmar.c: Xen *always*
>>>>>> *unconditionally* trashes the DMAR so that dom0 will not parse it.
>>>>>> Presumably bad stuff would happen if it did.
>>>>> 
>>>>> As Dom0 is a pv-kernel, it should be able to ignore this entry.
>>>>> The crash kernel OTOH should not panic due to the trashed entry!
>>>>> What is the correct solution here?
>>>> 
>>>> Could provide a cmdline option to not nobble the DMAR?
>>> 
>>> That's a possibility.
>>> I wonder whether it wouldn't be better to let dom0 decide not to use it if
>>> running under xen. This would remove the requirement for zapping the ACPI
>>> table. IMO it's always a bad idea to change data of a deeper layer...
>>> 
>>>> 
>>>>> The crash kernel expects a valid DMAR entry, as following code in
>>>>> enable_IR_x2apic() suggests:
>>>> 
>>>> I don't know what that function does, nor how the error path below depends
>>>> on DMAR. DMAR isn't mentioned in the below code.
>>> 
>>> Sorry, here a larger fragment (source arch/x86/kernel/apic/apic.c):
>>> 
>>> void __init enable_IR_x2apic(void)
>>> {
>>>         unsigned long flags;
>>>         struct IO_APIC_route_entry **ioapic_entries = NULL;
>>>         int ret, x2apic_enabled = 0;
>>>         int dmar_table_init_ret = 0;
>>> 
>>> #ifdef CONFIG_INTR_REMAP
>>>         dmar_table_init_ret = dmar_table_init();
>>>         if (dmar_table_init_ret)
>>>                 pr_debug("dmar_table_init() failed with %d:\n",
>>>                                 dmar_table_init_ret);
>>> #endif
>>> 
>>>         ioapic_entries = alloc_ioapic_entries();
>>>         if (!ioapic_entries) {
>>>                 pr_err("Allocate ioapic_entries failed\n");
>>>                 goto out;
>>>         }
>>> 
>>>         ret = save_IO_APIC_setup(ioapic_entries);
>>>         if (ret) {
>>>                 pr_info("Saving IO-APIC state failed: %d\n", ret);
>>>                 goto out;
>>>         }
>>> 
>>>         local_irq_save(flags);
>>>         mask_8259A();
>>>         mask_IO_APIC_setup(ioapic_entries);
>>> 
>>>         if (dmar_table_init_ret)
>>>                 ret = 0;
>>>         else
>>>                 ret = enable_IR();
>>> 
>>>         if (!ret) {
>>>                 /* IR is required if there is APIC ID > 255 even when
>>> running
>>>                  * under KVM
>>>                  */
>>>                 if (max_physical_apicid > 255 || !kvm_para_available())
>>>                         goto nox2apic;
>>>                 /*
>>>                  * without IR all CPUs can be addressed by IOAPIC/MSI
>>>                  * only in physical mode
>>>                  */
>>>                 x2apic_force_phys();
>>>         }
>>> 
>>>         x2apic_enabled = 1;
>>> 
>>>         if (x2apic_supported() && !x2apic_mode) {
>>>                 x2apic_mode = 1;
>>>                 enable_x2apic();
>>>                 pr_info("Enabled x2apic\n");
>>>         }
>>> 
>>> nox2apic:
>>>         if (!ret) /* IR enabling failed */
>>>                 restore_IO_APIC_setup(ioapic_entries);
>>>         unmask_8259A();
>>>         local_irq_restore(flags);
>>> 
>>> out:
>>>         if (ioapic_entries)
>>>                 free_ioapic_entries(ioapic_entries);
>>> 
>>>         if (x2apic_enabled)
>>>                 return;
>>> 
>>>         if (x2apic_preenabled)
>>>                 panic("x2apic: enabled by BIOS but kernel init failed.");
>>>         else if (cpu_has_x2apic)
>>>                 pr_info("Not enabling x2apic, Intr-remapping init
>>> failed.\n");
>>> }
>>> 
>>> 
>>> dmar_table_init() will return -ENODEV if no DMAR record is found.
>>> 
>>> 
>>> Juergen
>>> 
>>> --
>>> Juergen Gross                 Principal Developer Operating Systems
>>> TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
>>> Fujitsu Technology Solutions              e-mail:
>>> juergen.gross@ts.fujitsu.com
>>> Domagkstr. 28                           Internet: ts.fujitsu.com
>>> D-80807 Muenchen                 Company details:
>>> ts.fujitsu.com/imprint.html
>>> 
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xensource.com
>>> http://lists.xensource.com/xen-devel
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ACPI-Tables corrupted?
  2010-07-29  6:48                   ` Juergen Gross
@ 2010-07-29  6:48                     ` Keir Fraser
  2010-07-29  6:53                       ` Juergen Gross
  0 siblings, 1 reply; 23+ messages in thread
From: Keir Fraser @ 2010-07-29  6:48 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk

On 29/07/2010 07:48, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:

>> Okay I think now that the crash is legitimate. The easy fix is that we
>> should reinstate the DMAR before kexec'ing the crash kernel. I'll look into
>> doing a patch to the hypervisor later today.
> 
> Thanks!
> BTW: Do you understand why the EP system has no problems with the patched DMAR
> entry?

It probably doesn't have x2apic support. Any hint of x2apic in its boot
logs?

 -- Keir

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ACPI-Tables corrupted?
  2010-07-29  6:48                     ` Keir Fraser
@ 2010-07-29  6:53                       ` Juergen Gross
  0 siblings, 0 replies; 23+ messages in thread
From: Juergen Gross @ 2010-07-29  6:53 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk

On 07/29/2010 08:48 AM, Keir Fraser wrote:
> On 29/07/2010 07:48, "Juergen Gross"<juergen.gross@ts.fujitsu.com>  wrote:
>
>>> Okay I think now that the crash is legitimate. The easy fix is that we
>>> should reinstate the DMAR before kexec'ing the crash kernel. I'll look into
>>> doing a patch to the hypervisor later today.
>>
>> Thanks!
>> BTW: Do you understand why the EP system has no problems with the patched DMAR
>> entry?
>
> It probably doesn't have x2apic support. Any hint of x2apic in its boot
> logs?

Ahh, I see. No, it seems not to have x2apic.


Thanks, Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: ACPI-Tables corrupted?
  2010-07-29  6:48             ` Keir Fraser
@ 2010-07-29  7:37               ` Jiang, Yunhong
  2010-07-29  9:04                 ` Juergen Gross
  2010-07-29  9:33                 ` Keir Fraser
  0 siblings, 2 replies; 23+ messages in thread
From: Jiang, Yunhong @ 2010-07-29  7:37 UTC (permalink / raw)
  To: Keir Fraser, Juergen Gross
  Cc: Kay, Allen M, Jeremy Fitzhardinge, xen-devel@lists.xensource.com,
	Han, Weidong

Sorry that I didn't notice it is for crash kernel. In fact, I tried kexec before and never succed to bring it up.

What do you mean of "stab at disabling x2apic"? You mean we need disable x2apic before transfer control to crash kernel, right?

Per my understanding, with kexec, when system crash, it will jump directly to new kernel's entry point, no guest destroy (i.e. clean-up), not reset signal to cpu/chipset, right?
If yes, another issue need be considered is VT-d. I didn't find the vt-d disable code in xen's kexec_crash code, if the new kernel has no idea of vt-d (thus does not reset the vt-d engine), it may have trouble. Or, will the kexec kernel not use device assigned to guest?

Of course it is ok if crash kernel support vt-d too.

Thanks
--jyh


>-----Original Message-----
>From: Keir Fraser [mailto:keir.fraser@eu.citrix.com]
>Sent: Thursday, July 29, 2010 2:48 PM
>To: Jiang, Yunhong; Juergen Gross
>Cc: Jeremy Fitzhardinge; xen-devel@lists.xensource.com; Han, Weidong
>Subject: Re: [Xen-devel] ACPI-Tables corrupted?
>
>Strictly speaking we should have a stab at disabling x2apic as well, but
>that's harder. And not necessary for newer Linux kernels, which is what we
>will usually be kexec'ing to anyway.
>
> -- Keir
>
>On 29/07/2010 07:40, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:
>
>> It's not a dom0, it's a kexec'ed crash kernel. We should be reinstating DMAR
>> before jumping into a native kernel. I will sort out a fix.
>>
>>  -- Keir
>>
>> On 29/07/2010 07:31, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
>>
>>> A bit curios, why enable_IR_x2apic() will be called in dom0? IMO, dom0 will
>>> not control interrupt controller, either xapic, or x2apic. Shouldn't this be
>>> commented out in pvops dom0?
>>>
>>> What's the panic point in your environment? Is it the following code? If yes,
>>> that means you enable x2apic in BIOS and you can workaround this issue by
>>> disable x2apic in BIOS.
>>>
>>>         if (x2apic_preenabled)
>>>                  panic("x2apic: enabled by BIOS but kernel init failed.");
>>>
>>> Thanks
>>> --jyh
>>>
>>>> -----Original Message-----
>>>> From: xen-devel-bounces@lists.xensource.com
>>>> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Juergen Gross
>>>> Sent: Wednesday, July 28, 2010 8:14 PM
>>>> To: Keir Fraser
>>>> Cc: xen-devel@lists.xensource.com
>>>> Subject: Re: [Xen-devel] ACPI-Tables corrupted?
>>>>
>>>> On 07/28/2010 01:39 PM, Keir Fraser wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 28/07/2010 12:26, "Juergen Gross"<juergen.gross@ts.fujitsu.com>
>wrote:
>>>>>
>>>>>> On 07/28/2010 12:03 PM, Keir Fraser wrote:
>>>>>>> On 28/07/2010 10:38, "Juergen Gross"<juergen.gross@ts.fujitsu.com>
>>>> wrote:
>>>>>>>
>>>>>>>> As you can see, the DMAR eye-catcher is replaced by blanks!
>>>>>>>> This leads to a programmed panic in the crash kernel later in case of a
>>>>>>>> panic in dom0...
>>>>>>>>
>>>>>>>> Any ideas?
>>>>>>>> BTW: seen in unstable AND 4.0
>>>>>>>
>>>>>>> Look at the tail of xen/drivers/passthrough/vtd/dmar.c: Xen *always*
>>>>>>> *unconditionally* trashes the DMAR so that dom0 will not parse it.
>>>>>>> Presumably bad stuff would happen if it did.
>>>>>>
>>>>>> As Dom0 is a pv-kernel, it should be able to ignore this entry.
>>>>>> The crash kernel OTOH should not panic due to the trashed entry!
>>>>>> What is the correct solution here?
>>>>>
>>>>> Could provide a cmdline option to not nobble the DMAR?
>>>>
>>>> That's a possibility.
>>>> I wonder whether it wouldn't be better to let dom0 decide not to use it if
>>>> running under xen. This would remove the requirement for zapping the ACPI
>>>> table. IMO it's always a bad idea to change data of a deeper layer...
>>>>
>>>>>
>>>>>> The crash kernel expects a valid DMAR entry, as following code in
>>>>>> enable_IR_x2apic() suggests:
>>>>>
>>>>> I don't know what that function does, nor how the error path below depends
>>>>> on DMAR. DMAR isn't mentioned in the below code.
>>>>
>>>> Sorry, here a larger fragment (source arch/x86/kernel/apic/apic.c):
>>>>
>>>> void __init enable_IR_x2apic(void)
>>>> {
>>>>         unsigned long flags;
>>>>         struct IO_APIC_route_entry **ioapic_entries = NULL;
>>>>         int ret, x2apic_enabled = 0;
>>>>         int dmar_table_init_ret = 0;
>>>>
>>>> #ifdef CONFIG_INTR_REMAP
>>>>         dmar_table_init_ret = dmar_table_init();
>>>>         if (dmar_table_init_ret)
>>>>                 pr_debug("dmar_table_init() failed with %d:\n",
>>>>                                 dmar_table_init_ret);
>>>> #endif
>>>>
>>>>         ioapic_entries = alloc_ioapic_entries();
>>>>         if (!ioapic_entries) {
>>>>                 pr_err("Allocate ioapic_entries failed\n");
>>>>                 goto out;
>>>>         }
>>>>
>>>>         ret = save_IO_APIC_setup(ioapic_entries);
>>>>         if (ret) {
>>>>                 pr_info("Saving IO-APIC state failed: %d\n", ret);
>>>>                 goto out;
>>>>         }
>>>>
>>>>         local_irq_save(flags);
>>>>         mask_8259A();
>>>>         mask_IO_APIC_setup(ioapic_entries);
>>>>
>>>>         if (dmar_table_init_ret)
>>>>                 ret = 0;
>>>>         else
>>>>                 ret = enable_IR();
>>>>
>>>>         if (!ret) {
>>>>                 /* IR is required if there is APIC ID > 255 even when
>>>> running
>>>>                  * under KVM
>>>>                  */
>>>>                 if (max_physical_apicid > 255 || !kvm_para_available())
>>>>                         goto nox2apic;
>>>>                 /*
>>>>                  * without IR all CPUs can be addressed by IOAPIC/MSI
>>>>                  * only in physical mode
>>>>                  */
>>>>                 x2apic_force_phys();
>>>>         }
>>>>
>>>>         x2apic_enabled = 1;
>>>>
>>>>         if (x2apic_supported() && !x2apic_mode) {
>>>>                 x2apic_mode = 1;
>>>>                 enable_x2apic();
>>>>                 pr_info("Enabled x2apic\n");
>>>>         }
>>>>
>>>> nox2apic:
>>>>         if (!ret) /* IR enabling failed */
>>>>                 restore_IO_APIC_setup(ioapic_entries);
>>>>         unmask_8259A();
>>>>         local_irq_restore(flags);
>>>>
>>>> out:
>>>>         if (ioapic_entries)
>>>>                 free_ioapic_entries(ioapic_entries);
>>>>
>>>>         if (x2apic_enabled)
>>>>                 return;
>>>>
>>>>         if (x2apic_preenabled)
>>>>                 panic("x2apic: enabled by BIOS but kernel init failed.");
>>>>         else if (cpu_has_x2apic)
>>>>                 pr_info("Not enabling x2apic, Intr-remapping init
>>>> failed.\n");
>>>> }
>>>>
>>>>
>>>> dmar_table_init() will return -ENODEV if no DMAR record is found.
>>>>
>>>>
>>>> Juergen
>>>>
>>>> --
>>>> Juergen Gross                 Principal Developer Operating Systems
>>>> TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
>>>> Fujitsu Technology Solutions              e-mail:
>>>> juergen.gross@ts.fujitsu.com
>>>> Domagkstr. 28                           Internet: ts.fujitsu.com
>>>> D-80807 Muenchen                 Company details:
>>>> ts.fujitsu.com/imprint.html
>>>>
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@lists.xensource.com
>>>> http://lists.xensource.com/xen-devel
>>
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ACPI-Tables corrupted?
  2010-07-29  7:37               ` Jiang, Yunhong
@ 2010-07-29  9:04                 ` Juergen Gross
  2010-07-29  9:33                 ` Keir Fraser
  1 sibling, 0 replies; 23+ messages in thread
From: Juergen Gross @ 2010-07-29  9:04 UTC (permalink / raw)
  To: Jiang, Yunhong
  Cc: Han, Weidong, Jeremy Fitzhardinge, xen-devel@lists.xensource.com,
	Kay, Allen M, Keir Fraser

On 07/29/2010 09:37 AM, Jiang, Yunhong wrote:
> Sorry that I didn't notice it is for crash kernel. In fact, I tried kexec before and never succed to bring it up.
>
> What do you mean of "stab at disabling x2apic"? You mean we need disable x2apic before transfer control to crash kernel, right?
>
> Per my understanding, with kexec, when system crash, it will jump directly to new kernel's entry point, no guest destroy (i.e. clean-up), not reset signal to cpu/chipset, right?
> If yes, another issue need be considered is VT-d. I didn't find the vt-d disable code in xen's kexec_crash code, if the new kernel has no idea of vt-d (thus does not reset the vt-d engine), it may have trouble. Or, will the kexec kernel not use device assigned to guest?
>
> Of course it is ok if crash kernel support vt-d too.

Seems to be the case here.
A patched crash kernel which just took the zapped DMAR entry as a valid one
succeeded in writing a vmcore.

Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ACPI-Tables corrupted?
  2010-07-29  7:37               ` Jiang, Yunhong
  2010-07-29  9:04                 ` Juergen Gross
@ 2010-07-29  9:33                 ` Keir Fraser
  1 sibling, 0 replies; 23+ messages in thread
From: Keir Fraser @ 2010-07-29  9:33 UTC (permalink / raw)
  To: Jiang, Yunhong, Juergen Gross
  Cc: Kay, Allen M, Jeremy Fitzhardinge, xen-devel@lists.xensource.com,
	Han, Weidong

On 29/07/2010 08:37, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:

> Sorry that I didn't notice it is for crash kernel. In fact, I tried kexec
> before and never succed to bring it up.
> 
> What do you mean of "stab at disabling x2apic"? You mean we need disable
> x2apic before transfer control to crash kernel, right?
> 
> Per my understanding, with kexec, when system crash, it will jump directly to
> new kernel's entry point, no guest destroy (i.e. clean-up), not reset signal
> to cpu/chipset, right?
> If yes, another issue need be considered is VT-d. I didn't find the vt-d
> disable code in xen's kexec_crash code, if the new kernel has no idea of vt-d
> (thus does not reset the vt-d engine), it may have trouble. Or, will the kexec
> kernel not use device assigned to guest?
> 
> Of course it is ok if crash kernel support vt-d too.

A crash kernel needs to be carefully configured anyway. For example it may
not be started up on the boot processor: it gets started on whichever cpu
initiated the crash.

 K.

> Thanks
> --jyh
> 
> 
>> -----Original Message-----
>> From: Keir Fraser [mailto:keir.fraser@eu.citrix.com]
>> Sent: Thursday, July 29, 2010 2:48 PM
>> To: Jiang, Yunhong; Juergen Gross
>> Cc: Jeremy Fitzhardinge; xen-devel@lists.xensource.com; Han, Weidong
>> Subject: Re: [Xen-devel] ACPI-Tables corrupted?
>> 
>> Strictly speaking we should have a stab at disabling x2apic as well, but
>> that's harder. And not necessary for newer Linux kernels, which is what we
>> will usually be kexec'ing to anyway.
>> 
>> -- Keir
>> 
>> On 29/07/2010 07:40, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:
>> 
>>> It's not a dom0, it's a kexec'ed crash kernel. We should be reinstating DMAR
>>> before jumping into a native kernel. I will sort out a fix.
>>> 
>>>  -- Keir
>>> 
>>> On 29/07/2010 07:31, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
>>> 
>>>> A bit curios, why enable_IR_x2apic() will be called in dom0? IMO, dom0 will
>>>> not control interrupt controller, either xapic, or x2apic. Shouldn't this
>>>> be
>>>> commented out in pvops dom0?
>>>> 
>>>> What's the panic point in your environment? Is it the following code? If
>>>> yes,
>>>> that means you enable x2apic in BIOS and you can workaround this issue by
>>>> disable x2apic in BIOS.
>>>> 
>>>>         if (x2apic_preenabled)
>>>>                  panic("x2apic: enabled by BIOS but kernel init failed.");
>>>> 
>>>> Thanks
>>>> --jyh
>>>> 
>>>>> -----Original Message-----
>>>>> From: xen-devel-bounces@lists.xensource.com
>>>>> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Juergen Gross
>>>>> Sent: Wednesday, July 28, 2010 8:14 PM
>>>>> To: Keir Fraser
>>>>> Cc: xen-devel@lists.xensource.com
>>>>> Subject: Re: [Xen-devel] ACPI-Tables corrupted?
>>>>> 
>>>>> On 07/28/2010 01:39 PM, Keir Fraser wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On 28/07/2010 12:26, "Juergen Gross"<juergen.gross@ts.fujitsu.com>
>> wrote:
>>>>>> 
>>>>>>> On 07/28/2010 12:03 PM, Keir Fraser wrote:
>>>>>>>> On 28/07/2010 10:38, "Juergen Gross"<juergen.gross@ts.fujitsu.com>
>>>>> wrote:
>>>>>>>> 
>>>>>>>>> As you can see, the DMAR eye-catcher is replaced by blanks!
>>>>>>>>> This leads to a programmed panic in the crash kernel later in case of
>>>>>>>>> a
>>>>>>>>> panic in dom0...
>>>>>>>>> 
>>>>>>>>> Any ideas?
>>>>>>>>> BTW: seen in unstable AND 4.0
>>>>>>>> 
>>>>>>>> Look at the tail of xen/drivers/passthrough/vtd/dmar.c: Xen *always*
>>>>>>>> *unconditionally* trashes the DMAR so that dom0 will not parse it.
>>>>>>>> Presumably bad stuff would happen if it did.
>>>>>>> 
>>>>>>> As Dom0 is a pv-kernel, it should be able to ignore this entry.
>>>>>>> The crash kernel OTOH should not panic due to the trashed entry!
>>>>>>> What is the correct solution here?
>>>>>> 
>>>>>> Could provide a cmdline option to not nobble the DMAR?
>>>>> 
>>>>> That's a possibility.
>>>>> I wonder whether it wouldn't be better to let dom0 decide not to use it if
>>>>> running under xen. This would remove the requirement for zapping the ACPI
>>>>> table. IMO it's always a bad idea to change data of a deeper layer...
>>>>> 
>>>>>> 
>>>>>>> The crash kernel expects a valid DMAR entry, as following code in
>>>>>>> enable_IR_x2apic() suggests:
>>>>>> 
>>>>>> I don't know what that function does, nor how the error path below
>>>>>> depends
>>>>>> on DMAR. DMAR isn't mentioned in the below code.
>>>>> 
>>>>> Sorry, here a larger fragment (source arch/x86/kernel/apic/apic.c):
>>>>> 
>>>>> void __init enable_IR_x2apic(void)
>>>>> {
>>>>>         unsigned long flags;
>>>>>         struct IO_APIC_route_entry **ioapic_entries = NULL;
>>>>>         int ret, x2apic_enabled = 0;
>>>>>         int dmar_table_init_ret = 0;
>>>>> 
>>>>> #ifdef CONFIG_INTR_REMAP
>>>>>         dmar_table_init_ret = dmar_table_init();
>>>>>         if (dmar_table_init_ret)
>>>>>                 pr_debug("dmar_table_init() failed with %d:\n",
>>>>>                                 dmar_table_init_ret);
>>>>> #endif
>>>>> 
>>>>>         ioapic_entries = alloc_ioapic_entries();
>>>>>         if (!ioapic_entries) {
>>>>>                 pr_err("Allocate ioapic_entries failed\n");
>>>>>                 goto out;
>>>>>         }
>>>>> 
>>>>>         ret = save_IO_APIC_setup(ioapic_entries);
>>>>>         if (ret) {
>>>>>                 pr_info("Saving IO-APIC state failed: %d\n", ret);
>>>>>                 goto out;
>>>>>         }
>>>>> 
>>>>>         local_irq_save(flags);
>>>>>         mask_8259A();
>>>>>         mask_IO_APIC_setup(ioapic_entries);
>>>>> 
>>>>>         if (dmar_table_init_ret)
>>>>>                 ret = 0;
>>>>>         else
>>>>>                 ret = enable_IR();
>>>>> 
>>>>>         if (!ret) {
>>>>>                 /* IR is required if there is APIC ID > 255 even when
>>>>> running
>>>>>                  * under KVM
>>>>>                  */
>>>>>                 if (max_physical_apicid > 255 || !kvm_para_available())
>>>>>                         goto nox2apic;
>>>>>                 /*
>>>>>                  * without IR all CPUs can be addressed by IOAPIC/MSI
>>>>>                  * only in physical mode
>>>>>                  */
>>>>>                 x2apic_force_phys();
>>>>>         }
>>>>> 
>>>>>         x2apic_enabled = 1;
>>>>> 
>>>>>         if (x2apic_supported() && !x2apic_mode) {
>>>>>                 x2apic_mode = 1;
>>>>>                 enable_x2apic();
>>>>>                 pr_info("Enabled x2apic\n");
>>>>>         }
>>>>> 
>>>>> nox2apic:
>>>>>         if (!ret) /* IR enabling failed */
>>>>>                 restore_IO_APIC_setup(ioapic_entries);
>>>>>         unmask_8259A();
>>>>>         local_irq_restore(flags);
>>>>> 
>>>>> out:
>>>>>         if (ioapic_entries)
>>>>>                 free_ioapic_entries(ioapic_entries);
>>>>> 
>>>>>         if (x2apic_enabled)
>>>>>                 return;
>>>>> 
>>>>>         if (x2apic_preenabled)
>>>>>                 panic("x2apic: enabled by BIOS but kernel init failed.");
>>>>>         else if (cpu_has_x2apic)
>>>>>                 pr_info("Not enabling x2apic, Intr-remapping init
>>>>> failed.\n");
>>>>> }
>>>>> 
>>>>> 
>>>>> dmar_table_init() will return -ENODEV if no DMAR record is found.
>>>>> 
>>>>> 
>>>>> Juergen
>>>>> 
>>>>> --
>>>>> Juergen Gross                 Principal Developer Operating Systems
>>>>> TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
>>>>> Fujitsu Technology Solutions              e-mail:
>>>>> juergen.gross@ts.fujitsu.com
>>>>> Domagkstr. 28                           Internet: ts.fujitsu.com
>>>>> D-80807 Muenchen                 Company details:
>>>>> ts.fujitsu.com/imprint.html
>>>>> 
>>>>> _______________________________________________
>>>>> Xen-devel mailing list
>>>>> Xen-devel@lists.xensource.com
>>>>> http://lists.xensource.com/xen-devel
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xensource.com
>>> http://lists.xensource.com/xen-devel
>> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ACPI-Tables corrupted?
  2010-07-29  6:40           ` Keir Fraser
  2010-07-29  6:48             ` Keir Fraser
@ 2010-07-29 10:24             ` Keir Fraser
  2010-07-30  4:47               ` Juergen Gross
  1 sibling, 1 reply; 23+ messages in thread
From: Keir Fraser @ 2010-07-29 10:24 UTC (permalink / raw)
  To: Jiang, Yunhong, Juergen Gross
  Cc: Jeremy Fitzhardinge, xen-devel@lists.xensource.com, Han, Weidong

Juergen,

Please try xen-unstable:21886, from our staging tree at
http://xenbits.xen.org/staging/xen-unstable.hg

It does a bunch of kexec cleanup, and should also correctly reinstate the
DMAR table before calling the crash kernel.

If it works okay we can consider it for backport to 4.0.1.

 Thanks,
 Keir

On 29/07/2010 07:40, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:

> It's not a dom0, it's a kexec'ed crash kernel. We should be reinstating DMAR
> before jumping into a native kernel. I will sort out a fix.
> 
>  -- Keir
> 
> On 29/07/2010 07:31, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
> 
>> A bit curios, why enable_IR_x2apic() will be called in dom0? IMO, dom0 will
>> not control interrupt controller, either xapic, or x2apic. Shouldn't this be
>> commented out in pvops dom0?
>> 
>> What's the panic point in your environment? Is it the following code? If yes,
>> that means you enable x2apic in BIOS and you can workaround this issue by
>> disable x2apic in BIOS.
>> 
>>         if (x2apic_preenabled)
>>                  panic("x2apic: enabled by BIOS but kernel init failed.");
>> 
>> Thanks
>> --jyh
>> 
>>> -----Original Message-----
>>> From: xen-devel-bounces@lists.xensource.com
>>> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Juergen Gross
>>> Sent: Wednesday, July 28, 2010 8:14 PM
>>> To: Keir Fraser
>>> Cc: xen-devel@lists.xensource.com
>>> Subject: Re: [Xen-devel] ACPI-Tables corrupted?
>>> 
>>> On 07/28/2010 01:39 PM, Keir Fraser wrote:
>>>> 
>>>> 
>>>> 
>>>> On 28/07/2010 12:26, "Juergen Gross"<juergen.gross@ts.fujitsu.com>  wrote:
>>>> 
>>>>> On 07/28/2010 12:03 PM, Keir Fraser wrote:
>>>>>> On 28/07/2010 10:38, "Juergen Gross"<juergen.gross@ts.fujitsu.com>
>>> wrote:
>>>>>> 
>>>>>>> As you can see, the DMAR eye-catcher is replaced by blanks!
>>>>>>> This leads to a programmed panic in the crash kernel later in case of a
>>>>>>> panic in dom0...
>>>>>>> 
>>>>>>> Any ideas?
>>>>>>> BTW: seen in unstable AND 4.0
>>>>>> 
>>>>>> Look at the tail of xen/drivers/passthrough/vtd/dmar.c: Xen *always*
>>>>>> *unconditionally* trashes the DMAR so that dom0 will not parse it.
>>>>>> Presumably bad stuff would happen if it did.
>>>>> 
>>>>> As Dom0 is a pv-kernel, it should be able to ignore this entry.
>>>>> The crash kernel OTOH should not panic due to the trashed entry!
>>>>> What is the correct solution here?
>>>> 
>>>> Could provide a cmdline option to not nobble the DMAR?
>>> 
>>> That's a possibility.
>>> I wonder whether it wouldn't be better to let dom0 decide not to use it if
>>> running under xen. This would remove the requirement for zapping the ACPI
>>> table. IMO it's always a bad idea to change data of a deeper layer...
>>> 
>>>> 
>>>>> The crash kernel expects a valid DMAR entry, as following code in
>>>>> enable_IR_x2apic() suggests:
>>>> 
>>>> I don't know what that function does, nor how the error path below depends
>>>> on DMAR. DMAR isn't mentioned in the below code.
>>> 
>>> Sorry, here a larger fragment (source arch/x86/kernel/apic/apic.c):
>>> 
>>> void __init enable_IR_x2apic(void)
>>> {
>>>         unsigned long flags;
>>>         struct IO_APIC_route_entry **ioapic_entries = NULL;
>>>         int ret, x2apic_enabled = 0;
>>>         int dmar_table_init_ret = 0;
>>> 
>>> #ifdef CONFIG_INTR_REMAP
>>>         dmar_table_init_ret = dmar_table_init();
>>>         if (dmar_table_init_ret)
>>>                 pr_debug("dmar_table_init() failed with %d:\n",
>>>                                 dmar_table_init_ret);
>>> #endif
>>> 
>>>         ioapic_entries = alloc_ioapic_entries();
>>>         if (!ioapic_entries) {
>>>                 pr_err("Allocate ioapic_entries failed\n");
>>>                 goto out;
>>>         }
>>> 
>>>         ret = save_IO_APIC_setup(ioapic_entries);
>>>         if (ret) {
>>>                 pr_info("Saving IO-APIC state failed: %d\n", ret);
>>>                 goto out;
>>>         }
>>> 
>>>         local_irq_save(flags);
>>>         mask_8259A();
>>>         mask_IO_APIC_setup(ioapic_entries);
>>> 
>>>         if (dmar_table_init_ret)
>>>                 ret = 0;
>>>         else
>>>                 ret = enable_IR();
>>> 
>>>         if (!ret) {
>>>                 /* IR is required if there is APIC ID > 255 even when
>>> running
>>>                  * under KVM
>>>                  */
>>>                 if (max_physical_apicid > 255 || !kvm_para_available())
>>>                         goto nox2apic;
>>>                 /*
>>>                  * without IR all CPUs can be addressed by IOAPIC/MSI
>>>                  * only in physical mode
>>>                  */
>>>                 x2apic_force_phys();
>>>         }
>>> 
>>>         x2apic_enabled = 1;
>>> 
>>>         if (x2apic_supported() && !x2apic_mode) {
>>>                 x2apic_mode = 1;
>>>                 enable_x2apic();
>>>                 pr_info("Enabled x2apic\n");
>>>         }
>>> 
>>> nox2apic:
>>>         if (!ret) /* IR enabling failed */
>>>                 restore_IO_APIC_setup(ioapic_entries);
>>>         unmask_8259A();
>>>         local_irq_restore(flags);
>>> 
>>> out:
>>>         if (ioapic_entries)
>>>                 free_ioapic_entries(ioapic_entries);
>>> 
>>>         if (x2apic_enabled)
>>>                 return;
>>> 
>>>         if (x2apic_preenabled)
>>>                 panic("x2apic: enabled by BIOS but kernel init failed.");
>>>         else if (cpu_has_x2apic)
>>>                 pr_info("Not enabling x2apic, Intr-remapping init
>>> failed.\n");
>>> }
>>> 
>>> 
>>> dmar_table_init() will return -ENODEV if no DMAR record is found.
>>> 
>>> 
>>> Juergen
>>> 
>>> --
>>> Juergen Gross                 Principal Developer Operating Systems
>>> TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
>>> Fujitsu Technology Solutions              e-mail:
>>> juergen.gross@ts.fujitsu.com
>>> Domagkstr. 28                           Internet: ts.fujitsu.com
>>> D-80807 Muenchen                 Company details:
>>> ts.fujitsu.com/imprint.html
>>> 
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xensource.com
>>> http://lists.xensource.com/xen-devel
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ACPI-Tables corrupted?
  2010-07-29 10:24             ` Keir Fraser
@ 2010-07-30  4:47               ` Juergen Gross
  0 siblings, 0 replies; 23+ messages in thread
From: Juergen Gross @ 2010-07-30  4:47 UTC (permalink / raw)
  To: Keir Fraser
  Cc: Jeremy Fitzhardinge, Jiang, Yunhong, Han, Weidong,
	xen-devel@lists.xensource.com

On 07/29/2010 12:24 PM, Keir Fraser wrote:
> Juergen,
>
> Please try xen-unstable:21886, from our staging tree at
> http://xenbits.xen.org/staging/xen-unstable.hg
>
> It does a bunch of kexec cleanup, and should also correctly reinstate the
> DMAR table before calling the crash kernel.
>
> If it works okay we can consider it for backport to 4.0.1.

crash kernel comes up now.
And yes, please do a backport to 4.0.1!


Thanks, Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ACPI-Tables corrupted?
  2010-07-28 12:45         ` Keir Fraser
  2010-07-28 13:27           ` Juergen Gross
@ 2010-08-06 13:39           ` Jan Beulich
  2010-08-06 14:04             ` Keir Fraser
  1 sibling, 1 reply; 23+ messages in thread
From: Jan Beulich @ 2010-08-06 13:39 UTC (permalink / raw)
  To: Keir Fraser, Juergen Gross; +Cc: xen-devel@lists.xensource.com

>>> On 28.07.10 at 14:45, Keir Fraser <keir.fraser@eu.citrix.com> wrote:
> On 28/07/2010 13:13, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:
> 
>>>> As Dom0 is a pv-kernel, it should be able to ignore this entry.
>>>> The crash kernel OTOH should not panic due to the trashed entry!
>>>> What is the correct solution here?
>>> 
>>> Could provide a cmdline option to not nobble the DMAR?
>> 
>> That's a possibility.
>> I wonder whether it wouldn't be better to let dom0 decide not to use it if
>> running under xen. This would remove the requirement for zapping the ACPI
>> table. IMO it's always a bad idea to change data of a deeper layer...
> 
> If we don't zap the DMAR then every existing dom0 kernel will fail with new
> hypervisor.

Who decided that this zapping is going to work on all systems in the
future? E.g. I can't see why a BIOS shouldn't decide to put most of
the ACPI tables in chipset write protected memory, if a chipset offers
such? In such an environment, Dom0 would still see the original
signature - shouldn't we thus correct the original mistake of fiddling
with the ACPI tables as we now have to touch that code anyway?

Jan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: ACPI-Tables corrupted?
  2010-08-06 13:39           ` Jan Beulich
@ 2010-08-06 14:04             ` Keir Fraser
  0 siblings, 0 replies; 23+ messages in thread
From: Keir Fraser @ 2010-08-06 14:04 UTC (permalink / raw)
  To: Jan Beulich, Juergen Gross; +Cc: xen-devel@lists.xensource.com

On 06/08/2010 14:39, "Jan Beulich" <JBeulich@novell.com> wrote:

>> If we don't zap the DMAR then every existing dom0 kernel will fail with new
>> hypervisor.
> 
> Who decided that this zapping is going to work on all systems in the
> future?

Someone at Intel.

> E.g. I can't see why a BIOS shouldn't decide to put most of
> the ACPI tables in chipset write protected memory, if a chipset offers
> such? In such an environment, Dom0 would still see the original
> signature - shouldn't we thus correct the original mistake of fiddling
> with the ACPI tables as we now have to touch that code anyway?

Well it's harmless to try zapping it I think, now we reinstate it on kexec.
Trying to be smarter in dom0 *as well* is plausible I guess.

 -- Keir

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2010-08-06 14:04 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-28  9:38 ACPI-Tables corrupted? Juergen Gross
2010-07-28 10:03 ` Keir Fraser
2010-07-28 11:26   ` Juergen Gross
2010-07-28 11:39     ` Keir Fraser
2010-07-28 12:13       ` Juergen Gross
2010-07-28 12:45         ` Keir Fraser
2010-07-28 13:27           ` Juergen Gross
2010-07-28 13:36             ` Konrad Rzeszutek Wilk
2010-07-29  6:19               ` Juergen Gross
2010-07-29  6:39                 ` Keir Fraser
2010-07-29  6:48                   ` Juergen Gross
2010-07-29  6:48                     ` Keir Fraser
2010-07-29  6:53                       ` Juergen Gross
2010-08-06 13:39           ` Jan Beulich
2010-08-06 14:04             ` Keir Fraser
2010-07-29  6:31         ` Jiang, Yunhong
2010-07-29  6:40           ` Keir Fraser
2010-07-29  6:48             ` Keir Fraser
2010-07-29  7:37               ` Jiang, Yunhong
2010-07-29  9:04                 ` Juergen Gross
2010-07-29  9:33                 ` Keir Fraser
2010-07-29 10:24             ` Keir Fraser
2010-07-30  4:47               ` Juergen Gross

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).