* [PATCH 0/2] PVH Dom0 on QEMU @ 2023-05-13 1:16 Stefano Stabellini 2023-05-13 1:17 ` [PATCH 1/2] xen/x86/pvh: use preset XSDT header for XSDT generation Stefano Stabellini 2023-05-13 1:17 ` [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping Stefano Stabellini 0 siblings, 2 replies; 29+ messages in thread From: Stefano Stabellini @ 2023-05-13 1:16 UTC (permalink / raw) To: roger.pau, andrew.cooper3, jbeulich Cc: sstabellini, xen-devel, Xenia.Ragiadakou Hi all, These 2 patches are necessary to boot Xen and Dom0 PVH on QEMU (with or without KVM acceleration.) The first one is a genuine fix. The second one is a workaround: I don't know the underlying root cause of the problem. Cheers, Stefano ^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH 1/2] xen/x86/pvh: use preset XSDT header for XSDT generation 2023-05-13 1:16 [PATCH 0/2] PVH Dom0 on QEMU Stefano Stabellini @ 2023-05-13 1:17 ` Stefano Stabellini 2023-05-15 8:48 ` Roger Pau Monné 2023-05-13 1:17 ` [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping Stefano Stabellini 1 sibling, 1 reply; 29+ messages in thread From: Stefano Stabellini @ 2023-05-13 1:17 UTC (permalink / raw) To: roger.pau, andrew.cooper3, jbeulich Cc: sstabellini, xen-devel, Xenia.Ragiadakou, Stefano Stabellini From: Stefano Stabellini <stefano.stabellini@amd.com> Xen always generates a XSDT table even if the firmware provided a RSDT table. Instead of copying the XSDT header from the firmware table (that might be missing), generate the XSDT header from a preset. Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> --- xen/arch/x86/hvm/dom0_build.c | 32 +++++++++----------------------- 1 file changed, 9 insertions(+), 23 deletions(-) diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c index 307edc6a8c..5fde769863 100644 --- a/xen/arch/x86/hvm/dom0_build.c +++ b/xen/arch/x86/hvm/dom0_build.c @@ -963,13 +963,18 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, paddr_t *addr) { struct acpi_table_xsdt *xsdt; - struct acpi_table_header *table; - struct acpi_table_rsdp *rsdp; const struct acpi_table_desc *tables = acpi_gbl_root_table_list.tables; unsigned long size = sizeof(*xsdt); unsigned int i, j, num_tables = 0; - paddr_t xsdt_paddr; int rc; + struct acpi_table_header header = { + .signature = "XSDT", + .length = sizeof(struct acpi_table_header), + .revision = 0x1, + .oem_id = "Xen", + .oem_table_id = "HVM", + .oem_revision = 0, + }; /* * Restore original DMAR table signature, we are going to filter it from @@ -1001,26 +1006,7 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, goto out; } - /* Copy the native XSDT table header. */ - rsdp = acpi_os_map_memory(acpi_os_get_root_pointer(), sizeof(*rsdp)); - if ( !rsdp ) - { - printk("Unable to map RSDP\n"); - rc = -EINVAL; - goto out; - } - xsdt_paddr = rsdp->xsdt_physical_address; - acpi_os_unmap_memory(rsdp, sizeof(*rsdp)); - table = acpi_os_map_memory(xsdt_paddr, sizeof(*table)); - if ( !table ) - { - printk("Unable to map XSDT\n"); - rc = -EINVAL; - goto out; - } - xsdt->header = *table; - acpi_os_unmap_memory(table, sizeof(*table)); - + xsdt->header = header; /* Add the custom MADT. */ xsdt->table_offset_entry[0] = madt_addr; -- 2.25.1 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 1/2] xen/x86/pvh: use preset XSDT header for XSDT generation 2023-05-13 1:17 ` [PATCH 1/2] xen/x86/pvh: use preset XSDT header for XSDT generation Stefano Stabellini @ 2023-05-15 8:48 ` Roger Pau Monné 2023-05-15 14:14 ` Jan Beulich 0 siblings, 1 reply; 29+ messages in thread From: Roger Pau Monné @ 2023-05-15 8:48 UTC (permalink / raw) To: Stefano Stabellini Cc: andrew.cooper3, jbeulich, xen-devel, Xenia.Ragiadakou, Stefano Stabellini On Fri, May 12, 2023 at 06:17:19PM -0700, Stefano Stabellini wrote: > From: Stefano Stabellini <stefano.stabellini@amd.com> > > Xen always generates a XSDT table even if the firmware provided a RSDT > table. Instead of copying the XSDT header from the firmware table (that > might be missing), generate the XSDT header from a preset. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> > --- > xen/arch/x86/hvm/dom0_build.c | 32 +++++++++----------------------- > 1 file changed, 9 insertions(+), 23 deletions(-) > > diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c > index 307edc6a8c..5fde769863 100644 > --- a/xen/arch/x86/hvm/dom0_build.c > +++ b/xen/arch/x86/hvm/dom0_build.c > @@ -963,13 +963,18 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, > paddr_t *addr) > { > struct acpi_table_xsdt *xsdt; > - struct acpi_table_header *table; > - struct acpi_table_rsdp *rsdp; > const struct acpi_table_desc *tables = acpi_gbl_root_table_list.tables; > unsigned long size = sizeof(*xsdt); > unsigned int i, j, num_tables = 0; > - paddr_t xsdt_paddr; > int rc; > + struct acpi_table_header header = { > + .signature = "XSDT", > + .length = sizeof(struct acpi_table_header), > + .revision = 0x1, > + .oem_id = "Xen", > + .oem_table_id = "HVM", I think this is wrong, as according to the spec the OEM Table ID must match the OEM Table ID in the FADT. We likely want to copy the OEM ID and OEM Table ID from the RSDP, and possibly also the other OEM related fields. Alternatively we might want to copy and use the RSDT on systems that lack an XSDT, or even just copy the header from the RSDT into Xen's crafted XSDT, since the format of the RSDP and the XSDT headers are exactly the same (the difference is in the size of the description headers that come after). > + .oem_revision = 0, > + }; This wants to be initdata static const if we go down this route. Thanks, Roger. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/2] xen/x86/pvh: use preset XSDT header for XSDT generation 2023-05-15 8:48 ` Roger Pau Monné @ 2023-05-15 14:14 ` Jan Beulich 2023-05-16 0:16 ` Stefano Stabellini 0 siblings, 1 reply; 29+ messages in thread From: Jan Beulich @ 2023-05-15 14:14 UTC (permalink / raw) To: Roger Pau Monné, Stefano Stabellini Cc: andrew.cooper3, xen-devel, Xenia.Ragiadakou, Stefano Stabellini On 15.05.2023 10:48, Roger Pau Monné wrote: > On Fri, May 12, 2023 at 06:17:19PM -0700, Stefano Stabellini wrote: >> From: Stefano Stabellini <stefano.stabellini@amd.com> >> >> Xen always generates a XSDT table even if the firmware provided a RSDT >> table. Instead of copying the XSDT header from the firmware table (that >> might be missing), generate the XSDT header from a preset. >> >> Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> >> --- >> xen/arch/x86/hvm/dom0_build.c | 32 +++++++++----------------------- >> 1 file changed, 9 insertions(+), 23 deletions(-) >> >> diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c >> index 307edc6a8c..5fde769863 100644 >> --- a/xen/arch/x86/hvm/dom0_build.c >> +++ b/xen/arch/x86/hvm/dom0_build.c >> @@ -963,13 +963,18 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, >> paddr_t *addr) >> { >> struct acpi_table_xsdt *xsdt; >> - struct acpi_table_header *table; >> - struct acpi_table_rsdp *rsdp; >> const struct acpi_table_desc *tables = acpi_gbl_root_table_list.tables; >> unsigned long size = sizeof(*xsdt); >> unsigned int i, j, num_tables = 0; >> - paddr_t xsdt_paddr; >> int rc; >> + struct acpi_table_header header = { >> + .signature = "XSDT", >> + .length = sizeof(struct acpi_table_header), >> + .revision = 0x1, >> + .oem_id = "Xen", >> + .oem_table_id = "HVM", > > I think this is wrong, as according to the spec the OEM Table ID must > match the OEM Table ID in the FADT. > > We likely want to copy the OEM ID and OEM Table ID from the RSDP, and > possibly also the other OEM related fields. > > Alternatively we might want to copy and use the RSDT on systems that > lack an XSDT, or even just copy the header from the RSDT into Xen's > crafted XSDT, since the format of the RSDP and the XSDT headers are > exactly the same (the difference is in the size of the description > headers that come after). I guess I'd prefer that last variant. ACPI specifically says "Platforms provide the RSDT to enable compatibility with ACPI 1.0 operating systems." IOW any halfway modern system (including qemu, that is) ought to supply an XSDT anyway. Jan ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/2] xen/x86/pvh: use preset XSDT header for XSDT generation 2023-05-15 14:14 ` Jan Beulich @ 2023-05-16 0:16 ` Stefano Stabellini 2023-05-16 6:13 ` Jan Beulich 2023-05-16 8:10 ` Roger Pau Monné 0 siblings, 2 replies; 29+ messages in thread From: Stefano Stabellini @ 2023-05-16 0:16 UTC (permalink / raw) To: Jan Beulich Cc: Roger Pau Monné, Stefano Stabellini, andrew.cooper3, xen-devel, Xenia.Ragiadakou, Stefano Stabellini [-- Attachment #1: Type: text/plain, Size: 2895 bytes --] On Mon, 15 May 2023, Jan Beulich wrote: > On 15.05.2023 10:48, Roger Pau Monné wrote: > > On Fri, May 12, 2023 at 06:17:19PM -0700, Stefano Stabellini wrote: > >> From: Stefano Stabellini <stefano.stabellini@amd.com> > >> > >> Xen always generates a XSDT table even if the firmware provided a RSDT > >> table. Instead of copying the XSDT header from the firmware table (that > >> might be missing), generate the XSDT header from a preset. > >> > >> Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> > >> --- > >> xen/arch/x86/hvm/dom0_build.c | 32 +++++++++----------------------- > >> 1 file changed, 9 insertions(+), 23 deletions(-) > >> > >> diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c > >> index 307edc6a8c..5fde769863 100644 > >> --- a/xen/arch/x86/hvm/dom0_build.c > >> +++ b/xen/arch/x86/hvm/dom0_build.c > >> @@ -963,13 +963,18 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, > >> paddr_t *addr) > >> { > >> struct acpi_table_xsdt *xsdt; > >> - struct acpi_table_header *table; > >> - struct acpi_table_rsdp *rsdp; > >> const struct acpi_table_desc *tables = acpi_gbl_root_table_list.tables; > >> unsigned long size = sizeof(*xsdt); > >> unsigned int i, j, num_tables = 0; > >> - paddr_t xsdt_paddr; > >> int rc; > >> + struct acpi_table_header header = { > >> + .signature = "XSDT", > >> + .length = sizeof(struct acpi_table_header), > >> + .revision = 0x1, > >> + .oem_id = "Xen", > >> + .oem_table_id = "HVM", > > > > I think this is wrong, as according to the spec the OEM Table ID must > > match the OEM Table ID in the FADT. > > > > We likely want to copy the OEM ID and OEM Table ID from the RSDP, and > > possibly also the other OEM related fields. > > > > Alternatively we might want to copy and use the RSDT on systems that > > lack an XSDT, or even just copy the header from the RSDT into Xen's > > crafted XSDT, since the format of the RSDP and the XSDT headers are > > exactly the same (the difference is in the size of the description > > headers that come after). > > I guess I'd prefer that last variant. I tried this approach (together with the second patch for necessity) and it worked. diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c index fd2cbf68bc..11d6d1bc23 100644 --- a/xen/arch/x86/hvm/dom0_build.c +++ b/xen/arch/x86/hvm/dom0_build.c @@ -967,6 +967,10 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, goto out; } xsdt_paddr = rsdp->xsdt_physical_address; + if ( !xsdt_paddr ) + { + xsdt_paddr = rsdp->rsdt_physical_address; + } acpi_os_unmap_memory(rsdp, sizeof(*rsdp)); table = acpi_os_map_memory(xsdt_paddr, sizeof(*table)); if ( !table ) ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 1/2] xen/x86/pvh: use preset XSDT header for XSDT generation 2023-05-16 0:16 ` Stefano Stabellini @ 2023-05-16 6:13 ` Jan Beulich 2023-05-16 8:10 ` Roger Pau Monné 1 sibling, 0 replies; 29+ messages in thread From: Jan Beulich @ 2023-05-16 6:13 UTC (permalink / raw) To: Stefano Stabellini Cc: Roger Pau Monné, andrew.cooper3, xen-devel, Xenia.Ragiadakou, Stefano Stabellini On 16.05.2023 02:16, Stefano Stabellini wrote: > On Mon, 15 May 2023, Jan Beulich wrote: >> On 15.05.2023 10:48, Roger Pau Monné wrote: >>> Alternatively we might want to copy and use the RSDT on systems that >>> lack an XSDT, or even just copy the header from the RSDT into Xen's >>> crafted XSDT, since the format of the RSDP and the XSDT headers are >>> exactly the same (the difference is in the size of the description >>> headers that come after). >> >> I guess I'd prefer that last variant. > > I tried this approach (together with the second patch for necessity) and > it worked. Which I find slightly surprising - a fully conforming consumer of the tables may expect an XSDT when xsdt_physical_address is set, and reject RSDT. We may still want to limit ourselves to this less involved approach, but then with a code comment clearly stating the limitation. Jan > --- a/xen/arch/x86/hvm/dom0_build.c > +++ b/xen/arch/x86/hvm/dom0_build.c > @@ -967,6 +967,10 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, > goto out; > } > xsdt_paddr = rsdp->xsdt_physical_address; > + if ( !xsdt_paddr ) > + { > + xsdt_paddr = rsdp->rsdt_physical_address; > + } > acpi_os_unmap_memory(rsdp, sizeof(*rsdp)); > table = acpi_os_map_memory(xsdt_paddr, sizeof(*table)); > if ( !table ) ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/2] xen/x86/pvh: use preset XSDT header for XSDT generation 2023-05-16 0:16 ` Stefano Stabellini 2023-05-16 6:13 ` Jan Beulich @ 2023-05-16 8:10 ` Roger Pau Monné 2023-05-16 8:13 ` Roger Pau Monné 2023-05-16 8:24 ` Roger Pau Monné 1 sibling, 2 replies; 29+ messages in thread From: Roger Pau Monné @ 2023-05-16 8:10 UTC (permalink / raw) To: Stefano Stabellini Cc: Jan Beulich, andrew.cooper3, xen-devel, Xenia.Ragiadakou, Stefano Stabellini On Mon, May 15, 2023 at 05:16:27PM -0700, Stefano Stabellini wrote: > On Mon, 15 May 2023, Jan Beulich wrote: > > On 15.05.2023 10:48, Roger Pau Monné wrote: > > > On Fri, May 12, 2023 at 06:17:19PM -0700, Stefano Stabellini wrote: > > >> From: Stefano Stabellini <stefano.stabellini@amd.com> > > >> > > >> Xen always generates a XSDT table even if the firmware provided a RSDT > > >> table. Instead of copying the XSDT header from the firmware table (that > > >> might be missing), generate the XSDT header from a preset. > > >> > > >> Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> > > >> --- > > >> xen/arch/x86/hvm/dom0_build.c | 32 +++++++++----------------------- > > >> 1 file changed, 9 insertions(+), 23 deletions(-) > > >> > > >> diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c > > >> index 307edc6a8c..5fde769863 100644 > > >> --- a/xen/arch/x86/hvm/dom0_build.c > > >> +++ b/xen/arch/x86/hvm/dom0_build.c > > >> @@ -963,13 +963,18 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, > > >> paddr_t *addr) > > >> { > > >> struct acpi_table_xsdt *xsdt; > > >> - struct acpi_table_header *table; > > >> - struct acpi_table_rsdp *rsdp; > > >> const struct acpi_table_desc *tables = acpi_gbl_root_table_list.tables; > > >> unsigned long size = sizeof(*xsdt); > > >> unsigned int i, j, num_tables = 0; > > >> - paddr_t xsdt_paddr; > > >> int rc; > > >> + struct acpi_table_header header = { > > >> + .signature = "XSDT", > > >> + .length = sizeof(struct acpi_table_header), > > >> + .revision = 0x1, > > >> + .oem_id = "Xen", > > >> + .oem_table_id = "HVM", > > > > > > I think this is wrong, as according to the spec the OEM Table ID must > > > match the OEM Table ID in the FADT. > > > > > > We likely want to copy the OEM ID and OEM Table ID from the RSDP, and > > > possibly also the other OEM related fields. > > > > > > Alternatively we might want to copy and use the RSDT on systems that > > > lack an XSDT, or even just copy the header from the RSDT into Xen's > > > crafted XSDT, since the format of the RSDP and the XSDT headers are > > > exactly the same (the difference is in the size of the description > > > headers that come after). > > > > I guess I'd prefer that last variant. > > I tried this approach (together with the second patch for necessity) and > it worked. > > diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c > index fd2cbf68bc..11d6d1bc23 100644 > --- a/xen/arch/x86/hvm/dom0_build.c > +++ b/xen/arch/x86/hvm/dom0_build.c > @@ -967,6 +967,10 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, > goto out; > } > xsdt_paddr = rsdp->xsdt_physical_address; > + if ( !xsdt_paddr ) > + { > + xsdt_paddr = rsdp->rsdt_physical_address; > + } > acpi_os_unmap_memory(rsdp, sizeof(*rsdp)); > table = acpi_os_map_memory(xsdt_paddr, sizeof(*table)); > if ( !table ) To be slightly more consistent, could you use: /* * Note the header is the same for both RSDT and XSDT, so it's fine to * copy the native RSDT header to the Xen crafted XSDT if no native * XSDT is available. */ if (rsdp->revision > 1 && rsdp->xsdt_physical_address) sdt_paddr = rsdp->xsdt_physical_address; else sdt_paddr = rsdp->rsdt_physical_address; It was an oversight of mine to not check for the RSDP revision, as RSDP < 2 will never have an XSDT. Also add: Fixes: 1d74282c455f ('x86: setup PVHv2 Dom0 ACPI tables') Thanks. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/2] xen/x86/pvh: use preset XSDT header for XSDT generation 2023-05-16 8:10 ` Roger Pau Monné @ 2023-05-16 8:13 ` Roger Pau Monné 2023-05-16 8:24 ` Roger Pau Monné 1 sibling, 0 replies; 29+ messages in thread From: Roger Pau Monné @ 2023-05-16 8:13 UTC (permalink / raw) To: Stefano Stabellini Cc: Jan Beulich, andrew.cooper3, xen-devel, Xenia.Ragiadakou, Stefano Stabellini On Tue, May 16, 2023 at 10:10:07AM +0200, Roger Pau Monné wrote: > On Mon, May 15, 2023 at 05:16:27PM -0700, Stefano Stabellini wrote: > > On Mon, 15 May 2023, Jan Beulich wrote: > > > On 15.05.2023 10:48, Roger Pau Monné wrote: > > > > On Fri, May 12, 2023 at 06:17:19PM -0700, Stefano Stabellini wrote: > > > >> From: Stefano Stabellini <stefano.stabellini@amd.com> > > > >> > > > >> Xen always generates a XSDT table even if the firmware provided a RSDT > > > >> table. Instead of copying the XSDT header from the firmware table (that > > > >> might be missing), generate the XSDT header from a preset. > > > >> > > > >> Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> > > > >> --- > > > >> xen/arch/x86/hvm/dom0_build.c | 32 +++++++++----------------------- > > > >> 1 file changed, 9 insertions(+), 23 deletions(-) > > > >> > > > >> diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c > > > >> index 307edc6a8c..5fde769863 100644 > > > >> --- a/xen/arch/x86/hvm/dom0_build.c > > > >> +++ b/xen/arch/x86/hvm/dom0_build.c > > > >> @@ -963,13 +963,18 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, > > > >> paddr_t *addr) > > > >> { > > > >> struct acpi_table_xsdt *xsdt; > > > >> - struct acpi_table_header *table; > > > >> - struct acpi_table_rsdp *rsdp; > > > >> const struct acpi_table_desc *tables = acpi_gbl_root_table_list.tables; > > > >> unsigned long size = sizeof(*xsdt); > > > >> unsigned int i, j, num_tables = 0; > > > >> - paddr_t xsdt_paddr; > > > >> int rc; > > > >> + struct acpi_table_header header = { > > > >> + .signature = "XSDT", > > > >> + .length = sizeof(struct acpi_table_header), > > > >> + .revision = 0x1, > > > >> + .oem_id = "Xen", > > > >> + .oem_table_id = "HVM", > > > > > > > > I think this is wrong, as according to the spec the OEM Table ID must > > > > match the OEM Table ID in the FADT. > > > > > > > > We likely want to copy the OEM ID and OEM Table ID from the RSDP, and > > > > possibly also the other OEM related fields. > > > > > > > > Alternatively we might want to copy and use the RSDT on systems that > > > > lack an XSDT, or even just copy the header from the RSDT into Xen's > > > > crafted XSDT, since the format of the RSDP and the XSDT headers are > > > > exactly the same (the difference is in the size of the description > > > > headers that come after). > > > > > > I guess I'd prefer that last variant. > > > > I tried this approach (together with the second patch for necessity) and > > it worked. > > > > diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c > > index fd2cbf68bc..11d6d1bc23 100644 > > --- a/xen/arch/x86/hvm/dom0_build.c > > +++ b/xen/arch/x86/hvm/dom0_build.c > > @@ -967,6 +967,10 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, > > goto out; > > } > > xsdt_paddr = rsdp->xsdt_physical_address; > > + if ( !xsdt_paddr ) > > + { > > + xsdt_paddr = rsdp->rsdt_physical_address; > > + } > > acpi_os_unmap_memory(rsdp, sizeof(*rsdp)); > > table = acpi_os_map_memory(xsdt_paddr, sizeof(*table)); > > if ( !table ) > > To be slightly more consistent, could you use: > > /* > * Note the header is the same for both RSDT and XSDT, so it's fine to > * copy the native RSDT header to the Xen crafted XSDT if no native > * XSDT is available. > */ > if (rsdp->revision > 1 && rsdp->xsdt_physical_address) > sdt_paddr = rsdp->xsdt_physical_address; > else > sdt_paddr = rsdp->rsdt_physical_address; (please add the missing spaces in the chunk above) ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/2] xen/x86/pvh: use preset XSDT header for XSDT generation 2023-05-16 8:10 ` Roger Pau Monné 2023-05-16 8:13 ` Roger Pau Monné @ 2023-05-16 8:24 ` Roger Pau Monné 2023-05-16 9:13 ` Jan Beulich 2023-05-16 22:11 ` Stefano Stabellini 1 sibling, 2 replies; 29+ messages in thread From: Roger Pau Monné @ 2023-05-16 8:24 UTC (permalink / raw) To: Stefano Stabellini Cc: Jan Beulich, andrew.cooper3, xen-devel, Xenia.Ragiadakou, Stefano Stabellini On Tue, May 16, 2023 at 10:10:07AM +0200, Roger Pau Monné wrote: > On Mon, May 15, 2023 at 05:16:27PM -0700, Stefano Stabellini wrote: > > On Mon, 15 May 2023, Jan Beulich wrote: > > > On 15.05.2023 10:48, Roger Pau Monné wrote: > > > > On Fri, May 12, 2023 at 06:17:19PM -0700, Stefano Stabellini wrote: > > > >> From: Stefano Stabellini <stefano.stabellini@amd.com> > > > >> > > > >> Xen always generates a XSDT table even if the firmware provided a RSDT > > > >> table. Instead of copying the XSDT header from the firmware table (that > > > >> might be missing), generate the XSDT header from a preset. > > > >> > > > >> Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> > > > >> --- > > > >> xen/arch/x86/hvm/dom0_build.c | 32 +++++++++----------------------- > > > >> 1 file changed, 9 insertions(+), 23 deletions(-) > > > >> > > > >> diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c > > > >> index 307edc6a8c..5fde769863 100644 > > > >> --- a/xen/arch/x86/hvm/dom0_build.c > > > >> +++ b/xen/arch/x86/hvm/dom0_build.c > > > >> @@ -963,13 +963,18 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, > > > >> paddr_t *addr) > > > >> { > > > >> struct acpi_table_xsdt *xsdt; > > > >> - struct acpi_table_header *table; > > > >> - struct acpi_table_rsdp *rsdp; > > > >> const struct acpi_table_desc *tables = acpi_gbl_root_table_list.tables; > > > >> unsigned long size = sizeof(*xsdt); > > > >> unsigned int i, j, num_tables = 0; > > > >> - paddr_t xsdt_paddr; > > > >> int rc; > > > >> + struct acpi_table_header header = { > > > >> + .signature = "XSDT", > > > >> + .length = sizeof(struct acpi_table_header), > > > >> + .revision = 0x1, > > > >> + .oem_id = "Xen", > > > >> + .oem_table_id = "HVM", > > > > > > > > I think this is wrong, as according to the spec the OEM Table ID must > > > > match the OEM Table ID in the FADT. > > > > > > > > We likely want to copy the OEM ID and OEM Table ID from the RSDP, and > > > > possibly also the other OEM related fields. > > > > > > > > Alternatively we might want to copy and use the RSDT on systems that > > > > lack an XSDT, or even just copy the header from the RSDT into Xen's > > > > crafted XSDT, since the format of the RSDP and the XSDT headers are > > > > exactly the same (the difference is in the size of the description > > > > headers that come after). > > > > > > I guess I'd prefer that last variant. > > > > I tried this approach (together with the second patch for necessity) and > > it worked. > > > > diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c > > index fd2cbf68bc..11d6d1bc23 100644 > > --- a/xen/arch/x86/hvm/dom0_build.c > > +++ b/xen/arch/x86/hvm/dom0_build.c > > @@ -967,6 +967,10 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, > > goto out; > > } > > xsdt_paddr = rsdp->xsdt_physical_address; > > + if ( !xsdt_paddr ) > > + { > > + xsdt_paddr = rsdp->rsdt_physical_address; > > + } > > acpi_os_unmap_memory(rsdp, sizeof(*rsdp)); > > table = acpi_os_map_memory(xsdt_paddr, sizeof(*table)); > > if ( !table ) > > To be slightly more consistent, could you use: > > /* > * Note the header is the same for both RSDT and XSDT, so it's fine to > * copy the native RSDT header to the Xen crafted XSDT if no native > * XSDT is available. > */ > if (rsdp->revision > 1 && rsdp->xsdt_physical_address) > sdt_paddr = rsdp->xsdt_physical_address; > else > sdt_paddr = rsdp->rsdt_physical_address; > > It was an oversight of mine to not check for the RSDP revision, as > RSDP < 2 will never have an XSDT. Also add: > > Fixes: 1d74282c455f ('x86: setup PVHv2 Dom0 ACPI tables') Just realized this will require some more work so that the guest (dom0) provided RSDP is at least revision 2. You will need to adjust the field and recalculate the checksum if needed. Thanks, Roger. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/2] xen/x86/pvh: use preset XSDT header for XSDT generation 2023-05-16 8:24 ` Roger Pau Monné @ 2023-05-16 9:13 ` Jan Beulich 2023-05-16 9:23 ` Roger Pau Monné 2023-05-16 22:11 ` Stefano Stabellini 1 sibling, 1 reply; 29+ messages in thread From: Jan Beulich @ 2023-05-16 9:13 UTC (permalink / raw) To: Roger Pau Monné, Stefano Stabellini Cc: andrew.cooper3, xen-devel, Xenia.Ragiadakou, Stefano Stabellini On 16.05.2023 10:24, Roger Pau Monné wrote: > On Tue, May 16, 2023 at 10:10:07AM +0200, Roger Pau Monné wrote: >> On Mon, May 15, 2023 at 05:16:27PM -0700, Stefano Stabellini wrote: >>> On Mon, 15 May 2023, Jan Beulich wrote: >>>> On 15.05.2023 10:48, Roger Pau Monné wrote: >>>>> On Fri, May 12, 2023 at 06:17:19PM -0700, Stefano Stabellini wrote: >>>>>> From: Stefano Stabellini <stefano.stabellini@amd.com> >>>>>> >>>>>> Xen always generates a XSDT table even if the firmware provided a RSDT >>>>>> table. Instead of copying the XSDT header from the firmware table (that >>>>>> might be missing), generate the XSDT header from a preset. >>>>>> >>>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> >>>>>> --- >>>>>> xen/arch/x86/hvm/dom0_build.c | 32 +++++++++----------------------- >>>>>> 1 file changed, 9 insertions(+), 23 deletions(-) >>>>>> >>>>>> diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c >>>>>> index 307edc6a8c..5fde769863 100644 >>>>>> --- a/xen/arch/x86/hvm/dom0_build.c >>>>>> +++ b/xen/arch/x86/hvm/dom0_build.c >>>>>> @@ -963,13 +963,18 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, >>>>>> paddr_t *addr) >>>>>> { >>>>>> struct acpi_table_xsdt *xsdt; >>>>>> - struct acpi_table_header *table; >>>>>> - struct acpi_table_rsdp *rsdp; >>>>>> const struct acpi_table_desc *tables = acpi_gbl_root_table_list.tables; >>>>>> unsigned long size = sizeof(*xsdt); >>>>>> unsigned int i, j, num_tables = 0; >>>>>> - paddr_t xsdt_paddr; >>>>>> int rc; >>>>>> + struct acpi_table_header header = { >>>>>> + .signature = "XSDT", >>>>>> + .length = sizeof(struct acpi_table_header), >>>>>> + .revision = 0x1, >>>>>> + .oem_id = "Xen", >>>>>> + .oem_table_id = "HVM", >>>>> >>>>> I think this is wrong, as according to the spec the OEM Table ID must >>>>> match the OEM Table ID in the FADT. >>>>> >>>>> We likely want to copy the OEM ID and OEM Table ID from the RSDP, and >>>>> possibly also the other OEM related fields. >>>>> >>>>> Alternatively we might want to copy and use the RSDT on systems that >>>>> lack an XSDT, or even just copy the header from the RSDT into Xen's >>>>> crafted XSDT, since the format of the RSDP and the XSDT headers are >>>>> exactly the same (the difference is in the size of the description >>>>> headers that come after). >>>> >>>> I guess I'd prefer that last variant. >>> >>> I tried this approach (together with the second patch for necessity) and >>> it worked. >>> >>> diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c >>> index fd2cbf68bc..11d6d1bc23 100644 >>> --- a/xen/arch/x86/hvm/dom0_build.c >>> +++ b/xen/arch/x86/hvm/dom0_build.c >>> @@ -967,6 +967,10 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, >>> goto out; >>> } >>> xsdt_paddr = rsdp->xsdt_physical_address; >>> + if ( !xsdt_paddr ) >>> + { >>> + xsdt_paddr = rsdp->rsdt_physical_address; >>> + } >>> acpi_os_unmap_memory(rsdp, sizeof(*rsdp)); >>> table = acpi_os_map_memory(xsdt_paddr, sizeof(*table)); >>> if ( !table ) >> >> To be slightly more consistent, could you use: >> >> /* >> * Note the header is the same for both RSDT and XSDT, so it's fine to >> * copy the native RSDT header to the Xen crafted XSDT if no native >> * XSDT is available. >> */ >> if (rsdp->revision > 1 && rsdp->xsdt_physical_address) >> sdt_paddr = rsdp->xsdt_physical_address; >> else >> sdt_paddr = rsdp->rsdt_physical_address; >> >> It was an oversight of mine to not check for the RSDP revision, as >> RSDP < 2 will never have an XSDT. Also add: >> >> Fixes: 1d74282c455f ('x86: setup PVHv2 Dom0 ACPI tables') > > Just realized this will require some more work so that the guest > (dom0) provided RSDP is at least revision 2. You will need to adjust > the field and recalculate the checksum if needed. We could also mandate ACPI version >= 2 for PVH Dom0. Jan ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/2] xen/x86/pvh: use preset XSDT header for XSDT generation 2023-05-16 9:13 ` Jan Beulich @ 2023-05-16 9:23 ` Roger Pau Monné 0 siblings, 0 replies; 29+ messages in thread From: Roger Pau Monné @ 2023-05-16 9:23 UTC (permalink / raw) To: Jan Beulich Cc: Stefano Stabellini, andrew.cooper3, xen-devel, Xenia.Ragiadakou, Stefano Stabellini On Tue, May 16, 2023 at 11:13:17AM +0200, Jan Beulich wrote: > On 16.05.2023 10:24, Roger Pau Monné wrote: > > On Tue, May 16, 2023 at 10:10:07AM +0200, Roger Pau Monné wrote: > >> On Mon, May 15, 2023 at 05:16:27PM -0700, Stefano Stabellini wrote: > >>> On Mon, 15 May 2023, Jan Beulich wrote: > >>>> On 15.05.2023 10:48, Roger Pau Monné wrote: > >>>>> On Fri, May 12, 2023 at 06:17:19PM -0700, Stefano Stabellini wrote: > >>>>>> From: Stefano Stabellini <stefano.stabellini@amd.com> > >>>>>> > >>>>>> Xen always generates a XSDT table even if the firmware provided a RSDT > >>>>>> table. Instead of copying the XSDT header from the firmware table (that > >>>>>> might be missing), generate the XSDT header from a preset. > >>>>>> > >>>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> > >>>>>> --- > >>>>>> xen/arch/x86/hvm/dom0_build.c | 32 +++++++++----------------------- > >>>>>> 1 file changed, 9 insertions(+), 23 deletions(-) > >>>>>> > >>>>>> diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c > >>>>>> index 307edc6a8c..5fde769863 100644 > >>>>>> --- a/xen/arch/x86/hvm/dom0_build.c > >>>>>> +++ b/xen/arch/x86/hvm/dom0_build.c > >>>>>> @@ -963,13 +963,18 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, > >>>>>> paddr_t *addr) > >>>>>> { > >>>>>> struct acpi_table_xsdt *xsdt; > >>>>>> - struct acpi_table_header *table; > >>>>>> - struct acpi_table_rsdp *rsdp; > >>>>>> const struct acpi_table_desc *tables = acpi_gbl_root_table_list.tables; > >>>>>> unsigned long size = sizeof(*xsdt); > >>>>>> unsigned int i, j, num_tables = 0; > >>>>>> - paddr_t xsdt_paddr; > >>>>>> int rc; > >>>>>> + struct acpi_table_header header = { > >>>>>> + .signature = "XSDT", > >>>>>> + .length = sizeof(struct acpi_table_header), > >>>>>> + .revision = 0x1, > >>>>>> + .oem_id = "Xen", > >>>>>> + .oem_table_id = "HVM", > >>>>> > >>>>> I think this is wrong, as according to the spec the OEM Table ID must > >>>>> match the OEM Table ID in the FADT. > >>>>> > >>>>> We likely want to copy the OEM ID and OEM Table ID from the RSDP, and > >>>>> possibly also the other OEM related fields. > >>>>> > >>>>> Alternatively we might want to copy and use the RSDT on systems that > >>>>> lack an XSDT, or even just copy the header from the RSDT into Xen's > >>>>> crafted XSDT, since the format of the RSDP and the XSDT headers are > >>>>> exactly the same (the difference is in the size of the description > >>>>> headers that come after). > >>>> > >>>> I guess I'd prefer that last variant. > >>> > >>> I tried this approach (together with the second patch for necessity) and > >>> it worked. > >>> > >>> diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c > >>> index fd2cbf68bc..11d6d1bc23 100644 > >>> --- a/xen/arch/x86/hvm/dom0_build.c > >>> +++ b/xen/arch/x86/hvm/dom0_build.c > >>> @@ -967,6 +967,10 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, > >>> goto out; > >>> } > >>> xsdt_paddr = rsdp->xsdt_physical_address; > >>> + if ( !xsdt_paddr ) > >>> + { > >>> + xsdt_paddr = rsdp->rsdt_physical_address; > >>> + } > >>> acpi_os_unmap_memory(rsdp, sizeof(*rsdp)); > >>> table = acpi_os_map_memory(xsdt_paddr, sizeof(*table)); > >>> if ( !table ) > >> > >> To be slightly more consistent, could you use: > >> > >> /* > >> * Note the header is the same for both RSDT and XSDT, so it's fine to > >> * copy the native RSDT header to the Xen crafted XSDT if no native > >> * XSDT is available. > >> */ > >> if (rsdp->revision > 1 && rsdp->xsdt_physical_address) > >> sdt_paddr = rsdp->xsdt_physical_address; > >> else > >> sdt_paddr = rsdp->rsdt_physical_address; > >> > >> It was an oversight of mine to not check for the RSDP revision, as > >> RSDP < 2 will never have an XSDT. Also add: > >> > >> Fixes: 1d74282c455f ('x86: setup PVHv2 Dom0 ACPI tables') > > > > Just realized this will require some more work so that the guest > > (dom0) provided RSDP is at least revision 2. You will need to adjust > > the field and recalculate the checksum if needed. > > We could also mandate ACPI version >= 2 for PVH Dom0. Sorry, mentioned on IRC, the above is not required because the RSDP provided to dom0 is already crafted by Xen and unconditionally set to version == 2. There's no need to adjust the RSDP at all. Thanks, Roger. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/2] xen/x86/pvh: use preset XSDT header for XSDT generation 2023-05-16 8:24 ` Roger Pau Monné 2023-05-16 9:13 ` Jan Beulich @ 2023-05-16 22:11 ` Stefano Stabellini 2023-05-17 8:42 ` Roger Pau Monné 1 sibling, 1 reply; 29+ messages in thread From: Stefano Stabellini @ 2023-05-16 22:11 UTC (permalink / raw) To: Roger Pau Monné Cc: Stefano Stabellini, Jan Beulich, andrew.cooper3, xen-devel, Xenia.Ragiadakou, Stefano Stabellini [-- Attachment #1: Type: text/plain, Size: 4359 bytes --] On Tue, 16 May 2023, Roger Pau Monné wrote: > On Tue, May 16, 2023 at 10:10:07AM +0200, Roger Pau Monné wrote: > > On Mon, May 15, 2023 at 05:16:27PM -0700, Stefano Stabellini wrote: > > > On Mon, 15 May 2023, Jan Beulich wrote: > > > > On 15.05.2023 10:48, Roger Pau Monné wrote: > > > > > On Fri, May 12, 2023 at 06:17:19PM -0700, Stefano Stabellini wrote: > > > > >> From: Stefano Stabellini <stefano.stabellini@amd.com> > > > > >> > > > > >> Xen always generates a XSDT table even if the firmware provided a RSDT > > > > >> table. Instead of copying the XSDT header from the firmware table (that > > > > >> might be missing), generate the XSDT header from a preset. > > > > >> > > > > >> Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> > > > > >> --- > > > > >> xen/arch/x86/hvm/dom0_build.c | 32 +++++++++----------------------- > > > > >> 1 file changed, 9 insertions(+), 23 deletions(-) > > > > >> > > > > >> diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c > > > > >> index 307edc6a8c..5fde769863 100644 > > > > >> --- a/xen/arch/x86/hvm/dom0_build.c > > > > >> +++ b/xen/arch/x86/hvm/dom0_build.c > > > > >> @@ -963,13 +963,18 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, > > > > >> paddr_t *addr) > > > > >> { > > > > >> struct acpi_table_xsdt *xsdt; > > > > >> - struct acpi_table_header *table; > > > > >> - struct acpi_table_rsdp *rsdp; > > > > >> const struct acpi_table_desc *tables = acpi_gbl_root_table_list.tables; > > > > >> unsigned long size = sizeof(*xsdt); > > > > >> unsigned int i, j, num_tables = 0; > > > > >> - paddr_t xsdt_paddr; > > > > >> int rc; > > > > >> + struct acpi_table_header header = { > > > > >> + .signature = "XSDT", > > > > >> + .length = sizeof(struct acpi_table_header), > > > > >> + .revision = 0x1, > > > > >> + .oem_id = "Xen", > > > > >> + .oem_table_id = "HVM", > > > > > > > > > > I think this is wrong, as according to the spec the OEM Table ID must > > > > > match the OEM Table ID in the FADT. > > > > > > > > > > We likely want to copy the OEM ID and OEM Table ID from the RSDP, and > > > > > possibly also the other OEM related fields. > > > > > > > > > > Alternatively we might want to copy and use the RSDT on systems that > > > > > lack an XSDT, or even just copy the header from the RSDT into Xen's > > > > > crafted XSDT, since the format of the RSDP and the XSDT headers are > > > > > exactly the same (the difference is in the size of the description > > > > > headers that come after). > > > > > > > > I guess I'd prefer that last variant. > > > > > > I tried this approach (together with the second patch for necessity) and > > > it worked. > > > > > > diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c > > > index fd2cbf68bc..11d6d1bc23 100644 > > > --- a/xen/arch/x86/hvm/dom0_build.c > > > +++ b/xen/arch/x86/hvm/dom0_build.c > > > @@ -967,6 +967,10 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, > > > goto out; > > > } > > > xsdt_paddr = rsdp->xsdt_physical_address; > > > + if ( !xsdt_paddr ) > > > + { > > > + xsdt_paddr = rsdp->rsdt_physical_address; > > > + } > > > acpi_os_unmap_memory(rsdp, sizeof(*rsdp)); > > > table = acpi_os_map_memory(xsdt_paddr, sizeof(*table)); > > > if ( !table ) > > > > To be slightly more consistent, could you use: > > > > /* > > * Note the header is the same for both RSDT and XSDT, so it's fine to > > * copy the native RSDT header to the Xen crafted XSDT if no native > > * XSDT is available. > > */ > > if (rsdp->revision > 1 && rsdp->xsdt_physical_address) > > sdt_paddr = rsdp->xsdt_physical_address; > > else > > sdt_paddr = rsdp->rsdt_physical_address; > > > > It was an oversight of mine to not check for the RSDP revision, as > > RSDP < 2 will never have an XSDT. Also add: > > > > Fixes: 1d74282c455f ('x86: setup PVHv2 Dom0 ACPI tables') > > Just realized this will require some more work so that the guest > (dom0) provided RSDP is at least revision 2. You will need to adjust > the field and recalculate the checksum if needed. But we are always providing RSDP version 2 in pvh_setup_acpi, right? ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/2] xen/x86/pvh: use preset XSDT header for XSDT generation 2023-05-16 22:11 ` Stefano Stabellini @ 2023-05-17 8:42 ` Roger Pau Monné 0 siblings, 0 replies; 29+ messages in thread From: Roger Pau Monné @ 2023-05-17 8:42 UTC (permalink / raw) To: Stefano Stabellini Cc: Jan Beulich, andrew.cooper3, xen-devel, Xenia.Ragiadakou, Stefano Stabellini On Tue, May 16, 2023 at 03:11:49PM -0700, Stefano Stabellini wrote: > On Tue, 16 May 2023, Roger Pau Monné wrote: > > On Tue, May 16, 2023 at 10:10:07AM +0200, Roger Pau Monné wrote: > > > On Mon, May 15, 2023 at 05:16:27PM -0700, Stefano Stabellini wrote: > > > > On Mon, 15 May 2023, Jan Beulich wrote: > > > > > On 15.05.2023 10:48, Roger Pau Monné wrote: > > > > > > On Fri, May 12, 2023 at 06:17:19PM -0700, Stefano Stabellini wrote: > > > > > >> From: Stefano Stabellini <stefano.stabellini@amd.com> > > > > > >> > > > > > >> Xen always generates a XSDT table even if the firmware provided a RSDT > > > > > >> table. Instead of copying the XSDT header from the firmware table (that > > > > > >> might be missing), generate the XSDT header from a preset. > > > > > >> > > > > > >> Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> > > > > > >> --- > > > > > >> xen/arch/x86/hvm/dom0_build.c | 32 +++++++++----------------------- > > > > > >> 1 file changed, 9 insertions(+), 23 deletions(-) > > > > > >> > > > > > >> diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c > > > > > >> index 307edc6a8c..5fde769863 100644 > > > > > >> --- a/xen/arch/x86/hvm/dom0_build.c > > > > > >> +++ b/xen/arch/x86/hvm/dom0_build.c > > > > > >> @@ -963,13 +963,18 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, > > > > > >> paddr_t *addr) > > > > > >> { > > > > > >> struct acpi_table_xsdt *xsdt; > > > > > >> - struct acpi_table_header *table; > > > > > >> - struct acpi_table_rsdp *rsdp; > > > > > >> const struct acpi_table_desc *tables = acpi_gbl_root_table_list.tables; > > > > > >> unsigned long size = sizeof(*xsdt); > > > > > >> unsigned int i, j, num_tables = 0; > > > > > >> - paddr_t xsdt_paddr; > > > > > >> int rc; > > > > > >> + struct acpi_table_header header = { > > > > > >> + .signature = "XSDT", > > > > > >> + .length = sizeof(struct acpi_table_header), > > > > > >> + .revision = 0x1, > > > > > >> + .oem_id = "Xen", > > > > > >> + .oem_table_id = "HVM", > > > > > > > > > > > > I think this is wrong, as according to the spec the OEM Table ID must > > > > > > match the OEM Table ID in the FADT. > > > > > > > > > > > > We likely want to copy the OEM ID and OEM Table ID from the RSDP, and > > > > > > possibly also the other OEM related fields. > > > > > > > > > > > > Alternatively we might want to copy and use the RSDT on systems that > > > > > > lack an XSDT, or even just copy the header from the RSDT into Xen's > > > > > > crafted XSDT, since the format of the RSDP and the XSDT headers are > > > > > > exactly the same (the difference is in the size of the description > > > > > > headers that come after). > > > > > > > > > > I guess I'd prefer that last variant. > > > > > > > > I tried this approach (together with the second patch for necessity) and > > > > it worked. > > > > > > > > diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c > > > > index fd2cbf68bc..11d6d1bc23 100644 > > > > --- a/xen/arch/x86/hvm/dom0_build.c > > > > +++ b/xen/arch/x86/hvm/dom0_build.c > > > > @@ -967,6 +967,10 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, > > > > goto out; > > > > } > > > > xsdt_paddr = rsdp->xsdt_physical_address; > > > > + if ( !xsdt_paddr ) > > > > + { > > > > + xsdt_paddr = rsdp->rsdt_physical_address; > > > > + } > > > > acpi_os_unmap_memory(rsdp, sizeof(*rsdp)); > > > > table = acpi_os_map_memory(xsdt_paddr, sizeof(*table)); > > > > if ( !table ) > > > > > > To be slightly more consistent, could you use: > > > > > > /* > > > * Note the header is the same for both RSDT and XSDT, so it's fine to > > > * copy the native RSDT header to the Xen crafted XSDT if no native > > > * XSDT is available. > > > */ > > > if (rsdp->revision > 1 && rsdp->xsdt_physical_address) > > > sdt_paddr = rsdp->xsdt_physical_address; > > > else > > > sdt_paddr = rsdp->rsdt_physical_address; > > > > > > It was an oversight of mine to not check for the RSDP revision, as > > > RSDP < 2 will never have an XSDT. Also add: > > > > > > Fixes: 1d74282c455f ('x86: setup PVHv2 Dom0 ACPI tables') > > > > Just realized this will require some more work so that the guest > > (dom0) provided RSDP is at least revision 2. You will need to adjust > > the field and recalculate the checksum if needed. > > But we are always providing RSDP version 2 in pvh_setup_acpi, right? Yes, as said in the reply to Jan, just ignore this. Thanks, Roger. ^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping 2023-05-13 1:16 [PATCH 0/2] PVH Dom0 on QEMU Stefano Stabellini 2023-05-13 1:17 ` [PATCH 1/2] xen/x86/pvh: use preset XSDT header for XSDT generation Stefano Stabellini @ 2023-05-13 1:17 ` Stefano Stabellini 2023-05-15 9:44 ` Roger Pau Monné 2023-05-15 14:17 ` Jan Beulich 1 sibling, 2 replies; 29+ messages in thread From: Stefano Stabellini @ 2023-05-13 1:17 UTC (permalink / raw) To: roger.pau, andrew.cooper3, jbeulich Cc: sstabellini, xen-devel, Xenia.Ragiadakou, Stefano Stabellini From: Stefano Stabellini <stefano.stabellini@amd.com> Mapping the ACPI tables to Dom0 PVH 1:1 leads to memory corruptions of the tables in the guest. Instead, copy the tables to Dom0. This is a workaround. Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> --- As mentioned in the cover letter, this is a RFC workaround as I don't know the cause of the underlying problem. I do know that this patch solves what would be otherwise a hang at boot when Dom0 PVH attempts to parse ACPI tables. --- xen/arch/x86/hvm/dom0_build.c | 107 +++++++++------------------------- 1 file changed, 27 insertions(+), 80 deletions(-) diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c index 5fde769863..a6037fc6ed 100644 --- a/xen/arch/x86/hvm/dom0_build.c +++ b/xen/arch/x86/hvm/dom0_build.c @@ -73,32 +73,6 @@ static void __init print_order_stats(const struct domain *d) printk("order %2u allocations: %u\n", i, order_stats[i]); } -static int __init modify_identity_mmio(struct domain *d, unsigned long pfn, - unsigned long nr_pages, const bool map) -{ - int rc; - - for ( ; ; ) - { - rc = map ? map_mmio_regions(d, _gfn(pfn), nr_pages, _mfn(pfn)) - : unmap_mmio_regions(d, _gfn(pfn), nr_pages, _mfn(pfn)); - if ( rc == 0 ) - break; - if ( rc < 0 ) - { - printk(XENLOG_WARNING - "Failed to identity %smap [%#lx,%#lx) for d%d: %d\n", - map ? "" : "un", pfn, pfn + nr_pages, d->domain_id, rc); - break; - } - nr_pages -= rc; - pfn += rc; - process_pending_softirqs(); - } - - return rc; -} - /* Populate a HVM memory range using the biggest possible order. */ static int __init pvh_populate_memory_range(struct domain *d, unsigned long start, @@ -967,6 +941,8 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, unsigned long size = sizeof(*xsdt); unsigned int i, j, num_tables = 0; int rc; + struct acpi_table_fadt fadt; + unsigned long fadt_addr = 0, dsdt_addr = 0, facs_addr = 0, fadt_size = 0; struct acpi_table_header header = { .signature = "XSDT", .length = sizeof(struct acpi_table_header), @@ -1013,10 +989,33 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, /* Copy the addresses of the rest of the allowed tables. */ for( i = 0, j = 1; i < acpi_gbl_root_table_list.count; i++ ) { + void *table; + + pvh_steal_ram(d, tables[i].length, 0, GB(4), addr); + table = acpi_os_map_memory(tables[i].address, tables[i].length); + hvm_copy_to_guest_phys(*addr, table, tables[i].length, d->vcpu[0]); + pvh_add_mem_range(d, *addr, *addr + tables[i].length, E820_ACPI); + + if ( !strncmp(tables[i].signature.ascii, ACPI_SIG_FADT, ACPI_NAME_SIZE) ) + { + memcpy(&fadt, table, tables[i].length); + fadt_addr = *addr; + fadt_size = tables[i].length; + } + else if ( !strncmp(tables[i].signature.ascii, ACPI_SIG_DSDT, ACPI_NAME_SIZE) ) + dsdt_addr = *addr; + else if ( !strncmp(tables[i].signature.ascii, ACPI_SIG_FACS, ACPI_NAME_SIZE) ) + facs_addr = *addr; + if ( pvh_acpi_xsdt_table_allowed(tables[i].signature.ascii, - tables[i].address, tables[i].length) ) - xsdt->table_offset_entry[j++] = tables[i].address; + tables[i].address, tables[i].length) ) + xsdt->table_offset_entry[j++] = *addr; + + acpi_os_unmap_memory(table, tables[i].length); } + fadt.dsdt = dsdt_addr; + fadt.facs = facs_addr; + hvm_copy_to_guest_phys(fadt_addr, &fadt, fadt_size, d->vcpu[0]); xsdt->header.revision = 1; xsdt->header.length = size; @@ -1055,9 +1054,7 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, static int __init pvh_setup_acpi(struct domain *d, paddr_t start_info) { - unsigned long pfn, nr_pages; paddr_t madt_paddr, xsdt_paddr, rsdp_paddr; - unsigned int i; int rc; struct acpi_table_rsdp *native_rsdp, rsdp = { .signature = ACPI_SIG_RSDP, @@ -1065,56 +1062,6 @@ static int __init pvh_setup_acpi(struct domain *d, paddr_t start_info) .length = sizeof(rsdp), }; - - /* Scan top-level tables and add their regions to the guest memory map. */ - for( i = 0; i < acpi_gbl_root_table_list.count; i++ ) - { - const char *sig = acpi_gbl_root_table_list.tables[i].signature.ascii; - unsigned long addr = acpi_gbl_root_table_list.tables[i].address; - unsigned long size = acpi_gbl_root_table_list.tables[i].length; - - /* - * Make sure the original MADT is also mapped, so that Dom0 can - * properly access the data returned by _MAT methods in case it's - * re-using MADT memory. - */ - if ( strncmp(sig, ACPI_SIG_MADT, ACPI_NAME_SIZE) - ? pvh_acpi_table_allowed(sig, addr, size) - : !acpi_memory_banned(addr, size) ) - pvh_add_mem_range(d, addr, addr + size, E820_ACPI); - } - - /* Identity map ACPI e820 regions. */ - for ( i = 0; i < d->arch.nr_e820; i++ ) - { - if ( d->arch.e820[i].type != E820_ACPI && - d->arch.e820[i].type != E820_NVS ) - continue; - - pfn = PFN_DOWN(d->arch.e820[i].addr); - nr_pages = PFN_UP((d->arch.e820[i].addr & ~PAGE_MASK) + - d->arch.e820[i].size); - - /* Memory below 1MB has been dealt with by pvh_populate_p2m(). */ - if ( pfn < PFN_DOWN(MB(1)) ) - { - if ( pfn + nr_pages <= PFN_DOWN(MB(1)) ) - continue; - - /* This shouldn't happen, but is easy to deal with. */ - nr_pages -= PFN_DOWN(MB(1)) - pfn; - pfn = PFN_DOWN(MB(1)); - } - - rc = modify_identity_mmio(d, pfn, nr_pages, true); - if ( rc ) - { - printk("Failed to map ACPI region [%#lx, %#lx) into Dom0 memory map\n", - pfn, pfn + nr_pages); - return rc; - } - } - rc = pvh_setup_acpi_madt(d, &madt_paddr); if ( rc ) return rc; -- 2.25.1 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping 2023-05-13 1:17 ` [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping Stefano Stabellini @ 2023-05-15 9:44 ` Roger Pau Monné 2023-05-16 0:11 ` Stefano Stabellini 2023-05-15 14:17 ` Jan Beulich 1 sibling, 1 reply; 29+ messages in thread From: Roger Pau Monné @ 2023-05-15 9:44 UTC (permalink / raw) To: Stefano Stabellini Cc: andrew.cooper3, jbeulich, xen-devel, Xenia.Ragiadakou, Stefano Stabellini On Fri, May 12, 2023 at 06:17:20PM -0700, Stefano Stabellini wrote: > From: Stefano Stabellini <stefano.stabellini@amd.com> > > Mapping the ACPI tables to Dom0 PVH 1:1 leads to memory corruptions of > the tables in the guest. Instead, copy the tables to Dom0. > > This is a workaround. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> > --- > As mentioned in the cover letter, this is a RFC workaround as I don't > know the cause of the underlying problem. I do know that this patch > solves what would be otherwise a hang at boot when Dom0 PVH attempts to > parse ACPI tables. I'm unsure how safe this is for native systems, as it's possible for firmware to modify the data in the tables, so copying them would break that functionality. I think we need to get to the root cause that triggers this behavior on QEMU. Is it the table checksum that fail, or something else? Is there an error from Linux you could reference? I've got some feedback below, but I'm unsure copying is the correct approach. > --- > xen/arch/x86/hvm/dom0_build.c | 107 +++++++++------------------------- > 1 file changed, 27 insertions(+), 80 deletions(-) > > diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c > index 5fde769863..a6037fc6ed 100644 > --- a/xen/arch/x86/hvm/dom0_build.c > +++ b/xen/arch/x86/hvm/dom0_build.c > @@ -73,32 +73,6 @@ static void __init print_order_stats(const struct domain *d) > printk("order %2u allocations: %u\n", i, order_stats[i]); > } > > -static int __init modify_identity_mmio(struct domain *d, unsigned long pfn, > - unsigned long nr_pages, const bool map) > -{ > - int rc; > - > - for ( ; ; ) > - { > - rc = map ? map_mmio_regions(d, _gfn(pfn), nr_pages, _mfn(pfn)) > - : unmap_mmio_regions(d, _gfn(pfn), nr_pages, _mfn(pfn)); > - if ( rc == 0 ) > - break; > - if ( rc < 0 ) > - { > - printk(XENLOG_WARNING > - "Failed to identity %smap [%#lx,%#lx) for d%d: %d\n", > - map ? "" : "un", pfn, pfn + nr_pages, d->domain_id, rc); > - break; > - } > - nr_pages -= rc; > - pfn += rc; > - process_pending_softirqs(); > - } > - > - return rc; > -} > - > /* Populate a HVM memory range using the biggest possible order. */ > static int __init pvh_populate_memory_range(struct domain *d, > unsigned long start, > @@ -967,6 +941,8 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, > unsigned long size = sizeof(*xsdt); > unsigned int i, j, num_tables = 0; > int rc; > + struct acpi_table_fadt fadt; > + unsigned long fadt_addr = 0, dsdt_addr = 0, facs_addr = 0, fadt_size = 0; paddr_t and size_t would be better. > struct acpi_table_header header = { > .signature = "XSDT", > .length = sizeof(struct acpi_table_header), > @@ -1013,10 +989,33 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, > /* Copy the addresses of the rest of the allowed tables. */ > for( i = 0, j = 1; i < acpi_gbl_root_table_list.count; i++ ) > { > + void *table; const __iomem. > + > + pvh_steal_ram(d, tables[i].length, 0, GB(4), addr); > + table = acpi_os_map_memory(tables[i].address, tables[i].length); > + hvm_copy_to_guest_phys(*addr, table, tables[i].length, d->vcpu[0]); > + pvh_add_mem_range(d, *addr, *addr + tables[i].length, E820_ACPI); Need to check for errors in the calls above. > + > + if ( !strncmp(tables[i].signature.ascii, ACPI_SIG_FADT, ACPI_NAME_SIZE) ) > + { > + memcpy(&fadt, table, tables[i].length); > + fadt_addr = *addr; > + fadt_size = tables[i].length; > + } > + else if ( !strncmp(tables[i].signature.ascii, ACPI_SIG_DSDT, ACPI_NAME_SIZE) ) > + dsdt_addr = *addr; > + else if ( !strncmp(tables[i].signature.ascii, ACPI_SIG_FACS, ACPI_NAME_SIZE) ) > + facs_addr = *addr; Wrong indentation. > + > if ( pvh_acpi_xsdt_table_allowed(tables[i].signature.ascii, > - tables[i].address, tables[i].length) ) > - xsdt->table_offset_entry[j++] = tables[i].address; > + tables[i].address, tables[i].length) ) Unrelated withe space adjustment? Thanks, Roger. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping 2023-05-15 9:44 ` Roger Pau Monné @ 2023-05-16 0:11 ` Stefano Stabellini 2023-05-16 0:38 ` Stefano Stabellini ` (2 more replies) 0 siblings, 3 replies; 29+ messages in thread From: Stefano Stabellini @ 2023-05-16 0:11 UTC (permalink / raw) To: Roger Pau Monné Cc: Stefano Stabellini, andrew.cooper3, jbeulich, xen-devel, Xenia.Ragiadakou, Stefano Stabellini [-- Attachment #1: Type: text/plain, Size: 4778 bytes --] On Mon, 15 May 2023, Roger Pau Monné wrote: > On Fri, May 12, 2023 at 06:17:20PM -0700, Stefano Stabellini wrote: > > From: Stefano Stabellini <stefano.stabellini@amd.com> > > > > Mapping the ACPI tables to Dom0 PVH 1:1 leads to memory corruptions of > > the tables in the guest. Instead, copy the tables to Dom0. > > > > This is a workaround. > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> > > --- > > As mentioned in the cover letter, this is a RFC workaround as I don't > > know the cause of the underlying problem. I do know that this patch > > solves what would be otherwise a hang at boot when Dom0 PVH attempts to > > parse ACPI tables. > > I'm unsure how safe this is for native systems, as it's possible for > firmware to modify the data in the tables, so copying them would > break that functionality. > > I think we need to get to the root cause that triggers this behavior > on QEMU. Is it the table checksum that fail, or something else? Is > there an error from Linux you could reference? I agree with you but so far I haven't managed to find a way to the root of the issue. Here is what I know. These are the logs of a successful boot using this patch: [ 10.437488] ACPI: Early table checksum verification disabled [ 10.439345] ACPI: RSDP 0x000000004005F955 000024 (v02 BOCHS ) [ 10.441033] ACPI: RSDT 0x000000004005F979 000034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001) [ 10.444045] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001) [ 10.445984] ACPI: FACP 0x000000004005FA65 000074 (v01 BOCHS BXPCFACP 00000001 BXPC 00000001) [ 10.447170] ACPI BIOS Warning (bug): Incorrect checksum in table [FACP] - 0x67, should be 0x30 (20220331/tbprint-174) [ 10.449522] ACPI: DSDT 0x000000004005FB19 00145D (v01 BOCHS BXPCDSDT 00000001 BXPC 00000001) [ 10.451258] ACPI: FACS 0x000000004005FAD9 000040 [ 10.452245] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] [ 10.452389] ACPI: Reserving FACP table memory at [mem 0x4005fa65-0x4005fad8] [ 10.452497] ACPI: Reserving DSDT table memory at [mem 0x4005fb19-0x40060f75] [ 10.452602] ACPI: Reserving FACS table memory at [mem 0x4005fad9-0x4005fb18] And these are the logs of the same boot (unsuccessful) without this patch: [ 10.516015] ACPI: Early table checksum verification disabled [ 10.517732] ACPI: RSDP 0x0000000040060F1E 000024 (v02 BOCHS ) [ 10.519535] ACPI: RSDT 0x0000000040060F42 000034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001) [ 10.522523] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001) [ 10.527453] ACPI: ���� 0x000000007FFE149D FFFFFFFF (v255 ������ �������� FFFFFFFF ���� FFFFFFFF) [ 10.528362] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] [ 10.528491] ACPI: Reserving ���� table memory at [mem 0x7ffe149d-0x17ffe149b] It is clearly a memory corruption around FACS but I couldn't find the reason for it. The mapping code looks correct. I hope you can suggest a way to narrow down the problem. If I could, I would suggest to apply this patch just for the QEMU PVH tests but we don't have the infrastructure for that in gitlab-ci as there is a single Xen build for all tests. If it helps to repro on your side, you can just do the following, assuming your Xen repo is in /local/repos/xen: cd /local/repos/xen mkdir binaries cd binaries mkdir -p dist/install/ docker run -it -v `pwd`:`pwd` registry.gitlab.com/xen-project/xen/tests-artifacts/alpine:3.12 cp /initrd* /local/repos/xen/binaries exit docker run -it -v `pwd`:`pwd` registry.gitlab.com/xen-project/xen/tests-artifacts/kernel:6.1.19 cp /bzImage /local/repos/xen/binaries exit That's it. Now you have enough pre-built binaries to repro the issue. Next you can edit automation/scripts/qemu-alpine-x86_64.sh to add dom0=pvh dom0_mem=1G dom0-iommu=none on the Xen command line. I also removed "timeout" and pipe "tee" at the end for my own convenience: # Run the test -rm -f smoke.serial -set +e -timeout -k 1 720 \ qemu-system-x86_64 \ -cpu qemu64,+svm \ -m 2G -smp 2 \ -monitor none -serial stdio \ -nographic \ -device virtio-net-pci,netdev=n0 \ - -netdev user,id=n0,tftp=binaries,bootfile=/pxelinux.0 |& tee smoke.serial + -netdev user,id=n0,tftp=binaries,bootfile=/pxelinux.0 make sure to build the Xen hypervisor binary and place the binary under /local/repos/xen/binaries/ You can finally run the test with the below: cd .. docker run -it -v /local/repos/xen:/local/repos/xen registry.gitlab.com/xen-project/xen/debian:unstable cd /local/repos/xen bash automation/scripts/qemu-alpine-x86_64.sh It usually gets stuck halfway through the boot without this patch. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping 2023-05-16 0:11 ` Stefano Stabellini @ 2023-05-16 0:38 ` Stefano Stabellini 2023-05-16 6:27 ` Jan Beulich 2023-05-16 9:21 ` Roger Pau Monné 2 siblings, 0 replies; 29+ messages in thread From: Stefano Stabellini @ 2023-05-16 0:38 UTC (permalink / raw) To: Stefano Stabellini Cc: Roger Pau Monné, andrew.cooper3, jbeulich, xen-devel, Xenia.Ragiadakou, Stefano Stabellini [-- Attachment #1: Type: text/plain, Size: 3153 bytes --] On Mon, 15 May 2023, Stefano Stabellini wrote: > On Mon, 15 May 2023, Roger Pau Monné wrote: > > On Fri, May 12, 2023 at 06:17:20PM -0700, Stefano Stabellini wrote: > > > From: Stefano Stabellini <stefano.stabellini@amd.com> > > > > > > Mapping the ACPI tables to Dom0 PVH 1:1 leads to memory corruptions of > > > the tables in the guest. Instead, copy the tables to Dom0. > > > > > > This is a workaround. > > > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> > > > --- > > > As mentioned in the cover letter, this is a RFC workaround as I don't > > > know the cause of the underlying problem. I do know that this patch > > > solves what would be otherwise a hang at boot when Dom0 PVH attempts to > > > parse ACPI tables. > > > > I'm unsure how safe this is for native systems, as it's possible for > > firmware to modify the data in the tables, so copying them would > > break that functionality. > > > > I think we need to get to the root cause that triggers this behavior > > on QEMU. Is it the table checksum that fail, or something else? Is > > there an error from Linux you could reference? > > I agree with you but so far I haven't managed to find a way to the root > of the issue. Here is what I know. These are the logs of a successful > boot using this patch: > > [ 10.437488] ACPI: Early table checksum verification disabled > [ 10.439345] ACPI: RSDP 0x000000004005F955 000024 (v02 BOCHS ) > [ 10.441033] ACPI: RSDT 0x000000004005F979 000034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001) > [ 10.444045] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001) > [ 10.445984] ACPI: FACP 0x000000004005FA65 000074 (v01 BOCHS BXPCFACP 00000001 BXPC 00000001) > [ 10.447170] ACPI BIOS Warning (bug): Incorrect checksum in table [FACP] - 0x67, should be 0x30 (20220331/tbprint-174) > [ 10.449522] ACPI: DSDT 0x000000004005FB19 00145D (v01 BOCHS BXPCDSDT 00000001 BXPC 00000001) > [ 10.451258] ACPI: FACS 0x000000004005FAD9 000040 > [ 10.452245] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] > [ 10.452389] ACPI: Reserving FACP table memory at [mem 0x4005fa65-0x4005fad8] > [ 10.452497] ACPI: Reserving DSDT table memory at [mem 0x4005fb19-0x40060f75] > [ 10.452602] ACPI: Reserving FACS table memory at [mem 0x4005fad9-0x4005fb18] > > > And these are the logs of the same boot (unsuccessful) without this > patch: > > [ 10.516015] ACPI: Early table checksum verification disabled > [ 10.517732] ACPI: RSDP 0x0000000040060F1E 000024 (v02 BOCHS ) > [ 10.519535] ACPI: RSDT 0x0000000040060F42 000034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001) > [ 10.522523] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001) > [ 10.527453] ACPI: ���� 0x000000007FFE149D FFFFFFFF (v255 ������ �������� FFFFFFFF ���� FFFFFFFF) > [ 10.528362] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] > [ 10.528491] ACPI: Reserving ���� table memory at [mem 0x7ffe149d-0x17ffe149b] > > It is clearly a memory corruption around FACS Sorry I meant FACP/FADT ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping 2023-05-16 0:11 ` Stefano Stabellini 2023-05-16 0:38 ` Stefano Stabellini @ 2023-05-16 6:27 ` Jan Beulich 2023-05-16 9:21 ` Roger Pau Monné 2 siblings, 0 replies; 29+ messages in thread From: Jan Beulich @ 2023-05-16 6:27 UTC (permalink / raw) To: Stefano Stabellini Cc: andrew.cooper3, xen-devel, Xenia.Ragiadakou, Stefano Stabellini, Roger Pau Monné On 16.05.2023 02:11, Stefano Stabellini wrote: > On Mon, 15 May 2023, Roger Pau Monné wrote: >> On Fri, May 12, 2023 at 06:17:20PM -0700, Stefano Stabellini wrote: >>> From: Stefano Stabellini <stefano.stabellini@amd.com> >>> >>> Mapping the ACPI tables to Dom0 PVH 1:1 leads to memory corruptions of >>> the tables in the guest. Instead, copy the tables to Dom0. >>> >>> This is a workaround. >>> >>> Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> >>> --- >>> As mentioned in the cover letter, this is a RFC workaround as I don't >>> know the cause of the underlying problem. I do know that this patch >>> solves what would be otherwise a hang at boot when Dom0 PVH attempts to >>> parse ACPI tables. >> >> I'm unsure how safe this is for native systems, as it's possible for >> firmware to modify the data in the tables, so copying them would >> break that functionality. >> >> I think we need to get to the root cause that triggers this behavior >> on QEMU. Is it the table checksum that fail, or something else? Is >> there an error from Linux you could reference? > > I agree with you but so far I haven't managed to find a way to the root > of the issue. Here is what I know. These are the logs of a successful > boot using this patch: > > [ 10.437488] ACPI: Early table checksum verification disabled > [ 10.439345] ACPI: RSDP 0x000000004005F955 000024 (v02 BOCHS ) > [ 10.441033] ACPI: RSDT 0x000000004005F979 000034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001) > [ 10.444045] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001) > [ 10.445984] ACPI: FACP 0x000000004005FA65 000074 (v01 BOCHS BXPCFACP 00000001 BXPC 00000001) > [ 10.447170] ACPI BIOS Warning (bug): Incorrect checksum in table [FACP] - 0x67, should be 0x30 (20220331/tbprint-174) With this line I wouldn't really call it a "successful boot". > [ 10.449522] ACPI: DSDT 0x000000004005FB19 00145D (v01 BOCHS BXPCDSDT 00000001 BXPC 00000001) > [ 10.451258] ACPI: FACS 0x000000004005FAD9 000040 > [ 10.452245] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] > [ 10.452389] ACPI: Reserving FACP table memory at [mem 0x4005fa65-0x4005fad8] > [ 10.452497] ACPI: Reserving DSDT table memory at [mem 0x4005fb19-0x40060f75] > [ 10.452602] ACPI: Reserving FACS table memory at [mem 0x4005fad9-0x4005fb18] > > > And these are the logs of the same boot (unsuccessful) without this > patch: > > [ 10.516015] ACPI: Early table checksum verification disabled > [ 10.517732] ACPI: RSDP 0x0000000040060F1E 000024 (v02 BOCHS ) > [ 10.519535] ACPI: RSDT 0x0000000040060F42 000034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001) > [ 10.522523] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001) > [ 10.527453] ACPI: ���� 0x000000007FFE149D FFFFFFFF (v255 ������ �������� FFFFFFFF ���� FFFFFFFF) By instrumenting the kernel a little, it shouldn't be difficult to spot where this bogus address is coming from. Inspecting the relevant tables right before Dom0 actually starts and then again slightly later (perhaps triggered by Dom0 itself, again via slight instrumentation) from Xen should also be possible. I kind of expect they're going to be okay right after construction, and become corrupted subsequently. Did you check that the E820 table properly covers the ACPI table range(s)? Jan ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping 2023-05-16 0:11 ` Stefano Stabellini 2023-05-16 0:38 ` Stefano Stabellini 2023-05-16 6:27 ` Jan Beulich @ 2023-05-16 9:21 ` Roger Pau Monné 2023-05-16 23:34 ` Stefano Stabellini 2 siblings, 1 reply; 29+ messages in thread From: Roger Pau Monné @ 2023-05-16 9:21 UTC (permalink / raw) To: Stefano Stabellini Cc: andrew.cooper3, jbeulich, xen-devel, Xenia.Ragiadakou, Stefano Stabellini On Mon, May 15, 2023 at 05:11:25PM -0700, Stefano Stabellini wrote: > On Mon, 15 May 2023, Roger Pau Monné wrote: > > On Fri, May 12, 2023 at 06:17:20PM -0700, Stefano Stabellini wrote: > > > From: Stefano Stabellini <stefano.stabellini@amd.com> > > > > > > Mapping the ACPI tables to Dom0 PVH 1:1 leads to memory corruptions of > > > the tables in the guest. Instead, copy the tables to Dom0. > > > > > > This is a workaround. > > > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> > > > --- > > > As mentioned in the cover letter, this is a RFC workaround as I don't > > > know the cause of the underlying problem. I do know that this patch > > > solves what would be otherwise a hang at boot when Dom0 PVH attempts to > > > parse ACPI tables. > > > > I'm unsure how safe this is for native systems, as it's possible for > > firmware to modify the data in the tables, so copying them would > > break that functionality. > > > > I think we need to get to the root cause that triggers this behavior > > on QEMU. Is it the table checksum that fail, or something else? Is > > there an error from Linux you could reference? > > I agree with you but so far I haven't managed to find a way to the root > of the issue. Here is what I know. These are the logs of a successful > boot using this patch: > > [ 10.437488] ACPI: Early table checksum verification disabled > [ 10.439345] ACPI: RSDP 0x000000004005F955 000024 (v02 BOCHS ) > [ 10.441033] ACPI: RSDT 0x000000004005F979 000034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001) > [ 10.444045] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001) > [ 10.445984] ACPI: FACP 0x000000004005FA65 000074 (v01 BOCHS BXPCFACP 00000001 BXPC 00000001) > [ 10.447170] ACPI BIOS Warning (bug): Incorrect checksum in table [FACP] - 0x67, should be 0x30 (20220331/tbprint-174) > [ 10.449522] ACPI: DSDT 0x000000004005FB19 00145D (v01 BOCHS BXPCDSDT 00000001 BXPC 00000001) > [ 10.451258] ACPI: FACS 0x000000004005FAD9 000040 > [ 10.452245] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] > [ 10.452389] ACPI: Reserving FACP table memory at [mem 0x4005fa65-0x4005fad8] > [ 10.452497] ACPI: Reserving DSDT table memory at [mem 0x4005fb19-0x40060f75] > [ 10.452602] ACPI: Reserving FACS table memory at [mem 0x4005fad9-0x4005fb18] > > > And these are the logs of the same boot (unsuccessful) without this > patch: > > [ 10.516015] ACPI: Early table checksum verification disabled > [ 10.517732] ACPI: RSDP 0x0000000040060F1E 000024 (v02 BOCHS ) > [ 10.519535] ACPI: RSDT 0x0000000040060F42 000034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001) > [ 10.522523] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001) > [ 10.527453] ACPI: ���� 0x000000007FFE149D FFFFFFFF (v255 ������ �������� FFFFFFFF ���� FFFFFFFF) > [ 10.528362] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] > [ 10.528491] ACPI: Reserving ���� table memory at [mem 0x7ffe149d-0x17ffe149b] > > It is clearly a memory corruption around FACS but I couldn't find the > reason for it. The mapping code looks correct. I hope you can suggest a > way to narrow down the problem. If I could, I would suggest to apply > this patch just for the QEMU PVH tests but we don't have the > infrastructure for that in gitlab-ci as there is a single Xen build for > all tests. Would be helpful to see the memory map provided to Linux, just in case we messed up and there's some overlap. It seems like some of the XSDT entries (the FADT one) is corrupt? Could you maybe add some debug to the Xen-crafted XSDT placement. > > If it helps to repro on your side, you can just do the following, > assuming your Xen repo is in /local/repos/xen: > > > cd /local/repos/xen > mkdir binaries > cd binaries > mkdir -p dist/install/ > > docker run -it -v `pwd`:`pwd` registry.gitlab.com/xen-project/xen/tests-artifacts/alpine:3.12 > cp /initrd* /local/repos/xen/binaries > exit > > docker run -it -v `pwd`:`pwd` registry.gitlab.com/xen-project/xen/tests-artifacts/kernel:6.1.19 > cp /bzImage /local/repos/xen/binaries > exit > > That's it. Now you have enough pre-built binaries to repro the issue. > Next you can edit automation/scripts/qemu-alpine-x86_64.sh to add > > dom0=pvh dom0_mem=1G dom0-iommu=none Do you get to boot with dom0-iommu=none? Is there also some trick here in order to identity map dom0? I would expect things to not work because addresses used for IO with QEMU emulated devices won't be correct. > > on the Xen command line. I also removed "timeout" and pipe "tee" at the > end for my own convenience: > > # Run the test > -rm -f smoke.serial > -set +e > -timeout -k 1 720 \ > qemu-system-x86_64 \ > -cpu qemu64,+svm \ > -m 2G -smp 2 \ > -monitor none -serial stdio \ > -nographic \ > -device virtio-net-pci,netdev=n0 \ > - -netdev user,id=n0,tftp=binaries,bootfile=/pxelinux.0 |& tee smoke.serial > + -netdev user,id=n0,tftp=binaries,bootfile=/pxelinux.0 > > > make sure to build the Xen hypervisor binary and place the binary under > /local/repos/xen/binaries/ > > You can finally run the test with the below: > > cd .. > docker run -it -v /local/repos/xen:/local/repos/xen registry.gitlab.com/xen-project/xen/debian:unstable > cd /local/repos/xen > bash automation/scripts/qemu-alpine-x86_64.sh > > It usually gets stuck halfway through the boot without this patch. Thanks for the instructions, will give it a try if I can find some time. Roger. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping 2023-05-16 9:21 ` Roger Pau Monné @ 2023-05-16 23:34 ` Stefano Stabellini 2023-05-17 8:40 ` Roger Pau Monné 0 siblings, 1 reply; 29+ messages in thread From: Stefano Stabellini @ 2023-05-16 23:34 UTC (permalink / raw) To: Roger Pau Monné Cc: Stefano Stabellini, andrew.cooper3, jbeulich, xen-devel, Xenia.Ragiadakou, Stefano Stabellini [-- Attachment #1: Type: text/plain, Size: 10354 bytes --] On Tue, 16 May 2023, Roger Pau Monné wrote: > On Mon, May 15, 2023 at 05:11:25PM -0700, Stefano Stabellini wrote: > > On Mon, 15 May 2023, Roger Pau Monné wrote: > > > On Fri, May 12, 2023 at 06:17:20PM -0700, Stefano Stabellini wrote: > > > > From: Stefano Stabellini <stefano.stabellini@amd.com> > > > > > > > > Mapping the ACPI tables to Dom0 PVH 1:1 leads to memory corruptions of > > > > the tables in the guest. Instead, copy the tables to Dom0. > > > > > > > > This is a workaround. > > > > > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> > > > > --- > > > > As mentioned in the cover letter, this is a RFC workaround as I don't > > > > know the cause of the underlying problem. I do know that this patch > > > > solves what would be otherwise a hang at boot when Dom0 PVH attempts to > > > > parse ACPI tables. > > > > > > I'm unsure how safe this is for native systems, as it's possible for > > > firmware to modify the data in the tables, so copying them would > > > break that functionality. > > > > > > I think we need to get to the root cause that triggers this behavior > > > on QEMU. Is it the table checksum that fail, or something else? Is > > > there an error from Linux you could reference? > > > > I agree with you but so far I haven't managed to find a way to the root > > of the issue. Here is what I know. These are the logs of a successful > > boot using this patch: > > > > [ 10.437488] ACPI: Early table checksum verification disabled > > [ 10.439345] ACPI: RSDP 0x000000004005F955 000024 (v02 BOCHS ) > > [ 10.441033] ACPI: RSDT 0x000000004005F979 000034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001) > > [ 10.444045] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001) > > [ 10.445984] ACPI: FACP 0x000000004005FA65 000074 (v01 BOCHS BXPCFACP 00000001 BXPC 00000001) > > [ 10.447170] ACPI BIOS Warning (bug): Incorrect checksum in table [FACP] - 0x67, should be 0x30 (20220331/tbprint-174) > > [ 10.449522] ACPI: DSDT 0x000000004005FB19 00145D (v01 BOCHS BXPCDSDT 00000001 BXPC 00000001) > > [ 10.451258] ACPI: FACS 0x000000004005FAD9 000040 > > [ 10.452245] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] > > [ 10.452389] ACPI: Reserving FACP table memory at [mem 0x4005fa65-0x4005fad8] > > [ 10.452497] ACPI: Reserving DSDT table memory at [mem 0x4005fb19-0x40060f75] > > [ 10.452602] ACPI: Reserving FACS table memory at [mem 0x4005fad9-0x4005fb18] > > > > > > And these are the logs of the same boot (unsuccessful) without this > > patch: > > > > [ 10.516015] ACPI: Early table checksum verification disabled > > [ 10.517732] ACPI: RSDP 0x0000000040060F1E 000024 (v02 BOCHS ) > > [ 10.519535] ACPI: RSDT 0x0000000040060F42 000034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001) > > [ 10.522523] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001) > > [ 10.527453] ACPI: ���� 0x000000007FFE149D FFFFFFFF (v255 ������ �������� FFFFFFFF ���� FFFFFFFF) > > [ 10.528362] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] > > [ 10.528491] ACPI: Reserving ���� table memory at [mem 0x7ffe149d-0x17ffe149b] > > > > It is clearly a memory corruption around FACS but I couldn't find the > > reason for it. The mapping code looks correct. I hope you can suggest a > > way to narrow down the problem. If I could, I would suggest to apply > > this patch just for the QEMU PVH tests but we don't have the > > infrastructure for that in gitlab-ci as there is a single Xen build for > > all tests. > > Would be helpful to see the memory map provided to Linux, just in case > we messed up and there's some overlap. Everything looks correct. Here are some more logs: (XEN) Xen-e820 RAM map: (XEN) [0000000000000000, 000000000009fbff] (usable) (XEN) [000000000009fc00, 000000000009ffff] (reserved) (XEN) [00000000000f0000, 00000000000fffff] (reserved) (XEN) [0000000000100000, 000000007ffdffff] (usable) (XEN) [000000007ffe0000, 000000007fffffff] (reserved) (XEN) [00000000fffc0000, 00000000ffffffff] (reserved) (XEN) [000000fd00000000, 000000ffffffffff] (reserved) (XEN) Microcode loading not available (XEN) New Xen image base address: 0x7f600000 (XEN) System RAM: 2047MB (2096636kB) (XEN) ACPI: RSDP 000F58D0, 0014 (r0 BOCHS ) (XEN) ACPI: RSDT 7FFE1B21, 0034 (r1 BOCHS BXPC 1 BXPC 1) (XEN) ACPI: FACP 7FFE19CD, 0074 (r1 BOCHS BXPC 1 BXPC 1) (XEN) ACPI: DSDT 7FFE0040, 198D (r1 BOCHS BXPC 1 BXPC 1) (XEN) ACPI: FACS 7FFE0000, 0040 (XEN) ACPI: APIC 7FFE1A41, 0080 (r1 BOCHS BXPC 1 BXPC 1) (XEN) ACPI: HPET 7FFE1AC1, 0038 (r1 BOCHS BXPC 1 BXPC 1) (XEN) ACPI: WAET 7FFE1AF9, 0028 (r1 BOCHS BXPC 1 BXPC 1) [...] (XEN) Dom0 memory map: (XEN) [0000000000000000, 000000000009efff] (usable) (XEN) [000000000009fc00, 000000000009ffff] (reserved) (XEN) [00000000000f0000, 00000000000fffff] (reserved) (XEN) [0000000000100000, 0000000040060f1d] (usable) (XEN) [0000000040060f1e, 0000000040060fa7] (ACPI data) (XEN) [0000000040061000, 000000007ffdffff] (unusable) (XEN) [000000007ffe0000, 000000007fffffff] (reserved) (XEN) [00000000fffc0000, 00000000ffffffff] (reserved) (XEN) [000000fd00000000, 000000ffffffffff] (reserved) [...] [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009efff] usable [ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x00000000000fffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000040060f1d] usable [ 0.000000] BIOS-e820: [mem 0x0000000040060f1e-0x0000000040060fa7] ACPI data [ 0.000000] BIOS-e820: [mem 0x0000000040061000-0x000000007ffdffff] unusable [ 0.000000] BIOS-e820: [mem 0x000000007ffe0000-0x000000007fffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved [ 0.000000] BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved [...] [ 10.102427] ACPI: Early table checksum verification disabled [ 10.104455] ACPI: RSDP 0x0000000040060F1E 000024 (v02 BOCHS ) [ 10.106250] ACPI: RSDT 0x0000000040060F42 000034 (v01 BOCHS BXPC 00000001 BXPC 00000001) [ 10.109549] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPC 00000001 BXPC 00000001) [ 10.115173] ACPI: ���� 0x000000007FFE19CD FFFFFFFF (v255 ������ �������� FFFFFFFF ���� FFFFFFFF) [ 10.116054] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] [ 10.116182] ACPI: Reserving ���� table memory at [mem 0x7ffe19cd-0x17ffe19cb] > It seems like some of the XSDT entries (the FADT one) is corrupt? > > Could you maybe add some debug to the Xen-crafted XSDT placement. I added a printk just after: xsdt->table_offset_entry[j++] = tables[i].address; And it printed only once: (XEN) DEBUG pvh_setup_acpi_xsdt 1000 name=FACP address=7ffe19cd That actually matches the address read by Linux: [ 10.175448] ACPI: ���� 0x000000007FFE19CD FFFFFFFF (v255 ������ �������� FFFFFFFF ���� FFFFFFFF) So the address seems correct. It is the content of the FADT/FACP table that is corrupted. I wrote the following function in Xen: static void check(void) { unsigned long addr = 0x7ffe19cd; struct acpi_table_fadt *fadt; fadt = acpi_os_map_memory(addr, sizeof(*fadt)); printk("DEBUG %s %d s=%s\n",__func__,__LINE__,fadt->header.signature); acpi_os_unmap_memory(fadt, sizeof(*fadt)); } It prints the right table signature at the end of pvh_setup_acpi. I also added a call at the top of xenmem_add_to_physmap_one, and the signature is still correct. Then I added a call at the beginning of __update_vcpu_system_time. Here is the surprise: from Xen point of view the table never gets corrupted. Here are the logs: [...] (XEN) DEBUG fadt_check 1551 s=FACPt (XEN) DEBUG fadt_check 1551 s=FACPt (XEN) DEBUG fadt_check 1551 s=FACPt (XEN) DEBUG fadt_check 1551 s=FACPt (XEN) d0v0: upcall vector f3 [ 0.000000] Linux version 6.1.19 (root@124de7fbba7f) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_3 [ 0.000000] Command line: console=hvc0 [...] [ 10.371610] ACPI: Early table checksum verification disabled [ 10.373633] ACPI: RSDP 0x0000000040060F1E 000024 (v02 BOCHS ) [ 10.375548] ACPI: RSDT 0x0000000040060F42 000034 (v01 BOCHS BXPC 00000001 BXPC 00000001) [ 10.378732] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPC 00000001 BXPC 00000001) [ 10.384188] ACPI: ���� 0x000000007FFE19CD FFFFFFFF (v255 ������ �������� FFFFFFFF ���� FFFFFFFF) [ 10.385374] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] [ 10.385519] ACPI: Reserving ���� table memory at [mem 0x7ffe19cd-0x17ffe19cb] [...] (XEN) DEBUG fadt_check 1551 s=FACPt (XEN) DEBUG fadt_check 1551 s=FACPt (XEN) DEBUG fadt_check 1551 s=FACPt (XEN) DEBUG fadt_check 1551 s=FACPt So it looks like it is a problem with the mapping itself? Xen sees the data correctly and Linux sees it corrupted? > > If it helps to repro on your side, you can just do the following, > > assuming your Xen repo is in /local/repos/xen: > > > > > > cd /local/repos/xen > > mkdir binaries > > cd binaries > > mkdir -p dist/install/ > > > > docker run -it -v `pwd`:`pwd` registry.gitlab.com/xen-project/xen/tests-artifacts/alpine:3.12 > > cp /initrd* /local/repos/xen/binaries > > exit > > > > docker run -it -v `pwd`:`pwd` registry.gitlab.com/xen-project/xen/tests-artifacts/kernel:6.1.19 > > cp /bzImage /local/repos/xen/binaries > > exit > > > > That's it. Now you have enough pre-built binaries to repro the issue. > > Next you can edit automation/scripts/qemu-alpine-x86_64.sh to add > > > > dom0=pvh dom0_mem=1G dom0-iommu=none > > Do you get to boot with dom0-iommu=none? Is there also some trick > here in order to identity map dom0? I would expect things to not work > because addresses used for IO with QEMU emulated devices won't be > correct. That's easy: just don't use any devices to boot. Put everything needed in the dom0 ramdisk. That's the configuration provided in the gitlab-ci script I pointed you in the previous email which uses an Alpine Linux ramdisk. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping 2023-05-16 23:34 ` Stefano Stabellini @ 2023-05-17 8:40 ` Roger Pau Monné 2023-05-17 21:00 ` Stefano Stabellini 0 siblings, 1 reply; 29+ messages in thread From: Roger Pau Monné @ 2023-05-17 8:40 UTC (permalink / raw) To: Stefano Stabellini Cc: andrew.cooper3, jbeulich, xen-devel, Xenia.Ragiadakou, Stefano Stabellini On Tue, May 16, 2023 at 04:34:09PM -0700, Stefano Stabellini wrote: > On Tue, 16 May 2023, Roger Pau Monné wrote: > > On Mon, May 15, 2023 at 05:11:25PM -0700, Stefano Stabellini wrote: > > > On Mon, 15 May 2023, Roger Pau Monné wrote: > > > > On Fri, May 12, 2023 at 06:17:20PM -0700, Stefano Stabellini wrote: > > > > > From: Stefano Stabellini <stefano.stabellini@amd.com> > > > > > > > > > > Mapping the ACPI tables to Dom0 PVH 1:1 leads to memory corruptions of > > > > > the tables in the guest. Instead, copy the tables to Dom0. > > > > > > > > > > This is a workaround. > > > > > > > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> > > > > > --- > > > > > As mentioned in the cover letter, this is a RFC workaround as I don't > > > > > know the cause of the underlying problem. I do know that this patch > > > > > solves what would be otherwise a hang at boot when Dom0 PVH attempts to > > > > > parse ACPI tables. > > > > > > > > I'm unsure how safe this is for native systems, as it's possible for > > > > firmware to modify the data in the tables, so copying them would > > > > break that functionality. > > > > > > > > I think we need to get to the root cause that triggers this behavior > > > > on QEMU. Is it the table checksum that fail, or something else? Is > > > > there an error from Linux you could reference? > > > > > > I agree with you but so far I haven't managed to find a way to the root > > > of the issue. Here is what I know. These are the logs of a successful > > > boot using this patch: > > > > > > [ 10.437488] ACPI: Early table checksum verification disabled > > > [ 10.439345] ACPI: RSDP 0x000000004005F955 000024 (v02 BOCHS ) > > > [ 10.441033] ACPI: RSDT 0x000000004005F979 000034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001) > > > [ 10.444045] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001) > > > [ 10.445984] ACPI: FACP 0x000000004005FA65 000074 (v01 BOCHS BXPCFACP 00000001 BXPC 00000001) > > > [ 10.447170] ACPI BIOS Warning (bug): Incorrect checksum in table [FACP] - 0x67, should be 0x30 (20220331/tbprint-174) > > > [ 10.449522] ACPI: DSDT 0x000000004005FB19 00145D (v01 BOCHS BXPCDSDT 00000001 BXPC 00000001) > > > [ 10.451258] ACPI: FACS 0x000000004005FAD9 000040 > > > [ 10.452245] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] > > > [ 10.452389] ACPI: Reserving FACP table memory at [mem 0x4005fa65-0x4005fad8] > > > [ 10.452497] ACPI: Reserving DSDT table memory at [mem 0x4005fb19-0x40060f75] > > > [ 10.452602] ACPI: Reserving FACS table memory at [mem 0x4005fad9-0x4005fb18] > > > > > > > > > And these are the logs of the same boot (unsuccessful) without this > > > patch: > > > > > > [ 10.516015] ACPI: Early table checksum verification disabled > > > [ 10.517732] ACPI: RSDP 0x0000000040060F1E 000024 (v02 BOCHS ) > > > [ 10.519535] ACPI: RSDT 0x0000000040060F42 000034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001) > > > [ 10.522523] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001) > > > [ 10.527453] ACPI: ���� 0x000000007FFE149D FFFFFFFF (v255 ������ �������� FFFFFFFF ���� FFFFFFFF) > > > [ 10.528362] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] > > > [ 10.528491] ACPI: Reserving ���� table memory at [mem 0x7ffe149d-0x17ffe149b] > > > > > > It is clearly a memory corruption around FACS but I couldn't find the > > > reason for it. The mapping code looks correct. I hope you can suggest a > > > way to narrow down the problem. If I could, I would suggest to apply > > > this patch just for the QEMU PVH tests but we don't have the > > > infrastructure for that in gitlab-ci as there is a single Xen build for > > > all tests. > > > > Would be helpful to see the memory map provided to Linux, just in case > > we messed up and there's some overlap. > > Everything looks correct. Here are some more logs: > > (XEN) Xen-e820 RAM map: > (XEN) [0000000000000000, 000000000009fbff] (usable) > (XEN) [000000000009fc00, 000000000009ffff] (reserved) > (XEN) [00000000000f0000, 00000000000fffff] (reserved) > (XEN) [0000000000100000, 000000007ffdffff] (usable) > (XEN) [000000007ffe0000, 000000007fffffff] (reserved) > (XEN) [00000000fffc0000, 00000000ffffffff] (reserved) > (XEN) [000000fd00000000, 000000ffffffffff] (reserved) > (XEN) Microcode loading not available > (XEN) New Xen image base address: 0x7f600000 > (XEN) System RAM: 2047MB (2096636kB) > (XEN) ACPI: RSDP 000F58D0, 0014 (r0 BOCHS ) > (XEN) ACPI: RSDT 7FFE1B21, 0034 (r1 BOCHS BXPC 1 BXPC 1) > (XEN) ACPI: FACP 7FFE19CD, 0074 (r1 BOCHS BXPC 1 BXPC 1) > (XEN) ACPI: DSDT 7FFE0040, 198D (r1 BOCHS BXPC 1 BXPC 1) > (XEN) ACPI: FACS 7FFE0000, 0040 > (XEN) ACPI: APIC 7FFE1A41, 0080 (r1 BOCHS BXPC 1 BXPC 1) > (XEN) ACPI: HPET 7FFE1AC1, 0038 (r1 BOCHS BXPC 1 BXPC 1) > (XEN) ACPI: WAET 7FFE1AF9, 0028 (r1 BOCHS BXPC 1 BXPC 1) > [...] > (XEN) Dom0 memory map: > (XEN) [0000000000000000, 000000000009efff] (usable) > (XEN) [000000000009fc00, 000000000009ffff] (reserved) > (XEN) [00000000000f0000, 00000000000fffff] (reserved) > (XEN) [0000000000100000, 0000000040060f1d] (usable) > (XEN) [0000000040060f1e, 0000000040060fa7] (ACPI data) > (XEN) [0000000040061000, 000000007ffdffff] (unusable) > (XEN) [000000007ffe0000, 000000007fffffff] (reserved) > (XEN) [00000000fffc0000, 00000000ffffffff] (reserved) > (XEN) [000000fd00000000, 000000ffffffffff] (reserved) > [...] > [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009efff] usable > [ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x00000000000fffff] reserved > [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000040060f1d] usable > [ 0.000000] BIOS-e820: [mem 0x0000000040060f1e-0x0000000040060fa7] ACPI data > [ 0.000000] BIOS-e820: [mem 0x0000000040061000-0x000000007ffdffff] unusable > [ 0.000000] BIOS-e820: [mem 0x000000007ffe0000-0x000000007fffffff] reserved > [ 0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved > [ 0.000000] BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved > [...] > [ 10.102427] ACPI: Early table checksum verification disabled > [ 10.104455] ACPI: RSDP 0x0000000040060F1E 000024 (v02 BOCHS ) > [ 10.106250] ACPI: RSDT 0x0000000040060F42 000034 (v01 BOCHS BXPC 00000001 BXPC 00000001) > [ 10.109549] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPC 00000001 BXPC 00000001) > [ 10.115173] ACPI: ���� 0x000000007FFE19CD FFFFFFFF (v255 ������ �������� FFFFFFFF ���� FFFFFFFF) > [ 10.116054] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] > [ 10.116182] ACPI: Reserving ���� table memory at [mem 0x7ffe19cd-0x17ffe19cb] > > > > > It seems like some of the XSDT entries (the FADT one) is corrupt? > > > > Could you maybe add some debug to the Xen-crafted XSDT placement. > > I added a printk just after: > > xsdt->table_offset_entry[j++] = tables[i].address; > > And it printed only once: > > (XEN) DEBUG pvh_setup_acpi_xsdt 1000 name=FACP address=7ffe19cd > > That actually matches the address read by Linux: > > [ 10.175448] ACPI: ���� 0x000000007FFE19CD FFFFFFFF (v255 ������ �������� FFFFFFFF ���� FFFFFFFF) > > So the address seems correct. It is the content of the FADT/FACP table > that is corrupted. > > I wrote the following function in Xen: > > static void check(void) > { > unsigned long addr = 0x7ffe19cd; > struct acpi_table_fadt *fadt; > fadt = acpi_os_map_memory(addr, sizeof(*fadt)); > printk("DEBUG %s %d s=%s\n",__func__,__LINE__,fadt->header.signature); > acpi_os_unmap_memory(fadt, sizeof(*fadt)); > } > > It prints the right table signature at the end of pvh_setup_acpi. > I also added a call at the top of xenmem_add_to_physmap_one, and the > signature is still correct. Then I added a call at the beginning of > __update_vcpu_system_time. Here is the surprise: from Xen point of view > the table never gets corrupted. Here are the logs: > > [...] > (XEN) DEBUG fadt_check 1551 s=FACPt > (XEN) DEBUG fadt_check 1551 s=FACPt > (XEN) DEBUG fadt_check 1551 s=FACPt > (XEN) DEBUG fadt_check 1551 s=FACPt > (XEN) d0v0: upcall vector f3 > [ 0.000000] Linux version 6.1.19 (root@124de7fbba7f) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_3 > [ 0.000000] Command line: console=hvc0 > [...] > [ 10.371610] ACPI: Early table checksum verification disabled > [ 10.373633] ACPI: RSDP 0x0000000040060F1E 000024 (v02 BOCHS ) > [ 10.375548] ACPI: RSDT 0x0000000040060F42 000034 (v01 BOCHS BXPC 00000001 BXPC 00000001) > [ 10.378732] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPC 00000001 BXPC 00000001) > [ 10.384188] ACPI: ���� 0x000000007FFE19CD FFFFFFFF (v255 ������ �������� FFFFFFFF ���� FFFFFFFF) > [ 10.385374] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] > [ 10.385519] ACPI: Reserving ���� table memory at [mem 0x7ffe19cd-0x17ffe19cb] > [...] > (XEN) DEBUG fadt_check 1551 s=FACPt > (XEN) DEBUG fadt_check 1551 s=FACPt > (XEN) DEBUG fadt_check 1551 s=FACPt > (XEN) DEBUG fadt_check 1551 s=FACPt > > > So it looks like it is a problem with the mapping itself? Xen sees the > data correctly and Linux sees it corrupted? It seems to me like the page is not correctly mapped, and so 1s are returned? (same behavior as a hole). IOW: would seem to me like MMIO areas are not correctly handled by nested NPT? (I assume you are running this on AMD). Does it make a difference if you try to boot with dom0=pvh,shadow? A couple of wild ideas. Maybe the nested virt support that you are using doesn't handle the UC bit in second stage page table entries? You could to remove this in p2m_type_to_flags() (see the p2m_mmio_direct case). Another wild idea I have is that the emulated NPT code doesn't like having the bits 63:52 from the PTE set to anything different than 0, and hence only p2m_ram_rw works (p2m_mmio_direct is 5). > > > > > If it helps to repro on your side, you can just do the following, > > > assuming your Xen repo is in /local/repos/xen: > > > > > > > > > cd /local/repos/xen > > > mkdir binaries > > > cd binaries > > > mkdir -p dist/install/ > > > > > > docker run -it -v `pwd`:`pwd` registry.gitlab.com/xen-project/xen/tests-artifacts/alpine:3.12 > > > cp /initrd* /local/repos/xen/binaries > > > exit > > > > > > docker run -it -v `pwd`:`pwd` registry.gitlab.com/xen-project/xen/tests-artifacts/kernel:6.1.19 > > > cp /bzImage /local/repos/xen/binaries > > > exit > > > > > > That's it. Now you have enough pre-built binaries to repro the issue. > > > Next you can edit automation/scripts/qemu-alpine-x86_64.sh to add > > > > > > dom0=pvh dom0_mem=1G dom0-iommu=none > > > > Do you get to boot with dom0-iommu=none? Is there also some trick > > here in order to identity map dom0? I would expect things to not work > > because addresses used for IO with QEMU emulated devices won't be > > correct. > > That's easy: just don't use any devices to boot. Put everything needed > in the dom0 ramdisk. That's the configuration provided in the gitlab-ci > script I pointed you in the previous email which uses an Alpine Linux > ramdisk. Doh, yes :). Thanks, Roger. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping 2023-05-17 8:40 ` Roger Pau Monné @ 2023-05-17 21:00 ` Stefano Stabellini 2023-05-18 8:34 ` Roger Pau Monné 0 siblings, 1 reply; 29+ messages in thread From: Stefano Stabellini @ 2023-05-17 21:00 UTC (permalink / raw) To: Roger Pau Monné Cc: Stefano Stabellini, andrew.cooper3, jbeulich, xen-devel, Xenia.Ragiadakou, Stefano Stabellini [-- Attachment #1: Type: text/plain, Size: 12240 bytes --] On Wed, 17 May 2023, Roger Pau Monné wrote: > On Tue, May 16, 2023 at 04:34:09PM -0700, Stefano Stabellini wrote: > > On Tue, 16 May 2023, Roger Pau Monné wrote: > > > On Mon, May 15, 2023 at 05:11:25PM -0700, Stefano Stabellini wrote: > > > > On Mon, 15 May 2023, Roger Pau Monné wrote: > > > > > On Fri, May 12, 2023 at 06:17:20PM -0700, Stefano Stabellini wrote: > > > > > > From: Stefano Stabellini <stefano.stabellini@amd.com> > > > > > > > > > > > > Mapping the ACPI tables to Dom0 PVH 1:1 leads to memory corruptions of > > > > > > the tables in the guest. Instead, copy the tables to Dom0. > > > > > > > > > > > > This is a workaround. > > > > > > > > > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> > > > > > > --- > > > > > > As mentioned in the cover letter, this is a RFC workaround as I don't > > > > > > know the cause of the underlying problem. I do know that this patch > > > > > > solves what would be otherwise a hang at boot when Dom0 PVH attempts to > > > > > > parse ACPI tables. > > > > > > > > > > I'm unsure how safe this is for native systems, as it's possible for > > > > > firmware to modify the data in the tables, so copying them would > > > > > break that functionality. > > > > > > > > > > I think we need to get to the root cause that triggers this behavior > > > > > on QEMU. Is it the table checksum that fail, or something else? Is > > > > > there an error from Linux you could reference? > > > > > > > > I agree with you but so far I haven't managed to find a way to the root > > > > of the issue. Here is what I know. These are the logs of a successful > > > > boot using this patch: > > > > > > > > [ 10.437488] ACPI: Early table checksum verification disabled > > > > [ 10.439345] ACPI: RSDP 0x000000004005F955 000024 (v02 BOCHS ) > > > > [ 10.441033] ACPI: RSDT 0x000000004005F979 000034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001) > > > > [ 10.444045] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001) > > > > [ 10.445984] ACPI: FACP 0x000000004005FA65 000074 (v01 BOCHS BXPCFACP 00000001 BXPC 00000001) > > > > [ 10.447170] ACPI BIOS Warning (bug): Incorrect checksum in table [FACP] - 0x67, should be 0x30 (20220331/tbprint-174) > > > > [ 10.449522] ACPI: DSDT 0x000000004005FB19 00145D (v01 BOCHS BXPCDSDT 00000001 BXPC 00000001) > > > > [ 10.451258] ACPI: FACS 0x000000004005FAD9 000040 > > > > [ 10.452245] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] > > > > [ 10.452389] ACPI: Reserving FACP table memory at [mem 0x4005fa65-0x4005fad8] > > > > [ 10.452497] ACPI: Reserving DSDT table memory at [mem 0x4005fb19-0x40060f75] > > > > [ 10.452602] ACPI: Reserving FACS table memory at [mem 0x4005fad9-0x4005fb18] > > > > > > > > > > > > And these are the logs of the same boot (unsuccessful) without this > > > > patch: > > > > > > > > [ 10.516015] ACPI: Early table checksum verification disabled > > > > [ 10.517732] ACPI: RSDP 0x0000000040060F1E 000024 (v02 BOCHS ) > > > > [ 10.519535] ACPI: RSDT 0x0000000040060F42 000034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001) > > > > [ 10.522523] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001) > > > > [ 10.527453] ACPI: ���� 0x000000007FFE149D FFFFFFFF (v255 ������ �������� FFFFFFFF ���� FFFFFFFF) > > > > [ 10.528362] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] > > > > [ 10.528491] ACPI: Reserving ���� table memory at [mem 0x7ffe149d-0x17ffe149b] > > > > > > > > It is clearly a memory corruption around FACS but I couldn't find the > > > > reason for it. The mapping code looks correct. I hope you can suggest a > > > > way to narrow down the problem. If I could, I would suggest to apply > > > > this patch just for the QEMU PVH tests but we don't have the > > > > infrastructure for that in gitlab-ci as there is a single Xen build for > > > > all tests. > > > > > > Would be helpful to see the memory map provided to Linux, just in case > > > we messed up and there's some overlap. > > > > Everything looks correct. Here are some more logs: > > > > (XEN) Xen-e820 RAM map: > > (XEN) [0000000000000000, 000000000009fbff] (usable) > > (XEN) [000000000009fc00, 000000000009ffff] (reserved) > > (XEN) [00000000000f0000, 00000000000fffff] (reserved) > > (XEN) [0000000000100000, 000000007ffdffff] (usable) > > (XEN) [000000007ffe0000, 000000007fffffff] (reserved) > > (XEN) [00000000fffc0000, 00000000ffffffff] (reserved) > > (XEN) [000000fd00000000, 000000ffffffffff] (reserved) > > (XEN) Microcode loading not available > > (XEN) New Xen image base address: 0x7f600000 > > (XEN) System RAM: 2047MB (2096636kB) > > (XEN) ACPI: RSDP 000F58D0, 0014 (r0 BOCHS ) > > (XEN) ACPI: RSDT 7FFE1B21, 0034 (r1 BOCHS BXPC 1 BXPC 1) > > (XEN) ACPI: FACP 7FFE19CD, 0074 (r1 BOCHS BXPC 1 BXPC 1) > > (XEN) ACPI: DSDT 7FFE0040, 198D (r1 BOCHS BXPC 1 BXPC 1) > > (XEN) ACPI: FACS 7FFE0000, 0040 > > (XEN) ACPI: APIC 7FFE1A41, 0080 (r1 BOCHS BXPC 1 BXPC 1) > > (XEN) ACPI: HPET 7FFE1AC1, 0038 (r1 BOCHS BXPC 1 BXPC 1) > > (XEN) ACPI: WAET 7FFE1AF9, 0028 (r1 BOCHS BXPC 1 BXPC 1) > > [...] > > (XEN) Dom0 memory map: > > (XEN) [0000000000000000, 000000000009efff] (usable) > > (XEN) [000000000009fc00, 000000000009ffff] (reserved) > > (XEN) [00000000000f0000, 00000000000fffff] (reserved) > > (XEN) [0000000000100000, 0000000040060f1d] (usable) > > (XEN) [0000000040060f1e, 0000000040060fa7] (ACPI data) > > (XEN) [0000000040061000, 000000007ffdffff] (unusable) > > (XEN) [000000007ffe0000, 000000007fffffff] (reserved) > > (XEN) [00000000fffc0000, 00000000ffffffff] (reserved) > > (XEN) [000000fd00000000, 000000ffffffffff] (reserved) > > [...] > > [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009efff] usable > > [ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x00000000000fffff] reserved > > [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000040060f1d] usable > > [ 0.000000] BIOS-e820: [mem 0x0000000040060f1e-0x0000000040060fa7] ACPI data > > [ 0.000000] BIOS-e820: [mem 0x0000000040061000-0x000000007ffdffff] unusable > > [ 0.000000] BIOS-e820: [mem 0x000000007ffe0000-0x000000007fffffff] reserved > > [ 0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved > > [ 0.000000] BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved > > [...] > > [ 10.102427] ACPI: Early table checksum verification disabled > > [ 10.104455] ACPI: RSDP 0x0000000040060F1E 000024 (v02 BOCHS ) > > [ 10.106250] ACPI: RSDT 0x0000000040060F42 000034 (v01 BOCHS BXPC 00000001 BXPC 00000001) > > [ 10.109549] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPC 00000001 BXPC 00000001) > > [ 10.115173] ACPI: ���� 0x000000007FFE19CD FFFFFFFF (v255 ������ �������� FFFFFFFF ���� FFFFFFFF) > > [ 10.116054] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] > > [ 10.116182] ACPI: Reserving ���� table memory at [mem 0x7ffe19cd-0x17ffe19cb] > > > > > > > > > It seems like some of the XSDT entries (the FADT one) is corrupt? > > > > > > Could you maybe add some debug to the Xen-crafted XSDT placement. > > > > I added a printk just after: > > > > xsdt->table_offset_entry[j++] = tables[i].address; > > > > And it printed only once: > > > > (XEN) DEBUG pvh_setup_acpi_xsdt 1000 name=FACP address=7ffe19cd > > > > That actually matches the address read by Linux: > > > > [ 10.175448] ACPI: ���� 0x000000007FFE19CD FFFFFFFF (v255 ������ �������� FFFFFFFF ���� FFFFFFFF) > > > > So the address seems correct. It is the content of the FADT/FACP table > > that is corrupted. > > > > I wrote the following function in Xen: > > > > static void check(void) > > { > > unsigned long addr = 0x7ffe19cd; > > struct acpi_table_fadt *fadt; > > fadt = acpi_os_map_memory(addr, sizeof(*fadt)); > > printk("DEBUG %s %d s=%s\n",__func__,__LINE__,fadt->header.signature); > > acpi_os_unmap_memory(fadt, sizeof(*fadt)); > > } > > > > It prints the right table signature at the end of pvh_setup_acpi. > > I also added a call at the top of xenmem_add_to_physmap_one, and the > > signature is still correct. Then I added a call at the beginning of > > __update_vcpu_system_time. Here is the surprise: from Xen point of view > > the table never gets corrupted. Here are the logs: > > > > [...] > > (XEN) DEBUG fadt_check 1551 s=FACPt > > (XEN) DEBUG fadt_check 1551 s=FACPt > > (XEN) DEBUG fadt_check 1551 s=FACPt > > (XEN) DEBUG fadt_check 1551 s=FACPt > > (XEN) d0v0: upcall vector f3 > > [ 0.000000] Linux version 6.1.19 (root@124de7fbba7f) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_3 > > [ 0.000000] Command line: console=hvc0 > > [...] > > [ 10.371610] ACPI: Early table checksum verification disabled > > [ 10.373633] ACPI: RSDP 0x0000000040060F1E 000024 (v02 BOCHS ) > > [ 10.375548] ACPI: RSDT 0x0000000040060F42 000034 (v01 BOCHS BXPC 00000001 BXPC 00000001) > > [ 10.378732] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPC 00000001 BXPC 00000001) > > [ 10.384188] ACPI: ���� 0x000000007FFE19CD FFFFFFFF (v255 ������ �������� FFFFFFFF ���� FFFFFFFF) > > [ 10.385374] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] > > [ 10.385519] ACPI: Reserving ���� table memory at [mem 0x7ffe19cd-0x17ffe19cb] > > [...] > > (XEN) DEBUG fadt_check 1551 s=FACPt > > (XEN) DEBUG fadt_check 1551 s=FACPt > > (XEN) DEBUG fadt_check 1551 s=FACPt > > (XEN) DEBUG fadt_check 1551 s=FACPt > > > > > > So it looks like it is a problem with the mapping itself? Xen sees the > > data correctly and Linux sees it corrupted? > > It seems to me like the page is not correctly mapped, and so 1s are > returned? (same behavior as a hole). IOW: would seem to me like MMIO > areas are not correctly handled by nested NPT? (I assume you are > running this on AMD). > > Does it make a difference if you try to boot with dom0=pvh,shadow? > > A couple of wild ideas. Maybe the nested virt support that you are > using doesn't handle the UC bit in second stage page table entries? > You could to remove this in p2m_type_to_flags() (see the > p2m_mmio_direct case). > > Another wild idea I have is that the emulated NPT code doesn't like > having the bits 63:52 from the PTE set to anything different than 0, > and hence only p2m_ram_rw works (p2m_mmio_direct is 5). Many thanks to Xenia for figuring out the root cause of the bug. The underlying memory region is already added as E820_RESERVED to the guest (instead of E820_ACPI). When pvh_add_mem_range is called with E820_ACPI as parameter for the interesting table, pvh_add_mem_range returns with -EEXIST without doing anything. The original fix by Xenia was to carve out the relevant subset of the reserved region and mark it as E820_ACPI. Instead, I rewrote it as changing the type of the entire region to E820_ACPI because it is simpler and doesn't have to deal with the edge cases (partially overlapping, overlapping 2 existing regions, etc.) What do you think? diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c index e1043e40d2..6c1c73d853 100644 --- a/xen/arch/x86/hvm/dom0_build.c +++ b/xen/arch/x86/hvm/dom0_build.c @@ -241,6 +241,20 @@ static int __init pvh_add_mem_range(struct domain *d, uint64_t s, uint64_t e, if ( rs >= e ) break; + if ( re >= e && rs <= s ) + { + /* + * An existing overlapping memory range exists and it is + * marked as reserved. This happens on QEMU. Change the type + * to E820_ACPI. + */ + if ( d->arch.e820[i].type == E820_RESERVED && type == E820_ACPI ) + { + d->arch.e820[i].type = E820_ACPI; + break; + } + } + if ( re > s ) return -EEXIST; } ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping 2023-05-17 21:00 ` Stefano Stabellini @ 2023-05-18 8:34 ` Roger Pau Monné 0 siblings, 0 replies; 29+ messages in thread From: Roger Pau Monné @ 2023-05-18 8:34 UTC (permalink / raw) To: Stefano Stabellini Cc: andrew.cooper3, jbeulich, xen-devel, Xenia.Ragiadakou, Stefano Stabellini On Wed, May 17, 2023 at 02:00:01PM -0700, Stefano Stabellini wrote: > On Wed, 17 May 2023, Roger Pau Monné wrote: > > On Tue, May 16, 2023 at 04:34:09PM -0700, Stefano Stabellini wrote: > > > On Tue, 16 May 2023, Roger Pau Monné wrote: > > > > On Mon, May 15, 2023 at 05:11:25PM -0700, Stefano Stabellini wrote: > > > > > On Mon, 15 May 2023, Roger Pau Monné wrote: > > > > > > On Fri, May 12, 2023 at 06:17:20PM -0700, Stefano Stabellini wrote: > > > > > > > From: Stefano Stabellini <stefano.stabellini@amd.com> > > > > > > > > > > > > > > Mapping the ACPI tables to Dom0 PVH 1:1 leads to memory corruptions of > > > > > > > the tables in the guest. Instead, copy the tables to Dom0. > > > > > > > > > > > > > > This is a workaround. > > > > > > > > > > > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> > > > > > > > --- > > > > > > > As mentioned in the cover letter, this is a RFC workaround as I don't > > > > > > > know the cause of the underlying problem. I do know that this patch > > > > > > > solves what would be otherwise a hang at boot when Dom0 PVH attempts to > > > > > > > parse ACPI tables. > > > > > > > > > > > > I'm unsure how safe this is for native systems, as it's possible for > > > > > > firmware to modify the data in the tables, so copying them would > > > > > > break that functionality. > > > > > > > > > > > > I think we need to get to the root cause that triggers this behavior > > > > > > on QEMU. Is it the table checksum that fail, or something else? Is > > > > > > there an error from Linux you could reference? > > > > > > > > > > I agree with you but so far I haven't managed to find a way to the root > > > > > of the issue. Here is what I know. These are the logs of a successful > > > > > boot using this patch: > > > > > > > > > > [ 10.437488] ACPI: Early table checksum verification disabled > > > > > [ 10.439345] ACPI: RSDP 0x000000004005F955 000024 (v02 BOCHS ) > > > > > [ 10.441033] ACPI: RSDT 0x000000004005F979 000034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001) > > > > > [ 10.444045] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001) > > > > > [ 10.445984] ACPI: FACP 0x000000004005FA65 000074 (v01 BOCHS BXPCFACP 00000001 BXPC 00000001) > > > > > [ 10.447170] ACPI BIOS Warning (bug): Incorrect checksum in table [FACP] - 0x67, should be 0x30 (20220331/tbprint-174) > > > > > [ 10.449522] ACPI: DSDT 0x000000004005FB19 00145D (v01 BOCHS BXPCDSDT 00000001 BXPC 00000001) > > > > > [ 10.451258] ACPI: FACS 0x000000004005FAD9 000040 > > > > > [ 10.452245] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] > > > > > [ 10.452389] ACPI: Reserving FACP table memory at [mem 0x4005fa65-0x4005fad8] > > > > > [ 10.452497] ACPI: Reserving DSDT table memory at [mem 0x4005fb19-0x40060f75] > > > > > [ 10.452602] ACPI: Reserving FACS table memory at [mem 0x4005fad9-0x4005fb18] > > > > > > > > > > > > > > > And these are the logs of the same boot (unsuccessful) without this > > > > > patch: > > > > > > > > > > [ 10.516015] ACPI: Early table checksum verification disabled > > > > > [ 10.517732] ACPI: RSDP 0x0000000040060F1E 000024 (v02 BOCHS ) > > > > > [ 10.519535] ACPI: RSDT 0x0000000040060F42 000034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001) > > > > > [ 10.522523] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001) > > > > > [ 10.527453] ACPI: ���� 0x000000007FFE149D FFFFFFFF (v255 ������ �������� FFFFFFFF ���� FFFFFFFF) > > > > > [ 10.528362] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] > > > > > [ 10.528491] ACPI: Reserving ���� table memory at [mem 0x7ffe149d-0x17ffe149b] > > > > > > > > > > It is clearly a memory corruption around FACS but I couldn't find the > > > > > reason for it. The mapping code looks correct. I hope you can suggest a > > > > > way to narrow down the problem. If I could, I would suggest to apply > > > > > this patch just for the QEMU PVH tests but we don't have the > > > > > infrastructure for that in gitlab-ci as there is a single Xen build for > > > > > all tests. > > > > > > > > Would be helpful to see the memory map provided to Linux, just in case > > > > we messed up and there's some overlap. > > > > > > Everything looks correct. Here are some more logs: > > > > > > (XEN) Xen-e820 RAM map: > > > (XEN) [0000000000000000, 000000000009fbff] (usable) > > > (XEN) [000000000009fc00, 000000000009ffff] (reserved) > > > (XEN) [00000000000f0000, 00000000000fffff] (reserved) > > > (XEN) [0000000000100000, 000000007ffdffff] (usable) > > > (XEN) [000000007ffe0000, 000000007fffffff] (reserved) > > > (XEN) [00000000fffc0000, 00000000ffffffff] (reserved) > > > (XEN) [000000fd00000000, 000000ffffffffff] (reserved) > > > (XEN) Microcode loading not available > > > (XEN) New Xen image base address: 0x7f600000 > > > (XEN) System RAM: 2047MB (2096636kB) > > > (XEN) ACPI: RSDP 000F58D0, 0014 (r0 BOCHS ) > > > (XEN) ACPI: RSDT 7FFE1B21, 0034 (r1 BOCHS BXPC 1 BXPC 1) > > > (XEN) ACPI: FACP 7FFE19CD, 0074 (r1 BOCHS BXPC 1 BXPC 1) > > > (XEN) ACPI: DSDT 7FFE0040, 198D (r1 BOCHS BXPC 1 BXPC 1) > > > (XEN) ACPI: FACS 7FFE0000, 0040 > > > (XEN) ACPI: APIC 7FFE1A41, 0080 (r1 BOCHS BXPC 1 BXPC 1) > > > (XEN) ACPI: HPET 7FFE1AC1, 0038 (r1 BOCHS BXPC 1 BXPC 1) > > > (XEN) ACPI: WAET 7FFE1AF9, 0028 (r1 BOCHS BXPC 1 BXPC 1) > > > [...] > > > (XEN) Dom0 memory map: > > > (XEN) [0000000000000000, 000000000009efff] (usable) > > > (XEN) [000000000009fc00, 000000000009ffff] (reserved) > > > (XEN) [00000000000f0000, 00000000000fffff] (reserved) > > > (XEN) [0000000000100000, 0000000040060f1d] (usable) > > > (XEN) [0000000040060f1e, 0000000040060fa7] (ACPI data) > > > (XEN) [0000000040061000, 000000007ffdffff] (unusable) > > > (XEN) [000000007ffe0000, 000000007fffffff] (reserved) > > > (XEN) [00000000fffc0000, 00000000ffffffff] (reserved) > > > (XEN) [000000fd00000000, 000000ffffffffff] (reserved) > > > [...] > > > [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009efff] usable > > > [ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x00000000000fffff] reserved > > > [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000040060f1d] usable > > > [ 0.000000] BIOS-e820: [mem 0x0000000040060f1e-0x0000000040060fa7] ACPI data > > > [ 0.000000] BIOS-e820: [mem 0x0000000040061000-0x000000007ffdffff] unusable > > > [ 0.000000] BIOS-e820: [mem 0x000000007ffe0000-0x000000007fffffff] reserved > > > [ 0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved > > > [ 0.000000] BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved > > > [...] > > > [ 10.102427] ACPI: Early table checksum verification disabled > > > [ 10.104455] ACPI: RSDP 0x0000000040060F1E 000024 (v02 BOCHS ) > > > [ 10.106250] ACPI: RSDT 0x0000000040060F42 000034 (v01 BOCHS BXPC 00000001 BXPC 00000001) > > > [ 10.109549] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPC 00000001 BXPC 00000001) > > > [ 10.115173] ACPI: ���� 0x000000007FFE19CD FFFFFFFF (v255 ������ �������� FFFFFFFF ���� FFFFFFFF) > > > [ 10.116054] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] > > > [ 10.116182] ACPI: Reserving ���� table memory at [mem 0x7ffe19cd-0x17ffe19cb] > > > > > > > > > > > > > It seems like some of the XSDT entries (the FADT one) is corrupt? > > > > > > > > Could you maybe add some debug to the Xen-crafted XSDT placement. > > > > > > I added a printk just after: > > > > > > xsdt->table_offset_entry[j++] = tables[i].address; > > > > > > And it printed only once: > > > > > > (XEN) DEBUG pvh_setup_acpi_xsdt 1000 name=FACP address=7ffe19cd > > > > > > That actually matches the address read by Linux: > > > > > > [ 10.175448] ACPI: ���� 0x000000007FFE19CD FFFFFFFF (v255 ������ �������� FFFFFFFF ���� FFFFFFFF) > > > > > > So the address seems correct. It is the content of the FADT/FACP table > > > that is corrupted. > > > > > > I wrote the following function in Xen: > > > > > > static void check(void) > > > { > > > unsigned long addr = 0x7ffe19cd; > > > struct acpi_table_fadt *fadt; > > > fadt = acpi_os_map_memory(addr, sizeof(*fadt)); > > > printk("DEBUG %s %d s=%s\n",__func__,__LINE__,fadt->header.signature); > > > acpi_os_unmap_memory(fadt, sizeof(*fadt)); > > > } > > > > > > It prints the right table signature at the end of pvh_setup_acpi. > > > I also added a call at the top of xenmem_add_to_physmap_one, and the > > > signature is still correct. Then I added a call at the beginning of > > > __update_vcpu_system_time. Here is the surprise: from Xen point of view > > > the table never gets corrupted. Here are the logs: > > > > > > [...] > > > (XEN) DEBUG fadt_check 1551 s=FACPt > > > (XEN) DEBUG fadt_check 1551 s=FACPt > > > (XEN) DEBUG fadt_check 1551 s=FACPt > > > (XEN) DEBUG fadt_check 1551 s=FACPt > > > (XEN) d0v0: upcall vector f3 > > > [ 0.000000] Linux version 6.1.19 (root@124de7fbba7f) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_3 > > > [ 0.000000] Command line: console=hvc0 > > > [...] > > > [ 10.371610] ACPI: Early table checksum verification disabled > > > [ 10.373633] ACPI: RSDP 0x0000000040060F1E 000024 (v02 BOCHS ) > > > [ 10.375548] ACPI: RSDT 0x0000000040060F42 000034 (v01 BOCHS BXPC 00000001 BXPC 00000001) > > > [ 10.378732] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS BXPC 00000001 BXPC 00000001) > > > [ 10.384188] ACPI: ���� 0x000000007FFE19CD FFFFFFFF (v255 ������ �������� FFFFFFFF ���� FFFFFFFF) > > > [ 10.385374] ACPI: Reserving APIC table memory at [mem 0x40060f76-0x40060fff] > > > [ 10.385519] ACPI: Reserving ���� table memory at [mem 0x7ffe19cd-0x17ffe19cb] > > > [...] > > > (XEN) DEBUG fadt_check 1551 s=FACPt > > > (XEN) DEBUG fadt_check 1551 s=FACPt > > > (XEN) DEBUG fadt_check 1551 s=FACPt > > > (XEN) DEBUG fadt_check 1551 s=FACPt > > > > > > > > > So it looks like it is a problem with the mapping itself? Xen sees the > > > data correctly and Linux sees it corrupted? > > > > It seems to me like the page is not correctly mapped, and so 1s are > > returned? (same behavior as a hole). IOW: would seem to me like MMIO > > areas are not correctly handled by nested NPT? (I assume you are > > running this on AMD). > > > > Does it make a difference if you try to boot with dom0=pvh,shadow? > > > > A couple of wild ideas. Maybe the nested virt support that you are > > using doesn't handle the UC bit in second stage page table entries? > > You could to remove this in p2m_type_to_flags() (see the > > p2m_mmio_direct case). > > > > Another wild idea I have is that the emulated NPT code doesn't like > > having the bits 63:52 from the PTE set to anything different than 0, > > and hence only p2m_ram_rw works (p2m_mmio_direct is 5). > > Many thanks to Xenia for figuring out the root cause of the bug. The > underlying memory region is already added as E820_RESERVED to the guest > (instead of E820_ACPI). When pvh_add_mem_range is called with E820_ACPI > as parameter for the interesting table, pvh_add_mem_range returns with > -EEXIST without doing anything. > > The original fix by Xenia was to carve out the relevant subset of the > reserved region and mark it as E820_ACPI. Instead, I rewrote it as > changing the type of the entire region to E820_ACPI because it is > simpler and doesn't have to deal with the edge cases (partially > overlapping, overlapping 2 existing regions, etc.) > > What do you think? Hm, I'm unsure whether wholesale converting reserved regions into ACPI ones is correct, for example Xen will handle reserved regions specially when creating the IOMMU mappings, and RMRRs are also expected to live in reserved regions. The issue is IMO with the usage of dom0-iommu=none, which leads to reserved regions not mapped in the dom0 physmap (see arch_iommu_hwdom_init()). One option might be to move the mapping of reserved regions from arch_iommu_hwdom_init() into PVH dom0 build (as part of pvh_populate_p2m()) and thus render arch_iommu_hwdom_init() PV only. That would also imply that for PVH dom0 `dom0-iommu=map-reserved` is fixed cannot be modified (iow: reserved regions are always added to the PVH dom0 physmap). Thanks, Roger. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping 2023-05-13 1:17 ` [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping Stefano Stabellini 2023-05-15 9:44 ` Roger Pau Monné @ 2023-05-15 14:17 ` Jan Beulich 2023-05-16 0:12 ` Stefano Stabellini 2023-05-18 7:24 ` Xenia Ragiadakou 1 sibling, 2 replies; 29+ messages in thread From: Jan Beulich @ 2023-05-15 14:17 UTC (permalink / raw) To: Stefano Stabellini Cc: xen-devel, Xenia.Ragiadakou, Stefano Stabellini, roger.pau, andrew.cooper3 On 13.05.2023 03:17, Stefano Stabellini wrote: > From: Stefano Stabellini <stefano.stabellini@amd.com> > > Mapping the ACPI tables to Dom0 PVH 1:1 leads to memory corruptions of > the tables in the guest. Instead, copy the tables to Dom0. Do you really mean "in the guest" (i.e. from Xen's perspective, i.e. ignoring that when running on qemu it is kind of a guest itself)? I also consider the statement too broad anyway: Various people have run PVH Dom0 without running into such an issue, so it's clearly not just "leads to". Jan ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping 2023-05-15 14:17 ` Jan Beulich @ 2023-05-16 0:12 ` Stefano Stabellini 2023-05-18 7:24 ` Xenia Ragiadakou 1 sibling, 0 replies; 29+ messages in thread From: Stefano Stabellini @ 2023-05-16 0:12 UTC (permalink / raw) To: Jan Beulich Cc: Stefano Stabellini, xen-devel, Xenia.Ragiadakou, Stefano Stabellini, roger.pau, andrew.cooper3 On Mon, 15 May 2023, Jan Beulich wrote: > On 13.05.2023 03:17, Stefano Stabellini wrote: > > From: Stefano Stabellini <stefano.stabellini@amd.com> > > > > Mapping the ACPI tables to Dom0 PVH 1:1 leads to memory corruptions of > > the tables in the guest. Instead, copy the tables to Dom0. > > Do you really mean "in the guest" (i.e. from Xen's perspective, i.e. > ignoring that when running on qemu it is kind of a guest itself)? Yes, I posted the memory corruption info I have on a separate email > I also consider the statement too broad anyway: Various people have > run PVH Dom0 without running into such an issue, so it's clearly not > just "leads to". Fair enough ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping 2023-05-15 14:17 ` Jan Beulich 2023-05-16 0:12 ` Stefano Stabellini @ 2023-05-18 7:24 ` Xenia Ragiadakou 2023-05-18 9:31 ` Roger Pau Monné 1 sibling, 1 reply; 29+ messages in thread From: Xenia Ragiadakou @ 2023-05-18 7:24 UTC (permalink / raw) To: Jan Beulich, Stefano Stabellini Cc: xen-devel, Stefano Stabellini, roger.pau, andrew.cooper3 On 15/5/23 17:17, Jan Beulich wrote: > On 13.05.2023 03:17, Stefano Stabellini wrote: >> From: Stefano Stabellini <stefano.stabellini@amd.com> >> >> Mapping the ACPI tables to Dom0 PVH 1:1 leads to memory corruptions of >> the tables in the guest. Instead, copy the tables to Dom0. > Do you really mean "in the guest" (i.e. from Xen's perspective, i.e. > ignoring that when running on qemu it is kind of a guest itself)? > > I also consider the statement too broad anyway: Various people have > run PVH Dom0 without running into such an issue, so it's clearly not > just "leads to". In my opinion the issue is broader. In pvh_setup_acpi(), the code adding the ACPI tables to dom0 memory map does not check the return value of pvh_add_mem_range(). If there is an overlap and the overlapping region is marked as E820_ACPI, it maps not just the allowed tables but the entire overlapping range , while if the overlapping range is marked as E820_RESERVED, it does not map the tables at all (the issue that Stefano saw with qemu). Since dom0 memory map is initialized based on the native one, the code adding the ACPI table memory ranges will naturally fall into one of the two cases above. So even when not running into this issue, pvh_add_mem_range() still fails and the memory range mapped is wider than the allowed one. Xenia ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping 2023-05-18 7:24 ` Xenia Ragiadakou @ 2023-05-18 9:31 ` Roger Pau Monné 2023-05-18 11:36 ` Xenia Ragiadakou 0 siblings, 1 reply; 29+ messages in thread From: Roger Pau Monné @ 2023-05-18 9:31 UTC (permalink / raw) To: Xenia Ragiadakou Cc: Jan Beulich, Stefano Stabellini, xen-devel, Stefano Stabellini, andrew.cooper3 On Thu, May 18, 2023 at 10:24:10AM +0300, Xenia Ragiadakou wrote: > > On 15/5/23 17:17, Jan Beulich wrote: > > On 13.05.2023 03:17, Stefano Stabellini wrote: > > > From: Stefano Stabellini <stefano.stabellini@amd.com> > > > > > > Mapping the ACPI tables to Dom0 PVH 1:1 leads to memory corruptions of > > > the tables in the guest. Instead, copy the tables to Dom0. > > Do you really mean "in the guest" (i.e. from Xen's perspective, i.e. > > ignoring that when running on qemu it is kind of a guest itself)? > > > > I also consider the statement too broad anyway: Various people have > > run PVH Dom0 without running into such an issue, so it's clearly not > > just "leads to". > > In my opinion the issue is broader. > > In pvh_setup_acpi(), the code adding the ACPI tables to dom0 memory map does > not check the return value of pvh_add_mem_range(). If there is an overlap > and the overlapping region is marked as E820_ACPI, it maps not just the > allowed tables but the entire overlapping range , But that's the indented behavior: all ACPI regions will be mapped into the dom0 physmap, the filtering of the tables exposed to dom0 is done in the XSDT, but not in by filtering the mapped regions. Note this won't be effective anyway, as the minimal granularity of physmap entries is 4k, so multiple tables could live in the same 4K region. Also Xen cannot parse dynamic tables (SSDT) or execute methods, and hence doesn't know exactly which memory will be used. Xen relies on the firmware to have the ACPI tables in ACPI, NVS or RESERVED regions in order for them to be mapped into the gust physmap. The call to pvh_add_mem_range() in pvh_setup_acpi() is just an attempt to workaround broken systems that have tables placed in memory map holes, and hence ignoring the return value is fine. > while if the overlapping > range is marked as E820_RESERVED, it does not map the tables at all (the > issue that Stefano saw with qemu). Since dom0 memory map is initialized > based on the native one, the code adding the ACPI table memory ranges will > naturally fall into one of the two cases above. Xen does map them, but that's done in arch_iommu_hwdom_init() which get short-circuited by the usage of dom0-iommu=none in your example. See my reply to Stefano about moving such mappings into pvh_populate_p2m(). > So even when not running into this issue, pvh_add_mem_range() still fails > and the memory range mapped is wider than the allowed one. The intention of that call to pvh_add_mem_range() is not to limit what gets mapped into dom0 physmap, but rather to workaround bugs in the firmware if ACPI tables are placed in memory map holes. Thanks, Roger. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping 2023-05-18 9:31 ` Roger Pau Monné @ 2023-05-18 11:36 ` Xenia Ragiadakou 2023-05-18 11:44 ` Roger Pau Monné 0 siblings, 1 reply; 29+ messages in thread From: Xenia Ragiadakou @ 2023-05-18 11:36 UTC (permalink / raw) To: Roger Pau Monné Cc: Jan Beulich, Stefano Stabellini, xen-devel, Stefano Stabellini, andrew.cooper3 On 18/5/23 12:31, Roger Pau Monné wrote: > On Thu, May 18, 2023 at 10:24:10AM +0300, Xenia Ragiadakou wrote: >> On 15/5/23 17:17, Jan Beulich wrote: >>> On 13.05.2023 03:17, Stefano Stabellini wrote: >>>> From: Stefano Stabellini <stefano.stabellini@amd.com> >>>> >>>> Mapping the ACPI tables to Dom0 PVH 1:1 leads to memory corruptions of >>>> the tables in the guest. Instead, copy the tables to Dom0. >>> Do you really mean "in the guest" (i.e. from Xen's perspective, i.e. >>> ignoring that when running on qemu it is kind of a guest itself)? >>> >>> I also consider the statement too broad anyway: Various people have >>> run PVH Dom0 without running into such an issue, so it's clearly not >>> just "leads to". >> In my opinion the issue is broader. >> >> In pvh_setup_acpi(), the code adding the ACPI tables to dom0 memory map does >> not check the return value of pvh_add_mem_range(). If there is an overlap >> and the overlapping region is marked as E820_ACPI, it maps not just the >> allowed tables but the entire overlapping range , > But that's the indented behavior: all ACPI regions will be mapped into > the dom0 physmap, the filtering of the tables exposed to dom0 is done > in the XSDT, but not in by filtering the mapped regions. Note this > won't be effective anyway, as the minimal granularity of physmap > entries is 4k, so multiple tables could live in the same 4K region. > Also Xen cannot parse dynamic tables (SSDT) or execute methods, and > hence doesn't know exactly which memory will be used. Thanks a lot for the explanation. I checked more carefully the code and it's true that xen does not aim to restrict dom0 access to ACPI tables. I got confused by the name of the function pvh_acpi_table_allowed. > > Xen relies on the firmware to have the ACPI tables in ACPI, NVS or > RESERVED regions in order for them to be mapped into the gust physmap. > The call to pvh_add_mem_range() in pvh_setup_acpi() is just an attempt > to workaround broken systems that have tables placed in memory map > holes, and hence ignoring the return value is fine. In pvh_setup_acpi(), xen identity maps E820_ACPI and E820_NVS ranges to dom0. Why it does not do the same for E820_RESERVED, since ACPI tables may also lie there and since it does not know which memory will be used? >> while if the overlapping >> range is marked as E820_RESERVED, it does not map the tables at all (the >> issue that Stefano saw with qemu). Since dom0 memory map is initialized >> based on the native one, the code adding the ACPI table memory ranges will >> naturally fall into one of the two cases above. > Xen does map them, but that's done in arch_iommu_hwdom_init() which get > short-circuited by the usage of dom0-iommu=none in your example. See > my reply to Stefano about moving such mappings into pvh_populate_p2m(). Indeed, if dom0-iommu=none is removed from the xen cmdline and qemu is configured with an iommu, the issue is not triggered. Because arch_iommu_hwdom_init() identity maps to dom0 at least the first 4G, right? >> So even when not running into this issue, pvh_add_mem_range() still fails >> and the memory range mapped is wider than the allowed one. > The intention of that call to pvh_add_mem_range() is not to limit what > gets mapped into dom0 physmap, but rather to workaround bugs in the > firmware if ACPI tables are placed in memory map holes. > > Thanks, Roger. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping 2023-05-18 11:36 ` Xenia Ragiadakou @ 2023-05-18 11:44 ` Roger Pau Monné 0 siblings, 0 replies; 29+ messages in thread From: Roger Pau Monné @ 2023-05-18 11:44 UTC (permalink / raw) To: Xenia Ragiadakou Cc: Jan Beulich, Stefano Stabellini, xen-devel, Stefano Stabellini, andrew.cooper3 On Thu, May 18, 2023 at 02:36:41PM +0300, Xenia Ragiadakou wrote: > > On 18/5/23 12:31, Roger Pau Monné wrote: > > On Thu, May 18, 2023 at 10:24:10AM +0300, Xenia Ragiadakou wrote: > > > On 15/5/23 17:17, Jan Beulich wrote: > > > > On 13.05.2023 03:17, Stefano Stabellini wrote: > > > > > From: Stefano Stabellini <stefano.stabellini@amd.com> > > > > > > > > > > Mapping the ACPI tables to Dom0 PVH 1:1 leads to memory corruptions of > > > > > the tables in the guest. Instead, copy the tables to Dom0. > > > > Do you really mean "in the guest" (i.e. from Xen's perspective, i.e. > > > > ignoring that when running on qemu it is kind of a guest itself)? > > > > > > > > I also consider the statement too broad anyway: Various people have > > > > run PVH Dom0 without running into such an issue, so it's clearly not > > > > just "leads to". > > > In my opinion the issue is broader. > > > > > > In pvh_setup_acpi(), the code adding the ACPI tables to dom0 memory map does > > > not check the return value of pvh_add_mem_range(). If there is an overlap > > > and the overlapping region is marked as E820_ACPI, it maps not just the > > > allowed tables but the entire overlapping range , > > But that's the indented behavior: all ACPI regions will be mapped into > > the dom0 physmap, the filtering of the tables exposed to dom0 is done > > in the XSDT, but not in by filtering the mapped regions. Note this > > won't be effective anyway, as the minimal granularity of physmap > > entries is 4k, so multiple tables could live in the same 4K region. > > Also Xen cannot parse dynamic tables (SSDT) or execute methods, and > > hence doesn't know exactly which memory will be used. > Thanks a lot for the explanation. I checked more carefully the code and it's > true that xen does not aim to restrict dom0 access to ACPI tables. I got > confused by the name of the function pvh_acpi_table_allowed. > > > > Xen relies on the firmware to have the ACPI tables in ACPI, NVS or > > RESERVED regions in order for them to be mapped into the gust physmap. > > The call to pvh_add_mem_range() in pvh_setup_acpi() is just an attempt > > to workaround broken systems that have tables placed in memory map > > holes, and hence ignoring the return value is fine. > In pvh_setup_acpi(), xen identity maps E820_ACPI and E820_NVS ranges to > dom0. Why it does not do the same for E820_RESERVED, since ACPI tables may > also lie there and since it does not know which memory will be used? So far I at least wasn't considering that ACPI tables could reside in RESERVED regions. Given the behavior exposed by QEMU I think we need to move the mapping of RESERVED regions from arch_iommu_hwdom_init() into pvh_populate_p2m() for PVH dom0, thus rendering arch_iommu_hwdom_init() PV-only. > > > while if the overlapping > > > range is marked as E820_RESERVED, it does not map the tables at all (the > > > issue that Stefano saw with qemu). Since dom0 memory map is initialized > > > based on the native one, the code adding the ACPI table memory ranges will > > > naturally fall into one of the two cases above. > > Xen does map them, but that's done in arch_iommu_hwdom_init() which get > > short-circuited by the usage of dom0-iommu=none in your example. See > > my reply to Stefano about moving such mappings into pvh_populate_p2m(). > Indeed, if dom0-iommu=none is removed from the xen cmdline and qemu is > configured with an iommu, the issue is not triggered. Because > arch_iommu_hwdom_init() identity maps to dom0 at least the first 4G, right? For PVH dom0 only reserved regions are identity mapped into the physmap. Thanks, Roger. ^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2023-05-18 11:44 UTC | newest] Thread overview: 29+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-05-13 1:16 [PATCH 0/2] PVH Dom0 on QEMU Stefano Stabellini 2023-05-13 1:17 ` [PATCH 1/2] xen/x86/pvh: use preset XSDT header for XSDT generation Stefano Stabellini 2023-05-15 8:48 ` Roger Pau Monné 2023-05-15 14:14 ` Jan Beulich 2023-05-16 0:16 ` Stefano Stabellini 2023-05-16 6:13 ` Jan Beulich 2023-05-16 8:10 ` Roger Pau Monné 2023-05-16 8:13 ` Roger Pau Monné 2023-05-16 8:24 ` Roger Pau Monné 2023-05-16 9:13 ` Jan Beulich 2023-05-16 9:23 ` Roger Pau Monné 2023-05-16 22:11 ` Stefano Stabellini 2023-05-17 8:42 ` Roger Pau Monné 2023-05-13 1:17 ` [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping Stefano Stabellini 2023-05-15 9:44 ` Roger Pau Monné 2023-05-16 0:11 ` Stefano Stabellini 2023-05-16 0:38 ` Stefano Stabellini 2023-05-16 6:27 ` Jan Beulich 2023-05-16 9:21 ` Roger Pau Monné 2023-05-16 23:34 ` Stefano Stabellini 2023-05-17 8:40 ` Roger Pau Monné 2023-05-17 21:00 ` Stefano Stabellini 2023-05-18 8:34 ` Roger Pau Monné 2023-05-15 14:17 ` Jan Beulich 2023-05-16 0:12 ` Stefano Stabellini 2023-05-18 7:24 ` Xenia Ragiadakou 2023-05-18 9:31 ` Roger Pau Monné 2023-05-18 11:36 ` Xenia Ragiadakou 2023-05-18 11:44 ` Roger Pau Monné
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.