* [PATCH v7 01/10] lib/x86: Bump max basic leaf in {pv,hvm}_max_policy
2024-10-21 15:45 [PATCH v7 00/10] x86: Expose consistent topology to guests Alejandro Vallejo
@ 2024-10-21 15:45 ` Alejandro Vallejo
2024-10-29 17:57 ` Andrew Cooper
2024-10-21 15:45 ` [PATCH v7 02/10] xen/x86: Add initial x2APIC ID to the per-vLAPIC save area Alejandro Vallejo
` (8 subsequent siblings)
9 siblings, 1 reply; 27+ messages in thread
From: Alejandro Vallejo @ 2024-10-21 15:45 UTC (permalink / raw)
To: xen-devel
Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
Roger Pau Monné
Bump it to ARRAY_SIZE() so toolstack is able to extend a policy past
host limits (i.e: to emulate a feature not present in the host)
Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
v7:
* Replaces v6/patch1("Relax checks about policy compatibility")
* Bumps basic.max_leaf to ARRAY_SIZE(basic.raw) to pass the
compatibility checks rather than tweaking the checker.
---
xen/arch/x86/cpu-policy.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c
index b6d9fad56773..715a66d2a978 100644
--- a/xen/arch/x86/cpu-policy.c
+++ b/xen/arch/x86/cpu-policy.c
@@ -585,6 +585,9 @@ static void __init calculate_pv_max_policy(void)
*/
p->feat.max_subleaf = ARRAY_SIZE(p->feat.raw) - 1;
+ /* Toolstack may populate leaves not present in the basic host leaves */
+ p->basic.max_leaf = ARRAY_SIZE(p->basic.raw) - 1;
+
x86_cpu_policy_to_featureset(p, fs);
for ( i = 0; i < ARRAY_SIZE(fs); ++i )
@@ -672,6 +675,9 @@ static void __init calculate_hvm_max_policy(void)
*/
p->feat.max_subleaf = ARRAY_SIZE(p->feat.raw) - 1;
+ /* Toolstack may populate leaves not present in the basic host leaves */
+ p->basic.max_leaf = ARRAY_SIZE(p->basic.raw) - 1;
+
x86_cpu_policy_to_featureset(p, fs);
mask = hvm_hap_supported() ?
--
2.47.0
^ permalink raw reply related [flat|nested] 27+ messages in thread* Re: [PATCH v7 01/10] lib/x86: Bump max basic leaf in {pv,hvm}_max_policy
2024-10-21 15:45 ` [PATCH v7 01/10] lib/x86: Bump max basic leaf in {pv,hvm}_max_policy Alejandro Vallejo
@ 2024-10-29 17:57 ` Andrew Cooper
0 siblings, 0 replies; 27+ messages in thread
From: Andrew Cooper @ 2024-10-29 17:57 UTC (permalink / raw)
To: Alejandro Vallejo, xen-devel; +Cc: Jan Beulich, Roger Pau Monné
On 21/10/2024 4:45 pm, Alejandro Vallejo wrote:
> Bump it to ARRAY_SIZE() so toolstack is able to extend a policy past
> host limits (i.e: to emulate a feature not present in the host)
>
> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
> ---
> v7:
> * Replaces v6/patch1("Relax checks about policy compatibility")
> * Bumps basic.max_leaf to ARRAY_SIZE(basic.raw) to pass the
> compatibility checks rather than tweaking the checker.
> ---
> xen/arch/x86/cpu-policy.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c
> index b6d9fad56773..715a66d2a978 100644
> --- a/xen/arch/x86/cpu-policy.c
> +++ b/xen/arch/x86/cpu-policy.c
> @@ -585,6 +585,9 @@ static void __init calculate_pv_max_policy(void)
> */
> p->feat.max_subleaf = ARRAY_SIZE(p->feat.raw) - 1;
>
> + /* Toolstack may populate leaves not present in the basic host leaves */
> + p->basic.max_leaf = ARRAY_SIZE(p->basic.raw) - 1;
> +
> x86_cpu_policy_to_featureset(p, fs);
>
> for ( i = 0; i < ARRAY_SIZE(fs); ++i )
> @@ -672,6 +675,9 @@ static void __init calculate_hvm_max_policy(void)
> */
> p->feat.max_subleaf = ARRAY_SIZE(p->feat.raw) - 1;
>
> + /* Toolstack may populate leaves not present in the basic host leaves */
> + p->basic.max_leaf = ARRAY_SIZE(p->basic.raw) - 1;
> +
> x86_cpu_policy_to_featureset(p, fs);
>
> mask = hvm_hap_supported() ?
This sadly doesn't do what you want. It leaves the default policy with
extended limits too.
To unblock the work (which is long overdue), here's one I prepared
earlie^W just now.
https://lore.kernel.org/xen-devel/20241029175505.2698661-1-andrew.cooper3@citrix.com/T/#u
~Andrew
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v7 02/10] xen/x86: Add initial x2APIC ID to the per-vLAPIC save area
2024-10-21 15:45 [PATCH v7 00/10] x86: Expose consistent topology to guests Alejandro Vallejo
2024-10-21 15:45 ` [PATCH v7 01/10] lib/x86: Bump max basic leaf in {pv,hvm}_max_policy Alejandro Vallejo
@ 2024-10-21 15:45 ` Alejandro Vallejo
2024-10-29 20:30 ` Andrew Cooper
2024-10-21 15:45 ` [PATCH v7 03/10] xen/x86: Add supporting code for uploading LAPIC contexts during domain create Alejandro Vallejo
` (7 subsequent siblings)
9 siblings, 1 reply; 27+ messages in thread
From: Alejandro Vallejo @ 2024-10-21 15:45 UTC (permalink / raw)
To: xen-devel
Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
Roger Pau Monné
This allows the initial x2APIC ID to be sent on the migration stream.
This allows further changes to topology and APIC ID assignment without
breaking existing hosts. Given the vlapic data is zero-extended on
restore, fix up migrations from hosts without the field by setting it to
the old convention if zero.
The hardcoded mapping x2apic_id=2*vcpu_id is kept for the time being,
but it's meant to be overriden by toolstack on a later patch with
appropriate values.
Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
v7:
* Preserve output for CPUID[0xb].edx on PV rather than nullify it.
* s/vlapic->hw.x2apic_id/vlapic_x2apic_id(vlapic)/ in vlapic.c
---
xen/arch/x86/cpuid.c | 18 +++++++-----------
xen/arch/x86/hvm/vlapic.c | 22 ++++++++++++++++++++--
xen/arch/x86/include/asm/hvm/vlapic.h | 1 +
xen/include/public/arch-x86/hvm/save.h | 2 ++
4 files changed, 30 insertions(+), 13 deletions(-)
diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 2a777436ee27..e2489ff8e346 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -138,10 +138,9 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
const struct cpu_user_regs *regs;
case 0x1:
- /* TODO: Rework topology logic. */
res->b &= 0x00ffffffu;
if ( is_hvm_domain(d) )
- res->b |= (v->vcpu_id * 2) << 24;
+ res->b |= vlapic_x2apic_id(vcpu_vlapic(v)) << 24;
/* TODO: Rework vPMU control in terms of toolstack choices. */
if ( vpmu_available(v) &&
@@ -310,19 +309,16 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
break;
case 0xb:
- /*
- * In principle, this leaf is Intel-only. In practice, it is tightly
- * coupled with x2apic, and we offer an x2apic-capable APIC emulation
- * to guests on AMD hardware as well.
- *
- * TODO: Rework topology logic.
- */
if ( p->basic.x2apic )
{
*(uint8_t *)&res->c = subleaf;
- /* Fix the x2APIC identifier. */
- res->d = v->vcpu_id * 2;
+ /*
+ * Fix the x2APIC identifier. The PV side is nonsensical, but
+ * we've always shown it like this so it's kept for compat.
+ */
+ res->d = is_hvm_domain(d) ? vlapic_x2apic_id(vcpu_vlapic(v))
+ : 2 * v->vcpu_id;
}
break;
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 3363926b487b..33b463925f4e 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -1090,7 +1090,7 @@ static uint32_t x2apic_ldr_from_id(uint32_t id)
static void set_x2apic_id(struct vlapic *vlapic)
{
const struct vcpu *v = vlapic_vcpu(vlapic);
- uint32_t apic_id = v->vcpu_id * 2;
+ uint32_t apic_id = vlapic_x2apic_id(vlapic);
uint32_t apic_ldr = x2apic_ldr_from_id(apic_id);
/*
@@ -1470,7 +1470,7 @@ void vlapic_reset(struct vlapic *vlapic)
if ( v->vcpu_id == 0 )
vlapic->hw.apic_base_msr |= APIC_BASE_BSP;
- vlapic_set_reg(vlapic, APIC_ID, (v->vcpu_id * 2) << 24);
+ vlapic_set_reg(vlapic, APIC_ID, SET_xAPIC_ID(vlapic_x2apic_id(vlapic)));
vlapic_do_init(vlapic);
}
@@ -1538,6 +1538,16 @@ static void lapic_load_fixup(struct vlapic *vlapic)
const struct vcpu *v = vlapic_vcpu(vlapic);
uint32_t good_ldr = x2apic_ldr_from_id(vlapic->loaded.id);
+ /*
+ * Loading record without hw.x2apic_id in the save stream, calculate using
+ * the traditional "vcpu_id * 2" relation. There's an implicit assumption
+ * that vCPU0 always has x2APIC0, which is true for the old relation, and
+ * still holds under the new x2APIC generation algorithm. While that case
+ * goes through the conditional it's benign because it still maps to zero.
+ */
+ if ( !vlapic->hw.x2apic_id )
+ vlapic->hw.x2apic_id = v->vcpu_id * 2;
+
/* Skip fixups on xAPIC mode, or if the x2APIC LDR is already correct */
if ( !vlapic_x2apic_mode(vlapic) ||
(vlapic->loaded.ldr == good_ldr) )
@@ -1606,6 +1616,13 @@ static int cf_check lapic_check_hidden(const struct domain *d,
APIC_BASE_EXTD )
return -EINVAL;
+ /*
+ * Fail migrations from newer versions of Xen where
+ * rsvd_zero is interpreted as something else.
+ */
+ if ( s.rsvd_zero )
+ return -EINVAL;
+
return 0;
}
@@ -1687,6 +1704,7 @@ int vlapic_init(struct vcpu *v)
}
vlapic->pt.source = PTSRC_lapic;
+ vlapic->hw.x2apic_id = 2 * v->vcpu_id;
vlapic->regs_page = alloc_domheap_page(v->domain, MEMF_no_owner);
if ( !vlapic->regs_page )
diff --git a/xen/arch/x86/include/asm/hvm/vlapic.h b/xen/arch/x86/include/asm/hvm/vlapic.h
index 2c4ff94ae7a8..85c4a236b9f6 100644
--- a/xen/arch/x86/include/asm/hvm/vlapic.h
+++ b/xen/arch/x86/include/asm/hvm/vlapic.h
@@ -44,6 +44,7 @@
#define vlapic_xapic_mode(vlapic) \
(!vlapic_hw_disabled(vlapic) && \
!((vlapic)->hw.apic_base_msr & APIC_BASE_EXTD))
+#define vlapic_x2apic_id(vlapic) ((vlapic)->hw.x2apic_id)
/*
* Generic APIC bitmap vector update & search routines.
diff --git a/xen/include/public/arch-x86/hvm/save.h b/xen/include/public/arch-x86/hvm/save.h
index 7ecacadde165..1c2ec669ffc9 100644
--- a/xen/include/public/arch-x86/hvm/save.h
+++ b/xen/include/public/arch-x86/hvm/save.h
@@ -394,6 +394,8 @@ struct hvm_hw_lapic {
uint32_t disabled; /* VLAPIC_xx_DISABLED */
uint32_t timer_divisor;
uint64_t tdt_msr;
+ uint32_t x2apic_id;
+ uint32_t rsvd_zero;
};
DECLARE_HVM_SAVE_TYPE(LAPIC, 5, struct hvm_hw_lapic);
--
2.47.0
^ permalink raw reply related [flat|nested] 27+ messages in thread* Re: [PATCH v7 02/10] xen/x86: Add initial x2APIC ID to the per-vLAPIC save area
2024-10-21 15:45 ` [PATCH v7 02/10] xen/x86: Add initial x2APIC ID to the per-vLAPIC save area Alejandro Vallejo
@ 2024-10-29 20:30 ` Andrew Cooper
2024-10-30 6:37 ` Jan Beulich
2024-10-30 12:00 ` Alejandro Vallejo
0 siblings, 2 replies; 27+ messages in thread
From: Andrew Cooper @ 2024-10-29 20:30 UTC (permalink / raw)
To: Alejandro Vallejo, xen-devel; +Cc: Jan Beulich, Roger Pau Monné
On 21/10/2024 4:45 pm, Alejandro Vallejo wrote:
> This allows the initial x2APIC ID to be sent on the migration stream.
> This allows further changes to topology and APIC ID assignment without
> breaking existing hosts. Given the vlapic data is zero-extended on
> restore, fix up migrations from hosts without the field by setting it to
> the old convention if zero.
>
> The hardcoded mapping x2apic_id=2*vcpu_id is kept for the time being,
> but it's meant to be overriden by toolstack on a later patch with
> appropriate values.
>
> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
I'm going to request some changes, but I think they're only comment
changes. [edit, no sadly, one non-comment change.]
It's unfortunate that Xen uses an instance of hvm_hw_lapic for it's
internal state, but one swamp at a time.
In the subject, there's no such thing as the "initial" x2APIC ID.
There's just "the x2APIC ID" and it's not mutable state as far as the
guest is concerned (This is different to the xAPIC id, where there is
an architectural concept of the initial xAPIC ID, from the days when
OSes were permitted to edit it). Also, it's x86/hvm, seeing as this is
an HVM specific change you're making.
Next, while it's true that this allows the value to move in the
migration stream, the more important point is that this allows the
toolstack to configure the x2APIC ID for each vCPU.
So, for the commit message, I recommend:
---%<---
Today, Xen hard-codes x2APIC_ID = vcpu_id * 2, but this is unwise and
interferes with providing accurate topology information to the guest.
Introduce a new x2apic_id field into hvm_hw_lapic. This is immutable
state from the guest's point of view, but it allows the toolstack to
configure the value, and for the value to move on migrate.
For backwards compatibility, we treat incoming zeroes as if they were
the old hardcoded scheme.
---%<---
> diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
> index 2a777436ee27..e2489ff8e346 100644
> --- a/xen/arch/x86/cpuid.c
> +++ b/xen/arch/x86/cpuid.c
> @@ -138,10 +138,9 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
> const struct cpu_user_regs *regs;
>
> case 0x1:
> - /* TODO: Rework topology logic. */
> res->b &= 0x00ffffffu;
> if ( is_hvm_domain(d) )
> - res->b |= (v->vcpu_id * 2) << 24;
> + res->b |= vlapic_x2apic_id(vcpu_vlapic(v)) << 24;
There wants to be some kind of note here, especially as you're feeding
vlapic_x2apic_id() into a field called xAPIC ID. Perhaps
/* Large systems do wrap around 255 in the xAPIC_ID field. */
?
>
> /* TODO: Rework vPMU control in terms of toolstack choices. */
> if ( vpmu_available(v) &&
> @@ -310,19 +309,16 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
> break;
>
> case 0xb:
> - /*
> - * In principle, this leaf is Intel-only. In practice, it is tightly
> - * coupled with x2apic, and we offer an x2apic-capable APIC emulation
> - * to guests on AMD hardware as well.
> - *
> - * TODO: Rework topology logic.
> - */
> if ( p->basic.x2apic )
> {
> *(uint8_t *)&res->c = subleaf;
>
> - /* Fix the x2APIC identifier. */
> - res->d = v->vcpu_id * 2;
> + /*
> + * Fix the x2APIC identifier. The PV side is nonsensical, but
> + * we've always shown it like this so it's kept for compat.
> + */
In hindsight I should changed "Fix the x2APIC identifier." when I
reworked this logic, but oh well - better late than never.
/* The x2APIC_ID is per-vCPU, and fixed irrespective of the requested
subleaf. */
I'd also put a little more context in the PV side:
/* Xen 4.18 and earlier leaked x2APIC into PV guests. The value shown
is nonsensical but kept as-was for compatibility. */
> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
> index 3363926b487b..33b463925f4e 100644
> --- a/xen/arch/x86/hvm/vlapic.c
> +++ b/xen/arch/x86/hvm/vlapic.c
> @@ -1538,6 +1538,16 @@ static void lapic_load_fixup(struct vlapic *vlapic)
> const struct vcpu *v = vlapic_vcpu(vlapic);
> uint32_t good_ldr = x2apic_ldr_from_id(vlapic->loaded.id);
>
> + /*
> + * Loading record without hw.x2apic_id in the save stream, calculate using
> + * the traditional "vcpu_id * 2" relation. There's an implicit assumption
> + * that vCPU0 always has x2APIC0, which is true for the old relation, and
> + * still holds under the new x2APIC generation algorithm. While that case
> + * goes through the conditional it's benign because it still maps to zero.
> + */
It's not an implicit assumption; it's very explicit.
/* Xen 4.19 and earlier had no x2APIC_ID in the migration stream, and
hard-coded "vcpu_id * 2". Default back to this if we have a
zero-extended record. */
But, this will go malfunction if the toolstack tries to set v!0's
x2APIC_ID to 0.
What you need to know is whether lapic_load_hidden() had to zero-extend
the record or not (more specifically, over this field), so you want
h->size <= offsetof(x2_apicid) as the gating condition.
This should be safe for the toolstack, I think. Hypercalls prior to
this patch will get a shorter record, and hypercalls from this patch
onwards will get a longer record with the default x2APIC_ID = vcpu_id *
2 filled in.
> + if ( !vlapic->hw.x2apic_id )
> + vlapic->hw.x2apic_id = v->vcpu_id * 2;
> +
> /* Skip fixups on xAPIC mode, or if the x2APIC LDR is already correct */
> if ( !vlapic_x2apic_mode(vlapic) ||
> (vlapic->loaded.ldr == good_ldr) )
> @@ -1606,6 +1616,13 @@ static int cf_check lapic_check_hidden(const struct domain *d,
> APIC_BASE_EXTD )
> return -EINVAL;
>
> + /*
> + * Fail migrations from newer versions of Xen where
> + * rsvd_zero is interpreted as something else.
> + */
This comment isn't necessary. We've got no shortage of reserved
checks. However ...
> diff --git a/xen/include/public/arch-x86/hvm/save.h b/xen/include/public/arch-x86/hvm/save.h
> index 7ecacadde165..1c2ec669ffc9 100644
> --- a/xen/include/public/arch-x86/hvm/save.h
> +++ b/xen/include/public/arch-x86/hvm/save.h
> @@ -394,6 +394,8 @@ struct hvm_hw_lapic {
> uint32_t disabled; /* VLAPIC_xx_DISABLED */
> uint32_t timer_divisor;
> uint64_t tdt_msr;
> + uint32_t x2apic_id;
> + uint32_t rsvd_zero;
... we do normally spell it _rsvd; to make it extra extra clear that
people shouldn't be doing anything with it.
~Andrew
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [PATCH v7 02/10] xen/x86: Add initial x2APIC ID to the per-vLAPIC save area
2024-10-29 20:30 ` Andrew Cooper
@ 2024-10-30 6:37 ` Jan Beulich
2024-10-30 12:03 ` Alejandro Vallejo
2024-10-30 12:25 ` Andrew Cooper
2024-10-30 12:00 ` Alejandro Vallejo
1 sibling, 2 replies; 27+ messages in thread
From: Jan Beulich @ 2024-10-30 6:37 UTC (permalink / raw)
To: Andrew Cooper, Alejandro Vallejo; +Cc: Roger Pau Monné, xen-devel
On 29.10.2024 21:30, Andrew Cooper wrote:
> On 21/10/2024 4:45 pm, Alejandro Vallejo wrote:
>> @@ -310,19 +309,16 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
>> break;
>>
>> case 0xb:
>> - /*
>> - * In principle, this leaf is Intel-only. In practice, it is tightly
>> - * coupled with x2apic, and we offer an x2apic-capable APIC emulation
>> - * to guests on AMD hardware as well.
>> - *
>> - * TODO: Rework topology logic.
>> - */
>> if ( p->basic.x2apic )
>> {
>> *(uint8_t *)&res->c = subleaf;
>>
>> - /* Fix the x2APIC identifier. */
>> - res->d = v->vcpu_id * 2;
>> + /*
>> + * Fix the x2APIC identifier. The PV side is nonsensical, but
>> + * we've always shown it like this so it's kept for compat.
>> + */
>
> In hindsight I should changed "Fix the x2APIC identifier." when I
> reworked this logic, but oh well - better late than never.
>
> /* The x2APIC_ID is per-vCPU, and fixed irrespective of the requested
> subleaf. */
Can we perhaps avoid "fix" in this comment? "Adjusted", "overwritten", or
some such ought to do, without carrying a hint towards some bug somewhere.
>> --- a/xen/include/public/arch-x86/hvm/save.h
>> +++ b/xen/include/public/arch-x86/hvm/save.h
>> @@ -394,6 +394,8 @@ struct hvm_hw_lapic {
>> uint32_t disabled; /* VLAPIC_xx_DISABLED */
>> uint32_t timer_divisor;
>> uint64_t tdt_msr;
>> + uint32_t x2apic_id;
>> + uint32_t rsvd_zero;
>
> ... we do normally spell it _rsvd; to make it extra extra clear that
> people shouldn't be doing anything with it.
Alternatively, to carry the "zero" in the name, how about _mbz?
Jan
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [PATCH v7 02/10] xen/x86: Add initial x2APIC ID to the per-vLAPIC save area
2024-10-30 6:37 ` Jan Beulich
@ 2024-10-30 12:03 ` Alejandro Vallejo
2024-10-30 12:05 ` Jan Beulich
2024-10-30 12:25 ` Andrew Cooper
1 sibling, 1 reply; 27+ messages in thread
From: Alejandro Vallejo @ 2024-10-30 12:03 UTC (permalink / raw)
To: Jan Beulich, Andrew Cooper; +Cc: Roger Pau Monné, xen-devel
Hi,
On Wed Oct 30, 2024 at 6:37 AM GMT, Jan Beulich wrote:
> On 29.10.2024 21:30, Andrew Cooper wrote:
> > On 21/10/2024 4:45 pm, Alejandro Vallejo wrote:
> >> @@ -310,19 +309,16 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
> >> break;
> >>
> >> case 0xb:
> >> - /*
> >> - * In principle, this leaf is Intel-only. In practice, it is tightly
> >> - * coupled with x2apic, and we offer an x2apic-capable APIC emulation
> >> - * to guests on AMD hardware as well.
> >> - *
> >> - * TODO: Rework topology logic.
> >> - */
> >> if ( p->basic.x2apic )
> >> {
> >> *(uint8_t *)&res->c = subleaf;
> >>
> >> - /* Fix the x2APIC identifier. */
> >> - res->d = v->vcpu_id * 2;
> >> + /*
> >> + * Fix the x2APIC identifier. The PV side is nonsensical, but
> >> + * we've always shown it like this so it's kept for compat.
> >> + */
> >
> > In hindsight I should changed "Fix the x2APIC identifier." when I
> > reworked this logic, but oh well - better late than never.
> >
> > /* The x2APIC_ID is per-vCPU, and fixed irrespective of the requested
> > subleaf. */
>
> Can we perhaps avoid "fix" in this comment? "Adjusted", "overwritten", or
> some such ought to do, without carrying a hint towards some bug somewhere.
I understood "fix" there as "pin" rather than "unbreak". Regardless I can also
rewrite it as "The x2APIC ID is per-vCPU and shown on all subleafs"
>
> >> --- a/xen/include/public/arch-x86/hvm/save.h
> >> +++ b/xen/include/public/arch-x86/hvm/save.h
> >> @@ -394,6 +394,8 @@ struct hvm_hw_lapic {
> >> uint32_t disabled; /* VLAPIC_xx_DISABLED */
> >> uint32_t timer_divisor;
> >> uint64_t tdt_msr;
> >> + uint32_t x2apic_id;
> >> + uint32_t rsvd_zero;
> >
> > ... we do normally spell it _rsvd; to make it extra extra clear that
> > people shouldn't be doing anything with it.
>
> Alternatively, to carry the "zero" in the name, how about _mbz?
>
> Jan
I'd prefer that to _rsvd, if anything to make it patently clear that leaving
rubble is not ok.
Cheers,
Alejandro
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [PATCH v7 02/10] xen/x86: Add initial x2APIC ID to the per-vLAPIC save area
2024-10-30 12:03 ` Alejandro Vallejo
@ 2024-10-30 12:05 ` Jan Beulich
0 siblings, 0 replies; 27+ messages in thread
From: Jan Beulich @ 2024-10-30 12:05 UTC (permalink / raw)
To: Alejandro Vallejo; +Cc: Roger Pau Monné, xen-devel, Andrew Cooper
On 30.10.2024 13:03, Alejandro Vallejo wrote:
> On Wed Oct 30, 2024 at 6:37 AM GMT, Jan Beulich wrote:
>> On 29.10.2024 21:30, Andrew Cooper wrote:
>>> On 21/10/2024 4:45 pm, Alejandro Vallejo wrote:
>>>> @@ -310,19 +309,16 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
>>>> break;
>>>>
>>>> case 0xb:
>>>> - /*
>>>> - * In principle, this leaf is Intel-only. In practice, it is tightly
>>>> - * coupled with x2apic, and we offer an x2apic-capable APIC emulation
>>>> - * to guests on AMD hardware as well.
>>>> - *
>>>> - * TODO: Rework topology logic.
>>>> - */
>>>> if ( p->basic.x2apic )
>>>> {
>>>> *(uint8_t *)&res->c = subleaf;
>>>>
>>>> - /* Fix the x2APIC identifier. */
>>>> - res->d = v->vcpu_id * 2;
>>>> + /*
>>>> + * Fix the x2APIC identifier. The PV side is nonsensical, but
>>>> + * we've always shown it like this so it's kept for compat.
>>>> + */
>>>
>>> In hindsight I should changed "Fix the x2APIC identifier." when I
>>> reworked this logic, but oh well - better late than never.
>>>
>>> /* The x2APIC_ID is per-vCPU, and fixed irrespective of the requested
>>> subleaf. */
>>
>> Can we perhaps avoid "fix" in this comment? "Adjusted", "overwritten", or
>> some such ought to do, without carrying a hint towards some bug somewhere.
>
> I understood "fix" there as "pin" rather than "unbreak".
Oh, right - that possible meaning escaped me.
Jan
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v7 02/10] xen/x86: Add initial x2APIC ID to the per-vLAPIC save area
2024-10-30 6:37 ` Jan Beulich
2024-10-30 12:03 ` Alejandro Vallejo
@ 2024-10-30 12:25 ` Andrew Cooper
1 sibling, 0 replies; 27+ messages in thread
From: Andrew Cooper @ 2024-10-30 12:25 UTC (permalink / raw)
To: Jan Beulich, Alejandro Vallejo; +Cc: Roger Pau Monné, xen-devel
On 30/10/2024 6:37 am, Jan Beulich wrote:
> On 29.10.2024 21:30, Andrew Cooper wrote:
>> On 21/10/2024 4:45 pm, Alejandro Vallejo wrote:
>>> @@ -310,19 +309,16 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
>>> break;
>>>
>>> case 0xb:
>>> - /*
>>> - * In principle, this leaf is Intel-only. In practice, it is tightly
>>> - * coupled with x2apic, and we offer an x2apic-capable APIC emulation
>>> - * to guests on AMD hardware as well.
>>> - *
>>> - * TODO: Rework topology logic.
>>> - */
>>> if ( p->basic.x2apic )
>>> {
>>> *(uint8_t *)&res->c = subleaf;
>>>
>>> - /* Fix the x2APIC identifier. */
>>> - res->d = v->vcpu_id * 2;
>>> + /*
>>> + * Fix the x2APIC identifier. The PV side is nonsensical, but
>>> + * we've always shown it like this so it's kept for compat.
>>> + */
>> In hindsight I should changed "Fix the x2APIC identifier." when I
>> reworked this logic, but oh well - better late than never.
>>
>> /* The x2APIC_ID is per-vCPU, and fixed irrespective of the requested
>> subleaf. */
> Can we perhaps avoid "fix" in this comment? "Adjusted", "overwritten", or
> some such ought to do, without carrying a hint towards some bug somewhere.
Not really. This is actually a good example of why "fix" as is bugfix
is a weird corner of English, despite it being common in coding circles.
"Fix" means to attach one thing to another, along with a strong
implication that the other thing doesn't move. This comes from Latin,
and the collective term for nails/screws/bolts/etc is "fixings".
Fix as in bugfix derives from "to repair" or "to mend", which in turn
comes from the fact that even today (and moreso several hundred years
ago), many repairs/mends involve affixing one thing back to something
else that doesn't move.
In this case, it is res->d which which is fixed (as in unmoving) with
respect to the subleaf index. It is weird even for CPUID; it's the only
example I'm aware of where the content of the world logically the same
piece of information in any subleaf.
~Andrew
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v7 02/10] xen/x86: Add initial x2APIC ID to the per-vLAPIC save area
2024-10-29 20:30 ` Andrew Cooper
2024-10-30 6:37 ` Jan Beulich
@ 2024-10-30 12:00 ` Alejandro Vallejo
1 sibling, 0 replies; 27+ messages in thread
From: Alejandro Vallejo @ 2024-10-30 12:00 UTC (permalink / raw)
To: Andrew Cooper, xen-devel; +Cc: Jan Beulich, Roger Pau Monné
I'm fine with all suggestions, with one exception that needs a bit more
explanation...
On Tue Oct 29, 2024 at 8:30 PM GMT, Andrew Cooper wrote:
> On 21/10/2024 4:45 pm, Alejandro Vallejo wrote:
> > This allows the initial x2APIC ID to be sent on the migration stream.
> > This allows further changes to topology and APIC ID assignment without
> > breaking existing hosts. Given the vlapic data is zero-extended on
> > restore, fix up migrations from hosts without the field by setting it to
> > the old convention if zero.
> >
> > The hardcoded mapping x2apic_id=2*vcpu_id is kept for the time being,
> > but it's meant to be overriden by toolstack on a later patch with
> > appropriate values.
> >
> > Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
>
> I'm going to request some changes, but I think they're only comment
> changes. [edit, no sadly, one non-comment change.]
>
> It's unfortunate that Xen uses an instance of hvm_hw_lapic for it's
> internal state, but one swamp at a time.
>
>
> In the subject, there's no such thing as the "initial" x2APIC ID.
> There's just "the x2APIC ID" and it's not mutable state as far as the
> guest is concerned (This is different to the xAPIC id, where there is
> an architectural concept of the initial xAPIC ID, from the days when
> OSes were permitted to edit it). Also, it's x86/hvm, seeing as this is
> an HVM specific change you're making.
>
> Next, while it's true that this allows the value to move in the
> migration stream, the more important point is that this allows the
> toolstack to configure the x2APIC ID for each vCPU.
>
> So, for the commit message, I recommend:
>
> ---%<---
> Today, Xen hard-codes x2APIC_ID = vcpu_id * 2, but this is unwise and
> interferes with providing accurate topology information to the guest.
>
> Introduce a new x2apic_id field into hvm_hw_lapic. This is immutable
> state from the guest's point of view, but it allows the toolstack to
> configure the value, and for the value to move on migrate.
>
> For backwards compatibility, we treat incoming zeroes as if they were
> the old hardcoded scheme.
> ---%<---
>
> > diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
> > index 2a777436ee27..e2489ff8e346 100644
> > --- a/xen/arch/x86/cpuid.c
> > +++ b/xen/arch/x86/cpuid.c
> > @@ -138,10 +138,9 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
> > const struct cpu_user_regs *regs;
> >
> > case 0x1:
> > - /* TODO: Rework topology logic. */
> > res->b &= 0x00ffffffu;
> > if ( is_hvm_domain(d) )
> > - res->b |= (v->vcpu_id * 2) << 24;
> > + res->b |= vlapic_x2apic_id(vcpu_vlapic(v)) << 24;
>
> There wants to be some kind of note here, especially as you're feeding
> vlapic_x2apic_id() into a field called xAPIC ID. Perhaps
>
> /* Large systems do wrap around 255 in the xAPIC_ID field. */
>
> ?
>
>
> >
> > /* TODO: Rework vPMU control in terms of toolstack choices. */
> > if ( vpmu_available(v) &&
> > @@ -310,19 +309,16 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
> > break;
> >
> > case 0xb:
> > - /*
> > - * In principle, this leaf is Intel-only. In practice, it is tightly
> > - * coupled with x2apic, and we offer an x2apic-capable APIC emulation
> > - * to guests on AMD hardware as well.
> > - *
> > - * TODO: Rework topology logic.
> > - */
> > if ( p->basic.x2apic )
> > {
> > *(uint8_t *)&res->c = subleaf;
> >
> > - /* Fix the x2APIC identifier. */
> > - res->d = v->vcpu_id * 2;
> > + /*
> > + * Fix the x2APIC identifier. The PV side is nonsensical, but
> > + * we've always shown it like this so it's kept for compat.
> > + */
>
> In hindsight I should changed "Fix the x2APIC identifier." when I
> reworked this logic, but oh well - better late than never.
>
> /* The x2APIC_ID is per-vCPU, and fixed irrespective of the requested
> subleaf. */
>
> I'd also put a little more context in the PV side:
>
> /* Xen 4.18 and earlier leaked x2APIC into PV guests. The value shown
> is nonsensical but kept as-was for compatibility. */
>
> > diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
> > index 3363926b487b..33b463925f4e 100644
> > --- a/xen/arch/x86/hvm/vlapic.c
> > +++ b/xen/arch/x86/hvm/vlapic.c
> > @@ -1538,6 +1538,16 @@ static void lapic_load_fixup(struct vlapic *vlapic)
> > const struct vcpu *v = vlapic_vcpu(vlapic);
> > uint32_t good_ldr = x2apic_ldr_from_id(vlapic->loaded.id);
> >
> > + /*
> > + * Loading record without hw.x2apic_id in the save stream, calculate using
> > + * the traditional "vcpu_id * 2" relation. There's an implicit assumption
> > + * that vCPU0 always has x2APIC0, which is true for the old relation, and
> > + * still holds under the new x2APIC generation algorithm. While that case
> > + * goes through the conditional it's benign because it still maps to zero.
> > + */
>
> It's not an implicit assumption; it's very explicit.
It's implicit because it's not mentioned anywhere else and parts of the Xen
ecosystem live under the pretense that such a thing can indeed happen.
>
> /* Xen 4.19 and earlier had no x2APIC_ID in the migration stream, and
> hard-coded "vcpu_id * 2". Default back to this if we have a
> zero-extended record. */
>
> But, this will go malfunction if the toolstack tries to set v!0's
> x2APIC_ID to 0.
I assume you mean vcpuN with N != 0. I maintain that allowing non-monotonically
increasing APIC IDs on vCPUs is technical debt disguised as a misfeature. For
one, it would prevent hvmloader from asserting some sanity on its own reads of
APIC IDs, but it would be a mess to debug in general. I started making real
progress on the toolstack after asserting all APs had non-zero APIC IDs.
So, while...
>
> What you need to know is whether lapic_load_hidden() had to zero-extend
> the record or not (more specifically, over this field), so you want
> h->size <= offsetof(x2_apicid) as the gating condition.
... this is true and a more adequate gating condition (that I'm happy to
replace the current one with), I'd still like to keep the invariant that APIC
IDs must be monotonically increasing with the vCPU id, which has the side
effect of banning zero outside the BSP.
>
> This should be safe for the toolstack, I think. Hypercalls prior to
> this patch will get a shorter record, and hypercalls from this patch
> onwards will get a longer record with the default x2APIC_ID = vcpu_id *
> 2 filled in.
>
> > + if ( !vlapic->hw.x2apic_id )
> > + vlapic->hw.x2apic_id = v->vcpu_id * 2;
> > +
> > /* Skip fixups on xAPIC mode, or if the x2APIC LDR is already correct */
> > if ( !vlapic_x2apic_mode(vlapic) ||
> > (vlapic->loaded.ldr == good_ldr) )
> > @@ -1606,6 +1616,13 @@ static int cf_check lapic_check_hidden(const struct domain *d,
> > APIC_BASE_EXTD )
> > return -EINVAL;
> >
> > + /*
> > + * Fail migrations from newer versions of Xen where
> > + * rsvd_zero is interpreted as something else.
> > + */
>
> This comment isn't necessary. We've got no shortage of reserved
> checks. However ...
>
> > diff --git a/xen/include/public/arch-x86/hvm/save.h b/xen/include/public/arch-x86/hvm/save.h
> > index 7ecacadde165..1c2ec669ffc9 100644
> > --- a/xen/include/public/arch-x86/hvm/save.h
> > +++ b/xen/include/public/arch-x86/hvm/save.h
> > @@ -394,6 +394,8 @@ struct hvm_hw_lapic {
> > uint32_t disabled; /* VLAPIC_xx_DISABLED */
> > uint32_t timer_divisor;
> > uint64_t tdt_msr;
> > + uint32_t x2apic_id;
> > + uint32_t rsvd_zero;
>
> ... we do normally spell it _rsvd; to make it extra extra clear that
> people shouldn't be doing anything with it.
>
> ~Andrew
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v7 03/10] xen/x86: Add supporting code for uploading LAPIC contexts during domain create
2024-10-21 15:45 [PATCH v7 00/10] x86: Expose consistent topology to guests Alejandro Vallejo
2024-10-21 15:45 ` [PATCH v7 01/10] lib/x86: Bump max basic leaf in {pv,hvm}_max_policy Alejandro Vallejo
2024-10-21 15:45 ` [PATCH v7 02/10] xen/x86: Add initial x2APIC ID to the per-vLAPIC save area Alejandro Vallejo
@ 2024-10-21 15:45 ` Alejandro Vallejo
2024-12-02 9:27 ` Jan Beulich
2024-10-21 15:45 ` [PATCH v7 04/10] tools/hvmloader: Retrieve (x2)APIC IDs from the APs themselves Alejandro Vallejo
` (6 subsequent siblings)
9 siblings, 1 reply; 27+ messages in thread
From: Alejandro Vallejo @ 2024-10-21 15:45 UTC (permalink / raw)
To: xen-devel
Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
Roger Pau Monné
A later patch will upload LAPIC contexts as part of domain creation. In
order for it not to encounter a problem where the architectural state
does not reflect the APIC ID in the hidden state, this patch ensures
updates to the hidden state trigger an update in the architectural
registers so the APIC ID in both is consistent.
Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
v7:
* Rework the commit message so it explains a follow-up patch rather
than hypothetical behaviour.
---
xen/arch/x86/hvm/vlapic.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 33b463925f4e..03581eb33812 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -1640,7 +1640,27 @@ static int cf_check lapic_load_hidden(struct domain *d, hvm_domain_context_t *h)
s->loaded.hw = 1;
if ( s->loaded.regs )
+ {
+ /*
+ * We already processed architectural regs in lapic_load_regs(), so
+ * this must be a migration. Fix up inconsistencies from any older Xen.
+ */
lapic_load_fixup(s);
+ }
+ else
+ {
+ /*
+ * We haven't seen architectural regs so this could be a migration or a
+ * plain domain create. In the domain create case it's fine to modify
+ * the architectural state to align it to the APIC ID that was just
+ * uploaded and in the migrate case it doesn't matter because the
+ * architectural state will be replaced by the LAPIC_REGS ctx later on.
+ */
+ if ( vlapic_x2apic_mode(s) )
+ set_x2apic_id(s);
+ else
+ vlapic_set_reg(s, APIC_ID, SET_xAPIC_ID(s->hw.x2apic_id));
+ }
hvm_update_vlapic_mode(v);
--
2.47.0
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH v7 04/10] tools/hvmloader: Retrieve (x2)APIC IDs from the APs themselves
2024-10-21 15:45 [PATCH v7 00/10] x86: Expose consistent topology to guests Alejandro Vallejo
` (2 preceding siblings ...)
2024-10-21 15:45 ` [PATCH v7 03/10] xen/x86: Add supporting code for uploading LAPIC contexts during domain create Alejandro Vallejo
@ 2024-10-21 15:45 ` Alejandro Vallejo
2024-10-30 11:31 ` Andrew Cooper
2024-12-02 9:36 ` Jan Beulich
2024-10-21 15:45 ` [PATCH v7 05/10] tools/libacpi: Use LUT of APIC IDs rather than function pointer Alejandro Vallejo
` (5 subsequent siblings)
9 siblings, 2 replies; 27+ messages in thread
From: Alejandro Vallejo @ 2024-10-21 15:45 UTC (permalink / raw)
To: xen-devel
Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
Roger Pau Monné, Anthony PERARD
Make it so the APs expose their own APIC IDs in a LUT. We can use that
LUT to populate the MADT, decoupling the algorithm that relates CPU IDs
and APIC IDs from hvmloader.
Moved smp_initialise() ahead of apic_setup() in order to initialise
cpu_to_x2apicid ASAP and avoid using it uninitialised. Note that
bringing up the APs doesn't need the APIC in hvmloader becasue it always
runs virtualized and uses the PV interface.
While at this, exploit the assumption that CPU0 always has APICID0 to
remove ap_callin, as writing the APIC ID may serve the same purpose.
Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
v7:
* CPU_TO_X2APICID to lowercase
* Spell out the CPU0<-->APICID0 relationship in the commit message as
the rationale to remove ap_callin.
* Explain the motion of smp_initialise() ahead of apic_setup() in the
commit message.
---
tools/firmware/hvmloader/config.h | 5 ++-
tools/firmware/hvmloader/hvmloader.c | 6 +--
tools/firmware/hvmloader/mp_tables.c | 4 +-
tools/firmware/hvmloader/smp.c | 57 ++++++++++++++++++++-----
tools/firmware/hvmloader/util.c | 2 +-
tools/include/xen-tools/common-macros.h | 5 +++
6 files changed, 63 insertions(+), 16 deletions(-)
diff --git a/tools/firmware/hvmloader/config.h b/tools/firmware/hvmloader/config.h
index cd716bf39245..04cab1e59f08 100644
--- a/tools/firmware/hvmloader/config.h
+++ b/tools/firmware/hvmloader/config.h
@@ -4,6 +4,8 @@
#include <stdint.h>
#include <stdbool.h>
+#include <xen/hvm/hvm_info_table.h>
+
enum virtual_vga { VGA_none, VGA_std, VGA_cirrus, VGA_pt };
extern enum virtual_vga virtual_vga;
@@ -48,8 +50,9 @@ extern uint8_t ioapic_version;
#define IOAPIC_ID 0x01
+extern uint32_t cpu_to_x2apicid[HVM_MAX_VCPUS];
+
#define LAPIC_BASE_ADDRESS 0xfee00000
-#define LAPIC_ID(vcpu_id) ((vcpu_id) * 2)
#define PCI_ISA_DEVFN 0x08 /* dev 1, fn 0 */
#define PCI_ISA_IRQ_MASK 0x0c20U /* ISA IRQs 5,10,11 are PCI connected */
diff --git a/tools/firmware/hvmloader/hvmloader.c b/tools/firmware/hvmloader/hvmloader.c
index f8af88fabf24..bebdfa923880 100644
--- a/tools/firmware/hvmloader/hvmloader.c
+++ b/tools/firmware/hvmloader/hvmloader.c
@@ -224,7 +224,7 @@ static void apic_setup(void)
/* 8259A ExtInts are delivered through IOAPIC pin 0 (Virtual Wire Mode). */
ioapic_write(0x10, APIC_DM_EXTINT);
- ioapic_write(0x11, SET_APIC_ID(LAPIC_ID(0)));
+ ioapic_write(0x11, SET_APIC_ID(cpu_to_x2apicid[0]));
}
struct bios_info {
@@ -341,11 +341,11 @@ int main(void)
printf("CPU speed is %u MHz\n", get_cpu_mhz());
+ smp_initialise();
+
apic_setup();
pci_setup();
- smp_initialise();
-
perform_tests();
if ( bios->bios_info_setup )
diff --git a/tools/firmware/hvmloader/mp_tables.c b/tools/firmware/hvmloader/mp_tables.c
index 77d3010406d0..539260365e1e 100644
--- a/tools/firmware/hvmloader/mp_tables.c
+++ b/tools/firmware/hvmloader/mp_tables.c
@@ -198,8 +198,10 @@ static void fill_mp_config_table(struct mp_config_table *mpct, int length)
/* fills in an MP processor entry for VCPU 'vcpu_id' */
static void fill_mp_proc_entry(struct mp_proc_entry *mppe, int vcpu_id)
{
+ ASSERT(cpu_to_x2apicid[vcpu_id] < 0xFF );
+
mppe->type = ENTRY_TYPE_PROCESSOR;
- mppe->lapic_id = LAPIC_ID(vcpu_id);
+ mppe->lapic_id = cpu_to_x2apicid[vcpu_id];
mppe->lapic_version = 0x11;
mppe->cpu_flags = CPU_FLAG_ENABLED;
if ( vcpu_id == 0 )
diff --git a/tools/firmware/hvmloader/smp.c b/tools/firmware/hvmloader/smp.c
index 1b940cefd071..d63536f14f00 100644
--- a/tools/firmware/hvmloader/smp.c
+++ b/tools/firmware/hvmloader/smp.c
@@ -29,7 +29,37 @@
#include <xen/vcpu.h>
-static int ap_callin;
+/**
+ * Lookup table of (x2)APIC IDs.
+ *
+ * Each entry is populated its respective CPU as they come online. This is required
+ * for generating the MADT with minimal assumptions about ID relationships.
+ *
+ * While the name makes "x2" explicit, these may actually be xAPIC IDs if no
+ * x2APIC is present. "x2" merely highlights that each entry is 32 bits wide.
+ */
+uint32_t cpu_to_x2apicid[HVM_MAX_VCPUS];
+
+/** Tristate about x2apic being supported. -1=unknown */
+static int has_x2apic = -1;
+
+static uint32_t read_apic_id(void)
+{
+ uint32_t apic_id;
+
+ if ( has_x2apic )
+ cpuid(0xb, NULL, NULL, NULL, &apic_id);
+ else
+ {
+ cpuid(1, NULL, &apic_id, NULL, NULL);
+ apic_id >>= 24;
+ }
+
+ /* Never called by cpu0, so should never return 0 */
+ ASSERT(apic_id);
+
+ return apic_id;
+}
static void cpu_setup(unsigned int cpu)
{
@@ -37,13 +67,17 @@ static void cpu_setup(unsigned int cpu)
cacheattr_init();
printf("done.\n");
- if ( !cpu ) /* Used on the BSP too */
+ /* The BSP exits early because its APIC ID is known to be zero */
+ if ( !cpu )
return;
wmb();
- ap_callin = 1;
+ ACCESS_ONCE(cpu_to_x2apicid[cpu]) = read_apic_id();
- /* After this point, the BSP will shut us down. */
+ /*
+ * After this point the BSP will shut us down. A write to
+ * cpu_to_x2apicid[cpu] signals the BSP to bring down `cpu`.
+ */
for ( ;; )
asm volatile ( "hlt" );
@@ -54,10 +88,6 @@ static void boot_cpu(unsigned int cpu)
static uint8_t ap_stack[PAGE_SIZE] __attribute__ ((aligned (16)));
static struct vcpu_hvm_context ap;
- /* Initialise shared variables. */
- ap_callin = 0;
- wmb();
-
/* Wake up the secondary processor */
ap = (struct vcpu_hvm_context) {
.mode = VCPU_HVM_MODE_32B,
@@ -90,10 +120,11 @@ static void boot_cpu(unsigned int cpu)
BUG();
/*
- * Wait for the secondary processor to complete initialisation.
+ * Wait for the secondary processor to complete initialisation,
+ * which is signaled by its x2APIC ID being written to the LUT.
* Do not touch shared resources meanwhile.
*/
- while ( !ap_callin )
+ while ( !ACCESS_ONCE(cpu_to_x2apicid[cpu]) )
cpu_relax();
/* Take the secondary processor offline. */
@@ -104,6 +135,12 @@ static void boot_cpu(unsigned int cpu)
void smp_initialise(void)
{
unsigned int i, nr_cpus = hvm_info->nr_vcpus;
+ uint32_t ecx;
+
+ cpuid(1, NULL, NULL, &ecx, NULL);
+ has_x2apic = (ecx >> 21) & 1;
+ if ( has_x2apic )
+ printf("x2APIC supported\n");
printf("Multiprocessor initialisation:\n");
cpu_setup(0);
diff --git a/tools/firmware/hvmloader/util.c b/tools/firmware/hvmloader/util.c
index d3b3f9038e64..821b3086a87d 100644
--- a/tools/firmware/hvmloader/util.c
+++ b/tools/firmware/hvmloader/util.c
@@ -827,7 +827,7 @@ static void acpi_mem_free(struct acpi_ctxt *ctxt,
static uint32_t acpi_lapic_id(unsigned cpu)
{
- return LAPIC_ID(cpu);
+ return cpu_to_x2apic_id[cpu];
}
void hvmloader_acpi_build_tables(struct acpi_config *config,
diff --git a/tools/include/xen-tools/common-macros.h b/tools/include/xen-tools/common-macros.h
index 60912225cb7a..336c6309d96e 100644
--- a/tools/include/xen-tools/common-macros.h
+++ b/tools/include/xen-tools/common-macros.h
@@ -108,4 +108,9 @@
#define get_unaligned(ptr) get_unaligned_t(typeof(*(ptr)), ptr)
#define put_unaligned(val, ptr) put_unaligned_t(typeof(*(ptr)), val, ptr)
+#define __ACCESS_ONCE(x) ({ \
+ (void)(typeof(x))0; /* Scalar typecheck. */ \
+ (volatile typeof(x) *)&(x); })
+#define ACCESS_ONCE(x) (*__ACCESS_ONCE(x))
+
#endif /* __XEN_TOOLS_COMMON_MACROS__ */
--
2.47.0
^ permalink raw reply related [flat|nested] 27+ messages in thread* Re: [PATCH v7 04/10] tools/hvmloader: Retrieve (x2)APIC IDs from the APs themselves
2024-10-21 15:45 ` [PATCH v7 04/10] tools/hvmloader: Retrieve (x2)APIC IDs from the APs themselves Alejandro Vallejo
@ 2024-10-30 11:31 ` Andrew Cooper
2024-10-30 12:04 ` Jan Beulich
2024-11-11 11:20 ` Alejandro Vallejo
2024-12-02 9:36 ` Jan Beulich
1 sibling, 2 replies; 27+ messages in thread
From: Andrew Cooper @ 2024-10-30 11:31 UTC (permalink / raw)
To: Alejandro Vallejo, xen-devel
Cc: Jan Beulich, Roger Pau Monné, Anthony PERARD
On 21/10/2024 4:45 pm, Alejandro Vallejo wrote:
> diff --git a/tools/firmware/hvmloader/config.h b/tools/firmware/hvmloader/config.h
> index cd716bf39245..04cab1e59f08 100644
> --- a/tools/firmware/hvmloader/config.h
> +++ b/tools/firmware/hvmloader/config.h
> @@ -4,6 +4,8 @@
> #include <stdint.h>
> #include <stdbool.h>
>
> +#include <xen/hvm/hvm_info_table.h>
> +
> enum virtual_vga { VGA_none, VGA_std, VGA_cirrus, VGA_pt };
> extern enum virtual_vga virtual_vga;
>
> @@ -48,8 +50,9 @@ extern uint8_t ioapic_version;
>
> #define IOAPIC_ID 0x01
>
> +extern uint32_t cpu_to_x2apicid[HVM_MAX_VCPUS];
Just cpu_to_apic_id[] please. The distinction between x or x2 isn't
interesting here.
HVM_MAX_VCPUS is a constant that should never have existed in the first
place, *and* its the limit we're looking to finally break when this
series is accepted.
This array needs to be hvm_info->nr_vcpus entries long, and will want to
be more than 128 entries very soon. Just scratch_alloc() the array.
Then you can avoid the include.
> diff --git a/tools/firmware/hvmloader/mp_tables.c b/tools/firmware/hvmloader/mp_tables.c
> index 77d3010406d0..539260365e1e 100644
> --- a/tools/firmware/hvmloader/mp_tables.c
> +++ b/tools/firmware/hvmloader/mp_tables.c
> @@ -198,8 +198,10 @@ static void fill_mp_config_table(struct mp_config_table *mpct, int length)
> /* fills in an MP processor entry for VCPU 'vcpu_id' */
> static void fill_mp_proc_entry(struct mp_proc_entry *mppe, int vcpu_id)
> {
> + ASSERT(cpu_to_x2apicid[vcpu_id] < 0xFF );
This is just going to break when we hit 256 vCPUs in a VM.
What do real systems do?
They'll either wrap around 255 like the CPUID xAPIC_ID does, or they'll
not write out MP tables at all.
> diff --git a/tools/firmware/hvmloader/smp.c b/tools/firmware/hvmloader/smp.c
> index 1b940cefd071..d63536f14f00 100644
> --- a/tools/firmware/hvmloader/smp.c
> +++ b/tools/firmware/hvmloader/smp.c
> @@ -90,10 +120,11 @@ static void boot_cpu(unsigned int cpu)
> BUG();
>
> /*
> - * Wait for the secondary processor to complete initialisation.
> + * Wait for the secondary processor to complete initialisation,
> + * which is signaled by its x2APIC ID being written to the LUT.
Technically all arrays are a lookup table, but I'm not sure LUT is a
common enough term to be used unqualified like this.
Just say "... signalled by writing its APIC_ID out." The where is very
apparent by the code.
> @@ -104,6 +135,12 @@ static void boot_cpu(unsigned int cpu)
> void smp_initialise(void)
> {
> unsigned int i, nr_cpus = hvm_info->nr_vcpus;
> + uint32_t ecx;
> +
> + cpuid(1, NULL, NULL, &ecx, NULL);
> + has_x2apic = (ecx >> 21) & 1;
> + if ( has_x2apic )
> + printf("x2APIC supported\n");
You need to check max_leaf >= 0xb too. Remember Xen might not give you
leave 0xb yet, and then you'll hit the assert for finding 0.
And has_x2apic wants to be a simple boolean. Nothing good can come from
confusing -1 with "x2apic available".
I recommend splitting this patch into three. Several aspects are quite
subtle.
1) Collect the APIC_IDs on APs
2) Change how callin is signalled.
3) Replace LAPIC_ID() with the collected apic_id.
but AFAICT, it can be done as a standalone series, independently of the
other Xen/toolstack work.
~Andrew
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [PATCH v7 04/10] tools/hvmloader: Retrieve (x2)APIC IDs from the APs themselves
2024-10-30 11:31 ` Andrew Cooper
@ 2024-10-30 12:04 ` Jan Beulich
2024-11-11 11:20 ` Alejandro Vallejo
1 sibling, 0 replies; 27+ messages in thread
From: Jan Beulich @ 2024-10-30 12:04 UTC (permalink / raw)
To: Andrew Cooper
Cc: Roger Pau Monné, Alejandro Vallejo, Anthony PERARD,
xen-devel
On 30.10.2024 12:31, Andrew Cooper wrote:
> On 21/10/2024 4:45 pm, Alejandro Vallejo wrote:
>> --- a/tools/firmware/hvmloader/mp_tables.c
>> +++ b/tools/firmware/hvmloader/mp_tables.c
>> @@ -198,8 +198,10 @@ static void fill_mp_config_table(struct mp_config_table *mpct, int length)
>> /* fills in an MP processor entry for VCPU 'vcpu_id' */
>> static void fill_mp_proc_entry(struct mp_proc_entry *mppe, int vcpu_id)
>> {
>> + ASSERT(cpu_to_x2apicid[vcpu_id] < 0xFF );
>
> This is just going to break when we hit 256 vCPUs in a VM.
>
> What do real systems do?
>
> They'll either wrap around 255 like the CPUID xAPIC_ID does, or they'll
> not write out MP tables at all.
"at all" may be going a little far. They may simply not advertise CPUs with
too wide APIC IDs there, while still allowing others to be discovered this
legacy way.
Jan
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [PATCH v7 04/10] tools/hvmloader: Retrieve (x2)APIC IDs from the APs themselves
2024-10-30 11:31 ` Andrew Cooper
2024-10-30 12:04 ` Jan Beulich
@ 2024-11-11 11:20 ` Alejandro Vallejo
2024-11-11 12:07 ` Jan Beulich
1 sibling, 1 reply; 27+ messages in thread
From: Alejandro Vallejo @ 2024-11-11 11:20 UTC (permalink / raw)
To: Andrew Cooper, xen-devel
Cc: Jan Beulich, Roger Pau Monné, Anthony PERARD
On Wed Oct 30, 2024 at 11:31 AM GMT, Andrew Cooper wrote:
> On 21/10/2024 4:45 pm, Alejandro Vallejo wrote:
> > diff --git a/tools/firmware/hvmloader/config.h b/tools/firmware/hvmloader/config.h
> > index cd716bf39245..04cab1e59f08 100644
> > --- a/tools/firmware/hvmloader/config.h
> > +++ b/tools/firmware/hvmloader/config.h
> > @@ -4,6 +4,8 @@
> > #include <stdint.h>
> > #include <stdbool.h>
> >
> > +#include <xen/hvm/hvm_info_table.h>
> > +
> > enum virtual_vga { VGA_none, VGA_std, VGA_cirrus, VGA_pt };
> > extern enum virtual_vga virtual_vga;
> >
> > @@ -48,8 +50,9 @@ extern uint8_t ioapic_version;
> >
> > #define IOAPIC_ID 0x01
> >
> > +extern uint32_t cpu_to_x2apicid[HVM_MAX_VCPUS];
>
> Just cpu_to_apic_id[] please. The distinction between x or x2 isn't
> interesting here.
I disagree.
While "x" says nothing of interest "x2" does state the width. cpu_to_apic_id is
ambiguous and I've seen no shortage of code in which it's impossible to assess
its correctness without going to check what the original author meant; and
guesswork is bad for robustness. cpu_to_x2apicid has an unambiguous width at
the meager cost of 2 chars. If you have very strong feelings about it I can
change it, but my preference is to keep it as-is.
>
> HVM_MAX_VCPUS is a constant that should never have existed in the first
> place, *and* its the limit we're looking to finally break when this
> series is accepted.
>
> This array needs to be hvm_info->nr_vcpus entries long, and will want to
> be more than 128 entries very soon. Just scratch_alloc() the array.
> Then you can avoid the include.
That's a major PITA in the libxl side. I'll have a go to see how long it takes
me before I weep :_)
>
> > diff --git a/tools/firmware/hvmloader/mp_tables.c b/tools/firmware/hvmloader/mp_tables.c
> > index 77d3010406d0..539260365e1e 100644
> > --- a/tools/firmware/hvmloader/mp_tables.c
> > +++ b/tools/firmware/hvmloader/mp_tables.c
> > @@ -198,8 +198,10 @@ static void fill_mp_config_table(struct mp_config_table *mpct, int length)
> > /* fills in an MP processor entry for VCPU 'vcpu_id' */
> > static void fill_mp_proc_entry(struct mp_proc_entry *mppe, int vcpu_id)
> > {
> > + ASSERT(cpu_to_x2apicid[vcpu_id] < 0xFF );
>
> This is just going to break when we hit 256 vCPUs in a VM.
>
> What do real systems do?
>
> They'll either wrap around 255 like the CPUID xAPIC_ID does, or they'll
> not write out MP tables at all.
Definitely not wrapping around, that makes no sense.
It could also show the first 255 APs only. The reality is that if we're
exposing 1000 vCPUs is because we expect the guest to use them. While it's
likely we want to avoid writing the MP tables, that's not a puddle I want to
play with ATM.
Note that this is not a new breakage. It was already broken if we were to hit
such an APIC ID (which we can't because HVM_MAX_VCPUS is lower). I just made
sure we never write out corrupted tables.
>
> > diff --git a/tools/firmware/hvmloader/smp.c b/tools/firmware/hvmloader/smp.c
> > index 1b940cefd071..d63536f14f00 100644
> > --- a/tools/firmware/hvmloader/smp.c
> > +++ b/tools/firmware/hvmloader/smp.c
> > @@ -90,10 +120,11 @@ static void boot_cpu(unsigned int cpu)
> > BUG();
> >
> > /*
> > - * Wait for the secondary processor to complete initialisation.
> > + * Wait for the secondary processor to complete initialisation,
> > + * which is signaled by its x2APIC ID being written to the LUT.
>
> Technically all arrays are a lookup table, but I'm not sure LUT is a
No. A look-up table is a very specific implementation of a relation (in the
mathematical sense) between an unsigned integer and some other type,
implemented by means of an array indexed by said integer.
> common enough term to be used unqualified like this.
Happy to change the name if it's uncommon enough in this codebase, but it is
fairly common outside of it, and it's common enough to have its own wikipedia
page with that very acronym.
https://en.wikipedia.org/wiki/Lookup_table
>
> Just say "... signalled by writing its APIC_ID out." The where is very
> apparent by the code.
>
> > @@ -104,6 +135,12 @@ static void boot_cpu(unsigned int cpu)
> > void smp_initialise(void)
> > {
> > unsigned int i, nr_cpus = hvm_info->nr_vcpus;
> > + uint32_t ecx;
> > +
> > + cpuid(1, NULL, NULL, &ecx, NULL);
> > + has_x2apic = (ecx >> 21) & 1;
> > + if ( has_x2apic )
> > + printf("x2APIC supported\n");
>
> You need to check max_leaf >= 0xb too. Remember Xen might not give you
> leave 0xb yet, and then you'll hit the assert for finding 0.
True.
>
> And has_x2apic wants to be a simple boolean. Nothing good can come from
> confusing -1 with "x2apic available".
Sure
>
>
> I recommend splitting this patch into three. Several aspects are quite
> subtle.
>
> 1) Collect the APIC_IDs on APs
> 2) Change how callin is signalled.
> 3) Replace LAPIC_ID() with the collected apic_id.
>
> but AFAICT, it can be done as a standalone series, independently of the
> other Xen/toolstack work.
Ack
>
> ~Andrew
Cheers,
Alejandro
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [PATCH v7 04/10] tools/hvmloader: Retrieve (x2)APIC IDs from the APs themselves
2024-11-11 11:20 ` Alejandro Vallejo
@ 2024-11-11 12:07 ` Jan Beulich
0 siblings, 0 replies; 27+ messages in thread
From: Jan Beulich @ 2024-11-11 12:07 UTC (permalink / raw)
To: Alejandro Vallejo
Cc: Roger Pau Monné, Anthony PERARD, Andrew Cooper, xen-devel
On 11.11.2024 12:20, Alejandro Vallejo wrote:
> On Wed Oct 30, 2024 at 11:31 AM GMT, Andrew Cooper wrote:
>> On 21/10/2024 4:45 pm, Alejandro Vallejo wrote:
>>> diff --git a/tools/firmware/hvmloader/config.h b/tools/firmware/hvmloader/config.h
>>> index cd716bf39245..04cab1e59f08 100644
>>> --- a/tools/firmware/hvmloader/config.h
>>> +++ b/tools/firmware/hvmloader/config.h
>>> @@ -4,6 +4,8 @@
>>> #include <stdint.h>
>>> #include <stdbool.h>
>>>
>>> +#include <xen/hvm/hvm_info_table.h>
>>> +
>>> enum virtual_vga { VGA_none, VGA_std, VGA_cirrus, VGA_pt };
>>> extern enum virtual_vga virtual_vga;
>>>
>>> @@ -48,8 +50,9 @@ extern uint8_t ioapic_version;
>>>
>>> #define IOAPIC_ID 0x01
>>>
>>> +extern uint32_t cpu_to_x2apicid[HVM_MAX_VCPUS];
>>
>> Just cpu_to_apic_id[] please. The distinction between x or x2 isn't
>> interesting here.
>
> I disagree.
>
> While "x" says nothing of interest "x2" does state the width. cpu_to_apic_id is
> ambiguous and I've seen no shortage of code in which it's impossible to assess
> its correctness without going to check what the original author meant; and
> guesswork is bad for robustness. cpu_to_x2apicid has an unambiguous width at
> the meager cost of 2 chars. If you have very strong feelings about it I can
> change it, but my preference is to keep it as-is.
Just to mention it: I'm with Andrew here, and iirc I even had commented to this
effect on an earlier version as well.
Jan
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v7 04/10] tools/hvmloader: Retrieve (x2)APIC IDs from the APs themselves
2024-10-21 15:45 ` [PATCH v7 04/10] tools/hvmloader: Retrieve (x2)APIC IDs from the APs themselves Alejandro Vallejo
2024-10-30 11:31 ` Andrew Cooper
@ 2024-12-02 9:36 ` Jan Beulich
1 sibling, 0 replies; 27+ messages in thread
From: Jan Beulich @ 2024-12-02 9:36 UTC (permalink / raw)
To: Alejandro Vallejo
Cc: Andrew Cooper, Roger Pau Monné, Anthony PERARD, xen-devel
On 21.10.2024 17:45, Alejandro Vallejo wrote:
> --- a/tools/firmware/hvmloader/hvmloader.c
> +++ b/tools/firmware/hvmloader/hvmloader.c
> @@ -224,7 +224,7 @@ static void apic_setup(void)
>
> /* 8259A ExtInts are delivered through IOAPIC pin 0 (Virtual Wire Mode). */
> ioapic_write(0x10, APIC_DM_EXTINT);
> - ioapic_write(0x11, SET_APIC_ID(LAPIC_ID(0)));
> + ioapic_write(0x11, SET_APIC_ID(cpu_to_x2apicid[0]));
> }
In uses like this or ...
> --- a/tools/firmware/hvmloader/mp_tables.c
> +++ b/tools/firmware/hvmloader/mp_tables.c
> @@ -198,8 +198,10 @@ static void fill_mp_config_table(struct mp_config_table *mpct, int length)
> /* fills in an MP processor entry for VCPU 'vcpu_id' */
> static void fill_mp_proc_entry(struct mp_proc_entry *mppe, int vcpu_id)
> {
> + ASSERT(cpu_to_x2apicid[vcpu_id] < 0xFF );
> +
> mppe->type = ENTRY_TYPE_PROCESSOR;
> - mppe->lapic_id = LAPIC_ID(vcpu_id);
> + mppe->lapic_id = cpu_to_x2apicid[vcpu_id];
... this one (and also in acpi_lapic_id()), I consider the "x2" in the name
actively confusing, despite ...
> --- a/tools/firmware/hvmloader/smp.c
> +++ b/tools/firmware/hvmloader/smp.c
> @@ -29,7 +29,37 @@
>
> #include <xen/vcpu.h>
>
> -static int ap_callin;
> +/**
> + * Lookup table of (x2)APIC IDs.
> + *
> + * Each entry is populated its respective CPU as they come online. This is required
> + * for generating the MADT with minimal assumptions about ID relationships.
> + *
> + * While the name makes "x2" explicit, these may actually be xAPIC IDs if no
> + * x2APIC is present. "x2" merely highlights that each entry is 32 bits wide.
> + */
> +uint32_t cpu_to_x2apicid[HVM_MAX_VCPUS];
... the commentary here.
> +/** Tristate about x2apic being supported. -1=unknown */
> +static int has_x2apic = -1;
Why is this a tristate? Prior to the variable having been set, ...
> +static uint32_t read_apic_id(void)
> +{
> + uint32_t apic_id;
> +
> + if ( has_x2apic )
> + cpuid(0xb, NULL, NULL, NULL, &apic_id);
... this is bogus anyway.
Jan
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v7 05/10] tools/libacpi: Use LUT of APIC IDs rather than function pointer
2024-10-21 15:45 [PATCH v7 00/10] x86: Expose consistent topology to guests Alejandro Vallejo
` (3 preceding siblings ...)
2024-10-21 15:45 ` [PATCH v7 04/10] tools/hvmloader: Retrieve (x2)APIC IDs from the APs themselves Alejandro Vallejo
@ 2024-10-21 15:45 ` Alejandro Vallejo
2024-10-30 14:56 ` Andrew Cooper
2024-12-02 9:40 ` Jan Beulich
2024-10-21 15:45 ` [PATCH v7 06/10] tools/libguest: Always set vCPU context in vcpu_hvm() Alejandro Vallejo
` (4 subsequent siblings)
9 siblings, 2 replies; 27+ messages in thread
From: Alejandro Vallejo @ 2024-10-21 15:45 UTC (permalink / raw)
To: xen-devel
Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
Roger Pau Monné, Anthony PERARD, Juergen Gross
Refactors libacpi so that a single LUT is the authoritative source of
truth for the CPU to APIC ID mappings. This has a know-on effect in
reducing complexity on future patches, as the same LUT can be used for
configuring the APICs and configuring the ACPI tables for PVH.
Not functional change intended, because the same mappings are preserved.
Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
v7:
* NOTE: didn't add assert to libacpi as initially accepted in order to
protect libvirt from an assert failure.
* s/uint32_t/unsigned int/ in for loop of libxl.
* turned Xen-style loop in libxl to libxl-style.
---
tools/firmware/hvmloader/util.c | 7 +------
tools/include/xenguest.h | 5 +++++
tools/libacpi/build.c | 6 +++---
tools/libacpi/libacpi.h | 2 +-
tools/libs/light/libxl_dom.c | 5 +++++
tools/libs/light/libxl_x86_acpi.c | 7 +------
6 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/tools/firmware/hvmloader/util.c b/tools/firmware/hvmloader/util.c
index 821b3086a87d..afa3eb9d5775 100644
--- a/tools/firmware/hvmloader/util.c
+++ b/tools/firmware/hvmloader/util.c
@@ -825,11 +825,6 @@ static void acpi_mem_free(struct acpi_ctxt *ctxt,
/* ACPI builder currently doesn't free memory so this is just a stub */
}
-static uint32_t acpi_lapic_id(unsigned cpu)
-{
- return cpu_to_x2apic_id[cpu];
-}
-
void hvmloader_acpi_build_tables(struct acpi_config *config,
unsigned int physical)
{
@@ -859,7 +854,7 @@ void hvmloader_acpi_build_tables(struct acpi_config *config,
}
config->lapic_base_address = LAPIC_BASE_ADDRESS;
- config->lapic_id = acpi_lapic_id;
+ config->cpu_to_apicid = cpu_to_x2apicid;
config->ioapic_base_address = IOAPIC_BASE_ADDRESS;
config->ioapic_id = IOAPIC_ID;
config->pci_isa_irq_mask = PCI_ISA_IRQ_MASK;
diff --git a/tools/include/xenguest.h b/tools/include/xenguest.h
index e01f494b772a..aa50b78dfb89 100644
--- a/tools/include/xenguest.h
+++ b/tools/include/xenguest.h
@@ -22,6 +22,8 @@
#ifndef XENGUEST_H
#define XENGUEST_H
+#include "xen/hvm/hvm_info_table.h"
+
#define XC_NUMA_NO_NODE (~0U)
#define XCFLAGS_LIVE (1 << 0)
@@ -236,6 +238,9 @@ struct xc_dom_image {
#if defined(__i386__) || defined(__x86_64__)
struct e820entry *e820;
unsigned int e820_entries;
+
+ /* LUT mapping cpu id to (x2)APIC ID */
+ uint32_t cpu_to_apicid[HVM_MAX_VCPUS];
#endif
xen_pfn_t vuart_gfn;
diff --git a/tools/libacpi/build.c b/tools/libacpi/build.c
index 2f29863db154..2ad1d461a2ec 100644
--- a/tools/libacpi/build.c
+++ b/tools/libacpi/build.c
@@ -74,7 +74,7 @@ static struct acpi_20_madt *construct_madt(struct acpi_ctxt *ctxt,
const struct hvm_info_table *hvminfo = config->hvminfo;
int i, sz;
- if ( config->lapic_id == NULL )
+ if ( config->cpu_to_apicid == NULL )
return NULL;
sz = sizeof(struct acpi_20_madt);
@@ -148,7 +148,7 @@ static struct acpi_20_madt *construct_madt(struct acpi_ctxt *ctxt,
lapic->length = sizeof(*lapic);
/* Processor ID must match processor-object IDs in the DSDT. */
lapic->acpi_processor_id = i;
- lapic->apic_id = config->lapic_id(i);
+ lapic->apic_id = config->cpu_to_apicid[i];
lapic->flags = (test_bit(i, hvminfo->vcpu_online)
? ACPI_LOCAL_APIC_ENABLED : 0);
lapic++;
@@ -236,7 +236,7 @@ static struct acpi_20_srat *construct_srat(struct acpi_ctxt *ctxt,
processor->type = ACPI_PROCESSOR_AFFINITY;
processor->length = sizeof(*processor);
processor->domain = config->numa.vcpu_to_vnode[i];
- processor->apic_id = config->lapic_id(i);
+ processor->apic_id = config->cpu_to_apicid[i];
processor->flags = ACPI_LOCAL_APIC_AFFIN_ENABLED;
processor++;
}
diff --git a/tools/libacpi/libacpi.h b/tools/libacpi/libacpi.h
index deda39e5dbc4..e8f603ee18ee 100644
--- a/tools/libacpi/libacpi.h
+++ b/tools/libacpi/libacpi.h
@@ -84,7 +84,7 @@ struct acpi_config {
unsigned long rsdp;
/* x86-specific parameters */
- uint32_t (*lapic_id)(unsigned cpu);
+ const uint32_t *cpu_to_apicid; /* LUT mapping cpu id to (x2)APIC ID */
uint32_t lapic_base_address;
uint32_t ioapic_base_address;
uint16_t pci_isa_irq_mask;
diff --git a/tools/libs/light/libxl_dom.c b/tools/libs/light/libxl_dom.c
index 94fef374014e..5f4f6830e850 100644
--- a/tools/libs/light/libxl_dom.c
+++ b/tools/libs/light/libxl_dom.c
@@ -1082,6 +1082,11 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
dom->container_type = XC_DOM_HVM_CONTAINER;
+#if defined(__i386__) || defined(__x86_64__)
+ for (unsigned int i = 0; i < info->max_vcpus; i++)
+ dom->cpu_to_apicid[i] = 2 * i; /* TODO: Replace by topo calculation */
+#endif
+
/* The params from the configuration file are in Mb, which are then
* multiplied by 1 Kb. This was then divided off when calling
* the old xc_hvm_build_target_mem() which then turned them to bytes.
diff --git a/tools/libs/light/libxl_x86_acpi.c b/tools/libs/light/libxl_x86_acpi.c
index 5cf261bd6794..585d4c8755cb 100644
--- a/tools/libs/light/libxl_x86_acpi.c
+++ b/tools/libs/light/libxl_x86_acpi.c
@@ -75,11 +75,6 @@ static void acpi_mem_free(struct acpi_ctxt *ctxt,
{
}
-static uint32_t acpi_lapic_id(unsigned cpu)
-{
- return cpu * 2;
-}
-
static int init_acpi_config(libxl__gc *gc,
struct xc_dom_image *dom,
const libxl_domain_build_info *b_info,
@@ -144,7 +139,7 @@ static int init_acpi_config(libxl__gc *gc,
config->hvminfo = hvminfo;
config->lapic_base_address = LAPIC_BASE_ADDRESS;
- config->lapic_id = acpi_lapic_id;
+ config->cpu_to_apicid = dom->cpu_to_apicid;
config->acpi_revision = 5;
rc = 0;
--
2.47.0
^ permalink raw reply related [flat|nested] 27+ messages in thread* Re: [PATCH v7 05/10] tools/libacpi: Use LUT of APIC IDs rather than function pointer
2024-10-21 15:45 ` [PATCH v7 05/10] tools/libacpi: Use LUT of APIC IDs rather than function pointer Alejandro Vallejo
@ 2024-10-30 14:56 ` Andrew Cooper
2024-12-02 9:40 ` Jan Beulich
1 sibling, 0 replies; 27+ messages in thread
From: Andrew Cooper @ 2024-10-30 14:56 UTC (permalink / raw)
To: Alejandro Vallejo, xen-devel
Cc: Jan Beulich, Roger Pau Monné, Anthony PERARD, Juergen Gross
On 21/10/2024 4:45 pm, Alejandro Vallejo wrote:
> diff --git a/tools/include/xenguest.h b/tools/include/xenguest.h
> index e01f494b772a..aa50b78dfb89 100644
> --- a/tools/include/xenguest.h
> +++ b/tools/include/xenguest.h
> @@ -22,6 +22,8 @@
> #ifndef XENGUEST_H
> #define XENGUEST_H
>
> +#include "xen/hvm/hvm_info_table.h"
> +
> #define XC_NUMA_NO_NODE (~0U)
>
> #define XCFLAGS_LIVE (1 << 0)
> @@ -236,6 +238,9 @@ struct xc_dom_image {
> #if defined(__i386__) || defined(__x86_64__)
> struct e820entry *e820;
> unsigned int e820_entries;
> +
> + /* LUT mapping cpu id to (x2)APIC ID */
> + uint32_t cpu_to_apicid[HVM_MAX_VCPUS];
Same note as the previous patch.
This needs to be a plain dynamically allocated array, because it mustn't
use HVM_MAX_VCPUS.
~Andrew
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [PATCH v7 05/10] tools/libacpi: Use LUT of APIC IDs rather than function pointer
2024-10-21 15:45 ` [PATCH v7 05/10] tools/libacpi: Use LUT of APIC IDs rather than function pointer Alejandro Vallejo
2024-10-30 14:56 ` Andrew Cooper
@ 2024-12-02 9:40 ` Jan Beulich
1 sibling, 0 replies; 27+ messages in thread
From: Jan Beulich @ 2024-12-02 9:40 UTC (permalink / raw)
To: Alejandro Vallejo
Cc: Andrew Cooper, Roger Pau Monné, Anthony PERARD,
Juergen Gross, xen-devel
On 21.10.2024 17:45, Alejandro Vallejo wrote:
> Refactors libacpi so that a single LUT is the authoritative source of
> truth for the CPU to APIC ID mappings. This has a know-on effect in
> reducing complexity on future patches, as the same LUT can be used for
> configuring the APICs and configuring the ACPI tables for PVH.
>
> Not functional change intended, because the same mappings are preserved.
>
> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
> ---
> v7:
> * NOTE: didn't add assert to libacpi as initially accepted in order to
> protect libvirt from an assert failure.
If such an assertion can trigger, doesn't that suggest a problem with the
corresponding caller? I.e. isn't omitting the assertion merely trading a
noisy failure for a silent (and possibly hard to understand) one?
Jan
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v7 06/10] tools/libguest: Always set vCPU context in vcpu_hvm()
2024-10-21 15:45 [PATCH v7 00/10] x86: Expose consistent topology to guests Alejandro Vallejo
` (4 preceding siblings ...)
2024-10-21 15:45 ` [PATCH v7 05/10] tools/libacpi: Use LUT of APIC IDs rather than function pointer Alejandro Vallejo
@ 2024-10-21 15:45 ` Alejandro Vallejo
2024-10-21 15:45 ` [PATCH v7 07/10] xen/lib: Add topology generator for x86 Alejandro Vallejo
` (3 subsequent siblings)
9 siblings, 0 replies; 27+ messages in thread
From: Alejandro Vallejo @ 2024-10-21 15:45 UTC (permalink / raw)
To: xen-devel; +Cc: Alejandro Vallejo, Anthony PERARD, Juergen Gross
Currently used by PVH to set MTRR, will be used by a later patch to set
APIC state. Unconditionally send the hypercall, and gate overriding the
MTRR so it remains functionally equivalent.
While at it, add a missing "goto out" to what was the error condition
in the loop.
In principle this patch shouldn't affect functionality. An extra record
(the MTRR) is sent to the hypervisor per vCPU on HVM, but these records
are identical to those retrieved in the first place so there's no
expected functional change.
Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
v7:
* Unchanged
---
tools/libs/guest/xg_dom_x86.c | 84 ++++++++++++++++++-----------------
1 file changed, 44 insertions(+), 40 deletions(-)
diff --git a/tools/libs/guest/xg_dom_x86.c b/tools/libs/guest/xg_dom_x86.c
index cba01384ae75..c98229317db7 100644
--- a/tools/libs/guest/xg_dom_x86.c
+++ b/tools/libs/guest/xg_dom_x86.c
@@ -989,6 +989,7 @@ const static void *hvm_get_save_record(const void *ctx, unsigned int type,
static int vcpu_hvm(struct xc_dom_image *dom)
{
+ /* Initialises the BSP */
struct {
struct hvm_save_descriptor header_d;
HVM_SAVE_TYPE(HEADER) header;
@@ -997,6 +998,18 @@ static int vcpu_hvm(struct xc_dom_image *dom)
struct hvm_save_descriptor end_d;
HVM_SAVE_TYPE(END) end;
} bsp_ctx;
+ /* Initialises APICs and MTRRs of every vCPU */
+ struct {
+ struct hvm_save_descriptor header_d;
+ HVM_SAVE_TYPE(HEADER) header;
+ struct hvm_save_descriptor mtrr_d;
+ HVM_SAVE_TYPE(MTRR) mtrr;
+ struct hvm_save_descriptor end_d;
+ HVM_SAVE_TYPE(END) end;
+ } vcpu_ctx;
+ /* Context from full_ctx */
+ const HVM_SAVE_TYPE(MTRR) *mtrr_record;
+ /* Raw context as taken from Xen */
uint8_t *full_ctx = NULL;
int rc;
@@ -1083,51 +1096,42 @@ static int vcpu_hvm(struct xc_dom_image *dom)
bsp_ctx.end_d.instance = 0;
bsp_ctx.end_d.length = HVM_SAVE_LENGTH(END);
- /* TODO: maybe this should be a firmware option instead? */
- if ( !dom->device_model )
+ /* TODO: maybe setting MTRRs should be a firmware option instead? */
+ mtrr_record = hvm_get_save_record(full_ctx, HVM_SAVE_CODE(MTRR), 0);
+
+ if ( !mtrr_record)
{
- struct {
- struct hvm_save_descriptor header_d;
- HVM_SAVE_TYPE(HEADER) header;
- struct hvm_save_descriptor mtrr_d;
- HVM_SAVE_TYPE(MTRR) mtrr;
- struct hvm_save_descriptor end_d;
- HVM_SAVE_TYPE(END) end;
- } mtrr = {
- .header_d = bsp_ctx.header_d,
- .header = bsp_ctx.header,
- .mtrr_d.typecode = HVM_SAVE_CODE(MTRR),
- .mtrr_d.length = HVM_SAVE_LENGTH(MTRR),
- .end_d = bsp_ctx.end_d,
- .end = bsp_ctx.end,
- };
- const HVM_SAVE_TYPE(MTRR) *mtrr_record =
- hvm_get_save_record(full_ctx, HVM_SAVE_CODE(MTRR), 0);
- unsigned int i;
-
- if ( !mtrr_record )
- {
- xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
- "%s: unable to get MTRR save record", __func__);
- goto out;
- }
+ xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
+ "%s: unable to get MTRR save record", __func__);
+ goto out;
+ }
- memcpy(&mtrr.mtrr, mtrr_record, sizeof(mtrr.mtrr));
+ vcpu_ctx.header_d = bsp_ctx.header_d;
+ vcpu_ctx.header = bsp_ctx.header;
+ vcpu_ctx.mtrr_d.typecode = HVM_SAVE_CODE(MTRR);
+ vcpu_ctx.mtrr_d.length = HVM_SAVE_LENGTH(MTRR);
+ vcpu_ctx.mtrr = *mtrr_record;
+ vcpu_ctx.end_d = bsp_ctx.end_d;
+ vcpu_ctx.end = bsp_ctx.end;
- /*
- * Enable MTRR, set default type to WB.
- * TODO: add MMIO areas as UC when passthrough is supported.
- */
- mtrr.mtrr.msr_mtrr_def_type = MTRR_TYPE_WRBACK | MTRR_DEF_TYPE_ENABLE;
+ /*
+ * Enable MTRR, set default type to WB.
+ * TODO: add MMIO areas as UC when passthrough is supported in PVH
+ */
+ if ( !dom->device_model )
+ vcpu_ctx.mtrr.msr_mtrr_def_type = MTRR_TYPE_WRBACK | MTRR_DEF_TYPE_ENABLE;
+
+ for ( unsigned int i = 0; i < dom->max_vcpus; i++ )
+ {
+ vcpu_ctx.mtrr_d.instance = i;
- for ( i = 0; i < dom->max_vcpus; i++ )
+ rc = xc_domain_hvm_setcontext(dom->xch, dom->guest_domid,
+ (uint8_t *)&vcpu_ctx, sizeof(vcpu_ctx));
+ if ( rc != 0 )
{
- mtrr.mtrr_d.instance = i;
- rc = xc_domain_hvm_setcontext(dom->xch, dom->guest_domid,
- (uint8_t *)&mtrr, sizeof(mtrr));
- if ( rc != 0 )
- xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
- "%s: SETHVMCONTEXT failed (rc=%d)", __func__, rc);
+ xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
+ "%s: SETHVMCONTEXT failed (rc=%d)", __func__, rc);
+ goto out;
}
}
--
2.47.0
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH v7 07/10] xen/lib: Add topology generator for x86
2024-10-21 15:45 [PATCH v7 00/10] x86: Expose consistent topology to guests Alejandro Vallejo
` (5 preceding siblings ...)
2024-10-21 15:45 ` [PATCH v7 06/10] tools/libguest: Always set vCPU context in vcpu_hvm() Alejandro Vallejo
@ 2024-10-21 15:45 ` Alejandro Vallejo
2024-10-21 15:45 ` [PATCH v7 08/10] xen/x86: Derive topologically correct x2APIC IDs from the policy Alejandro Vallejo
` (2 subsequent siblings)
9 siblings, 0 replies; 27+ messages in thread
From: Alejandro Vallejo @ 2024-10-21 15:45 UTC (permalink / raw)
To: xen-devel
Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
Roger Pau Monné, Anthony PERARD
Add a helper to populate topology leaves in the cpu policy from
threads/core and cores/package counts. It's unit-tested in
test-cpu-policy.c, but it's not connected to the rest of the code yet.
Intel's cache leaves (CPUID[4]) have limited width for core counts, so
(in the absence of real world data for how it might behave) this
implementation takes the view that those counts should clip to their
maximum values on overflow. Just like lppp and NC.
Adds the ASSERT() macro to xen/lib/x86/private.h, as it was missing.
Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
v7:
* MAX/MIN -> max/min; adding U suffixes to literals for type-matching
and uppercases for MISRA compliance.
* Clip core counts in cache leaves to their maximum values
* Remove unified cache conditional. Less code, and less likely for the
threads_per_cache field to clip.
* Add extra check to ensure threads_per_pkg fit in 16 bits (which is
the space they have in leaf 0xb.
* Add extra check to detect overflow in threads_per_pkg calculation.
* Reworked the comment for the topo generator, expressing more clearly
what are inputs and what are outputs.
---
tools/tests/cpu-policy/test-cpu-policy.c | 133 +++++++++++++++++++++++
xen/include/xen/lib/x86/cpu-policy.h | 16 +++
xen/lib/x86/policy.c | 93 ++++++++++++++++
xen/lib/x86/private.h | 4 +
4 files changed, 246 insertions(+)
diff --git a/tools/tests/cpu-policy/test-cpu-policy.c b/tools/tests/cpu-policy/test-cpu-policy.c
index 301df2c00285..849d7cebaa7c 100644
--- a/tools/tests/cpu-policy/test-cpu-policy.c
+++ b/tools/tests/cpu-policy/test-cpu-policy.c
@@ -650,6 +650,137 @@ static void test_is_compatible_failure(void)
}
}
+static void test_topo_from_parts(void)
+{
+ static const struct test {
+ unsigned int threads_per_core;
+ unsigned int cores_per_pkg;
+ struct cpu_policy policy;
+ } tests[] = {
+ {
+ .threads_per_core = 3, .cores_per_pkg = 1,
+ .policy = {
+ .x86_vendor = X86_VENDOR_AMD,
+ .topo.subleaf = {
+ { .nr_logical = 3, .level = 0, .type = 1, .id_shift = 2, },
+ { .nr_logical = 1, .level = 1, .type = 2, .id_shift = 2, },
+ },
+ },
+ },
+ {
+ .threads_per_core = 1, .cores_per_pkg = 3,
+ .policy = {
+ .x86_vendor = X86_VENDOR_AMD,
+ .topo.subleaf = {
+ { .nr_logical = 1, .level = 0, .type = 1, .id_shift = 0, },
+ { .nr_logical = 3, .level = 1, .type = 2, .id_shift = 2, },
+ },
+ },
+ },
+ {
+ .threads_per_core = 7, .cores_per_pkg = 5,
+ .policy = {
+ .x86_vendor = X86_VENDOR_AMD,
+ .topo.subleaf = {
+ { .nr_logical = 7, .level = 0, .type = 1, .id_shift = 3, },
+ { .nr_logical = 5, .level = 1, .type = 2, .id_shift = 6, },
+ },
+ },
+ },
+ {
+ .threads_per_core = 2, .cores_per_pkg = 128,
+ .policy = {
+ .x86_vendor = X86_VENDOR_AMD,
+ .topo.subleaf = {
+ { .nr_logical = 2, .level = 0, .type = 1, .id_shift = 1, },
+ { .nr_logical = 128, .level = 1, .type = 2,
+ .id_shift = 8, },
+ },
+ },
+ },
+ {
+ .threads_per_core = 3, .cores_per_pkg = 1,
+ .policy = {
+ .x86_vendor = X86_VENDOR_INTEL,
+ .topo.subleaf = {
+ { .nr_logical = 3, .level = 0, .type = 1, .id_shift = 2, },
+ { .nr_logical = 3, .level = 1, .type = 2, .id_shift = 2, },
+ },
+ },
+ },
+ {
+ .threads_per_core = 1, .cores_per_pkg = 3,
+ .policy = {
+ .x86_vendor = X86_VENDOR_INTEL,
+ .topo.subleaf = {
+ { .nr_logical = 1, .level = 0, .type = 1, .id_shift = 0, },
+ { .nr_logical = 3, .level = 1, .type = 2, .id_shift = 2, },
+ },
+ },
+ },
+ {
+ .threads_per_core = 7, .cores_per_pkg = 5,
+ .policy = {
+ .x86_vendor = X86_VENDOR_INTEL,
+ .topo.subleaf = {
+ { .nr_logical = 7, .level = 0, .type = 1, .id_shift = 3, },
+ { .nr_logical = 35, .level = 1, .type = 2, .id_shift = 6, },
+ },
+ },
+ },
+ {
+ .threads_per_core = 2, .cores_per_pkg = 128,
+ .policy = {
+ .x86_vendor = X86_VENDOR_INTEL,
+ .topo.subleaf = {
+ { .nr_logical = 2, .level = 0, .type = 1, .id_shift = 1, },
+ { .nr_logical = 256, .level = 1, .type = 2,
+ .id_shift = 8, },
+ },
+ },
+ },
+ };
+
+ printf("Testing topology synthesis from parts:\n");
+
+ for ( size_t i = 0; i < ARRAY_SIZE(tests); ++i )
+ {
+ const struct test *t = &tests[i];
+ struct cpu_policy actual = { .x86_vendor = t->policy.x86_vendor };
+ int rc = x86_topo_from_parts(&actual, t->threads_per_core,
+ t->cores_per_pkg);
+
+ if ( rc || memcmp(&actual.topo, &t->policy.topo, sizeof(actual.topo)) )
+ {
+#define TOPO(n, f) t->policy.topo.subleaf[(n)].f, actual.topo.subleaf[(n)].f
+ fail("FAIL[%d] - '%s %u t/c, %u c/p'\n",
+ rc,
+ x86_cpuid_vendor_to_str(t->policy.x86_vendor),
+ t->threads_per_core, t->cores_per_pkg);
+ printf(" subleaf=%u expected_n=%u actual_n=%u\n"
+ " expected_lvl=%u actual_lvl=%u\n"
+ " expected_type=%u actual_type=%u\n"
+ " expected_shift=%u actual_shift=%u\n",
+ 0,
+ TOPO(0, nr_logical),
+ TOPO(0, level),
+ TOPO(0, type),
+ TOPO(0, id_shift));
+
+ printf(" subleaf=%u expected_n=%u actual_n=%u\n"
+ " expected_lvl=%u actual_lvl=%u\n"
+ " expected_type=%u actual_type=%u\n"
+ " expected_shift=%u actual_shift=%u\n",
+ 1,
+ TOPO(1, nr_logical),
+ TOPO(1, level),
+ TOPO(1, type),
+ TOPO(1, id_shift));
+#undef TOPO
+ }
+ }
+}
+
int main(int argc, char **argv)
{
printf("CPU Policy unit tests\n");
@@ -667,6 +798,8 @@ int main(int argc, char **argv)
test_is_compatible_success();
test_is_compatible_failure();
+ test_topo_from_parts();
+
if ( nr_failures )
printf("Done: %u failures\n", nr_failures);
else
diff --git a/xen/include/xen/lib/x86/cpu-policy.h b/xen/include/xen/lib/x86/cpu-policy.h
index f43e1a3b21e9..67d16fda933d 100644
--- a/xen/include/xen/lib/x86/cpu-policy.h
+++ b/xen/include/xen/lib/x86/cpu-policy.h
@@ -542,6 +542,22 @@ int x86_cpu_policies_are_compatible(const struct cpu_policy *host,
const struct cpu_policy *guest,
struct cpu_policy_errors *err);
+/**
+ * Synthesise topology information in `p` given high-level constraints
+ *
+ * Topology is expressed in various fields accross several leaves, some of
+ * which are vendor-specific. This function populates such fields given
+ * threads/core, cores/package and other existing policy fields.
+ *
+ * @param p CPU policy of the domain.
+ * @param threads_per_core threads/core. Doesn't need to be a power of 2.
+ * @param cores_per_package cores/package. Doesn't need to be a power of 2.
+ * @return 0 on success; -errno on failure
+ */
+int x86_topo_from_parts(struct cpu_policy *p,
+ unsigned int threads_per_core,
+ unsigned int cores_per_pkg);
+
#endif /* !XEN_LIB_X86_POLICIES_H */
/*
diff --git a/xen/lib/x86/policy.c b/xen/lib/x86/policy.c
index f033d22785be..5ff89022e901 100644
--- a/xen/lib/x86/policy.c
+++ b/xen/lib/x86/policy.c
@@ -2,6 +2,99 @@
#include <xen/lib/x86/cpu-policy.h>
+static unsigned int order(unsigned int n)
+{
+ ASSERT(n); /* clz(0) is UB */
+
+ return 8 * sizeof(n) - __builtin_clz(n);
+}
+
+int x86_topo_from_parts(struct cpu_policy *p,
+ unsigned int threads_per_core,
+ unsigned int cores_per_pkg)
+{
+ unsigned int threads_per_pkg = threads_per_core * cores_per_pkg;
+ unsigned int apic_id_size;
+
+ /*
+ * threads_per_pkg must fit in 16bits to avoid overflowing
+ * nr_logical in leaf 0xb on Intel systems.
+ */
+ if ( !p || !threads_per_core || !cores_per_pkg ||
+ threads_per_pkg > UINT16_MAX ||
+ threads_per_pkg / cores_per_pkg != threads_per_core )
+ return -EINVAL;
+
+ p->basic.max_leaf = max(0xBU, p->basic.max_leaf);
+
+ memset(p->topo.raw, 0, sizeof(p->topo.raw));
+
+ /* thread level */
+ p->topo.subleaf[0].nr_logical = threads_per_core;
+ p->topo.subleaf[0].id_shift = 0;
+ p->topo.subleaf[0].level = 0;
+ p->topo.subleaf[0].type = 1;
+ if ( threads_per_core > 1 )
+ p->topo.subleaf[0].id_shift = order(threads_per_core - 1);
+
+ /* core level */
+ p->topo.subleaf[1].nr_logical = cores_per_pkg;
+ if ( p->x86_vendor == X86_VENDOR_INTEL )
+ p->topo.subleaf[1].nr_logical = threads_per_pkg;
+ p->topo.subleaf[1].id_shift = p->topo.subleaf[0].id_shift;
+ p->topo.subleaf[1].level = 1;
+ p->topo.subleaf[1].type = 2;
+ if ( cores_per_pkg > 1 )
+ p->topo.subleaf[1].id_shift += order(cores_per_pkg - 1);
+
+ apic_id_size = p->topo.subleaf[1].id_shift;
+
+ /*
+ * Contrary to what the name might seem to imply. HTT is an enabler for
+ * SMP and there's no harm in setting it even with a single vCPU.
+ */
+ p->basic.htt = true;
+ p->basic.lppp = min(0xFFU, threads_per_pkg);
+
+ switch ( p->x86_vendor )
+ {
+ case X86_VENDOR_INTEL: {
+ struct cpuid_cache_leaf *sl = p->cache.subleaf;
+
+ for ( size_t i = 0; sl->type &&
+ i < ARRAY_SIZE(p->cache.raw); i++, sl++ )
+ {
+ /* Clip these values to their max if they overflow */
+ sl->cores_per_package = min(63U, cores_per_pkg - 1);
+ sl->threads_per_cache = min(4095U, threads_per_core - 1);
+ }
+ break;
+ }
+
+ case X86_VENDOR_AMD:
+ case X86_VENDOR_HYGON:
+ /* Expose p->basic.lppp */
+ p->extd.cmp_legacy = true;
+
+ /* Clip NC to the maximum value it can hold */
+ p->extd.nc = min(0xFFU, threads_per_pkg - 1);
+
+ /* TODO: Expose leaf e1E */
+ p->extd.topoext = false;
+
+ /*
+ * Clip APIC ID to 8 bits, as that's what high core-count machines do.
+ *
+ * That's what AMD EPYC 9654 does with >256 CPUs.
+ */
+ p->extd.apic_id_size = min(8U, apic_id_size);
+
+ break;
+ }
+
+ return 0;
+}
+
int x86_cpu_policies_are_compatible(const struct cpu_policy *host,
const struct cpu_policy *guest,
struct cpu_policy_errors *err)
diff --git a/xen/lib/x86/private.h b/xen/lib/x86/private.h
index 60bb82a400b7..2ec9dbee33c2 100644
--- a/xen/lib/x86/private.h
+++ b/xen/lib/x86/private.h
@@ -4,6 +4,7 @@
#ifdef __XEN__
#include <xen/bitops.h>
+#include <xen/bug.h>
#include <xen/guest_access.h>
#include <xen/kernel.h>
#include <xen/lib.h>
@@ -17,6 +18,7 @@
#else
+#include <assert.h>
#include <errno.h>
#include <inttypes.h>
#include <stdbool.h>
@@ -28,6 +30,8 @@
#include <xen-tools/common-macros.h>
+#define ASSERT(x) assert(x)
+
static inline bool test_bit(unsigned int bit, const void *vaddr)
{
const char *addr = vaddr;
--
2.47.0
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH v7 08/10] xen/x86: Derive topologically correct x2APIC IDs from the policy
2024-10-21 15:45 [PATCH v7 00/10] x86: Expose consistent topology to guests Alejandro Vallejo
` (6 preceding siblings ...)
2024-10-21 15:45 ` [PATCH v7 07/10] xen/lib: Add topology generator for x86 Alejandro Vallejo
@ 2024-10-21 15:45 ` Alejandro Vallejo
2024-10-21 15:45 ` [PATCH v7 09/10] tools/libguest: Set distinct x2APIC IDs for each vCPU Alejandro Vallejo
2024-10-21 15:46 ` [PATCH v7 10/10] tools/x86: Synthesise domain topologies Alejandro Vallejo
9 siblings, 0 replies; 27+ messages in thread
From: Alejandro Vallejo @ 2024-10-21 15:45 UTC (permalink / raw)
To: xen-devel
Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
Roger Pau Monné, Anthony PERARD
Implements the helper for mapping vcpu_id to x2apic_id given a valid
topology in a policy. The algo is written with the intention of
extending it to leaves 0x1f and extended 0x26 in the future.
The helper returns the legacy mapping when leaf 0xb is not implemented
(as is the case at the moment).
Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
v7:
* Changes to commit message
---
tools/tests/cpu-policy/test-cpu-policy.c | 68 +++++++++++++++++++++
xen/include/xen/lib/x86/cpu-policy.h | 11 ++++
xen/lib/x86/policy.c | 76 ++++++++++++++++++++++++
3 files changed, 155 insertions(+)
diff --git a/tools/tests/cpu-policy/test-cpu-policy.c b/tools/tests/cpu-policy/test-cpu-policy.c
index 849d7cebaa7c..e5f9b8f7ee39 100644
--- a/tools/tests/cpu-policy/test-cpu-policy.c
+++ b/tools/tests/cpu-policy/test-cpu-policy.c
@@ -781,6 +781,73 @@ static void test_topo_from_parts(void)
}
}
+static void test_x2apic_id_from_vcpu_id_success(void)
+{
+ static const struct test {
+ unsigned int vcpu_id;
+ unsigned int threads_per_core;
+ unsigned int cores_per_pkg;
+ uint32_t x2apic_id;
+ uint8_t x86_vendor;
+ } tests[] = {
+ {
+ .vcpu_id = 3, .threads_per_core = 3, .cores_per_pkg = 8,
+ .x2apic_id = 1 << 2,
+ },
+ {
+ .vcpu_id = 6, .threads_per_core = 3, .cores_per_pkg = 8,
+ .x2apic_id = 2 << 2,
+ },
+ {
+ .vcpu_id = 24, .threads_per_core = 3, .cores_per_pkg = 8,
+ .x2apic_id = 1 << 5,
+ },
+ {
+ .vcpu_id = 35, .threads_per_core = 3, .cores_per_pkg = 8,
+ .x2apic_id = (35 % 3) | (((35 / 3) % 8) << 2) | ((35 / 24) << 5),
+ },
+ {
+ .vcpu_id = 96, .threads_per_core = 7, .cores_per_pkg = 3,
+ .x2apic_id = (96 % 7) | (((96 / 7) % 3) << 3) | ((96 / 21) << 5),
+ },
+ };
+
+ const uint8_t vendors[] = {
+ X86_VENDOR_INTEL,
+ X86_VENDOR_AMD,
+ X86_VENDOR_CENTAUR,
+ X86_VENDOR_SHANGHAI,
+ X86_VENDOR_HYGON,
+ };
+
+ printf("Testing x2apic id from vcpu id success:\n");
+
+ /* Perform the test run on every vendor we know about */
+ for ( size_t i = 0; i < ARRAY_SIZE(vendors); ++i )
+ {
+ for ( size_t j = 0; j < ARRAY_SIZE(tests); ++j )
+ {
+ struct cpu_policy policy = { .x86_vendor = vendors[i] };
+ const struct test *t = &tests[j];
+ uint32_t x2apic_id;
+ int rc = x86_topo_from_parts(&policy, t->threads_per_core,
+ t->cores_per_pkg);
+
+ if ( rc ) {
+ fail("FAIL[%d] - 'x86_topo_from_parts() failed", rc);
+ continue;
+ }
+
+ x2apic_id = x86_x2apic_id_from_vcpu_id(&policy, t->vcpu_id);
+ if ( x2apic_id != t->x2apic_id )
+ fail("FAIL - '%s cpu%u %u t/c %u c/p'. bad x2apic_id: expected=%u actual=%u\n",
+ x86_cpuid_vendor_to_str(policy.x86_vendor),
+ t->vcpu_id, t->threads_per_core, t->cores_per_pkg,
+ t->x2apic_id, x2apic_id);
+ }
+ }
+}
+
int main(int argc, char **argv)
{
printf("CPU Policy unit tests\n");
@@ -799,6 +866,7 @@ int main(int argc, char **argv)
test_is_compatible_failure();
test_topo_from_parts();
+ test_x2apic_id_from_vcpu_id_success();
if ( nr_failures )
printf("Done: %u failures\n", nr_failures);
diff --git a/xen/include/xen/lib/x86/cpu-policy.h b/xen/include/xen/lib/x86/cpu-policy.h
index 67d16fda933d..61d5cf3c7f12 100644
--- a/xen/include/xen/lib/x86/cpu-policy.h
+++ b/xen/include/xen/lib/x86/cpu-policy.h
@@ -542,6 +542,17 @@ int x86_cpu_policies_are_compatible(const struct cpu_policy *host,
const struct cpu_policy *guest,
struct cpu_policy_errors *err);
+/**
+ * Calculates the x2APIC ID of a vCPU given a CPU policy
+ *
+ * If the policy lacks leaf 0xb falls back to legacy mapping of apic_id=cpu*2
+ *
+ * @param p CPU policy of the domain.
+ * @param id vCPU ID of the vCPU.
+ * @returns x2APIC ID of the vCPU.
+ */
+uint32_t x86_x2apic_id_from_vcpu_id(const struct cpu_policy *p, uint32_t id);
+
/**
* Synthesise topology information in `p` given high-level constraints
*
diff --git a/xen/lib/x86/policy.c b/xen/lib/x86/policy.c
index 5ff89022e901..427a90f907a2 100644
--- a/xen/lib/x86/policy.c
+++ b/xen/lib/x86/policy.c
@@ -2,6 +2,82 @@
#include <xen/lib/x86/cpu-policy.h>
+static uint32_t parts_per_higher_scoped_level(const struct cpu_policy *p,
+ size_t lvl)
+{
+ /*
+ * `nr_logical` reported by Intel is the number of THREADS contained in
+ * the next topological scope. For example, assuming a system with 2
+ * threads/core and 3 cores/module in a fully symmetric topology,
+ * `nr_logical` at the core level will report 6. Because it's reporting
+ * the number of threads in a module.
+ *
+ * On AMD/Hygon, nr_logical is already normalized by the higher scoped
+ * level (cores/complex, etc) so we can return it as-is.
+ */
+ if ( p->x86_vendor != X86_VENDOR_INTEL || !lvl )
+ return p->topo.subleaf[lvl].nr_logical;
+
+ return p->topo.subleaf[lvl].nr_logical /
+ p->topo.subleaf[lvl - 1].nr_logical;
+}
+
+uint32_t x86_x2apic_id_from_vcpu_id(const struct cpu_policy *p, uint32_t id)
+{
+ uint32_t shift = 0, x2apic_id = 0;
+
+ /* In the absence of topology leaves, fallback to traditional mapping */
+ if ( !p->topo.subleaf[0].type )
+ return id * 2;
+
+ /*
+ * `id` means different things at different points of the algo
+ *
+ * At lvl=0: global thread_id (same as vcpu_id)
+ * At lvl=1: global core_id
+ * At lvl=2: global socket_id (actually complex_id in AMD, module_id
+ * in Intel, but the name is inconsequential)
+ *
+ * +--+
+ * ____ |#0| ______ <= 1 socket
+ * / +--+ \+--+
+ * __#0__ __|#1|__ <= 2 cores/socket
+ * / | \ +--+/ +-|+ \
+ * #0 #1 #2 |#3| #4 #5 <= 3 threads/core
+ * +--+
+ *
+ * ... and so on. Global in this context means that it's a unique
+ * identifier for the whole topology, and not relative to the level
+ * it's in. For example, in the diagram shown above, we're looking at
+ * thread #3 in the global sense, though it's #0 within its core.
+ *
+ * Note that dividing a global thread_id by the number of threads per
+ * core returns the global core id that contains it. e.g: 0, 1 or 2
+ * divided by 3 returns core_id=0. 3, 4 or 5 divided by 3 returns core
+ * 1, and so on. An analogous argument holds for higher levels. This is
+ * the property we exploit to derive x2apic_id from vcpu_id.
+ *
+ * NOTE: `topo` is currently derived from leaf 0xb, which is bound to two
+ * levels, but once we track leaves 0x1f (or extended 0x26) there will be a
+ * few more. The algorithm is written to cope with that case.
+ */
+ for ( uint32_t i = 0; i < ARRAY_SIZE(p->topo.raw); i++ )
+ {
+ uint32_t nr_parts;
+
+ if ( !p->topo.subleaf[i].type )
+ /* sentinel subleaf */
+ break;
+
+ nr_parts = parts_per_higher_scoped_level(p, i);
+ x2apic_id |= (id % nr_parts) << shift;
+ id /= nr_parts;
+ shift = p->topo.subleaf[i].id_shift;
+ }
+
+ return (id << shift) | x2apic_id;
+}
+
static unsigned int order(unsigned int n)
{
ASSERT(n); /* clz(0) is UB */
--
2.47.0
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH v7 09/10] tools/libguest: Set distinct x2APIC IDs for each vCPU
2024-10-21 15:45 [PATCH v7 00/10] x86: Expose consistent topology to guests Alejandro Vallejo
` (7 preceding siblings ...)
2024-10-21 15:45 ` [PATCH v7 08/10] xen/x86: Derive topologically correct x2APIC IDs from the policy Alejandro Vallejo
@ 2024-10-21 15:45 ` Alejandro Vallejo
2024-10-21 15:46 ` [PATCH v7 10/10] tools/x86: Synthesise domain topologies Alejandro Vallejo
9 siblings, 0 replies; 27+ messages in thread
From: Alejandro Vallejo @ 2024-10-21 15:45 UTC (permalink / raw)
To: xen-devel; +Cc: Alejandro Vallejo, Anthony PERARD, Juergen Gross
Have toolstack populate the new x2APIC ID in the LAPIC save record with
the proper IDs intended for each vCPU.
Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
v7:
* Unchanged
---
tools/libs/guest/xg_dom_x86.c | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/tools/libs/guest/xg_dom_x86.c b/tools/libs/guest/xg_dom_x86.c
index c98229317db7..38486140ed15 100644
--- a/tools/libs/guest/xg_dom_x86.c
+++ b/tools/libs/guest/xg_dom_x86.c
@@ -1004,11 +1004,14 @@ static int vcpu_hvm(struct xc_dom_image *dom)
HVM_SAVE_TYPE(HEADER) header;
struct hvm_save_descriptor mtrr_d;
HVM_SAVE_TYPE(MTRR) mtrr;
+ struct hvm_save_descriptor lapic_d;
+ HVM_SAVE_TYPE(LAPIC) lapic;
struct hvm_save_descriptor end_d;
HVM_SAVE_TYPE(END) end;
} vcpu_ctx;
- /* Context from full_ctx */
+ /* Contexts from full_ctx */
const HVM_SAVE_TYPE(MTRR) *mtrr_record;
+ const HVM_SAVE_TYPE(LAPIC) *lapic_record;
/* Raw context as taken from Xen */
uint8_t *full_ctx = NULL;
int rc;
@@ -1111,6 +1114,8 @@ static int vcpu_hvm(struct xc_dom_image *dom)
vcpu_ctx.mtrr_d.typecode = HVM_SAVE_CODE(MTRR);
vcpu_ctx.mtrr_d.length = HVM_SAVE_LENGTH(MTRR);
vcpu_ctx.mtrr = *mtrr_record;
+ vcpu_ctx.lapic_d.typecode = HVM_SAVE_CODE(LAPIC);
+ vcpu_ctx.lapic_d.length = HVM_SAVE_LENGTH(LAPIC);
vcpu_ctx.end_d = bsp_ctx.end_d;
vcpu_ctx.end = bsp_ctx.end;
@@ -1125,6 +1130,18 @@ static int vcpu_hvm(struct xc_dom_image *dom)
{
vcpu_ctx.mtrr_d.instance = i;
+ lapic_record = hvm_get_save_record(full_ctx, HVM_SAVE_CODE(LAPIC), i);
+ if ( !lapic_record )
+ {
+ xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
+ "%s: unable to get LAPIC[%d] save record", __func__, i);
+ goto out;
+ }
+
+ vcpu_ctx.lapic = *lapic_record;
+ vcpu_ctx.lapic.x2apic_id = dom->cpu_to_apicid[i];
+ vcpu_ctx.lapic_d.instance = i;
+
rc = xc_domain_hvm_setcontext(dom->xch, dom->guest_domid,
(uint8_t *)&vcpu_ctx, sizeof(vcpu_ctx));
if ( rc != 0 )
--
2.47.0
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH v7 10/10] tools/x86: Synthesise domain topologies
2024-10-21 15:45 [PATCH v7 00/10] x86: Expose consistent topology to guests Alejandro Vallejo
` (8 preceding siblings ...)
2024-10-21 15:45 ` [PATCH v7 09/10] tools/libguest: Set distinct x2APIC IDs for each vCPU Alejandro Vallejo
@ 2024-10-21 15:46 ` Alejandro Vallejo
2024-12-02 9:18 ` Jan Beulich
9 siblings, 1 reply; 27+ messages in thread
From: Alejandro Vallejo @ 2024-10-21 15:46 UTC (permalink / raw)
To: xen-devel
Cc: Alejandro Vallejo, Anthony PERARD, Juergen Gross, Jan Beulich,
Andrew Cooper, Roger Pau Monné
Expose sensible topologies in leaf 0xb. At the moment it synthesises
non-HT systems, in line with the previous code intent.
Leaf 0xb in the host policy is no longer zapped and the guest {max,def}
policies have their topology leaves zapped instead. The intent is for
toolstack to populate them. There's no current use for the topology
information in the host policy, but it makes no harm.
Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
v7:
* No changes
---
tools/include/xenguest.h | 3 +++
tools/libs/guest/xg_cpuid_x86.c | 29 ++++++++++++++++++++++++++++-
tools/libs/light/libxl_dom.c | 22 +++++++++++++++++++++-
xen/arch/x86/cpu-policy.c | 9 ++++++---
4 files changed, 58 insertions(+), 5 deletions(-)
diff --git a/tools/include/xenguest.h b/tools/include/xenguest.h
index aa50b78dfb89..dcabf219b9cb 100644
--- a/tools/include/xenguest.h
+++ b/tools/include/xenguest.h
@@ -831,6 +831,9 @@ int xc_set_domain_cpu_policy(xc_interface *xch, uint32_t domid,
uint32_t xc_get_cpu_featureset_size(void);
+/* Returns the APIC ID of the `cpu`-th CPU according to `policy` */
+uint32_t xc_cpu_to_apicid(const xc_cpu_policy_t *policy, unsigned int cpu);
+
enum xc_static_cpu_featuremask {
XC_FEATUREMASK_KNOWN,
XC_FEATUREMASK_SPECIAL,
diff --git a/tools/libs/guest/xg_cpuid_x86.c b/tools/libs/guest/xg_cpuid_x86.c
index 4453178100ad..c591f8732a1a 100644
--- a/tools/libs/guest/xg_cpuid_x86.c
+++ b/tools/libs/guest/xg_cpuid_x86.c
@@ -725,8 +725,16 @@ int xc_cpuid_apply_policy(xc_interface *xch, uint32_t domid, bool restore,
p->policy.basic.htt = test_bit(X86_FEATURE_HTT, host_featureset);
p->policy.extd.cmp_legacy = test_bit(X86_FEATURE_CMP_LEGACY, host_featureset);
}
- else
+ else if ( restore )
{
+ /*
+ * Reconstruct the topology exposed on Xen <= 4.13. It makes very little
+ * sense, but it's what those guests saw so it's set in stone now.
+ *
+ * Guests from Xen 4.14 onwards carry their own CPUID leaves in the
+ * migration stream so they don't need special treatment.
+ */
+
/*
* Topology for HVM guests is entirely controlled by Xen. For now, we
* hardcode APIC_ID = vcpu_id * 2 to give the illusion of no SMT.
@@ -782,6 +790,20 @@ int xc_cpuid_apply_policy(xc_interface *xch, uint32_t domid, bool restore,
break;
}
}
+ else
+ {
+ /* TODO: Expose the ability to choose a custom topology for HVM/PVH */
+ unsigned int threads_per_core = 1;
+ unsigned int cores_per_pkg = di.max_vcpu_id + 1;
+
+ rc = x86_topo_from_parts(&p->policy, threads_per_core, cores_per_pkg);
+ if ( rc )
+ {
+ ERROR("Failed to generate topology: rc=%d t/c=%u c/p=%u",
+ rc, threads_per_core, cores_per_pkg);
+ goto out;
+ }
+ }
nr_leaves = ARRAY_SIZE(p->leaves);
rc = x86_cpuid_copy_to_buffer(&p->policy, p->leaves, &nr_leaves);
@@ -1028,3 +1050,8 @@ bool xc_cpu_policy_is_compatible(xc_interface *xch, xc_cpu_policy_t *host,
return false;
}
+
+uint32_t xc_cpu_to_apicid(const xc_cpu_policy_t *policy, unsigned int cpu)
+{
+ return x86_x2apic_id_from_vcpu_id(&policy->policy, cpu);
+}
diff --git a/tools/libs/light/libxl_dom.c b/tools/libs/light/libxl_dom.c
index 5f4f6830e850..1d7c34820d8f 100644
--- a/tools/libs/light/libxl_dom.c
+++ b/tools/libs/light/libxl_dom.c
@@ -1063,6 +1063,9 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
libxl_domain_build_info *const info = &d_config->b_info;
struct xc_dom_image *dom = NULL;
bool device_model = info->type == LIBXL_DOMAIN_TYPE_HVM ? true : false;
+#if defined(__i386__) || defined(__x86_64__)
+ struct xc_cpu_policy *policy = NULL;
+#endif
xc_dom_loginit(ctx->xch);
@@ -1083,8 +1086,22 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
dom->container_type = XC_DOM_HVM_CONTAINER;
#if defined(__i386__) || defined(__x86_64__)
+ policy = xc_cpu_policy_init();
+ if (!policy) {
+ LOGE(ERROR, "xc_cpu_policy_get_domain failed d%u", domid);
+ rc = ERROR_NOMEM;
+ goto out;
+ }
+
+ rc = xc_cpu_policy_get_domain(ctx->xch, domid, policy);
+ if (rc != 0) {
+ LOGE(ERROR, "xc_cpu_policy_get_domain failed d%u", domid);
+ rc = ERROR_FAIL;
+ goto out;
+ }
+
for (unsigned int i = 0; i < info->max_vcpus; i++)
- dom->cpu_to_apicid[i] = 2 * i; /* TODO: Replace by topo calculation */
+ dom->cpu_to_apicid[i] = xc_cpu_to_apicid(policy, i);
#endif
/* The params from the configuration file are in Mb, which are then
@@ -1214,6 +1231,9 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
out:
assert(rc != 0);
if (dom != NULL) xc_dom_release(dom);
+#if defined(__i386__) || defined(__x86_64__)
+ xc_cpu_policy_destroy(policy);
+#endif
return rc;
}
diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c
index 715a66d2a978..a7e0d44cce78 100644
--- a/xen/arch/x86/cpu-policy.c
+++ b/xen/arch/x86/cpu-policy.c
@@ -266,9 +266,6 @@ static void recalculate_misc(struct cpu_policy *p)
p->basic.raw[0x8] = EMPTY_LEAF;
- /* TODO: Rework topology logic. */
- memset(p->topo.raw, 0, sizeof(p->topo.raw));
-
p->basic.raw[0xc] = EMPTY_LEAF;
p->extd.e1d &= ~CPUID_COMMON_1D_FEATURES;
@@ -619,6 +616,9 @@ static void __init calculate_pv_max_policy(void)
recalculate_xstate(p);
p->extd.raw[0xa] = EMPTY_LEAF; /* No SVM for PV guests. */
+
+ /* Wipe host topology. Populated by toolstack */
+ memset(p->topo.raw, 0, sizeof(p->topo.raw));
}
static void __init calculate_pv_def_policy(void)
@@ -785,6 +785,9 @@ static void __init calculate_hvm_max_policy(void)
/* It's always possible to emulate CPUID faulting for HVM guests */
p->platform_info.cpuid_faulting = true;
+
+ /* Wipe host topology. Populated by toolstack */
+ memset(p->topo.raw, 0, sizeof(p->topo.raw));
}
static void __init calculate_hvm_def_policy(void)
--
2.47.0
^ permalink raw reply related [flat|nested] 27+ messages in thread* Re: [PATCH v7 10/10] tools/x86: Synthesise domain topologies
2024-10-21 15:46 ` [PATCH v7 10/10] tools/x86: Synthesise domain topologies Alejandro Vallejo
@ 2024-12-02 9:18 ` Jan Beulich
0 siblings, 0 replies; 27+ messages in thread
From: Jan Beulich @ 2024-12-02 9:18 UTC (permalink / raw)
To: Alejandro Vallejo
Cc: Anthony PERARD, Juergen Gross, Andrew Cooper,
Roger Pau Monné, xen-devel
On 21.10.2024 17:46, Alejandro Vallejo wrote:
> Expose sensible topologies in leaf 0xb. At the moment it synthesises
> non-HT systems, in line with the previous code intent.
>
> Leaf 0xb in the host policy is no longer zapped and the guest {max,def}
> policies have their topology leaves zapped instead. The intent is for
> toolstack to populate them. There's no current use for the topology
> information in the host policy, but it makes no harm.
How does this (and hence ...
> @@ -619,6 +616,9 @@ static void __init calculate_pv_max_policy(void)
> recalculate_xstate(p);
>
> p->extd.raw[0xa] = EMPTY_LEAF; /* No SVM for PV guests. */
> +
> + /* Wipe host topology. Populated by toolstack */
> + memset(p->topo.raw, 0, sizeof(p->topo.raw));
> }
>
> static void __init calculate_pv_def_policy(void)
> @@ -785,6 +785,9 @@ static void __init calculate_hvm_max_policy(void)
>
> /* It's always possible to emulate CPUID faulting for HVM guests */
> p->platform_info.cpuid_faulting = true;
> +
> + /* Wipe host topology. Populated by toolstack */
> + memset(p->topo.raw, 0, sizeof(p->topo.raw));
> }
... these, at least comment-wise) fit with Dom0 also needing some data
there?
Also nit: Multi-sentence comments want full stops after every sentence.
Jan
^ permalink raw reply [flat|nested] 27+ messages in thread