All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Alejandro Vallejo" <alejandro.vallejo@cloud.com>
To: "Andrew Cooper" <andrew.cooper3@citrix.com>,
	<xen-devel@lists.xenproject.org>
Cc: "Jan Beulich" <jbeulich@suse.com>,
	"Roger Pau Monné" <roger.pau@citrix.com>
Subject: Re: [PATCH v7 02/10] xen/x86: Add initial x2APIC ID to the per-vLAPIC save area
Date: Wed, 30 Oct 2024 12:00:09 +0000	[thread overview]
Message-ID: <D594H5BKU18G.20YVS360FNF71@cloud.com> (raw)
In-Reply-To: <974538f8-10b5-4fa3-9069-df6655a5d86d@citrix.com>

I'm fine with all suggestions, with one exception that needs a bit more
explanation...

On Tue Oct 29, 2024 at 8:30 PM GMT, Andrew Cooper wrote:
> On 21/10/2024 4:45 pm, Alejandro Vallejo wrote:
> > This allows the initial x2APIC ID to be sent on the migration stream.
> > This allows further changes to topology and APIC ID assignment without
> > breaking existing hosts. Given the vlapic data is zero-extended on
> > restore, fix up migrations from hosts without the field by setting it to
> > the old convention if zero.
> >
> > The hardcoded mapping x2apic_id=2*vcpu_id is kept for the time being,
> > but it's meant to be overriden by toolstack on a later patch with
> > appropriate values.
> >
> > Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
>
> I'm going to request some changes, but I think they're only comment
> changes. [edit, no sadly, one non-comment change.]
>
> It's unfortunate that Xen uses an instance of hvm_hw_lapic for it's
> internal state, but one swamp at a time.
>
>
> In the subject, there's no such thing as the "initial" x2APIC ID. 
> There's just "the x2APIC ID" and it's not mutable state as far as the
> guest is concerned  (This is different to the xAPIC id, where there is
> an architectural concept of the initial xAPIC ID, from the days when
> OSes were permitted to edit it).  Also, it's x86/hvm, seeing as this is
> an HVM specific change you're making.
>
> Next, while it's true that this allows the value to move in the
> migration stream, the more important point is that this allows the
> toolstack to configure the x2APIC ID for each vCPU.
>
> So, for the commit message, I recommend:
>
> ---%<---
> Today, Xen hard-codes x2APIC_ID = vcpu_id * 2, but this is unwise and
> interferes with providing accurate topology information to the guest.
>
> Introduce a new x2apic_id field into hvm_hw_lapic.  This is immutable
> state from the guest's point of view, but it allows the toolstack to
> configure the value, and for the value to move on migrate.
>
> For backwards compatibility, we treat incoming zeroes as if they were
> the old hardcoded scheme.
> ---%<---
>
> > diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
> > index 2a777436ee27..e2489ff8e346 100644
> > --- a/xen/arch/x86/cpuid.c
> > +++ b/xen/arch/x86/cpuid.c
> > @@ -138,10 +138,9 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
> >          const struct cpu_user_regs *regs;
> >  
> >      case 0x1:
> > -        /* TODO: Rework topology logic. */
> >          res->b &= 0x00ffffffu;
> >          if ( is_hvm_domain(d) )
> > -            res->b |= (v->vcpu_id * 2) << 24;
> > +            res->b |= vlapic_x2apic_id(vcpu_vlapic(v)) << 24;
>
> There wants to be some kind of note here, especially as you're feeding
> vlapic_x2apic_id() into a field called xAPIC ID.  Perhaps
>
> /* Large systems do wrap around 255 in the xAPIC_ID field. */
>
> ?
>
>
> >  
> >          /* TODO: Rework vPMU control in terms of toolstack choices. */
> >          if ( vpmu_available(v) &&
> > @@ -310,19 +309,16 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
> >          break;
> >  
> >      case 0xb:
> > -        /*
> > -         * In principle, this leaf is Intel-only.  In practice, it is tightly
> > -         * coupled with x2apic, and we offer an x2apic-capable APIC emulation
> > -         * to guests on AMD hardware as well.
> > -         *
> > -         * TODO: Rework topology logic.
> > -         */
> >          if ( p->basic.x2apic )
> >          {
> >              *(uint8_t *)&res->c = subleaf;
> >  
> > -            /* Fix the x2APIC identifier. */
> > -            res->d = v->vcpu_id * 2;
> > +            /*
> > +             * Fix the x2APIC identifier. The PV side is nonsensical, but
> > +             * we've always shown it like this so it's kept for compat.
> > +             */
>
> In hindsight I should changed "Fix the x2APIC identifier." when I
> reworked this logic, but oh well - better late than never.
>
> /* The x2APIC_ID is per-vCPU, and fixed irrespective of the requested
> subleaf. */
>
> I'd also put a little more context in the PV side:
>
> /* Xen 4.18 and earlier leaked x2APIC into PV guests.  The value shown
> is nonsensical but kept as-was for compatibility. */
>
> > diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
> > index 3363926b487b..33b463925f4e 100644
> > --- a/xen/arch/x86/hvm/vlapic.c
> > +++ b/xen/arch/x86/hvm/vlapic.c
> > @@ -1538,6 +1538,16 @@ static void lapic_load_fixup(struct vlapic *vlapic)
> >      const struct vcpu *v = vlapic_vcpu(vlapic);
> >      uint32_t good_ldr = x2apic_ldr_from_id(vlapic->loaded.id);
> >  
> > +    /*
> > +     * Loading record without hw.x2apic_id in the save stream, calculate using
> > +     * the traditional "vcpu_id * 2" relation. There's an implicit assumption
> > +     * that vCPU0 always has x2APIC0, which is true for the old relation, and
> > +     * still holds under the new x2APIC generation algorithm. While that case
> > +     * goes through the conditional it's benign because it still maps to zero.
> > +     */
>
> It's not an implicit assumption; it's very explicit.

It's implicit because it's not mentioned anywhere else and parts of the Xen
ecosystem live under the pretense that such a thing can indeed happen.

>
> /* Xen 4.19 and earlier had no x2APIC_ID in the migration stream, and
> hard-coded "vcpu_id * 2".  Default back to this if we have a
> zero-extended record.  */
>
> But, this will go malfunction if the toolstack tries to set v!0's
> x2APIC_ID to 0.

I assume you mean vcpuN with N != 0. I maintain that allowing non-monotonically
increasing APIC IDs on vCPUs is technical debt disguised as a misfeature. For
one, it would prevent hvmloader from asserting some sanity on its own reads of
APIC IDs, but it would be a mess to debug in general. I started making real
progress on the toolstack after asserting all APs had non-zero APIC IDs.

So, while...

>
> What you need to know is whether lapic_load_hidden() had to zero-extend
> the record or not (more specifically, over this field), so you want
> h->size <= offsetof(x2_apicid) as the gating condition.

... this is true and a more adequate gating condition (that I'm happy to
replace the current one with), I'd still like to keep the invariant that APIC
IDs must be monotonically increasing with the vCPU id, which has the side
effect of banning zero outside the BSP.

>
> This should be safe for the toolstack, I think.  Hypercalls prior to
> this patch will get a shorter record, and hypercalls from this patch
> onwards will get a longer record with the default x2APIC_ID = vcpu_id *
> 2 filled in.
>
> > +    if ( !vlapic->hw.x2apic_id )
> > +        vlapic->hw.x2apic_id = v->vcpu_id * 2;
> > +
> >      /* Skip fixups on xAPIC mode, or if the x2APIC LDR is already correct */
> >      if ( !vlapic_x2apic_mode(vlapic) ||
> >           (vlapic->loaded.ldr == good_ldr) )
> > @@ -1606,6 +1616,13 @@ static int cf_check lapic_check_hidden(const struct domain *d,
> >           APIC_BASE_EXTD )
> >          return -EINVAL;
> >  
> > +    /*
> > +     * Fail migrations from newer versions of Xen where
> > +     * rsvd_zero is interpreted as something else.
> > +     */
>
> This comment isn't necessary.  We've got no shortage of reserved
> checks.  However ...
>
> > diff --git a/xen/include/public/arch-x86/hvm/save.h b/xen/include/public/arch-x86/hvm/save.h
> > index 7ecacadde165..1c2ec669ffc9 100644
> > --- a/xen/include/public/arch-x86/hvm/save.h
> > +++ b/xen/include/public/arch-x86/hvm/save.h
> > @@ -394,6 +394,8 @@ struct hvm_hw_lapic {
> >      uint32_t             disabled; /* VLAPIC_xx_DISABLED */
> >      uint32_t             timer_divisor;
> >      uint64_t             tdt_msr;
> > +    uint32_t             x2apic_id;
> > +    uint32_t             rsvd_zero;
>
> ... we do normally spell it _rsvd; to make it extra extra clear that
> people shouldn't be doing anything with it.
>
> ~Andrew



  parent reply	other threads:[~2024-10-30 12:00 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-21 15:45 [PATCH v7 00/10] x86: Expose consistent topology to guests Alejandro Vallejo
2024-10-21 15:45 ` [PATCH v7 01/10] lib/x86: Bump max basic leaf in {pv,hvm}_max_policy Alejandro Vallejo
2024-10-29 17:57   ` Andrew Cooper
2024-10-21 15:45 ` [PATCH v7 02/10] xen/x86: Add initial x2APIC ID to the per-vLAPIC save area Alejandro Vallejo
2024-10-29 20:30   ` Andrew Cooper
2024-10-30  6:37     ` Jan Beulich
2024-10-30 12:03       ` Alejandro Vallejo
2024-10-30 12:05         ` Jan Beulich
2024-10-30 12:25       ` Andrew Cooper
2024-10-30 12:00     ` Alejandro Vallejo [this message]
2024-10-21 15:45 ` [PATCH v7 03/10] xen/x86: Add supporting code for uploading LAPIC contexts during domain create Alejandro Vallejo
2024-12-02  9:27   ` Jan Beulich
2024-10-21 15:45 ` [PATCH v7 04/10] tools/hvmloader: Retrieve (x2)APIC IDs from the APs themselves Alejandro Vallejo
2024-10-30 11:31   ` Andrew Cooper
2024-10-30 12:04     ` Jan Beulich
2024-11-11 11:20     ` Alejandro Vallejo
2024-11-11 12:07       ` Jan Beulich
2024-12-02  9:36   ` Jan Beulich
2024-10-21 15:45 ` [PATCH v7 05/10] tools/libacpi: Use LUT of APIC IDs rather than function pointer Alejandro Vallejo
2024-10-30 14:56   ` Andrew Cooper
2024-12-02  9:40   ` Jan Beulich
2024-10-21 15:45 ` [PATCH v7 06/10] tools/libguest: Always set vCPU context in vcpu_hvm() Alejandro Vallejo
2024-10-21 15:45 ` [PATCH v7 07/10] xen/lib: Add topology generator for x86 Alejandro Vallejo
2024-10-21 15:45 ` [PATCH v7 08/10] xen/x86: Derive topologically correct x2APIC IDs from the policy Alejandro Vallejo
2024-10-21 15:45 ` [PATCH v7 09/10] tools/libguest: Set distinct x2APIC IDs for each vCPU Alejandro Vallejo
2024-10-21 15:46 ` [PATCH v7 10/10] tools/x86: Synthesise domain topologies Alejandro Vallejo
2024-12-02  9:18   ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D594H5BKU18G.20YVS360FNF71@cloud.com \
    --to=alejandro.vallejo@cloud.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=roger.pau@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.