From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Alejandro Vallejo <alejandro.vallejo@cloud.com>
Cc: Xen-devel <xen-devel@lists.xenproject.org>,
Jan Beulich <jbeulich@suse.com>,
Andrew Cooper <andrew.cooper3@citrix.com>, Wei Liu <wl@xen.org>
Subject: Re: [PATCH v3] xen/x86: On x2APIC mode, derive LDR from APIC ID
Date: Thu, 23 Nov 2023 15:03:36 +0100 [thread overview]
Message-ID: <ZV9buKlgta4Gbn-a@macbook> (raw)
In-Reply-To: <655f43d2.5d0a0220.afd7c.48b9@mx.google.com>
On Thu, Nov 23, 2023 at 12:21:36PM +0000, Alejandro Vallejo wrote:
> On Thu, Nov 23, 2023 at 10:03:12AM +0100, Roger Pau Monné wrote:
> > On Wed, Nov 22, 2023 at 04:08:17PM +0000, Alejandro Vallejo wrote:
> > > +static uint32_t x2apic_ldr_from_id(uint32_t id)
> > > +{
> > > + return ((id & ~0xf) << 12) | (1 << (id & 0xf));
> > > +}
> > > +
> > > static void set_x2apic_id(struct vlapic *vlapic)
> > > {
> > > - u32 id = vlapic_vcpu(vlapic)->vcpu_id;
> > > - u32 ldr = ((id & ~0xf) << 12) | (1 << (id & 0xf));
> > > + uint32_t vcpu_id = vlapic_vcpu(vlapic)->vcpu_id;
> > > + uint32_t apic_id = vcpu_id * 2;
> > > + uint32_t apic_ldr = x2apic_ldr_from_id(apic_id);
> > > +
> > > + /* This is a migration bug workaround. See wall of text in lapic_load_fixup() */
> >
> > Line length > 80 cols.
> >
> > I try to avoid referencing function names in comments, as they tend to
> > get out of sync without noticing. It's much easier to use cscope to
> > grep for x2apic_ldr_bug_with_vcpu_id and find the comment itself.
> In my experience that's less of a problem than it's usually made out to be,
> and helps new readers know about the real context something is in the place
> it is.
>
> But I do hold the atypical belief that an out of date pointer to context is
> preferrable to no context.
It's a question of taste TBH, I'm certainly not going to insist.
Since you have to wrap the line to fit in 80 cols anyway, I think I
would rather write: "This is a workaround for migrated domains. ...".
Current text reads to me as it's a migration bug, but that's not the
case, the bug is in the previous Xen versions. I'm not a native
speaker anyway, so maybe it's just me reading it wrong.
> >
> > > + if ( vlapic_domain(vlapic)->arch.hvm.x2apic_ldr_bug_with_vcpu_id )
> > > + apic_ldr = x2apic_ldr_from_id(vcpu_id);
> > >
> > > - vlapic_set_reg(vlapic, APIC_ID, id * 2);
> > > - vlapic_set_reg(vlapic, APIC_LDR, ldr);
> > > + vlapic_set_reg(vlapic, APIC_ID, apic_id);
> > > + vlapic_set_reg(vlapic, APIC_LDR, apic_ldr);
> > > }
> > >
> > > int guest_wrmsr_apic_base(struct vcpu *v, uint64_t val)
> > > @@ -1498,27 +1508,35 @@ static int cf_check lapic_save_regs(struct vcpu *v, hvm_domain_context_t *h)
> > > */
> > > static void lapic_load_fixup(struct vlapic *vlapic)
> > > {
> > > - uint32_t id = vlapic->loaded.id;
> > > + /* Skip fixups on xAPIC mode, or if the x2APIC LDR is already correct */
> > > + if ( !vlapic_x2apic_mode(vlapic) ||
> > > + (vlapic->loaded.ldr == x2apic_ldr_from_id(vlapic->loaded.id)) )
> > > + return;
> > >
> > > - if ( vlapic_x2apic_mode(vlapic) && id && vlapic->loaded.ldr == 1 )
> > > - {
> > > + if ( vlapic->loaded.ldr == 1 )
> > > + /*
> > > + * Xen <= 4.4 had a bug by which all the APICs configured in x2APIC
> > > + * mode got LDR = 1. We can't leave it as-is because it assigned the
> > > + * same LDR to every CPU. We'll fix fix the bug now and assign an
> > > + * LDR value consistent with the APIC ID.
> > > + */
> > > + set_x2apic_id(vlapic);
> > > + else if ( vlapic->loaded.ldr ==
> > > + x2apic_ldr_from_id(vlapic_vcpu(vlapic)->vcpu_id) )
> > > /*
> > > - * This is optional: ID != 0 contradicts LDR == 1. It's being added
> > > - * to aid in eventual debugging of issues arising from the fixup done
> > > - * here, but can be dropped as soon as it is found to conflict with
> > > - * other (future) changes.
> > > + * This is a migration from a broken Xen between 4.4 and 4.18 and
> > > + * we must _PRESERVE_ LDRs so new vCPUs use consistent derivations.
> >
> > Not sure if we should try to avoid mentioning specific versions in the
> > comments, as I this fix will be backported to stable branches (I hope),
> > and hence those will no longer be affected.
> Hence the "broken Xen" part of the paragraphs. Not every 4.18 will have the
> problem, but it shouldn't be seen in 4.19 onwards. I think there's value in
> stating the versions that "may" exhibit problems, but this is all
> subjective
FE.
> >
> > > + * This is so existing running guests that may have already read
> > > + * the LDR at the source host aren't surprised when IPIs stop
> > > + * working as they did at the other end. To address this, we set
> > > + * this domain boolean so future CPU hotplugs derive an LDR
> > > + * consistent with the older Xen's broken idea of consistency.
> >
> > I think this is possibly too verbose, I would be fine with just the
> > first sentence TBH. If we want the full comment however, the wording
> > should be slightly addressed: it's not just IPIs that would possibly
> > fail to be delivered, but any interrupt attempting to target the APIC
> > using the previous LDR addressing (either an IPI or an external
> > interrupt).
> I can s/IPIs/targetted interrupts/ and remove the second sentence.
I would just use interrupts FWIW.
> >
> > > */
> > > - if ( GET_xAPIC_ID(id) != vlapic_vcpu(vlapic)->vcpu_id * 2 ||
> > > - id != SET_xAPIC_ID(GET_xAPIC_ID(id)) )
> > > - printk(XENLOG_G_WARNING "%pv: bogus APIC ID %#x loaded\n",
> > > - vlapic_vcpu(vlapic), id);
> > > - set_x2apic_id(vlapic);
> > > - }
> > > - else /* Undo an eventual earlier fixup. */
> > > - {
> > > - vlapic_set_reg(vlapic, APIC_ID, id);
> > > - vlapic_set_reg(vlapic, APIC_LDR, vlapic->loaded.ldr);
> > > - }
> > > + vlapic_domain(vlapic)->arch.hvm.x2apic_ldr_bug_with_vcpu_id = true;
> > > + else
> > > + printk(XENLOG_G_WARNING
> > > + "%pv: bogus x2APIC loaded id=%#x ldr=%#x\n",
> > > + vlapic_vcpu(vlapic), vlapic->loaded.id, vlapic->loaded.ldr);
> >
> > Could you write the expected values while at it:
> >
> > "%pv: bogus x2APIC loaded id=%#x ldr=%#x (expected id=%#x ldr=%#x)\n"
> x2APIC ID is current strictly related to the vcpu ID, but it won't be after
> I'm done with topology. I can print the expected LDR though.
Oh, I see. Just printing the expected LDR then would be fine.
Thanks, Roger.
next prev parent reply other threads:[~2023-11-23 14:04 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-22 16:08 [PATCH v3] xen/x86: On x2APIC mode, derive LDR from APIC ID Alejandro Vallejo
2023-11-22 17:36 ` Alejandro Vallejo
2023-11-23 9:03 ` Roger Pau Monné
2023-11-23 12:21 ` Alejandro Vallejo
2023-11-23 14:03 ` Roger Pau Monné [this message]
2023-11-23 14:16 ` Andrew Cooper
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZV9buKlgta4Gbn-a@macbook \
--to=roger.pau@citrix.com \
--cc=alejandro.vallejo@cloud.com \
--cc=andrew.cooper3@citrix.com \
--cc=jbeulich@suse.com \
--cc=wl@xen.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.