From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: xen-devel@lists.xenproject.org, Jan Beulich <jbeulich@suse.com>,
Willi Junga <xenproject@ymy.be>,
David Woodhouse <dwmw@amazon.co.uk>
Subject: Re: [PATCH] x86/io-apic: fix directed EOI when using AMd-Vi interrupt remapping
Date: Mon, 21 Oct 2024 13:57:21 +0200 [thread overview]
Message-ID: <ZxZBocLV7eJUxK50@macbook.local> (raw)
In-Reply-To: <9270ef0c-9dfa-4fbf-8060-3c507c0c6684@citrix.com>
On Mon, Oct 21, 2024 at 12:10:14PM +0100, Andrew Cooper wrote:
> On 18/10/2024 9:08 am, Roger Pau Monne wrote:
> > When using AMD-VI interrupt remapping the vector field in the IO-APIC RTE is
> > repurposed to contain part of the offset into the remapping table. Previous to
> > 2ca9fbd739b8 Xen had logic so that the offset into the interrupt remapping
> > table would match the vector. Such logic was mandatory for end of interrupt to
> > work, since the vector field (even when not containing a vector) is used by the
> > IO-APIC to find for which pin the EOI must be performed.
> >
> > Introduce a table to store the EOI handlers when using interrupt remapping, so
> > that the IO-APIC driver can translate pins into EOI handlers without having to
> > read the IO-APIC RTE entry. Note that to simplify the logic such table is used
> > unconditionally when interrupt remapping is enabled, even if strictly it would
> > only be required for AMD-Vi.
> >
> > Reported-by: Willi Junga <xenproject@ymy.be>
> > Suggested-by: David Woodhouse <dwmw@amazon.co.uk>
> > Fixes: 2ca9fbd739b8 ('AMD IOMMU: allocate IRTE entries instead of using a static mapping')
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>
> Yet more fallout from the multi-MSI work. That really has been a giant
> source of bugs.
>
> > ---
> > xen/arch/x86/io_apic.c | 47 ++++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 47 insertions(+)
> >
> > diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
> > index e40d2f7dbd75..8856eb29d275 100644
> > --- a/xen/arch/x86/io_apic.c
> > +++ b/xen/arch/x86/io_apic.c
> > @@ -71,6 +71,22 @@ static int apic_pin_2_gsi_irq(int apic, int pin);
> >
> > static vmask_t *__read_mostly vector_map[MAX_IO_APICS];
> >
> > +/*
> > + * Store the EOI handle when using interrupt remapping.
> > + *
> > + * If using AMD-Vi interrupt remapping the IO-APIC redirection entry remapped
> > + * format repurposes the vector field to store the offset into the Interrupt
> > + * Remap table. This causes directed EOI to longer work, as the CPU vector no
> > + * longer matches the contents of the RTE vector field. Add a translation
> > + * table so that directed EOI uses the value in the RTE vector field when
> > + * interrupt remapping is enabled.
> > + *
> > + * Note Intel VT-d Xen code still stores the CPU vector in the RTE vector field
> > + * when using the remapped format, but use the translation table uniformly in
> > + * order to avoid extra logic to differentiate between VT-d and AMD-Vi.
> > + */
> > +static unsigned int **apic_pin_eoi;
>
> I think we can get away with this being uint8_t rather than unsigned
> int, especially as we're allocating memory when not strictly necessary.
>
> The only sentinel value we use is IRQ_VECTOR_UNASSIGNED which is -1.
>
> Vector 0xff is strictly SPIV and not allocated for anything else, so can
> be reused as a suitable sentinel here.
The coding style explicitly discourages using fixed width types unless
it's strictly necessary, I assume the usage here would be covered by
Xen caching a value of a hardware register field that has a
fixed-width size.
> > +
> > static void share_vector_maps(unsigned int src, unsigned int dst)
> > {
> > unsigned int pin;
> > @@ -273,6 +289,13 @@ void __ioapic_write_entry(
> > {
> > __io_apic_write(apic, 0x11 + 2 * pin, eu.w2);
> > __io_apic_write(apic, 0x10 + 2 * pin, eu.w1);
> > + /*
> > + * Might be called before apic_pin_eoi is allocated. Entry will be
> > + * updated once the array is allocated and there's an EOI or write
> > + * against the pin.
> > + */
>
> Is this for the xAPIC path where we turn on interrupts before the IOMMU ?
It's for iommu_setup() -> iommu_hardware_setup() saving and restoring
the IO-APIC entries around enabling of interrupt remapping. This is
done just ahead of smp_prepare_cpus() which is where
setup_IO_APIC_irqs() gets called.
> > + if ( apic_pin_eoi )
> > + apic_pin_eoi[apic][pin] = e.vector;
> > }
> > else
> > iommu_update_ire_from_apic(apic, pin, e.raw);
> > @@ -298,9 +321,17 @@ static void __io_apic_eoi(unsigned int apic, unsigned int vector, unsigned int p
> > /* Prefer the use of the EOI register if available */
> > if ( ioapic_has_eoi_reg(apic) )
> > {
> > + if ( apic_pin_eoi )
> > + vector = apic_pin_eoi[apic][pin];
> > +
> > /* If vector is unknown, read it from the IO-APIC */
> > if ( vector == IRQ_VECTOR_UNASSIGNED )
> > + {
> > vector = __ioapic_read_entry(apic, pin, true).vector;
> > + if ( apic_pin_eoi )
> > + /* Update cached value so further EOI don't need to fetch it. */
> > + apic_pin_eoi[apic][pin] = vector;
> > + }
> >
> > *(IO_APIC_BASE(apic)+16) = vector;
> > }
> > @@ -1022,7 +1053,23 @@ static void __init setup_IO_APIC_irqs(void)
> >
> > apic_printk(APIC_VERBOSE, KERN_DEBUG "init IO_APIC IRQs\n");
> >
> > + if ( iommu_intremap )
>
> MISRA requires this to be iommu_intremap != iommu_intremap_off.
>
> But, if this safe on older hardware? iommu_intremap defaults to on
> (full), and is then turned off later on boot for various reasons.
I think it's fine because setup_IO_APIC_irqs() is strictly called
after iommu_setup(), so the value of iommu_intremap by that point
should reflect whether IR is enabled.
> We do all memory allocations in setup_IO_APIC_irqs() so at least we get
> to see a consistent view of iommu_intremap.
>
> I suppose there's nothing wrong with having an extra cache of the vector
> in the way when not using interrupt remapping, so maybe it's fine?
>
> > + {
> > + apic_pin_eoi = xzalloc_array(typeof(*apic_pin_eoi), nr_ioapics);
> > + BUG_ON(!apic_pin_eoi);
> > + }
> > +
> > for (apic = 0; apic < nr_ioapics; apic++) {
> > + if ( iommu_intremap )
> > + {
> > + apic_pin_eoi[apic] = xmalloc_array(typeof(**apic_pin_eoi),
> > + nr_ioapic_entries[apic]);
> > + BUG_ON(!apic_pin_eoi[apic]);
> > +
> > + for ( pin = 0; pin < nr_ioapic_entries[apic]; pin++ )
> > + apic_pin_eoi[apic][pin] = IRQ_VECTOR_UNASSIGNED;
> > + }
>
> This logic will be better if you pull nr_ioapic_entries[apic] out into a
> loop-local variable.
>
> It should also allow the optimiser to turn the for loop into a memset(),
> which it can't now because of possible pointer aliasing with the
> induction variable.
Oh, OK, can send v2 with that adjusted.
> But overall, the patch looks broadly ok to me.
Thanks, Roger.
next prev parent reply other threads:[~2024-10-21 11:57 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-18 8:08 [PATCH] x86/io-apic: fix directed EOI when using AMd-Vi interrupt remapping Roger Pau Monne
2024-10-19 3:23 ` Marek Marczykowski-Górecki
2024-10-21 11:43 ` Woodhouse, David
2024-11-02 3:54 ` marmarek
2024-10-21 9:55 ` Alejandro Vallejo
2024-10-21 10:07 ` Andrew Cooper
2024-10-21 10:49 ` Roger Pau Monné
2024-10-21 11:32 ` David Woodhouse
2024-10-21 12:18 ` Alejandro Vallejo
2024-10-21 9:56 ` Alejandro Vallejo
2024-10-21 11:10 ` Andrew Cooper
2024-10-21 11:38 ` Andrew Cooper
2024-10-21 11:49 ` [EXTERNAL] " David Woodhouse
2024-10-21 11:53 ` Andrew Cooper
2024-10-21 12:02 ` David Woodhouse
2024-10-21 14:25 ` Roger Pau Monné
2024-10-21 14:03 ` Roger Pau Monné
2024-10-21 17:00 ` Roger Pau Monné
2024-10-21 17:21 ` Andrew Cooper
2024-10-21 11:57 ` Roger Pau Monné [this message]
2024-10-21 12:33 ` Andrew Cooper
2024-10-28 11:02 ` Jan Beulich
2024-10-28 11:05 ` Jan Beulich
2024-10-29 15:56 ` Jan Beulich
2024-10-21 11:34 ` David Woodhouse
2024-10-21 14:06 ` Roger Pau Monné
2024-10-21 14:51 ` Andrew Cooper
2024-10-21 14:54 ` David Woodhouse
2024-10-21 15:00 ` Roger Pau Monné
2024-10-21 15:03 ` Alejandro Vallejo
2024-10-21 15:08 ` Andrew Cooper
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZxZBocLV7eJUxK50@macbook.local \
--to=roger.pau@citrix.com \
--cc=andrew.cooper3@citrix.com \
--cc=dwmw@amazon.co.uk \
--cc=jbeulich@suse.com \
--cc=xen-devel@lists.xenproject.org \
--cc=xenproject@ymy.be \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.