From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Jan Beulich <jbeulich@suse.com>,
Oleksii Kurochko <oleksii.kurochko@gmail.com>
Cc: xen-devel@lists.xenproject.org,
Andrew Cooper <andrew.cooper3@citrix.com>,
Anthony PERARD <anthony.perard@vates.tech>,
Michal Orzel <michal.orzel@amd.com>,
Julien Grall <julien@xen.org>,
Stefano Stabellini <sstabellini@kernel.org>
Subject: Re: [PATCH for-4.20 v3 0/5] xen/x86: prevent local APIC errors at shutdown
Date: Wed, 12 Feb 2025 10:25:01 +0100 [thread overview]
Message-ID: <Z6xo7Us0LiJqiEi1@macbook.local> (raw)
In-Reply-To: <6191ed5b-ec66-4054-a6bc-173ab578aa54@suse.com>
On Wed, Feb 12, 2025 at 09:51:16AM +0100, Jan Beulich wrote:
> On 12.02.2025 09:33, Oleksii Kurochko wrote:
> >
> > On 2/11/25 7:39 PM, Roger Pau Monné wrote:
> >> On Tue, Feb 11, 2025 at 12:02:04PM +0100, Roger Pau Monne wrote:
> >>> Hello,
> >>>
> >>> The following series aims to prevent local APIC errors from stalling the
> >>> shtudown process. On XenServer testing we have seen reports of AMD
> >>> boxes sporadically getting stuck in a spam of:
> >>>
> >>> APIC error on CPU0: 00(08), Receive accept error
> >>>
> >>> Messages during shutdown, as a result of device interrupts targeting
> >>> CPUs that are offline (and have the local APIC disabled).
> >>>
> >>> First patch strictly solves the issue of shutdown getting stuck, further
> >>> patches aim to quiesce interrupts from all devices (known by Xen) as an
> >>> attempt to prevent a spurious "APIC error on CPU0: 00(00)" plus also
> >>> make kexec more reliable.
> >>>
> >>> Thanks, Roger.
> >>>
> >>> Roger Pau Monne (5):
> >>> x86/shutdown: offline APs with interrupts disabled on all CPUs
> >>> x86/irq: drop fixup_irqs() parameters
> >>> x86/smp: perform disabling on interrupts ahead of AP shutdown
> >>> x86/pci: disable MSI(-X) on all devices at shutdown
> >>> x86/iommu: disable interrupts at shutdown
> >> This is now fully reviewed, can I get your opinion (and
> >> release-acked-by) on which patches we should take for 4.20?
> >
> > If my understanding is correct to unblock shutdown process, it is enough just
> > to have only first patch merged, correct? So the first patch should be merged.
> >
> > As second patch doesn't have functional changes, IMO, it could be merged to
> > despite of the fact we have Hard code freeze period.
> >
> > All other patches, I would like to ask additional opinion (as I am an expert in x86),
> > at first glance it looks like an absence of these patches in staging branch will
> > lead only to triggering "Receive accept error" which I believe won't block shutdown
> > process, so these patches could be postponed until 4.21. On other side, if it is
> > low-risk fixes then we could consider to merge them now.
I expect the following patches might make kexec'ing from Xen a bit
more reliable, as the kexec'ed kernel should find an environment with
interrupts from all Xen known devices quiesced.
> I'm not Roger, but as a data point: While I'm uncertain about patch 2, all
> others in this series will very likely be backported anyway.
I plan to backport the series to the XenServer patch queue also when it
goes in.
Thanks, Roger.
next prev parent reply other threads:[~2025-02-12 9:25 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-11 11:02 [PATCH for-4.20 v3 0/5] xen/x86: prevent local APIC errors at shutdown Roger Pau Monne
2025-02-11 11:02 ` [PATCH for-4.20 v3 1/5] x86/shutdown: offline APs with interrupts disabled on all CPUs Roger Pau Monne
2025-02-11 11:23 ` Jan Beulich
2025-02-11 14:12 ` Roger Pau Monné
2025-02-11 11:02 ` [PATCH for-4.20 v3 2/5] x86/irq: drop fixup_irqs() parameters Roger Pau Monne
2025-02-11 11:02 ` [PATCH for-4.20 v3 3/5] x86/smp: perform disabling on interrupts ahead of AP shutdown Roger Pau Monne
2025-02-11 11:02 ` [PATCH for-4.20 v3 4/5] x86/pci: disable MSI(-X) on all devices at shutdown Roger Pau Monne
2025-02-11 11:34 ` Jan Beulich
2025-02-11 14:19 ` Roger Pau Monné
2025-02-11 14:48 ` [PATCH for-4.20 v4 " Roger Pau Monne
2025-02-11 17:00 ` Jan Beulich
2025-02-11 11:02 ` [PATCH for-4.20 v3 5/5] x86/iommu: disable interrupts " Roger Pau Monne
2025-02-11 11:37 ` Jan Beulich
2025-02-11 18:39 ` [PATCH for-4.20 v3 0/5] xen/x86: prevent local APIC errors " Roger Pau Monné
2025-02-12 8:33 ` Oleksii Kurochko
2025-02-12 8:51 ` Jan Beulich
2025-02-12 9:25 ` Roger Pau Monné [this message]
2025-02-12 14:25 ` Oleksii Kurochko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z6xo7Us0LiJqiEi1@macbook.local \
--to=roger.pau@citrix.com \
--cc=andrew.cooper3@citrix.com \
--cc=anthony.perard@vates.tech \
--cc=jbeulich@suse.com \
--cc=julien@xen.org \
--cc=michal.orzel@amd.com \
--cc=oleksii.kurochko@gmail.com \
--cc=sstabellini@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.