From: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
To: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
Jan Beulich <jbeulich@suse.com>,
xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: S3 regression related to XSA-471 patches
Date: Fri, 22 Aug 2025 02:53:33 +0200 [thread overview]
Message-ID: <aKe_jgWBiwqPqt7i@mail-itl> (raw)
In-Reply-To: <aJw98srxJKZ2msct@macbook.local>
[-- Attachment #1: Type: text/plain, Size: 4200 bytes --]
On Wed, Aug 13, 2025 at 09:26:09AM +0200, Roger Pau Monné wrote:
> On Wed, Aug 13, 2025 at 04:53:53AM +0200, Marek Marczykowski-Górecki wrote:
> >
> >
> > On August 11, 2025 3:16:46 PM GMT+02:00, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> > >On 07/08/2025 6:29 pm, Marek Marczykowski-Górecki wrote:
> > >> On Wed, Aug 06, 2025 at 12:46:36PM +0200, Marek Marczykowski-Górecki wrote:
> > >>> On Wed, Aug 06, 2025 at 12:36:56PM +0200, Jan Beulich wrote:
> > >>>> On 06.08.2025 12:23, Marek Marczykowski-Górecki wrote:
> > >>>>> We've got several reports that S3 reliability recently regressed. We
> > >>>>> identified it's definitely related to XSA-471 patches, and bisection
> > >>>>> points at "x86/idle: Remove broken MWAIT implementation". I don't have
> > >>>>> reliable reproduction steps, so I'm not 100% sure if it's really this
> > >>>>> patch, or maybe an earlier one - but it's definitely already broken at
> > >>>>> this point in the series. Most reports are about Xen 4.17 (as that's
> > >>>>> what stable Qubes OS version currently use), but I think I've seen
> > >>>>> somebody reporting the issue on 4.19 too (but I don't have clear
> > >>>>> evidence, especially if it's the same issue).
> > >>>> At the time we've been discussing the explicit raising of TIMER_SOFTIRQ
> > >>>> in mwait_idle_with_hints() a lot. If it was now truly missing, that imo
> > >>>> shouldn't cause problems only after resume, but then it may have covered
> > >>>> for some omission during resume. As a far-fetched experiment, could you
> > >>>> try putting that back (including the calculation of the "expires" local
> > >>>> variable)?
> > >>> Sure, I'll try.
> > >> It appears this fixes the issue, at least in ~10 attempts so far
> > >> (usually I could reproduce the issue after 2-3 attempts).
> > >>
> > >
> > >Can you show the exact code which seems to have made this stable?
> >
> > This patch: https://github.com/marmarek/qubes-vmm-xen/blob/7c9e9e312948c772d9a68090109964121c1d16fe/0001-DEBUG-S3.patch
>
> Hello,
>
> Can you test if the patch below in isolation makes any difference?
Seems to help too. At least a test similar as before did not hit the
issue anymore.
> Thanks, Roger.
> ---
> diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
> index 2ac162c997fe..27d672ad5dbb 100644
> --- a/xen/arch/x86/acpi/power.c
> +++ b/xen/arch/x86/acpi/power.c
> @@ -19,6 +19,7 @@
> #include <xen/iommu.h>
> #include <xen/param.h>
> #include <xen/sched.h>
> +#include <xen/softirq.h>
> #include <xen/spinlock.h>
> #include <xen/watchdog.h>
>
> @@ -310,6 +311,7 @@ static int enter_state(u32 state)
> thaw_domains();
> system_state = SYS_STATE_active;
> spin_unlock(&pm_lock);
> + raise_softirq(TIMER_SOFTIRQ);
> return error;
> }
>
> diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c
> index 0fd8bdba7067..9d66db861b74 100644
> --- a/xen/arch/x86/apic.c
> +++ b/xen/arch/x86/apic.c
> @@ -65,7 +65,6 @@ static struct {
> unsigned int apic_lvt0;
> unsigned int apic_lvt1;
> unsigned int apic_lvterr;
> - unsigned int apic_tmict;
> unsigned int apic_tdcr;
> unsigned int apic_thmr;
> } apic_pm_state;
> @@ -658,7 +657,6 @@ int lapic_suspend(void)
> apic_pm_state.apic_lvt0 = apic_read(APIC_LVT0);
> apic_pm_state.apic_lvt1 = apic_read(APIC_LVT1);
> apic_pm_state.apic_lvterr = apic_read(APIC_LVTERR);
> - apic_pm_state.apic_tmict = apic_read(APIC_TMICT);
> apic_pm_state.apic_tdcr = apic_read(APIC_TDCR);
> if (maxlvt >= 5)
> apic_pm_state.apic_thmr = apic_read(APIC_LVTTHMR);
> @@ -718,7 +716,7 @@ int lapic_resume(void)
> apic_write(APIC_LVTPC, apic_pm_state.apic_lvtpc);
> apic_write(APIC_LVTT, apic_pm_state.apic_lvtt);
> apic_write(APIC_TDCR, apic_pm_state.apic_tdcr);
> - apic_write(APIC_TMICT, apic_pm_state.apic_tmict);
> + apic_write(APIC_TMICT, 0);
> apic_write(APIC_ESR, 0);
> apic_read(APIC_ESR);
> apic_write(APIC_LVTERR, apic_pm_state.apic_lvterr);
>
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
prev parent reply other threads:[~2025-08-22 0:54 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-06 10:23 S3 regression related to XSA-471 patches Marek Marczykowski-Górecki
2025-08-06 10:36 ` Jan Beulich
2025-08-06 10:46 ` Marek Marczykowski-Górecki
2025-08-07 17:29 ` Marek Marczykowski-Górecki
2025-08-11 13:16 ` Andrew Cooper
2025-08-13 2:53 ` Marek Marczykowski-Górecki
2025-08-13 7:26 ` Roger Pau Monné
2025-08-22 0:53 ` Marek Marczykowski-Górecki [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aKe_jgWBiwqPqt7i@mail-itl \
--to=marmarek@invisiblethingslab.com \
--cc=andrew.cooper3@citrix.com \
--cc=jbeulich@suse.com \
--cc=roger.pau@citrix.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.