From: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: IOMMU faults after S3
Date: Thu, 2 Apr 2026 01:17:29 +0200 [thread overview]
Message-ID: <ac2nibFfvGm_7elv@mail-itl> (raw)
In-Reply-To: <090b8b8f-141b-4a24-92eb-879c0a0c73e1@suse.com>
[-- Attachment #1.1: Type: text/plain, Size: 4830 bytes --]
On Wed, Apr 01, 2026 at 10:52:37AM +0200, Jan Beulich wrote:
> On 01.04.2026 09:14, Jan Beulich wrote:
> > On 27.03.2026 11:19, Marek Marczykowski-Górecki wrote:
> >> I noticed that on some systems, there are a lot of IOMMU faults after
> >> S3. I can see it also on a laptop with MTL, but it affects also the ADL
> >> gitlab runner:
> >>
> >> https://gitlab.com/xen-project/hardware/xen/-/jobs/13661033722
> >> (XEN) [ 37.201160] [VT-D]DMAR:[DMA Write] Request device [0000:00:1e.6] fault addr 0
> >> (XEN) [ 37.201164] [VT-D]DMAR: reason 02 - Present bit in context entry is clear
> >> (XEN) [ 37.202332] [VT-D]DMAR:[DMA Write] Request device [0000:00:1e.6] fault addr 0
> >> (XEN) [ 37.202339] [VT-D]DMAR: reason 02 - Present bit in context entry is clear
> >>
> >> Interestingly, the 0000:00:1e.6 device is not even listed by lspci.
> >>
> >> The issue is present only on staging, not staging-4.21.
> >>
> >> Bisect says:
> >>
> >> 5ec93b2f19ff8873fca65d38c1164b0a56d3898b is the first bad commit
> >> commit 5ec93b2f19ff8873fca65d38c1164b0a56d3898b
> >> Author: Jan Beulich <jbeulich@suse.com>
> >> Date: Thu Jan 22 14:13:35 2026 +0100
> >>
> >> x86/HPET: drop .set_affinity hook
> >
> > Looking into this, I find several things I can't quite understand (yet).
> > First there is
> >
> > (XEN) [000000456c0fe39f] Disabling HPET for being unreliable
> >
> > which looks to only affect clocksource selection, but not use as
> > broadcast source for CPU-idle management. (This may be an independent
> > issue.)
> >
> > Then there is
> >
> > (XEN) [ 2.760248] HPET: 8 timers usable for broadcast (8 total)
> >
> > which should only occur on ARAT-incapable systems. That should only be
> > older hardware. (On my much older Skylake I don't see this line, for
> > example.) What does CPUID leaf 6 have on this system? Sadly xen-cpuid
> > is purely featureset based, and hence doesn't expose info about that
> > leaf. The leaf also isn't exposed to domains, so CPUID output in Dom0
> > isn't useful to look at either. It would need to be CPUID output on a
> > bare metal kernel.
> >
> > Further I suspect the fingered commit may only have uncovered an issue
> > elsewhere. I don't think we clear any context table entries during
> > suspend or resume. Hence in
> >
> > (XEN) [ 20.554813] [VT-D]DMAR:[DMA Write] Request device [0000:00:1e.6] fault addr 0
> > (XEN) [ 20.554819] [VT-D]DMAR: reason 02 - Present bit in context entry is clear
> >
> > the latter message is confusing me.
> >
> > The fault address being zero may, otoh, be a hint of hpet_msi_write()
> > never having run post-resume. Which may be the connection to the
> > dropping of hpet_msi_set_affinity(), as that did call that function.
>
> There clearly is an issue with the handling of the max_cstate variable,
> but I expect you don't use xenpm to limit usable C-states (there clearly
> is no respective command line option in the log you referenced)?
No, I don't think so.
> From what the log has, I conclude hpet_broadcast_resume() is called.
I don't think so... I applied changes as attached and got this on
resume:
(XEN) [ 69.486120] Enabling non-boot CPUs ...
(XEN) [ 69.486404] mwait-idle: state C1 is disabled
(XEN) [ 69.587869] mwait-idle: state C1 is disabled
(XEN) [ 69.588008] mwait-idle: state C1 is disabled
(XEN) [ 69.689438] mwait-idle: state C1 is disabled
(XEN) [ 69.689608] mwait-idle: state C1 is disabled
(XEN) [ 69.791066] mwait-idle: state C1 is disabled
(XEN) [ 69.791334] mwait-idle: state C1 is disabled
(XEN) [ 69.892938] mwait-idle: state C1 is disabled
(XEN) [ 69.893209] mwait-idle: state C1 is disabled
(XEN) [ 69.994890] mwait-idle: state C1 is disabled
(XEN) [ 69.995096] mwait-idle: state C1 is disabled
(XEN) [ 70.096638] mwait-idle: state C1 is disabled
(XEN) [ 70.096915] mwait-idle: state C1 is disabled
(XEN) [ 70.097093] mwait-idle: state C1 is disabled
(XEN) [ 70.097272] mwait-idle: state C1 is disabled
(XEN) [ 70.203357] [VT-D]DMAR:[DMA Write] Request device [0000:00:1e.6] fault addr 0
(XEN) [ 70.203363] [VT-D]DMAR: reason 02 - Present bit in context entry is clear
> Question is whether it does what we want it to. Could you instrument it
> some, so we have confirmation that it is called, and we also know whether
> __hpet_setup_msi_irq() is not only called on all 8 channels, but also
> succeeds there? (If it failed, I suppose we better wouldn't set
> HPET_TN_FSB and/or HPET_TN_ENABLE.) If, however, it succeeds, I couldn't
> explain why the fault address would be reported as 0, as then we
> definitely must have written HPET_Tn_ROUTE.
>
> Jan
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
[-- Attachment #1.2: xen-debug.diff --]
[-- Type: text/plain, Size: 2875 bytes --]
diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c
index 1ea8ae457424..4c5bf079b728 100644
--- a/xen/arch/x86/hpet.c
+++ b/xen/arch/x86/hpet.c
@@ -658,6 +658,8 @@ void hpet_broadcast_resume(void)
u32 cfg;
unsigned int i, n;
+ printk("%s:%d: hpet_events: %p\n", __func__, __LINE__, hpet_events);
+
if ( !hpet_events )
return;
@@ -667,23 +669,30 @@ void hpet_broadcast_resume(void)
if ( num_hpets_used > 0 )
{
+ printk("%s:%d: num_hpets_used: %d\n", __func__, __LINE__, num_hpets_used);
/* Stop HPET legacy interrupts */
cfg &= ~HPET_CFG_LEGACY;
n = num_hpets_used;
}
else if ( hpet_events->flags & HPET_EVT_DISABLE )
+ {
+ printk("%s:%d: hpet_events->flags: %#x\n", __func__, __LINE__, hpet_events->flags);
return;
+ }
else
{
/* Start HPET legacy interrupts */
+ printk("%s:%d\n", __func__, __LINE__);
cfg |= HPET_CFG_LEGACY;
n = 1;
}
+ printk("%s:%d: cfg: %#x\n", __func__, __LINE__, cfg);
hpet_write32(cfg, HPET_CFG);
for ( i = 0; i < n; i++ )
{
+ printk("%s:%d: i:%d, hpet_events[i].msi.irq: %d, hpet_events[i].flags: %#x\n", __func__, __LINE__, i, hpet_events[i].msi.irq, hpet_events[i].flags);
if ( hpet_events[i].msi.irq >= 0 )
__hpet_setup_msi_irq(irq_to_desc(hpet_events[i].msi.irq));
@@ -694,6 +703,7 @@ void hpet_broadcast_resume(void)
if ( !(hpet_events[i].flags & HPET_EVT_LEGACY) )
cfg |= HPET_TN_FSB;
hpet_write32(cfg, HPET_Tn_CFG(hpet_events[i].idx));
+ printk("%s:%d: i:%d, cfg: %#x\n", __func__, __LINE__, i, cfg);
hpet_events[i].next_event = STIME_MAX;
}
diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index fed30a919d2c..15113ebdfb6c 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -2646,6 +2646,7 @@ static int _disable_pit_irq(bool init)
{
int ret = 1;
+ printk("%s:%d: using_pit: %d, cpu_has_apic: %d\n", __func__, __LINE__, using_pit, cpu_has_apic);
if ( using_pit || !cpu_has_apic )
return -1;
@@ -2655,8 +2656,10 @@ static int _disable_pit_irq(bool init)
* XXX dom0 may rely on RTC interrupt delivery, so only enable
* hpet_broadcast if FSB mode available or if force_hpet_broadcast.
*/
+ printk("%s:%d: cpuidle_using_deep_cstate: %d, boot_cpu_has(X86_FEATURE_XEN_ARAT): %d\n", __func__, __LINE__, cpuidle_using_deep_cstate(), boot_cpu_has(X86_FEATURE_XEN_ARAT));
if ( cpuidle_using_deep_cstate() && !boot_cpu_has(X86_FEATURE_XEN_ARAT) )
{
+ printk("%s:%d: init: %d\n", __func__, __LINE__, init);
init ? hpet_broadcast_init() : hpet_broadcast_resume();
if ( !hpet_broadcast_is_available() )
{
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2026-04-01 23:18 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-27 10:19 IOMMU faults after S3 Marek Marczykowski-Górecki
2026-03-27 10:56 ` Teddy Astie
2026-03-27 10:59 ` Marek Marczykowski-Górecki
2026-03-27 12:23 ` Andrew Cooper
2026-04-01 7:14 ` Jan Beulich
2026-04-01 7:20 ` Andrew Cooper
2026-04-01 8:11 ` Jan Beulich
2026-04-01 20:30 ` Marek Marczykowski-Górecki
2026-04-02 6:55 ` Jan Beulich
2026-04-01 8:52 ` Jan Beulich
2026-04-01 23:17 ` Marek Marczykowski-Górecki [this message]
2026-04-02 7:01 ` Jan Beulich
2026-04-02 8:08 ` Marek Marczykowski-Górecki
2026-04-02 8:39 ` Jan Beulich
2026-04-02 8:47 ` Jan Beulich
2026-04-02 9:42 ` Marek Marczykowski-Górecki
2026-04-02 10:23 ` Jan Beulich
2026-04-02 14:02 ` Marek Marczykowski-Górecki
2026-04-02 14:23 ` Jan Beulich
2026-04-07 6:48 ` Jan Beulich
2026-04-02 9:35 ` Marek Marczykowski-Górecki
2026-04-02 10:48 ` Jan Beulich
2026-04-02 14:47 ` Marek Marczykowski-Górecki
2026-04-02 14:53 ` Jan Beulich
2026-04-02 23:06 ` Marek Marczykowski-Górecki
2026-04-07 6:29 ` Jan Beulich
2026-04-07 10:02 ` Marek Marczykowski-Górecki
2026-04-07 10:23 ` Jan Beulich
2026-04-07 11:34 ` Marek Marczykowski-Górecki
2026-04-07 11:52 ` Jan Beulich
2026-04-07 11:56 ` Marek Marczykowski-Górecki
2026-04-01 8:58 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ac2nibFfvGm_7elv@mail-itl \
--to=marmarek@invisiblethingslab.com \
--cc=jbeulich@suse.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.