All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: IOMMU faults after S3
Date: Thu, 2 Apr 2026 10:08:42 +0200	[thread overview]
Message-ID: <ac4kCq87SQSc6ddV@mail-itl> (raw)
In-Reply-To: <933a3e95-33d2-4e20-a4d5-2d8b20c2da7f@suse.com>

[-- Attachment #1: Type: text/plain, Size: 17447 bytes --]

On Thu, Apr 02, 2026 at 09:01:12AM +0200, Jan Beulich wrote:
> On 02.04.2026 01:17, Marek Marczykowski-Górecki wrote:
> > On Wed, Apr 01, 2026 at 10:52:37AM +0200, Jan Beulich wrote:
> >> On 01.04.2026 09:14, Jan Beulich wrote:
> >>> On 27.03.2026 11:19, Marek Marczykowski-Górecki wrote:
> >>>> I noticed that on some systems, there are a lot of IOMMU faults after
> >>>> S3. I can see it also on a laptop with MTL, but it affects also the ADL
> >>>> gitlab runner:
> >>>>
> >>>>     https://gitlab.com/xen-project/hardware/xen/-/jobs/13661033722
> >>>>     (XEN) [   37.201160] [VT-D]DMAR:[DMA Write] Request device [0000:00:1e.6] fault addr 0
> >>>>     (XEN) [   37.201164] [VT-D]DMAR: reason 02 - Present bit in context entry is clear
> >>>>     (XEN) [   37.202332] [VT-D]DMAR:[DMA Write] Request device [0000:00:1e.6] fault addr 0
> >>>>     (XEN) [   37.202339] [VT-D]DMAR: reason 02 - Present bit in context entry is clear
> >>>>
> >>>> Interestingly, the 0000:00:1e.6 device is not even listed by lspci.
> >>>>
> >>>> The issue is present only on staging, not staging-4.21.
> >>>>
> >>>> Bisect says:
> >>>>
> >>>> 5ec93b2f19ff8873fca65d38c1164b0a56d3898b is the first bad commit
> >>>> commit 5ec93b2f19ff8873fca65d38c1164b0a56d3898b
> >>>> Author: Jan Beulich <jbeulich@suse.com>
> >>>> Date:   Thu Jan 22 14:13:35 2026 +0100
> >>>>
> >>>>     x86/HPET: drop .set_affinity hook
> >>>
> >>> Looking into this, I find several things I can't quite understand (yet).
> >>> First there is
> >>>
> >>> (XEN) [000000456c0fe39f] Disabling HPET for being unreliable
> >>>
> >>> which looks to only affect clocksource selection, but not use as
> >>> broadcast source for CPU-idle management. (This may be an independent
> >>> issue.)
> >>>
> >>> Then there is
> >>>
> >>> (XEN) [    2.760248] HPET: 8 timers usable for broadcast (8 total)
> >>>
> >>> which should only occur on ARAT-incapable systems. That should only be
> >>> older hardware. (On my much older Skylake I don't see this line, for
> >>> example.) What does CPUID leaf 6 have on this system? Sadly xen-cpuid
> >>> is purely featureset based, and hence doesn't expose info about that
> >>> leaf. The leaf also isn't exposed to domains, so CPUID output in Dom0
> >>> isn't useful to look at either. It would need to be CPUID output on a
> >>> bare metal kernel.
> >>>
> >>> Further I suspect the fingered commit may only have uncovered an issue
> >>> elsewhere. I don't think we clear any context table entries during
> >>> suspend or resume. Hence in
> >>>
> >>> (XEN) [   20.554813] [VT-D]DMAR:[DMA Write] Request device [0000:00:1e.6] fault addr 0
> >>> (XEN) [   20.554819] [VT-D]DMAR: reason 02 - Present bit in context entry is clear
> >>>
> >>> the latter message is confusing me.
> >>>
> >>> The fault address being zero may, otoh, be a hint of hpet_msi_write()
> >>> never having run post-resume. Which may be the connection to the
> >>> dropping of hpet_msi_set_affinity(), as that did call that function.
> >>
> >> There clearly is an issue with the handling of the max_cstate variable,
> >> but I expect you don't use xenpm to limit usable C-states (there clearly
> >> is no respective command line option in the log you referenced)?
> > 
> > No, I don't think so.
> > 
> >> From what the log has, I conclude hpet_broadcast_resume() is called.
> > 
> > I don't think so... I applied changes as attached and got this on
> > resume:
> > 
> > (XEN) [   69.486120] Enabling non-boot CPUs  ...
> > (XEN) [   69.486404] mwait-idle: state C1 is disabled
> > (XEN) [   69.587869] mwait-idle: state C1 is disabled
> > (XEN) [   69.588008] mwait-idle: state C1 is disabled
> > (XEN) [   69.689438] mwait-idle: state C1 is disabled
> > (XEN) [   69.689608] mwait-idle: state C1 is disabled
> > (XEN) [   69.791066] mwait-idle: state C1 is disabled
> > (XEN) [   69.791334] mwait-idle: state C1 is disabled
> > (XEN) [   69.892938] mwait-idle: state C1 is disabled
> > (XEN) [   69.893209] mwait-idle: state C1 is disabled
> > (XEN) [   69.994890] mwait-idle: state C1 is disabled
> > (XEN) [   69.995096] mwait-idle: state C1 is disabled
> > (XEN) [   70.096638] mwait-idle: state C1 is disabled
> > (XEN) [   70.096915] mwait-idle: state C1 is disabled
> > (XEN) [   70.097093] mwait-idle: state C1 is disabled
> > (XEN) [   70.097272] mwait-idle: state C1 is disabled
> > (XEN) [   70.203357] [VT-D]DMAR:[DMA Write] Request device [0000:00:1e.6] fault addr 0
> > (XEN) [   70.203363] [VT-D]DMAR: reason 02 - Present bit in context entry is clear
> 
> That was on the serial console or from xl dmesg? I ask because console_resume()
> runs after time_resume(), so nothing appearing on the serial console would be
> expected (I think).

Ah, right, that's why I don't see my messages.
The xl dmesg output (from MTL this time):

    (XEN) [  123.477511] Entering ACPI S3 state.
    (XEN) [18446743903.571842] _disable_pit_irq:2649: using_pit: 0, cpu_has_apic: 1
    (XEN) [18446743903.571856] _disable_pit_irq:2659: cpuidle_using_deep_cstate: 1, boot_cpu_has(X86_FEATURE_XEN_ARAT): 0
    (XEN) [18446743903.571866] _disable_pit_irq:2662: init: 0
    (XEN) [18446743903.571877] hpet_broadcast_resume:661: hpet_events: ffff83046bc1f080
    (XEN) [18446743903.572020] hpet_broadcast_resume:672: num_hpets_used: 8
    (XEN) [18446743903.572029] hpet_broadcast_resume:690: cfg: 0x1
    (XEN) [18446743903.572040] hpet_broadcast_resume:695: i:0, hpet_events[i].msi.irq: 122, hpet_events[i].flags: 0
    (XEN) [18446743903.572081] hpet_broadcast_resume:706: i:0, cfg: 0xc134
    (XEN) [18446743903.572089] hpet_broadcast_resume:695: i:1, hpet_events[i].msi.irq: 123, hpet_events[i].flags: 0
    (XEN) [18446743903.572123] hpet_broadcast_resume:706: i:1, cfg: 0xc104
    (XEN) [18446743903.572132] hpet_broadcast_resume:695: i:2, hpet_events[i].msi.irq: 124, hpet_events[i].flags: 0
    (XEN) [18446743903.572167] hpet_broadcast_resume:706: i:2, cfg: 0xc104
    (XEN) [18446743903.572175] hpet_broadcast_resume:695: i:3, hpet_events[i].msi.irq: 125, hpet_events[i].flags: 0
    (XEN) [18446743903.572210] hpet_broadcast_resume:706: i:3, cfg: 0xc104
    (XEN) [18446743903.572218] hpet_broadcast_resume:695: i:4, hpet_events[i].msi.irq: 126, hpet_events[i].flags: 0
    (XEN) [18446743903.572252] hpet_broadcast_resume:706: i:4, cfg: 0xc104
    (XEN) [18446743903.572261] hpet_broadcast_resume:695: i:5, hpet_events[i].msi.irq: 127, hpet_events[i].flags: 0
    (XEN) [18446743903.572294] hpet_broadcast_resume:706: i:5, cfg: 0xc104
    (XEN) [18446743903.572303] hpet_broadcast_resume:695: i:6, hpet_events[i].msi.irq: 128, hpet_events[i].flags: 0
    (XEN) [18446743903.572338] hpet_broadcast_resume:706: i:6, cfg: 0xc104
    (XEN) [18446743903.572347] hpet_broadcast_resume:695: i:7, hpet_events[i].msi.irq: 129, hpet_events[i].flags: 0
    (XEN) [18446743903.572382] hpet_broadcast_resume:706: i:7, cfg: 0xc104

And the xen-cpuid -p output from this system:

    Xen reports there are maximum 120 leaves and 2 MSRs
    Raw policy: 48 leaves, 2 MSRs
     CPUID:
      leaf     subleaf  -> eax      ebx      ecx      edx     
      00000000:ffffffff -> 00000023:756e6547:6c65746e:49656e69
      00000001:ffffffff -> 000a06a4:20800800:77fafbff:bfebfbff
      00000002:ffffffff -> 00feff01:000000f0:00000000:00000000
      00000004:00000000 -> fc004121:02c0003f:0000003f:00000000
      00000004:00000001 -> fc004122:03c0003f:0000003f:00000000
      00000004:00000002 -> fc01c143:03c0003f:000007ff:00000000
      00000004:00000003 -> fc0fc163:02c0003f:00007fff:00000004
      00000005:ffffffff -> 00000040:00000040:00000003:11112020
      00000006:ffffffff -> 00dfcff7:00000002:00000409:00040003
      00000007:00000000 -> 00000002:239c27eb:994007ac:fc18c410
      00000007:00000001 -> 40400910:00000001:00000000:00040000
      00000007:00000002 -> 00000000:00000000:00000000:0000003f
      0000000a:ffffffff -> 07300805:00000000:00000007:00008603
      0000000b:00000000 -> 00000001:00000002:00000100:00000020
      0000000b:00000001 -> 00000007:00000016:00000201:00000020
      0000000d:00000000 -> 00000207:00000000:00000a88:00000000
      0000000d:00000001 -> 0000000f:00000000:00019900:00000000
      0000000d:00000002 -> 00000100:00000240:00000000:00000000
      0000000d:00000008 -> 00000080:00000000:00000001:00000000
      0000000d:00000009 -> 00000008:00000a80:00000000:00000000
      0000000d:0000000b -> 00000010:00000000:00000001:00000000
      0000000d:0000000c -> 00000018:00000000:00000001:00000000
      0000000d:0000000f -> 00000328:00000000:00000001:00000000
      0000000d:00000010 -> 00000008:00000000:00000001:00000000
      80000000:ffffffff -> 80000008:00000000:00000000:00000000
      80000001:ffffffff -> 00000000:00000000:00000121:2c100800
      80000002:ffffffff -> 65746e49:2952286c:726f4320:4d542865
      80000003:ffffffff -> 6c552029:20617274:35312037:00004835
      80000006:ffffffff -> 00000000:00000000:08007040:00000000
      80000007:ffffffff -> 00000000:00000000:00000000:00000100
      80000008:ffffffff -> 0000302e:00000000:00000000:00000000
     MSRs:
      index    -> value           
      000000ce -> 0000000080000000
      0000010a -> 000000000d89fd6b
    Host policy: 41 leaves, 2 MSRs
     CPUID:
      leaf     subleaf  -> eax      ebx      ecx      edx     
      00000000:ffffffff -> 0000000d:756e6547:6c65746e:49656e69
      00000001:ffffffff -> 000a06a4:20800800:77fafbff:bfebfbff
      00000002:ffffffff -> 00feff01:000000f0:00000000:00000000
      00000004:00000000 -> fc004121:02c0003f:0000003f:00000000
      00000004:00000001 -> fc004122:03c0003f:0000003f:00000000
      00000004:00000002 -> fc01c143:03c0003f:000007ff:00000000
      00000004:00000003 -> fc0fc163:02c0003f:00007fff:00000004
      00000005:ffffffff -> 00000040:00000040:00000003:11112020
      00000006:ffffffff -> 00dfcff7:00000002:00000409:00040003
      00000007:00000000 -> 00000002:239c27eb:994007ac:fc18c410
      00000007:00000001 -> 40000910:00000001:00000000:00040000
      00000007:00000002 -> 00000000:00000000:00000000:0000003f
      0000000b:00000000 -> 00000001:00000002:00000100:00000020
      0000000b:00000001 -> 00000007:00000016:00000201:00000020
      0000000d:00000000 -> 00000207:00000000:00000a88:00000000
      0000000d:00000001 -> 0000000f:00000000:00000000:00000000
      0000000d:00000002 -> 00000100:00000240:00000000:00000000
      0000000d:00000009 -> 00000008:00000a80:00000000:00000000
      80000000:ffffffff -> 80000008:00000000:00000000:00000000
      80000001:ffffffff -> 00000000:00000000:00000121:2c100800
      80000002:ffffffff -> 65746e49:2952286c:726f4320:4d542865
      80000003:ffffffff -> 6c552029:20617274:35312037:00004835
      80000006:ffffffff -> 00000000:00000000:08007040:00000000
      80000007:ffffffff -> 00000000:00000000:00000000:00000100
      80000008:ffffffff -> 0000302e:00000000:00000000:00000000
     MSRs:
      index    -> value           
      000000ce -> 0000000080000000
      0000010a -> 400000000d89fd6b
    PV Max policy: 58 leaves, 2 MSRs
     CPUID:
      leaf     subleaf  -> eax      ebx      ecx      edx     
      00000000:ffffffff -> 0000000d:756e6547:6c65746e:49656e69
      00000001:ffffffff -> 000a06a4:00800800:f6f83203:1fc9cbf5
      00000002:ffffffff -> 00feff01:000000f0:00000000:00000000
      00000004:00000000 -> fc004121:02c0003f:0000003f:00000000
      00000004:00000001 -> fc004122:03c0003f:0000003f:00000000
      00000004:00000002 -> fc01c143:03c0003f:000007ff:00000000
      00000004:00000003 -> fc0fc163:02c0003f:00007fff:00000004
      00000007:00000000 -> 00000002:218c0329:18400700:ac004410
      00000007:00000001 -> 00000810:00000000:00000000:00000000
      00000007:00000002 -> 00000000:00000000:00000000:00000021
      0000000d:00000000 -> 00000007:00000000:00000340:00000000
      0000000d:00000001 -> 00000007:00000000:00000000:00000000
      0000000d:00000002 -> 00000100:00000240:00000000:00000000
      80000000:ffffffff -> 80000021:00000000:00000000:00000000
      80000001:ffffffff -> 00000000:00000000:00000123:28100800
      80000002:ffffffff -> 65746e49:2952286c:726f4320:4d542865
      80000003:ffffffff -> 6c552029:20617274:35312037:00004835
      80000006:ffffffff -> 00000000:00000000:08007040:00000000
      80000007:ffffffff -> 00000000:00000000:00000000:00000100
      80000008:ffffffff -> 0000302e:00001000:00000000:00000000
     MSRs:
      index    -> value           
      000000ce -> 0000000080000000
      0000010a -> 400000001d0ae167
    HVM Max policy: 65 leaves, 2 MSRs
     CPUID:
      leaf     subleaf  -> eax      ebx      ecx      edx     
      00000000:ffffffff -> 0000000d:756e6547:6c65746e:49656e69
      00000001:ffffffff -> 000a06a4:00800800:f7fa3223:1fcbfbff
      00000002:ffffffff -> 00feff01:000000f0:00000000:00000000
      00000004:00000000 -> fc004121:02c0003f:0000003f:00000000
      00000004:00000001 -> fc004122:03c0003f:0000003f:00000000
      00000004:00000002 -> fc01c143:03c0003f:000007ff:00000000
      00000004:00000003 -> fc0fc163:02c0003f:00007fff:00000004
      00000007:00000000 -> 00000002:219c07ab:9840070c:bc004410
      00000007:00000001 -> 00000810:00000000:00000000:00000000
      00000007:00000002 -> 00000000:00000000:00000000:00000037
      0000000d:00000000 -> 00000207:00000000:00000a88:00000000
      0000000d:00000001 -> 0000000f:00000000:00000000:00000000
      0000000d:00000002 -> 00000100:00000240:00000000:00000000
      0000000d:00000009 -> 00000008:00000a80:00000000:00000000
      80000000:ffffffff -> 80000021:00000000:00000000:00000000
      80000001:ffffffff -> 00000000:00000000:00000123:2c100800
      80000002:ffffffff -> 65746e49:2952286c:726f4320:4d542865
      80000003:ffffffff -> 6c552029:20617274:35312037:00004835
      80000006:ffffffff -> 00000000:00000000:08007040:00000000
      80000007:ffffffff -> 00000000:00000000:00000000:00000100
      80000008:ffffffff -> 0000302e:00101000:00000000:00000000
     MSRs:
      index    -> value           
      000000ce -> 0000000080000000
      0000010a -> 400000001d0ae167
    PV Default policy: 33 leaves, 2 MSRs
     CPUID:
      leaf     subleaf  -> eax      ebx      ecx      edx     
      00000000:ffffffff -> 0000000d:756e6547:6c65746e:49656e69
      00000001:ffffffff -> 000a06a4:00800800:f6d83203:1fc9cbf5
      00000002:ffffffff -> 00feff01:000000f0:00000000:00000000
      00000004:00000000 -> fc004121:02c0003f:0000003f:00000000
      00000004:00000001 -> fc004122:03c0003f:0000003f:00000000
      00000004:00000002 -> fc01c143:03c0003f:000007ff:00000000
      00000004:00000003 -> fc0fc163:02c0003f:00007fff:00000004
      00000007:00000000 -> 00000002:218c0329:00400700:ac004410
      00000007:00000001 -> 00000810:00000000:00000000:00000000
      00000007:00000002 -> 00000000:00000000:00000000:00000021
      0000000d:00000000 -> 00000007:00000000:00000340:00000000
      0000000d:00000001 -> 00000007:00000000:00000000:00000000
      0000000d:00000002 -> 00000100:00000240:00000000:00000000
      80000000:ffffffff -> 80000008:00000000:00000000:00000000
      80000001:ffffffff -> 00000000:00000000:00000121:28100800
      80000002:ffffffff -> 65746e49:2952286c:726f4320:4d542865
      80000003:ffffffff -> 6c552029:20617274:35312037:00004835
      80000006:ffffffff -> 00000000:00000000:08007040:00000000
      80000008:ffffffff -> 0000302e:00001000:00000000:00000000
     MSRs:
      index    -> value           
      000000ce -> 0000000080000000
      0000010a -> 400000000d08e163
    HVM Default policy: 40 leaves, 2 MSRs
     CPUID:
      leaf     subleaf  -> eax      ebx      ecx      edx     
      00000000:ffffffff -> 0000000d:756e6547:6c65746e:49656e69
      00000001:ffffffff -> 000a06a4:00800800:f7fa3203:1fcbfbff
      00000002:ffffffff -> 00feff01:000000f0:00000000:00000000
      00000004:00000000 -> fc004121:02c0003f:0000003f:00000000
      00000004:00000001 -> fc004122:03c0003f:0000003f:00000000
      00000004:00000002 -> fc01c143:03c0003f:000007ff:00000000
      00000004:00000003 -> fc0fc163:02c0003f:00007fff:00000004
      00000007:00000000 -> 00000002:219c07ab:8040070c:bc004410
      00000007:00000001 -> 00000810:00000000:00000000:00000000
      00000007:00000002 -> 00000000:00000000:00000000:00000037
      0000000d:00000000 -> 00000207:00000000:00000a88:00000000
      0000000d:00000001 -> 0000000f:00000000:00000000:00000000
      0000000d:00000002 -> 00000100:00000240:00000000:00000000
      0000000d:00000009 -> 00000008:00000a80:00000000:00000000
      80000000:ffffffff -> 80000008:00000000:00000000:00000000
      80000001:ffffffff -> 00000000:00000000:00000121:2c100800
      80000002:ffffffff -> 65746e49:2952286c:726f4320:4d542865
      80000003:ffffffff -> 6c552029:20617274:35312037:00004835
      80000006:ffffffff -> 00000000:00000000:08007040:00000000
      80000008:ffffffff -> 0000302e:00101000:00000000:00000000
     MSRs:
      index    -> value           
      000000ce -> 0000000080000000
      0000010a -> 400000000d08e163


> Without hpet_broadcast_resume() running, I don't think I could explain how the
> channels (and their FSB interrupts) would get enabled.
> 
> Jan

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2026-04-02  8:09 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-27 10:19 IOMMU faults after S3 Marek Marczykowski-Górecki
2026-03-27 10:56 ` Teddy Astie
2026-03-27 10:59   ` Marek Marczykowski-Górecki
2026-03-27 12:23 ` Andrew Cooper
2026-04-01  7:14 ` Jan Beulich
2026-04-01  7:20   ` Andrew Cooper
2026-04-01  8:11     ` Jan Beulich
2026-04-01 20:30       ` Marek Marczykowski-Górecki
2026-04-02  6:55         ` Jan Beulich
2026-04-01  8:52   ` Jan Beulich
2026-04-01 23:17     ` Marek Marczykowski-Górecki
2026-04-02  7:01       ` Jan Beulich
2026-04-02  8:08         ` Marek Marczykowski-Górecki [this message]
2026-04-02  8:39           ` Jan Beulich
2026-04-02  8:47             ` Jan Beulich
2026-04-02  9:42               ` Marek Marczykowski-Górecki
2026-04-02 10:23                 ` Jan Beulich
2026-04-02 14:02                   ` Marek Marczykowski-Górecki
2026-04-02 14:23                     ` Jan Beulich
2026-04-07  6:48                     ` Jan Beulich
2026-04-02  9:35             ` Marek Marczykowski-Górecki
2026-04-02 10:48               ` Jan Beulich
2026-04-02 14:47                 ` Marek Marczykowski-Górecki
2026-04-02 14:53                   ` Jan Beulich
2026-04-02 23:06                     ` Marek Marczykowski-Górecki
2026-04-07  6:29                       ` Jan Beulich
2026-04-07 10:02                         ` Marek Marczykowski-Górecki
2026-04-07 10:23                         ` Jan Beulich
2026-04-07 11:34                           ` Marek Marczykowski-Górecki
2026-04-07 11:52                             ` Jan Beulich
2026-04-07 11:56                               ` Marek Marczykowski-Górecki
2026-04-01  8:58   ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ac4kCq87SQSc6ddV@mail-itl \
    --to=marmarek@invisiblethingslab.com \
    --cc=jbeulich@suse.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.