All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Mark Brown <broonie@kernel.org>
Cc: Fuad Tabba <tabba@google.com>,
	kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
	kvm@vger.kernel.org, Joey Gouly <joey.gouly@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Oliver Upton <oupton@kernel.org>,
	Zenghui Yu <yuzenghui@huawei.com>,
	Christoffer Dall <christoffer.dall@arm.com>,
	Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>,
	Yao Yuan <yaoyuan@linux.alibaba.com>
Subject: Re: [PATCH v2 29/45] KVM: arm64: GICv3: Set ICH_HCR_EL2.TDIR when interrupts overflow LR capacity
Date: Mon, 24 Nov 2025 13:40:35 +0000	[thread overview]
Message-ID: <86ldjvr1kc.wl-maz@kernel.org> (raw)
In-Reply-To: <342302ba-5678-408a-ab63-1a854099d4a1@sirena.org.uk>

On Mon, 24 Nov 2025 13:23:08 +0000,
Mark Brown <broonie@kernel.org> wrote:
> 
> [1  <text/plain; us-ascii (7bit)>]
> On Mon, Nov 24, 2025 at 01:06:29PM +0000, Marc Zyngier wrote:
> > Mark Brown <broonie@kernel.org> wrote:
> 
> > > FWIW I am seeing this on i.MX8MP (4xA53+GICv3):
> 
> > >   https://lava.sirena.org.uk/scheduler/job/2118713#L1044
> 
> > There are worrying errors way before that, in the VMID allocator init,
> > and I can't see what the GIC has to do with it. The issue Fuad
> > reported was at run time, not boot time. so this really doesn't align
> > with what you are seeing.
> 
> Yeah, I was just looking further and realising it was probably
> different - sorry about that.  I was checking what else was failing
> after seeing the qemu issue he was, all the platforms aren't booting one
> way or another.  FWIW with earlycon on the AM625 is showing similar
> issues to the i.MX8MP.

That's the initial warning:

	WARN_ON(NUM_USER_VMIDS - 1 <= num_possible_cpus());

The register state:

[  224.378174] pc : kvm_arm_vmid_alloc_init+0xa0/0xc0
[  224.382954] lr : kvm_arm_vmid_alloc_init+0x24/0xc0
[  224.387734] sp : ffff80008009bd40
[  224.391035] x29: ffff80008009bd40 x28: ffff0020209bd3c0 x27: ffffce5349159068
[  224.398162] x26: ffffce5349070118 x25: ffffce5348fb8eb8 x24: ffffce5349059128
[  224.405287] x23: 0000000000000109 x22: ffff0020208ea6c0 x21: 0000000000000004
[  224.412413] x20: ffffce5349c20b78 x19: 0000000000000000 x18: 00000000ffffffff
[  224.419538] x17: 00000000e9a61a0d x16: 00000000b1c06f2c x15: 00000000ffffffff
[  224.426663] x14: 0000000000000000 x13: 7374696220343420 x12: 3a74696d694c2065
[  224.433789] x11: ffffffffffe00000 x10: ffff00275c260000 x9 : ffffce5348048be0
[  224.440914] x8 : 00000000fffeffff x7 : ffff00275c260000 x6 : 80000000ffff0000
[  224.448039] x5 : 0000000000000048 x4 : 0000000000000110 x3 : ffffce5348fc1000
[  224.455164] x2 : 0000000000000100 x1 : 0000000000000100 x0 : 00000000000000ff

The disassembly:

ffff8000816ff220 <kvm_arm_vmid_alloc_init>:
ffff8000816ff220:       d503201f        nop
ffff8000816ff224:       d503201f        nop
ffff8000816ff228:       d503233f        paciasp
ffff8000816ff22c:       a9be7bfd        stp     x29, x30, [sp, #-32]!
ffff8000816ff230:       5280e400        mov     w0, #0x720                      // #1824
ffff8000816ff234:       910003fd        mov     x29, sp
ffff8000816ff238:       72a00300        movk    w0, #0x18, lsl #16
ffff8000816ff23c:       f9000bf3        str     x19, [sp, #16]
ffff8000816ff240:       97a4a61c        bl      ffff800080028ab0 <read_sanitised_ftr_reg>
ffff8000816ff244:       d3441c00        ubfx    x0, x0, #4, #4
ffff8000816ff248:       d0fffa02        adrp    x2, ffff800081641000 <rodata_full>
ffff8000816ff24c:       d0fffa03        adrp    x3, ffff800081641000 <rodata_full>
ffff8000816ff250:       f100081f        cmp     x0, #0x2
ffff8000816ff254:       52800201        mov     w1, #0x10                       // #16
ffff8000816ff258:       b940f044        ldr     w4, [x2, #240]
ffff8000816ff25c:       52800102        mov     w2, #0x8                        // #8
ffff8000816ff260:       d29fffe0        mov     x0, #0xffff                     // #65535
ffff8000816ff264:       1a820021        csel    w1, w1, w2, eq  // eq = none
ffff8000816ff268:       d2801fe2        mov     x2, #0xff                       // #255
ffff8000816ff26c:       b9005061        str     w1, [x3, #80]
ffff8000816ff270:       9a820000        csel    x0, x0, x2, eq  // eq = none
ffff8000816ff274:       d2802001        mov     x1, #0x100                      // #256
ffff8000816ff278:       d2a00022        mov     x2, #0x10000                    // #65536
ffff8000816ff27c:       9a810042        csel    x2, x2, x1, eq  // eq = none
ffff8000816ff280:       eb00009f        cmp     x4, x0
ffff8000816ff284:       540001e2        b.cs    ffff8000816ff2c0 <kvm_arm_vmid_alloc_init+0xa0>  // b.hs, b.nlast

That's the branch to the...

[...]

ffff8000816ff2c0:       d4210000        brk     #0x800

... BRK instruction.

So x0=255 and x4=272. 272 possible CPUs on a machine with only 16?
Bollocks.

Something is badly screwed in -next, and I'm not convinced it is KVM.

	d0f23ccf6ba9e cpumask: Cache num_possible_cpus()

is my current suspect.

	M.

-- 
Without deviation from the norm, progress is not possible.

  reply	other threads:[~2025-11-24 13:40 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-09 17:15 [PATCH v2 00/45] KVM: arm64: Add LR overflow infrastructure Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 01/45] irqchip/gic: Add missing GICH_HCR control bits Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 02/45] irqchip/gic: Expose CPU interface VA to KVM Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 03/45] irqchip/apple-aic: Spit out ICH_MISR_EL2 value on spurious vGIC MI Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 04/45] KVM: arm64: Turn vgic-v3 errata traps into a patched-in constant Marc Zyngier
2025-11-10 10:40   ` Suzuki K Poulose
2025-11-10 11:47     ` Marc Zyngier
2025-11-11 23:53   ` Oliver Upton
2025-11-13  9:52   ` Marek Szyprowski
2025-11-13 10:56     ` Marc Zyngier
2025-11-13 11:04       ` Marek Szyprowski
2025-11-13 11:23         ` Joey Gouly
2025-11-13 11:42           ` Marc Zyngier
2025-11-13 10:59     ` Marc Zyngier
2025-11-13 11:20       ` Marek Szyprowski
2025-11-13 18:01   ` Mark Brown
2025-11-14  9:37     ` Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 05/45] KVM: arm64: GICv3: Detect and work around the lack of ICV_DIR_EL1 trapping Marc Zyngier
2025-11-13 14:33   ` Mark Brown
2025-11-13 18:15     ` Marc Zyngier
2025-11-13 19:06       ` Mark Brown
2025-11-13 20:10         ` Marc Zyngier
2025-11-13 21:59           ` Oliver Upton
2025-11-09 17:15 ` [PATCH v2 06/45] KVM: arm64: Repack struct vgic_irq fields Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 07/45] KVM: arm64: Add tracking of vgic_irq being present in a LR Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 08/45] KVM: arm64: Add LR overflow handling documentation Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 09/45] KVM: arm64: GICv3: Drop LPI active state when folding LRs Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 10/45] KVM: arm64: GICv3: Preserve EOIcount on exit Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 11/45] KVM: arm64: GICv3: Decouple ICH_HCR_EL2 programming from LRs Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 12/45] KVM: arm64: GICv3: Extract LR folding primitive Marc Zyngier
2025-11-10  9:01   ` Yao Yuan
2025-11-10  9:18     ` Marc Zyngier
2025-11-10  9:48       ` Yao Yuan
2025-11-09 17:15 ` [PATCH v2 13/45] KVM: arm64: GICv3: Extract LR computing primitive Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 14/45] KVM: arm64: GICv2: Preserve EOIcount on exit Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 15/45] KVM: arm64: GICv2: Decouple GICH_HCR programming from LRs being loaded Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 16/45] KVM: arm64: GICv2: Extract LR folding primitive Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 17/45] KVM: arm64: GICv2: Extract LR computing primitive Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 18/45] KVM: arm64: Compute vgic state irrespective of the number of interrupts Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 19/45] KVM: arm64: Eagerly save VMCR on exit Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 20/45] KVM: arm64: Revamp vgic maintenance interrupt configuration Marc Zyngier
2025-11-12  0:08   ` Oliver Upton
2025-11-12  8:33     ` Marc Zyngier
2025-11-12  8:45       ` Oliver Upton
2025-11-12  9:56         ` Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 21/45] KVM: arm64: Turn kvm_vgic_vcpu_enable() into kvm_vgic_vcpu_reset() Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 22/45] KVM: arm64: Make vgic_target_oracle() globally available Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 23/45] KVM: arm64: Invert ap_list sorting to push active interrupts out Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 24/45] KVM: arm64: Move undeliverable interrupts to the end of ap_list Marc Zyngier
2025-11-09 17:15 ` [PATCH v2 25/45] KVM: arm64: Use MI to detect groups being enabled/disabled Marc Zyngier
2025-11-09 17:16 ` [PATCH v2 26/45] KVM: arm64: GICv3: Handle LR overflow when EOImode==0 Marc Zyngier
2025-11-09 17:16 ` [PATCH v2 27/45] KVM: arm64: GICv3: Handle deactivation via ICV_DIR_EL1 traps Marc Zyngier
2025-11-09 17:16 ` [PATCH v2 28/45] KVM: arm64: GICv3: Add GICv2 SGI handling to deactivation primitive Marc Zyngier
2025-11-09 17:16 ` [PATCH v2 29/45] KVM: arm64: GICv3: Set ICH_HCR_EL2.TDIR when interrupts overflow LR capacity Marc Zyngier
2025-11-14 14:20   ` Fuad Tabba
2025-11-14 15:02     ` Marc Zyngier
2025-11-14 15:53       ` Fuad Tabba
2025-11-14 17:41         ` Marc Zyngier
2025-11-17  8:22           ` Fuad Tabba
2025-11-17 11:56             ` Marc Zyngier
2025-11-24 11:52       ` Mark Brown
2025-11-24 13:06         ` Marc Zyngier
2025-11-24 13:23           ` Mark Brown
2025-11-24 13:40             ` Marc Zyngier [this message]
2025-11-24 14:12               ` Marc Zyngier
2025-11-24 15:06                 ` Mark Brown
2025-11-09 17:16 ` [PATCH v2 30/45] KVM: arm64: GICv3: Add SPI tracking to handle asymmetric deactivation Marc Zyngier
2025-11-09 17:16 ` [PATCH v2 31/45] KVM: arm64: GICv3: Handle in-LR deactivation when possible Marc Zyngier
2025-11-09 17:16 ` [PATCH v2 32/45] KVM: arm64: GICv3: Avoid broadcast kick on CPUs lacking TDIR Marc Zyngier
2025-11-09 17:16 ` [PATCH v2 33/45] KVM: arm64: GICv2: Handle LR overflow when EOImode==0 Marc Zyngier
2025-11-09 17:16 ` [PATCH v2 34/45] KVM: arm64: GICv2: Handle deactivation via GICV_DIR traps Marc Zyngier
2025-11-09 17:16 ` [PATCH v2 35/45] KVM: arm64: GICv2: Always trap GICV_DIR register Marc Zyngier
2025-11-09 17:16 ` [PATCH v2 36/45] KVM: arm64: selftests: gic_v3: Add irq group setting helper Marc Zyngier
2025-11-09 17:16 ` [PATCH v2 37/45] KVM: arm64: selftests: gic_v3: Disable Group-0 interrupts by default Marc Zyngier
2025-11-09 17:16 ` [PATCH v2 38/45] KVM: arm64: selftests: vgic_irq: Fix GUEST_ASSERT_IAR_EMPTY() helper Marc Zyngier
2025-11-09 17:16 ` [PATCH v2 39/45] KVM: arm64: selftests: vgic_irq: Change configuration before enabling interrupt Marc Zyngier
2025-11-09 17:16 ` [PATCH v2 40/45] KVM: arm64: selftests: vgic_irq: Exclude timer-controlled interrupts Marc Zyngier
2025-11-09 17:16 ` [PATCH v2 41/45] KVM: arm64: selftests: vgic_irq: Remove LR-bound limitation Marc Zyngier
2025-11-09 17:16 ` [PATCH v2 42/45] KVM: arm64: selftests: vgic_irq: Perform EOImode==1 deactivation in ack order Marc Zyngier
2025-11-09 17:16 ` [PATCH v2 43/45] KVM: arm64: selftests: vgic_irq: Add asymmetric SPI deaectivation test Marc Zyngier
2025-11-09 17:16 ` [PATCH v2 44/45] KVM: arm64: selftests: vgic_irq: Add Group-0 enable test Marc Zyngier
2025-11-09 17:16 ` [PATCH v2 45/45] KVM: arm64: selftests: vgic_irq: Add timer deactivation test Marc Zyngier
2025-11-12  9:13 ` [PATCH v2 00/45] KVM: arm64: Add LR overflow infrastructure Oliver Upton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86ldjvr1kc.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=Volodymyr_Babchuk@epam.com \
    --cc=broonie@kernel.org \
    --cc=christoffer.dall@arm.com \
    --cc=joey.gouly@arm.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=oupton@kernel.org \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=yaoyuan@linux.alibaba.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.