From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 79F64CFD313 for ; Mon, 24 Nov 2025 13:40:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:MIME-Version: References:In-Reply-To:Subject:Cc:To:From:Message-ID:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=ppcWgy7CRPnAxhDLslOd3J00aNM9sXTZwgfu2TIvE04=; b=s+N6MfPoX0OOUSIIyRxm9Tr/tz ZW8lfJjDxgus1+AMKSM7ijcX3E09AUgd+N4e+w8fn+lJhbq4MzkRxbJ//WMn5DsV2GHBfdA2em6gc yVP39+sjxDqqcrkFiQ4MxyOSniCGUiKhb/W3PzZi7tewRp9g23+dRRO/Xb3B4rBMiLrIbZ9Kqmpb2 b/dNPMWtEZm1JxtDVvBnJ4U5wxEZqUCldOdpH4yo4SO5fxC1100pDaz8v+eM1xK3+QD5K9MF9EHod AVA2G7ahrdECcCuYj7sxuWfcbWEuCPFVTIfD1EI2/fZompxBb7f8VTBCcwIwVL2Md9I8FnksQx/nz oU1aHhHw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98.2 #2 (Red Hat Linux)) id 1vNWo6-0000000Blow-1Dq7; Mon, 24 Nov 2025 13:40:42 +0000 Received: from sea.source.kernel.org ([2600:3c0a:e001:78e:0:1991:8:25]) by bombadil.infradead.org with esmtps (Exim 4.98.2 #2 (Red Hat Linux)) id 1vNWo2-0000000BloU-3cM4 for linux-arm-kernel@lists.infradead.org; Mon, 24 Nov 2025 13:40:40 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sea.source.kernel.org (Postfix) with ESMTP id 3B366429E9; Mon, 24 Nov 2025 13:40:38 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0CE9AC4CEF1; Mon, 24 Nov 2025 13:40:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1763991638; bh=yq4uyJZy6izwORe81H0kssAC4st99lLWRqoHxxrv56M=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=PlRZR+spYWLZYU+a02vWj37CgEnGvSCojbXHIYNPktdI1GNayI86yFWfJcwbYDpjc fAlmkieqZZERfj7K53wor6ylZJzVwjobO6LNC2FO5r1L40MHOGwsAOOPaj6ULo+/Sl 2oAl8DTVFCiHBEDl6iIdBWs2HoMUrZaSV2X/rRqe5kzvcNgg0HFGXexzqO38NxqanX bWz6Bag54zU8H3x9KlH+MxVY8Rt8HI2IWcL6L8scSE+MjixFjmnxzoCZQM/MD6OJVX qIuAB/dG2Dz7JNUxHY0aOcOvlZlX+PSFYePXrX5ih8XD03Hfdf/+kxjh0Y0yqdqYA8 +paGMQb6XrsQA== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1vNWo0-00000007rbZ-03xP; Mon, 24 Nov 2025 13:40:36 +0000 Date: Mon, 24 Nov 2025 13:40:35 +0000 Message-ID: <86ldjvr1kc.wl-maz@kernel.org> From: Marc Zyngier To: Mark Brown Cc: Fuad Tabba , kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org, Joey Gouly , Suzuki K Poulose , Oliver Upton , Zenghui Yu , Christoffer Dall , Volodymyr Babchuk , Yao Yuan Subject: Re: [PATCH v2 29/45] KVM: arm64: GICv3: Set ICH_HCR_EL2.TDIR when interrupts overflow LR capacity In-Reply-To: <342302ba-5678-408a-ab63-1a854099d4a1@sirena.org.uk> References: <20251109171619.1507205-1-maz@kernel.org> <20251109171619.1507205-30-maz@kernel.org> <86cy5ku06v.wl-maz@kernel.org> <51f5b5d7-9e98-40b8-8f8b-f50254573f3d@sirena.org.uk> <86o6orr356.wl-maz@kernel.org> <342302ba-5678-408a-ab63-1a854099d4a1@sirena.org.uk> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: broonie@kernel.org, tabba@google.com, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org, joey.gouly@arm.com, suzuki.poulose@arm.com, oupton@kernel.org, yuzenghui@huawei.com, christoffer.dall@arm.com, Volodymyr_Babchuk@epam.com, yaoyuan@linux.alibaba.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20251124_054038_943456_3887EAFA X-CRM114-Status: GOOD ( 20.90 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Mon, 24 Nov 2025 13:23:08 +0000, Mark Brown wrote: > > [1 ] > On Mon, Nov 24, 2025 at 01:06:29PM +0000, Marc Zyngier wrote: > > Mark Brown wrote: > > > > FWIW I am seeing this on i.MX8MP (4xA53+GICv3): > > > > https://lava.sirena.org.uk/scheduler/job/2118713#L1044 > > > There are worrying errors way before that, in the VMID allocator init, > > and I can't see what the GIC has to do with it. The issue Fuad > > reported was at run time, not boot time. so this really doesn't align > > with what you are seeing. > > Yeah, I was just looking further and realising it was probably > different - sorry about that. I was checking what else was failing > after seeing the qemu issue he was, all the platforms aren't booting one > way or another. FWIW with earlycon on the AM625 is showing similar > issues to the i.MX8MP. That's the initial warning: WARN_ON(NUM_USER_VMIDS - 1 <= num_possible_cpus()); The register state: [ 224.378174] pc : kvm_arm_vmid_alloc_init+0xa0/0xc0 [ 224.382954] lr : kvm_arm_vmid_alloc_init+0x24/0xc0 [ 224.387734] sp : ffff80008009bd40 [ 224.391035] x29: ffff80008009bd40 x28: ffff0020209bd3c0 x27: ffffce5349159068 [ 224.398162] x26: ffffce5349070118 x25: ffffce5348fb8eb8 x24: ffffce5349059128 [ 224.405287] x23: 0000000000000109 x22: ffff0020208ea6c0 x21: 0000000000000004 [ 224.412413] x20: ffffce5349c20b78 x19: 0000000000000000 x18: 00000000ffffffff [ 224.419538] x17: 00000000e9a61a0d x16: 00000000b1c06f2c x15: 00000000ffffffff [ 224.426663] x14: 0000000000000000 x13: 7374696220343420 x12: 3a74696d694c2065 [ 224.433789] x11: ffffffffffe00000 x10: ffff00275c260000 x9 : ffffce5348048be0 [ 224.440914] x8 : 00000000fffeffff x7 : ffff00275c260000 x6 : 80000000ffff0000 [ 224.448039] x5 : 0000000000000048 x4 : 0000000000000110 x3 : ffffce5348fc1000 [ 224.455164] x2 : 0000000000000100 x1 : 0000000000000100 x0 : 00000000000000ff The disassembly: ffff8000816ff220 : ffff8000816ff220: d503201f nop ffff8000816ff224: d503201f nop ffff8000816ff228: d503233f paciasp ffff8000816ff22c: a9be7bfd stp x29, x30, [sp, #-32]! ffff8000816ff230: 5280e400 mov w0, #0x720 // #1824 ffff8000816ff234: 910003fd mov x29, sp ffff8000816ff238: 72a00300 movk w0, #0x18, lsl #16 ffff8000816ff23c: f9000bf3 str x19, [sp, #16] ffff8000816ff240: 97a4a61c bl ffff800080028ab0 ffff8000816ff244: d3441c00 ubfx x0, x0, #4, #4 ffff8000816ff248: d0fffa02 adrp x2, ffff800081641000 ffff8000816ff24c: d0fffa03 adrp x3, ffff800081641000 ffff8000816ff250: f100081f cmp x0, #0x2 ffff8000816ff254: 52800201 mov w1, #0x10 // #16 ffff8000816ff258: b940f044 ldr w4, [x2, #240] ffff8000816ff25c: 52800102 mov w2, #0x8 // #8 ffff8000816ff260: d29fffe0 mov x0, #0xffff // #65535 ffff8000816ff264: 1a820021 csel w1, w1, w2, eq // eq = none ffff8000816ff268: d2801fe2 mov x2, #0xff // #255 ffff8000816ff26c: b9005061 str w1, [x3, #80] ffff8000816ff270: 9a820000 csel x0, x0, x2, eq // eq = none ffff8000816ff274: d2802001 mov x1, #0x100 // #256 ffff8000816ff278: d2a00022 mov x2, #0x10000 // #65536 ffff8000816ff27c: 9a810042 csel x2, x2, x1, eq // eq = none ffff8000816ff280: eb00009f cmp x4, x0 ffff8000816ff284: 540001e2 b.cs ffff8000816ff2c0 // b.hs, b.nlast That's the branch to the... [...] ffff8000816ff2c0: d4210000 brk #0x800 ... BRK instruction. So x0=255 and x4=272. 272 possible CPUs on a machine with only 16? Bollocks. Something is badly screwed in -next, and I'm not convinced it is KVM. d0f23ccf6ba9e cpumask: Cache num_possible_cpus() is my current suspect. M. -- Without deviation from the norm, progress is not possible.