From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6BE452C236C for ; Thu, 28 Aug 2025 07:56:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756367764; cv=none; b=Ad/6qom4GY1oDAxltxLaVG7I1Yyndb9djjgut1/l3fhMaLO2PoM3KQdHVhg/ktMmqAVtUQLlDca4Y29SU5yVWBxC5LQ0rvQXZfNST9wT2JEH1kkCrY+Oj/y6DxpsmjtMc+WYkcHlyeFzhdQEIvEOK6GR9KM0wzVKUBgyK3H8tBU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756367764; c=relaxed/simple; bh=LBFU3wJPpP1GzD1wqG5KpOWfMmOnz1g1O9lz75lM1nE=; h=Date:Message-ID:From:To:Cc:Subject:In-Reply-To:References: MIME-Version:Content-Type; b=seVetunViM7YfGhYphbECDri8Xk8lsRvg0/f34gOuFvPBgk87F/JaZ4O/mHscaBoFDbKDvz2/1JRcDJPz3DyQ6qiDt6n4rgRlC2VL5tL5BvuZkChFaRZMH0sdA/m0IUpPAch4VSJxaqHETFtvUTAYtFQPskXIf9ZZ3UMTM/aYZY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=sYl2jjF8; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="sYl2jjF8" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1487EC4CEEB; Thu, 28 Aug 2025 07:56:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1756367764; bh=LBFU3wJPpP1GzD1wqG5KpOWfMmOnz1g1O9lz75lM1nE=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=sYl2jjF85JDbJk7EESItaQx9GxoAuwwBD5hfhuuTdpUUTDYI9+cSuNqbmx5a6VL+a iAdyAixrKpioIlMdVHg85cEgQDTOk1QNMkGSqp/THk1eUJs/l1abOSGQgPxkFdyGwT OJKO5j2feUyxvGE56kDhz7VTrQKDSmBJGo9QWgKvMflgephbGIdRtQgaQ0U8SCPFWV P7ANwW82O0PUUkXsKWJbUR/s6XkSLPYdVLtNi/L3gmYQXVYHRsANvIhlgNig3vxTKY dhbNpWgpg0aDRP76nJCcpCWJjjijLX+4LgwG87JbZJS1u4rT6LuuBZCED8rZ5N1ouT ZwhXUPh0tpfTw== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1urXUH-00000001A3d-3ITy; Thu, 28 Aug 2025 07:56:01 +0000 Date: Thu, 28 Aug 2025 08:56:01 +0100 Message-ID: <86cy8fev72.wl-maz@kernel.org> From: Marc Zyngier To: Koichiro Den Cc: linux-arm-kernel@lists.infradead.org, tglx@linutronix.de, linux-kernel@vger.kernel.org Subject: Re: [PATCH] irqchip/gic-v3-its: Fix invalid wait context lockdep report In-Reply-To: References: <20250827073848.1410315-1-den@valinux.co.jp> <86h5xtdj6m.wl-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: den@valinux.co.jp, linux-arm-kernel@lists.infradead.org, tglx@linutronix.de, linux-kernel@vger.kernel.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false On Thu, 28 Aug 2025 04:09:00 +0100, Koichiro Den wrote: > > On Wed, Aug 27, 2025 at 01:48:33PM +0100, Marc Zyngier wrote: > > On Wed, 27 Aug 2025 08:38:48 +0100, > > Koichiro Den wrote: > > > > > > its_irq_set_vcpu_affinity() always runs under a raw_spin_lock wait > > > context, so calling kcalloc there is not permitted and RT-unsafe since > > > ___slab_alloc() may acquire a local lock. The below is the actual > > > lockdep report observed: > > > > > > ============================= > > > [ BUG: Invalid wait context ] > > > 6.16.0-rc3-irqchip-next-7e28bba92c5c+ #1 Tainted: G S > > > ----------------------------- > > > qemu-system-aar/2129 is trying to lock: > > > ffff0085b74f2178 (batched_entropy_u32.lock){..-.}-{3:3}, at: get_random_u32+0x9c/0x708 > > > other info that might help us debug this: > > > context-{5:5} > > > 6 locks held by qemu-system-aar/2129: > > > #0: ffff0000b84a0738 (&vdev->igate){+.+.}-{4:4}, at: vfio_pci_core_ioctl+0x40c/0x748 [vfio_pci_core] > > > #1: ffff8000883cef68 (lock#6){+.+.}-{4:4}, at: irq_bypass_register_producer+0x64/0x2f0 > > > #2: ffff0000ac0df960 (&its->its_lock){+.+.}-{4:4}, at: kvm_vgic_v4_set_forwarding+0x224/0x6f0 > > > #3: ffff000086dc4718 (&irq->irq_lock#3){....}-{2:2}, at: kvm_vgic_v4_set_forwarding+0x288/0x6f0 > > > #4: ffff0001356200c8 (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0xc8/0x158 > > > #5: ffff00009eae4850 (&dev->event_map.vlpi_lock){....}-{2:2}, at: its_irq_set_vcpu_affinity+0x8c/0x528 > > > ... > > > Call trace: > > > show_stack+0x30/0x98 (C) > > > dump_stack_lvl+0x9c/0xd0 > > > dump_stack+0x1c/0x34 > > > __lock_acquire+0x814/0xb40 > > > lock_acquire.part.0+0x16c/0x2a8 > > > lock_acquire+0x8c/0x178 > > > get_random_u32+0xd4/0x708 > > > __get_random_u32_below+0x20/0x80 > > > shuffle_freelist+0x5c/0x1b0 > > > allocate_slab+0x15c/0x348 > > > new_slab+0x48/0x80 > > > ___slab_alloc+0x590/0x8b8 > > > __slab_alloc.isra.0+0x3c/0x80 > > > __kmalloc_noprof+0x174/0x520 > > > its_vlpi_map+0x834/0xce0 > > > its_irq_set_vcpu_affinity+0x21c/0x528 > > > irq_set_vcpu_affinity+0x160/0x1b0 > > > its_map_vlpi+0x90/0x100 > > > kvm_vgic_v4_set_forwarding+0x3c4/0x6f0 > > > kvm_arch_irq_bypass_add_producer+0xac/0x108 > > > __connect+0x138/0x1b0 > > > irq_bypass_register_producer+0x16c/0x2f0 > > > vfio_msi_set_vector_signal+0x2c0/0x5a8 [vfio_pci_core] > > > vfio_msi_set_block+0x8c/0x120 [vfio_pci_core] > > > vfio_pci_set_msi_trigger+0x120/0x3d8 [vfio_pci_core] > > > > Huh. I guess this is due to RT not being completely compatible with > > GFP_ATOMIC... Why you'd want RT and KVM at the same time is beyond > > me, but hey. > > For the record, I didn't run KVM on RT, though I still believe it's better > to conform to the wait context rule and avoid triggering the lockdep > splat. Then I don't understand how you get this, because I have not seen it so far. > > I don't know if there are any plans which make kmalloc with GFP_ATOMIC > workable under a stricter wait context (getting rid of the local lock > in some way?), but I think it would be nicer. GFP_ATOMIC is documented as being compatible with raw spinlocks in the absence of RT, making the above trace pretty odd. > > > > > > ... > > > > > > To avoid this, simply pre-allocate vlpi_maps when creating an ITS v4 > > > device with LPIs allcation. The trade-off is some wasted memory > > > depending on nr_lpis, if none of those LPIs are never upgraded to VLPIs. > > > > > > An alternative would be to move the vlpi_maps allocation out of > > > its_map_vlpi() and introduce a two-stage prepare/commit flow, allowing a > > > caller (KVM in the lockdep splat shown above) to do the allocation > > > outside irq_set_vcpu_affinity(). However, this would unnecessarily add > > > complexity. > > > > That's debatable. It is probably fine for now, but if this was to > > grow, we'd need to revisit this. > > Just curious but do you have any plans to replace the current > irq_set_vcpu_affinity() approach with something else? Who knows. This is the Linux kernel, everything changes all the time without the need for a good reason. More significantly, the amount of *data* being associated with a VLPI could become much higher in the future, and add more unnecessary allocation. M. -- Without deviation from the norm, progress is not possible.