From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6BE452C236C
	for <linux-kernel@vger.kernel.org>; Thu, 28 Aug 2025 07:56:04 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1756367764; cv=none; b=Ad/6qom4GY1oDAxltxLaVG7I1Yyndb9djjgut1/l3fhMaLO2PoM3KQdHVhg/ktMmqAVtUQLlDca4Y29SU5yVWBxC5LQ0rvQXZfNST9wT2JEH1kkCrY+Oj/y6DxpsmjtMc+WYkcHlyeFzhdQEIvEOK6GR9KM0wzVKUBgyK3H8tBU=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1756367764; c=relaxed/simple;
	bh=LBFU3wJPpP1GzD1wqG5KpOWfMmOnz1g1O9lz75lM1nE=;
	h=Date:Message-ID:From:To:Cc:Subject:In-Reply-To:References:
	 MIME-Version:Content-Type; b=seVetunViM7YfGhYphbECDri8Xk8lsRvg0/f34gOuFvPBgk87F/JaZ4O/mHscaBoFDbKDvz2/1JRcDJPz3DyQ6qiDt6n4rgRlC2VL5tL5BvuZkChFaRZMH0sdA/m0IUpPAch4VSJxaqHETFtvUTAYtFQPskXIf9ZZ3UMTM/aYZY=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=sYl2jjF8; arc=none smtp.client-ip=10.30.226.201
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="sYl2jjF8"
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1487EC4CEEB;
	Thu, 28 Aug 2025 07:56:04 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
	s=k20201202; t=1756367764;
	bh=LBFU3wJPpP1GzD1wqG5KpOWfMmOnz1g1O9lz75lM1nE=;
	h=Date:From:To:Cc:Subject:In-Reply-To:References:From;
	b=sYl2jjF85JDbJk7EESItaQx9GxoAuwwBD5hfhuuTdpUUTDYI9+cSuNqbmx5a6VL+a
	 iAdyAixrKpioIlMdVHg85cEgQDTOk1QNMkGSqp/THk1eUJs/l1abOSGQgPxkFdyGwT
	 OJKO5j2feUyxvGE56kDhz7VTrQKDSmBJGo9QWgKvMflgephbGIdRtQgaQ0U8SCPFWV
	 P7ANwW82O0PUUkXsKWJbUR/s6XkSLPYdVLtNi/L3gmYQXVYHRsANvIhlgNig3vxTKY
	 dhbNpWgpg0aDRP76nJCcpCWJjjijLX+4LgwG87JbZJS1u4rT6LuuBZCED8rZ5N1ouT
	 ZwhXUPh0tpfTw==
Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org)
	by disco-boy.misterjones.org with esmtpsa  (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
	(Exim 4.98.2)
	(envelope-from <maz@kernel.org>)
	id 1urXUH-00000001A3d-3ITy;
	Thu, 28 Aug 2025 07:56:01 +0000
Date: Thu, 28 Aug 2025 08:56:01 +0100
Message-ID: <86cy8fev72.wl-maz@kernel.org>
From: Marc Zyngier <maz@kernel.org>
To: Koichiro Den <den@valinux.co.jp>
Cc: linux-arm-kernel@lists.infradead.org,
	tglx@linutronix.de,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] irqchip/gic-v3-its: Fix invalid wait context lockdep report
In-Reply-To: <pkfekcmetqyoj7rwvr77kisu7ok7bc6srq5maoydisnsk4bnyy@wimnw744lp5t>
References: <20250827073848.1410315-1-den@valinux.co.jp>
	<86h5xtdj6m.wl-maz@kernel.org>
	<pkfekcmetqyoj7rwvr77kisu7ok7bc6srq5maoydisnsk4bnyy@wimnw744lp5t>
User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue)
 FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.1
 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO)
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue")
Content-Type: text/plain; charset=US-ASCII
X-SA-Exim-Connect-IP: 185.219.108.64
X-SA-Exim-Rcpt-To: den@valinux.co.jp, linux-arm-kernel@lists.infradead.org, tglx@linutronix.de, linux-kernel@vger.kernel.org
X-SA-Exim-Mail-From: maz@kernel.org
X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false

On Thu, 28 Aug 2025 04:09:00 +0100,
Koichiro Den <den@valinux.co.jp> wrote:
> 
> On Wed, Aug 27, 2025 at 01:48:33PM +0100, Marc Zyngier wrote:
> > On Wed, 27 Aug 2025 08:38:48 +0100,
> > Koichiro Den <den@valinux.co.jp> wrote:
> > > 
> > > its_irq_set_vcpu_affinity() always runs under a raw_spin_lock wait
> > > context, so calling kcalloc there is not permitted and RT-unsafe since
> > > ___slab_alloc() may acquire a local lock. The below is the actual
> > > lockdep report observed:
> > > 
> > >   =============================
> > >   [ BUG: Invalid wait context ]
> > >   6.16.0-rc3-irqchip-next-7e28bba92c5c+ #1 Tainted: G S
> > >   -----------------------------
> > >   qemu-system-aar/2129 is trying to lock:
> > >   ffff0085b74f2178 (batched_entropy_u32.lock){..-.}-{3:3}, at: get_random_u32+0x9c/0x708
> > >   other info that might help us debug this:
> > >   context-{5:5}
> > >   6 locks held by qemu-system-aar/2129:
> > >    #0: ffff0000b84a0738 (&vdev->igate){+.+.}-{4:4}, at: vfio_pci_core_ioctl+0x40c/0x748 [vfio_pci_core]
> > >    #1: ffff8000883cef68 (lock#6){+.+.}-{4:4}, at: irq_bypass_register_producer+0x64/0x2f0
> > >    #2: ffff0000ac0df960 (&its->its_lock){+.+.}-{4:4}, at: kvm_vgic_v4_set_forwarding+0x224/0x6f0
> > >    #3: ffff000086dc4718 (&irq->irq_lock#3){....}-{2:2}, at: kvm_vgic_v4_set_forwarding+0x288/0x6f0
> > >    #4: ffff0001356200c8 (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0xc8/0x158
> > >    #5: ffff00009eae4850 (&dev->event_map.vlpi_lock){....}-{2:2}, at: its_irq_set_vcpu_affinity+0x8c/0x528
> > >   ...
> > >   Call trace:
> > >    show_stack+0x30/0x98 (C)
> > >    dump_stack_lvl+0x9c/0xd0
> > >    dump_stack+0x1c/0x34
> > >    __lock_acquire+0x814/0xb40
> > >    lock_acquire.part.0+0x16c/0x2a8
> > >    lock_acquire+0x8c/0x178
> > >    get_random_u32+0xd4/0x708
> > >    __get_random_u32_below+0x20/0x80
> > >    shuffle_freelist+0x5c/0x1b0
> > >    allocate_slab+0x15c/0x348
> > >    new_slab+0x48/0x80
> > >    ___slab_alloc+0x590/0x8b8
> > >    __slab_alloc.isra.0+0x3c/0x80
> > >    __kmalloc_noprof+0x174/0x520
> > >    its_vlpi_map+0x834/0xce0
> > >    its_irq_set_vcpu_affinity+0x21c/0x528
> > >    irq_set_vcpu_affinity+0x160/0x1b0
> > >    its_map_vlpi+0x90/0x100
> > >    kvm_vgic_v4_set_forwarding+0x3c4/0x6f0
> > >    kvm_arch_irq_bypass_add_producer+0xac/0x108
> > >    __connect+0x138/0x1b0
> > >    irq_bypass_register_producer+0x16c/0x2f0
> > >    vfio_msi_set_vector_signal+0x2c0/0x5a8 [vfio_pci_core]
> > >    vfio_msi_set_block+0x8c/0x120 [vfio_pci_core]
> > >    vfio_pci_set_msi_trigger+0x120/0x3d8 [vfio_pci_core]
> > 
> > Huh. I guess this is due to RT not being completely compatible with
> > GFP_ATOMIC...  Why you'd want RT and KVM at the same time is beyond
> > me, but hey.
> 
> For the record, I didn't run KVM on RT, though I still believe it's better
> to conform to the wait context rule and avoid triggering the lockdep
> splat.

Then I don't understand how you get this, because I have not seen it
so far.

> 
> I don't know if there are any plans which make kmalloc with GFP_ATOMIC
> workable under a stricter wait context (getting rid of the local lock
> in some way?), but I think it would be nicer.

GFP_ATOMIC is documented as being compatible with raw spinlocks in the
absence of RT, making the above trace pretty odd.

> 
> > 
> > >   ...
> > > 
> > > To avoid this, simply pre-allocate vlpi_maps when creating an ITS v4
> > > device with LPIs allcation. The trade-off is some wasted memory
> > > depending on nr_lpis, if none of those LPIs are never upgraded to VLPIs.
> > >
> > > An alternative would be to move the vlpi_maps allocation out of
> > > its_map_vlpi() and introduce a two-stage prepare/commit flow, allowing a
> > > caller (KVM in the lockdep splat shown above) to do the allocation
> > > outside irq_set_vcpu_affinity(). However, this would unnecessarily add
> > > complexity.
> > 
> > That's debatable. It is probably fine for now, but if this was to
> > grow, we'd need to revisit this.
> 
> Just curious but do you have any plans to replace the current
> irq_set_vcpu_affinity() approach with something else?

Who knows. This is the Linux kernel, everything changes all the time
without the need for a good reason. More significantly, the amount of
*data* being associated with a VLPI could become much higher in the
future, and add more unnecessary allocation.

	M.

-- 
Without deviation from the norm, progress is not possible.