From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C480FCA0EFF for ; Wed, 27 Aug 2025 16:53:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:MIME-Version: References:In-Reply-To:Subject:Cc:To:From:Message-ID:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=6uVqMMEWd+rc9gqAlN9RP9OKD/LFlZ4tGBd3rwQbYkA=; b=QJdhX7qIbVfdY7t4JqTf+FznXs avmCpiHZc5PAA4Be3rCM5VsK699RFcpH/z7yE44oM58zkS1RNtXZb9H2hPlw80X7wkG2TqPnP1RON NGcamuj+c2BSRS2itc90YnkFoErhZ4xye6nUcbjaZNjSO2jV960dziQkefzOrE2DhPWamsEaKcxX2 CLz7xLf2G97rz02/yjqaY92lFAffW94+wXQC6mWYHPw36En9yuiJgKfmrsxjm3CWg87B7NBt6uIpr FpxPREoQYIDnrILhSu6Z6lVd22VUUDwCRyY+kA+0sYBRkTLnl3YFSnUKP9VqGWtIRh4s8xhSgKcDh CXWsNZfg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98.2 #2 (Red Hat Linux)) id 1urJOu-0000000GDzu-1rS1; Wed, 27 Aug 2025 16:53:32 +0000 Received: from sea.source.kernel.org ([2600:3c0a:e001:78e:0:1991:8:25]) by bombadil.infradead.org with esmtps (Exim 4.98.2 #2 (Red Hat Linux)) id 1urFZt-0000000FT7Z-0n8o for linux-arm-kernel@lists.infradead.org; Wed, 27 Aug 2025 12:48:38 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sea.source.kernel.org (Postfix) with ESMTP id D482940A57; Wed, 27 Aug 2025 12:48:35 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B47D0C4CEEB; Wed, 27 Aug 2025 12:48:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1756298915; bh=boJHzfZQIJgcM1/GV387QaK5mDcJHs2QxJhzT1rJCUI=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=cZV9J1a4H0MDK5HqdlAnjvXs2nUG3jJ5gcnrlsI9cA/ST4QxE5cNYdt4d4uhFFCXV 6BQBgGH/tLhzUJZ35dASi+U+VBloU4nEEKepqPf/Dj/d6UnPR/plob4NN0Z7kuxOWm VcOG+wXsBi8H2z4yns4c0FKNIjxVBWj6DDyRr+99RH2KOYduCNo8CEqxuZsBEO1xor XIvWou42GhFk07ZLpGnM0l7nuGAkHUmXUEVDGfkby+AyXhwXWfkUorQHkjAmSKDNvN mTGJfY8+FSemPNPpIHWDyWmpkqILAqTCiYGjN1AnE9Z6RMRBjuE/PEspCK8NjM1x3i yupPagpSOaPlg== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1urFZp-00000000und-1YJf; Wed, 27 Aug 2025 12:48:33 +0000 Date: Wed, 27 Aug 2025 13:48:33 +0100 Message-ID: <86h5xtdj6m.wl-maz@kernel.org> From: Marc Zyngier To: Koichiro Den Cc: linux-arm-kernel@lists.infradead.org, tglx@linutronix.de, linux-kernel@vger.kernel.org Subject: Re: [PATCH] irqchip/gic-v3-its: Fix invalid wait context lockdep report In-Reply-To: <20250827073848.1410315-1-den@valinux.co.jp> References: <20250827073848.1410315-1-den@valinux.co.jp> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: den@valinux.co.jp, linux-arm-kernel@lists.infradead.org, tglx@linutronix.de, linux-kernel@vger.kernel.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250827_054837_268577_15536421 X-CRM114-Status: GOOD ( 34.70 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Wed, 27 Aug 2025 08:38:48 +0100, Koichiro Den wrote: > > its_irq_set_vcpu_affinity() always runs under a raw_spin_lock wait > context, so calling kcalloc there is not permitted and RT-unsafe since > ___slab_alloc() may acquire a local lock. The below is the actual > lockdep report observed: > > ============================= > [ BUG: Invalid wait context ] > 6.16.0-rc3-irqchip-next-7e28bba92c5c+ #1 Tainted: G S > ----------------------------- > qemu-system-aar/2129 is trying to lock: > ffff0085b74f2178 (batched_entropy_u32.lock){..-.}-{3:3}, at: get_random_u32+0x9c/0x708 > other info that might help us debug this: > context-{5:5} > 6 locks held by qemu-system-aar/2129: > #0: ffff0000b84a0738 (&vdev->igate){+.+.}-{4:4}, at: vfio_pci_core_ioctl+0x40c/0x748 [vfio_pci_core] > #1: ffff8000883cef68 (lock#6){+.+.}-{4:4}, at: irq_bypass_register_producer+0x64/0x2f0 > #2: ffff0000ac0df960 (&its->its_lock){+.+.}-{4:4}, at: kvm_vgic_v4_set_forwarding+0x224/0x6f0 > #3: ffff000086dc4718 (&irq->irq_lock#3){....}-{2:2}, at: kvm_vgic_v4_set_forwarding+0x288/0x6f0 > #4: ffff0001356200c8 (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0xc8/0x158 > #5: ffff00009eae4850 (&dev->event_map.vlpi_lock){....}-{2:2}, at: its_irq_set_vcpu_affinity+0x8c/0x528 > ... > Call trace: > show_stack+0x30/0x98 (C) > dump_stack_lvl+0x9c/0xd0 > dump_stack+0x1c/0x34 > __lock_acquire+0x814/0xb40 > lock_acquire.part.0+0x16c/0x2a8 > lock_acquire+0x8c/0x178 > get_random_u32+0xd4/0x708 > __get_random_u32_below+0x20/0x80 > shuffle_freelist+0x5c/0x1b0 > allocate_slab+0x15c/0x348 > new_slab+0x48/0x80 > ___slab_alloc+0x590/0x8b8 > __slab_alloc.isra.0+0x3c/0x80 > __kmalloc_noprof+0x174/0x520 > its_vlpi_map+0x834/0xce0 > its_irq_set_vcpu_affinity+0x21c/0x528 > irq_set_vcpu_affinity+0x160/0x1b0 > its_map_vlpi+0x90/0x100 > kvm_vgic_v4_set_forwarding+0x3c4/0x6f0 > kvm_arch_irq_bypass_add_producer+0xac/0x108 > __connect+0x138/0x1b0 > irq_bypass_register_producer+0x16c/0x2f0 > vfio_msi_set_vector_signal+0x2c0/0x5a8 [vfio_pci_core] > vfio_msi_set_block+0x8c/0x120 [vfio_pci_core] > vfio_pci_set_msi_trigger+0x120/0x3d8 [vfio_pci_core] Huh. I guess this is due to RT not being completely compatible with GFP_ATOMIC... Why you'd want RT and KVM at the same time is beyond me, but hey. > ... > > To avoid this, simply pre-allocate vlpi_maps when creating an ITS v4 > device with LPIs allcation. The trade-off is some wasted memory > depending on nr_lpis, if none of those LPIs are never upgraded to VLPIs. > > An alternative would be to move the vlpi_maps allocation out of > its_map_vlpi() and introduce a two-stage prepare/commit flow, allowing a > caller (KVM in the lockdep splat shown above) to do the allocation > outside irq_set_vcpu_affinity(). However, this would unnecessarily add > complexity. That's debatable. It is probably fine for now, but if this was to grow, we'd need to revisit this. > Fixes: d011e4e654d7 ("irqchip/gic-v3-its: Add VLPI map/unmap operations") No. This code predates RT being merged, and this problem cannot occur before RT. > Signed-off-by: Koichiro Den > --- > drivers/irqchip/irq-gic-v3-its.c | 36 ++++++++++++++++++-------------- > 1 file changed, 20 insertions(+), 16 deletions(-) > > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c > index 467cb78435a9..b933be8ddc51 100644 > --- a/drivers/irqchip/irq-gic-v3-its.c > +++ b/drivers/irqchip/irq-gic-v3-its.c > @@ -1923,19 +1923,10 @@ static int its_vlpi_map(struct irq_data *d, struct its_cmd_info *info) > if (!info->map) > return -EINVAL; > > - if (!its_dev->event_map.vm) { > - struct its_vlpi_map *maps; > - > - maps = kcalloc(its_dev->event_map.nr_lpis, sizeof(*maps), > - GFP_ATOMIC); > - if (!maps) > - return -ENOMEM; > - > + if (!its_dev->event_map.vm) > its_dev->event_map.vm = info->map->vm; > - its_dev->event_map.vlpi_maps = maps; > - } else if (its_dev->event_map.vm != info->map->vm) { > + else if (its_dev->event_map.vm != info->map->vm) > return -EINVAL; > - } > > /* Get our private copy of the mapping information */ > its_dev->event_map.vlpi_maps[event] = *info->map; > @@ -2010,10 +2001,8 @@ static int its_vlpi_unmap(struct irq_data *d) > * Drop the refcount and make the device available again if > * this was the last VLPI. > */ > - if (!--its_dev->event_map.nr_vlpis) { > + if (!--its_dev->event_map.nr_vlpis) > its_dev->event_map.vm = NULL; > - kfree(its_dev->event_map.vlpi_maps); > - } > > return 0; > } > @@ -3469,6 +3458,7 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id, > { > struct its_device *dev; > unsigned long *lpi_map = NULL; > + struct its_vlpi_map *vlpi_maps; > unsigned long flags; > u16 *col_map = NULL; > void *itt; > @@ -3497,16 +3487,28 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id, > > if (alloc_lpis) { > lpi_map = its_lpi_alloc(nvecs, &lpi_base, &nr_lpis); > - if (lpi_map) > + if (lpi_map) { > col_map = kcalloc(nr_lpis, sizeof(*col_map), > GFP_KERNEL); > + > + /* > + * Pre-allocate vlpi_maps to avoid slab allocation > + * under the strict raw spinlock wait context of > + * irq_set_vcpu_affinity. This could waste memory > + * if no vlpi map is ever created. > + */ > + if (is_v4(its) && nr_lpis > 0) > + vlpi_maps = kcalloc(nr_lpis, sizeof(*vlpi_maps), > + GFP_KERNEL); > + } > } else { > col_map = kcalloc(nr_ites, sizeof(*col_map), GFP_KERNEL); > nr_lpis = 0; > lpi_base = 0; > } > > - if (!dev || !itt || !col_map || (!lpi_map && alloc_lpis)) { > + if (!dev || !itt || !col_map || > + (alloc_lpis && (!lpi_map || (is_v4(its) && !vlpi_maps)))) { This needs to be collapsed into a single boolean evaluated with the pointer being NULL. > kfree(dev); > itt_free_pool(itt, sz); > bitmap_free(lpi_map); Where are you freeing vlpi_maps if on the failure path?? Thanks, M. -- Without deviation from the norm, progress is not possible.