From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F905C677F1 for ; Mon, 9 Jan 2023 16:41:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233721AbjAIQlS (ORCPT ); Mon, 9 Jan 2023 11:41:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37638 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235476AbjAIQlM (ORCPT ); Mon, 9 Jan 2023 11:41:12 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D916C1DDE7 for ; Mon, 9 Jan 2023 08:41:10 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 9ACBDB80E62 for ; Mon, 9 Jan 2023 16:41:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 27794C433F0; Mon, 9 Jan 2023 16:41:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673282468; bh=+SAZ3c7QNi7o7yOxTuOMZJqNdqFFTto1wgfsxfU/UEI=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=Ta8FcJW78yret2uXTJ868WzJFXrN0h60/XntQuHHJYLa1kgwkzw7PVYyNb3S1vCdf sPYPWJ5joY3w7NcNlSELDYrSZheiYLGfzsMsCoNtkzGiH4xk2n7K3S7IY6/qN53jh+ E3xL+2vfvQhSvyiaaIhqYFkJ04u2zgvcqA/sc6KLvl09W4heRYz+lDHegonX/pula0 hdDuEyZthqLPLg5ttnqrZOnka421My4HBZqaxYzX2E51SjzeHhz2oVvOfJeoLpveIm RPfvCYflaEdTt2VSsZEsTx5nbNRQS6gWZx8UJDKhWUoHqvp2ksOovnsbY/UNxwxKFw S32lGSI0vYG4g== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pEvCr-000Kd3-Sn; Mon, 09 Jan 2023 16:41:05 +0000 Date: Mon, 09 Jan 2023 16:41:05 +0000 Message-ID: <86fscjoe3i.wl-maz@kernel.org> From: Marc Zyngier To: Shanker Donthineni Cc: Catalin Marinas , Will Deacon , James Morse , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] arm64: gic: increase the number of IRQ descriptors In-Reply-To: <4cc4114d-7fa5-1c23-3504-0ca4dbdd0f62@nvidia.com> References: <20230104023738.1258925-1-sdonthineni@nvidia.com> <86sfgq7jb3.wl-maz@kernel.org> <2a0116a8-fbd0-d866-ada0-ed50f0523f1d@nvidia.com> <86k0216ydh.wl-maz@kernel.org> <4cc4114d-7fa5-1c23-3504-0ca4dbdd0f62@nvidia.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: sdonthineni@nvidia.com, catalin.marinas@arm.com, will@kernel.org, james.morse@arm.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 05 Jan 2023 14:47:44 +0000, Shanker Donthineni wrote: > > > > On 1/5/23 04:59, Marc Zyngier wrote: > > External email: Use caution opening links or attachments > > > > > > On Wed, 04 Jan 2023 13:47:03 +0000, > > Shanker Donthineni wrote: > >> > >> Hi Marc, > >> > >> On 1/4/23 03:14, Marc Zyngier wrote: > >>> External email: Use caution opening links or attachments > >>> > >>> > >>> On Wed, 04 Jan 2023 02:37:38 +0000, > >>> Shanker Donthineni wrote: > >>>> > >>>> The default value of NR_IRQS is not sufficient to support GICv4.1 > >>>> features and ~56K LPIs. This parameter would be too small for certain > >>>> server platforms where it has many IO devices and is capable of > >>>> direct injection of vSGI and vLPI features. > >>>> > >>>> Currently, maximum of 64 + 8192 (IRQ_BITMAP_BITS) IRQ descriptors > >>>> are allowed. The vCPU creation fails after reaching count ~400 with > >>>> kvm-arm.vgic_v4_enable=1. > >>>> > >>>> This patch increases NR_IRQS to 1^19 to cover 56K LPIs and 262144 > >>>> vSGIs (16K vPEs x 16). > >>>> > >>>> Signed-off-by: Shanker Donthineni > >>>> --- > >>>> Changes since v1: > >>>> -create from v6.2-rc1 and edit commit text > >>>> > >>>> arch/arm64/include/asm/irq.h | 4 ++++ > >>>> 1 file changed, 4 insertions(+) > >>>> > >>>> diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h > >>>> index fac08e18bcd5..3fffc0b8b704 100644 > >>>> --- a/arch/arm64/include/asm/irq.h > >>>> +++ b/arch/arm64/include/asm/irq.h > >>>> @@ -4,6 +4,10 @@ > >>>> > >>>> #ifndef __ASSEMBLER__ > >>>> > >>>> +#if defined(CONFIG_ARM_GIC_V3_ITS) > >>>> +#define NR_IRQS (1 << 19) > >>>> +#endif > >>>> + > >>>> #include > >>>> > >>>> struct pt_regs; > >>> > >>> Sorry, but I don't think this is an acceptable change. This is a large > >>> overhead that affects *everyone*, and that will eventually be too > >>> small anyway with larger systems and larger interrupt spaces. > >>> > >>> A better way to address this would be to move to a more dynamic > >>> allocation, converting the irqdesc rb-tree into an xarray, getting rid > >>> of the bitmaps (the allocation bitmap and the resend one), and track > >>> everything in the xarray. > >> > >> The actual memory allocation for IRQ descriptors is still dynamic for ARM64. > >> This change increases static memory for variable 'allocated_irqs' by 64KB, > >> feel not a noticeable overhead. > > > > 64kB for each bitmap, so that's already 128kB (you missed the > > irqs_resend bitmap). And that's for a number of IRQs that is still way > > below what the GIC architecture supports today. > > > > The architecture supports 32bit INTIDs, and that's 1GB worth of > > bitmaps, only for the physical side. Add the virtual stuff for which > > we create host-side descriptors, and we can go way beyond that. > > > > So what happens next, once you exceed the arbitrary limit that only > > satisfies your own use case? We will bump it up again, and again, > > bloating the kernel with useless static data that *nobody* needs. > > Specially not the VMs that you plan to run. > > > > So I'm putting my foot down right now, and saying that it needs to be > > fixed once and for all. The current scheme was OK for small interrupt > > spaces, but it isn't fit for purpose anymore, certainly not with > > things like the GICv4 architecture. > > > > I'm happy to help with it, but I'm certainly not willing to accept any > > sort of new compile-time limit. > > Thanks for helping with a scalable solution instead of static > allocation. Please include me whenever patches posted to LKML. I'm > happy to verify on NVIDIA server platforms and provide test > feedback. > I offered to help you. I didn't offer to do the work for you! ;-) Thanks, M. -- Without deviation from the norm, progress is not possible.