From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2871EC433FE for ; Sat, 23 Apr 2022 05:43:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233097AbiDWFqg (ORCPT ); Sat, 23 Apr 2022 01:46:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55346 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230025AbiDWFqd (ORCPT ); Sat, 23 Apr 2022 01:46:33 -0400 Received: from verein.lst.de (verein.lst.de [213.95.11.211]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EF6D864716; Fri, 22 Apr 2022 22:43:36 -0700 (PDT) Received: by verein.lst.de (Postfix, from userid 2407) id 7331568AFE; Sat, 23 Apr 2022 07:43:32 +0200 (CEST) Date: Sat, 23 Apr 2022 07:43:31 +0200 From: Christoph Hellwig To: "brookxu.cn" Cc: kbusch@kernel.org, axboe@fb.com, hch@lst.de, sagi@grimberg.me, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, tglx@linutronix.de, frederic@kernel.org Subject: Re: [RFC PATCH] nvme-pci: allowed to modify IRQ affinity in latency sensitive scenarios Message-ID: <20220423054331.GA17823@lst.de> References: <1650625106-30272-1-git-send-email-brookxu.cn@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1650625106-30272-1-git-send-email-brookxu.cn@gmail.com> User-Agent: Mutt/1.5.17 (2007-11-01) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 22, 2022 at 06:58:26PM +0800, brookxu.cn wrote: > From: Chunguang Xu > > In most cases, setting the affinity through managed IRQ is a better > choice. But in some scenarios that use isolcpus, such as DPDK, because > managed IRQ does not distinguish between housekeeping CPU and isolated > CPU when selecting CPU, this will cause IO interrupts triggered by > housekeeping CPU to be routed to isolated CPU, which will affect the > tasks running on isolated CPU. commit 11ea68f553e2 ("genirq, > sched/isolation: Isolate from handling managed interrupts") tries to > fix this in a best effort way. However, in a real production environment, > latency-sensitive business needs more of a deterministic result. So, > similar to the mpt3sas driver, we might can add a module parameter > smp_affinity_enable to the Nvme driver. This kind of boilerplate code in random drivers is not sustainable. I really think we need to handle this whole housekeeping CPU case in common code. That is designed CPUs as housekeeping vs non-housekeeping and let the generic affinity assignment code deal with it and solve it for all drivers using the proper affinity masks instead of having random slighty overrides in all drivers anyone ever wants to use in such a system.