From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 17FBFCCF9FE for ; Fri, 31 Oct 2025 13:00:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E98A88E00CC; Fri, 31 Oct 2025 09:00:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E6FD28E0042; Fri, 31 Oct 2025 09:00:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DADA88E00CC; Fri, 31 Oct 2025 09:00:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C90428E0042 for ; Fri, 31 Oct 2025 09:00:18 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 72E7D59211 for ; Fri, 31 Oct 2025 13:00:18 +0000 (UTC) X-FDA: 84058417716.27.35F4B31 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf28.hostedemail.com (Postfix) with ESMTP id 5F815C0003 for ; Fri, 31 Oct 2025 13:00:16 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=I622t+vo; spf=pass (imf28.hostedemail.com: domain of pauld@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=pauld@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1761915616; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uLDwt+NU3vvKrlrQRKfDo5aEPn3HnX2ZWzIgGku0kDc=; b=5Vho0tT6IGTh2PamcwBeIZMPzsa0bUJBQ86LF2irjJxR8DzJLricQRmgA7YYWtmPIMf5X/ GZmH6Sjf9boBEyMlUN2rWgRAnDRwZCDMHyJrqgF21tBu6RjHPLZSzaTNWLBM5xUHyMQtO1 4mZ54Sl3jTwnG37nNtCpbQkS6+yTwzc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1761915616; a=rsa-sha256; cv=none; b=L+ed0Qy16CrL7TeKiZfGgQekPL5qjYSigmBXNE/hMcTxIf7YXvcTxbnmJE3AwPlY3ppQCc pvSayRfEADQMxXbHACWb5T0jNFQlPORabEY1D6hN6iwtv94tNONfMnuo5ZTiIIgT8EbvdS zbKNQ90ltarhyoGM8KC+t3mWlnhbEtA= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=I622t+vo; spf=pass (imf28.hostedemail.com: domain of pauld@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=pauld@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1761915615; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=uLDwt+NU3vvKrlrQRKfDo5aEPn3HnX2ZWzIgGku0kDc=; b=I622t+votir+o6W0kijC/MB1m+syBwAS+VBTOeZVDYp0jyWxEQkh+mWRXnBcRh0Z/eyBWg WqcBbBL9pny3UtX/Iif5TixJ/n/mCaVrQsk+6IoufMY2cQoOK8baExbyTc7MnGEV5EHt0o bb/LFV0w25o6JNCJ1TH6KbIKfY3SWc0= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-607-qDUMAdLUOq24oc-NOEKyLQ-1; Fri, 31 Oct 2025 09:00:10 -0400 X-MC-Unique: qDUMAdLUOq24oc-NOEKyLQ-1 X-Mimecast-MFC-AGG-ID: qDUMAdLUOq24oc-NOEKyLQ_1761915606 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id CB1101955F3E; Fri, 31 Oct 2025 13:00:03 +0000 (UTC) Received: from pauld.westford.csb (unknown [10.22.80.244]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 2491A1955BE3; Fri, 31 Oct 2025 12:59:53 +0000 (UTC) Date: Fri, 31 Oct 2025 08:59:51 -0400 From: Phil Auld To: Frederic Weisbecker Cc: LKML , Michal =?iso-8859-1?Q?Koutn=FD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: Re: [PATCH 13/33] cpuset: Update HK_TYPE_DOMAIN cpumask from cpuset Message-ID: <20251031125951.GA430420@pauld.westford.csb> References: <20251013203146.10162-1-frederic@kernel.org> <20251013203146.10162-14-frederic@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20251013203146.10162-14-frederic@kernel.org> X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 X-Rspamd-Server: rspam05 X-Stat-Signature: 77r4k7raf5979sr4g9nhiye3ffsssato X-Rspam-User: X-Rspamd-Queue-Id: 5F815C0003 X-HE-Tag: 1761915616-544971 X-HE-Meta: U2FsdGVkX1/ePBTbpaDnmwMGDsPO/te6MI5XIQqw00wU4Wcjj2MBwbAOg/53KW9RVruvhegu4Y4/wFH4DIgOPE/o2f/YVNi2sSHiu0lBo/3lKw3S6hbwyG+biTsL/VtcMQyz6x+TVkhoXB9QR2Q8KAt2mlaPYeWwdMyBaGBtM0TYm+ytwJhk3AFgyh/ZCcNZcHBIjHEm5nudC3Gy3PrlpUGjfC59WsyeMkEi9ZuVkO4rLlM6rAvoioo/4VA/aHH+Jk75OE7bDxqry3pvJUwdPi/lcErmgs3Bm9Cbq4PLHNaBEtnHzwj0RkiNWyt5jJ79IGtzb6Ae5tM/SF8kFgVRlf+4318kyZ6CxFR944pB6808oO7TnRN0ts8qnGkJAvM3mAvPP6ljUEEcSxpDgS72LcRK7EW9Xk9zkmHiTrN6DYmFInofn172NP3tOmKtYrH3XBoMJxaC86/z+4Kc3YOTSpsRY7E0rHcGpf6PPaPiHkDSR+8DO6rPa5kYmXUaLkIHGwAhnwQrMkl6mWtxK7fP09Xg1Z9uJTG2iAdhyy1Ro80eHnuBfRom9nmpJEKPudb5wm8SoeJr4+ZAiFDNgQaDIHFELb8KzO5qIJDLhww8qisytzYwY8GVPOOTuxM+WO+XtzMDnD4rdVw7uMFEH4WZ3/vJ7+ygam838Wuav7CuTvrA85ktgY9ykYpiFiIUcQqWj8upQyFA1zK4UOQkxpapF2d2UQqx9ZI7hG4wPQztO9D+7Q3QWiZfHA2staYBG2tpjgeM+Ah9sOTzWry6etTTSNDgyx4K6vOu3A8Ft8Xxpsk9k0M1oZiaIHU5vOrG/zxs2c8HK8F7EEBTkQXC/svkxrra2Q8KAGoQWFgvzHLMHlNcuwt1pNEfIxkN/lUt+SYmWVkJQ85SjrrDIObR7lR+7+HX5tR1tFiV2cWQUZZxgbQ6l3kWtQV9FO1UfcwssNDtjJhxAkHHwEiZASwf2CH 4hdbMn1A N+SSb9ds2VgH5HLjHIWPY7sAyvnNeoWvBpdAItnZuP3tq7BEPArAfaJ4ubFVvwczZoHHpReN/BR/1CU50jYLgJItDb9D+TH86plBRDBcJA1os005Kw033ESweQB8OkrlVy5Uuhga9kG+bGK7J8EyIFTKSf6w7eM+CzTNB2/hL/6JLxSHNhv0WdTbwfb1+EKAIIMdjZK4RA7wafNc6sc3clhDDKTJQfOlTjK8vf8pBe0e/jU6axbP3ou8NvtVy3RSa9Mijv5qRVjmv3/wMlM1Lcw2wLoCDtY/XlQv9VegJ+oY9Kykq6EGBJqX6l5DE/kG5AfJqjBYoY4g8Zn3k1gHf+OgpCixMNKLAa4Cgw6iHP0S5KoKtyfGRnd+/QW7OVLiO5fuz X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Frederic, On Mon, Oct 13, 2025 at 10:31:26PM +0200 Frederic Weisbecker wrote: > Until now, HK_TYPE_DOMAIN used to only include boot defined isolated > CPUs passed through isolcpus= boot option. Users interested in also > knowing the runtime defined isolated CPUs through cpuset must use > different APIs: cpuset_cpu_is_isolated(), cpu_is_isolated(), etc... > > There are many drawbacks to that approach: > > 1) Most interested subsystems want to know about all isolated CPUs, not > just those defined on boot time. > > 2) cpuset_cpu_is_isolated() / cpu_is_isolated() are not synchronized with > concurrent cpuset changes. > > 3) Further cpuset modifications are not propagated to subsystems > > Solve 1) and 2) and centralize all isolated CPUs within the > HK_TYPE_DOMAIN housekeeping cpumask. > > Subsystems can rely on RCU to synchronize against concurrent changes. > > The propagation mentioned in 3) will be handled in further patches. > > Signed-off-by: Frederic Weisbecker > --- > include/linux/sched/isolation.h | 2 + > kernel/cgroup/cpuset.c | 2 + > kernel/sched/isolation.c | 75 ++++++++++++++++++++++++++++++--- > kernel/sched/sched.h | 1 + > 4 files changed, 74 insertions(+), 6 deletions(-) > > diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h > index da22b038942a..94d5c835121b 100644 > --- a/include/linux/sched/isolation.h > +++ b/include/linux/sched/isolation.h > @@ -32,6 +32,7 @@ extern const struct cpumask *housekeeping_cpumask(enum hk_type type); > extern bool housekeeping_enabled(enum hk_type type); > extern void housekeeping_affine(struct task_struct *t, enum hk_type type); > extern bool housekeeping_test_cpu(int cpu, enum hk_type type); > +extern int housekeeping_update(struct cpumask *mask, enum hk_type type); > extern void __init housekeeping_init(void); > > #else > @@ -59,6 +60,7 @@ static inline bool housekeeping_test_cpu(int cpu, enum hk_type type) > return true; > } > > +static inline int housekeeping_update(struct cpumask *mask, enum hk_type type) { return 0; } > static inline void housekeeping_init(void) { } > #endif /* CONFIG_CPU_ISOLATION */ > > diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c > index aa1ac7bcf2ea..b04a4242f2fa 100644 > --- a/kernel/cgroup/cpuset.c > +++ b/kernel/cgroup/cpuset.c > @@ -1403,6 +1403,8 @@ static void update_unbound_workqueue_cpumask(bool isolcpus_updated) > > ret = workqueue_unbound_exclude_cpumask(isolated_cpus); > WARN_ON_ONCE(ret < 0); > + ret = housekeeping_update(isolated_cpus, HK_TYPE_DOMAIN); > + WARN_ON_ONCE(ret < 0); > } > > /** > diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c > index b46c20b5437f..95d69c2102f6 100644 > --- a/kernel/sched/isolation.c > +++ b/kernel/sched/isolation.c > @@ -29,18 +29,48 @@ static struct housekeeping housekeeping; > > bool housekeeping_enabled(enum hk_type type) > { > - return !!(housekeeping.flags & BIT(type)); > + return !!(READ_ONCE(housekeeping.flags) & BIT(type)); > } > EXPORT_SYMBOL_GPL(housekeeping_enabled); > > +static bool housekeeping_dereference_check(enum hk_type type) > +{ > + if (IS_ENABLED(CONFIG_LOCKDEP) && type == HK_TYPE_DOMAIN) { > + /* Cpuset isn't even writable yet? */ > + if (system_state <= SYSTEM_SCHEDULING) > + return true; > + > + /* CPU hotplug write locked, so cpuset partition can't be overwritten */ > + if (IS_ENABLED(CONFIG_HOTPLUG_CPU) && lockdep_is_cpus_write_held()) > + return true; > + > + /* Cpuset lock held, partitions not writable */ > + if (IS_ENABLED(CONFIG_CPUSETS) && lockdep_is_cpuset_held()) > + return true; > + > + return false; > + } > + > + return true; > +} > + > +static inline struct cpumask *housekeeping_cpumask_dereference(enum hk_type type) > +{ > + return rcu_dereference_check(housekeeping.cpumasks[type], > + housekeeping_dereference_check(type)); > +} > + > const struct cpumask *housekeeping_cpumask(enum hk_type type) > { > + const struct cpumask *mask = NULL; > + > if (static_branch_unlikely(&housekeeping_overridden)) { > - if (housekeeping.flags & BIT(type)) { > - return rcu_dereference_check(housekeeping.cpumasks[type], 1); > - } > + if (READ_ONCE(housekeeping.flags) & BIT(type)) > + mask = housekeeping_cpumask_dereference(type); > } > - return cpu_possible_mask; > + if (!mask) > + mask = cpu_possible_mask; > + return mask; > } > EXPORT_SYMBOL_GPL(housekeeping_cpumask); > > @@ -80,12 +110,45 @@ EXPORT_SYMBOL_GPL(housekeeping_affine); > > bool housekeeping_test_cpu(int cpu, enum hk_type type) > { > - if (housekeeping.flags & BIT(type)) > + if (READ_ONCE(housekeeping.flags) & BIT(type)) > return cpumask_test_cpu(cpu, housekeeping_cpumask(type)); > return true; > } > EXPORT_SYMBOL_GPL(housekeeping_test_cpu); > > +int housekeeping_update(struct cpumask *mask, enum hk_type type) > +{ > + struct cpumask *trial, *old = NULL; > + > + if (type != HK_TYPE_DOMAIN) > + return -ENOTSUPP; > + > + trial = kmalloc(sizeof(*trial), GFP_KERNEL); > + if (!trial) > + return -ENOMEM; > + > + cpumask_andnot(trial, housekeeping_cpumask(HK_TYPE_DOMAIN_BOOT), mask); > + if (!cpumask_intersects(trial, cpu_online_mask)) { > + kfree(trial); > + return -EINVAL; > + } > + > + if (!housekeeping.flags) > + static_branch_enable(&housekeeping_overridden); > + > + if (!(housekeeping.flags & BIT(type))) > + old = housekeeping_cpumask_dereference(type); > + else > + WRITE_ONCE(housekeeping.flags, housekeeping.flags | BIT(type)); Isn't this backwards? If the bit is not set you save old to free it and if the bit is set you set it again. Cheers, Phil > + rcu_assign_pointer(housekeeping.cpumasks[type], trial); > + > + synchronize_rcu(); > + > + kfree(old); > + > + return 0; > +} > + > void __init housekeeping_init(void) > { > enum hk_type type; > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h > index 0c0ef8999fd6..8fac8aa451c6 100644 > --- a/kernel/sched/sched.h > +++ b/kernel/sched/sched.h > @@ -30,6 +30,7 @@ > #include > #include > #include > +#include > #include > #include > #include > -- > 2.51.0 > --