From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 00E84CCD1A5 for ; Tue, 21 Oct 2025 19:17:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: Content-Type:In-Reply-To:References:Cc:To:Subject:MIME-Version:Date: Message-ID:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=Av2zs3MEbcbLC7aMwacLFVSODjPRb5hpuHHxAZIybWo=; b=k1nAW9dMpMIeAZ/iUUEqi4ycDz yZSv5Mk+xfjYWFQAnlqHg3IiJfuZ/q6lf5xLj/ukl/jBY+zbD030sNTDKA53V3fzPC8eG14x0GSTi q0SKpQy3i+QyZ4pKd2Glq09genMbqg+K1AMF2iis1yY/iiv1VGCnm446+1Z80JPRGLht4zgWEIiLM H8fuewOvZxQIu45GszpbnajeGsCMAV5ldlJTsYWpW5PN3sNUXYiv8emr7QEpRka/lmOz6BFo4zNUL qo0Qb3hTZbixS4attkzX9lyG2CK4x/lAUieYKFKKiOhYxEwdeSuBQMVhWqIGQjzsYEnY0vqeYSkxe 0T5bOJbA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98.2 #2 (Red Hat Linux)) id 1vBHqu-00000000SS4-0pfT; Tue, 21 Oct 2025 19:17:00 +0000 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by bombadil.infradead.org with esmtps (Exim 4.98.2 #2 (Red Hat Linux)) id 1vBHqr-00000000SRD-0qUg for linux-arm-kernel@lists.infradead.org; Tue, 21 Oct 2025 19:16:58 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1761074216; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Av2zs3MEbcbLC7aMwacLFVSODjPRb5hpuHHxAZIybWo=; b=S8OXLvxlGYy3d58CF8uoOrLBNw/DIL/cMvw3OnjkKEDGd8Hsx8AuWkz09Qb+lh/zE32gTl DtoLUcfpDBkona8PnA8IIyrUkuC2w/V2TT3tlNSpJdzYlSfusB1Pp2cRsp+plIdxXvLHf0 gCQ3q3vUjI/T3RW6Ra6u7nhHuABYPZ8= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-540-fb_zyCPTO-iA3E8GqexFwg-1; Tue, 21 Oct 2025 15:16:51 -0400 X-MC-Unique: fb_zyCPTO-iA3E8GqexFwg-1 X-Mimecast-MFC-AGG-ID: fb_zyCPTO-iA3E8GqexFwg_1761074211 Received: by mail-qv1-f70.google.com with SMTP id 6a1803df08f44-87c1558a74aso317405316d6.2 for ; Tue, 21 Oct 2025 12:16:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761074211; x=1761679011; h=content-transfer-encoding:in-reply-to:content-language:references :cc:to:subject:user-agent:mime-version:date:message-id:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Av2zs3MEbcbLC7aMwacLFVSODjPRb5hpuHHxAZIybWo=; b=JuFOzoOd4RP3xN9qp4GpP4k+awDZ7/D7hr0T5IkF7CFsWMYQyhsLWpAhPiCvDOI3fw tmohDzkOY7J8A/OOg7mgFM4INIdfJeTli3546HwB0dVMKBm68fJacV9TpceXCAxvCx5G 3oNBzxVJ22NP0AJHfJUTv82osvorDuOL8X+lvi1P2HPS9EyC0PxPZpb42vpWz2vqOkq0 rSx3ydQsypfrzgMhVd/xORtLnete8PWgKTqrAXsa5zGGf1/HVnIg4QOVP6ZiBwWczC1Q 77hvnOdhGL+M7U769OWBunA0ePQzjpty5c2iQphicrxd4065n6sJcXCZm3CVHoOKBPXl q3Ag== X-Forwarded-Encrypted: i=1; AJvYcCVMOF8czBimf6bFqJuEM8BROrB++/jA9xNp0GPrLvttKYFMyJGo2W62zeZf+qjSf7/ZzdH9L7YJOq24FldSwnR1@lists.infradead.org X-Gm-Message-State: AOJu0Yyqwvb3hylQSItlFAfjtcLN0HGiC5FNqGJZKa0WrLKJU6w7PqtC V2Q3LTs+BkOnlTLiBJHgHXzZkvu0DRXySNwvow2AVLvBUrGmzy/11GQ5Yi4ImbZGzGhXwbS/Qqf Wa7cdyg/96OB5yZfw3/pEPzL08gvRxTDPHJKt1UkBJvA84dH0zXal/UIGeMveEiPswWxlKQIvnG nI X-Gm-Gg: ASbGncv5IvOqjEB68Y4qKDvEV8HGuQz/PPc1PDvKbb4DYzZWVIbnb8z7/H24gvmSCUb laOZqoDAtUuSnCoseQpf7V6IZhEHT07lEMmrU0JfM/AnaQxr8zc8jPhU7BZ8fblCc/3P2tZBJdZ lCj4mhcc4dnJh4tY93iVAsvOF44tVMJnh0Pu/fcnWhSf+Fv5bNu+YAGQ1Xh5rM1SCl+L0oOFeut qmOh4USQFUyKrVUFt/rUjqNSk1OfE4CnPeUpLIzuNElE2kXg6AWyakT7AkuZcq+qQo2ErWwKydt Aq/Ieli0AIvwqYnL0Mu2DbTcGBFxVMjcr7qdUNOp1LyOV1gjm4nKUChAKSng93MAse5yq6ujkZH 7dchTt0XmcsBh6vlNqCBXYwvLEke4SRFr/18bvAoK0pVVuw== X-Received: by 2002:a05:6214:4012:b0:87d:8fa7:d29e with SMTP id 6a1803df08f44-87d8fa7d3a7mr172807766d6.35.1761074211056; Tue, 21 Oct 2025 12:16:51 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEMN3DCL5z77zrF4an/kPQngsHh4Dl7HI0jLtCtPSfq0EP/4XoCKRGCEKlYnjWxZgcz1y0y9w== X-Received: by 2002:a05:6214:4012:b0:87d:8fa7:d29e with SMTP id 6a1803df08f44-87d8fa7d3a7mr172807136d6.35.1761074210482; Tue, 21 Oct 2025 12:16:50 -0700 (PDT) Received: from ?IPV6:2601:600:947f:f020:85dc:d2b2:c5ee:e3c4? ([2601:600:947f:f020:85dc:d2b2:c5ee:e3c4]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-87cf521c2c7sm74369666d6.19.2025.10.21.12.16.46 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 21 Oct 2025 12:16:49 -0700 (PDT) From: Waiman Long X-Google-Original-From: Waiman Long Message-ID: <364e084a-ef37-42ab-a2ae-5f103f1eb212@redhat.com> Date: Tue, 21 Oct 2025 15:16:45 -0400 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 14/33] sched/isolation: Flush memcg workqueues on cpuset isolated partition change To: Frederic Weisbecker , LKML Cc: =?UTF-8?Q?Michal_Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org References: <20251013203146.10162-1-frederic@kernel.org> <20251013203146.10162-15-frederic@kernel.org> In-Reply-To: <20251013203146.10162-15-frederic@kernel.org> X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: zf-XkkjhoKL7HdjH93dQno2PHQc05YnFoj7EjeTJ6Vk_1761074211 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20251021_121657_306485_26416FC1 X-CRM114-Status: GOOD ( 24.83 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 10/13/25 4:31 PM, Frederic Weisbecker wrote: > The HK_TYPE_DOMAIN housekeeping cpumask is now modifyable at runtime. In > order to synchronize against memcg workqueue to make sure that no > asynchronous draining is still pending or executing on a newly made > isolated CPU, the housekeeping susbsystem must flush the memcg > workqueues. > > However the memcg workqueues can't be flushed easily since they are > queued to the main per-CPU workqueue pool. > > Solve this with creating a memcg specific pool and provide and use the > appropriate flushing API. > > Acked-by: Shakeel Butt > Signed-off-by: Frederic Weisbecker > --- > include/linux/memcontrol.h | 4 ++++ > kernel/sched/isolation.c | 2 ++ > kernel/sched/sched.h | 1 + > mm/memcontrol.c | 12 +++++++++++- > 4 files changed, 18 insertions(+), 1 deletion(-) > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > index 873e510d6f8d..001200df63cf 100644 > --- a/include/linux/memcontrol.h > +++ b/include/linux/memcontrol.h > @@ -1074,6 +1074,8 @@ static inline u64 cgroup_id_from_mm(struct mm_struct *mm) > return id; > } > > +void mem_cgroup_flush_workqueue(void); > + > extern int mem_cgroup_init(void); > #else /* CONFIG_MEMCG */ > > @@ -1481,6 +1483,8 @@ static inline u64 cgroup_id_from_mm(struct mm_struct *mm) > return 0; > } > > +static inline void mem_cgroup_flush_workqueue(void) { } > + > static inline int mem_cgroup_init(void) { return 0; } > #endif /* CONFIG_MEMCG */ > > diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c > index 95d69c2102f6..9ec365dea921 100644 > --- a/kernel/sched/isolation.c > +++ b/kernel/sched/isolation.c > @@ -144,6 +144,8 @@ int housekeeping_update(struct cpumask *mask, enum hk_type type) > > synchronize_rcu(); > > + mem_cgroup_flush_workqueue(); > + > kfree(old); > > return 0; > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h > index 8fac8aa451c6..8bfc0b4b133f 100644 > --- a/kernel/sched/sched.h > +++ b/kernel/sched/sched.h > @@ -44,6 +44,7 @@ > #include > #include > #include > +#include > #include > #include > #include > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 1033e52ab6cf..1aa14e543f35 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -95,6 +95,8 @@ static bool cgroup_memory_nokmem __ro_after_init; > /* BPF memory accounting disabled? */ > static bool cgroup_memory_nobpf __ro_after_init; > > +static struct workqueue_struct *memcg_wq __ro_after_init; > + > static struct kmem_cache *memcg_cachep; > static struct kmem_cache *memcg_pn_cachep; > > @@ -1975,7 +1977,7 @@ static void schedule_drain_work(int cpu, struct work_struct *work) > { > guard(rcu)(); > if (!cpu_is_isolated(cpu)) > - schedule_work_on(cpu, work); > + queue_work_on(cpu, memcg_wq, work); > } > > /* > @@ -5092,6 +5094,11 @@ void mem_cgroup_sk_uncharge(const struct sock *sk, unsigned int nr_pages) > refill_stock(memcg, nr_pages); > } > > +void mem_cgroup_flush_workqueue(void) > +{ > + flush_workqueue(memcg_wq); > +} > + > static int __init cgroup_memory(char *s) > { > char *token; > @@ -5134,6 +5141,9 @@ int __init mem_cgroup_init(void) > cpuhp_setup_state_nocalls(CPUHP_MM_MEMCQ_DEAD, "mm/memctrl:dead", NULL, > memcg_hotplug_cpu_dead); > > + memcg_wq = alloc_workqueue("memcg", 0, 0); Should we explicitly mark the memcg_wq as WQ_PERCPU even though I think percpu is the default. The schedule_work_on() schedules work on the system_percpu_wq. Cheers, Longman > + WARN_ON(!memcg_wq); > + > for_each_possible_cpu(cpu) { > INIT_WORK(&per_cpu_ptr(&memcg_stock, cpu)->work, > drain_local_memcg_stock);