From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7899C432BE for ; Thu, 5 Aug 2021 16:00:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4353761154 for ; Thu, 5 Aug 2021 16:00:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4353761154 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=techsingularity.net Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id E0CC86B0071; Thu, 5 Aug 2021 12:00:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DBD1A6B0072; Thu, 5 Aug 2021 12:00:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CD39C6B0073; Thu, 5 Aug 2021 12:00:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0205.hostedemail.com [216.40.44.205]) by kanga.kvack.org (Postfix) with ESMTP id B2F5E6B0071 for ; Thu, 5 Aug 2021 12:00:42 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 574E1182D93DB for ; Thu, 5 Aug 2021 16:00:42 +0000 (UTC) X-FDA: 78441489924.25.36F8E8A Received: from outbound-smtp21.blacknight.com (outbound-smtp21.blacknight.com [81.17.249.41]) by imf30.hostedemail.com (Postfix) with ESMTP id D3877E00307C for ; Thu, 5 Aug 2021 16:00:41 +0000 (UTC) Received: from mail.blacknight.com (pemlinmail03.blacknight.ie [81.17.254.16]) by outbound-smtp21.blacknight.com (Postfix) with ESMTPS id 76BC2CCBC7 for ; Thu, 5 Aug 2021 17:00:40 +0100 (IST) Received: (qmail 18016 invoked from network); 5 Aug 2021 16:00:40 -0000 Received: from unknown (HELO stampy.112glenside.lan) (mgorman@techsingularity.net@[84.203.17.255]) by 81.17.254.9 with ESMTPA; 5 Aug 2021 16:00:40 -0000 From: Mel Gorman To: Andrew Morton Cc: Thomas Gleixner , Ingo Molnar , Vlastimil Babka , Hugh Dickins , Linux-MM , Linux-RT-Users , LKML , Mel Gorman Subject: [PATCH 1/1] mm/vmstat: Protect per cpu variables with preempt disable on RT Date: Thu, 5 Aug 2021 17:00:19 +0100 Message-Id: <20210805160019.1137-2-mgorman@techsingularity.net> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210805160019.1137-1-mgorman@techsingularity.net> References: <20210805160019.1137-1-mgorman@techsingularity.net> MIME-Version: 1.0 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: D3877E00307C Authentication-Results: imf30.hostedemail.com; dkim=none; spf=pass (imf30.hostedemail.com: domain of mgorman@techsingularity.net designates 81.17.249.41 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net; dmarc=none X-Stat-Signature: 5njyu33384smmpy8tmoi9ycwuwusp58n X-HE-Tag: 1628179241-849438 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Ingo Molnar Disable preemption on -RT for the vmstat code. On vanila the code runs in IRQ-off regions while on -RT it may not when stats are updated under a local_lock. "preempt_disable" ensures that the same resources is not updated in parallel due to preemption. This patch differs from the preempt-rt version where __count_vm_event and __count_vm_events are also protected. The counters are explicitly "allowe= d to be to be racy" so there is no need to protect them from preemption. On= ly the accurate page stats that are updated by a read-modify-write need protection. This patch also differs in that a preempt_[en|dis]able_rt helper is not used. As vmstat is the only user of the helper, it was suggested that it be open-coded in vmstat.c instead of risking the helper being used in unnecessary contexts. Signed-off-by: Ingo Molnar Signed-off-by: Thomas Gleixner Signed-off-by: Mel Gorman --- mm/vmstat.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) diff --git a/mm/vmstat.c b/mm/vmstat.c index b0534e068166..2c7e7569a453 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -319,6 +319,16 @@ void __mod_zone_page_state(struct zone *zone, enum z= one_stat_item item, long x; long t; =20 + /* + * Accurate vmstat updates require a RMW. On !PREEMPT_RT kernels, + * atomicity is provided by IRQs being disabled -- either explicitly + * or via local_lock_irq. On PREEMPT_RT, local_lock_irq only disables + * CPU migrations and preemption potentially corrupts a counter so + * disable preemption. + */ + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_disable(); + x =3D delta + __this_cpu_read(*p); =20 t =3D __this_cpu_read(pcp->stat_threshold); @@ -328,6 +338,9 @@ void __mod_zone_page_state(struct zone *zone, enum zo= ne_stat_item item, x =3D 0; } __this_cpu_write(*p, x); + + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_enable(); } EXPORT_SYMBOL(__mod_zone_page_state); =20 @@ -350,6 +363,10 @@ void __mod_node_page_state(struct pglist_data *pgdat= , enum node_stat_item item, delta >>=3D PAGE_SHIFT; } =20 + /* See __mod_node_page_state */ + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_disable(); + x =3D delta + __this_cpu_read(*p); =20 t =3D __this_cpu_read(pcp->stat_threshold); @@ -359,6 +376,9 @@ void __mod_node_page_state(struct pglist_data *pgdat,= enum node_stat_item item, x =3D 0; } __this_cpu_write(*p, x); + + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_enable(); } EXPORT_SYMBOL(__mod_node_page_state); =20 @@ -391,6 +411,10 @@ void __inc_zone_state(struct zone *zone, enum zone_s= tat_item item) s8 __percpu *p =3D pcp->vm_stat_diff + item; s8 v, t; =20 + /* See __mod_node_page_state */ + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_disable(); + v =3D __this_cpu_inc_return(*p); t =3D __this_cpu_read(pcp->stat_threshold); if (unlikely(v > t)) { @@ -399,6 +423,9 @@ void __inc_zone_state(struct zone *zone, enum zone_st= at_item item) zone_page_state_add(v + overstep, zone, item); __this_cpu_write(*p, -overstep); } + + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_enable(); } =20 void __inc_node_state(struct pglist_data *pgdat, enum node_stat_item ite= m) @@ -409,6 +436,10 @@ void __inc_node_state(struct pglist_data *pgdat, enu= m node_stat_item item) =20 VM_WARN_ON_ONCE(vmstat_item_in_bytes(item)); =20 + /* See __mod_node_page_state */ + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_disable(); + v =3D __this_cpu_inc_return(*p); t =3D __this_cpu_read(pcp->stat_threshold); if (unlikely(v > t)) { @@ -417,6 +448,9 @@ void __inc_node_state(struct pglist_data *pgdat, enum= node_stat_item item) node_page_state_add(v + overstep, pgdat, item); __this_cpu_write(*p, -overstep); } + + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_enable(); } =20 void __inc_zone_page_state(struct page *page, enum zone_stat_item item) @@ -437,6 +471,10 @@ void __dec_zone_state(struct zone *zone, enum zone_s= tat_item item) s8 __percpu *p =3D pcp->vm_stat_diff + item; s8 v, t; =20 + /* See __mod_node_page_state */ + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_disable(); + v =3D __this_cpu_dec_return(*p); t =3D __this_cpu_read(pcp->stat_threshold); if (unlikely(v < - t)) { @@ -445,6 +483,9 @@ void __dec_zone_state(struct zone *zone, enum zone_st= at_item item) zone_page_state_add(v - overstep, zone, item); __this_cpu_write(*p, overstep); } + + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_enable(); } =20 void __dec_node_state(struct pglist_data *pgdat, enum node_stat_item ite= m) @@ -455,6 +496,10 @@ void __dec_node_state(struct pglist_data *pgdat, enu= m node_stat_item item) =20 VM_WARN_ON_ONCE(vmstat_item_in_bytes(item)); =20 + /* See __mod_node_page_state */ + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_disable(); + v =3D __this_cpu_dec_return(*p); t =3D __this_cpu_read(pcp->stat_threshold); if (unlikely(v < - t)) { @@ -463,6 +508,9 @@ void __dec_node_state(struct pglist_data *pgdat, enum= node_stat_item item) node_page_state_add(v - overstep, pgdat, item); __this_cpu_write(*p, overstep); } + + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_enable(); } =20 void __dec_zone_page_state(struct page *page, enum zone_stat_item item) --=20 2.31.1