From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85687C433DB for ; Thu, 25 Feb 2021 22:25:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 04F6064E83 for ; Thu, 25 Feb 2021 22:25:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 04F6064E83 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 684486B0005; Thu, 25 Feb 2021 17:25:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 635556B0006; Thu, 25 Feb 2021 17:25:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4FCFC6B006C; Thu, 25 Feb 2021 17:25:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0049.hostedemail.com [216.40.44.49]) by kanga.kvack.org (Postfix) with ESMTP id 3A65D6B0005 for ; Thu, 25 Feb 2021 17:25:52 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id E306A1838A148 for ; Thu, 25 Feb 2021 22:25:51 +0000 (UTC) X-FDA: 77858223702.24.4AD7C83 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by imf07.hostedemail.com (Postfix) with ESMTP id 1F57CA000600 for ; Thu, 25 Feb 2021 22:25:49 +0000 (UTC) IronPort-SDR: uGJlzfLhJVzdKcByuV94I5xhBQgzeCJgnlrrGVuazBBg48QyJsmUQOM0sv51lJJlj3FdsN/LuN Y9X7FTiwOqjg== X-IronPort-AV: E=McAfee;i="6000,8403,9906"; a="185820954" X-IronPort-AV: E=Sophos;i="5.81,207,1610438400"; d="scan'208";a="185820954" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Feb 2021 14:25:48 -0800 IronPort-SDR: 3xeohI4Olwdz2iRO+Wcc4cNuCWodhBY1jeeoz7LVZ+cZBwoZwLjM3a7ax373SR3+PoXrk0U++r pxX9gr5hypXQ== X-IronPort-AV: E=Sophos;i="5.81,207,1610438400"; d="scan'208";a="365618944" Received: from schen9-mobl.amr.corp.intel.com ([10.254.86.33]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Feb 2021 14:25:48 -0800 Subject: Re: [PATCH v2 2/3] mm: Force update of mem cgroup soft limit tree on usage excess From: Tim Chen To: Michal Hocko Cc: Andrew Morton , Johannes Weiner , Vladimir Davydov , Dave Hansen , Ying Huang , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org References: <06f1f92f1f7d4e57c4e20c97f435252c16c60a27.1613584277.git.tim.c.chen@linux.intel.com> <884d7559-e118-3773-351d-84c02642ca96@linux.intel.com> Message-ID: Date: Thu, 25 Feb 2021 14:25:47 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 1F57CA000600 X-Stat-Signature: e9ww9s438na8k848rt3m7oaajcz53wgr Received-SPF: none (linux.intel.com>: No applicable sender policy available) receiver=imf07; identity=mailfrom; envelope-from=""; helo=mga09.intel.com; client-ip=134.134.136.24 X-HE-DKIM-Result: none/none X-HE-Tag: 1614291949-816269 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2/22/21 9:41 AM, Tim Chen wrote: > > > On 2/22/21 12:40 AM, Michal Hocko wrote: >> On Fri 19-02-21 10:59:05, Tim Chen wrote: > occurrence. >>>> >>>> Soft limit is evaluated every THRESHOLDS_EVENTS_TARGET * SOFTLIMIT_EVENTS_TARGET. >>>> If all events correspond with a newly charged memory and the last event >>>> was just about the soft limit boundary then we should be bound by 128k >>>> pages (512M and much more if this were huge pages) which is a lot! >>>> I haven't realized this was that much. Now I see the problem. This would >>>> be a useful information for the changelog. >>>> >>>> Your fix is focusing on the over-the-limit boundary which will solve the >>>> problem but wouldn't that lead to to updates happening too often in >>>> pathological situation when a memcg would get reclaimed immediatelly? >>> >>> Not really immediately. The memcg that has the most soft limit excess will >>> be chosen for page reclaim, which is the way it should be. >>> It is less likely that a memcg that just exceeded >>> the soft limit becomes the worst offender immediately. >> >> Well this all depends on when the the soft limit reclaim triggeres. In >> other words how often you see the global memory reclaim. If we have a >> memcg with a sufficient excess then this will work mostly fine. I was more >> worried about a case when you have memcgs just slightly over the limit >> and the global memory pressure is a regular event. You can easily end up >> bouncing memcgs off and on the tree in a rapid fashion. >> > > If you are concerned about such a case, we can add an excess threshold, > say 4 MB (or 1024 4K pages), before we trigger a forced update. You think > that will cover this concern? > Michal, How about modifiying this patch with a threshold? Like the following? Tim --- >From 5a78ab56e2e654290cacab2f5a1631e1da1d90d2 Mon Sep 17 00:00:00 2001 From: Tim Chen Date: Wed, 3 Feb 2021 14:08:48 -0800 Subject: [PATCH] mm: Force update of mem cgroup soft limit tree on usage excess To rate limit updates to the mem cgroup soft limit tree, we only perform updates every SOFTLIMIT_EVENTS_TARGET (defined as 1024) memory events. However, this sampling based updates may miss a critical update: i.e. when the mem cgroup first exceeded its limit but it was not on the soft limit tree. It should be on the tree at that point so it could be subjected to soft limit page reclaim. If the mem cgroup had few memory events compared with other mem cgroups, we may not update it and place in on the mem cgroup soft limit tree for many memory events. And this mem cgroup excess usage could creep up and the mem cgroup could be hidden from the soft limit page reclaim for a long time. Fix this issue by forcing an update to the mem cgroup soft limit tree if a mem cgroup has exceeded its memory soft limit but it is not on the mem cgroup soft limit tree. --- mm/memcontrol.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index a51bf90732cb..e0f6948f8ea5 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -104,6 +104,7 @@ static bool do_memsw_account(void) #define THRESHOLDS_EVENTS_TARGET 128 #define SOFTLIMIT_EVENTS_TARGET 1024 +#define SOFTLIMIT_EXCESS_THRESHOLD 1024 /* * Cgroups above their limits are maintained in a RB-Tree, independent of @@ -985,15 +986,29 @@ static bool mem_cgroup_event_ratelimit(struct mem_cgroup *memcg, */ static void memcg_check_events(struct mem_cgroup *memcg, struct page *page) { + struct mem_cgroup_per_node *mz; + bool force_update = false; + + mz = mem_cgroup_nodeinfo(memcg, page_to_nid(page)); + /* + * mem_cgroup_update_tree may not be called for a memcg exceeding + * soft limit due to the sampling nature of update. Don't allow + * a memcg to be left out of the tree if it has too much usage + * excess. + */ + if (mz && !mz->on_tree && + soft_limit_excess(mz->memcg) > SOFTLIMIT_EXCESS_THRESHOLD) + force_update = true; + /* threshold event is triggered in finer grain than soft limit */ - if (unlikely(mem_cgroup_event_ratelimit(memcg, + if (unlikely((force_update) || mem_cgroup_event_ratelimit(memcg, MEM_CGROUP_TARGET_THRESH))) { bool do_softlimit; do_softlimit = mem_cgroup_event_ratelimit(memcg, MEM_CGROUP_TARGET_SOFTLIMIT); mem_cgroup_threshold(memcg); - if (unlikely(do_softlimit)) + if (unlikely((force_update) || do_softlimit)) mem_cgroup_update_tree(memcg, page); } } -- 2.20.1