From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A794C46CD2 for ; Wed, 24 Jan 2024 07:50:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A9A988D0003; Wed, 24 Jan 2024 02:50:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A4A8E8D0002; Wed, 24 Jan 2024 02:50:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8EA238D0003; Wed, 24 Jan 2024 02:50:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 7A72F8D0002 for ; Wed, 24 Jan 2024 02:50:48 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 3A1921A029C for ; Wed, 24 Jan 2024 07:50:48 +0000 (UTC) X-FDA: 81713432976.02.1D8857B Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by imf11.hostedemail.com (Postfix) with ESMTP id 0F7A14000F for ; Wed, 24 Jan 2024 07:50:45 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=d9yJYnB8; dkim=pass header.d=suse.com header.s=susede1 header.b=d9yJYnB8; spf=pass (imf11.hostedemail.com: domain of mhocko@suse.com designates 195.135.223.130 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706082646; a=rsa-sha256; cv=none; b=tfPKOjzjmnPu4ACGmv3zUEPehz3vExeG3jgUMo/xewd+IVVvfgqP6v4gEAGEhDgT/FAJBe zQigwWfq8G8FqJfwpmnOhjnulbJTfULApp9Y19wab08Oo8y2l6lGKK3bUi2qHDuyKMwkbw Pk0tA8xG7Y8ZcV7uILAFPDGOP1idn6g= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=d9yJYnB8; dkim=pass header.d=suse.com header.s=susede1 header.b=d9yJYnB8; spf=pass (imf11.hostedemail.com: domain of mhocko@suse.com designates 195.135.223.130 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706082646; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sUY8xqCzF/ho9H6VnM7nIW3YTPMmxBLu+pNBzIkIdWc=; b=6V/rNfgbz9GZyt/Cqn9ORUV+UgjoK+kidMJQZPY2amNw57gM1ttjNaocU0/TsiM1/BsW8T Fhvr4/uDQvTUkjGQsv6hjNEE1Y7AeLstVxkKRSlK8Oofm7LFtpzLKrcX2e8OfJ7+Vj/DKd oxdwWRP16AQDB2lGTNQxxHif9+stVUk= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 2C8D021EB6; Wed, 24 Jan 2024 07:50:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1706082644; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=sUY8xqCzF/ho9H6VnM7nIW3YTPMmxBLu+pNBzIkIdWc=; b=d9yJYnB8gVI3nNOhaiezeRXyIDCwE3OFGeee6sYMbH9eTyPTWPGRq6gf/75o5M7uWPSftQ 1ItK3LWFd4chp+82V2UIy8EDmTrYn42Lv1IelQop4bBQJMiUBj8ICT99u3/PMD1maW8bMe wPcoLOpfnPIdANtQ1Q6tjip8lUE7k4Q= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1706082644; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=sUY8xqCzF/ho9H6VnM7nIW3YTPMmxBLu+pNBzIkIdWc=; b=d9yJYnB8gVI3nNOhaiezeRXyIDCwE3OFGeee6sYMbH9eTyPTWPGRq6gf/75o5M7uWPSftQ 1ItK3LWFd4chp+82V2UIy8EDmTrYn42Lv1IelQop4bBQJMiUBj8ICT99u3/PMD1maW8bMe wPcoLOpfnPIdANtQ1Q6tjip8lUE7k4Q= Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 098771333E; Wed, 24 Jan 2024 07:50:43 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id vhqROlPBsGWhXAAAD6G6ig (envelope-from ); Wed, 24 Jan 2024 07:50:43 +0000 Date: Wed, 24 Jan 2024 08:50:43 +0100 From: Michal Hocko To: Johannes Weiner Cc: "T.J. Mercier" , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , android-mm@google.com, yuzhao@google.com, yangyifei03@kuaishou.com, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] Revert "mm:vmscan: fix inaccurate reclaim during proactive reclaim" Message-ID: References: <20240121214413.833776-1-tjmercier@google.com> <20240123164819.GB1745986@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240123164819.GB1745986@cmpxchg.org> X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 0F7A14000F X-Stat-Signature: 4pygz415w9zy5odpatkrs187r5njn3mr X-Rspam-User: X-HE-Tag: 1706082645-776567 X-HE-Meta: U2FsdGVkX1/72vW6ZiaQFb54ZENomBs8HQIm2jlvySXMgwVh5zRK2S06Qmay1ZdIGtdEz6ZaDKsPk+lyh6lWD6ps3w4X/VC3Fu/ZmaN1DRXi/bs2RGbTmV0qOjcDVHoFR/fUjaaNVensC/bKNONtueqfEbY9MWCB7Koznk5J6X4aIAc3SsmPWkveXXxpUW5IZZ/LM1mxc22sr40fviAncH3ke+CZqjrivnAi7xcT7f5lCLkQASEOhAAgn8hDHNMvihYzVVPRSglatrdSdnCpojwG6357gfcs0wRK6A5v7QjcuA/fWe6uhmlkkH3Us6XNM5g9PR2sgQPy4nkeC+BH5c76G5zM1qPxQb6u/VuQmfQtH0d3LZk8rt7PrgBUy9cxryF8J5SkD5bdfzM+TZJbdGN4pKmFGP9EUpgLlse/dUQPXNBnHhRtBK7QjcZGi9j++1ujoOMhOYi/6A5jL3UqqWmwMM5c3/9eOwHRoBGpCu/AQMJO0nlHB25aGr67V2vAA8k6KnWAvHZdkqs8Jd1ZY3407VgvxHQ/eM9Zsl5pwppY5yU5dB9PubFUcHIA8+AKp8TkJmwIS3Q1UrzYN5oUsoE+M2zf/2sm48+MQPT5f6C/w1ltEjC5w5sGbuY+i7rQFG8/RfmxX5CVsbPISko1f/owi0Aih538eV2rkaS23+XCWJh8QmgFH5hn7Ap34+fBEIMuHeetluJ9nsKPCuiegWXu3mk4eJHwPPpp4sQ+hmVna2QOEiDz+/SsZoxgSRCY2mkoki1T+z/8YUCCU6YbCJIE7876qNJOzfRO0juaycPatMsdkMxHWzvV+0c+Hd972wo2TX/D5YBx7CyNYIdxT7HmYsFv5AtRWR58Xz1MTC2B8f+I3KkHlYMSMP2nXbQ2iG0crSA0bNsGyzPlsXG/rlwjTCRJY691T84EQ0U2MwTSKwEwvw/CKFr9tiMB6/xT47v3kV9F9Wup8ZBWrEa 9dYlprVD ItlamUOyfyr7IpGIbIuXN87xsO0GuPP4vWH/oMaZ8F5M5FAEDYTKVUYBqjz4InkeiU5EPRr0hoeeg92AWmHVgHyu3OF8hnE+Vjo4LRdU3UhUm0fe4ghUP72KKlxuwdlnoQTw4DhJQUs6XK0T6zCRHzOsiG4GGHARqUzrEz7yj24Bu5gyWuHxQMpG+oDsPQSM7S+chZSnuNWQ8oTQdHR8SgGio+nzjRqx1m0QntDs+Nj/9D9HbWKvVqNLe/sVCNapZJSZlgq2vXp9tEvgYnssYuUR9cMMbD9jfQjyYbXJproUIIsI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue 23-01-24 11:48:19, Johannes Weiner wrote: > The revert isn't a straight-forward solution. > > The patch you're reverting fixed conventional reclaim and broke > MGLRU. Your revert fixes MGLRU and breaks conventional reclaim. > > On Tue, Jan 23, 2024 at 05:58:05AM -0800, T.J. Mercier wrote: > > They both are able to make progress. The main difference is that a > > single iteration of try_to_free_mem_cgroup_pages with MGLRU ends soon > > after it reclaims nr_to_reclaim, and before it touches all memcgs. So > > a single iteration really will reclaim only about SWAP_CLUSTER_MAX-ish > > pages with MGLRU. WIthout MGLRU the memcg walk is not aborted > > immediately after nr_to_reclaim is reached, so a single call to > > try_to_free_mem_cgroup_pages can actually reclaim thousands of pages > > even when sc->nr_to_reclaim is 32. (I.E. MGLRU overreclaims less.) > > https://lore.kernel.org/lkml/20221201223923.873696-1-yuzhao@google.com/ > > Is that a feature or a bug? > > * 1. Memcg LRU only applies to global reclaim, and the round-robin incrementing > * of their max_seq counters ensures the eventual fairness to all eligible > * memcgs. For memcg reclaim, it still relies on mem_cgroup_iter(). > > If it bails out exactly after nr_to_reclaim, it'll overreclaim > less. But with steady reclaim in a complex subtree, it will always hit > the first cgroup returned by mem_cgroup_iter() and then bail. This > seems like a fairness issue. Agreed. We would need to re-introduce something like we used to have before 1ba6fc9af35bf. > We should figure out what the right method for balancing fairness with > overreclaim is, regardless of reclaim implementation. Because having > two different approaches and reverting dependent things back and forth > doesn't make sense. Absolutely agreed! > Using an LRU to rotate through memcgs over multiple reclaim cycles > seems like a good idea. Why is this specific to MGLRU? Shouldn't this > be a generic piece of memcg infrastructure? > > Then there is the question of why there is an LRU for global reclaim, > but not for subtree reclaim. Reclaiming a container with multiple > subtrees would benefit from the fairness provided by a container-level > LRU order just as much; having fairness for root but not for subtrees > would produce different reclaim and pressure behavior, and can cause > regressions when moving a service from bare-metal into a container. > > Figuring out these differences and converging on a method for cgroup > fairness would be the better way of fixing this. Because of the > regression risk to the default reclaim implementation, I'm inclined to > NAK this revert. I do agree that a simple revert doesn't seem to be the way to go. -- Michal Hocko SUSE Labs