From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 99DCA79C0 for ; Wed, 30 Nov 2022 18:52:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 20657C433D6; Wed, 30 Nov 2022 18:52:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1669834367; bh=Ffrg4MH1kKEzpy9NhJ2ONABLjMus7mc83yJw2bCzBfM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XAx9dxK6YAzb8L7/9nfap4XZI7R5nOQOU6vW/pbhiuujvSd1wXux7p5gsRaoE0WoW eSHRWnOUQtWMnCJzrRchTQ4XCz8h6PSn32+s7DuddwsmOXODMUtBuW1miIDoVihI4t Vq3J7K0dvgMr2iv3+w+4uhCUgHYbj6rXSp+nZfdI= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, "Aneesh Kumar K.V" , Johannes Weiner , Tejun Heo , zefan li , Andrew Morton Subject: [PATCH 6.0 200/289] mm/cgroup/reclaim: fix dirty pages throttling on cgroup v1 Date: Wed, 30 Nov 2022 19:23:05 +0100 Message-Id: <20221130180548.654898585@linuxfoundation.org> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221130180544.105550592@linuxfoundation.org> References: <20221130180544.105550592@linuxfoundation.org> User-Agent: quilt/0.67 Precedence: bulk X-Mailing-List: patches@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From: Aneesh Kumar K.V commit 81a70c21d9170de67a45843bdd627f4cce9c4215 upstream. balance_dirty_pages doesn't do the required dirty throttling on cgroupv1. See commit 9badce000e2c ("cgroup, writeback: don't enable cgroup writeback on traditional hierarchies"). Instead, the kernel depends on writeback throttling in shrink_folio_list to achieve the same goal. With large memory systems, the flusher may not be able to writeback quickly enough such that we will start finding pages in the shrink_folio_list already in writeback. Hence for cgroupv1 let's do a reclaim throttle after waking up the flusher. The below test which used to fail on a 256GB system completes till the the file system is full with this change. root@lp2:/sys/fs/cgroup/memory# mkdir test root@lp2:/sys/fs/cgroup/memory# cd test/ root@lp2:/sys/fs/cgroup/memory/test# echo 120M > memory.limit_in_bytes root@lp2:/sys/fs/cgroup/memory/test# echo $$ > tasks root@lp2:/sys/fs/cgroup/memory/test# dd if=/dev/zero of=/home/kvaneesh/test bs=1M Killed Link: https://lkml.kernel.org/r/20221118070603.84081-1-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V Suggested-by: Johannes Weiner Acked-by: Johannes Weiner Cc: Tejun Heo Cc: zefan li Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman --- mm/vmscan.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2472,8 +2472,20 @@ shrink_inactive_list(unsigned long nr_to * the flushers simply cannot keep up with the allocation * rate. Nudge the flusher threads in case they are asleep. */ - if (stat.nr_unqueued_dirty == nr_taken) + if (stat.nr_unqueued_dirty == nr_taken) { wakeup_flusher_threads(WB_REASON_VMSCAN); + /* + * For cgroupv1 dirty throttling is achieved by waking up + * the kernel flusher here and later waiting on folios + * which are in writeback to finish (see shrink_folio_list()). + * + * Flusher may not be able to issue writeback quickly + * enough for cgroupv1 writeback throttling to work + * on a large system. + */ + if (!writeback_throttling_sane(sc)) + reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK); + } sc->nr.dirty += stat.nr_dirty; sc->nr.congested += stat.nr_congested;