* [PATCH] mm/cgroup/reclaim: Fix dirty pages throttling on cgroup v1
@ 2022-11-18 7:06 ` Aneesh Kumar K.V
0 siblings, 0 replies; 4+ messages in thread
From: Aneesh Kumar K.V @ 2022-11-18 7:06 UTC (permalink / raw)
To: linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, Tejun Heo, Zefan Li,
Johannes Weiner
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, Aneesh Kumar K.V,
stable-DgEjT+Ai2ygdnm+yROfE0A
balance_dirty_pages doesn't do the required dirty throttling on cgroupv1. See
commit 9badce000e2c ("cgroup, writeback: don't enable cgroup writeback on
traditional hierarchies"). Instead, the kernel depends on writeback throttling
in shrink_folio_list to achieve the same goal. With large memory systems, the
flusher may not be able to writeback quickly enough such that we will start
finding pages in the shrink_folio_list already in writeback. Hence for cgroupv1
let's do a reclaim throttle after waking up the flusher.
The below test which used to fail on a 256GB system completes till the
the file system is full with this change.
root@lp2:/sys/fs/cgroup/memory# mkdir test
root@lp2:/sys/fs/cgroup/memory# cd test/
root@lp2:/sys/fs/cgroup/memory/test# echo 120M > memory.limit_in_bytes
root@lp2:/sys/fs/cgroup/memory/test# echo $$ > tasks
root@lp2:/sys/fs/cgroup/memory/test# dd if=/dev/zero of=/home/kvaneesh/test bs=1M
Killed
Cc: <stable-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Suggested-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org>
---
mm/vmscan.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 04d8b88e5216..388022c5ef2b 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2514,8 +2514,20 @@ static unsigned long shrink_inactive_list(unsigned long nr_to_scan,
* the flushers simply cannot keep up with the allocation
* rate. Nudge the flusher threads in case they are asleep.
*/
- if (stat.nr_unqueued_dirty == nr_taken)
+ if (stat.nr_unqueued_dirty == nr_taken) {
wakeup_flusher_threads(WB_REASON_VMSCAN);
+ /*
+ * For cgroupv1 dirty throttling is achieved by waking up
+ * the kernel flusher here and later waiting on folios
+ * which are in writeback to finish (see shrink_folio_list()).
+ *
+ * Flusher may not be able to issue writeback quickly
+ * enough for cgroupv1 writeback throttling to work
+ * on a large system.
+ */
+ if (!writeback_throttling_sane(sc))
+ reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK);
+ }
sc->nr.dirty += stat.nr_dirty;
sc->nr.congested += stat.nr_congested;
--
2.38.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH] mm/cgroup/reclaim: Fix dirty pages throttling on cgroup v1
@ 2022-11-18 7:06 ` Aneesh Kumar K.V
0 siblings, 0 replies; 4+ messages in thread
From: Aneesh Kumar K.V @ 2022-11-18 7:06 UTC (permalink / raw)
To: linux-mm, akpm, Tejun Heo, Zefan Li, Johannes Weiner
Cc: cgroups, Aneesh Kumar K.V, stable
balance_dirty_pages doesn't do the required dirty throttling on cgroupv1. See
commit 9badce000e2c ("cgroup, writeback: don't enable cgroup writeback on
traditional hierarchies"). Instead, the kernel depends on writeback throttling
in shrink_folio_list to achieve the same goal. With large memory systems, the
flusher may not be able to writeback quickly enough such that we will start
finding pages in the shrink_folio_list already in writeback. Hence for cgroupv1
let's do a reclaim throttle after waking up the flusher.
The below test which used to fail on a 256GB system completes till the
the file system is full with this change.
root@lp2:/sys/fs/cgroup/memory# mkdir test
root@lp2:/sys/fs/cgroup/memory# cd test/
root@lp2:/sys/fs/cgroup/memory/test# echo 120M > memory.limit_in_bytes
root@lp2:/sys/fs/cgroup/memory/test# echo $$ > tasks
root@lp2:/sys/fs/cgroup/memory/test# dd if=/dev/zero of=/home/kvaneesh/test bs=1M
Killed
Cc: <stable@kernel.org>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
mm/vmscan.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 04d8b88e5216..388022c5ef2b 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2514,8 +2514,20 @@ static unsigned long shrink_inactive_list(unsigned long nr_to_scan,
* the flushers simply cannot keep up with the allocation
* rate. Nudge the flusher threads in case they are asleep.
*/
- if (stat.nr_unqueued_dirty == nr_taken)
+ if (stat.nr_unqueued_dirty == nr_taken) {
wakeup_flusher_threads(WB_REASON_VMSCAN);
+ /*
+ * For cgroupv1 dirty throttling is achieved by waking up
+ * the kernel flusher here and later waiting on folios
+ * which are in writeback to finish (see shrink_folio_list()).
+ *
+ * Flusher may not be able to issue writeback quickly
+ * enough for cgroupv1 writeback throttling to work
+ * on a large system.
+ */
+ if (!writeback_throttling_sane(sc))
+ reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK);
+ }
sc->nr.dirty += stat.nr_dirty;
sc->nr.congested += stat.nr_congested;
--
2.38.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] mm/cgroup/reclaim: Fix dirty pages throttling on cgroup v1
2022-11-18 7:06 ` Aneesh Kumar K.V
@ 2022-11-18 23:43 ` Johannes Weiner
-1 siblings, 0 replies; 4+ messages in thread
From: Johannes Weiner @ 2022-11-18 23:43 UTC (permalink / raw)
To: Aneesh Kumar K.V
Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, Tejun Heo, Zefan Li,
cgroups-u79uwXL29TY76Z2rM5mHXA, stable-DgEjT+Ai2ygdnm+yROfE0A
On Fri, Nov 18, 2022 at 12:36:03PM +0530, Aneesh Kumar K.V wrote:
> balance_dirty_pages doesn't do the required dirty throttling on cgroupv1. See
> commit 9badce000e2c ("cgroup, writeback: don't enable cgroup writeback on
> traditional hierarchies"). Instead, the kernel depends on writeback throttling
> in shrink_folio_list to achieve the same goal. With large memory systems, the
> flusher may not be able to writeback quickly enough such that we will start
> finding pages in the shrink_folio_list already in writeback. Hence for cgroupv1
> let's do a reclaim throttle after waking up the flusher.
>
> The below test which used to fail on a 256GB system completes till the
> the file system is full with this change.
>
> root@lp2:/sys/fs/cgroup/memory# mkdir test
> root@lp2:/sys/fs/cgroup/memory# cd test/
> root@lp2:/sys/fs/cgroup/memory/test# echo 120M > memory.limit_in_bytes
> root@lp2:/sys/fs/cgroup/memory/test# echo $$ > tasks
> root@lp2:/sys/fs/cgroup/memory/test# dd if=/dev/zero of=/home/kvaneesh/test bs=1M
> Killed
>
> Cc: <stable-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Suggested-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org>
Acked-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
Thanks Aneesh
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] mm/cgroup/reclaim: Fix dirty pages throttling on cgroup v1
@ 2022-11-18 23:43 ` Johannes Weiner
0 siblings, 0 replies; 4+ messages in thread
From: Johannes Weiner @ 2022-11-18 23:43 UTC (permalink / raw)
To: Aneesh Kumar K.V; +Cc: linux-mm, akpm, Tejun Heo, Zefan Li, cgroups, stable
On Fri, Nov 18, 2022 at 12:36:03PM +0530, Aneesh Kumar K.V wrote:
> balance_dirty_pages doesn't do the required dirty throttling on cgroupv1. See
> commit 9badce000e2c ("cgroup, writeback: don't enable cgroup writeback on
> traditional hierarchies"). Instead, the kernel depends on writeback throttling
> in shrink_folio_list to achieve the same goal. With large memory systems, the
> flusher may not be able to writeback quickly enough such that we will start
> finding pages in the shrink_folio_list already in writeback. Hence for cgroupv1
> let's do a reclaim throttle after waking up the flusher.
>
> The below test which used to fail on a 256GB system completes till the
> the file system is full with this change.
>
> root@lp2:/sys/fs/cgroup/memory# mkdir test
> root@lp2:/sys/fs/cgroup/memory# cd test/
> root@lp2:/sys/fs/cgroup/memory/test# echo 120M > memory.limit_in_bytes
> root@lp2:/sys/fs/cgroup/memory/test# echo $$ > tasks
> root@lp2:/sys/fs/cgroup/memory/test# dd if=/dev/zero of=/home/kvaneesh/test bs=1M
> Killed
>
> Cc: <stable@kernel.org>
> Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Thanks Aneesh
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-11-18 23:43 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-18 7:06 [PATCH] mm/cgroup/reclaim: Fix dirty pages throttling on cgroup v1 Aneesh Kumar K.V
2022-11-18 7:06 ` Aneesh Kumar K.V
[not found] ` <20221118070603.84081-1-aneesh.kumar-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org>
2022-11-18 23:43 ` Johannes Weiner
2022-11-18 23:43 ` Johannes Weiner
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.