From: Greg Thelen <gthelen@google.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
containers@lists.osdl.org, linux-fsdevel@vger.kernel.org,
Andrea Righi <arighi@develer.com>,
Balbir Singh <balbir@linux.vnet.ibm.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
Minchan Kim <minchan.kim@gmail.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Ciju Rajan K <ciju@linux.vnet.ibm.com>,
David Rientjes <rientjes@google.com>,
Wu Fengguang <fengguang.wu@intel.com>,
Chad Talbott <ctalbott@google.com>,
Justin TerAvest <teravest@google.com>,
Vivek Goyal <vgoyal@redhat.com>, Greg Thelen <gthelen@google.com>
Subject: [PATCH v6 8/9] memcg: check memcg dirty limits in page writeback
Date: Fri, 11 Mar 2011 10:43:30 -0800 [thread overview]
Message-ID: <1299869011-26152-9-git-send-email-gthelen@google.com> (raw)
In-Reply-To: <1299869011-26152-1-git-send-email-gthelen@google.com>
If the current process is in a non-root memcg, then
balance_dirty_pages() will consider the memcg dirty limits as well as
the system-wide limits. This allows different cgroups to have distinct
dirty limits which trigger direct and background writeback at different
levels.
If called with a mem_cgroup, then throttle_vm_writeout() queries the
given cgroup for its dirty memory usage limits.
Signed-off-by: Andrea Righi <arighi@develer.com>
Signed-off-by: Greg Thelen <gthelen@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Wu Fengguang <fengguang.wu@intel.com>
---
Changelog since v5:
- Simplified this change by using mem_cgroup_balance_dirty_pages() rather than
cramming the somewhat different logic into balance_dirty_pages(). This means
the global (non-memcg) dirty limits are not passed around in the
struct dirty_info, so there's less change to existing code.
Changelog since v4:
- Added missing 'struct mem_cgroup' forward declaration in writeback.h.
- Made throttle_vm_writeout() memcg aware.
- Removed previously added dirty_writeback_pages() which is no longer needed.
- Added logic to balance_dirty_pages() to throttle if over foreground memcg
limit.
Changelog since v3:
- Leave determine_dirtyable_memory() static. v3 made is non-static.
- balance_dirty_pages() now considers both system and memcg dirty limits and
usage data. This data is retrieved with global_dirty_info() and
memcg_dirty_info().
include/linux/writeback.h | 3 ++-
mm/page-writeback.c | 34 ++++++++++++++++++++++++++++------
mm/vmscan.c | 2 +-
3 files changed, 31 insertions(+), 8 deletions(-)
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index 0ead399..a45d895 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -8,6 +8,7 @@
#include <linux/fs.h>
struct backing_dev_info;
+struct mem_cgroup;
extern spinlock_t inode_lock;
@@ -92,7 +93,7 @@ void laptop_mode_timer_fn(unsigned long data);
#else
static inline void laptop_sync_completion(void) { }
#endif
-void throttle_vm_writeout(gfp_t gfp_mask);
+void throttle_vm_writeout(gfp_t gfp_mask, struct mem_cgroup *mem_cgroup);
/* These are exported to sysctl. */
extern int dirty_background_ratio;
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index d8005b0..f6a8dd6 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -473,7 +473,8 @@ unsigned long bdi_dirty_limit(struct backing_dev_info *bdi, unsigned long dirty)
* data. It looks at the number of dirty pages in the machine and will force
* the caller to perform writeback if the system is over `vm_dirty_ratio'.
* If we're over `background_thresh' then the writeback threads are woken to
- * perform some writeout.
+ * perform some writeout. The current task may have per-memcg dirty
+ * limits, which are also checked.
*/
static void balance_dirty_pages(struct address_space *mapping,
unsigned long write_chunk)
@@ -488,6 +489,8 @@ static void balance_dirty_pages(struct address_space *mapping,
bool dirty_exceeded = false;
struct backing_dev_info *bdi = mapping->backing_dev_info;
+ mem_cgroup_balance_dirty_pages(mapping, write_chunk);
+
for (;;) {
struct writeback_control wbc = {
.sync_mode = WB_SYNC_NONE,
@@ -651,23 +654,42 @@ void balance_dirty_pages_ratelimited_nr(struct address_space *mapping,
}
EXPORT_SYMBOL(balance_dirty_pages_ratelimited_nr);
-void throttle_vm_writeout(gfp_t gfp_mask)
+/*
+ * Throttle the current task if it is near dirty memory usage limits. Both
+ * global dirty memory limits and (if @mem_cgroup is given) per-cgroup dirty
+ * memory limits are checked.
+ *
+ * If near limits, then wait for usage to drop. Dirty usage should drop because
+ * dirty producers should have used balance_dirty_pages(), which would have
+ * scheduled writeback.
+ */
+void throttle_vm_writeout(gfp_t gfp_mask, struct mem_cgroup *mem_cgroup)
{
unsigned long background_thresh;
unsigned long dirty_thresh;
+ struct dirty_info memcg_info;
+ bool do_memcg;
for ( ; ; ) {
global_dirty_limits(&background_thresh, &dirty_thresh);
+ do_memcg = mem_cgroup && mem_cgroup_hierarchical_dirty_info(
+ determine_dirtyable_memory(), true, mem_cgroup,
+ &memcg_info);
/*
* Boost the allowable dirty threshold a bit for page
* allocators so they don't get DoS'ed by heavy writers
*/
dirty_thresh += dirty_thresh / 10; /* wheeee... */
-
- if (global_page_state(NR_UNSTABLE_NFS) +
- global_page_state(NR_WRITEBACK) <= dirty_thresh)
- break;
+ if (do_memcg)
+ memcg_info.dirty_thresh += memcg_info.dirty_thresh / 10;
+
+ if ((global_page_state(NR_UNSTABLE_NFS) +
+ global_page_state(NR_WRITEBACK) <= dirty_thresh) &&
+ (!do_memcg ||
+ (memcg_info.nr_unstable_nfs +
+ memcg_info.nr_writeback <= memcg_info.dirty_thresh)))
+ break;
congestion_wait(BLK_RW_ASYNC, HZ/10);
/*
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 060e4c1..035d2ea 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1939,7 +1939,7 @@ restart:
sc->nr_scanned - nr_scanned, sc))
goto restart;
- throttle_vm_writeout(sc->gfp_mask);
+ throttle_vm_writeout(sc->gfp_mask, sc->mem_cgroup);
}
/*
--
1.7.3.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Greg Thelen <gthelen@google.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
containers@lists.osdl.org, linux-fsdevel@vger.kernel.org,
Andrea Righi <arighi@develer.com>,
Balbir Singh <balbir@linux.vnet.ibm.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
Minchan Kim <minchan.kim@gmail.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Ciju Rajan K <ciju@linux.vnet.ibm.com>,
David Rientjes <rientjes@google.com>,
Wu Fengguang <fengguang.wu@intel.com>,
Chad Talbott <ctalbott@google.com>,
Justin TerAvest <teravest@google.com>,
Vivek Goyal <vgoyal@redhat.com>, Greg Thelen <gthelen@google.com>
Subject: [PATCH v6 8/9] memcg: check memcg dirty limits in page writeback
Date: Fri, 11 Mar 2011 10:43:30 -0800 [thread overview]
Message-ID: <1299869011-26152-9-git-send-email-gthelen@google.com> (raw)
In-Reply-To: <1299869011-26152-1-git-send-email-gthelen@google.com>
If the current process is in a non-root memcg, then
balance_dirty_pages() will consider the memcg dirty limits as well as
the system-wide limits. This allows different cgroups to have distinct
dirty limits which trigger direct and background writeback at different
levels.
If called with a mem_cgroup, then throttle_vm_writeout() queries the
given cgroup for its dirty memory usage limits.
Signed-off-by: Andrea Righi <arighi@develer.com>
Signed-off-by: Greg Thelen <gthelen@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Wu Fengguang <fengguang.wu@intel.com>
---
Changelog since v5:
- Simplified this change by using mem_cgroup_balance_dirty_pages() rather than
cramming the somewhat different logic into balance_dirty_pages(). This means
the global (non-memcg) dirty limits are not passed around in the
struct dirty_info, so there's less change to existing code.
Changelog since v4:
- Added missing 'struct mem_cgroup' forward declaration in writeback.h.
- Made throttle_vm_writeout() memcg aware.
- Removed previously added dirty_writeback_pages() which is no longer needed.
- Added logic to balance_dirty_pages() to throttle if over foreground memcg
limit.
Changelog since v3:
- Leave determine_dirtyable_memory() static. v3 made is non-static.
- balance_dirty_pages() now considers both system and memcg dirty limits and
usage data. This data is retrieved with global_dirty_info() and
memcg_dirty_info().
include/linux/writeback.h | 3 ++-
mm/page-writeback.c | 34 ++++++++++++++++++++++++++++------
mm/vmscan.c | 2 +-
3 files changed, 31 insertions(+), 8 deletions(-)
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index 0ead399..a45d895 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -8,6 +8,7 @@
#include <linux/fs.h>
struct backing_dev_info;
+struct mem_cgroup;
extern spinlock_t inode_lock;
@@ -92,7 +93,7 @@ void laptop_mode_timer_fn(unsigned long data);
#else
static inline void laptop_sync_completion(void) { }
#endif
-void throttle_vm_writeout(gfp_t gfp_mask);
+void throttle_vm_writeout(gfp_t gfp_mask, struct mem_cgroup *mem_cgroup);
/* These are exported to sysctl. */
extern int dirty_background_ratio;
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index d8005b0..f6a8dd6 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -473,7 +473,8 @@ unsigned long bdi_dirty_limit(struct backing_dev_info *bdi, unsigned long dirty)
* data. It looks at the number of dirty pages in the machine and will force
* the caller to perform writeback if the system is over `vm_dirty_ratio'.
* If we're over `background_thresh' then the writeback threads are woken to
- * perform some writeout.
+ * perform some writeout. The current task may have per-memcg dirty
+ * limits, which are also checked.
*/
static void balance_dirty_pages(struct address_space *mapping,
unsigned long write_chunk)
@@ -488,6 +489,8 @@ static void balance_dirty_pages(struct address_space *mapping,
bool dirty_exceeded = false;
struct backing_dev_info *bdi = mapping->backing_dev_info;
+ mem_cgroup_balance_dirty_pages(mapping, write_chunk);
+
for (;;) {
struct writeback_control wbc = {
.sync_mode = WB_SYNC_NONE,
@@ -651,23 +654,42 @@ void balance_dirty_pages_ratelimited_nr(struct address_space *mapping,
}
EXPORT_SYMBOL(balance_dirty_pages_ratelimited_nr);
-void throttle_vm_writeout(gfp_t gfp_mask)
+/*
+ * Throttle the current task if it is near dirty memory usage limits. Both
+ * global dirty memory limits and (if @mem_cgroup is given) per-cgroup dirty
+ * memory limits are checked.
+ *
+ * If near limits, then wait for usage to drop. Dirty usage should drop because
+ * dirty producers should have used balance_dirty_pages(), which would have
+ * scheduled writeback.
+ */
+void throttle_vm_writeout(gfp_t gfp_mask, struct mem_cgroup *mem_cgroup)
{
unsigned long background_thresh;
unsigned long dirty_thresh;
+ struct dirty_info memcg_info;
+ bool do_memcg;
for ( ; ; ) {
global_dirty_limits(&background_thresh, &dirty_thresh);
+ do_memcg = mem_cgroup && mem_cgroup_hierarchical_dirty_info(
+ determine_dirtyable_memory(), true, mem_cgroup,
+ &memcg_info);
/*
* Boost the allowable dirty threshold a bit for page
* allocators so they don't get DoS'ed by heavy writers
*/
dirty_thresh += dirty_thresh / 10; /* wheeee... */
-
- if (global_page_state(NR_UNSTABLE_NFS) +
- global_page_state(NR_WRITEBACK) <= dirty_thresh)
- break;
+ if (do_memcg)
+ memcg_info.dirty_thresh += memcg_info.dirty_thresh / 10;
+
+ if ((global_page_state(NR_UNSTABLE_NFS) +
+ global_page_state(NR_WRITEBACK) <= dirty_thresh) &&
+ (!do_memcg ||
+ (memcg_info.nr_unstable_nfs +
+ memcg_info.nr_writeback <= memcg_info.dirty_thresh)))
+ break;
congestion_wait(BLK_RW_ASYNC, HZ/10);
/*
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 060e4c1..035d2ea 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1939,7 +1939,7 @@ restart:
sc->nr_scanned - nr_scanned, sc))
goto restart;
- throttle_vm_writeout(sc->gfp_mask);
+ throttle_vm_writeout(sc->gfp_mask, sc->mem_cgroup);
}
/*
--
1.7.3.1
next prev parent reply other threads:[~2011-03-11 18:43 UTC|newest]
Thread overview: 136+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-11 18:43 [PATCH v6 0/9] memcg: per cgroup dirty page accounting Greg Thelen
2011-03-11 18:43 ` Greg Thelen
2011-03-11 18:43 ` [PATCH v6 1/9] memcg: document cgroup dirty memory interfaces Greg Thelen
2011-03-11 18:43 ` Greg Thelen
2011-03-14 14:50 ` Minchan Kim
2011-03-14 14:50 ` Minchan Kim
2011-03-11 18:43 ` [PATCH v6 2/9] memcg: add page_cgroup flags for dirty page tracking Greg Thelen
2011-03-11 18:43 ` Greg Thelen
2011-03-11 18:43 ` [PATCH v6 3/9] memcg: add dirty page accounting infrastructure Greg Thelen
2011-03-11 18:43 ` Greg Thelen
2011-03-14 14:56 ` Minchan Kim
2011-03-14 14:56 ` Minchan Kim
2011-03-11 18:43 ` [PATCH v6 4/9] memcg: add kernel calls for memcg dirty page stats Greg Thelen
2011-03-11 18:43 ` Greg Thelen
2011-03-14 15:10 ` Minchan Kim
2011-03-14 15:10 ` Minchan Kim
2011-03-15 6:32 ` Greg Thelen
2011-03-15 6:32 ` Greg Thelen
2011-03-15 6:32 ` Greg Thelen
2011-03-15 13:50 ` Ryusuke Konishi
2011-03-15 13:50 ` Ryusuke Konishi
2011-03-11 18:43 ` [PATCH v6 5/9] memcg: add dirty limits to mem_cgroup Greg Thelen
2011-03-11 18:43 ` Greg Thelen
2011-03-11 18:43 ` [PATCH v6 6/9] memcg: add cgroupfs interface to memcg dirty limits Greg Thelen
2011-03-11 18:43 ` Greg Thelen
2011-03-14 15:16 ` Minchan Kim
2011-03-14 15:16 ` Minchan Kim
2011-03-15 14:01 ` Mike Heffner
2011-03-15 14:01 ` Mike Heffner
2011-03-16 0:00 ` KAMEZAWA Hiroyuki
2011-03-16 0:00 ` KAMEZAWA Hiroyuki
2011-03-16 0:50 ` Greg Thelen
2011-03-16 0:50 ` Greg Thelen
2011-03-11 18:43 ` [PATCH v6 7/9] memcg: add dirty limiting routines Greg Thelen
2011-03-11 18:43 ` Greg Thelen
2011-03-11 18:43 ` Greg Thelen [this message]
2011-03-11 18:43 ` [PATCH v6 8/9] memcg: check memcg dirty limits in page writeback Greg Thelen
2011-03-14 17:54 ` Vivek Goyal
2011-03-14 17:54 ` Vivek Goyal
2011-03-14 17:59 ` Vivek Goyal
2011-03-14 17:59 ` Vivek Goyal
2011-03-14 21:10 ` Jan Kara
2011-03-14 21:10 ` Jan Kara
2011-03-15 3:27 ` Greg Thelen
2011-03-15 3:27 ` Greg Thelen
2011-03-15 23:12 ` Jan Kara
2011-03-15 23:12 ` Jan Kara
2011-03-15 23:12 ` Jan Kara
2011-03-16 2:35 ` Greg Thelen
2011-03-16 2:35 ` Greg Thelen
2011-03-16 2:35 ` Greg Thelen
2011-03-16 12:35 ` Jan Kara
2011-03-16 12:35 ` Jan Kara
2011-03-16 12:35 ` Jan Kara
2011-03-16 18:07 ` Vivek Goyal
2011-03-16 18:07 ` Vivek Goyal
2011-03-16 18:07 ` Vivek Goyal
2011-03-15 16:20 ` Vivek Goyal
2011-03-15 16:20 ` Vivek Goyal
2011-03-11 18:43 ` [PATCH v6 9/9] memcg: make background writeback memcg aware Greg Thelen
2011-03-11 18:43 ` Greg Thelen
2011-03-15 22:54 ` Vivek Goyal
2011-03-15 22:54 ` Vivek Goyal
2011-03-16 1:00 ` Greg Thelen
2011-03-16 1:00 ` Greg Thelen
2011-03-12 1:10 ` [PATCH v6 0/9] memcg: per cgroup dirty page accounting Andrew Morton
2011-03-12 1:10 ` Andrew Morton
2011-03-14 18:29 ` Greg Thelen
2011-03-14 18:29 ` Greg Thelen
2011-03-14 20:23 ` Vivek Goyal
2011-03-14 20:23 ` Vivek Goyal
2011-03-15 2:41 ` Greg Thelen
2011-03-15 2:41 ` Greg Thelen
2011-03-15 18:48 ` Vivek Goyal
2011-03-15 18:48 ` Vivek Goyal
2011-03-15 18:48 ` Vivek Goyal
2011-03-16 13:13 ` Johannes Weiner
2011-03-16 13:13 ` Johannes Weiner
2011-03-16 13:13 ` Johannes Weiner
2011-03-16 14:59 ` Vivek Goyal
2011-03-16 14:59 ` Vivek Goyal
2011-03-16 14:59 ` Vivek Goyal
2011-03-16 16:35 ` Johannes Weiner
2011-03-16 16:35 ` Johannes Weiner
2011-03-16 16:35 ` Johannes Weiner
2011-03-16 17:06 ` Vivek Goyal
2011-03-16 17:06 ` Vivek Goyal
2011-03-16 21:19 ` Greg Thelen
2011-03-16 21:19 ` Greg Thelen
2011-03-16 21:52 ` Johannes Weiner
2011-03-16 21:52 ` Johannes Weiner
2011-03-16 21:52 ` Johannes Weiner
2011-03-17 4:41 ` Greg Thelen
2011-03-17 4:41 ` Greg Thelen
2011-03-17 12:43 ` Johannes Weiner
2011-03-17 12:43 ` Johannes Weiner
2011-03-17 14:49 ` Vivek Goyal
2011-03-17 14:49 ` Vivek Goyal
2011-03-17 14:53 ` Jan Kara
2011-03-17 14:53 ` Jan Kara
2011-03-17 15:42 ` Curt Wohlgemuth
2011-03-17 15:42 ` Curt Wohlgemuth
2011-03-17 15:42 ` Curt Wohlgemuth
2011-03-18 7:57 ` Greg Thelen
2011-03-18 7:57 ` Greg Thelen
2011-03-18 14:50 ` Vivek Goyal
2011-03-18 14:50 ` Vivek Goyal
2011-03-23 9:06 ` KAMEZAWA Hiroyuki
2011-03-23 9:06 ` KAMEZAWA Hiroyuki
2011-03-23 9:06 ` KAMEZAWA Hiroyuki
2011-03-18 14:29 ` Vivek Goyal
2011-03-18 14:29 ` Vivek Goyal
2011-03-18 14:46 ` Johannes Weiner
2011-03-18 14:46 ` Johannes Weiner
2011-03-17 14:46 ` Jan Kara
2011-03-17 14:46 ` Jan Kara
2011-03-17 17:12 ` Vivek Goyal
2011-03-17 17:12 ` Vivek Goyal
2011-03-17 17:59 ` Jan Kara
2011-03-17 17:59 ` Jan Kara
2011-03-17 18:15 ` Vivek Goyal
2011-03-17 18:15 ` Vivek Goyal
2011-03-15 21:23 ` Vivek Goyal
2011-03-15 21:23 ` Vivek Goyal
2011-03-15 21:23 ` Vivek Goyal
2011-03-15 23:11 ` Vivek Goyal
2011-03-15 23:11 ` Vivek Goyal
2011-03-15 23:11 ` Vivek Goyal
2011-03-15 1:56 ` KAMEZAWA Hiroyuki
2011-03-15 1:56 ` KAMEZAWA Hiroyuki
2011-03-15 2:51 ` Greg Thelen
2011-03-15 2:51 ` Greg Thelen
2011-03-15 2:54 ` KAMEZAWA Hiroyuki
2011-03-15 2:54 ` KAMEZAWA Hiroyuki
2011-03-16 12:45 ` Johannes Weiner
2011-03-16 12:45 ` Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1299869011-26152-9-git-send-email-gthelen@google.com \
--to=gthelen@google.com \
--cc=akpm@linux-foundation.org \
--cc=arighi@develer.com \
--cc=balbir@linux.vnet.ibm.com \
--cc=ciju@linux.vnet.ibm.com \
--cc=containers@lists.osdl.org \
--cc=ctalbott@google.com \
--cc=fengguang.wu@intel.com \
--cc=hannes@cmpxchg.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan.kim@gmail.com \
--cc=nishimura@mxp.nes.nec.co.jp \
--cc=rientjes@google.com \
--cc=teravest@google.com \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.