From: Greg Thelen <gthelen@google.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
containers@lists.osdl.org, linux-fsdevel@vger.kernel.org,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Balbir Singh <bsingharora@gmail.com>,
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
Minchan Kim <minchan.kim@gmail.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Wu Fengguang <fengguang.wu@intel.com>,
Dave Chinner <david@fromorbit.com>,
Vivek Goyal <vgoyal@redhat.com>,
Andrea Righi <andrea@betterlinux.com>,
Ciju Rajan K <ciju@linux.vnet.ibm.com>,
David Rientjes <rientjes@google.com>,
Greg Thelen <gthelen@google.com>
Subject: [PATCH v9 01/13] memcg: document cgroup dirty memory interfaces
Date: Wed, 17 Aug 2011 09:14:53 -0700 [thread overview]
Message-ID: <1313597705-6093-2-git-send-email-gthelen@google.com> (raw)
In-Reply-To: <1313597705-6093-1-git-send-email-gthelen@google.com>
Document cgroup dirty memory interfaces and statistics.
The implementation for these new interfaces routines comes in a series
of following patches.
Signed-off-by: Andrea Righi <andrea@betterlinux.com>
Signed-off-by: Greg Thelen <gthelen@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
---
Documentation/cgroups/memory.txt | 70 ++++++++++++++++++++++++++++++++++++++
1 files changed, 70 insertions(+), 0 deletions(-)
diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt
index 6f3c598..5fd6ab8 100644
--- a/Documentation/cgroups/memory.txt
+++ b/Documentation/cgroups/memory.txt
@@ -389,6 +389,10 @@ mapped_file - # of bytes of mapped file (includes tmpfs/shmem)
pgpgin - # of pages paged in (equivalent to # of charging events).
pgpgout - # of pages paged out (equivalent to # of uncharging events).
swap - # of bytes of swap usage
+dirty - # of bytes that are waiting to get written back to the disk.
+writeback - # of bytes that are actively being written back to the disk.
+nfs_unstable - # of bytes sent to the NFS server, but not yet committed to
+ the actual storage.
inactive_anon - # of bytes of anonymous memory and swap cache memory on
LRU list.
active_anon - # of bytes of anonymous and swap cache memory on active
@@ -410,6 +414,9 @@ total_mapped_file - sum of all children's "cache"
total_pgpgin - sum of all children's "pgpgin"
total_pgpgout - sum of all children's "pgpgout"
total_swap - sum of all children's "swap"
+total_dirty - sum of all children's "dirty"
+total_writeback - sum of all children's "writeback"
+total_nfs_unstable - sum of all children's "nfs_unstable"
total_inactive_anon - sum of all children's "inactive_anon"
total_active_anon - sum of all children's "active_anon"
total_inactive_file - sum of all children's "inactive_file"
@@ -567,6 +574,69 @@ unevictable=<total anon pages> N0=<node 0 pages> N1=<node 1 pages> ...
And we have total = file + anon + unevictable.
+5.7 dirty memory
+
+Control the maximum amount of dirty pages a cgroup can have at any given time.
+
+Limiting dirty memory is like fixing the max amount of dirty (hard to reclaim)
+page cache used by a cgroup. So, in case of multiple cgroup writers, they will
+not be able to consume more than their designated share of dirty pages and will
+be throttled if they cross that limit. System-wide dirty limits are also
+consulted. Dirty memory consumption is checked against both system-wide and
+per-cgroup dirty limits.
+
+The interface is similar to the procfs interface: /proc/sys/vm/dirty_*. It is
+possible to configure a limit to trigger throttling of a dirtier or queue
+background writeback. The root cgroup memory.dirty_* control files are
+read-only and match the contents of the /proc/sys/vm/dirty_* files.
+
+Per-cgroup dirty limits can be set using the following files in the cgroupfs:
+
+- memory.dirty_ratio: the amount of dirty memory (expressed as a percentage of
+ cgroup memory) at which a process generating dirty pages will be throttled.
+ The default value is the system-wide dirty ratio, /proc/sys/vm/dirty_ratio.
+
+- memory.dirty_limit_in_bytes: the amount of dirty memory (expressed in bytes)
+ in the cgroup at which a process generating dirty pages will be throttled.
+ Suffix (k, K, m, M, g, or G) can be used to indicate that value is kilo, mega
+ or gigabytes. The default value is the system-wide dirty limit,
+ /proc/sys/vm/dirty_bytes.
+
+ Note: memory.dirty_limit_in_bytes is the counterpart of memory.dirty_ratio.
+ Only one may be specified at a time. When one is written it is immediately
+ taken into account to evaluate the dirty memory limits and the other appears
+ as 0 when read.
+
+- memory.dirty_background_ratio: the amount of dirty memory of the cgroup
+ (expressed as a percentage of cgroup memory) at which background writeback
+ kernel threads will start writing out dirty data. The default value is the
+ system-wide background dirty ratio, /proc/sys/vm/dirty_background_ratio.
+
+- memory.dirty_background_limit_in_bytes: the amount of dirty memory (expressed
+ in bytes) in the cgroup at which background writeback kernel threads will
+ start writing out dirty data. Suffix (k, K, m, M, g, or G) can be used to
+ indicate that value is kilo, mega or gigabytes. The default value is the
+ system-wide dirty background limit, /proc/sys/vm/dirty_background_bytes.
+
+ Note: memory.dirty_background_limit_in_bytes is the counterpart of
+ memory.dirty_background_ratio. Only one may be specified at a time. When one
+ is written it is immediately taken into account to evaluate the dirty memory
+ limits and the other appears as 0 when read.
+
+A cgroup may contain more dirty memory than its dirty limit. This is possible
+because of the principle that the first cgroup to touch a page is charged for
+it. Subsequent page counting events (dirty, writeback, nfs_unstable) are also
+counted to the originally charged cgroup. Example: If page is allocated by a
+cgroup A task, then the page is charged to cgroup A. If the page is later
+dirtied by a task in cgroup B, then the cgroup A dirty count will be
+incremented. If cgroup A is over its dirty limit but cgroup B is not, then
+dirtying a cgroup A page from a cgroup B task may push cgroup A over its dirty
+limit without throttling the dirtying cgroup B task.
+
+When use_hierarchy=0, each cgroup has independent dirty memory usage and limits.
+When use_hierarchy=1 the dirty limits of parent cgroups are also checked to
+ensure that no dirty limit is exceeded.
+
6. Hierarchy support
The memory controller supports a deep hierarchy and hierarchical accounting.
--
1.7.3.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Greg Thelen <gthelen@google.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
containers@lists.osdl.org, linux-fsdevel@vger.kernel.org,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Balbir Singh <bsingharora@gmail.com>,
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
Minchan Kim <minchan.kim@gmail.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Wu Fengguang <fengguang.wu@intel.com>,
Dave Chinner <david@fromorbit.com>,
Vivek Goyal <vgoyal@redhat.com>,
Andrea Righi <andrea@betterlinux.com>,
Ciju Rajan K <ciju@linux.vnet.ibm.com>,
David Rientjes <rientjes@google.com>,
Greg Thelen <gthelen@google.com>
Subject: [PATCH v9 01/13] memcg: document cgroup dirty memory interfaces
Date: Wed, 17 Aug 2011 09:14:53 -0700 [thread overview]
Message-ID: <1313597705-6093-2-git-send-email-gthelen@google.com> (raw)
In-Reply-To: <1313597705-6093-1-git-send-email-gthelen@google.com>
Document cgroup dirty memory interfaces and statistics.
The implementation for these new interfaces routines comes in a series
of following patches.
Signed-off-by: Andrea Righi <andrea@betterlinux.com>
Signed-off-by: Greg Thelen <gthelen@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
---
Documentation/cgroups/memory.txt | 70 ++++++++++++++++++++++++++++++++++++++
1 files changed, 70 insertions(+), 0 deletions(-)
diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt
index 6f3c598..5fd6ab8 100644
--- a/Documentation/cgroups/memory.txt
+++ b/Documentation/cgroups/memory.txt
@@ -389,6 +389,10 @@ mapped_file - # of bytes of mapped file (includes tmpfs/shmem)
pgpgin - # of pages paged in (equivalent to # of charging events).
pgpgout - # of pages paged out (equivalent to # of uncharging events).
swap - # of bytes of swap usage
+dirty - # of bytes that are waiting to get written back to the disk.
+writeback - # of bytes that are actively being written back to the disk.
+nfs_unstable - # of bytes sent to the NFS server, but not yet committed to
+ the actual storage.
inactive_anon - # of bytes of anonymous memory and swap cache memory on
LRU list.
active_anon - # of bytes of anonymous and swap cache memory on active
@@ -410,6 +414,9 @@ total_mapped_file - sum of all children's "cache"
total_pgpgin - sum of all children's "pgpgin"
total_pgpgout - sum of all children's "pgpgout"
total_swap - sum of all children's "swap"
+total_dirty - sum of all children's "dirty"
+total_writeback - sum of all children's "writeback"
+total_nfs_unstable - sum of all children's "nfs_unstable"
total_inactive_anon - sum of all children's "inactive_anon"
total_active_anon - sum of all children's "active_anon"
total_inactive_file - sum of all children's "inactive_file"
@@ -567,6 +574,69 @@ unevictable=<total anon pages> N0=<node 0 pages> N1=<node 1 pages> ...
And we have total = file + anon + unevictable.
+5.7 dirty memory
+
+Control the maximum amount of dirty pages a cgroup can have at any given time.
+
+Limiting dirty memory is like fixing the max amount of dirty (hard to reclaim)
+page cache used by a cgroup. So, in case of multiple cgroup writers, they will
+not be able to consume more than their designated share of dirty pages and will
+be throttled if they cross that limit. System-wide dirty limits are also
+consulted. Dirty memory consumption is checked against both system-wide and
+per-cgroup dirty limits.
+
+The interface is similar to the procfs interface: /proc/sys/vm/dirty_*. It is
+possible to configure a limit to trigger throttling of a dirtier or queue
+background writeback. The root cgroup memory.dirty_* control files are
+read-only and match the contents of the /proc/sys/vm/dirty_* files.
+
+Per-cgroup dirty limits can be set using the following files in the cgroupfs:
+
+- memory.dirty_ratio: the amount of dirty memory (expressed as a percentage of
+ cgroup memory) at which a process generating dirty pages will be throttled.
+ The default value is the system-wide dirty ratio, /proc/sys/vm/dirty_ratio.
+
+- memory.dirty_limit_in_bytes: the amount of dirty memory (expressed in bytes)
+ in the cgroup at which a process generating dirty pages will be throttled.
+ Suffix (k, K, m, M, g, or G) can be used to indicate that value is kilo, mega
+ or gigabytes. The default value is the system-wide dirty limit,
+ /proc/sys/vm/dirty_bytes.
+
+ Note: memory.dirty_limit_in_bytes is the counterpart of memory.dirty_ratio.
+ Only one may be specified at a time. When one is written it is immediately
+ taken into account to evaluate the dirty memory limits and the other appears
+ as 0 when read.
+
+- memory.dirty_background_ratio: the amount of dirty memory of the cgroup
+ (expressed as a percentage of cgroup memory) at which background writeback
+ kernel threads will start writing out dirty data. The default value is the
+ system-wide background dirty ratio, /proc/sys/vm/dirty_background_ratio.
+
+- memory.dirty_background_limit_in_bytes: the amount of dirty memory (expressed
+ in bytes) in the cgroup at which background writeback kernel threads will
+ start writing out dirty data. Suffix (k, K, m, M, g, or G) can be used to
+ indicate that value is kilo, mega or gigabytes. The default value is the
+ system-wide dirty background limit, /proc/sys/vm/dirty_background_bytes.
+
+ Note: memory.dirty_background_limit_in_bytes is the counterpart of
+ memory.dirty_background_ratio. Only one may be specified at a time. When one
+ is written it is immediately taken into account to evaluate the dirty memory
+ limits and the other appears as 0 when read.
+
+A cgroup may contain more dirty memory than its dirty limit. This is possible
+because of the principle that the first cgroup to touch a page is charged for
+it. Subsequent page counting events (dirty, writeback, nfs_unstable) are also
+counted to the originally charged cgroup. Example: If page is allocated by a
+cgroup A task, then the page is charged to cgroup A. If the page is later
+dirtied by a task in cgroup B, then the cgroup A dirty count will be
+incremented. If cgroup A is over its dirty limit but cgroup B is not, then
+dirtying a cgroup A page from a cgroup B task may push cgroup A over its dirty
+limit without throttling the dirtying cgroup B task.
+
+When use_hierarchy=0, each cgroup has independent dirty memory usage and limits.
+When use_hierarchy=1 the dirty limits of parent cgroups are also checked to
+ensure that no dirty limit is exceeded.
+
6. Hierarchy support
The memory controller supports a deep hierarchy and hierarchical accounting.
--
1.7.3.1
next prev parent reply other threads:[~2011-08-17 16:14 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-17 16:14 [PATCH v9 00/13] memcg: per cgroup dirty page limiting Greg Thelen
2011-08-17 16:14 ` Greg Thelen
2011-08-17 16:14 ` Greg Thelen [this message]
2011-08-17 16:14 ` [PATCH v9 01/13] memcg: document cgroup dirty memory interfaces Greg Thelen
2011-08-17 16:14 ` [PATCH v9 02/13] memcg: add page_cgroup flags for dirty page tracking Greg Thelen
2011-08-17 16:14 ` Greg Thelen
2011-08-17 16:14 ` [PATCH v9 03/13] memcg: add dirty page accounting infrastructure Greg Thelen
2011-08-17 16:14 ` Greg Thelen
2011-08-18 0:39 ` KAMEZAWA Hiroyuki
2011-08-18 0:39 ` KAMEZAWA Hiroyuki
2011-08-18 6:07 ` Greg Thelen
2011-08-18 6:07 ` Greg Thelen
2011-08-17 16:14 ` [PATCH v9 04/13] memcg: add kernel calls for memcg dirty page stats Greg Thelen
2011-08-17 16:14 ` Greg Thelen
2011-08-17 16:14 ` [PATCH v9 05/13] memcg: add mem_cgroup_mark_inode_dirty() Greg Thelen
2011-08-17 16:14 ` Greg Thelen
2011-08-18 0:51 ` KAMEZAWA Hiroyuki
2011-08-18 0:51 ` KAMEZAWA Hiroyuki
2011-08-17 16:14 ` [PATCH v9 06/13] memcg: add dirty limits to mem_cgroup Greg Thelen
2011-08-17 16:14 ` Greg Thelen
2011-08-18 0:53 ` KAMEZAWA Hiroyuki
2011-08-18 0:53 ` KAMEZAWA Hiroyuki
2011-08-17 16:14 ` [PATCH v9 07/13] memcg: add cgroupfs interface to memcg dirty limits Greg Thelen
2011-08-17 16:14 ` Greg Thelen
2011-08-18 0:55 ` KAMEZAWA Hiroyuki
2011-08-18 0:55 ` KAMEZAWA Hiroyuki
2011-08-17 16:15 ` [PATCH v9 08/13] memcg: dirty page accounting support routines Greg Thelen
2011-08-17 16:15 ` Greg Thelen
2011-08-18 1:05 ` KAMEZAWA Hiroyuki
2011-08-18 1:05 ` KAMEZAWA Hiroyuki
2011-08-18 7:04 ` Greg Thelen
2011-08-18 7:04 ` Greg Thelen
2011-08-17 16:15 ` [PATCH v9 09/13] memcg: create support routines for writeback Greg Thelen
2011-08-17 16:15 ` Greg Thelen
2011-08-18 1:13 ` KAMEZAWA Hiroyuki
2011-08-18 1:13 ` KAMEZAWA Hiroyuki
2011-08-17 16:15 ` [PATCH v9 10/13] writeback: pass wb_writeback_work into move_expired_inodes() Greg Thelen
2011-08-17 16:15 ` Greg Thelen
2011-08-18 1:15 ` KAMEZAWA Hiroyuki
2011-08-18 1:15 ` KAMEZAWA Hiroyuki
2011-08-17 16:15 ` [PATCH v9 11/13] writeback: make background writeback cgroup aware Greg Thelen
2011-08-17 16:15 ` Greg Thelen
2011-08-18 1:23 ` KAMEZAWA Hiroyuki
2011-08-18 1:23 ` KAMEZAWA Hiroyuki
2011-08-18 7:10 ` Greg Thelen
2011-08-18 7:10 ` Greg Thelen
2011-08-18 7:17 ` KAMEZAWA Hiroyuki
2011-08-18 7:17 ` KAMEZAWA Hiroyuki
2011-08-18 7:38 ` Greg Thelen
2011-08-18 7:38 ` Greg Thelen
2011-08-18 7:35 ` KAMEZAWA Hiroyuki
2011-08-18 7:35 ` KAMEZAWA Hiroyuki
2011-08-17 16:15 ` [PATCH v9 12/13] memcg: create support routines for page writeback Greg Thelen
2011-08-17 16:15 ` Greg Thelen
2011-08-18 1:38 ` KAMEZAWA Hiroyuki
2011-08-18 1:38 ` KAMEZAWA Hiroyuki
2011-08-18 2:36 ` Wu Fengguang
2011-08-18 2:36 ` Wu Fengguang
2011-08-18 10:12 ` Jan Kara
2011-08-18 10:12 ` Jan Kara
2011-08-18 12:17 ` Wu Fengguang
2011-08-18 12:17 ` Wu Fengguang
2011-08-18 20:08 ` Jan Kara
2011-08-18 20:08 ` Jan Kara
2011-08-19 1:36 ` Wu Fengguang
2011-08-19 1:36 ` Wu Fengguang
2011-08-17 16:15 ` [PATCH v9 13/13] memcg: check memcg dirty limits in " Greg Thelen
2011-08-17 16:15 ` Greg Thelen
2011-08-18 1:40 ` KAMEZAWA Hiroyuki
2011-08-18 1:40 ` KAMEZAWA Hiroyuki
2011-08-18 0:35 ` [PATCH v9 00/13] memcg: per cgroup dirty page limiting KAMEZAWA Hiroyuki
2011-08-18 0:35 ` KAMEZAWA Hiroyuki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1313597705-6093-2-git-send-email-gthelen@google.com \
--to=gthelen@google.com \
--cc=akpm@linux-foundation.org \
--cc=andrea@betterlinux.com \
--cc=bsingharora@gmail.com \
--cc=ciju@linux.vnet.ibm.com \
--cc=containers@lists.osdl.org \
--cc=david@fromorbit.com \
--cc=fengguang.wu@intel.com \
--cc=hannes@cmpxchg.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan.kim@gmail.com \
--cc=nishimura@mxp.nes.nec.co.jp \
--cc=rientjes@google.com \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.