From: Parav Pandit <pandit.parav-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org,
hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org,
dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
liranl-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org,
sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org,
jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org,
haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org
Cc: corbet-T1hC0tSOHrs@public.gmane.org,
james.l.morris-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org,
ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org,
matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org,
raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org,
akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org,
linux-security-module-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
pandit.parav-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
Subject: [PATCHv7 3/3] rdmacg: Added documentation for rdmacg
Date: Sun, 28 Feb 2016 19:43:41 +0530 [thread overview]
Message-ID: <1456668821-25799-4-git-send-email-pandit.parav@gmail.com> (raw)
In-Reply-To: <1456668821-25799-1-git-send-email-pandit.parav-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Added documentation for v1 and v2 version describing high
level design and usage examples on using rdma controller.
Signed-off-by: Parav Pandit <pandit.parav-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
Documentation/cgroup-v1/rdma.txt | 111 +++++++++++++++++++++++++++++++++++++++
Documentation/cgroup-v2.txt | 33 ++++++++++++
2 files changed, 144 insertions(+)
create mode 100644 Documentation/cgroup-v1/rdma.txt
diff --git a/Documentation/cgroup-v1/rdma.txt b/Documentation/cgroup-v1/rdma.txt
new file mode 100644
index 0000000..1973502
--- /dev/null
+++ b/Documentation/cgroup-v1/rdma.txt
@@ -0,0 +1,111 @@
+ RDMA Controller
+ ----------------
+
+Contents
+--------
+
+1. Overview
+ 1-1. What is RDMA controller?
+ 1-2. Why RDMA controller needed?
+ 1-3. How is RDMA controller implemented?
+2. Usage Examples
+
+1. Overview
+
+1-1. What is RDMA controller?
+-----------------------------
+
+RDMA controller allows user to limit RDMA/IB specific resources
+that a given set of processes can use. These processes are grouped using
+RDMA controller.
+
+RDMA controller allows operating on resources defined by the IB stack
+which are mainly IB verb resources and in future hardware specific
+well defined resources.
+
+1-2. Why RDMA controller needed?
+--------------------------------
+
+Currently user space applications can easily take away all the rdma device
+specific resources such as AH, CQ, QP, MR etc. Due to which other applications
+in other cgroup or kernel space ULPs may not even get chance to allocate any
+rdma resources. This leads to service unavailability.
+
+Therefore RDMA controller is needed through which resource consumption
+of processes can be limited. Through this controller various different rdma
+resources described by IB stack can be accounted.
+
+1-3. How is RDMA controller implemented?
+----------------------------------------
+
+RDMA cgroup allows limit configuration of resources. These resources are not
+defined by the rdma controller. Instead they are defined by the IB stack.
+This provides great flexibility to allow IB stack to define new resources,
+without any changes to rdma cgroup.
+Rdma cgroup maintains resource accounting per cgroup, per device using
+resource pool structure. Each such resource pool is limited up to
+64 resources in given resource pool by rdma cgroup, which can be extended
+later if required.
+
+This resource pool object is linked to the cgroup css. Typically there
+are 0 to 4 resource pool instances per cgroup, per device in most use cases.
+But nothing limits to have it more. At present hundreds of RDMA devices per
+single cgroup may not be handled optimally, however there is no
+known use case for such configuration either.
+
+Since RDMA resources can be allocated from any process and can be freed by any
+of the child processes which shares the address space, rdma resources are
+always owned by the creator cgroup css. This allows process migration from one
+to other cgroup without major complexity of transferring resource ownership;
+because such ownership is not really present due to shared nature of
+rdma resources. Linking resources around css also ensures that cgroups can be
+deleted after processes migrated. This allow progress migration as well with
+active resources, even though that’s not the primary use case.
+
+Whenever RDMA resource charing occurs, owner rdma cgroup is returned to
+the caller. Same rdma cgroup should be passed while uncharging the resource.
+This also allows process migrated with active RDMA resource to charge
+to new owner cgroup for new resource. It also allows to uncharge resource of
+a process from previously charged cgroup which is migrated to new cgroup,
+even though that is not a primary use case.
+
+Resource pool object is created in following situations.
+(a) User sets the limit and no previous resource pool exist for the device
+of interest for the cgroup.
+(b) No resource limits were configured, but IB/RDMA stack tries to
+charge the resource. So that it correctly uncharge them when applications are
+running without limits and later on when limits are enforced during uncharging,
+otherwise usage count will drop to negative.
+
+Resource pool is destroyed if it all the resource limits are set to max
+and it is the last resource getting deallocated.
+
+User should set all the limit to max value if it intents to remove/unconfigure
+the resource pool for a particular device.
+
+IB stack honors limits enforced by the rdma controller. When application
+query about maximum resource limits of IB device, it returns minimum of
+what is configured by user for a given cgroup and what is supported by
+IB device.
+
+2. Usage Examples
+-----------------
+
+(a) Configure resource limit:
+echo mlx4_0 mr=100 qp=10 ah=2 > /sys/fs/cgroup/rdma/1/rdma.max
+echo ocrdma1 mr=120 qp=20 cq=10 > /sys/fs/cgroup/rdma/2/rdma.max
+
+(b) Query resource limit:
+cat /sys/fs/cgroup/rdma/2/rdma.max
+#Output:
+mlx4_0 mr=100 qp=10 ah=2 pd=max
+ocrdma1 mr=120 qp=20 cq=10 pd=max ah=max
+
+(c) Query current usage:
+cat /sys/fs/cgroup/rdma/2/rdma.current
+#Output:
+mlx4_0 mr=95 qp=8 ah=2
+ocrdma1 mr=0 qp=20 cq=10
+
+(d) Delete resource limit:
+echo mlx4_0 mr=max qp=max ah=max > /sys/fs/cgroup/rdma/1/rdma.max
diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
index ff49cf9..0ec4605 100644
--- a/Documentation/cgroup-v2.txt
+++ b/Documentation/cgroup-v2.txt
@@ -47,6 +47,8 @@ CONTENTS
5-3. IO
5-3-1. IO Interface Files
5-3-2. Writeback
+ 5-4. RDMA
+ 5-4-1. RDMA Interface Files
P. Information on Kernel Programming
P-1. Filesystem Support for Writeback
D. Deprecated v1 Core Features
@@ -1088,6 +1090,37 @@ writeback as follows.
total available memory and applied the same way as
vm.dirty[_background]_ratio.
+5-4. RDMA
+
+The "rdma" controller regulates the distribution of RDMA resources.
+This controller implements resource accounting of resources defined
+by IB stack.
+
+5-4-1. RDMA Interface Files
+
+ rdma.max
+ A readwrite file that exists for all the cgroups except root that
+ describes current configured resource limit for a RDMA/IB device.
+
+ Lines are keyed by device name and are not ordered.
+ Each line contains space separated resource name and its configured
+ limit that can be distributed.
+
+ An example for mlx4 and ocrdma device follows.
+
+ mlx4_0 ah=2 mr=1000 qp=104
+ ocrdma1 cq=10 mr=900 qp=89
+ mlx4_1 uctx=max ah=max pd=max cq=max qp=max
+
+ rdma.current
+ A read-only file that describes current resource usage.
+ It exists for all the cgroup except root.
+
+ An example for mlx4 and ocrdma device follows.
+
+ mlx4_0 mr=1000 qp=102 ah=2 flow=10 srq=0
+ ocrdma1 mr=900 qp=79 cq=10 flow=0 srq=0
+ mlx4_1 uctx=max ah=max pd=max cq=max qp=max
P. Information on Kernel Programming
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2016-02-28 14:13 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-28 14:13 [PATCHv7 0/3] rdmacg: IB/core: rdma controller support Parav Pandit
2016-02-28 14:13 ` [PATCHv7 1/3] rdmacg: Added rdma cgroup controller Parav Pandit
[not found] ` <1456668821-25799-2-git-send-email-pandit.parav-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-03-01 8:26 ` Haggai Eran
[not found] ` <1456668821-25799-1-git-send-email-pandit.parav-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-02-28 14:13 ` [PATCHv7 2/3] IB/core: added support to use " Parav Pandit
2016-03-01 9:08 ` Haggai Eran
[not found] ` <56D55C06.8090102-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-03-01 9:23 ` Parav Pandit
2016-03-01 9:12 ` Haggai Eran
2016-03-01 9:22 ` Parav Pandit
[not found] ` <CAG53R5UrM7WAuamCEynDRKV0YMJsJUe=SEYfTrueQD0rtx71bA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-03-01 11:06 ` Haggai Eran
2016-03-01 13:43 ` Parav Pandit
2016-02-28 14:13 ` Parav Pandit [this message]
2016-03-01 11:01 ` [PATCHv7 3/3] rdmacg: Added documentation for rdmacg Haggai Eran
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1456668821-25799-4-git-send-email-pandit.parav@gmail.com \
--to=pandit.parav-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
--cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=corbet-T1hC0tSOHrs@public.gmane.org \
--cc=dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
--cc=james.l.morris-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
--cc=jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org \
--cc=linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-security-module-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=liranl-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org \
--cc=matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
--cc=serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org \
--cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).