From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S964949AbbJ2O6t (ORCPT ); Thu, 29 Oct 2015 10:58:49 -0400 Received: from mail-am1on0076.outbound.protection.outlook.com ([157.56.112.76]:21280 "EHLO emea01-am1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753068AbbJ2O6p (ORCPT ); Thu, 29 Oct 2015 10:58:45 -0400 Authentication-Results: spf=pass (sender IP is 193.47.165.134) smtp.mailfrom=mellanox.com; obsidianresearch.com; dkim=none (message not signed) header.d=none;obsidianresearch.com; dmarc=pass action=none header.from=mellanox.com; Subject: Re: RFC rdma cgroup To: Parav Pandit , Tejun Heo , "Doug Ledford" , "Hefty, Sean" , "linux-rdma@vger.kernel.org" , "cgroups@vger.kernel.org" , Liran Liss , "linux-kernel@vger.kernel.org" , "lizefan@huawei.com" , Johannes Weiner , Jonathan Corbet , "james.l.morris@oracle.com" , "serge@hallyn.com" , Or Gerlitz , Matan Barak , "raindel@mellanox.com" , "akpm@linux-foundation.org" , "linux-security-module@vger.kernel.org" , Jason Gunthorpe References: From: Haggai Eran Message-ID: <563233D7.90808@mellanox.com> Date: Thu, 29 Oct 2015 16:57:27 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.0.52.254] X-EOPAttributedMessage: 0 X-Microsoft-Exchange-Diagnostics: 1;AM1FFO11FD026;1:t/VJ3a8o9ys0kYPbWOp9UX+qm3hxYABKu6+jfR89oVt2VWvBtkQH+DBqNmQq42j725lDYgPAtbxi/nJYNCp2L/ryLr3vlJB5Ix++scTO01sD/3ZKfWOrzCL7dni6PhD4/UITZKadU1XOk7ost+Sf8FOWflHJGZL3RUFbOpLzxG1n0VQyr4TW378sGfNMkduw2ReNxZWBXRTkzxUVMOV/Xbe48xOeSffo2afunV9Fp2SqjnxnCTOD2DFoytGo/RIuX4vsvOcs3uI/vk3mQA8UGRVwBVLOXDYCHGq4Ywhy9NaWnxqj9PfkEZyFcJT6Ly1TNhVhbIYAgRPq9y5nBIcsCbj1+sbhgwFvgYQs2DHguGCqaK2fKDEX2sTjKrzOpuTKFQjS2zvu59vkqDAlYHVrIQ== X-Forefront-Antispam-Report: CIP:193.47.165.134;CTRY:IL;IPV:NLI;EFV:NLI;SFV:NSPM;SFS:(10009020)(6009001)(2980300002)(438002)(479174004)(164054003)(199003)(189002)(24454002)(33656002)(92566002)(5008740100001)(50986999)(76176999)(561944003)(54356999)(65816999)(87936001)(2501003)(107886002)(97736004)(2201001)(106466001)(77096005)(11100500001)(36756003)(47776003)(64126003)(2950100001)(189998001)(4001350100001)(23676002)(5001770100001)(50466002)(5007970100001)(6806005)(65956001)(5004730100002)(83506001)(65806001)(86362001)(3940600001)(921003)(1121003);DIR:OUT;SFP:1101;SCL:1;SRVR:AM4PR05MB1457;H:mtlcas13.mtl.com;FPR:;SPF:Pass;PTR:ErrorRetry;A:1;MX:1;LANG:en; X-Microsoft-Exchange-Diagnostics: 1;AM4PR05MB1457;2:dqG6RkYdei9RE0053TPKOu05oacijRYj7VlbpKjy0VdERbCxLpYQIiBEIEDZvN85en+H9B5Jw7uEyBuw16g2Rl4K2+3FxNk3Hx/YMpKNvlXh0gUfPLijMFhpWWWwKGN3Jl/GWq4U+9rExLnoDCO2ZMlrxPLZm2QUApHHEkPa984=;3:drzcZWn2cXf7vHJtQQeHno77mYMRVrtOw1Ylsx/9duFdUfDLDVVUoGHyM653TmcqzGhFFui+oGRBMVndNfhtyWxZReQs+HrKWh1ik4KoROj8rEYUxGhNwgqzS3p58tt4BbBvjBOwGZfoMsQNFQEbc2yY8uNm4TJIIiyzGkmf9vR/4zISOgAsa565NR8qgpVwtEj3Wob6JA9SFOkh+2+rDDnXWbc5cae3Z47OXRanWj0TWbrGnWZwB4ramXB1PRn5mgLVjVG07Hk9ZnF6UdSA9w==;25:5XF1Ui1aaE7XeYm6SpNxq3b1hptYrLTS7HpIfzOi7wMNGluSX9ghOT6XN+I7NYFtgRFbIeZ+WjyWEbIPuIiRygKU8xu5NRyhCi3rgJ60mdBYxsyOZ+B0c+NS0wOOE0oADw0vYK273NfSJHXv2KLXmG0LyOxeL6IUYON8SKS6GQTrxkdLHacDO9hcIt7Mfw5V0FOBNZBYWXnNWN2Ba6SfmWduULe7hESVm6EXXvkQcbFjFFoB52SdhVECJUv4IvUCuLD+kMq0kDlIW4fQPi/uXg== X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(8251501001);SRVR:AM4PR05MB1457; X-Microsoft-Exchange-Diagnostics: 1;AM4PR05MB1457;20:ziOUnM+dN0LPA6gDB6GiLuni4ob1d2ushxUQkzE20cwV8z60HuVkj3NEZB1lMeUF6rBcgiC257RXpykBYilR0ZXtUUCfGhIc57vXTZA3TujGxjTaCYSWe0ep5vWT/L9tL6lxHm0O5Y7w6JQkD5fya8pUfO+6usN0g3R5p60gKYxc1hn9+a2cZlPmx5rsro6QHqbTgGxIjBCSs02PyU60sLiAhUJX5xSfS4+IPK1VU7DbDxnpBOjnZa93fs6Suj5wDoCcLw0FXO4r3Rg+ijakbJ0OCTkEqgIoJRARh2zystwN2YJFxhOKUMRwL26dubsyZKWzmQW7EDrzi6AQCju0iyGIv6EEqekBtbOyNZdpvPyJbfZVnSoaKfdL69nOhifr/prt/CsnxUfaLOjDuWzrPiPWhvW1UX1pLnmD3gcQ7ILhd4CbU3Yb9qh5vZMsfAjA0/JxZwNUQ2d7dKnPLFgbeZC/KB1mX87YBmlntx0uZr18BDfPe+yh3pAWlUv/TF5f;4:GTsXI+RTaZ7dZvC24EQP6PzbftIkHNp1JuD2on9+QJklz5bLoefoCJWtbCoGvWbP0MVEB2BWU2edMdB+mxNNpz+qYiz5yqIAPzHZ42ZMFwOTsTDp9rYVAhdQgp5kiBkdVMS9zSeOytO2M+jOcDAF1kbVq6mDK8y/hyEbpdnHT35btgjA+w2TY1QJvS9foIiq2o2ICqGsNCh3FjpG/B+PBLB6E20NPwbMwRwEX/2IvksfPwWKgDVeBrW4ZSRWt7v0w3ckrxpHJBpmboeaENe5TEck/SsA6TDzWWIalySwFVA4G8FGuT4hAcwspQTUPXgdEOwpeqAvKkOz4w9FPgRZgOCaThy5rndv+MONkBP4I7OSMi3NbY/91a48eCus8urcUCecwNQiu8PNxl+dyQmT2g== X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(5005006)(8121501046)(520078)(3002001)(10201501046)(102215026);SRVR:AM4PR05MB1457;BCL:0;PCL:0;RULEID:;SRVR:AM4PR05MB1457; X-Forefront-PRVS: 0744CFB5E8 X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtBTTRQUjA1TUIxNDU3OzIzOktGdGxWSG9xNVRCSUx6VkliWUdRNWloQWtI?= =?utf-8?B?RXlySmI2WFV2OExxdllyanVDS29OMXpOenNleW4zQjNsS04xaVAvYjBMbkhY?= =?utf-8?B?S0h0Vk03WUlNWmE5dnAvUmJIM3BvWk9IcytTMWhHZTNSQnFCWXk1bVc4c1Na?= =?utf-8?B?cEtxMm1xZ1FUNHFCZDBXYnhoYldJdnMzelprMGZ1ejYyazZkems2azROcGFy?= =?utf-8?B?K1B0U1czSEp4amVVcklGMVhCUFovcmY4dHdscEtWcHdBeE1GQ1UrWkhCaFdO?= =?utf-8?B?dUNKWWxBUFk4UUZkbEt4eFhxZGlyaStadDFleG1WSnhHVGdSYWNIQm95L0Vl?= =?utf-8?B?bVZ6dElPbUUxd0s4U08wcWpIN2JHS3VkaVkvS3YzK1JVSE80SnJaMlMvaDZm?= =?utf-8?B?dWF3SFJqcm1TY0J0Qit5UDZFeG8rdWgxZXk0VUZPZzJCM2dBb2dUNmdiclNG?= =?utf-8?B?dngvNzVMcytQdlFiQTdRaUxKamdwd09zWWxTaGVsdnQwK0paL0t5ODdQRlJC?= =?utf-8?B?c0tnMXEwVG13aXFETlFRM3JWWDdmRWZHRGxHYnVxUlpybE1nV1h3WlpIeFYz?= =?utf-8?B?QUpHVTQwK3pPQ1dQekVWWkx4L0swZTBtSGxwK2IwMVZ5Qzk5dHVpNkFET0Vr?= =?utf-8?B?ajR6TCsvZmM3bjJ0YjM2ZHdObVlZNWpWaTM5cjcrVnF0bGtRN0t0YkRQM1l6?= =?utf-8?B?TTlVd2RiRGUwYlF1cnZTSXVaQWI4OEZpSmxyUk5xaURUZGlIdFQ0NGpSeFE1?= =?utf-8?B?VmxBRldnY3lFODkxMjFJTzBBTjE2WHd5R2hORENCVG1PUzV1bE55OFA5Z3NO?= =?utf-8?B?bE16eXN1WndQNUFzdlZDeE1USkhvbnloYW13QmtPaGw0OWx1K2ZtMDJzenA3?= =?utf-8?B?V2ljWUM3bnNHeEU0RGQ1UkZlUk4vS3MxWkJ4VzhVQnBPYUZpRGl1UWo0REo1?= =?utf-8?B?UTI3QXlGM0ZOMSs4MnIzY2hKeUc5MlZiSE8xQ1cxMHovaE1wejFDMTkwR3N2?= =?utf-8?B?TWJkSE1vUmtMdjRiTityQjhpcS9tM0o2ZTgwTkc0aTFOTmZLY2Z4VE5TZ1J4?= =?utf-8?B?aFI5ZGczMStRM2ZSTHlYeE0vODErS0x0K0hvMXBsejZTYkFsbmh2T254NEV0?= =?utf-8?B?Ukswa1NxOVlCYXRvNEY1L3hCZWhaZ1AzT0NMdHpQTThXQTkwRzBMdUFwdFEv?= =?utf-8?B?SHo5TmRmZWdDQ0VNQjFYQzRubkpaQnQ5eEs5VUM4V3hNRnVSZ0JBYkNCM015?= =?utf-8?B?Zmx4bFZMV2gzbzV1aWpXNkZ6TFNGK1BYSCttWEFmL0VaeTZFV3hEeStobEFB?= =?utf-8?B?MGgvV09pY0ZxQVJpdXdHRkJqZ3RJdUo1b01uMDJHbjBFRTBrYjVVeU9VTnR0?= =?utf-8?B?MzNraTE4M0VKRnhUSisvdnUxRmpVYjEwWnJlMnFFSys5M2RvMlU2bnI2Z0pG?= =?utf-8?B?L20rMVVrUHVFaFdMTEIwbjVrR3NSR3V6R1ZqN3ZoYzhoTURWaHI2M2xwdy9U?= =?utf-8?Q?JvK3TwtcR0JBM0GUENshcAH9AyFtn4nC6kg4S4x1gHmS4k?= X-Microsoft-Exchange-Diagnostics: 1;AM4PR05MB1457;5:p9O1ZeyGrc3uhtSZT5vfIoFoLO+yOp4kIID7M07vhRaWFu2QRem1vKJzMIvcScAt3G0z1I3iTlNfr0L/2DjKMVwADQp4B58oRasie1FMAUgjGm9d3tDwFVSlVrFJWR71SvO3LFxQyxOxvbFc0/0mhg==;24:2Hk74iY6lV7ncBWFDEoOiZ347ERoEJu/ZoOgD+iRfEbeeEiDMUISDkZYM7tHXJYs0/R+IwoV6eNfIzmz/Ut4974d/VGygeByMtWhXsrcAN0=;20:roqzEd4GxXqqwpVmg5FkzreXp+K7E5pBdWv60lZeslBMQbGhOd6UrMb6PLlYLikfBCnLdDHlVEex2vOQM/mjcA== SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Oct 2015 14:58:38.8509 (UTC) X-MS-Exchange-CrossTenant-Id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=a652971c-7d2e-4d9b-a6a4-d149256f461b;Ip=[193.47.165.134];Helo=[mtlcas13.mtl.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR05MB1457 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 28/10/2015 10:29, Parav Pandit wrote: > 3. Resources are not defined by the RDMA cgroup. Resources are defined > by RDMA/IB subsystem and optionally by HCA vendor device drivers. > Rationale: This allows rdma cgroup to remain constant while RDMA/IB > subsystem can evolve without the need of rdma cgroup update. A new > resource can be easily added by the RDMA/IB subsystem without touching > rdma cgroup. Resources exposed by the cgroup are basically a UAPI, so we have to be careful to make it stable when it evolves. I understand the need for vendor specific resources, following the discussion on the previous proposal, but could you write on how you plan to allow these set of resources to evolve? > 8. Typically each RDMA cgroup will have 0 to 4 RDMA devices. Therefore > each cgroup will have 0 to 4 verbs resource pool and optionally 0 to 4 > hw resource pool per such device. > (Nothing stops to have more devices and pools, but design is around > this use case). In what way does the design depend on this assumption? > 9. Resource pool object is created in following situations. > (a) administrative operation is done to set the limit and no previous > resource pool exist for the device of interest for the cgroup. > (b) no resource limits were configured, but IB/RDMA subsystem tries to > charge the resource. so that when applications are running without > limits and later on when limits are enforced, during uncharging, it > correctly uncharges them, otherwise usage count will drop to negative. > This is done using default resource pool. > Instead of implementing any sort of time markers, default pool > simplifies the design. Having a default resource pool kind of implies there is a non-default one. Is the only difference between the default and non-default the fact that the second was created with an administrative operation and has specified limits or is there some other difference? > (c) When process migrate from one to other cgroup, resource is > continue to be owned by the creator cgroup (rather css). > After process migration, whenever new resource is created in new > cgroup, it will be owned by new cgroup. It sounds a little different from how other cgroups behave. I agree that mostly processes will create the resources in their cgroup and won't migrate, but why not move the charge during migration? I finally wanted to ask about other limitations an RDMA cgroup could handle. It would be great to be able to limit a container to be allowed to use only a subset of the MAC/VLAN pairs programmed to a device, or only a subset of P_Keys and GIDs it has. Do you see such limitations also as part of this cgroup? Thanks, Haggai