From mboxrd@z Thu Jan 1 00:00:00 1970 From: Glauber Costa Subject: Re: [PATCH v3 13/28] slub: create duplicate cache Date: Tue, 29 May 2012 23:40:02 +0400 Message-ID: <4FC52612.5060006@parallels.com> References: <1337951028-3427-1-git-send-email-glommer@parallels.com> <1337951028-3427-14-git-send-email-glommer@parallels.com> <4FC4F1A7.2010206@parallels.com> <4FC501E9.60607@parallels.com> <4FC506E6.8030108@parallels.com> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Sender: cgroups-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-ID: Content-Type: text/plain; charset="us-ascii"; format="flowed" To: Christoph Lameter Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org, Tejun Heo , Li Zefan , Greg Thelen , Suleiman Souhlal , Michal Hocko , Johannes Weiner , devel-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org, David Rientjes , Pekka Enberg On 05/29/2012 11:26 PM, Christoph Lameter wrote: > On Tue, 29 May 2012, Glauber Costa wrote: > >> But we really need a page to be filled with objects from the same cgroup, and >> the non-shared objects to be accounted to the right place. > > No other subsystem has such a requirement. Even the NUMA nodes are mostly > suggestions and can be ignored by the allocators to use memory from other > pages. Of course it does. Memcg itself has such a requirement. The collective set of processes needs to have the pages it uses accounted to it, and never go over limit. >> Otherwise, I don't think we can meet even the lighter of isolation guarantees. > > The approach works just fine with NUMA and cpusets. Isolation is mostly > done on the per node boundaries and you already have per node statistics. I don't know about cpusets in details, but at least with NUMA, this is not an apple-to-apple comparison. a NUMA node is not meant to contain you. A container is, and that is why it is called a container. NUMA just means what is the *best* node to put my memory. Now, if you actually say, through you syscalls "this is the node it should live in", then you have a constraint, that to the best of my knowledge is respected. Now isolation here, is done in the container boundary. (cgroups, to be generic). From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx171.postini.com [74.125.245.171]) by kanga.kvack.org (Postfix) with SMTP id 88D146B005D for ; Tue, 29 May 2012 15:42:23 -0400 (EDT) Message-ID: <4FC52612.5060006@parallels.com> Date: Tue, 29 May 2012 23:40:02 +0400 From: Glauber Costa MIME-Version: 1.0 Subject: Re: [PATCH v3 13/28] slub: create duplicate cache References: <1337951028-3427-1-git-send-email-glommer@parallels.com> <1337951028-3427-14-git-send-email-glommer@parallels.com> <4FC4F1A7.2010206@parallels.com> <4FC501E9.60607@parallels.com> <4FC506E6.8030108@parallels.com> In-Reply-To: Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Christoph Lameter Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, kamezawa.hiroyu@jp.fujitsu.com, Tejun Heo , Li Zefan , Greg Thelen , Suleiman Souhlal , Michal Hocko , Johannes Weiner , devel@openvz.org, David Rientjes , Pekka Enberg On 05/29/2012 11:26 PM, Christoph Lameter wrote: > On Tue, 29 May 2012, Glauber Costa wrote: > >> But we really need a page to be filled with objects from the same cgroup, and >> the non-shared objects to be accounted to the right place. > > No other subsystem has such a requirement. Even the NUMA nodes are mostly > suggestions and can be ignored by the allocators to use memory from other > pages. Of course it does. Memcg itself has such a requirement. The collective set of processes needs to have the pages it uses accounted to it, and never go over limit. >> Otherwise, I don't think we can meet even the lighter of isolation guarantees. > > The approach works just fine with NUMA and cpusets. Isolation is mostly > done on the per node boundaries and you already have per node statistics. I don't know about cpusets in details, but at least with NUMA, this is not an apple-to-apple comparison. a NUMA node is not meant to contain you. A container is, and that is why it is called a container. NUMA just means what is the *best* node to put my memory. Now, if you actually say, through you syscalls "this is the node it should live in", then you have a constraint, that to the best of my knowledge is respected. Now isolation here, is done in the container boundary. (cgroups, to be generic). -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755378Ab2E2TmX (ORCPT ); Tue, 29 May 2012 15:42:23 -0400 Received: from mx2.parallels.com ([64.131.90.16]:40960 "EHLO mx2.parallels.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754581Ab2E2TmW (ORCPT ); Tue, 29 May 2012 15:42:22 -0400 Message-ID: <4FC52612.5060006@parallels.com> Date: Tue, 29 May 2012 23:40:02 +0400 From: Glauber Costa User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.1) Gecko/20120216 Thunderbird/10.0.1 MIME-Version: 1.0 To: Christoph Lameter CC: , , , , Tejun Heo , Li Zefan , Greg Thelen , Suleiman Souhlal , Michal Hocko , Johannes Weiner , , David Rientjes , Pekka Enberg Subject: Re: [PATCH v3 13/28] slub: create duplicate cache References: <1337951028-3427-1-git-send-email-glommer@parallels.com> <1337951028-3427-14-git-send-email-glommer@parallels.com> <4FC4F1A7.2010206@parallels.com> <4FC501E9.60607@parallels.com> <4FC506E6.8030108@parallels.com> In-Reply-To: Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [188.255.67.70] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/29/2012 11:26 PM, Christoph Lameter wrote: > On Tue, 29 May 2012, Glauber Costa wrote: > >> But we really need a page to be filled with objects from the same cgroup, and >> the non-shared objects to be accounted to the right place. > > No other subsystem has such a requirement. Even the NUMA nodes are mostly > suggestions and can be ignored by the allocators to use memory from other > pages. Of course it does. Memcg itself has such a requirement. The collective set of processes needs to have the pages it uses accounted to it, and never go over limit. >> Otherwise, I don't think we can meet even the lighter of isolation guarantees. > > The approach works just fine with NUMA and cpusets. Isolation is mostly > done on the per node boundaries and you already have per node statistics. I don't know about cpusets in details, but at least with NUMA, this is not an apple-to-apple comparison. a NUMA node is not meant to contain you. A container is, and that is why it is called a container. NUMA just means what is the *best* node to put my memory. Now, if you actually say, through you syscalls "this is the node it should live in", then you have a constraint, that to the best of my knowledge is respected. Now isolation here, is done in the container boundary. (cgroups, to be generic).