From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tejun Heo Subject: Re: [RFC] cgroup: removing css reference drain wait during cgroup removal Date: Tue, 13 Mar 2012 15:16:47 -0700 Message-ID: <20120313221647.GG7349@google.com> References: <20120312213155.GE23255@google.com> <20120313214526.GG19584@count0.beaverton.ibm.com> <20120313220551.GF7349@google.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=JyzvesLFEcFMu7jF7m2uAY/sJvKp1OuzkuEsH/PgkRg=; b=R5+FvUCaitGxzpixMwyoiLZt4mjfzUrZU1honlBsk3FENWRnt6FmTWNnR4062fPhO7 fjuY4FLDk2btd+uwXFN1qwQqpSdLG82608kCScaOlTAzsLBYo5rRd76WVT+/J+G4AQpv 5Dxxn/7Q+u6H0WpeoTMfYKMgLEyhiikAHiWi43Q/V2vgTJDcZlVjMA0poF6z3AK3ewKz +RyN8KXdXiNT8ikXiqmkkAfjGpMrP8REhKNT8QMSBWqYoL1spmrZ8qGJCViP/eBqDZBx oRZc+fKkzzdT5bt3RtHZwbd/9hyyducTnAn+Pw5hTEBIil6EwiTZU3DZoJmQz8R/I3fu s7qA== Content-Disposition: inline In-Reply-To: <20120313220551.GF7349-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: Matt Helsley Cc: Jens Axboe , containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org, Hugh Dickins , linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Michal Hocko , linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, Vivek Goyal , Johannes Weiner , cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org (fixed up mailing list addresses) On Tue, Mar 13, 2012 at 03:05:51PM -0700, Tejun Heo wrote: > Hey, Matt. > > On Tue, Mar 13, 2012 at 02:45:26PM -0700, Matt Helsley wrote: > > If you want to spend your time doing archaeology there are some old threads > > that touch on this idea (roughly around 2003-2005). One point against the > > idea that I distinctly recall: > > > > Somewhat like configfs, object lifetimes in cgroups are determined > > primarily by the user whereas sysfs object lifetimes are primarily > > determined by the kernel. I think the closest we come to user-determined > > objects in sysfs occur through debugfs, and module loading/unloading. > > However those involve mount/umount and modprobe/rmmod rather than > > mkdir/rmdir to create and remove the objects. > > The thing is that sysfs itself has been almost completely rewritten > since that time to 1. decouple internal representation from vfs > objects and 2. provide proper isolation between the userland and > kernel code exposing data through sysfs. > > #1 began mostly due to the large size of dentries and inodes but, with > the benefit of hindsight, I think it just was a bad idea to piggyback > on vfs objects for object life-cycle management and locking for stuff > which is wholely described in memory with simplistic locking. > > #2 was necessary to avoid hanging device detach due to open sysfs file > from userland. sysfs now has notion of "active access" encompassing > only each show/store op invocation and it only guarantees that the > associated device doesn't go away while active accesses are in > progress. > > The sysfs heritage is almost recognizable and unfortunately almost the > same set of problems (nobody wants show/store ops to be called on > unlinked css waiting for references to be drained). As refactoring > and sharing sysfs won't be a trivial task, my plan is to first augment > cgroupfs as necessary with longer term goal of converging and later > sharing the same code with sysfs. Sorry, forgot to reply to the userland-determined object creation/deletion part. I don't think there are direct creation cases in sysfs but there are plenty of deletion going on, especially the kind where a file requests to delete its parent directly (*/device/delete). While using mkdir/rmdir indeed is different for cgroupfs, I don't think that would make too much of difference. Both calls are essentially unused by sysfs currently and there's nothing preventing addition of callbacks there. Thanks. -- tejun From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx197.postini.com [74.125.245.197]) by kanga.kvack.org (Postfix) with SMTP id B73786B00E7 for ; Tue, 13 Mar 2012 18:16:52 -0400 (EDT) Received: by yenm8 with SMTP id m8so1451828yen.14 for ; Tue, 13 Mar 2012 15:16:51 -0700 (PDT) Date: Tue, 13 Mar 2012 15:16:47 -0700 From: Tejun Heo Subject: Re: [RFC] cgroup: removing css reference drain wait during cgroup removal Message-ID: <20120313221647.GG7349@google.com> References: <20120312213155.GE23255@google.com> <20120313214526.GG19584@count0.beaverton.ibm.com> <20120313220551.GF7349@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20120313220551.GF7349@google.com> Sender: owner-linux-mm@kvack.org List-ID: To: Matt Helsley Cc: KAMEZAWA Hiroyuki , Michal Hocko , Johannes Weiner , gthelen@google.com, Hugh Dickins , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Vivek Goyal , Jens Axboe , Li Zefan , containers@lists.linux-foundation.org, cgroups@vger.kernel.org (fixed up mailing list addresses) On Tue, Mar 13, 2012 at 03:05:51PM -0700, Tejun Heo wrote: > Hey, Matt. > > On Tue, Mar 13, 2012 at 02:45:26PM -0700, Matt Helsley wrote: > > If you want to spend your time doing archaeology there are some old threads > > that touch on this idea (roughly around 2003-2005). One point against the > > idea that I distinctly recall: > > > > Somewhat like configfs, object lifetimes in cgroups are determined > > primarily by the user whereas sysfs object lifetimes are primarily > > determined by the kernel. I think the closest we come to user-determined > > objects in sysfs occur through debugfs, and module loading/unloading. > > However those involve mount/umount and modprobe/rmmod rather than > > mkdir/rmdir to create and remove the objects. > > The thing is that sysfs itself has been almost completely rewritten > since that time to 1. decouple internal representation from vfs > objects and 2. provide proper isolation between the userland and > kernel code exposing data through sysfs. > > #1 began mostly due to the large size of dentries and inodes but, with > the benefit of hindsight, I think it just was a bad idea to piggyback > on vfs objects for object life-cycle management and locking for stuff > which is wholely described in memory with simplistic locking. > > #2 was necessary to avoid hanging device detach due to open sysfs file > from userland. sysfs now has notion of "active access" encompassing > only each show/store op invocation and it only guarantees that the > associated device doesn't go away while active accesses are in > progress. > > The sysfs heritage is almost recognizable and unfortunately almost the > same set of problems (nobody wants show/store ops to be called on > unlinked css waiting for references to be drained). As refactoring > and sharing sysfs won't be a trivial task, my plan is to first augment > cgroupfs as necessary with longer term goal of converging and later > sharing the same code with sysfs. Sorry, forgot to reply to the userland-determined object creation/deletion part. I don't think there are direct creation cases in sysfs but there are plenty of deletion going on, especially the kind where a file requests to delete its parent directly (*/device/delete). While using mkdir/rmdir indeed is different for cgroupfs, I don't think that would make too much of difference. Both calls are essentially unused by sysfs currently and there's nothing preventing addition of callbacks there. Thanks. -- tejun -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760011Ab2CMWQy (ORCPT ); Tue, 13 Mar 2012 18:16:54 -0400 Received: from mail-gy0-f174.google.com ([209.85.160.174]:39145 "EHLO mail-gy0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757999Ab2CMWQw (ORCPT ); Tue, 13 Mar 2012 18:16:52 -0400 Date: Tue, 13 Mar 2012 15:16:47 -0700 From: Tejun Heo To: Matt Helsley Cc: KAMEZAWA Hiroyuki , Michal Hocko , Johannes Weiner , gthelen@google.com, Hugh Dickins , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Vivek Goyal , Jens Axboe , Li Zefan , containers@lists.linux-foundation.org, cgroups@vger.kernel.org Subject: Re: [RFC] cgroup: removing css reference drain wait during cgroup removal Message-ID: <20120313221647.GG7349@google.com> References: <20120312213155.GE23255@google.com> <20120313214526.GG19584@count0.beaverton.ibm.com> <20120313220551.GF7349@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20120313220551.GF7349@google.com> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org (fixed up mailing list addresses) On Tue, Mar 13, 2012 at 03:05:51PM -0700, Tejun Heo wrote: > Hey, Matt. > > On Tue, Mar 13, 2012 at 02:45:26PM -0700, Matt Helsley wrote: > > If you want to spend your time doing archaeology there are some old threads > > that touch on this idea (roughly around 2003-2005). One point against the > > idea that I distinctly recall: > > > > Somewhat like configfs, object lifetimes in cgroups are determined > > primarily by the user whereas sysfs object lifetimes are primarily > > determined by the kernel. I think the closest we come to user-determined > > objects in sysfs occur through debugfs, and module loading/unloading. > > However those involve mount/umount and modprobe/rmmod rather than > > mkdir/rmdir to create and remove the objects. > > The thing is that sysfs itself has been almost completely rewritten > since that time to 1. decouple internal representation from vfs > objects and 2. provide proper isolation between the userland and > kernel code exposing data through sysfs. > > #1 began mostly due to the large size of dentries and inodes but, with > the benefit of hindsight, I think it just was a bad idea to piggyback > on vfs objects for object life-cycle management and locking for stuff > which is wholely described in memory with simplistic locking. > > #2 was necessary to avoid hanging device detach due to open sysfs file > from userland. sysfs now has notion of "active access" encompassing > only each show/store op invocation and it only guarantees that the > associated device doesn't go away while active accesses are in > progress. > > The sysfs heritage is almost recognizable and unfortunately almost the > same set of problems (nobody wants show/store ops to be called on > unlinked css waiting for references to be drained). As refactoring > and sharing sysfs won't be a trivial task, my plan is to first augment > cgroupfs as necessary with longer term goal of converging and later > sharing the same code with sysfs. Sorry, forgot to reply to the userland-determined object creation/deletion part. I don't think there are direct creation cases in sysfs but there are plenty of deletion going on, especially the kind where a file requests to delete its parent directly (*/device/delete). While using mkdir/rmdir indeed is different for cgroupfs, I don't think that would make too much of difference. Both calls are essentially unused by sysfs currently and there's nothing preventing addition of callbacks there. Thanks. -- tejun