From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tejun Heo Subject: Re: [RFC] cgroup: removing css reference drain wait during cgroup removal Date: Tue, 13 Mar 2012 15:16:47 -0700 Message-ID: <20120313221647.GG7349@google.com> References: <20120312213155.GE23255@google.com> <20120313214526.GG19584@count0.beaverton.ibm.com> <20120313220551.GF7349@google.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=JyzvesLFEcFMu7jF7m2uAY/sJvKp1OuzkuEsH/PgkRg=; b=R5+FvUCaitGxzpixMwyoiLZt4mjfzUrZU1honlBsk3FENWRnt6FmTWNnR4062fPhO7 fjuY4FLDk2btd+uwXFN1qwQqpSdLG82608kCScaOlTAzsLBYo5rRd76WVT+/J+G4AQpv 5Dxxn/7Q+u6H0WpeoTMfYKMgLEyhiikAHiWi43Q/V2vgTJDcZlVjMA0poF6z3AK3ewKz +RyN8KXdXiNT8ikXiqmkkAfjGpMrP8REhKNT8QMSBWqYoL1spmrZ8qGJCViP/eBqDZBx oRZc+fKkzzdT5bt3RtHZwbd/9hyyducTnAn+Pw5hTEBIil6EwiTZU3DZoJmQz8R/I3fu s7qA== Content-Disposition: inline In-Reply-To: <20120313220551.GF7349-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: Matt Helsley Cc: Jens Axboe , containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org, Hugh Dickins , linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Michal Hocko , linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, Vivek Goyal , Johannes Weiner , cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org (fixed up mailing list addresses) On Tue, Mar 13, 2012 at 03:05:51PM -0700, Tejun Heo wrote: > Hey, Matt. > > On Tue, Mar 13, 2012 at 02:45:26PM -0700, Matt Helsley wrote: > > If you want to spend your time doing archaeology there are some old threads > > that touch on this idea (roughly around 2003-2005). One point against the > > idea that I distinctly recall: > > > > Somewhat like configfs, object lifetimes in cgroups are determined > > primarily by the user whereas sysfs object lifetimes are primarily > > determined by the kernel. I think the closest we come to user-determined > > objects in sysfs occur through debugfs, and module loading/unloading. > > However those involve mount/umount and modprobe/rmmod rather than > > mkdir/rmdir to create and remove the objects. > > The thing is that sysfs itself has been almost completely rewritten > since that time to 1. decouple internal representation from vfs > objects and 2. provide proper isolation between the userland and > kernel code exposing data through sysfs. > > #1 began mostly due to the large size of dentries and inodes but, with > the benefit of hindsight, I think it just was a bad idea to piggyback > on vfs objects for object life-cycle management and locking for stuff > which is wholely described in memory with simplistic locking. > > #2 was necessary to avoid hanging device detach due to open sysfs file > from userland. sysfs now has notion of "active access" encompassing > only each show/store op invocation and it only guarantees that the > associated device doesn't go away while active accesses are in > progress. > > The sysfs heritage is almost recognizable and unfortunately almost the > same set of problems (nobody wants show/store ops to be called on > unlinked css waiting for references to be drained). As refactoring > and sharing sysfs won't be a trivial task, my plan is to first augment > cgroupfs as necessary with longer term goal of converging and later > sharing the same code with sysfs. Sorry, forgot to reply to the userland-determined object creation/deletion part. I don't think there are direct creation cases in sysfs but there are plenty of deletion going on, especially the kind where a file requests to delete its parent directly (*/device/delete). While using mkdir/rmdir indeed is different for cgroupfs, I don't think that would make too much of difference. Both calls are essentially unused by sysfs currently and there's nothing preventing addition of callbacks there. Thanks. -- tejun