From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Mason Subject: Re: [RFC] Storing cgroup id in page->private (Was: Re: [RFC] [PATCH 0/6] Provide cgroup isolation for buffered writes.) Date: Thu, 10 Mar 2011 16:15:39 -0500 Message-ID: <1299791640-sup-1874@think> References: <1299619256-12661-1-git-send-email-teravest@google.com> <20110309142237.6ab82523.kamezawa.hiroyu@jp.fujitsu.com> <20110310181529.GF29464@redhat.com> <20110310191115.GG29464@redhat.com> <20110310194106.GH29464@redhat.com> Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Justin TerAvest , KAMEZAWA Hiroyuki , m-ikeda , jaxboe , linux-kernel , ryov , taka , "righi.andrea" , guijianfeng , balbir , ctalbott , nauman , mrubin , linux-fsdevel To: Vivek Goyal Return-path: In-reply-to: <20110310194106.GH29464@redhat.com> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org Excerpts from Vivek Goyal's message of 2011-03-10 14:41:06 -0500: > On Thu, Mar 10, 2011 at 02:11:15PM -0500, Vivek Goyal wrote: > > On Thu, Mar 10, 2011 at 10:57:52AM -0800, Justin TerAvest wrote: > > > On Thu, Mar 10, 2011 at 10:15 AM, Vivek Goyal = wrote: > > > > On Thu, Mar 10, 2011 at 10:08:03AM -0800, Justin TerAvest wrote= : > > > > > > > > [..] > > > >> > I don't like to increase size of page_cgroup but I think you= can record > > > >> > information without increasing size of page_cgroup. > > > >> > > > > >> > A) As Andrea did, encode it to pc->flags. > > > >> > =C2=A0 But I'm afraid that there is a racy case because memo= ry cgroup uses some > > > >> > =C2=A0 test_and_set() bits. > > > >> > B) I wonder why the information cannot be recorded in page->= private. > > > >> > =C2=A0 When page has buffers, you can record the information= to buffer struct. > > > >> > =C2=A0 About swapio (if you take care of), you can record in= formation to bio. > > > >> > > > >> Hi Kame, > > > >> > > > >> I'm concerned that by using something like buffer_heads stored= in > > > >> page->private, we will only be supported on some filesystems a= nd not > > > >> others. In addition, I'm not sure if all filesystems attach bu= ffer > > > >> heads at the same time; if page->private is modified in the fl= usher > > > >> thread, we might not be able to determine the thread that dirt= ied the > > > >> page in the first place. > > > > > > > > I think the person who dirtied the page can store the informati= on in > > > > page->private (assuming buffer heads were not generated) and if= flusher > > > > thread later ends up generating buffer heads and ends up modify= ing > > > > page->private, this can be copied in buffer heads? > > >=20 > > > This scares me a bit. > > >=20 > > > As I understand it, fs/ code expects total ownership of page->pri= vate. > > > This adds a responsibility for every user to copy the data throug= h and > > > store it in the buffer head (or anything else). btrfs seems to do > > > something entirely different in some cases and store a different = kind > > > of value. > >=20 > > If filesystems are using page->private for some other purpose also,= then > > I guess we have issues.=20 > >=20 > > I am ccing linux-fsdevel to have some feedback on the idea of tryin= g > > to store cgroup id of page dirtying thread in page->private and/or = buffer > > head for tracking which group originally dirtied the page in IO con= troller > > during writeback. >=20 > A quick "grep" showed that btrfs, ceph and logfs are using page->priv= ate > for other purposes also. >=20 > I was under the impression that either page->private is null or it=20 > points to buffer heads for the writeback case. So storing the info > directly in either buffer head directly or first in page->private and > then transferring it to buffer heads would have helped.=20 Right, btrfs has its own uses for page->private, and we expect to own it. With a proper callback, the FS could store the extra information y= ou need in out own structs. -chris