All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: Jonathan Corbet <corbet@lwn.net>
Cc: Andrea Righi <arighi@develer.com>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Greg Thelen <gthelen@google.com>,
	Wu Fengguang <fengguang.wu@intel.com>,
	Gui Jianfeng <guijianfeng@cn.fujitsu.com>,
	Ryo Tsuruta <ryov@valinux.co.jp>,
	Hirokazu Takahashi <taka@valinux.co.jp>,
	Jens Axboe <axboe@kernel.dk>,
	Andrew Morton <akpm@linux-foundation.org>,
	containers@lists.linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/5] page_cgroup: make page tracking available for blkio
Date: Tue, 22 Feb 2011 16:57:20 -0500	[thread overview]
Message-ID: <20110222215720.GK28269@redhat.com> (raw)
In-Reply-To: <20110222130145.37cb151e@bike.lwn.net>

On Tue, Feb 22, 2011 at 01:01:45PM -0700, Jonathan Corbet wrote:
> On Tue, 22 Feb 2011 18:12:54 +0100
> Andrea Righi <arighi@develer.com> wrote:
> 
> > The page_cgroup infrastructure, currently available only for the memory
> > cgroup controller, can be used to store the owner of each page and
> > opportunely track the writeback IO. This information is encoded in
> > the upper 16-bits of the page_cgroup->flags.
> > 
> > A owner can be identified using a generic ID number and the following
> > interfaces are provided to store a retrieve this information:
> > 
> >   unsigned long page_cgroup_get_owner(struct page *page);
> >   int page_cgroup_set_owner(struct page *page, unsigned long id);
> >   int page_cgroup_copy_owner(struct page *npage, struct page *opage);
> 
> My immediate observation is that you're not really tracking the "owner"
> here - you're tracking an opaque 16-bit token known only to the block
> controller in a field which - if changed by anybody other than the block
> controller - will lead to mayhem in the block controller.  I think it
> might be clearer - and safer - to say "blkcg" or some such instead of
> "owner" here.
> 
> I'm tempted to say it might be better to just add a pointer to your
> throtl_grp structure into struct page_cgroup.

throtl_grp might not even be present when page is being dirtied. When this
IO is actually submitted to device, we migth end up creating new
throtl_grp. I guess other concern here would be increasing the size of
page_cgroup structure.

I guess you meant storing a pointer to blkio_cgroup, along the lines of
storing a pointer to mem_cgroup. That also means extra 8 bytes and only
one subsystem can use it at a time. So using upper bits of pc->flags
is probably better.

> Or maybe replace the
> mem_cgroup pointer with a single pointer to struct css_set.  Both of
> those ideas, though, probably just add unwanted extra overhead now to gain
> generality which may or may not be wanted in the future.

This sounds interesting. IIUC, then this single pointer will allow all
the subsystems to use this single pointer to retireve respective cgroups
without actually co-mounting them. 

I am not sure how much work is involved in making it happen. Also not sure
about the overhead involved in traversing one extra pointer. Also apart
from blkio controller, have we practically felt the need of any other
controller this info. (network controller?). Few days back we were
experimenting with trying to control block IO bandwidth over NFS with 
the help of network controller but it did not really work well with
host of issues and one them being losing the context information.

If storing css_set pointer is lot of work, may be for the time being
we can go for this hardcoding that these bits are exclusively used
by blkio controller and once some other controller wants to share it,
then look for ways of how to do sharing.

Thanks
Vivek

WARNING: multiple messages have this Message-ID (diff)
From: Vivek Goyal <vgoyal@redhat.com>
To: Jonathan Corbet <corbet@lwn.net>
Cc: Andrea Righi <arighi@develer.com>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Greg Thelen <gthelen@google.com>,
	Wu Fengguang <fengguang.wu@intel.com>,
	Gui Jianfeng <guijianfeng@cn.fujitsu.com>,
	Ryo Tsuruta <ryov@valinux.co.jp>,
	Hirokazu Takahashi <taka@valinux.co.jp>,
	Jens Axboe <axboe@kernel.dk>,
	Andrew Morton <akpm@linux-foundation.org>,
	containers@lists.linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/5] page_cgroup: make page tracking available for blkio
Date: Tue, 22 Feb 2011 16:57:20 -0500	[thread overview]
Message-ID: <20110222215720.GK28269@redhat.com> (raw)
In-Reply-To: <20110222130145.37cb151e@bike.lwn.net>

On Tue, Feb 22, 2011 at 01:01:45PM -0700, Jonathan Corbet wrote:
> On Tue, 22 Feb 2011 18:12:54 +0100
> Andrea Righi <arighi@develer.com> wrote:
> 
> > The page_cgroup infrastructure, currently available only for the memory
> > cgroup controller, can be used to store the owner of each page and
> > opportunely track the writeback IO. This information is encoded in
> > the upper 16-bits of the page_cgroup->flags.
> > 
> > A owner can be identified using a generic ID number and the following
> > interfaces are provided to store a retrieve this information:
> > 
> >   unsigned long page_cgroup_get_owner(struct page *page);
> >   int page_cgroup_set_owner(struct page *page, unsigned long id);
> >   int page_cgroup_copy_owner(struct page *npage, struct page *opage);
> 
> My immediate observation is that you're not really tracking the "owner"
> here - you're tracking an opaque 16-bit token known only to the block
> controller in a field which - if changed by anybody other than the block
> controller - will lead to mayhem in the block controller.  I think it
> might be clearer - and safer - to say "blkcg" or some such instead of
> "owner" here.
> 
> I'm tempted to say it might be better to just add a pointer to your
> throtl_grp structure into struct page_cgroup.

throtl_grp might not even be present when page is being dirtied. When this
IO is actually submitted to device, we migth end up creating new
throtl_grp. I guess other concern here would be increasing the size of
page_cgroup structure.

I guess you meant storing a pointer to blkio_cgroup, along the lines of
storing a pointer to mem_cgroup. That also means extra 8 bytes and only
one subsystem can use it at a time. So using upper bits of pc->flags
is probably better.

> Or maybe replace the
> mem_cgroup pointer with a single pointer to struct css_set.  Both of
> those ideas, though, probably just add unwanted extra overhead now to gain
> generality which may or may not be wanted in the future.

This sounds interesting. IIUC, then this single pointer will allow all
the subsystems to use this single pointer to retireve respective cgroups
without actually co-mounting them. 

I am not sure how much work is involved in making it happen. Also not sure
about the overhead involved in traversing one extra pointer. Also apart
from blkio controller, have we practically felt the need of any other
controller this info. (network controller?). Few days back we were
experimenting with trying to control block IO bandwidth over NFS with 
the help of network controller but it did not really work well with
host of issues and one them being losing the context information.

If storing css_set pointer is lot of work, may be for the time being
we can go for this hardcoding that these bits are exclusively used
by blkio controller and once some other controller wants to share it,
then look for ways of how to do sharing.

Thanks
Vivek

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2011-02-22 22:00 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-22 17:12 [PATCH 0/5] blk-throttle: writeback and swap IO control Andrea Righi
2011-02-22 17:12 ` Andrea Righi
2011-02-22 17:12 ` [PATCH 1/5] blk-cgroup: move blk-cgroup.h in include/linux/blk-cgroup.h Andrea Righi
2011-02-22 17:12   ` Andrea Righi
2011-02-22 17:12 ` [PATCH 2/5] blk-cgroup: introduce task_to_blkio_cgroup() Andrea Righi
2011-02-22 17:12   ` Andrea Righi
2011-02-22 17:12 ` [PATCH 3/5] page_cgroup: make page tracking available for blkio Andrea Righi
2011-02-22 17:12   ` Andrea Righi
2011-02-22 20:01   ` Jonathan Corbet
2011-02-22 20:01     ` Jonathan Corbet
     [not found]     ` <20110222130145.37cb151e-vw3g6Xz/EtPk1uMJSBkQmQ@public.gmane.org>
2011-02-22 21:57       ` Vivek Goyal
2011-02-22 23:01       ` Andrea Righi
2011-02-22 21:57     ` Vivek Goyal [this message]
2011-02-22 21:57       ` Vivek Goyal
2011-02-22 23:01     ` Andrea Righi
2011-02-22 23:01       ` Andrea Righi
2011-02-22 23:06       ` Vivek Goyal
2011-02-22 23:06         ` Vivek Goyal
     [not found]         ` <20110222230630.GL28269-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-02-22 23:21           ` Jonathan Corbet
2011-02-22 23:37           ` Andrea Righi
2011-02-22 23:21         ` Jonathan Corbet
2011-02-22 23:21           ` Jonathan Corbet
2011-02-22 23:37         ` Andrea Righi
2011-02-22 23:37           ` Andrea Righi
2011-02-23  4:49           ` KAMEZAWA Hiroyuki
2011-02-23  4:49             ` KAMEZAWA Hiroyuki
     [not found]             ` <20110223134910.abbdc931.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2011-02-23  8:59               ` Andrea Righi
2011-02-23  8:59             ` Andrea Righi
2011-02-23  8:59               ` Andrea Righi
2011-02-23 23:58               ` KAMEZAWA Hiroyuki
2011-02-23 23:58                 ` KAMEZAWA Hiroyuki
     [not found]                 ` <20110224085805.14766e93.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2011-02-25  0:48                   ` Andrea Righi
2011-02-25  0:48                 ` Andrea Righi
2011-02-25  0:48                   ` Andrea Righi
     [not found]               ` <20110223085911.GC2174-fxUVXftIFDlZdMzt4l2sLQC/G2K4zDHf@public.gmane.org>
2011-02-23 23:58                 ` KAMEZAWA Hiroyuki
     [not found]           ` <20110222233718.GF23723-fxUVXftIFDlZdMzt4l2sLQC/G2K4zDHf@public.gmane.org>
2011-02-23  4:49             ` KAMEZAWA Hiroyuki
     [not found]       ` <20110222230146.GB23723-fxUVXftIFDlZdMzt4l2sLQC/G2K4zDHf@public.gmane.org>
2011-02-22 23:06         ` Vivek Goyal
2011-02-22 23:27         ` Jonathan Corbet
2011-02-22 23:27       ` Jonathan Corbet
2011-02-22 23:27         ` Jonathan Corbet
2011-02-22 23:48         ` Andrea Righi
2011-02-22 23:48           ` Andrea Righi
     [not found]         ` <20110222162729.054fe596-vw3g6Xz/EtPk1uMJSBkQmQ@public.gmane.org>
2011-02-22 23:48           ` Andrea Righi
     [not found]   ` <1298394776-9957-4-git-send-email-arighi-vWjgImWzx8FBDgjK7y7TUQ@public.gmane.org>
2011-02-22 20:01     ` Jonathan Corbet
2011-02-22 21:22     ` Vivek Goyal
2011-02-22 21:22   ` Vivek Goyal
2011-02-22 21:22     ` Vivek Goyal
     [not found]     ` <20110222212253.GJ28269-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-02-22 23:08       ` Andrea Righi
2011-02-22 23:08     ` Andrea Righi
2011-02-22 23:08       ` Andrea Righi
     [not found] ` <1298394776-9957-1-git-send-email-arighi-vWjgImWzx8FBDgjK7y7TUQ@public.gmane.org>
2011-02-22 17:12   ` [PATCH 1/5] blk-cgroup: move blk-cgroup.h in include/linux/blk-cgroup.h Andrea Righi
2011-02-22 17:12   ` [PATCH 2/5] blk-cgroup: introduce task_to_blkio_cgroup() Andrea Righi
2011-02-22 17:12   ` [PATCH 3/5] page_cgroup: make page tracking available for blkio Andrea Righi
2011-02-22 17:12   ` [PATCH 4/5] blk-throttle: track buffered and anonymous pages Andrea Righi
2011-02-22 17:12   ` [PATCH 5/5] blk-throttle: buffered and anonymous page tracking instrumentation Andrea Righi
2011-02-22 19:34   ` [PATCH 0/5] blk-throttle: writeback and swap IO control Vivek Goyal
2011-02-24  6:08   ` Balbir Singh
2011-02-22 17:12 ` [PATCH 4/5] blk-throttle: track buffered and anonymous pages Andrea Righi
2011-02-22 17:12   ` Andrea Righi
2011-02-22 18:42   ` Chad Talbott
2011-02-22 18:42     ` Chad Talbott
2011-02-22 19:12     ` Andrea Righi
2011-02-22 19:12       ` Andrea Righi
2011-02-22 20:49     ` Vivek Goyal
2011-02-22 20:49       ` Vivek Goyal
2011-02-22 23:03       ` Andrea Righi
2011-02-22 23:03         ` Andrea Righi
     [not found]       ` <20110222204928.GH28269-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-02-22 23:03         ` Andrea Righi
     [not found]     ` <AANLkTinD2ZH3hw_iqVpvMjMRbUkXMBgttjd2NevvYq9x-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-02-22 19:12       ` Andrea Righi
2011-02-22 20:49       ` Vivek Goyal
2011-02-22 21:00   ` Vivek Goyal
2011-02-22 21:00     ` Vivek Goyal
     [not found]     ` <20110222210030.GI28269-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-02-22 23:05       ` Andrea Righi
2011-02-22 23:05     ` Andrea Righi
2011-02-22 23:05       ` Andrea Righi
     [not found]       ` <20110222230534.GD23723-fxUVXftIFDlZdMzt4l2sLQC/G2K4zDHf@public.gmane.org>
2011-02-23  0:07         ` Vivek Goyal
2011-02-23  0:07       ` Vivek Goyal
2011-02-23  0:07         ` Vivek Goyal
     [not found]         ` <20110223000718.GN28269-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-02-23  8:37           ` Andrea Righi
2011-02-23  8:37         ` Andrea Righi
2011-02-23  8:37           ` Andrea Righi
     [not found]   ` <1298394776-9957-5-git-send-email-arighi-vWjgImWzx8FBDgjK7y7TUQ@public.gmane.org>
2011-02-22 18:42     ` Chad Talbott
2011-02-22 21:00     ` Vivek Goyal
2011-02-22 17:12 ` [PATCH 5/5] blk-throttle: buffered and anonymous page tracking instrumentation Andrea Righi
2011-02-22 17:12   ` Andrea Righi
2011-02-22 19:34 ` [PATCH 0/5] blk-throttle: writeback and swap IO control Vivek Goyal
2011-02-22 19:34   ` Vivek Goyal
     [not found]   ` <20110222193403.GG28269-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-02-22 22:41     ` Andrea Righi
2011-02-22 22:41   ` Andrea Righi
2011-02-22 22:41     ` Andrea Righi
2011-02-23  0:03     ` Vivek Goyal
2011-02-23  0:03       ` Vivek Goyal
     [not found]       ` <20110223000358.GM28269-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-02-23  8:32         ` Andrea Righi
2011-02-23  8:32       ` Andrea Righi
2011-02-23  8:32         ` Andrea Righi
2011-02-23 15:23         ` Vivek Goyal
2011-02-23 15:23           ` Vivek Goyal
     [not found]           ` <20110223152354.GA2526-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-02-23 23:14             ` Andrea Righi
2011-02-23 23:14           ` Andrea Righi
2011-02-23 23:14             ` Andrea Righi
     [not found]             ` <20110223231410.GB1744-fxUVXftIFDlZdMzt4l2sLQC/G2K4zDHf@public.gmane.org>
2011-02-24  0:10               ` Vivek Goyal
2011-02-24  0:10             ` Vivek Goyal
2011-02-24  0:10               ` Vivek Goyal
2011-02-24  0:40               ` KAMEZAWA Hiroyuki
2011-02-24  0:40                 ` KAMEZAWA Hiroyuki
2011-02-24  2:01                 ` Greg Thelen
2011-02-24  2:01                   ` Greg Thelen
     [not found]                 ` <20110224094039.89c07bea.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2011-02-24  2:01                   ` Greg Thelen
2011-02-24 16:18                   ` Vivek Goyal
     [not found]                     ` <20110224161844.GD18494__23141.7772280567$1298564487$gmane$org-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-03-23 18:48                       ` Daniel Poelzleithner
2011-02-24 16:18                 ` Vivek Goyal
2011-02-24 16:18                   ` Vivek Goyal
     [not found]               ` <20110224001033.GF2526-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-02-24  0:40                 ` KAMEZAWA Hiroyuki
2011-02-25  0:54                 ` Andrea Righi
2011-02-25  0:54               ` Andrea Righi
2011-02-25  0:54                 ` Andrea Righi
     [not found]         ` <20110223083206.GA2174-fxUVXftIFDlZdMzt4l2sLQC/G2K4zDHf@public.gmane.org>
2011-02-23 15:23           ` Vivek Goyal
     [not found]     ` <20110222224141.GA23723-fxUVXftIFDlZdMzt4l2sLQC/G2K4zDHf@public.gmane.org>
2011-02-23  0:03       ` Vivek Goyal
2011-02-24  6:08 ` Balbir Singh
2011-02-24  6:08   ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110222215720.GK28269@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=arighi@develer.com \
    --cc=axboe@kernel.dk \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=fengguang.wu@intel.com \
    --cc=gthelen@google.com \
    --cc=guijianfeng@cn.fujitsu.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nishimura@mxp.nes.nec.co.jp \
    --cc=ryov@valinux.co.jp \
    --cc=taka@valinux.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.