From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756528AbYIROF3 (ORCPT ); Thu, 18 Sep 2008 10:05:29 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753599AbYIROFU (ORCPT ); Thu, 18 Sep 2008 10:05:20 -0400 Received: from mx1.redhat.com ([66.187.233.31]:34112 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754093AbYIROFT (ORCPT ); Thu, 18 Sep 2008 10:05:19 -0400 Date: Thu, 18 Sep 2008 09:55:13 -0400 From: Vivek Goyal To: Andrea Righi Cc: Hirokazu Takahashi , randy.dunlap@oracle.com, menage@google.com, chlunde@ping.uio.no, dpshah@google.com, eric.rannaud@gmail.com, balbir@linux.vnet.ibm.com, fernando@oss.ntt.co.jp, akpm@linux-foundation.org, agk@sourceware.org, subrata@linux.vnet.ibm.com, axboe@kernel.dk, m.innocenti@cineca.it, containers@lists.linux-foundation.org, linux-kernel@vger.kernel.org, dave@linux.vnet.ibm.com, matt@bluehost.com, roberto@unbit.it, ngupta@google.com Subject: Re: [RFC][PATCH -mm 0/5] cgroup: block device i/o controller (v9) Message-ID: <20080918135513.GE20640@redhat.com> References: <1219853257-11052-1-git-send-email-righi.andrea@gmail.com> <20080917.161811.27257227.taka@valinux.co.jp> <48D0C43A.2010102@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <48D0C43A.2010102@gmail.com> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 17, 2008 at 10:47:54AM +0200, Andrea Righi wrote: > Hirokazu Takahashi wrote: > > Hi, > > > >> TODO: > >> > >> * Try to push down the throttling and implement it directly in the I/O > >> schedulers, using bio-cgroup (http://people.valinux.co.jp/~ryov/bio-cgroup/) > >> to keep track of the right cgroup context. This approach could lead to more > >> memory consumption and increases the number of dirty pages (hard/slow to > >> reclaim pages) in the system, since dirty-page ratio in memory is not > >> limited. This could even lead to potential OOM conditions, but these problems > >> can be resolved directly into the memory cgroup subsystem > >> > >> * Handle I/O generated by kswapd: at the moment there's no control on the I/O > >> generated by kswapd; try to use the page_cgroup functionality of the memory > >> cgroup controller to track this kind of I/O and charge the right cgroup when > >> pages are swapped in/out > > > > FYI, this also can be done with bio-cgroup, which determine the owner cgroup > > of a given anonymous page. > > > > Thanks, > > Hirokazu Takahashi > > That would be great! FYI here is how I would like to proceed: > > - today I'll post a new version of my cgroup-io-throttle patch rebased > to 2.6.27-rc5-mm1 (it's well tested and seems to be stable enough). > To keep the things light and simpler I've implemented custom > get_cgroup_from_page() / put_cgroup_from_page() in the memory > controller to retrieve the owner of a page, holding a reference to the > corresponding memcg, during async writes in submit_bio(); this is not > probably the best way to proceed, and a more generic framework like > bio-cgroup sounds better, but it seems to work quite well. The only > problem I've found is that during swap_writepage() the page is not > assigned to any page_cgroup (page_get_page_cgroup() returns NULL), and > so I'm not able to charge the cost of this I/O operation to the right > cgroup. Does bio-cgroup address or even resolve this issue? > - begin to implement a new branch of cgroup-io-throttle on top of > bio-cgroup > - also start to implement an additional request queue to provide first a > control at the cgroup level and a dispatcher to pass the request to > the elevator (as suggested by Vivek) > Hi Andrea, So if we maintain and rb-tree per request queue and implement the cgroup rules there, then that will take care of io-throttling also. (One can control the release of bio/requests to elevator based on any kind of rules. proportional weight/max-bandwidth). If that's the case, I was wondering what do you mean by "begin to implement new branch of cgroup-io-throttle" on top of bio-cgroup". Thanks Vivek