From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754593AbYIROiV (ORCPT ); Thu, 18 Sep 2008 10:38:21 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1755920AbYIROiH (ORCPT ); Thu, 18 Sep 2008 10:38:07 -0400 Received: from rv-out-0506.google.com ([209.85.198.237]:64094 "EHLO rv-out-0506.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755931AbYIROiF (ORCPT ); Thu, 18 Sep 2008 10:38:05 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:reply-to:user-agent:mime-version:to:cc:subject :references:in-reply-to:x-enigmail-version:content-type :content-transfer-encoding; b=xKUjdOO2fBEIO1KSKpubib9Q9JRDgl8npDITRwrW/S0P7a8+N0mOj//8QalyueamAA XAkR4tKzZCVCQHIfiJmL3ni80RMU8JqwywGjVmrKdV6PxULSJZ82eEXgHXUSwTnMeZC+ Jq/o8qB+2ntJPqrVsyCOWjGzmDm/l9QbKz7HE= Message-ID: <48D267C4.2060906@gmail.com> Date: Thu, 18 Sep 2008 16:37:56 +0200 From: Andrea Righi Reply-To: righi.andrea@gmail.com User-Agent: Thunderbird 2.0.0.16 (X11/20080724) MIME-Version: 1.0 To: Hirokazu Takahashi CC: kamezawa.hiroyu@jp.fujitsu.com, balbir@linux.vnet.ibm.com, menage@google.com, agk@sourceware.org, akpm@linux-foundation.org, axboe@kernel.dk, baramsori72@gmail.com, chlunde@ping.uio.no, dave@linux.vnet.ibm.com, dpshah@google.com, eric.rannaud@gmail.com, fernando@oss.ntt.co.jp, lizf@cn.fujitsu.com, m.innocenti@cineca.it, matt@bluehost.com, ngupta@google.com, randy.dunlap@oracle.com, roberto@unbit.it, ryov@valinux.co.jp, s-uchida@ap.jp.nec.com, subrata@linux.vnet.ibm.com, yoshikawa.takuya@oss.ntt.co.jp, containers@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: Re: [RFC][PATCH -mm 0/5] cgroup: block device i/o controller (v9) References: <1219853257-11052-1-git-send-email-righi.andrea@gmail.com> <20080917.161811.27257227.taka@valinux.co.jp> <48D0C43A.2010102@gmail.com> <20080918.202416.120249186.taka@valinux.co.jp> In-Reply-To: <20080918.202416.120249186.taka@valinux.co.jp> X-Enigmail-Version: 0.95.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hirokazu Takahashi wrote: > Hi, > >>> Hi, >>> >>>> TODO: >>>> >>>> * Try to push down the throttling and implement it directly in the I/O >>>> schedulers, using bio-cgroup (http://people.valinux.co.jp/~ryov/bio-cgroup/) >>>> to keep track of the right cgroup context. This approach could lead to more >>>> memory consumption and increases the number of dirty pages (hard/slow to >>>> reclaim pages) in the system, since dirty-page ratio in memory is not >>>> limited. This could even lead to potential OOM conditions, but these problems >>>> can be resolved directly into the memory cgroup subsystem >>>> >>>> * Handle I/O generated by kswapd: at the moment there's no control on the I/O >>>> generated by kswapd; try to use the page_cgroup functionality of the memory >>>> cgroup controller to track this kind of I/O and charge the right cgroup when >>>> pages are swapped in/out >>> FYI, this also can be done with bio-cgroup, which determine the owner cgroup >>> of a given anonymous page. >>> >>> Thanks, >>> Hirokazu Takahashi >> That would be great! FYI here is how I would like to proceed: >> >> - today I'll post a new version of my cgroup-io-throttle patch rebased >> to 2.6.27-rc5-mm1 (it's well tested and seems to be stable enough). >> To keep the things light and simpler I've implemented custom >> get_cgroup_from_page() / put_cgroup_from_page() in the memory >> controller to retrieve the owner of a page, holding a reference to the >> corresponding memcg, during async writes in submit_bio(); this is not >> probably the best way to proceed, and a more generic framework like >> bio-cgroup sounds better, but it seems to work quite well. The only >> problem I've found is that during swap_writepage() the page is not >> assigned to any page_cgroup (page_get_page_cgroup() returns NULL), and > > This behavior depends on the version of memory-cgroup. > In the previous version, pages in the swap cache were owned by one of > the cgroups. > > Kamezawa-san, one of the implementer, told me he got this feature off > temporarily and he was going to turn it on again. I think this > workaround is chosen because the current implementation of memory > cgroup has a weak point under memory pressure. > >> so I'm not able to charge the cost of this I/O operation to the right >> cgroup. Does bio-cgroup address or even resolve this issue? > > Bio-cgroup can't support pages in the swap cache temporarily with the > current linux kernel either since it shares the same infrastructure > with memory-cgroup. > > Now, they have just started to rewrite the infrastructure to track pages > with page_cgroup, which is going to give us good performance ever. > After that I'm going to enhance bio-cgroup more, such as dirty page > tracking. To tell the truth, I already have dirty pages tracking patch > for the current linux in my hand, which isn't posted yet. I'm going to > port it on the new infrastructure. > > If memory cgroup team change their mind, I will implement swap-pages > tracking in bio-cgroup. Very good! in any case it seems I'll get the tracking of swap-pages from someone else.. so I don't have to change/implement anything in my io-throttle patchset. :) I'll start to use bio-cgroup in io-throttle ASAP and do some tests. I'll keep you informed. Thanks, -Andrea