From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Chinner Subject: Re: [Lsf] IO less throttling and cgroup aware writeback Date: Fri, 8 Apr 2011 09:42:49 +1000 Message-ID: <20110407234249.GE30279@dastard> References: <20110331222756.GC2904@dastard> <20110401171838.GD20986@redhat.com> <20110401214947.GE6957@dastard> <20110405131359.GA14239@redhat.com> <20110405225639.GB31057@dastard> <20110406153954.GB18777@redhat.com> <20110406233602.GK31057@dastard> <20110407192424.GE27778@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Greg Thelen , Curt Wohlgemuth , James Bottomley , lsf@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org To: Vivek Goyal Return-path: Received: from ipmail06.adl2.internode.on.net ([150.101.137.129]:49178 "EHLO ipmail06.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757159Ab1DGXmy (ORCPT ); Thu, 7 Apr 2011 19:42:54 -0400 Content-Disposition: inline In-Reply-To: <20110407192424.GE27778@redhat.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Thu, Apr 07, 2011 at 03:24:24PM -0400, Vivek Goyal wrote: > On Thu, Apr 07, 2011 at 09:36:02AM +1000, Dave Chinner wrote: [...] > > > When I_DIRTY is cleared, remove inode from bdi_memcg->b_dirty. D= elete bdi_memcg > > > if the list is now empty. > > >=20 > > > balance_dirty_pages() calls mem_cgroup_balance_dirty_pages(memcg,= bdi) > > > if over bg limit, then > > > set bdi_memcg->b_over_limit > > > If there is no bdi_memcg (because all inodes of curren= t=E2=80=99s > > > memcg dirty pages where first dirtied by other memcg) = then > > > memcg lru to find inode and call writeback_single_inod= e(). > > > This is to handle uncommon sharing. > >=20 > > We don't want to introduce any new IO sources into > > balance_dirty_pages(). This needs to trigger memcg-LRU based bdi > > flusher writeback, not try to write back inodes itself. >=20 > Will we not enjoy more sequtial IO traffic once we find an inode by > traversing memcg->lru list? So isn't that better than pure LRU based > flushing? Sorry, I wasn't particularly clear there, What I meant was that we ask the bdi-flusher thread to select the inode to write back from the LRU, not do it directly from balance_dirty_pages(). i.e. bdp stays IO-less. > > Alternatively, this problem won't exist if you transfer page =D1=89= ache > > state from one memcg to another when you move the inode from one > > memcg to another. >=20 > But in case of shared inode problem still remains. inode is being wri= tten > from two cgroups and it can't be in both the groups as per the exisit= ing > design. But we've already determined that there is no use case for this shared inode behaviour, so we aren't going to explictly support it, right? Cheers, Dave. --=20 Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html