From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vivek Goyal Subject: Re: [Lsf] IO less throttling and cgroup aware writeback (Was: Re: Preliminary Agenda and Activities for LSF) Date: Tue, 19 Apr 2011 10:34:23 -0400 Message-ID: <20110419143423.GC31712@redhat.com> References: <20110331222756.GC2904@dastard> <20110401171838.GD20986@redhat.com> <20110401214947.GE6957@dastard> <20110405131359.GA14239@redhat.com> <20110405225639.GB31057@dastard> <20110406153715.GA18777@redhat.com> <20110406235039.GL31057@dastard> <20110407175537.GD27778@redhat.com> <20110411013630.GM30279@dastard> <20110419141717.GA26482@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Dave Chinner , Greg Thelen , James Bottomley , lsf@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org To: Wu Fengguang Return-path: Received: from mx1.redhat.com ([209.132.183.28]:59911 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752610Ab1DSOe3 (ORCPT ); Tue, 19 Apr 2011 10:34:29 -0400 Content-Disposition: inline In-Reply-To: <20110419141717.GA26482@localhost> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Tue, Apr 19, 2011 at 10:17:17PM +0800, Wu Fengguang wrote: > [snip] > > > > > For throttling case, apart from metadata, I found that with simple > > > > > throttling of data I ran into issues with journalling with ext4 mounuted > > > > > in ordered mode. So it was suggested that WRITE IO throttling should > > > > > not be done at device level instead try to do it in higher layers, > > > > > possibly balance_dirty_pages() and throttle process early. > > > > > > > > The problem with doing it at the page cache entry level is that > > > > cache hits then get throttled. It's not really a an IO controller at > > > > that point, and the impact on application performance could be huge > > > > (i.e. MB/s instead of GB/s). > > > > > > Agreed that throttling cache hits is not a good idea. Can we determine > > > if page being asked for is in cache or not and charge for IO accordingly. > > > > You'd need hooks in find_or_create_page(), though you have no > > context of whether a read or a write is in progress at that point. > > I'm confused. Where is the throttling at cache hits? > > The balance_dirty_pages() throttling kicks in at write() syscall and > page fault time. For example, generic_perform_write(), do_wp_page() > and __do_fault() will explicitly call > balance_dirty_pages_ratelimited() to do the write throttling. This comment was in the context of what if we move block IO controller read throttling also in higher layers. Then we don't want to throttle reads which are already in cache. Currently throttling hook is in generic_make_request() and it kicks in only if data is not present in page cache and actual disk IO is initiated. Thanks Vivek