From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wu Fengguang Subject: Re: [PATCH 1/2] writeback: Improve busyloop prevention Date: Sat, 22 Oct 2011 12:20:19 +0800 Message-ID: <20111022042019.GA10287@localhost> References: <20111013201835.GD27363@quack.suse.cz> <20111014160047.GA13330@localhost> <20111014162807.GA4617@localhost> <20111018005128.GI4528@quack.suse.cz> <20111018143504.GA17818@localhost> <20111019115630.GA22266@quack.suse.cz> <20111020120909.GA8193@localhost> <20111020123300.GA12317@localhost> <20111020133938.GA18058@localhost> <20111020222616.GA20542@quack.suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: "linux-fsdevel@vger.kernel.org" , Christoph Hellwig , Dave Chinner To: Jan Kara Return-path: Received: from mga14.intel.com ([143.182.124.37]:32599 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752013Ab1JVEWN (ORCPT ); Sat, 22 Oct 2011 00:22:13 -0400 Content-Disposition: inline In-Reply-To: <20111020222616.GA20542@quack.suse.cz> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Fri, Oct 21, 2011 at 06:26:16AM +0800, Jan Kara wrote: > On Thu 20-10-11 21:39:38, Wu Fengguang wrote: > > On Thu, Oct 20, 2011 at 08:33:00PM +0800, Wu Fengguang wrote: > > > On Thu, Oct 20, 2011 at 08:09:09PM +0800, Wu Fengguang wrote: > > > > Jan, > > > > > > > > I tried the below combined patch over the ioless one, and find some > > > > minor regressions. I studied the thresh=1G/ext3-1dd case in particular > > > > and find that nr_writeback and the iostat avgrq-sz drops from time to time. > > > > > > > > I'll try to bisect the changeset. > > > > This is interesting, the culprit is found to be patch 1, which is > > simply > > if (work->for_kupdate) { > > oldest_jif = jiffies - > > msecs_to_jiffies(dirty_expire_interval * 10); > > - work->older_than_this = &oldest_jif; > > - } > > + } else if (work->for_background) > > + oldest_jif = jiffies; > Yeah. I had a look into the trace and you can notice that during the > whole dd run, we were running a single background writeback work (you can > verify that by work->nr_pages decreasing steadily). Yes, it is. > Without refreshing > oldest_jif, we'd write block device inode for /dev/sda (you can identify > that by bdi=8:0, ino=0) only once. When refreshing oldest_jif, we write it > every 5 seconds (kjournald dirties the device inode after committing a > transaction by dirtying metadata buffers which were just committed and can > now be checkpointed either by kjournald or flusher thread). OK, now I understand the regular drops of nr_writeback and avgrq-sz: on every 5s, it takes _some time_ to write inode 0, during which the flusher is blocked and the IO queue runs low. > So although the performance is slightly reduced, I'd say that the > behavior is a desired one. OK. However it's sad to see the flusher get blocked from time to time... > Also if you observed the performance on a really long run, the difference > should get smaller because eventually, kjournald has to flush the metadata > blocks when the journal fills up and we need to free some journal space and > at that point flushing is even more expensive because we have to do a > blocking write during which all transaction operations, thus effectively > the whole filesystem, are blocked. OK. The dd test time was 300s, I'll increase it to 900s (cannot do more because it's a 90GB disk partition). Thanks, Fengguang