linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Chris Mason <chris.mason@oracle.com>
Cc: Jan Kara <jack@suse.cz>, Vivek Goyal <vgoyal@redhat.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-ext4 <linux-ext4@vger.kernel.org>, xfs <xfs@oss.sgi.com>,
	axboe <axboe@kernel.dk>
Subject: Re: buffered writeback torture program
Date: Thu, 21 Apr 2011 22:44:34 +0200	[thread overview]
Message-ID: <20110421204434.GF4476@quack.suse.cz> (raw)
In-Reply-To: <1303404971-sup-7825@think>

On Thu 21-04-11 12:57:17, Chris Mason wrote:
> Excerpts from Jan Kara's message of 2011-04-21 12:55:29 -0400:
> > On Thu 21-04-11 11:25:41, Chris Mason wrote:
> > > Excerpts from Chris Mason's message of 2011-04-21 07:09:11 -0400:
> > > > Excerpts from Vivek Goyal's message of 2011-04-20 18:06:26 -0400:
> > > > > > 
> > > > > > In this case the 128s spent in write was on a single 4K overwrite on a
> > > > > > 4K file.
> > > > > 
> > > > > Chris, You seem to be doing 1MB (32768*32) writes on fsync file instead of 4K.
> > > > > I changed the size to 4K still not much difference though.
> > > > 
> > > > Whoops, I had that change made locally but didn't get it copied out.
> > > > 
> > > > > 
> > > > > Once the program has exited because of high write time, i restarted it and
> > > > > this time I don't see high write times.
> > > > 
> > > > I see this for some of my runs as well.
> > > > 
> > > > > 
> > > > > First run
> > > > > ---------
> > > > > # ./a.out 
> > > > > setting up random write file
> > > > > done setting up random write file
> > > > > starting fsync run
> > > > > starting random io!
> > > > > write time: 0.0006s fsync time: 0.3400s
> > > > > write time: 63.3270s fsync time: 0.3760s
> > > > > run done 2 fsyncs total, killing random writer
> > > > > 
> > > > > Second run
> > > > > ----------
> > > > > # ./a.out 
> > > > > starting fsync run
> > > > > starting random io!
> > > > > write time: 0.0006s fsync time: 0.5359s
> > > > > write time: 0.0007s fsync time: 0.3559s
> > > > > write time: 0.0009s fsync time: 0.3113s
> > > > > write time: 0.0008s fsync time: 0.4336s
> > > > > write time: 0.0009s fsync time: 0.3780s
> > > > > write time: 0.0008s fsync time: 0.3114s
> > > > > write time: 0.0009s fsync time: 0.3225s
> > > > > write time: 0.0009s fsync time: 0.3891s
> > > > > write time: 0.0009s fsync time: 0.4336s
> > > > > write time: 0.0009s fsync time: 0.4225s
> > > > > write time: 0.0009s fsync time: 0.4114s
> > > > > write time: 0.0007s fsync time: 0.4004s
> > > > > 
> > > > > Not sure why would that happen.
> > > > > 
> > > > > I am wondering why pwrite/fsync process was throttled. It did not have any
> > > > > pages in page cache and it shouldn't have hit the task dirty limits. Does that
> > > > > mean per task dirty limit logic does not work or I am completely missing
> > > > > the root cause of the problem.
> > > > 
> > > > I haven't traced it to see.  This test box only has 1GB of ram, so the
> > > > dirty ratios can be very tight.
> > > 
> > > Oh, I see now.  The test program first creates the file with a big
> > > streaming write.  So the task doing the streaming writes gets nailed
> > > with the per-task dirty accounting because it is making a ton of dirty
> > > data.
> > > 
> > > Then the task forks the random writer to do all the random IO.
> > > 
> > > Then the original pid goes back to do the fsyncs on the new file.
> > > 
> > > So, in the original run, we get stuffed into balance_dirty_pages because
> > > the per-task limits show we've done a lot of dirties.
> > > 
> > > In all later runs, the file already exists, so our fsyncing process
> > > hasn't done much dirtying at all.  Looks like the VM is doing something
> > > sane, we just get nailed with big random IO.
> >   Ok, so there isn't a problem with fsync() as such if I understand it
> > right. We just block tasks in balance_dirty_pages() for a *long* time
> > because it takes long time to write out that dirty IO and we make it even
> > worse by trying to writeout more on behalf of the throttled task. Am I
> > right? The IO-less throttling will solve this regardless of patchset we
> > choose so I wouldn't be too worried about the problem now.
> 
> You're right.  With one small exception, we probably do want to rotor
> out of the random buffered writes in hopes of finding some sequential IO
> even with the IO-less dirty throttling.
  Flusher thread should do this - it writes at most 1024 pages (as much as
->writepages call does) from the big file and then put inode to b_more_io
queue and go on with the next dirty inode. So if there is some sequential
IO as well, we should get to it...

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  reply	other threads:[~2011-04-21 20:44 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-20 18:23 buffered writeback torture program Chris Mason
2011-04-20 22:06 ` Vivek Goyal
2011-04-21 11:09   ` Chris Mason
2011-04-21 15:25     ` Chris Mason
2011-04-21 15:35       ` Vivek Goyal
2011-04-21 16:55       ` Jan Kara
2011-04-21 16:57         ` Chris Mason
2011-04-21 20:44           ` Jan Kara [this message]
2011-04-21  8:32 ` Christoph Hellwig
2011-04-21 17:34   ` Chris Mason
2011-04-21 17:41     ` Christoph Hellwig
2011-04-21 17:59       ` Andreas Dilger
2011-04-21 18:02         ` Christoph Hellwig
2011-04-21 18:02           ` Chris Mason
2011-04-21 18:08             ` Christoph Hellwig
2011-04-21 18:29               ` Chris Mason
2011-04-21 18:43                 ` Andreas Dilger
2011-04-21 18:47                   ` Chris Mason
2011-04-21 18:00       ` Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110421204434.GF4476@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=axboe@kernel.dk \
    --cc=chris.mason@oracle.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=vgoyal@redhat.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).