From: Dave Chinner <david@fromorbit.com>
To: Jan Kara <jack@suse.cz>
Cc: Denys Fedorysychenko <nuclearcat@nuclearcat.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: endless sync on bdi_sched_wait()? 2.6.33.1
Date: Mon, 12 Apr 2010 10:47:38 +1000 [thread overview]
Message-ID: <20100412004738.GC2493@dastard> (raw)
In-Reply-To: <20100408092850.GA20488@quack.suse.cz>
On Thu, Apr 08, 2010 at 11:28:50AM +0200, Jan Kara wrote:
> On Wed 31-03-10 19:07:31, Denys Fedorysychenko wrote:
> > I have a proxy server with "loaded" squid. On some moment i did sync, and
> > expecting it to finish in reasonable time. Waited more than 30 minutes, still
> > "sync". Can be reproduced easily.
....
> >
> > SUPERPROXY ~ # cat /proc/1753/stack
> > [<c019a93c>] bdi_sched_wait+0x8/0xc
> > [<c019a807>] wait_on_bit+0x20/0x2c
> > [<c019a9af>] sync_inodes_sb+0x6f/0x10a
> > [<c019dd53>] __sync_filesystem+0x28/0x49
> > [<c019ddf3>] sync_filesystems+0x7f/0xc0
> > [<c019de7a>] sys_sync+0x1b/0x2d
> > [<c02f7a25>] syscall_call+0x7/0xb
> > [<ffffffff>] 0xffffffff
> Hmm, I guess you are observing the problem reported in
> https://bugzilla.kernel.org/show_bug.cgi?id=14830
> There seem to be several issues in the per-bdi writeback code that
> cause sync on a busy filesystem to last almost forever. To that bug are
> attached two patches that fix two issues but apparently it's not all.
> I'm still looking into it...
Jan, just another data point that i haven't had a chance to look
into yet - I noticed that 2.6.34-rc1 writeback patterns have changed
on XFS from looking at blocktrace.
The bdi-flush background write threadi almost never completes - it
blocks in get_request() and it is doing 1-2 page IOs. If I do a
large dd write, the writeback thread starts with 512k IOs for a
short while, then suddenly degrades to 1-2 page IOs that get merged
in the elevator to 512k IOs.
My theory is that the inode is getting dirtied by the concurrent
write() and the inode is never moving back to the dirty list and
having it's dirtied_when time reset - it's being moved to the
b_more_io list in writeback_single_inode(), wbc->more_io is being
set, and then we re-enter writeback_inodes_wb() which splices the
b_more_io list back onto the b_io list and we try to write it out
again.
Because I have so many dirty pages in memory, nr_pages is quite high
and this pattern continues for some time until it is exhausted, at
which time throttling triggers background sync to run again and the
1-2 page IO pattern continues.
And for sync(), nr_pages is set to LONG_MAX, so regardless of how
many pages were dirty, if we keep dirtying pages it will stay in
this loop until LONG_MAX pages are written....
Anyway, that's my theory - if we had trace points in the writeback
code, I could confirm/deny this straight away. First thing I need to
do, though, is to forward port the original writeback tracng code
Jens posted a while back....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2010-04-13 1:20 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-31 16:07 endless sync on bdi_sched_wait()? 2.6.33.1 Denys Fedorysychenko
2010-03-31 22:12 ` Dave Chinner
2010-04-01 10:42 ` Denys Fedorysychenko
2010-04-01 11:13 ` Dave Chinner
2010-04-01 20:14 ` Jeff Moyer
2010-04-08 9:28 ` Jan Kara
2010-04-08 10:12 ` Denys Fedorysychenko
2010-04-12 0:47 ` Dave Chinner [this message]
2010-04-19 1:37 ` Dave Chinner
2010-04-19 7:04 ` Dave Chinner
2010-04-19 7:23 ` Dave Chinner
2010-04-21 0:33 ` Jan Kara
2010-04-21 1:54 ` Dave Chinner
2010-04-21 13:27 ` Jan Kara
2010-04-22 0:06 ` Dave Chinner
2010-04-22 12:48 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100412004738.GC2493@dastard \
--to=david@fromorbit.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nuclearcat@nuclearcat.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).