From: Jan Kara <jack@suse.cz>
To: Dave Chinner <david@fromorbit.com>
Cc: Jens Axboe <axboe@kernel.dk>, Jan Kara <jack@suse.cz>,
Eryu Guan <eguan@redhat.com>,
linux-kernel@vger.kernel.org, xfs@oss.sgi.com, axboe@fb.com,
linux-fsdevel@vger.kernel.org, Jan Kara <jack@suse.com>,
Tejun Heo <tj@kernel.org>,
kernel-team@fb.com
Subject: Re: [PATCH block/for-linus] writeback: fix syncing of I_DIRTY_TIME inodes
Date: Mon, 24 Aug 2015 11:19:59 +0200 [thread overview]
Message-ID: <20150824091959.GA2936@quack.suse.cz> (raw)
In-Reply-To: <20150824062425.GU3902@dastard>
On Mon 24-08-15 16:24:25, Dave Chinner wrote:
> On Mon, Aug 24, 2015 at 11:18:16AM +0800, Eryu Guan wrote:
> > On Mon, Aug 24, 2015 at 11:11:23AM +1000, Dave Chinner wrote:
> > >
> > > Eryu, can you change the way you run the event trace to be:
> > >
> > > $ sudo trace-cmd <options> -o <outfile location> ./check <test options>
> > >
> > > rather than running the trace as a background operation elsewhere?
> > > Maybe that will give better results.
> >
> > The results are here
> >
> > http://128.199.137.77/writeback-v3/
<snip>
> What I can't see in the traces is where sync is doing a blocking
> sync pass on the fileystem. The wbc control structure being passed
> to XFS is:
>
> wbc_writepage: bdi 253:0: towrt=45569 skip=0 mode=0 kupd=0 bgrd=0 reclm=0 cyclic=0 start=0x0 end=0x7fffffffffffffff
>
> Which is not coming from sync_inodes_sb() as the sync mode is
> incorrect (i.e. not WB_SYNC_ALL). It looks to me that writeback is
> coming from a generic bdi flusher command rather than a directed
> superblock sync. i.e. through wakeup_flusher_threads() which sets:
>
> work->sync_mode = WB_SYNC_NONE;
> work->nr_pages = nr_pages;
> work->range_cyclic = range_cyclic;
> work->reason = reason;
> work->auto_free = 1;
>
> as the reason is "sync":
>
> sync-18849 writeback_queue: bdi 253:0: sb_dev 0:0 nr_pages=308986 sync_mode=0 kupdate=0 range_cyclic=0 background=0 reason=sync
> sync-18849 writeback_queue: bdi 253:0: sb_dev 253:1 nr_pages=9223372036854775807 sync_mode=1 kupdate=0 range_cyclic=0 background=0 reason=sync
> ....
> kworker/u8:1-1563 writeback_exec: bdi 253:0: sb_dev 0:0 nr_pages=308986 sync_mode=0 kupdate=0 range_cyclic=0 background=0 reason=sync
> kworker/u8:1-1563 writeback_start: bdi 253:0: sb_dev 0:0 nr_pages=308986 sync_mode=0 kupdate=0 range_cyclic=0 background=0 reason=sync
>
> The next writeback_queue/writeback_exec tracepoint pair are:
>
> ....
> kworker/2:1-17163 xfs_setfilesize: dev 253:6 ino 0xef6506 isize 0xa00000 disize 0x0 offset 0x0 count 10481664
> kworker/2:1-17163 xfs_setfilesize: dev 253:6 ino 0xef6506 isize 0xa00000 disize 0x9ff000 offset 0x9ff000 count 4096
> sync-18849 wbc_writepage: bdi 253:0: towrt=9223372036854775798 skip=0 mode=1 kupd=0 bgrd=0 reclm=0 cyclic=0 start=0x0 end=0x7fffffffffffffff
> sync-18849 wbc_writepage: bdi 253:0: towrt=9223372036854775797 skip=0 mode=1 kupd=0 bgrd=0 reclm=0 cyclic=0 start=0x0 end=0x7fffffffffffffff
> sync-18849 wbc_writepage: bdi 253:0: towrt=9223372036854775796 skip=0 mode=1 kupd=0 bgrd=0 reclm=0 cyclic=0 start=0x0 end=0x7fffffffffffffff
> sync-18849 wbc_writepage: bdi 253:0: towrt=9223372036854775795 skip=0 mode=1 kupd=0 bgrd=0 reclm=0 cyclic=0 start=0x0 end=0x7fffffffffffffff
> umount-18852 writeback_queue: bdi 253:0: sb_dev 253:6 nr_pages=22059 sync_mode=0 kupdate=0 range_cyclic=0 background=0 reason=sync
> kworker/u8:1-1563 writeback_exec: bdi 253:0: sb_dev 253:6 nr_pages=22059 sync_mode=0 kupdate=0 range_cyclic=0 background=0 reason=sync
> ....
>
> which shows unmount being the next writeback event queued and
> executed after the IO completions have come in (that missed the
> log). What is missing is the specific queue/exec events for
> sync_sb_inodes() from the sync code for each filesystem.
Bah, I see the problem and indeed it was introduced by commit e79729123f639
"writeback: don't issue wb_writeback_work if clean". The problem is that
we bail out of sync_inodes_sb() if there is no dirty IO. Which is wrong
because we have to wait for any outstanding IO (i.e. call wait_sb_inodes())
regardless of dirty state! And that also explains why Tejun's patch fixes
the problem because it backs out the change to the exit condition in
sync_inodes_sb().
So Tejun's patch from this thread is indeed fixing the real problem but the
comment in sync_inodes_sb() should be fixed to mention wait_sb_inodes()
must be called in all cases... Tejun, will you fixup the comment please?
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2015-08-24 9:19 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20150812101204.GE17933@dhcp-13-216.nay.redhat.com>
2015-08-13 0:44 ` generic/04[89] fail on XFS due to change in writeback code [4.2-rc1 regression] Dave Chinner
2015-08-13 15:34 ` Tejun Heo
2015-08-13 19:16 ` Tejun Heo
2015-08-13 22:44 ` [PATCH block/for-linus] writeback: fix syncing of I_DIRTY_TIME inodes Tejun Heo
2015-08-14 11:14 ` Jan Kara
2015-08-14 15:14 ` Damien Wyart
2015-08-17 20:00 ` Tejun Heo
2015-08-18 5:33 ` Damien Wyart
2015-08-17 20:02 ` Tejun Heo
2015-08-18 9:16 ` Jan Kara
2015-08-18 17:47 ` Tejun Heo
2015-08-18 19:54 ` Tejun Heo
2015-08-18 21:56 ` Dave Chinner
2015-08-20 6:12 ` Eryu Guan
2015-08-20 14:01 ` Eryu Guan
2015-08-20 14:36 ` Eryu Guan
2015-08-20 14:37 ` Eryu Guan
2015-08-20 16:55 ` Tejun Heo
2015-08-20 23:04 ` Dave Chinner
2015-08-24 18:10 ` Tejun Heo
2015-08-24 22:27 ` Dave Chinner
2015-08-24 22:53 ` Tejun Heo
2015-08-21 10:20 ` Eryu Guan
2015-08-22 0:30 ` Dave Chinner
2015-08-22 4:46 ` Eryu Guan
2015-08-24 1:11 ` Dave Chinner
2015-08-24 3:18 ` Eryu Guan
2015-08-24 6:24 ` Dave Chinner
2015-08-24 8:34 ` Eryu Guan
2015-08-24 8:55 ` Dave Chinner
2015-08-24 9:19 ` Jan Kara [this message]
2015-08-24 14:51 ` Tejun Heo
2015-08-24 17:11 ` Tejun Heo
2015-08-24 19:08 ` Jan Kara
2015-08-24 19:32 ` Tejun Heo
2015-08-24 21:09 ` Jan Kara
2015-08-24 21:45 ` Tejun Heo
2015-08-24 22:54 ` Tejun Heo
2015-08-24 22:57 ` Dave Chinner
2015-08-25 18:11 ` [PATCH v2 block/for-linus] writeback: sync_inodes_sb() must write out I_DIRTY_TIME inodes and always call wait_sb_inodes() Tejun Heo
2015-08-25 20:37 ` Jens Axboe
2015-08-26 9:00 ` Jan Kara
2015-08-13 23:24 ` generic/04[89] fail on XFS due to change in writeback code [4.2-rc1 regression] Tejun Heo
2015-08-14 6:19 ` Eryu Guan
2015-08-17 20:27 ` Tejun Heo
2015-08-18 3:57 ` Eryu Guan
2015-08-18 5:31 ` Eryu Guan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150824091959.GA2936@quack.suse.cz \
--to=jack@suse.cz \
--cc=axboe@fb.com \
--cc=axboe@kernel.dk \
--cc=david@fromorbit.com \
--cc=eguan@redhat.com \
--cc=jack@suse.com \
--cc=kernel-team@fb.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tj@kernel.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).