linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>, Jan Kara <jack@suse.cz>,
	Eryu Guan <eguan@redhat.com>,
	linux-kernel@vger.kernel.org, xfs@oss.sgi.com, axboe@fb.com,
	Jan Kara <jack@suse.com>,
	linux-fsdevel@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH block/for-linus] writeback: fix syncing of I_DIRTY_TIME inodes
Date: Tue, 25 Aug 2015 08:27:20 +1000	[thread overview]
Message-ID: <20150824222720.GD714@dastard> (raw)
In-Reply-To: <20150824181038.GA28944@mtj.duckdns.org>

On Mon, Aug 24, 2015 at 02:10:38PM -0400, Tejun Heo wrote:
> Hello, Dave.
> 
> On Fri, Aug 21, 2015 at 09:04:51AM +1000, Dave Chinner wrote:
> > > Maybe I'm misunderstanding the code but all xfs_writepage() calls are
> > > from unbound workqueues - the writeback workers - while
> > > xfs_setfilesize() are from bound workqueues, so I wondered why that
> > > was and looked at the code and the setsize functions are run off of a
> > > separate work item which is queued from the end_bio callback and I
> > > can't tell who would be waiting for them.  Dave, what am I missing?
> > 
> > xfs_setfilesize runs transactions, so it can't be run from IO
> > completion context as it needs to block (i.e. on log space or inode
> > locks). It also can't block log IO completion, nor metadata Io
> > completion, as only log IO completion can free log space, and the
> > inode lock might be waiting on metadata buffer IO completion (e.g.
> > during delayed allocation). Hence we have multiple IO completion
> > workqueues to keep these things separated and deadlock free. i.e.
> > they all get punted to a workqueue where they are then processed in
> > a context that can block safely.
> 
> I'm still a bit confused.  What prevents the following from happening?
> 
> 1. io completion of last dirty page of an inode and work item for
>    xfs_setfilesize() is queued.
> 
> 2. inode removed from dirty list.

The inode has already been removed from the dirty list - that
happens at inode writeback submission time, not IO completion.

> 3. __sync_filesystem() invokes sync_inodes_sb().  There are no dirty
>    pages, so it finishes.

There are no dirty pages, but the pages aren't clean, either. i.e
they are still under writeback.  Hence we need to invoke
wait_inodes_sb() to wait for writeback on all pages to complete
before returning.

> 4. xfs_fs_sync_fs() is called which calls _xfs_log_force() but the
>    work item from #1 hasn't run yet, so the size update isn't written
>    out.

The bug here is that wait_inodes_sb() has not been run, therefore
->syncfs is being run before IO completions have been processed and
pages marked clean.

> 5. Crash.
> 
> Is it that _xfs_log_force() waits for the setfilesize transaction
> created during writepage?

No, it's wait_inodes_sb() that does the waiting for data IO
completion for sync.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2015-08-24 22:27 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20150812101204.GE17933@dhcp-13-216.nay.redhat.com>
2015-08-13  0:44 ` generic/04[89] fail on XFS due to change in writeback code [4.2-rc1 regression] Dave Chinner
2015-08-13 15:34   ` Tejun Heo
2015-08-13 19:16     ` Tejun Heo
2015-08-13 22:44   ` [PATCH block/for-linus] writeback: fix syncing of I_DIRTY_TIME inodes Tejun Heo
2015-08-14 11:14     ` Jan Kara
2015-08-14 15:14       ` Damien Wyart
2015-08-17 20:00         ` Tejun Heo
2015-08-18  5:33           ` Damien Wyart
2015-08-17 20:02       ` Tejun Heo
2015-08-18  9:16         ` Jan Kara
2015-08-18 17:47           ` Tejun Heo
2015-08-18 19:54             ` Tejun Heo
2015-08-18 21:56               ` Dave Chinner
2015-08-20  6:12                 ` Eryu Guan
2015-08-20 14:01                   ` Eryu Guan
2015-08-20 14:36                   ` Eryu Guan
2015-08-20 14:37                     ` Eryu Guan
2015-08-20 16:55                       ` Tejun Heo
2015-08-20 23:04                         ` Dave Chinner
2015-08-24 18:10                           ` Tejun Heo
2015-08-24 22:27                             ` Dave Chinner [this message]
2015-08-24 22:53                               ` Tejun Heo
2015-08-21 10:20                 ` Eryu Guan
2015-08-22  0:30                   ` Dave Chinner
2015-08-22  4:46                     ` Eryu Guan
2015-08-24  1:11                       ` Dave Chinner
2015-08-24  3:18                         ` Eryu Guan
2015-08-24  6:24                           ` Dave Chinner
2015-08-24  8:34                             ` Eryu Guan
2015-08-24  8:55                               ` Dave Chinner
2015-08-24  9:19                             ` Jan Kara
2015-08-24 14:51                               ` Tejun Heo
2015-08-24 17:11                                 ` Tejun Heo
2015-08-24 19:08                                   ` Jan Kara
2015-08-24 19:32                                     ` Tejun Heo
2015-08-24 21:09                                       ` Jan Kara
2015-08-24 21:45                                         ` Tejun Heo
2015-08-24 22:54                                           ` Tejun Heo
2015-08-24 22:57                                           ` Dave Chinner
2015-08-25 18:11     ` [PATCH v2 block/for-linus] writeback: sync_inodes_sb() must write out I_DIRTY_TIME inodes and always call wait_sb_inodes() Tejun Heo
2015-08-25 20:37       ` Jens Axboe
2015-08-26  9:00       ` Jan Kara
2015-08-13 23:24   ` generic/04[89] fail on XFS due to change in writeback code [4.2-rc1 regression] Tejun Heo
2015-08-14  6:19     ` Eryu Guan
2015-08-17 20:27       ` Tejun Heo
2015-08-18  3:57         ` Eryu Guan
2015-08-18  5:31           ` Eryu Guan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150824222720.GD714@dastard \
    --to=david@fromorbit.com \
    --cc=axboe@fb.com \
    --cc=axboe@kernel.dk \
    --cc=eguan@redhat.com \
    --cc=jack@suse.com \
    --cc=jack@suse.cz \
    --cc=kernel-team@fb.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).