From: Dave Chinner <david@fromorbit.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Josef Bacik <josef@toxicpanda.com>,
Johannes Thumshirn <johannes.thumshirn@wdc.com>,
David Sterba <dsterba@suse.com>,
"linux-btrfs @ vger . kernel . org" <linux-btrfs@vger.kernel.org>,
Filipe Manana <fdmanana@gmail.com>,
Christoph Hellwig <hch@lst.de>,
Linux FS Devel <linux-fsdevel@vger.kernel.org>
Subject: Re: [RFC PATCH] btrfs: don't call btrfs_sync_file from iomap context
Date: Wed, 2 Sep 2020 22:20:08 +1000 [thread overview]
Message-ID: <20200902122008.GK12096@dread.disaster.area> (raw)
In-Reply-To: <20200902114414.GX14765@casper.infradead.org>
On Wed, Sep 02, 2020 at 12:44:14PM +0100, Matthew Wilcox wrote:
> On Wed, Sep 02, 2020 at 09:58:30AM +1000, Dave Chinner wrote:
> > Put simply: converting a filesystem to use iomap is not a "change
> > the filesystem interfacing code and it will work" modification. We
> > ask that filesystems are modified to conform to the iomap IO
> > exclusion model; adding special cases for every potential
> > locking and mapping quirk every different filesystem has is part of
> > what turned the old direct IO code into an unmaintainable nightmare.
> >
> > > That's fine, but this is kind of a bad way to find
> > > out. We really shouldn't have generic helper's that have different generic
> > > locking rules based on which file system uses them.
> >
> > We certainly can change the rules for new infrastructure. Indeed, we
> > had to change the rules to support DAX. The whole point of the
> > iomap infrastructure was that it enabled us to use code that already
> > worked for DAX (the XFS code) in multiple filesystems. And as people
> > have realised that the DIO via iomap is much faster than the old DIO
> > code and is a much more efficient way of doing large buffered IO,
> > other filesystems have started to use it.
> >
> > However....
> >
> > > Because then we end up
> > > with situations like this, where suddenly we're having to come up with some
> > > weird solution because the generic thing only works for a subset of file
> > > systems. Thanks,
> >
> > .... we've always said "you need to change the filesystem code to
> > use iomap". This is simply a reflection on the fact that iomap has
> > different rules and constraints to the old code and so it's not a
> > direct plug in replacement. There are no short cuts here...
>
> Can you point me (and I suspect Josef!) towards the documentation of the
> locking model? I was hoping to find Documentation/filesystems/iomap.rst
> but all the 'iomap' strings in Documentation/ refer to pci_iomap and
> similar, except for this in the DAX documentation:
There's no locking model documentation because there is no locking
in the iomap direct IO code. The filesystem defines and does all the
locking, so there's pretty much nothing to document for iomap.
IOWs, the only thing iomap_dio_rw requires is that the IO completion
paths do not take same locks that the IO submission path
requires. And that's because:
/*
* iomap_dio_rw() always completes O_[D]SYNC writes regardless of whether the IO
* is being issued as AIO or not. [...]
So you obviously can't sit waiting for dio completion in
iomap_dio_rw() while holding the submission lock if completion
requires the submission lock to make progress.
FWIW, iomap_dio_rw() originally required the inode_lock() to be held
and contained a lockdep assert to verify this, but....
commit 3ad99bec6e82e32fa9faf2f84e74b134586b46f7
Author: Goldwyn Rodrigues <rgoldwyn@suse.com>
Date: Sat Nov 30 09:59:25 2019 -0600
iomap: remove lockdep_assert_held()
Filesystems such as btrfs can perform direct I/O without holding the
inode->i_rwsem in some of the cases like writing within i_size. So,
remove the check for lockdep_assert_held() in iomap_dio_rw().
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
... btrfs has special corner cases for direct IO locking and hence
we removed the lockdep assert....
IOWs, iomap_dio_rw() really does not care what strategy filesystems
use to serialise DIO against other operations. Filesystems can use
whatever IO serialisation mechanism they want (mutex, rwsem, range
locks, etc) as long as they obey the one simple requirement: do not
take the DIO submission lock in the DIO completion path.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2020-09-02 12:20 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20200901130644.12655-1-johannes.thumshirn@wdc.com>
2020-09-01 15:11 ` [RFC PATCH] btrfs: don't call btrfs_sync_file from iomap context Josef Bacik
2020-09-01 17:45 ` Darrick J. Wong
2020-09-01 17:55 ` Josef Bacik
2020-09-01 21:46 ` Dave Chinner
2020-09-01 22:19 ` Josef Bacik
2020-09-01 23:58 ` Dave Chinner
2020-09-02 0:22 ` Josef Bacik
2020-09-02 7:12 ` Johannes Thumshirn
2020-09-02 11:10 ` Josef Bacik
2020-09-02 16:29 ` Darrick J. Wong
2020-09-02 16:47 ` Josef Bacik
2020-09-02 11:44 ` Matthew Wilcox
2020-09-02 12:20 ` Dave Chinner [this message]
2020-09-02 12:42 ` Josef Bacik
2020-09-03 2:28 ` Dave Chinner
2020-09-03 9:49 ` Filipe Manana
2020-09-03 16:32 ` Christoph Hellwig
2020-09-03 16:46 ` Josef Bacik
2020-09-07 0:04 ` Dave Chinner
2020-09-15 21:48 ` Goldwyn Rodrigues
2020-09-17 3:09 ` Dave Chinner
2020-09-17 5:52 ` Christoph Hellwig
2020-09-17 6:29 ` Dave Chinner
2020-09-17 6:42 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200902122008.GK12096@dread.disaster.area \
--to=david@fromorbit.com \
--cc=dsterba@suse.com \
--cc=fdmanana@gmail.com \
--cc=hch@lst.de \
--cc=johannes.thumshirn@wdc.com \
--cc=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox