From: Brian Foster <bfoster@redhat.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>, linux-xfs@vger.kernel.org
Subject: Re: [PATCH v2] xfs: serialize unaligned dio writes against all other dio writes
Date: Tue, 16 Apr 2019 14:18:31 -0400 [thread overview]
Message-ID: <20190416181830.GA10746@bfoster> (raw)
In-Reply-To: <20190416151434.GM4752@magnolia>
On Tue, Apr 16, 2019 at 08:14:34AM -0700, Darrick J. Wong wrote:
> On Mon, Mar 25, 2019 at 01:24:48PM -0400, Brian Foster wrote:
> > XFS applies more strict serialization constraints to unaligned
> > direct writes to accommodate things like direct I/O layer zeroing,
> > unwritten extent conversion, etc. Unaligned submissions acquire the
> > exclusive iolock and wait for in-flight dio to complete to ensure
> > multiple submissions do not race on the same block and cause data
> > corruption.
> >
> > This generally works in the case of an aligned dio followed by an
> > unaligned dio, but the serialization is lost if I/Os occur in the
> > opposite order. If an unaligned write is submitted first and
> > immediately followed by an overlapping, aligned write, the latter
> > submits without the typical unaligned serialization barriers because
> > there is no indication of an unaligned dio still in-flight. This can
> > lead to unpredictable results.
> >
> > To provide proper unaligned dio serialization, require that such
> > direct writes are always the only dio allowed in-flight at one time
> > for a particular inode. We already acquire the exclusive iolock and
> > drain pending dio before submitting the unaligned dio. Wait once
> > more after the dio submission to hold the iolock across the I/O and
> > prevent further submissions until the unaligned I/O completes. This
> > is heavy handed, but consistent with the current pre-submission
> > serialization for unaligned direct writes.
> >
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
> > ---
> >
> > v2:
> > - Use dio return value instead of I/O type in wait logic.
> > - Drop spurious else logic and fix up comments.
> > v1: https://marc.info/?l=linux-xfs&m=155327356800415&w=2
> >
> > fs/xfs/xfs_file.c | 27 +++++++++++++++++----------
> > 1 file changed, 17 insertions(+), 10 deletions(-)
> >
> > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> > index 770cc2edf777..933d9c467f56 100644
> > --- a/fs/xfs/xfs_file.c
> > +++ b/fs/xfs/xfs_file.c
> > @@ -529,18 +529,17 @@ xfs_file_dio_aio_write(
> > count = iov_iter_count(from);
> >
> > /*
> > - * If we are doing unaligned IO, wait for all other IO to drain,
> > - * otherwise demote the lock if we had to take the exclusive lock
> > - * for other reasons in xfs_file_aio_write_checks.
> > + * If we are doing unaligned IO, we can't allow any other overlapping IO
> > + * in-flight at the same time or we risk data corruption. Wait for all
> > + * other IO to drain before we submit. If the IO is aligned, demote the
> > + * iolock if we had to take the exclusive lock in
> > + * xfs_file_aio_write_checks() for other reasons.
> > */
> > if (unaligned_io) {
> > - /* If we are going to wait for other DIO to finish, bail */
> > - if (iocb->ki_flags & IOCB_NOWAIT) {
> > - if (atomic_read(&inode->i_dio_count))
> > - return -EAGAIN;
> > - } else {
> > - inode_dio_wait(inode);
> > - }
> > + /* unaligned dio always waits, bail */
> > + if (iocb->ki_flags & IOCB_NOWAIT)
> > + return -EAGAIN;
>
> Hmm, Dave pointed out on IRC that this looks like we're bailing out with
> *iolock held. I took another look at the function and wondered why
> wouldn't we bail out as soon as we know that we're doing unaligned
> nowait directio, which is before we take all the locks and such?
>
Yeah, though it doesn't look like that's due to the patch above (though
I wish I noticed it then :P)..? The above patch basically just removed
the dio count check in that first hunk.
> --D
>
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index cdcc75735521..c586fd9f244c 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -517,7 +517,8 @@ xfs_file_dio_aio_write(
> }
>
> if (iocb->ki_flags & IOCB_NOWAIT) {
> - if (!xfs_ilock_nowait(ip, iolock))
> + /* unaligned dio always waits, bail */
> + if (unaligned_io || !xfs_ilock_nowait(ip, iolock))
I'd prefer to see the lock on a line of its own, but otherwise I think
that's reasonable. After the patch above, an unaligned dio is by
definition going to wait on itself at the very least.
Brian
> return -EAGAIN;
> } else {
> xfs_ilock(ip, iolock);
> @@ -536,9 +537,6 @@ xfs_file_dio_aio_write(
> * xfs_file_aio_write_checks() for other reasons.
> */
> if (unaligned_io) {
> - /* unaligned dio always waits, bail */
> - if (iocb->ki_flags & IOCB_NOWAIT)
> - return -EAGAIN;
> inode_dio_wait(inode);
> } else if (iolock == XFS_IOLOCK_EXCL) {
> xfs_ilock_demote(ip, XFS_IOLOCK_EXCL);
>
>
> > + inode_dio_wait(inode);
> > } else if (iolock == XFS_IOLOCK_EXCL) {
> > xfs_ilock_demote(ip, XFS_IOLOCK_EXCL);
> > iolock = XFS_IOLOCK_SHARED;
> > @@ -548,6 +547,14 @@ xfs_file_dio_aio_write(
> >
> > trace_xfs_file_direct_write(ip, count, iocb->ki_pos);
> > ret = iomap_dio_rw(iocb, from, &xfs_iomap_ops, xfs_dio_write_end_io);
> > +
> > + /*
> > + * If unaligned, this is the only IO in-flight. If it has not yet
> > + * completed, wait on it before we release the iolock to prevent
> > + * subsequent overlapping IO.
> > + */
> > + if (ret == -EIOCBQUEUED && unaligned_io)
> > + inode_dio_wait(inode);
> > out:
> > xfs_iunlock(ip, iolock);
> >
> > --
> > 2.17.2
> >
next prev parent reply other threads:[~2019-04-16 18:18 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-25 17:24 [PATCH v2] xfs: serialize unaligned dio writes against all other dio writes Brian Foster
2019-03-25 23:51 ` Dave Chinner
2019-04-16 15:14 ` Darrick J. Wong
2019-04-16 18:18 ` Brian Foster [this message]
2019-04-16 19:03 ` Darrick J. Wong
2019-04-16 19:16 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190416181830.GA10746@bfoster \
--to=bfoster@redhat.com \
--cc=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).