linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-nvdimm <linux-nvdimm@lists.01.org>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	Dave Chinner <david@fromorbit.com>,
	Christoph Hellwig <hch@lst.de>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Jan Kara <jack@suse.cz>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v6 14/15] xfs: prepare xfs_break_layouts() for another layout type
Date: Mon, 19 Mar 2018 11:09:02 -0700	[thread overview]
Message-ID: <CAPcyv4hM5HOn-Gjn-Nqp_5rPL9ZK2MAjW79WvZaS7DbJnKDhbA@mail.gmail.com> (raw)
In-Reply-To: <20180319174559.GG1757@magnolia>

On Mon, Mar 19, 2018 at 10:45 AM, Darrick J. Wong
<darrick.wong@oracle.com> wrote:
> On Thu, Mar 15, 2018 at 08:52:45AM -0700, Dan Williams wrote:
>> When xfs is operating as the back-end of a pNFS block server, it prevents
>> collisions between local and remote operations by requiring a lease to
>> be held for remotely accessed blocks. Local filesystem operations break
>> those leases before writing or mutating the extent map of the file.
>>
>> A similar mechanism is needed to prevent operations on pinned dax
>> mappings, like device-DMA, from colliding with extent unmap operations.
>>
>> BREAK_WRITE and BREAK_TRUNCATE are introduced as two distinct levels of
>> layout breaking.
>>
>> Layouts are broken in the BREAK_WRITE case to ensure that layout-holders
>> do not collide with local writes. Additionally, layouts are broken in
>> the BREAK_TRUNCATE case to make sure the layout-holder has a consistent
>> view of the file's extent map. While BREAK_WRITE breaks can be satisfied
>> be recalling FL_LAYOUT leases, BREAK_TRUNCATE breaks additionally
>> require waiting for busy dax-pages to go idle.
>>
>> Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
>> Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
>> Reported-by: Dave Chinner <david@fromorbit.com>
>> Reported-by: Christoph Hellwig <hch@lst.de>
>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>> ---
>>  fs/xfs/xfs_file.c  |   23 +++++++++++++++++------
>>  fs/xfs/xfs_inode.h |   18 ++++++++++++++++--
>>  fs/xfs/xfs_ioctl.c |    2 +-
>>  fs/xfs/xfs_iops.c  |    5 +++--
>>  4 files changed, 37 insertions(+), 11 deletions(-)
>>
>> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
>> index 5742d395a4e4..399c5221f101 100644
>> --- a/fs/xfs/xfs_file.c
>> +++ b/fs/xfs/xfs_file.c
>> @@ -352,7 +352,7 @@ xfs_file_aio_write_checks(
>>
>>       xfs_ilock(ip, XFS_MMAPLOCK_EXCL);
>>       *iolock |= XFS_MMAPLOCK_EXCL;
>> -     error = xfs_break_layouts(inode, iolock);
>> +     error = xfs_break_layouts(inode, iolock, BREAK_WRITE);
>>       if (error) {
>>               *iolock &= ~XFS_MMAPLOCK_EXCL;
>>               xfs_iunlock(ip, XFS_MMAPLOCK_EXCL);
>> @@ -762,7 +762,8 @@ xfs_file_write_iter(
>>  int
>>  xfs_break_layouts(
>>       struct inode            *inode,
>> -     uint                    *iolock)
>> +     uint                    *iolock,
>> +     enum layout_break_reason reason)
>>  {
>>       struct xfs_inode        *ip = XFS_I(inode);
>>       int                     ret;
>> @@ -770,9 +771,19 @@ xfs_break_layouts(
>>       ASSERT(xfs_isilocked(ip, XFS_IOLOCK_SHARED | XFS_IOLOCK_EXCL
>>                               | XFS_MMAPLOCK_EXCL));
>>
>> -     ret = xfs_break_leased_layouts(inode, iolock);
>> -     if (ret > 0)
>> -             ret = 0;
>> +     switch (reason) {
>> +     case BREAK_TRUNCATE:
>> +             /* fall through */
>> +     case BREAK_WRITE:
>> +             ret = xfs_break_leased_layouts(inode, iolock);
>> +             if (ret > 0)
>> +                     ret = 0;
>> +             break;
>> +     default:
>> +             ret = -EINVAL;
>> +             break;
>> +     }
>> +
>>       return ret;
>>  }
>>
>> @@ -802,7 +813,7 @@ xfs_file_fallocate(
>>               return -EOPNOTSUPP;
>>
>>       xfs_ilock(ip, iolock);
>> -     error = xfs_break_layouts(inode, &iolock);
>> +     error = xfs_break_layouts(inode, &iolock, BREAK_TRUNCATE);
>>       if (error)
>>               goto out_unlock;
>>
>> diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
>> index 74c63f3a720f..1a66c7afcf45 100644
>> --- a/fs/xfs/xfs_inode.h
>> +++ b/fs/xfs/xfs_inode.h
>> @@ -379,6 +379,20 @@ static inline void xfs_ifunlock(struct xfs_inode *ip)
>>                                       >> XFS_ILOCK_SHIFT)
>>
>>  /*
>> + * Layouts are broken in the BREAK_WRITE case to ensure that
>> + * layout-holders do not collide with local writes. Additionally,
>> + * layouts are broken in the BREAK_TRUNCATE case to make sure the
>> + * layout-holder has a consistent view of the file's extent map. While
>> + * BREAK_WRITE breaks can be satisfied be recalling FL_LAYOUT leases,
>> + * BREAK_TRUNCATE breaks additionally require waiting for busy dax-pages
>> + * to go idle.
>> + */
>> +enum layout_break_reason {
>> +        BREAK_WRITE,
>
> I wonder if this belongs in a system header file since the same general
> principle is going to apply to ext4 and others?  If this really is a
> special xfs thing, then please use the "XFS_" prefix.
>
>> +        BREAK_TRUNCATE,
>
> The "truncate" part of the name misled me for a bit.  That operation is
> really about "make sure there are no busy dax pages because we're about
> to change some file pmem^Wblock mappings", right?  Truncation, hole
> punching, reflinking, and zero_range (because it punches the range and
> reallocates unwritten extents) all have to wait for busy dax pages to
> become free before they can begin unmapping blocks.  So this isn't just
> about truncation specifically; can I suggest "BREAK_UNMAPI" ?

I like that name better.

It isn't clear that this needs to go to a system header because I
expect ext4 will only support BREAK_UNMAPI for dax and not care about
BREAK_WRITE since that is an XFS-only / pNFS-only distinction in
current code. However, I'll let Jan chime in if this guess is
incorrect and ext4 plans to add pNFS block-server support.

> (I also haven't figured out whether the same requirements apply to
> things that *add* extent maps to the file, but I have to run to a
> meeting in a few so wanted to get this email sent.)

Add should not be a problem because DMA only ever starts after extents
are added, i.e. there's no risk of DMA starting to invalid address
because fs/dax.c arranges for extents to be allocated at fault time.

  reply	other threads:[~2018-03-19 18:09 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-15 15:51 [PATCH v6 00/15] dax: fix dma vs truncate/hole-punch Dan Williams
2018-03-15 15:51 ` [PATCH v6 01/15] dax: store pfns in the radix Dan Williams
2018-03-15 15:51 ` [PATCH v6 02/15] fs, dax: prepare for dax-specific address_space_operations Dan Williams
2018-03-16 18:59   ` Christoph Hellwig
2018-03-15 15:51 ` [PATCH v6 03/15] block, dax: remove dead code in blkdev_writepages() Dan Williams
2018-03-16 18:59   ` Christoph Hellwig
2018-03-15 15:51 ` [PATCH v6 04/15] xfs, dax: introduce xfs_dax_aops Dan Williams
2018-03-16 19:00   ` Christoph Hellwig
2018-03-15 15:51 ` [PATCH v6 05/15] ext4, dax: introduce ext4_dax_aops Dan Williams
2018-03-15 15:51 ` [PATCH v6 06/15] ext2, dax: introduce ext2_dax_aops Dan Williams
2018-03-18  4:02   ` kbuild test robot
2018-03-18  4:02   ` [RFC PATCH] ext2, dax: ext2_dax_aops can be static kbuild test robot
2018-03-15 15:52 ` [PATCH v6 07/15] fs, dax: use page->mapping to warn if truncate collides with a busy page Dan Williams
2018-03-18  6:26   ` kbuild test robot
2018-03-15 15:52 ` [PATCH v6 08/15] mm, dax: enable filesystems to trigger dev_pagemap ->page_free callbacks Dan Williams
2018-03-15 15:52 ` [PATCH v6 09/15] mm, dev_pagemap: introduce CONFIG_DEV_PAGEMAP_OPS Dan Williams
2018-03-15 15:52 ` [PATCH v6 10/15] memremap: mark devm_memremap_pages() EXPORT_SYMBOL_GPL Dan Williams
2018-03-15 15:52 ` [PATCH v6 11/15] mm, fs, dax: handle layout changes to pinned dax mappings Dan Williams
2018-03-16 19:01   ` Christoph Hellwig
2018-03-17 22:14   ` kbuild test robot
2018-03-15 15:52 ` [PATCH v6 12/15] xfs: require mmap lock for xfs_break_layouts() Dan Williams
2018-03-16 19:04   ` Christoph Hellwig
2018-03-16 19:10     ` Dan Williams
2018-03-19 17:33   ` Darrick J. Wong
2018-03-19 17:57     ` Dan Williams
2018-03-19 18:19       ` Darrick J. Wong
2018-03-19 18:34         ` Dan Williams
2018-03-19 19:45       ` Christoph Hellwig
2018-03-19 20:10         ` Dan Williams
2018-03-19 21:14           ` Christoph Hellwig
2018-03-15 15:52 ` [PATCH v6 13/15] xfs: communicate lock drop events from xfs_break_layouts() Dan Williams
2018-03-16 19:08   ` Christoph Hellwig
2018-03-15 15:52 ` [PATCH v6 14/15] xfs: prepare xfs_break_layouts() for another layout type Dan Williams
2018-03-16 19:08   ` Christoph Hellwig
2018-03-19 17:45   ` Darrick J. Wong
2018-03-19 18:09     ` Dan Williams [this message]
2018-03-15 15:52 ` [PATCH v6 15/15] xfs, dax: introduce xfs_break_dax_layouts() Dan Williams
2018-03-16 19:09   ` Christoph Hellwig
2018-03-17 22:11   ` kbuild test robot
2018-03-17 23:47   ` kbuild test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4hM5HOn-Gjn-Nqp_5rPL9ZK2MAjW79WvZaS7DbJnKDhbA@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=ross.zwisler@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).