linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org>
To: Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>
Cc: linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org,
	"Darrick J. Wong"
	<darrick.wong-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
Subject: Re: [PATCH v3 2/2] ext4: handle layout changes to pinned DAX mappings
Date: Thu, 5 Jul 2018 09:54:14 +1000	[thread overview]
Message-ID: <20180704235414.GU19934@dastard> (raw)
In-Reply-To: <20180704122723.lup2wovzb6u6ta6v-4I4JzKEfoa/jFM9bn6wA6Q@public.gmane.org>

On Wed, Jul 04, 2018 at 02:27:23PM +0200, Jan Kara wrote:
> On Wed 04-07-18 10:49:23, Dave Chinner wrote:
> > On Mon, Jul 02, 2018 at 11:29:12AM -0600, Ross Zwisler wrote:
> > > Follow the lead of xfs_break_dax_layouts() and add synchronization between
> > > operations in ext4 which remove blocks from an inode (hole punch, truncate
> > > down, etc.) and pages which are pinned due to DAX DMA operations.
> > > 
> > > Signed-off-by: Ross Zwisler <ross.zwisler-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
> > > Reviewed-by: Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>
> > > Reviewed-by: Lukas Czerner <lczerner-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > > ---
> > > 
> > > Changes since v2:
> > >  * Added a comment to ext4_insert_range() explaining why we don't call
> > >    ext4_break_layouts(). (Jan)
> > 
> > Which I think is wrong and will cause data corruption.
> > 
> > > @@ -5651,6 +5663,11 @@ int ext4_insert_range(struct inode *inode, loff_t offset, loff_t len)
> > >  			LLONG_MAX);
> > >  	if (ret)
> > >  		goto out_mmap;
> > > +	/*
> > > +	 * We don't need to call ext4_break_layouts() because we aren't
> > > +	 * removing any blocks from the inode.  We are just changing their
> > > +	 * offset by inserting a hole.
> > > +	 */
> > 
> > The entire point of these leases is so that a thrid party can
> > directly access the blocks underlying the file. That means they are
> > keeping their own file offset<->disk block mapping internally, and
> > they are assuming that it is valid for as long as they hold the
> > lease. If the filesystem modifies the extent map - even something
> > like a shift here which changes the offset<->disk block mapping -
> > the userspace app now has a stale mapping and so the lease *must be
> > broken* to tell it that it's mappings are now stale and it needs to
> > refetch them.
> 
> Well, ext4 has no real concept of leases and no pNFS support. And DAX
> requirements wrt consistency are much weaker than those of pNFS. This is
> mostly caused by the fact that calls like invalidate_mapping_pages() will
> flush offset<->pfn mappings DAX maintains in the radix tree automatically
> (similarly as it happens when page cache is used).

I'm more concerned about apps that use file leases behaving the same
way, not just the pNFS stuff. if we are /delegating file layouts/ to
3rd parties, then all filesystems *need* to behave the same way.
We've already defined those semantics with XFS - every time the
filesystem changes an extent layout in any way it needs to break
existing layout delegations...

> What Ross did just keeps ext4 + DAX behave similarly as ext4 + page cache
> does

Sure. But the issue I'm raising is that ext4 is not playing by the
same extent layout delegation rules that XFS has already defined for
3rd party use.

i.e. don't fuck up layout delegation behaviour consistency right
from the start just because "<this subset of functionality> is all
we need right now for ext4". All the filesystems should implement
the same semantics and behaviour right from the start, otherwise
we're just going to make life a misery for anyone who tries to use
layout delegations in future.

Haven't we learnt this lesson the hard way enough times already?

Cheers,

Dave.

-- 
Dave Chinner
david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org

  parent reply	other threads:[~2018-07-04 23:54 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-27 21:22 [PATCH v2 0/2] ext4: fix DAX dma vs truncate/hole-punch Ross Zwisler
     [not found] ` <20180627212252.31032-1-ross.zwisler-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2018-06-27 21:22   ` [PATCH v2 1/2] dax: dax_layout_busy_page() warn on !exceptional Ross Zwisler
     [not found]     ` <20180627212252.31032-2-ross.zwisler-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2018-07-02 22:15       ` Theodore Y. Ts'o
     [not found]         ` <20180702221503.GA12830-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2018-07-03 15:41           ` Ross Zwisler
     [not found]             ` <20180703154137.GB13019-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2018-07-03 17:44               ` Theodore Y. Ts'o
2018-06-27 21:22   ` [PATCH v2 2/2] ext4: handle layout changes to pinned DAX mappings Ross Zwisler
     [not found]     ` <20180627212252.31032-3-ross.zwisler-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2018-06-29 12:02       ` Lukas Czerner
     [not found]         ` <20180629120223.oaslngsvspnwf4ae-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2018-06-29 15:13           ` Ross Zwisler
     [not found]             ` <20180629151300.GA3006-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2018-07-02  7:34               ` Jan Kara
2018-07-02  7:59               ` Lukas Czerner
     [not found]                 ` <20180702075948.i4aqjg5rrorwoxqj-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2018-07-02 16:27                   ` Ross Zwisler
2018-06-30  1:12           ` Dave Chinner
2018-07-02 17:29       ` [PATCH v3 " Ross Zwisler
     [not found]         ` <20180702172912.329-1-ross.zwisler-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2018-07-04  0:49           ` Dave Chinner
2018-07-04 12:27             ` Jan Kara
     [not found]               ` <20180704122723.lup2wovzb6u6ta6v-4I4JzKEfoa/jFM9bn6wA6Q@public.gmane.org>
2018-07-04 23:54                 ` Dave Chinner [this message]
2018-07-05  3:59                   ` Darrick J. Wong
2018-07-05 16:53                     ` Ross Zwisler
     [not found]                       ` <20180705165310.GB22200-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2018-07-09 12:33                         ` Jan Kara
     [not found]                           ` <20180709123347.nw3ixr64prgk7sxz-4I4JzKEfoa/jFM9bn6wA6Q@public.gmane.org>
2018-07-09 16:23                             ` Darrick J. Wong
2018-07-09 19:49                               ` Jan Kara
2018-07-05 20:40                     ` Dan Williams
     [not found]                       ` <CAPcyv4jSNh95XUPh4ZzguKmcJpgNG7AG5_9=+gbLEjsaZUTq4g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-07-05 23:29                         ` Dave Chinner
2018-07-06  5:08                           ` Dan Williams
2018-07-09  9:59                           ` Lukas Czerner
     [not found]                             ` <20180709095907.i3mnyodvn6gpcidt-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2018-07-09 16:18                               ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180704235414.GU19934@dastard \
    --to=david-fqsqvqoi3ljby3ivrkzq2a@public.gmane.org \
    --cc=darrick.wong-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=hch-jcswGhMUV9g@public.gmane.org \
    --cc=jack-AlSwsSmVLrQ@public.gmane.org \
    --cc=linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).