All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ross Zwisler <ross.zwisler-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
To: Dan Williams <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: linux-xfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>,
	"linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org"
	<linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org>,
	Linux API <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Dave Chinner <david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org>,
	Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	Linux MM <linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>,
	linux-fsdevel
	<linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	linux-ext4 <linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
Subject: Re: [PATCH 17/17] xfs: support for synchronous DAX faults
Date: Tue, 31 Oct 2017 21:47:07 -0600	[thread overview]
Message-ID: <20171101034707.GA1662@linux.intel.com> (raw)
In-Reply-To: <CAPcyv4gBVyZ8KhB2s1M0BdhC1=HAean6Mb6oV7zsJBx0-t3bhw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Tue, Oct 31, 2017 at 02:50:01PM -0700, Dan Williams wrote:
> On Tue, Oct 31, 2017 at 8:19 AM, Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org> wrote:
> > On Fri 27-10-17 12:08:34, Jan Kara wrote:
> >> On Fri 27-10-17 08:16:11, Dave Chinner wrote:
> >> > On Thu, Oct 26, 2017 at 05:48:04PM +0200, Jan Kara wrote:
> >> > > > > diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> >> > > > > index f179bdf1644d..b43be199fbdf 100644
> >> > > > > --- a/fs/xfs/xfs_iomap.c
> >> > > > > +++ b/fs/xfs/xfs_iomap.c
> >> > > > > @@ -33,6 +33,7 @@
> >> > > > >  #include "xfs_error.h"
> >> > > > >  #include "xfs_trans.h"
> >> > > > >  #include "xfs_trans_space.h"
> >> > > > > +#include "xfs_inode_item.h"
> >> > > > >  #include "xfs_iomap.h"
> >> > > > >  #include "xfs_trace.h"
> >> > > > >  #include "xfs_icache.h"
> >> > > > > @@ -1086,6 +1087,10 @@ xfs_file_iomap_begin(
> >> > > > >               trace_xfs_iomap_found(ip, offset, length, 0, &imap);
> >> > > > >       }
> >> > > > >
> >> > > > > +     if ((flags & IOMAP_WRITE) && xfs_ipincount(ip) &&
> >> > > > > +         (ip->i_itemp->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP))
> >> > > > > +             iomap->flags |= IOMAP_F_DIRTY;
> >> > > >
> >> > > > This is the very definition of an inode that is "fdatasync dirty".
> >> > > >
> >> > > > Hmmmm, shouldn't this also be set for read faults, too?
> >> > >
> >> > > No, read faults don't need to set IOMAP_F_DIRTY since user cannot write any
> >> > > data to the page which he'd then like to be persistent. The only reason why
> >> > > I thought it could be useful for a while was that it would be nice to make
> >> > > MAP_SYNC mapping provide the guarantee that data you see now is the data
> >> > > you'll see after a crash
> >> >
> >> > Isn't that the entire point of MAP_SYNC? i.e. That when we return
> >> > from a page fault, the app knows that the data and it's underlying
> >> > extent is on persistent storage?
> >> >
> >> > > but we cannot provide that guarantee for RO
> >> > > mapping anyway if someone else has the page mapped as well. So I just
> >> > > decided not to return IOMAP_F_DIRTY for read faults.
> >> >
> >> > If there are multiple MAP_SYNC mappings to the inode, I would have
> >> > expected that they all sync all of the data/metadata on every page
> >> > fault, regardless of who dirtied the inode. An RO mapping doesn't
> >>
> >> Well, they all do sync regardless of who dirtied the inode on every *write*
> >> fault.
> >>
> >> > mean the data/metadata on the inode can't change, it just means it
> >> > can't change through that mapping.  Running fsync() to guarantee the
> >> > persistence of that data/metadata doesn't actually changing any
> >> > data....
> >> >
> >> > IOWs, if read faults don't guarantee the mapped range has stable
> >> > extents on a MAP_SYNC mapping, then I think MAP_SYNC is broken
> >> > because it's not giving consistent guarantees to userspace. Yes, it
> >> > works fine when only one MAP_SYNC mapping is modifying the inode,
> >> > but the moment we have concurrent operations on the inode that
> >> > aren't MAP_SYNC or O_SYNC this goes out the window....
> >>
> >> MAP_SYNC as I've implemented it provides guarantees only for data the
> >> process has actually written. I agree with that and it was a conscious
> >> decision. In my opinion that covers most usecases, provides reasonably
> >> simple semantics (i.e., if you write data through MAP_SYNC mapping, you can
> >> persist it just using CPU instructions), and reasonable performance.
> >>
> >> Now you seem to suggest the semantics should be: "Data you have read from or
> >> written to a MAP_SYNC mapping can be persisted using CPU instructions." And
> >> from implementation POV we can do that rather easily (just rip out the
> >> IOMAP_WRITE checks). But I'm unsure whether this additional guarantee would
> >> be useful enough to justify the slowdown of read faults? I was not able to
> >> come up with a good usecase and so I've decided for current semantics. What
> >> do other people think?
> >
> > Nobody commented on this for couple of days so how do we proceed? I would
> > prefer to go just with a guarantee for data written and we can always make
> > the guarantee stronger (i.e. apply it also for read data) when some user
> > comes with a good usecase?
> 
> I think it is easier to strengthen the guarantee than loosen it later
> especially since it is not yet clear that we have a use case for the
> stronger semantic. At least the initial motivation for MAP_SYNC was
> for writers.

I agree.  It seems like all threads/processes in a given application need to
use MAP_SYNC consistently so they can be sure that data that is written (and
then possibly read) will be durable on media.  I think what you have is a good
starting point, and we can adjust later if necessary.

WARNING: multiple messages have this Message-ID (diff)
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-xfs@vger.kernel.org, Jan Kara <jack@suse.cz>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	Linux API <linux-api@vger.kernel.org>,
	Dave Chinner <david@fromorbit.com>,
	Christoph Hellwig <hch@infradead.org>,
	Linux MM <linux-mm@kvack.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 17/17] xfs: support for synchronous DAX faults
Date: Tue, 31 Oct 2017 21:47:07 -0600	[thread overview]
Message-ID: <20171101034707.GA1662@linux.intel.com> (raw)
In-Reply-To: <CAPcyv4gBVyZ8KhB2s1M0BdhC1=HAean6Mb6oV7zsJBx0-t3bhw@mail.gmail.com>

On Tue, Oct 31, 2017 at 02:50:01PM -0700, Dan Williams wrote:
> On Tue, Oct 31, 2017 at 8:19 AM, Jan Kara <jack@suse.cz> wrote:
> > On Fri 27-10-17 12:08:34, Jan Kara wrote:
> >> On Fri 27-10-17 08:16:11, Dave Chinner wrote:
> >> > On Thu, Oct 26, 2017 at 05:48:04PM +0200, Jan Kara wrote:
> >> > > > > diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> >> > > > > index f179bdf1644d..b43be199fbdf 100644
> >> > > > > --- a/fs/xfs/xfs_iomap.c
> >> > > > > +++ b/fs/xfs/xfs_iomap.c
> >> > > > > @@ -33,6 +33,7 @@
> >> > > > >  #include "xfs_error.h"
> >> > > > >  #include "xfs_trans.h"
> >> > > > >  #include "xfs_trans_space.h"
> >> > > > > +#include "xfs_inode_item.h"
> >> > > > >  #include "xfs_iomap.h"
> >> > > > >  #include "xfs_trace.h"
> >> > > > >  #include "xfs_icache.h"
> >> > > > > @@ -1086,6 +1087,10 @@ xfs_file_iomap_begin(
> >> > > > >               trace_xfs_iomap_found(ip, offset, length, 0, &imap);
> >> > > > >       }
> >> > > > >
> >> > > > > +     if ((flags & IOMAP_WRITE) && xfs_ipincount(ip) &&
> >> > > > > +         (ip->i_itemp->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP))
> >> > > > > +             iomap->flags |= IOMAP_F_DIRTY;
> >> > > >
> >> > > > This is the very definition of an inode that is "fdatasync dirty".
> >> > > >
> >> > > > Hmmmm, shouldn't this also be set for read faults, too?
> >> > >
> >> > > No, read faults don't need to set IOMAP_F_DIRTY since user cannot write any
> >> > > data to the page which he'd then like to be persistent. The only reason why
> >> > > I thought it could be useful for a while was that it would be nice to make
> >> > > MAP_SYNC mapping provide the guarantee that data you see now is the data
> >> > > you'll see after a crash
> >> >
> >> > Isn't that the entire point of MAP_SYNC? i.e. That when we return
> >> > from a page fault, the app knows that the data and it's underlying
> >> > extent is on persistent storage?
> >> >
> >> > > but we cannot provide that guarantee for RO
> >> > > mapping anyway if someone else has the page mapped as well. So I just
> >> > > decided not to return IOMAP_F_DIRTY for read faults.
> >> >
> >> > If there are multiple MAP_SYNC mappings to the inode, I would have
> >> > expected that they all sync all of the data/metadata on every page
> >> > fault, regardless of who dirtied the inode. An RO mapping doesn't
> >>
> >> Well, they all do sync regardless of who dirtied the inode on every *write*
> >> fault.
> >>
> >> > mean the data/metadata on the inode can't change, it just means it
> >> > can't change through that mapping.  Running fsync() to guarantee the
> >> > persistence of that data/metadata doesn't actually changing any
> >> > data....
> >> >
> >> > IOWs, if read faults don't guarantee the mapped range has stable
> >> > extents on a MAP_SYNC mapping, then I think MAP_SYNC is broken
> >> > because it's not giving consistent guarantees to userspace. Yes, it
> >> > works fine when only one MAP_SYNC mapping is modifying the inode,
> >> > but the moment we have concurrent operations on the inode that
> >> > aren't MAP_SYNC or O_SYNC this goes out the window....
> >>
> >> MAP_SYNC as I've implemented it provides guarantees only for data the
> >> process has actually written. I agree with that and it was a conscious
> >> decision. In my opinion that covers most usecases, provides reasonably
> >> simple semantics (i.e., if you write data through MAP_SYNC mapping, you can
> >> persist it just using CPU instructions), and reasonable performance.
> >>
> >> Now you seem to suggest the semantics should be: "Data you have read from or
> >> written to a MAP_SYNC mapping can be persisted using CPU instructions." And
> >> from implementation POV we can do that rather easily (just rip out the
> >> IOMAP_WRITE checks). But I'm unsure whether this additional guarantee would
> >> be useful enough to justify the slowdown of read faults? I was not able to
> >> come up with a good usecase and so I've decided for current semantics. What
> >> do other people think?
> >
> > Nobody commented on this for couple of days so how do we proceed? I would
> > prefer to go just with a guarantee for data written and we can always make
> > the guarantee stronger (i.e. apply it also for read data) when some user
> > comes with a good usecase?
> 
> I think it is easier to strengthen the guarantee than loosen it later
> especially since it is not yet clear that we have a use case for the
> stronger semantic. At least the initial motivation for MAP_SYNC was
> for writers.

I agree.  It seems like all threads/processes in a given application need to
use MAP_SYNC consistently so they can be sure that data that is written (and
then possibly read) will be durable on media.  I think what you have is a good
starting point, and we can adjust later if necessary.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Jan Kara <jack@suse.cz>, Dave Chinner <david@fromorbit.com>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	Christoph Hellwig <hch@infradead.org>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-xfs@vger.kernel.org, Linux API <linux-api@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>, Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 17/17] xfs: support for synchronous DAX faults
Date: Tue, 31 Oct 2017 21:47:07 -0600	[thread overview]
Message-ID: <20171101034707.GA1662@linux.intel.com> (raw)
In-Reply-To: <CAPcyv4gBVyZ8KhB2s1M0BdhC1=HAean6Mb6oV7zsJBx0-t3bhw@mail.gmail.com>

On Tue, Oct 31, 2017 at 02:50:01PM -0700, Dan Williams wrote:
> On Tue, Oct 31, 2017 at 8:19 AM, Jan Kara <jack@suse.cz> wrote:
> > On Fri 27-10-17 12:08:34, Jan Kara wrote:
> >> On Fri 27-10-17 08:16:11, Dave Chinner wrote:
> >> > On Thu, Oct 26, 2017 at 05:48:04PM +0200, Jan Kara wrote:
> >> > > > > diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> >> > > > > index f179bdf1644d..b43be199fbdf 100644
> >> > > > > --- a/fs/xfs/xfs_iomap.c
> >> > > > > +++ b/fs/xfs/xfs_iomap.c
> >> > > > > @@ -33,6 +33,7 @@
> >> > > > >  #include "xfs_error.h"
> >> > > > >  #include "xfs_trans.h"
> >> > > > >  #include "xfs_trans_space.h"
> >> > > > > +#include "xfs_inode_item.h"
> >> > > > >  #include "xfs_iomap.h"
> >> > > > >  #include "xfs_trace.h"
> >> > > > >  #include "xfs_icache.h"
> >> > > > > @@ -1086,6 +1087,10 @@ xfs_file_iomap_begin(
> >> > > > >               trace_xfs_iomap_found(ip, offset, length, 0, &imap);
> >> > > > >       }
> >> > > > >
> >> > > > > +     if ((flags & IOMAP_WRITE) && xfs_ipincount(ip) &&
> >> > > > > +         (ip->i_itemp->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP))
> >> > > > > +             iomap->flags |= IOMAP_F_DIRTY;
> >> > > >
> >> > > > This is the very definition of an inode that is "fdatasync dirty".
> >> > > >
> >> > > > Hmmmm, shouldn't this also be set for read faults, too?
> >> > >
> >> > > No, read faults don't need to set IOMAP_F_DIRTY since user cannot write any
> >> > > data to the page which he'd then like to be persistent. The only reason why
> >> > > I thought it could be useful for a while was that it would be nice to make
> >> > > MAP_SYNC mapping provide the guarantee that data you see now is the data
> >> > > you'll see after a crash
> >> >
> >> > Isn't that the entire point of MAP_SYNC? i.e. That when we return
> >> > from a page fault, the app knows that the data and it's underlying
> >> > extent is on persistent storage?
> >> >
> >> > > but we cannot provide that guarantee for RO
> >> > > mapping anyway if someone else has the page mapped as well. So I just
> >> > > decided not to return IOMAP_F_DIRTY for read faults.
> >> >
> >> > If there are multiple MAP_SYNC mappings to the inode, I would have
> >> > expected that they all sync all of the data/metadata on every page
> >> > fault, regardless of who dirtied the inode. An RO mapping doesn't
> >>
> >> Well, they all do sync regardless of who dirtied the inode on every *write*
> >> fault.
> >>
> >> > mean the data/metadata on the inode can't change, it just means it
> >> > can't change through that mapping.  Running fsync() to guarantee the
> >> > persistence of that data/metadata doesn't actually changing any
> >> > data....
> >> >
> >> > IOWs, if read faults don't guarantee the mapped range has stable
> >> > extents on a MAP_SYNC mapping, then I think MAP_SYNC is broken
> >> > because it's not giving consistent guarantees to userspace. Yes, it
> >> > works fine when only one MAP_SYNC mapping is modifying the inode,
> >> > but the moment we have concurrent operations on the inode that
> >> > aren't MAP_SYNC or O_SYNC this goes out the window....
> >>
> >> MAP_SYNC as I've implemented it provides guarantees only for data the
> >> process has actually written. I agree with that and it was a conscious
> >> decision. In my opinion that covers most usecases, provides reasonably
> >> simple semantics (i.e., if you write data through MAP_SYNC mapping, you can
> >> persist it just using CPU instructions), and reasonable performance.
> >>
> >> Now you seem to suggest the semantics should be: "Data you have read from or
> >> written to a MAP_SYNC mapping can be persisted using CPU instructions." And
> >> from implementation POV we can do that rather easily (just rip out the
> >> IOMAP_WRITE checks). But I'm unsure whether this additional guarantee would
> >> be useful enough to justify the slowdown of read faults? I was not able to
> >> come up with a good usecase and so I've decided for current semantics. What
> >> do other people think?
> >
> > Nobody commented on this for couple of days so how do we proceed? I would
> > prefer to go just with a guarantee for data written and we can always make
> > the guarantee stronger (i.e. apply it also for read data) when some user
> > comes with a good usecase?
> 
> I think it is easier to strengthen the guarantee than loosen it later
> especially since it is not yet clear that we have a use case for the
> stronger semantic. At least the initial motivation for MAP_SYNC was
> for writers.

I agree.  It seems like all threads/processes in a given application need to
use MAP_SYNC consistently so they can be sure that data that is written (and
then possibly read) will be durable on media.  I think what you have is a good
starting point, and we can adjust later if necessary.

WARNING: multiple messages have this Message-ID (diff)
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Jan Kara <jack@suse.cz>, Dave Chinner <david@fromorbit.com>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	Christoph Hellwig <hch@infradead.org>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-xfs@vger.kernel.org, Linux API <linux-api@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>, Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 17/17] xfs: support for synchronous DAX faults
Date: Tue, 31 Oct 2017 21:47:07 -0600	[thread overview]
Message-ID: <20171101034707.GA1662@linux.intel.com> (raw)
In-Reply-To: <CAPcyv4gBVyZ8KhB2s1M0BdhC1=HAean6Mb6oV7zsJBx0-t3bhw@mail.gmail.com>

On Tue, Oct 31, 2017 at 02:50:01PM -0700, Dan Williams wrote:
> On Tue, Oct 31, 2017 at 8:19 AM, Jan Kara <jack@suse.cz> wrote:
> > On Fri 27-10-17 12:08:34, Jan Kara wrote:
> >> On Fri 27-10-17 08:16:11, Dave Chinner wrote:
> >> > On Thu, Oct 26, 2017 at 05:48:04PM +0200, Jan Kara wrote:
> >> > > > > diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> >> > > > > index f179bdf1644d..b43be199fbdf 100644
> >> > > > > --- a/fs/xfs/xfs_iomap.c
> >> > > > > +++ b/fs/xfs/xfs_iomap.c
> >> > > > > @@ -33,6 +33,7 @@
> >> > > > >  #include "xfs_error.h"
> >> > > > >  #include "xfs_trans.h"
> >> > > > >  #include "xfs_trans_space.h"
> >> > > > > +#include "xfs_inode_item.h"
> >> > > > >  #include "xfs_iomap.h"
> >> > > > >  #include "xfs_trace.h"
> >> > > > >  #include "xfs_icache.h"
> >> > > > > @@ -1086,6 +1087,10 @@ xfs_file_iomap_begin(
> >> > > > >               trace_xfs_iomap_found(ip, offset, length, 0, &imap);
> >> > > > >       }
> >> > > > >
> >> > > > > +     if ((flags & IOMAP_WRITE) && xfs_ipincount(ip) &&
> >> > > > > +         (ip->i_itemp->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP))
> >> > > > > +             iomap->flags |= IOMAP_F_DIRTY;
> >> > > >
> >> > > > This is the very definition of an inode that is "fdatasync dirty".
> >> > > >
> >> > > > Hmmmm, shouldn't this also be set for read faults, too?
> >> > >
> >> > > No, read faults don't need to set IOMAP_F_DIRTY since user cannot write any
> >> > > data to the page which he'd then like to be persistent. The only reason why
> >> > > I thought it could be useful for a while was that it would be nice to make
> >> > > MAP_SYNC mapping provide the guarantee that data you see now is the data
> >> > > you'll see after a crash
> >> >
> >> > Isn't that the entire point of MAP_SYNC? i.e. That when we return
> >> > from a page fault, the app knows that the data and it's underlying
> >> > extent is on persistent storage?
> >> >
> >> > > but we cannot provide that guarantee for RO
> >> > > mapping anyway if someone else has the page mapped as well. So I just
> >> > > decided not to return IOMAP_F_DIRTY for read faults.
> >> >
> >> > If there are multiple MAP_SYNC mappings to the inode, I would have
> >> > expected that they all sync all of the data/metadata on every page
> >> > fault, regardless of who dirtied the inode. An RO mapping doesn't
> >>
> >> Well, they all do sync regardless of who dirtied the inode on every *write*
> >> fault.
> >>
> >> > mean the data/metadata on the inode can't change, it just means it
> >> > can't change through that mapping.  Running fsync() to guarantee the
> >> > persistence of that data/metadata doesn't actually changing any
> >> > data....
> >> >
> >> > IOWs, if read faults don't guarantee the mapped range has stable
> >> > extents on a MAP_SYNC mapping, then I think MAP_SYNC is broken
> >> > because it's not giving consistent guarantees to userspace. Yes, it
> >> > works fine when only one MAP_SYNC mapping is modifying the inode,
> >> > but the moment we have concurrent operations on the inode that
> >> > aren't MAP_SYNC or O_SYNC this goes out the window....
> >>
> >> MAP_SYNC as I've implemented it provides guarantees only for data the
> >> process has actually written. I agree with that and it was a conscious
> >> decision. In my opinion that covers most usecases, provides reasonably
> >> simple semantics (i.e., if you write data through MAP_SYNC mapping, you can
> >> persist it just using CPU instructions), and reasonable performance.
> >>
> >> Now you seem to suggest the semantics should be: "Data you have read from or
> >> written to a MAP_SYNC mapping can be persisted using CPU instructions." And
> >> from implementation POV we can do that rather easily (just rip out the
> >> IOMAP_WRITE checks). But I'm unsure whether this additional guarantee would
> >> be useful enough to justify the slowdown of read faults? I was not able to
> >> come up with a good usecase and so I've decided for current semantics. What
> >> do other people think?
> >
> > Nobody commented on this for couple of days so how do we proceed? I would
> > prefer to go just with a guarantee for data written and we can always make
> > the guarantee stronger (i.e. apply it also for read data) when some user
> > comes with a good usecase?
> 
> I think it is easier to strengthen the guarantee than loosen it later
> especially since it is not yet clear that we have a use case for the
> stronger semantic. At least the initial motivation for MAP_SYNC was
> for writers.

I agree.  It seems like all threads/processes in a given application need to
use MAP_SYNC consistently so they can be sure that data that is written (and
then possibly read) will be durable on media.  I think what you have is a good
starting point, and we can adjust later if necessary.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2017-11-01  3:47 UTC|newest]

Thread overview: 147+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-24 15:23 [PATCH 0/17 v5] dax, ext4, xfs: Synchronous page faults Jan Kara
2017-10-24 15:23 ` Jan Kara
2017-10-24 15:23 ` Jan Kara
2017-10-24 15:23 ` Jan Kara
2017-10-24 15:24 ` [PATCH 04/17] dax: Factor out getting of pfn out of iomap Jan Kara
2017-10-24 15:24   ` Jan Kara
2017-10-24 15:24   ` Jan Kara
2017-10-24 15:24   ` Jan Kara
     [not found] ` <20171024152415.22864-1-jack-AlSwsSmVLrQ@public.gmane.org>
2017-10-24 15:23   ` [PATCH 01/17] mm: introduce MAP_SHARED_VALIDATE, a mechanism to safely define new mmap flags Jan Kara
2017-10-24 15:23     ` Jan Kara
2017-10-24 15:23     ` Jan Kara
2017-10-24 15:23     ` Jan Kara
2017-10-24 15:23     ` Jan Kara
     [not found]     ` <20171024152415.22864-2-jack-AlSwsSmVLrQ@public.gmane.org>
2017-10-24 21:21       ` Ross Zwisler
2017-10-24 21:21         ` Ross Zwisler
2017-10-24 21:21         ` Ross Zwisler
2017-10-24 21:21         ` Ross Zwisler
2017-10-24 15:23   ` [PATCH 02/17] mm: Remove VM_FAULT_HWPOISON_LARGE_MASK Jan Kara
2017-10-24 15:23     ` Jan Kara
2017-10-24 15:23     ` Jan Kara
2017-10-24 15:23     ` Jan Kara
2017-10-24 15:23     ` Jan Kara
2017-10-24 15:24   ` [PATCH 03/17] dax: Simplify arguments of dax_insert_mapping() Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24   ` [PATCH 05/17] dax: Create local variable for VMA in dax_iomap_pte_fault() Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24   ` [PATCH 06/17] dax: Create local variable for vmf->flags & FAULT_FLAG_WRITE test Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24   ` [PATCH 07/17] dax: Inline dax_insert_mapping() into the callsite Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24   ` [PATCH 08/17] dax: Inline dax_pmd_insert_mapping() " Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24   ` [PATCH 09/17] dax: Fix comment describing dax_iomap_fault() Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24   ` [PATCH 10/17] dax: Allow dax_iomap_fault() to return pfn Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24   ` [PATCH 12/17] mm: Define MAP_SYNC and VM_SYNC flags Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24   ` [PATCH 13/17] dax, iomap: Add support for synchronous faults Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24   ` [PATCH 15/17] ext4: Simplify error handling in ext4_dax_huge_fault() Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24   ` [PATCH 16/17] ext4: Support for synchronous DAX faults Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24   ` [PATCH] mmap.2: Add description of MAP_SHARED_VALIDATE and MAP_SYNC Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 15:24     ` Jan Kara
2017-10-24 21:10     ` Ross Zwisler
2017-10-24 21:10       ` Ross Zwisler
2017-10-24 21:10       ` Ross Zwisler
2017-10-26 13:24       ` Jan Kara
2017-10-26 13:24         ` Jan Kara
2017-10-26 13:24         ` Jan Kara
     [not found]         ` <20171026132402.GB31161-4I4JzKEfoa/jFM9bn6wA6Q@public.gmane.org>
2017-10-26 18:22           ` Ross Zwisler
2017-10-26 18:22             ` Ross Zwisler
2017-10-26 18:22             ` Ross Zwisler
2017-10-26 18:22             ` Ross Zwisler
2017-10-24 15:24 ` [PATCH 11/17] dax: Allow tuning whether dax_insert_mapping_entry() dirties entry Jan Kara
2017-10-24 15:24   ` Jan Kara
2017-10-24 15:24   ` Jan Kara
2017-10-24 15:24   ` Jan Kara
2017-10-24 15:24 ` [PATCH 14/17] dax: Implement dax_finish_sync_fault() Jan Kara
2017-10-24 15:24   ` Jan Kara
2017-10-24 15:24   ` Jan Kara
2017-10-24 15:24   ` Jan Kara
2017-10-24 15:24   ` Jan Kara
2017-10-24 15:24 ` [PATCH 17/17] xfs: support for synchronous DAX faults Jan Kara
2017-10-24 15:24   ` Jan Kara
2017-10-24 15:24   ` Jan Kara
2017-10-24 15:24   ` Jan Kara
2017-10-24 21:29   ` Ross Zwisler
2017-10-24 21:29     ` Ross Zwisler
2017-10-24 21:29     ` Ross Zwisler
2017-10-24 22:23   ` Dave Chinner
2017-10-24 22:23     ` Dave Chinner
2017-10-24 22:23     ` Dave Chinner
2017-10-26 15:48     ` Jan Kara
2017-10-26 15:48       ` Jan Kara
2017-10-26 15:48       ` Jan Kara
2017-10-26 15:48       ` Jan Kara
     [not found]       ` <20171026154804.GF31161-4I4JzKEfoa/jFM9bn6wA6Q@public.gmane.org>
2017-10-26 21:16         ` Dave Chinner
2017-10-26 21:16           ` Dave Chinner
2017-10-26 21:16           ` Dave Chinner
2017-10-26 21:16           ` Dave Chinner
2017-10-27 10:08           ` Jan Kara
2017-10-27 10:08             ` Jan Kara
2017-10-27 10:08             ` Jan Kara
     [not found]             ` <20171027100834.GH31161-4I4JzKEfoa/jFM9bn6wA6Q@public.gmane.org>
2017-10-31 15:19               ` Jan Kara
2017-10-31 15:19                 ` Jan Kara
2017-10-31 15:19                 ` Jan Kara
2017-10-31 15:19                 ` Jan Kara
     [not found]                 ` <20171031151907.GB26128-4I4JzKEfoa/jFM9bn6wA6Q@public.gmane.org>
2017-10-31 21:50                   ` Dan Williams
2017-10-31 21:50                     ` Dan Williams
2017-10-31 21:50                     ` Dan Williams
2017-10-31 21:50                     ` Dan Williams
     [not found]                     ` <CAPcyv4gBVyZ8KhB2s1M0BdhC1=HAean6Mb6oV7zsJBx0-t3bhw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-11-01  3:47                       ` Ross Zwisler [this message]
2017-11-01  3:47                         ` Ross Zwisler
2017-11-01  3:47                         ` Ross Zwisler
2017-11-01  3:47                         ` Ross Zwisler
2017-10-27  6:43       ` Christoph Hellwig
2017-10-27  6:43         ` Christoph Hellwig
     [not found]         ` <20171027064301.GC22931-jcswGhMUV9g@public.gmane.org>
2017-10-27  9:13           ` Jan Kara
2017-10-27  9:13             ` Jan Kara
2017-10-27  9:13             ` Jan Kara
2017-10-27  9:13             ` Jan Kara
  -- strict thread matches above, loose matches on Subject: below --
2017-10-19 12:57 [PATCH 0/17 v4] dax, ext4, xfs: Synchronous page faults Jan Kara
     [not found] ` <20171019125817.11580-1-jack-AlSwsSmVLrQ@public.gmane.org>
2017-10-19 12:58   ` [PATCH 17/17] xfs: support for synchronous DAX faults Jan Kara
2017-10-19 12:58     ` Jan Kara
2017-10-19 12:58     ` Jan Kara
2017-10-19 12:58     ` Jan Kara
2017-10-19 12:58     ` Jan Kara
2017-10-19 13:17     ` Christoph Hellwig
2017-10-19 13:17       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171101034707.GA1662@linux.intel.com \
    --to=ross.zwisler-vuqaysv1563yd54fqh9/ca@public.gmane.org \
    --cc=dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org \
    --cc=hch-jcswGhMUV9g@public.gmane.org \
    --cc=hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=jack-AlSwsSmVLrQ@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
    --cc=linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org \
    --cc=linux-xfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.