All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Jeff Layton <jlayton@poochiereds.net>, Jan Kara <jack@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	Dave Chinner <david@fromorbit.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Linux MM <linux-mm@kvack.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-xfs@vger.kernel.org, Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 6/7] mm, fs: introduce file_operations->post_mmap()
Date: Tue, 26 Sep 2017 12:57:51 -0600	[thread overview]
Message-ID: <20170926185751.GB31146@linux.intel.com> (raw)
In-Reply-To: <CAPcyv4jtO028KeZK7SdkOUsgMLGqgttLzBCYgH0M+RP3eAXf4A@mail.gmail.com>

On Mon, Sep 25, 2017 at 04:38:45PM -0700, Dan Williams wrote:
> On Mon, Sep 25, 2017 at 4:14 PM, Ross Zwisler
> <ross.zwisler@linux.intel.com> wrote:
> > When mappings are created the vma->vm_flags that they use vary based on
> > whether the inode being mapped is using DAX or not.  This setup happens in
> > XFS via mmap_region()=>call_mmap()=>xfs_file_mmap().
> >
> > For us to be able to safely use the DAX per-inode flag we need to prevent
> > S_DAX transitions when any mappings are present, and we will do that by
> > looking at the address_space->i_mmap tree and returning -EBUSY if any
> > mappings are present.
> >
> > Unfortunately at the time that the filesystem's file_operations->mmap()
> > entry point is called the mapping has not yet been added to the
> > address_space->i_mmap tree.  This means that at that point in time we
> > cannot determine whether or not the mapping will be set up to support DAX.
> >
> > Fix this by adding a new file_operations entry called post_mmap() which is
> > called after the mapping has been added to the address_space->i_mmap tree.
> > This post_mmap() op now happens at a time when we can be sure whether the
> > mapping will use DAX or not, and we can set up the vma->vm_flags
> > appropriately.
> >
> > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> > ---
> >  fs/xfs/xfs_file.c  | 15 ++++++++++++++-
> >  include/linux/fs.h |  1 +
> >  mm/mmap.c          |  2 ++
> >  3 files changed, 17 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> > index 2816858..9d66aaa 100644
> > --- a/fs/xfs/xfs_file.c
> > +++ b/fs/xfs/xfs_file.c
> > @@ -1087,9 +1087,21 @@ xfs_file_mmap(
> >  {
> >         file_accessed(filp);
> >         vma->vm_ops = &xfs_file_vm_ops;
> > +       return 0;
> > +}
> > +
> > +/* This call happens during mmap(), after the vma has been inserted into the
> > + * inode->i_mapping->i_mmap tree.  At this point the decision on whether or
> > + * not to use DAX for this mapping has been set and will not change for the
> > + * duration of the mapping.
> > + */
> > +STATIC void
> > +xfs_file_post_mmap(
> > +       struct file     *filp,
> > +       struct vm_area_struct *vma)
> > +{
> >         if (IS_DAX(file_inode(filp)))
> >                 vma->vm_flags |= VM_MIXEDMAP | VM_HUGEPAGE;
> 
> It's not clear to me what this is actually protecting? vma_is_dax()
> returns true regardless of the vm_flags state , so what is the benefit
> to delaying the vm_flags setting to ->post_mmap()?

Right, but the point is that until the vma has been inserted into the
inode->i_mapping->i_mmap tree, the results of IS_DAX() don't matter because it
can still change.  Until this insertion happens we cannot know whether or not
we should set up the vma->vm_flags to support DAX mappings (i.e. have
VM_MIXEDMAP and VM_HUGEPAGE set).  This decision can only be made (in this
proposed scheme) *after* the inode->i_mapping->i_mmap  tree has been
populated, which means we need another call into the filesystem after this
insertion has happened.

We don't want to mess with the existing file_operations->mmap() call because
in many filesystems that does sanity checking and setup that you really want
to have happen *before* the mapping is completed and inserted into the
inode->i_mapping->i_mmap tree.

> Also, why is this a file_operation and not a vm_operation?

Because ->mmap() is also a file_operation, and this is an analogous call from
the mmap code that needs to happen at a different time.  Or are you suggesting
that file_operations->mmap() should be moved to be a vm_operation?  If not,
why would one be in one operations table and one in another?
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Christoph Hellwig <hch@lst.de>,
	Dave Chinner <david@fromorbit.com>, Jan Kara <jack@suse.cz>,
	Jeff Layton <jlayton@poochiereds.net>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	linux-xfs@vger.kernel.org
Subject: Re: [PATCH 6/7] mm, fs: introduce file_operations->post_mmap()
Date: Tue, 26 Sep 2017 12:57:51 -0600	[thread overview]
Message-ID: <20170926185751.GB31146@linux.intel.com> (raw)
In-Reply-To: <CAPcyv4jtO028KeZK7SdkOUsgMLGqgttLzBCYgH0M+RP3eAXf4A@mail.gmail.com>

On Mon, Sep 25, 2017 at 04:38:45PM -0700, Dan Williams wrote:
> On Mon, Sep 25, 2017 at 4:14 PM, Ross Zwisler
> <ross.zwisler@linux.intel.com> wrote:
> > When mappings are created the vma->vm_flags that they use vary based on
> > whether the inode being mapped is using DAX or not.  This setup happens in
> > XFS via mmap_region()=>call_mmap()=>xfs_file_mmap().
> >
> > For us to be able to safely use the DAX per-inode flag we need to prevent
> > S_DAX transitions when any mappings are present, and we will do that by
> > looking at the address_space->i_mmap tree and returning -EBUSY if any
> > mappings are present.
> >
> > Unfortunately at the time that the filesystem's file_operations->mmap()
> > entry point is called the mapping has not yet been added to the
> > address_space->i_mmap tree.  This means that at that point in time we
> > cannot determine whether or not the mapping will be set up to support DAX.
> >
> > Fix this by adding a new file_operations entry called post_mmap() which is
> > called after the mapping has been added to the address_space->i_mmap tree.
> > This post_mmap() op now happens at a time when we can be sure whether the
> > mapping will use DAX or not, and we can set up the vma->vm_flags
> > appropriately.
> >
> > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> > ---
> >  fs/xfs/xfs_file.c  | 15 ++++++++++++++-
> >  include/linux/fs.h |  1 +
> >  mm/mmap.c          |  2 ++
> >  3 files changed, 17 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> > index 2816858..9d66aaa 100644
> > --- a/fs/xfs/xfs_file.c
> > +++ b/fs/xfs/xfs_file.c
> > @@ -1087,9 +1087,21 @@ xfs_file_mmap(
> >  {
> >         file_accessed(filp);
> >         vma->vm_ops = &xfs_file_vm_ops;
> > +       return 0;
> > +}
> > +
> > +/* This call happens during mmap(), after the vma has been inserted into the
> > + * inode->i_mapping->i_mmap tree.  At this point the decision on whether or
> > + * not to use DAX for this mapping has been set and will not change for the
> > + * duration of the mapping.
> > + */
> > +STATIC void
> > +xfs_file_post_mmap(
> > +       struct file     *filp,
> > +       struct vm_area_struct *vma)
> > +{
> >         if (IS_DAX(file_inode(filp)))
> >                 vma->vm_flags |= VM_MIXEDMAP | VM_HUGEPAGE;
> 
> It's not clear to me what this is actually protecting? vma_is_dax()
> returns true regardless of the vm_flags state , so what is the benefit
> to delaying the vm_flags setting to ->post_mmap()?

Right, but the point is that until the vma has been inserted into the
inode->i_mapping->i_mmap tree, the results of IS_DAX() don't matter because it
can still change.  Until this insertion happens we cannot know whether or not
we should set up the vma->vm_flags to support DAX mappings (i.e. have
VM_MIXEDMAP and VM_HUGEPAGE set).  This decision can only be made (in this
proposed scheme) *after* the inode->i_mapping->i_mmap  tree has been
populated, which means we need another call into the filesystem after this
insertion has happened.

We don't want to mess with the existing file_operations->mmap() call because
in many filesystems that does sanity checking and setup that you really want
to have happen *before* the mapping is completed and inserted into the
inode->i_mapping->i_mmap tree.

> Also, why is this a file_operation and not a vm_operation?

Because ->mmap() is also a file_operation, and this is an analogous call from
the mmap code that needs to happen at a different time.  Or are you suggesting
that file_operations->mmap() should be moved to be a vm_operation?  If not,
why would one be in one operations table and one in another?

WARNING: multiple messages have this Message-ID (diff)
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Christoph Hellwig <hch@lst.de>,
	Dave Chinner <david@fromorbit.com>, Jan Kara <jack@suse.cz>,
	Jeff Layton <jlayton@poochiereds.net>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	linux-xfs@vger.kernel.org
Subject: Re: [PATCH 6/7] mm, fs: introduce file_operations->post_mmap()
Date: Tue, 26 Sep 2017 12:57:51 -0600	[thread overview]
Message-ID: <20170926185751.GB31146@linux.intel.com> (raw)
In-Reply-To: <CAPcyv4jtO028KeZK7SdkOUsgMLGqgttLzBCYgH0M+RP3eAXf4A@mail.gmail.com>

On Mon, Sep 25, 2017 at 04:38:45PM -0700, Dan Williams wrote:
> On Mon, Sep 25, 2017 at 4:14 PM, Ross Zwisler
> <ross.zwisler@linux.intel.com> wrote:
> > When mappings are created the vma->vm_flags that they use vary based on
> > whether the inode being mapped is using DAX or not.  This setup happens in
> > XFS via mmap_region()=>call_mmap()=>xfs_file_mmap().
> >
> > For us to be able to safely use the DAX per-inode flag we need to prevent
> > S_DAX transitions when any mappings are present, and we will do that by
> > looking at the address_space->i_mmap tree and returning -EBUSY if any
> > mappings are present.
> >
> > Unfortunately at the time that the filesystem's file_operations->mmap()
> > entry point is called the mapping has not yet been added to the
> > address_space->i_mmap tree.  This means that at that point in time we
> > cannot determine whether or not the mapping will be set up to support DAX.
> >
> > Fix this by adding a new file_operations entry called post_mmap() which is
> > called after the mapping has been added to the address_space->i_mmap tree.
> > This post_mmap() op now happens at a time when we can be sure whether the
> > mapping will use DAX or not, and we can set up the vma->vm_flags
> > appropriately.
> >
> > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> > ---
> >  fs/xfs/xfs_file.c  | 15 ++++++++++++++-
> >  include/linux/fs.h |  1 +
> >  mm/mmap.c          |  2 ++
> >  3 files changed, 17 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> > index 2816858..9d66aaa 100644
> > --- a/fs/xfs/xfs_file.c
> > +++ b/fs/xfs/xfs_file.c
> > @@ -1087,9 +1087,21 @@ xfs_file_mmap(
> >  {
> >         file_accessed(filp);
> >         vma->vm_ops = &xfs_file_vm_ops;
> > +       return 0;
> > +}
> > +
> > +/* This call happens during mmap(), after the vma has been inserted into the
> > + * inode->i_mapping->i_mmap tree.  At this point the decision on whether or
> > + * not to use DAX for this mapping has been set and will not change for the
> > + * duration of the mapping.
> > + */
> > +STATIC void
> > +xfs_file_post_mmap(
> > +       struct file     *filp,
> > +       struct vm_area_struct *vma)
> > +{
> >         if (IS_DAX(file_inode(filp)))
> >                 vma->vm_flags |= VM_MIXEDMAP | VM_HUGEPAGE;
> 
> It's not clear to me what this is actually protecting? vma_is_dax()
> returns true regardless of the vm_flags state , so what is the benefit
> to delaying the vm_flags setting to ->post_mmap()?

Right, but the point is that until the vma has been inserted into the
inode->i_mapping->i_mmap tree, the results of IS_DAX() don't matter because it
can still change.  Until this insertion happens we cannot know whether or not
we should set up the vma->vm_flags to support DAX mappings (i.e. have
VM_MIXEDMAP and VM_HUGEPAGE set).  This decision can only be made (in this
proposed scheme) *after* the inode->i_mapping->i_mmap  tree has been
populated, which means we need another call into the filesystem after this
insertion has happened.

We don't want to mess with the existing file_operations->mmap() call because
in many filesystems that does sanity checking and setup that you really want
to have happen *before* the mapping is completed and inserted into the
inode->i_mapping->i_mmap tree.

> Also, why is this a file_operation and not a vm_operation?

Because ->mmap() is also a file_operation, and this is an analogous call from
the mmap code that needs to happen at a different time.  Or are you suggesting
that file_operations->mmap() should be moved to be a vm_operation?  If not,
why would one be in one operations table and one in another?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-09-26 18:54 UTC|newest]

Thread overview: 130+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-25 23:13 [PATCH 0/7] re-enable XFS per-inode DAX Ross Zwisler
2017-09-25 23:13 ` Ross Zwisler
2017-09-25 23:13 ` Ross Zwisler
2017-09-25 23:13 ` [PATCH 1/7] xfs: always use DAX if mount option is used Ross Zwisler
2017-09-25 23:13   ` Ross Zwisler
2017-09-25 23:13   ` Ross Zwisler
2017-09-25 23:38   ` Dave Chinner
2017-09-25 23:38     ` Dave Chinner
2017-09-25 23:38     ` Dave Chinner
2017-09-26  9:35     ` Jan Kara
2017-09-26  9:35       ` Jan Kara
2017-09-26  9:35       ` Jan Kara
2017-09-26 11:09       ` Dave Chinner
2017-09-26 11:09         ` Dave Chinner
2017-09-26 11:09         ` Dave Chinner
2017-09-26 14:37         ` Christoph Hellwig
2017-09-26 14:37           ` Christoph Hellwig
2017-09-26 17:30           ` Ross Zwisler
2017-09-26 17:30             ` Ross Zwisler
2017-09-26 17:30             ` Ross Zwisler
2017-09-26 19:48             ` Darrick J. Wong
2017-09-26 19:48               ` Darrick J. Wong
2017-09-26 22:00               ` Dave Chinner
2017-09-26 22:00                 ` Dave Chinner
2017-09-26 22:00                 ` Dave Chinner
2017-09-27  6:40             ` Christoph Hellwig
2017-09-27  6:40               ` Christoph Hellwig
2017-09-27  6:40               ` Christoph Hellwig
2017-09-27 16:15               ` Ross Zwisler
2017-09-27 16:15                 ` Ross Zwisler
2017-10-01  8:17                 ` Christoph Hellwig
2017-10-01  8:17                   ` Christoph Hellwig
2017-10-01  8:17                   ` Christoph Hellwig
2017-09-26 18:02         ` Eric Sandeen
2017-09-26 18:02           ` Eric Sandeen
2017-09-26 18:02           ` Eric Sandeen
2017-09-26 18:50     ` Ross Zwisler
2017-09-26 18:50       ` Ross Zwisler
2017-09-26 18:50       ` Ross Zwisler
2017-09-25 23:13 ` [PATCH 2/7] xfs: validate bdev support for DAX inode flag Ross Zwisler
2017-09-25 23:13   ` Ross Zwisler
2017-09-25 23:13   ` Ross Zwisler
2017-09-26  6:36   ` Christoph Hellwig
2017-09-26  6:36     ` Christoph Hellwig
2017-09-26  6:36     ` Christoph Hellwig
2017-09-26 17:16     ` Ross Zwisler
2017-09-26 17:16       ` Ross Zwisler
2017-09-26 17:16       ` Ross Zwisler
2017-09-26 17:57       ` Darrick J. Wong
2017-09-26 17:57         ` Darrick J. Wong
2017-09-25 23:14 ` [PATCH 3/7] xfs: protect S_DAX transitions in XFS read path Ross Zwisler
2017-09-25 23:14   ` Ross Zwisler
2017-09-25 23:14   ` Ross Zwisler
2017-09-25 23:27   ` Dave Chinner
2017-09-25 23:27     ` Dave Chinner
2017-09-25 23:27     ` Dave Chinner
2017-09-26  6:32   ` Christoph Hellwig
2017-09-26  6:32     ` Christoph Hellwig
2017-09-26  6:32     ` Christoph Hellwig
2017-09-26 13:59     ` Dan Williams
2017-09-26 13:59       ` Dan Williams
2017-09-26 13:59       ` Dan Williams
2017-09-26 14:33       ` Christoph Hellwig
2017-09-26 14:33         ` Christoph Hellwig
2017-09-26 14:33         ` Christoph Hellwig
2017-09-26 18:11         ` Dan Williams
2017-09-26 18:11           ` Dan Williams
2017-10-01  8:17           ` Christoph Hellwig
2017-10-01  8:17             ` Christoph Hellwig
2017-10-01  8:17             ` Christoph Hellwig
2017-09-25 23:14 ` [PATCH 4/7] xfs: protect S_DAX transitions in XFS write path Ross Zwisler
2017-09-25 23:14   ` Ross Zwisler
2017-09-25 23:14   ` Ross Zwisler
2017-09-25 23:29   ` Dave Chinner
2017-09-25 23:29     ` Dave Chinner
2017-09-25 23:29     ` Dave Chinner
2017-09-25 23:14 ` [PATCH 5/7] xfs: introduce xfs_is_dax_state_changing Ross Zwisler
2017-09-25 23:14   ` Ross Zwisler
2017-09-25 23:14   ` Ross Zwisler
2017-09-26  6:33   ` Christoph Hellwig
2017-09-26  6:33     ` Christoph Hellwig
2017-09-26  6:33     ` Christoph Hellwig
2017-09-25 23:14 ` [PATCH 6/7] mm, fs: introduce file_operations->post_mmap() Ross Zwisler
2017-09-25 23:14   ` Ross Zwisler
2017-09-25 23:14   ` Ross Zwisler
2017-09-25 23:38   ` Dan Williams
2017-09-25 23:38     ` Dan Williams
2017-09-26 18:57     ` Ross Zwisler [this message]
2017-09-26 18:57       ` Ross Zwisler
2017-09-26 18:57       ` Ross Zwisler
2017-09-26 19:19       ` Dan Williams
2017-09-26 19:19         ` Dan Williams
2017-09-26 19:19         ` Dan Williams
2017-09-26 21:06         ` Ross Zwisler
2017-09-26 21:06           ` Ross Zwisler
2017-09-26 21:06           ` Ross Zwisler
2017-09-26 21:41           ` Dan Williams
2017-09-26 21:41             ` Dan Williams
2017-09-26 21:41             ` Dan Williams
2017-09-27 11:35             ` Jan Kara
2017-09-27 11:35               ` Jan Kara
2017-09-27 11:35               ` Jan Kara
2017-09-27 14:00               ` Dan Williams
2017-09-27 14:00                 ` Dan Williams
2017-09-27 14:00                 ` Dan Williams
2017-09-27 15:07                 ` Jan Kara
2017-09-27 15:07                   ` Jan Kara
2017-09-27 15:07                   ` Jan Kara
2017-09-27 15:36                   ` Dan Williams
2017-09-27 15:36                     ` Dan Williams
2017-09-27 15:39               ` Ross Zwisler
2017-09-27 15:39                 ` Ross Zwisler
2017-09-27 15:39                 ` Ross Zwisler
2017-09-27 15:54                 ` Dan Williams
2017-09-27 15:54                   ` Dan Williams
2017-09-27 15:54                   ` Dan Williams
2017-09-26  6:34   ` Christoph Hellwig
2017-09-26  6:34     ` Christoph Hellwig
2017-09-26  6:34     ` Christoph Hellwig
2017-09-25 23:14 ` [PATCH 7/7] xfs: re-enable XFS per-inode DAX Ross Zwisler
2017-09-25 23:14   ` Ross Zwisler
2017-09-25 23:14   ` Ross Zwisler
2017-09-26  0:31   ` Dave Chinner
2017-09-26  0:31     ` Dave Chinner
2017-09-26  0:31     ` Dave Chinner
2017-09-26  6:36   ` Christoph Hellwig
2017-09-26  6:36     ` Christoph Hellwig
2017-09-26 19:01     ` Ross Zwisler
2017-09-26 19:01       ` Ross Zwisler
2017-09-26 19:01       ` Ross Zwisler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170926185751.GB31146@linux.intel.com \
    --to=ross.zwisler@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=bfields@fieldses.org \
    --cc=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jlayton@poochiereds.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.