From: Dave Chinner <david@fromorbit.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Jan Kara <jack@suse.cz>,
Andrew Morton <akpm@linux-foundation.org>,
Matthew Wilcox <mawilcox@microsoft.com>,
"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
linux-xfs <linux-xfs@vger.kernel.org>,
Linux MM <linux-mm@kvack.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 14/15] dax: associate mappings with inodes, and warn if dma collides with truncate
Date: Thu, 21 Dec 2017 09:14:47 +1100 [thread overview]
Message-ID: <20171220221447.GG4094@dastard> (raw)
In-Reply-To: <CAPcyv4irj_+pJdX1SO6MjsxURcKm8--i_QvyudgHTZE2w4w-sA@mail.gmail.com>
On Tue, Dec 19, 2017 at 05:11:38PM -0800, Dan Williams wrote:
> On Fri, Nov 10, 2017 at 1:08 AM, Christoph Hellwig <hch@lst.de> wrote:
> >> + struct {
> >> + /*
> >> + * ZONE_DEVICE pages are never on an lru or handled by
> >> + * a slab allocator, this points to the hosting device
> >> + * page map.
> >> + */
> >> + struct dev_pagemap *pgmap;
> >> + /*
> >> + * inode association for MEMORY_DEVICE_FS_DAX page-idle
> >> + * callbacks. Note that we don't use ->mapping since
> >> + * that has hard coded page-cache assumptions in
> >> + * several paths.
> >> + */
> >
> > What assumptions? I'd much rather fix those up than having two fields
> > that have the same functionality.
>
> [ Reviving this old thread where you asked why I introduce page->inode
> instead of reusing page->mapping ]
>
> For example, xfs_vm_set_page_dirty() assumes that page->mapping being
> non-NULL indicates a typical page cache page, this is a false
> assumption for DAX.
That means every single filesystem has an incorrect assumption for
DAX pages. xfs_vm_set_page_dirty() is derived directly from
__set_page_dirty_buffers(), which is the default function that
set_page_dirty() calls to do it's work. Indeed, ext4 also calls
__set_page_dirty_buffers(), so whatever problem XFS has here with
DAX and racing truncates is going to manifest in ext4 as well.
> My guess at a fix for this is to add
> pagecache_page() checks to locations like this, but I worry about how
> to find them all. Where pagecache_page() is:
>
> bool pagecache_page(struct page *page)
> {
> if (!page->mapping)
> return false;
> if (!IS_DAX(page->mapping->host))
> return false;
> return true;
> }
This is likely to be a problem in lots more places if we have to
treat "has page been truncated away" race checks on dax mappings
differently to page cache mappings. This smells of a whack-a-mole
style bandaid to me....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>,
"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
Jan Kara <jack@suse.cz>, Matthew Wilcox <mawilcox@microsoft.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
linux-xfs <linux-xfs@vger.kernel.org>,
Linux MM <linux-mm@kvack.org>, Jeff Moyer <jmoyer@redhat.com>,
Ross Zwisler <ross.zwisler@linux.intel.com>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 14/15] dax: associate mappings with inodes, and warn if dma collides with truncate
Date: Thu, 21 Dec 2017 09:14:47 +1100 [thread overview]
Message-ID: <20171220221447.GG4094@dastard> (raw)
In-Reply-To: <CAPcyv4irj_+pJdX1SO6MjsxURcKm8--i_QvyudgHTZE2w4w-sA@mail.gmail.com>
On Tue, Dec 19, 2017 at 05:11:38PM -0800, Dan Williams wrote:
> On Fri, Nov 10, 2017 at 1:08 AM, Christoph Hellwig <hch@lst.de> wrote:
> >> + struct {
> >> + /*
> >> + * ZONE_DEVICE pages are never on an lru or handled by
> >> + * a slab allocator, this points to the hosting device
> >> + * page map.
> >> + */
> >> + struct dev_pagemap *pgmap;
> >> + /*
> >> + * inode association for MEMORY_DEVICE_FS_DAX page-idle
> >> + * callbacks. Note that we don't use ->mapping since
> >> + * that has hard coded page-cache assumptions in
> >> + * several paths.
> >> + */
> >
> > What assumptions? I'd much rather fix those up than having two fields
> > that have the same functionality.
>
> [ Reviving this old thread where you asked why I introduce page->inode
> instead of reusing page->mapping ]
>
> For example, xfs_vm_set_page_dirty() assumes that page->mapping being
> non-NULL indicates a typical page cache page, this is a false
> assumption for DAX.
That means every single filesystem has an incorrect assumption for
DAX pages. xfs_vm_set_page_dirty() is derived directly from
__set_page_dirty_buffers(), which is the default function that
set_page_dirty() calls to do it's work. Indeed, ext4 also calls
__set_page_dirty_buffers(), so whatever problem XFS has here with
DAX and racing truncates is going to manifest in ext4 as well.
> My guess at a fix for this is to add
> pagecache_page() checks to locations like this, but I worry about how
> to find them all. Where pagecache_page() is:
>
> bool pagecache_page(struct page *page)
> {
> if (!page->mapping)
> return false;
> if (!IS_DAX(page->mapping->host))
> return false;
> return true;
> }
This is likely to be a problem in lots more places if we have to
treat "has page been truncated away" race checks on dax mappings
differently to page cache mappings. This smells of a whack-a-mole
style bandaid to me....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>,
"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
Jan Kara <jack@suse.cz>, Matthew Wilcox <mawilcox@microsoft.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
linux-xfs <linux-xfs@vger.kernel.org>,
Linux MM <linux-mm@kvack.org>, Jeff Moyer <jmoyer@redhat.com>,
Ross Zwisler <ross.zwisler@linux.intel.com>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 14/15] dax: associate mappings with inodes, and warn if dma collides with truncate
Date: Thu, 21 Dec 2017 09:14:47 +1100 [thread overview]
Message-ID: <20171220221447.GG4094@dastard> (raw)
In-Reply-To: <CAPcyv4irj_+pJdX1SO6MjsxURcKm8--i_QvyudgHTZE2w4w-sA@mail.gmail.com>
On Tue, Dec 19, 2017 at 05:11:38PM -0800, Dan Williams wrote:
> On Fri, Nov 10, 2017 at 1:08 AM, Christoph Hellwig <hch@lst.de> wrote:
> >> + struct {
> >> + /*
> >> + * ZONE_DEVICE pages are never on an lru or handled by
> >> + * a slab allocator, this points to the hosting device
> >> + * page map.
> >> + */
> >> + struct dev_pagemap *pgmap;
> >> + /*
> >> + * inode association for MEMORY_DEVICE_FS_DAX page-idle
> >> + * callbacks. Note that we don't use ->mapping since
> >> + * that has hard coded page-cache assumptions in
> >> + * several paths.
> >> + */
> >
> > What assumptions? I'd much rather fix those up than having two fields
> > that have the same functionality.
>
> [ Reviving this old thread where you asked why I introduce page->inode
> instead of reusing page->mapping ]
>
> For example, xfs_vm_set_page_dirty() assumes that page->mapping being
> non-NULL indicates a typical page cache page, this is a false
> assumption for DAX.
That means every single filesystem has an incorrect assumption for
DAX pages. xfs_vm_set_page_dirty() is derived directly from
__set_page_dirty_buffers(), which is the default function that
set_page_dirty() calls to do it's work. Indeed, ext4 also calls
__set_page_dirty_buffers(), so whatever problem XFS has here with
DAX and racing truncates is going to manifest in ext4 as well.
> My guess at a fix for this is to add
> pagecache_page() checks to locations like this, but I worry about how
> to find them all. Where pagecache_page() is:
>
> bool pagecache_page(struct page *page)
> {
> if (!page->mapping)
> return false;
> if (!IS_DAX(page->mapping->host))
> return false;
> return true;
> }
This is likely to be a problem in lots more places if we have to
treat "has page been truncated away" race checks on dax mappings
differently to page cache mappings. This smells of a whack-a-mole
style bandaid to me....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-12-20 22:10 UTC|newest]
Thread overview: 92+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-31 23:21 [PATCH 00/15] dax: prep work for fixing dax-dma vs truncate collisions Dan Williams
2017-10-31 23:21 ` Dan Williams
2017-10-31 23:21 ` Dan Williams
2017-10-31 23:21 ` [PATCH 01/15] dax: quiet bdev_dax_supported() Dan Williams
2017-10-31 23:21 ` Dan Williams
2017-10-31 23:21 ` Dan Williams
2017-11-02 20:11 ` Christoph Hellwig
2017-11-02 20:11 ` Christoph Hellwig
2017-10-31 23:21 ` [PATCH 02/15] mm, dax: introduce pfn_t_special() Dan Williams
2017-10-31 23:21 ` Dan Williams
2017-11-03 2:32 ` Michael Ellerman
2017-11-03 2:32 ` Michael Ellerman
2017-11-03 2:32 ` Michael Ellerman
2017-10-31 23:21 ` [PATCH 03/15] dax: require 'struct page' by default for filesystem dax Dan Williams
2017-10-31 23:21 ` Dan Williams
2017-10-31 23:21 ` Dan Williams
2017-10-31 23:21 ` [PATCH 04/15] brd: remove dax support Dan Williams
2017-10-31 23:21 ` Dan Williams
2017-10-31 23:21 ` Dan Williams
2017-11-02 20:12 ` Christoph Hellwig
2017-11-02 20:12 ` Christoph Hellwig
2017-11-04 16:31 ` Jens Axboe
2017-11-04 16:31 ` Jens Axboe
2017-11-04 16:31 ` Jens Axboe
2017-10-31 23:22 ` [PATCH 05/15] dax: stop using VM_MIXEDMAP for dax Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-10-31 23:22 ` [PATCH 06/15] dax: stop using VM_HUGEPAGE " Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-10-31 23:22 ` [PATCH 07/15] dax: stop requiring a live device for dax_flush() Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-11-02 20:12 ` Christoph Hellwig
2017-11-02 20:12 ` Christoph Hellwig
2017-10-31 23:22 ` [PATCH 08/15] dax: store pfns in the radix Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-10-31 23:22 ` [PATCH 09/15] tools/testing/nvdimm: add 'bio_delay' mechanism Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-10-31 23:22 ` [PATCH 10/15] IB/core: disable memory registration of fileystem-dax vmas Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-11-02 20:13 ` Christoph Hellwig
2017-11-02 20:13 ` Christoph Hellwig
2017-11-02 21:06 ` Dan Williams
2017-11-02 21:06 ` Dan Williams
2017-11-02 21:06 ` Dan Williams
2017-10-31 23:22 ` [PATCH 11/15] [media] v4l2: disable filesystem-dax mapping support Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-10-31 23:22 ` [PATCH 12/15] mm, dax: enable filesystems to trigger page-idle callbacks Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-11-10 9:04 ` Christoph Hellwig
2017-11-10 9:04 ` Christoph Hellwig
2017-10-31 23:22 ` [PATCH 13/15] mm, devmap: introduce CONFIG_DEVMAP_MANAGED_PAGES Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-11-10 9:06 ` Christoph Hellwig
2017-11-10 9:06 ` Christoph Hellwig
2017-10-31 23:22 ` [PATCH 14/15] dax: associate mappings with inodes, and warn if dma collides with truncate Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-11-10 9:08 ` Christoph Hellwig
2017-11-10 9:08 ` Christoph Hellwig
2017-11-10 9:08 ` Christoph Hellwig
2017-12-20 1:11 ` Dan Williams
2017-12-20 1:11 ` Dan Williams
2017-12-20 14:38 ` Jan Kara
2017-12-20 14:38 ` Jan Kara
2017-12-20 22:41 ` Dan Williams
2017-12-20 22:41 ` Dan Williams
2017-12-20 22:41 ` Dan Williams
2017-12-21 12:14 ` Jan Kara
2017-12-21 12:14 ` Jan Kara
2017-12-21 17:31 ` Dan Williams
2017-12-21 17:31 ` Dan Williams
2017-12-22 8:51 ` Jan Kara
2017-12-22 8:51 ` Jan Kara
2017-12-20 22:14 ` Dave Chinner [this message]
2017-12-20 22:14 ` Dave Chinner
2017-12-20 22:14 ` Dave Chinner
2017-10-31 23:22 ` [PATCH 15/15] wait_bit: introduce {wait_on,wake_up}_devmap_idle Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-10-31 23:22 ` Dan Williams
2017-11-10 9:09 ` Christoph Hellwig
2017-11-10 9:09 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171220221447.GG4094@dastard \
--to=david@fromorbit.com \
--cc=akpm@linux-foundation.org \
--cc=dan.j.williams@intel.com \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nvdimm@lists.01.org \
--cc=linux-xfs@vger.kernel.org \
--cc=mawilcox@microsoft.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.