linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/13] fs/dax: Fix FS DAX page reference counts
@ 2024-06-27  0:54 Alistair Popple
  2024-06-27  0:54 ` [PATCH 01/13] mm/gup.c: Remove redundant check for PCI P2PDMA page Alistair Popple
                   ` (14 more replies)
  0 siblings, 15 replies; 54+ messages in thread
From: Alistair Popple @ 2024-06-27  0:54 UTC (permalink / raw)
  To: dan.j.williams, vishal.l.verma, dave.jiang, logang, bhelgaas,
	jack, jgg
  Cc: catalin.marinas, will, mpe, npiggin, dave.hansen, ira.weiny,
	willy, djwong, tytso, linmiaohe, david, peterx, linux-doc,
	linux-kernel, linux-arm-kernel, linuxppc-dev, nvdimm, linux-cxl,
	linux-fsdevel, linux-mm, linux-ext4, linux-xfs, jhubbard, hch,
	david, Alistair Popple

FS DAX pages have always maintained their own page reference counts
without following the normal rules for page reference counting. In
particular pages are considered free when the refcount hits one rather
than zero and refcounts are not added when mapping the page.

Tracking this requires special PTE bits (PTE_DEVMAP) and a secondary
mechanism for allowing GUP to hold references on the page (see
get_dev_pagemap). However there doesn't seem to be any reason why FS
DAX pages need their own reference counting scheme.

By treating the refcounts on these pages the same way as normal pages
we can remove a lot of special checks. In particular pXd_trans_huge()
becomes the same as pXd_leaf(), although I haven't made that change
here. It also frees up a valuable SW define PTE bit on architectures
that have devmap PTE bits defined.

It also almost certainly allows further clean-up of the devmap managed
functions, but I have left that as a future improvment.

This is an update to the original RFC rebased onto v6.10-rc5. Unlike
the original RFC it passes the same number of ndctl test suite
(https://github.com/pmem/ndctl) tests as my current development
environment does without these patches.

I am not intimately familiar with the FS DAX code so would appreciate
some careful review there. In particular I have not given any thought
at all to CONFIG_FS_DAX_LIMITED.

Signed-off-by: Alistair Popple <apopple@nvidia.com>

Alistair Popple (13):
  mm/gup.c: Remove redundant check for PCI P2PDMA page
  pci/p2pdma: Don't initialise page refcount to one
  fs/dax: Refactor wait for dax idle page
  fs/dax: Add dax_page_free callback
  mm: Allow compound zone device pages
  mm/memory: Add dax_insert_pfn
  huge_memory: Allow mappings of PUD sized pages
  huge_memory: Allow mappings of PMD sized pages
  gup: Don't allow FOLL_LONGTERM pinning of FS DAX pages
  fs/dax: Properly refcount fs dax pages
  huge_memory: Remove dead vmf_insert_pXd code
  mm: Remove pXX_devmap callers
  mm: Remove devmap related functions and page table bits

 Documentation/mm/arch_pgtable_helpers.rst     |   6 +-
 arch/arm64/Kconfig                            |   1 +-
 arch/arm64/include/asm/pgtable-prot.h         |   1 +-
 arch/arm64/include/asm/pgtable.h              |  24 +--
 arch/powerpc/Kconfig                          |   1 +-
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |   6 +-
 arch/powerpc/include/asm/book3s/64/hash-64k.h |   7 +-
 arch/powerpc/include/asm/book3s/64/pgtable.h  |  52 +----
 arch/powerpc/include/asm/book3s/64/radix.h    |  14 +-
 arch/powerpc/mm/book3s64/hash_pgtable.c       |   3 +-
 arch/powerpc/mm/book3s64/pgtable.c            |   8 +-
 arch/powerpc/mm/book3s64/radix_pgtable.c      |   5 +-
 arch/powerpc/mm/pgtable.c                     |   2 +-
 arch/x86/Kconfig                              |   1 +-
 arch/x86/include/asm/pgtable.h                |  50 +----
 arch/x86/include/asm/pgtable_types.h          |   5 +-
 drivers/dax/device.c                          |  12 +-
 drivers/dax/super.c                           |   2 +-
 drivers/gpu/drm/nouveau/nouveau_dmem.c        |   2 +-
 drivers/nvdimm/pmem.c                         |   9 +-
 drivers/pci/p2pdma.c                          |   4 +-
 fs/dax.c                                      | 204 +++++++---------
 fs/ext4/inode.c                               |   5 +-
 fs/fuse/dax.c                                 |   4 +-
 fs/fuse/virtio_fs.c                           |   8 +-
 fs/userfaultfd.c                              |   2 +-
 fs/xfs/xfs_inode.c                            |   4 +-
 include/linux/dax.h                           |  11 +-
 include/linux/huge_mm.h                       |  17 +-
 include/linux/memremap.h                      |  23 +-
 include/linux/migrate.h                       |   2 +-
 include/linux/mm.h                            |  40 +---
 include/linux/page-flags.h                    |   6 +-
 include/linux/pfn_t.h                         |  20 +--
 include/linux/pgtable.h                       |  21 +--
 include/linux/rmap.h                          |  14 +-
 lib/test_hmm.c                                |   2 +-
 mm/Kconfig                                    |   4 +-
 mm/debug_vm_pgtable.c                         |  59 +-----
 mm/gup.c                                      | 178 +--------------
 mm/hmm.c                                      |  12 +-
 mm/huge_memory.c                              | 248 +++++++------------
 mm/internal.h                                 |   2 +-
 mm/khugepaged.c                               |   2 +-
 mm/mapping_dirty_helpers.c                    |   4 +-
 mm/memory-failure.c                           |   6 +-
 mm/memory.c                                   | 114 ++++++---
 mm/memremap.c                                 |  38 +---
 mm/migrate_device.c                           |   6 +-
 mm/mlock.c                                    |   2 +-
 mm/mm_init.c                                  |   5 +-
 mm/mprotect.c                                 |   2 +-
 mm/mremap.c                                   |   5 +-
 mm/page_vma_mapped.c                          |   5 +-
 mm/pgtable-generic.c                          |   7 +-
 mm/rmap.c                                     |  48 ++++-
 mm/swap.c                                     |   2 +-
 mm/userfaultfd.c                              |   2 +-
 mm/vmscan.c                                   |   5 +-
 59 files changed, 485 insertions(+), 869 deletions(-)

base-commit: f2661062f16b2de5d7b6a5c42a9a5c96326b8454
-- 
git-series 0.9.1

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2024-09-06  6:25 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-27  0:54 [PATCH 00/13] fs/dax: Fix FS DAX page reference counts Alistair Popple
2024-06-27  0:54 ` [PATCH 01/13] mm/gup.c: Remove redundant check for PCI P2PDMA page Alistair Popple
2024-06-27  6:36   ` Dan Williams
2024-06-27  0:54 ` [PATCH 02/13] pci/p2pdma: Don't initialise page refcount to one Alistair Popple
2024-06-27  5:30   ` Christoph Hellwig
2024-06-29 21:28   ` Bjorn Helgaas
2024-06-27  0:54 ` [PATCH 03/13] fs/dax: Refactor wait for dax idle page Alistair Popple
2024-06-27  5:31   ` Christoph Hellwig
2024-06-27  0:54 ` [PATCH 04/13] fs/dax: Add dax_page_free callback Alistair Popple
2024-06-27  5:33   ` Christoph Hellwig
2024-06-27 23:48     ` Alistair Popple
2024-06-27  0:54 ` [PATCH 05/13] mm: Allow compound zone device pages Alistair Popple
2024-06-27  5:35   ` Christoph Hellwig
2024-06-27  0:54 ` [PATCH 06/13] mm/memory: Add dax_insert_pfn Alistair Popple
2024-06-27  5:22   ` Christoph Hellwig
2024-06-27 11:33   ` Jan Kara
2024-09-06  6:21     ` Alistair Popple
2024-07-02  7:18   ` David Hildenbrand
2024-07-02 10:47     ` Alistair Popple
2024-07-02 11:46     ` Christoph Hellwig
2024-07-02 11:53       ` David Hildenbrand
2024-06-27  0:54 ` [PATCH 07/13] huge_memory: Allow mappings of PUD sized pages Alistair Popple
2024-06-27 22:26   ` kernel test robot
2024-07-02  7:16   ` David Hildenbrand
2024-07-02 10:19     ` Alistair Popple
2024-07-02 11:02       ` David Hildenbrand
2024-07-02 11:30         ` Alistair Popple
2024-07-02 13:01           ` David Hildenbrand
2024-07-02 11:51       ` Christoph Hellwig
2024-06-27  0:54 ` [PATCH 08/13] huge_memory: Allow mappings of PMD " Alistair Popple
2024-06-27  0:54 ` [PATCH 09/13] gup: Don't allow FOLL_LONGTERM pinning of FS DAX pages Alistair Popple
2024-07-01  8:59   ` David Hildenbrand
2024-07-01 23:47     ` Alistair Popple
2024-07-02 10:48       ` David Hildenbrand
2024-06-27  0:54 ` [PATCH 10/13] fs/dax: Properly refcount fs dax pages Alistair Popple
2024-06-27  5:44   ` Christoph Hellwig
2024-09-06  6:00     ` Alistair Popple
2024-06-27  0:54 ` [PATCH 11/13] huge_memory: Remove dead vmf_insert_pXd code Alistair Popple
2024-07-05 14:24   ` Peter Xu
2024-07-09  4:07     ` Alistair Popple
2024-07-09 15:56       ` Peter Xu
2024-07-12  2:40         ` Alistair Popple
2024-07-12 15:52           ` Peter Xu
2024-06-27  0:54 ` [PATCH 12/13] mm: Remove pXX_devmap callers Alistair Popple
2024-06-27  0:54 ` [PATCH 13/13] mm: Remove devmap related functions and page table bits Alistair Popple
2024-06-27 23:04   ` kernel test robot
2024-06-28  2:12   ` kernel test robot
2024-07-08 11:35   ` Will Deacon
2024-06-27  6:58 ` [PATCH 00/13] fs/dax: Fix FS DAX page reference counts Dan Williams
2024-06-27  7:15   ` Alistair Popple
2024-06-27 20:24     ` Dan Williams
2024-06-28  0:06       ` Alistair Popple
2024-07-01  4:24 ` Dave Chinner
2024-07-01  8:33   ` Alistair Popple

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).