linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org, etnaviv@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org,
	linux-arm-kernel@lists.infradead.org,
	linux-samsung-soc@vger.kernel.org, linux-rdma@vger.kernel.org,
	linux-media@vger.kernel.org, linux-kselftest@vger.kernel.org,
	David Hildenbrand <david@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	John Hubbard <jhubbard@nvidia.com>, Peter Xu <peterx@redhat.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Hugh Dickins <hughd@google.com>, Nadav Amit <namit@vmware.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Matthew Wilcox <willy@infradead.org>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Muchun Song <songmuchun@bytedance.com>,
	Lucas Stach <l.stach@pengutronix.de>,
	David Airlie <airlied@gmail.com>,
	Oded Gabbay <ogabbay@kernel.org>, Arnd Bergmann <arnd@arndb.de>
Subject: [PATCH RFC 00/19] mm/gup: remove FOLL_FORCE usage from drivers (reliable R/O long-term pinning)
Date: Mon,  7 Nov 2022 17:17:21 +0100	[thread overview]
Message-ID: <20221107161740.144456-1-david@redhat.com> (raw)

For now, we did not support reliable R/O long-term pinning in COW mappings.
That means, if we would trigger R/O long-term pinning in MAP_PRIVATE
mapping, we could end up pinning the (R/O-mapped) shared zeropage or a
pagecache page.

The next write access would trigger a write fault and replace the pinned
page by an exclusive anonymous page in the process page table; whatever the
process would write to that private page copy would not be visible by the
owner of the previous page pin: for example, RDMA could read stale data.
The end result is essentially an unexpected and hard-to-debug memory
corruption.

Some drivers tried working around that limitation by using
"FOLL_FORCE|FOLL_WRITE|FOLL_LONGTERM" for R/O long-term pinning for now.
FOLL_WRITE would trigger a write fault, if required, and break COW before
pinning the page. FOLL_FORCE is required because the VMA might lack write
permissions, and drivers wanted to make that working as well, just like
one would expect (no write access, but still triggering a write access to
break COW).

However, that is not a practical solution, because
(1) Drivers that don't stick to that undocumented and debatable pattern
    would still run into that issue. For example, VFIO only uses
    FOLL_LONGTERM for R/O long-term pinning.
(2) Using FOLL_WRITE just to work around a COW mapping + page pinning
    limitation is unintuitive. FOLL_WRITE would, for example, mark the
    page softdirty or trigger uffd-wp, even though, there actually isn't
    going to be any write access.
(3) The purpose of FOLL_FORCE is debug access, not access without lack of
    VMA permissions by arbitrarty drivers.

So instead, make R/O long-term pinning work as expected, by breaking COW
in a COW mapping early, such that we can remove any FOLL_FORCE usage from
drivers. More details in patch #8.

Patches #1--#3 add COW tests for non-anonymous pages.
Patches #4--#7 prepare core MM for extended FAULT_FLAG_UNSHARE support in
COW mappings.
Patch #8 implements reliable R/O long-term pinning in COW mappings
Patches #9--#19 remove any FOLL_FORCE usage from drivers.

I'm refraining from CCing all driver maintainers on the whole patch set.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Shuah Khan <shuah@kernel.org
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: David Airlie <airlied@gmail.com>
Cc: Oded Gabbay <ogabbay@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>

David Hildenbrand (19):
  selftests/vm: anon_cow: prepare for non-anonymous COW tests
  selftests/vm: cow: basic COW tests for non-anonymous pages
  selftests/vm: cow: R/O long-term pinning reliability tests for
    non-anon pages
  mm: add early FAULT_FLAG_UNSHARE consistency checks
  mm: add early FAULT_FLAG_WRITE consistency checks
  mm: rework handling in do_wp_page() based on private vs. shared
    mappings
  mm: don't call vm_ops->huge_fault() in wp_huge_pmd()/wp_huge_pud() for
    private mappings
  mm: extend FAULT_FLAG_UNSHARE support to anything in a COW mapping
  mm/gup: reliable R/O long-term pinning in COW mappings
  RDMA/umem: remove FOLL_FORCE usage
  RDMA/usnic: remove FOLL_FORCE usage
  RDMA/siw: remove FOLL_FORCE usage
  media: videobuf-dma-sg: remove FOLL_FORCE usage
  drm/etnaviv: remove FOLL_FORCE usage
  media: pci/ivtv: remove FOLL_FORCE usage
  mm/frame-vector: remove FOLL_FORCE usage
  drm/exynos: remove FOLL_FORCE usage
  RDMA/hw/qib/qib_user_pages: remove FOLL_FORCE usage
  habanalabs: remove FOLL_FORCE usage

 drivers/gpu/drm/etnaviv/etnaviv_gem.c         |   8 +-
 drivers/gpu/drm/exynos/exynos_drm_g2d.c       |   2 +-
 drivers/infiniband/core/umem.c                |   8 +-
 drivers/infiniband/hw/qib/qib_user_pages.c    |   2 +-
 drivers/infiniband/hw/usnic/usnic_uiom.c      |   9 +-
 drivers/infiniband/sw/siw/siw_mem.c           |   9 +-
 drivers/media/common/videobuf2/frame_vector.c |   2 +-
 drivers/media/pci/ivtv/ivtv-udma.c            |   2 +-
 drivers/media/pci/ivtv/ivtv-yuv.c             |   5 +-
 drivers/media/v4l2-core/videobuf-dma-sg.c     |  14 +-
 drivers/misc/habanalabs/common/memory.c       |   3 +-
 include/linux/mm.h                            |  27 +-
 include/linux/mm_types.h                      |   8 +-
 mm/gup.c                                      |  10 +-
 mm/huge_memory.c                              |   5 +-
 mm/hugetlb.c                                  |  12 +-
 mm/memory.c                                   |  97 +++--
 tools/testing/selftests/vm/.gitignore         |   2 +-
 tools/testing/selftests/vm/Makefile           |  10 +-
 tools/testing/selftests/vm/check_config.sh    |   4 +-
 .../selftests/vm/{anon_cow.c => cow.c}        | 387 +++++++++++++++++-
 tools/testing/selftests/vm/run_vmtests.sh     |   8 +-
 22 files changed, 516 insertions(+), 118 deletions(-)
 rename tools/testing/selftests/vm/{anon_cow.c => cow.c} (74%)

-- 
2.38.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

             reply	other threads:[~2022-11-07 16:19 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-07 16:17 David Hildenbrand [this message]
2022-11-07 16:17 ` [PATCH RFC 01/19] selftests/vm: anon_cow: prepare for non-anonymous COW tests David Hildenbrand
2022-11-07 16:17 ` [PATCH RFC 02/19] selftests/vm: cow: basic COW tests for non-anonymous pages David Hildenbrand
2022-11-07 16:17 ` [PATCH RFC 03/19] selftests/vm: cow: R/O long-term pinning reliability tests for non-anon pages David Hildenbrand
2022-11-07 16:17 ` [PATCH RFC 04/19] mm: add early FAULT_FLAG_UNSHARE consistency checks David Hildenbrand
2022-11-07 16:17 ` [PATCH RFC 05/19] mm: add early FAULT_FLAG_WRITE " David Hildenbrand
2022-11-07 19:03   ` Nadav Amit
2022-11-07 19:27     ` David Hildenbrand
2022-11-07 19:50       ` Nadav Amit
2022-11-07 16:17 ` [PATCH RFC 06/19] mm: rework handling in do_wp_page() based on private vs. shared mappings David Hildenbrand
2022-11-07 16:17 ` [PATCH RFC 07/19] mm: don't call vm_ops->huge_fault() in wp_huge_pmd()/wp_huge_pud() for private mappings David Hildenbrand
2022-11-07 16:17 ` [PATCH RFC 08/19] mm: extend FAULT_FLAG_UNSHARE support to anything in a COW mapping David Hildenbrand
2022-11-07 16:17 ` [PATCH RFC 09/19] mm/gup: reliable R/O long-term pinning in COW mappings David Hildenbrand
2022-11-07 16:17 ` [PATCH RFC 10/19] RDMA/umem: remove FOLL_FORCE usage David Hildenbrand
2022-11-14  8:30   ` Leon Romanovsky
2022-11-07 16:17 ` [PATCH RFC 11/19] RDMA/usnic: " David Hildenbrand
2022-11-07 16:17 ` [PATCH RFC 12/19] RDMA/siw: " David Hildenbrand
2022-11-07 16:17 ` [PATCH RFC 13/19] media: videobuf-dma-sg: " David Hildenbrand
2022-11-07 16:17 ` [PATCH RFC 14/19] drm/etnaviv: " David Hildenbrand
2022-11-07 16:17 ` [PATCH RFC 15/19] media: pci/ivtv: " David Hildenbrand
2022-11-07 16:17 ` [PATCH RFC 16/19] mm/frame-vector: " David Hildenbrand
2022-11-08  4:45   ` Tomasz Figa
2022-11-08  9:40     ` David Hildenbrand
2022-11-22 12:25     ` Hans Verkuil
2022-11-22 12:38       ` David Hildenbrand
2022-11-22 14:07         ` Hans Verkuil
2022-11-22 15:03           ` David Hildenbrand
2022-11-22 17:33       ` Linus Torvalds
2022-11-07 16:17 ` [PATCH RFC 17/19] drm/exynos: " David Hildenbrand
2022-11-07 16:17 ` [PATCH RFC 18/19] RDMA/hw/qib/qib_user_pages: " David Hildenbrand
2022-11-07 16:17 ` [PATCH RFC 19/19] habanalabs: " David Hildenbrand
2022-11-07 21:25   ` Oded Gabbay
2022-11-07 17:27 ` [PATCH RFC 00/19] mm/gup: remove FOLL_FORCE usage from drivers (reliable R/O long-term pinning) Linus Torvalds
2022-11-08  9:29   ` David Hildenbrand
2022-11-14  6:03   ` Christoph Hellwig
2022-11-14  8:07     ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221107161740.144456-1-david@redhat.com \
    --to=david@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=airlied@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=etnaviv@lists.freedesktop.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=hughd@google.com \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=l.stach@pengutronix.de \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-samsung-soc@vger.kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=namit@vmware.com \
    --cc=ogabbay@kernel.org \
    --cc=peterx@redhat.com \
    --cc=songmuchun@bytedance.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).