All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Peter Xu <peterx@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Nadav Amit <nadav.amit@gmail.com>,
	Matthew Wilcox <willy@infradead.org>,
	Mike Rapoport <rppt@linux.vnet.ibm.com>,
	David Hildenbrand <david@redhat.com>,
	Hugh Dickins <hughd@google.com>,
	Jerome Glisse <jglisse@redhat.com>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Alistair Popple <apopple@nvidia.com>
Subject: Re: [PATCH v8 22/23] mm: Enable PTE markers by default
Date: Wed, 20 Apr 2022 09:46:07 -0400	[thread overview]
Message-ID: <YmAOn75O7zOOioXQ@cmpxchg.org> (raw)
In-Reply-To: <Yl8xKh9Dc+nPr30a@xz-m1.local>

On Tue, Apr 19, 2022 at 06:01:14PM -0400, Peter Xu wrote:
> On Tue, Apr 19, 2022 at 05:24:28PM -0400, Johannes Weiner wrote:
> > On Tue, Apr 19, 2022 at 04:28:16PM -0400, Peter Xu wrote:
> > > On Tue, Apr 19, 2022 at 04:14:11PM -0400, Johannes Weiner wrote:
> > > > On Tue, Apr 19, 2022 at 03:59:21PM -0400, Peter Xu wrote:
> > > > > @@ -910,16 +910,16 @@ config ANON_VMA_NAME
> > > > Btw, this doesn't do much without userfaultfd being enabled in
> > > > general, right?
> > > 
> > > So far yes, but I'm thinking there can be potential other users of
> > > PTE_MARKERS from mm world.  The most close discussion is on the swap read
> > > failures and this patch proposed by Miaohe:
> > > 
> > > https://lore.kernel.org/lkml/20220416030549.60559-1-linmiaohe@huawei.com/
> > >
> > > So I hope we can still keep them around here under mm/ if possible, and
> > > from the gut feeling it really should..
> > 
> > Agreed, mm/ seems a good fit for PTE_MARKER.
> > 
> > If it's invisible and gets selected as needed, it's less of a concern,
> > IMO. I'm somewhat worried about when and how the user-visible options
> > show up right now, though...
> > 
> > > > Would it make sense to have it next to 'config USERFAULTFD' as a
> > > > sub-option?
> > > 
> > > Yes another good question. :)
> > > 
> > > IIUC CONFIG_USERFAULTFD resides in init/Kconfig because it introduces a new
> > > syscall.  Same to the rest of the bits for uffd since then, namely:
> > > 
> > >   - USERFAULTFD_WP
> > >   - USERFAULTFD_MINOR
> > > 
> > > What I am thinking now is the other way round of your suggestion: whether
> > > we should move most of them out, at least the _WP and _MINOR configs into
> > > mm/?  Because IMHO they are really pure mm ideas and they're irrelevant to
> > > syscalls and init.
> > 
> > I'm thinking the MM submenu would probably be a better fit for all
> > user-visible userfaultfd options, including the syscall. Like you say,
> > it's an MM concept.
> > 
> > But if moving the syscall knob out from init isn't popular, IMO it
> > would be better to add the new WP option to init as well. This ensures
> > that when somebody selects userfaultfd, they also see the relevant
> > suboptions and don't have to chase them down across multiple submenus.
> > 
> > Conversely, they should also have the necessary depend clauses so that
> > suboptions aren't visible without the main feature. E.g. it asked me
> > for userfaultd options even though I have CONFIG_USERFAULTFD=n.
> 
> Hmm, this is a bit weird... since we do have that dependency chain for
> PTE_MARKER_UFFD_WP -> HAVE_ARCH_USERFAULTFD_WP -> USERFAULTFD:
> 
>   in arch/x86/Kconfig:
>   config X86
>           ...
>           select HAVE_ARCH_USERFAULTFD_WP         if X86_64 && USERFAULTFD
> 
>   in mm/Kconfig (with/without the "mm/uffd: Hide PTE_MARKER" patch applied):
>   config PTE_MARKER_UFFD_WP
>           ...
>           depends on HAVE_ARCH_USERFAULTFD_WP
> 
> So logically if !USERFAULTFD we shouldn't see PTE_MARKER_UFFD_WP at all?
> 
> That's also what I got when I tried it out for either !USERFAULTFD on x86,
> or any non-x86 platforms (because there we have !HAVE_ARCH_USERFAULTFD_WP
> constantly irrelevant of USERFAULTFD).  Though I could have missed
> something..

Sorry, it asked me about PTE_MARKERS, and that conclusion got stuck in
my head. Indeed, once that symbol is invisible we should be good.

> > What do you think?
> 
> I don't have a strong preference here, I think it's okay if it's preferred
> that we only put user-visible configs into mm/Kconfig.  It's just that I
> see we have tons of user-invisible configs already in mm/Kconfig, to list
> some:
> 
>         config ARCH_HAS_HUGEPD
>         config MAPPING_DIRTY_HELPERS
>         config KMAP_LOCAL
>         config KMAP_LOCAL_NON_LINEAR_PTE_ARRAY
> 
> But I'm not sure whether it's a rule of thumb somewhere else.

I wasn't objecting to invisible symbols in mm/.

My point was simply that for the user it might be easiest and most
intuitive if userfaultfd and its related suboptions are 1) grouped
together and 2) in the MM submenu.

> At the meantime, I also looked at whether syscall configs are always and
> only be put under init/, and funnily I got:
> 
> $ find . -name Kconfig | xargs grep --color -E "\".*syscall.*\""
> ./init/Kconfig: bool "Enable process_vm_readv/writev syscalls"
> ./init/Kconfig: bool "uselib syscall"
> ./init/Kconfig: bool "sgetmask/ssetmask syscalls support" if EXPERT
> ./init/Kconfig: bool "Sysfs syscall support" if EXPERT
> ./init/Kconfig: bool "open by fhandle syscalls" if EXPERT
> ./init/Kconfig: bool "Enable madvise/fadvise syscalls" if EXPERT
> ./arch/xtensa/Kconfig:  bool "Enable fast atomic syscalls"
> ./arch/xtensa/Kconfig:  bool "Enable spill registers syscall"
> ./arch/powerpc/Kconfig: bool "Support setting protections for 4k subpages (subpage_prot syscall)"
> ./arch/powerpc/Kconfig: bool "Enable filtering of RTAS syscalls"
> ./arch/Kconfig: bool "Support for randomizing kernel stack offset on syscall entry" if EXPERT
> ./arch/s390/Kconfig:    bool "Verify kernel signature during kexec_file_load() syscall"
> ./arch/sh/mm/Kconfig:   bool "Support vsyscall page"
> ./arch/x86/Kconfig:     bool "Enable vsyscall emulation" if EXPERT
> ./arch/x86/Kconfig:     bool "Verify kernel signature during kexec_file_load() syscall"
> ./arch/x86/Kconfig:     bool "Require a valid signature in kexec_file_load() syscall"
> ./arch/x86/Kconfig:     prompt "vsyscall table for legacy applications"
> ./arch/arm64/Kconfig:   bool "Verify kernel signature during kexec_file_load() syscall"
> ./arch/arm64/Kconfig:   bool "Enable the tagged user addresses syscall ABI"
> ./kernel/trace/Kconfig: bool "Trace syscalls"
> ./kernel/trace/Kconfig: bool "Run selftest on syscall events"
> 
> So let's put aside arch specific lines, ftrace does have FTRACE_SYSCALLS
> that lies in the kernel/trace/ dir.. not sure whether we could move
> USERFAULTFD and all the rest into mm/ as well?  Or perhaps that's just a
> bad example? :)

Yeah it looks like there is a healthy mix ;) To add to the list:

- mm/Kconfig has CONFIG_SWAP for the swapon/swapoff syscalls.
- fs/Kconfig has CONFIG_FILE_LOCKING, which adds the flock() syscall.
- Interestingly, fs/Kconfig has CONFIG_MEMFD_CREATE for memfd_create()
  which is implemented in mm/memfd.c.

It seems reasonable to me to move the userfaultfd stuff to mm as well,
especially when it's becoming more than just a single binary question
on whether you want a syscall or not, and has MM-specific suboptions.


  reply	other threads:[~2022-04-20 13:46 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-05  1:46 [PATCH v8 00/23] userfaultfd-wp: Support shmem and hugetlbfs Peter Xu
2022-04-05  1:46 ` [PATCH v8 01/23] mm: Introduce PTE_MARKER swap entry Peter Xu
2022-04-12  1:07   ` Alistair Popple
2022-04-12 19:45     ` Peter Xu
2022-04-13  0:30       ` Alistair Popple
2022-04-13 13:44         ` Peter Xu
2022-04-19  8:25   ` Alistair Popple
2022-04-19 19:44     ` Peter Xu
2022-04-05  1:48 ` [PATCH v8 02/23] mm: Teach core mm about pte markers Peter Xu
2022-04-12  1:22   ` Alistair Popple
2022-04-12 19:53     ` Peter Xu
2022-04-05  1:48 ` [PATCH v8 03/23] mm: Check against orig_pte for finish_fault() Peter Xu
2022-04-12  2:05   ` Alistair Popple
2022-04-12 19:54     ` Peter Xu
2022-04-13 14:03   ` Marek Szyprowski
2022-04-13 16:43     ` Peter Xu
2022-04-14  7:51       ` Marek Szyprowski
2022-04-14 16:30         ` Peter Xu
2022-04-14 20:57           ` Andrew Morton
2022-04-14 21:08             ` Peter Xu
2022-04-15 14:21   ` Guenter Roeck
2022-04-15 14:41     ` Peter Xu
2022-04-05  1:48 ` [PATCH v8 04/23] mm/uffd: PTE_MARKER_UFFD_WP Peter Xu
2022-04-06  1:41   ` kernel test robot
2022-04-05  1:48 ` [PATCH v8 05/23] mm/shmem: Take care of UFFDIO_COPY_MODE_WP Peter Xu
2022-04-05  1:48 ` [PATCH v8 06/23] mm/shmem: Handle uffd-wp special pte in page fault handler Peter Xu
2022-05-11 16:30   ` David Hildenbrand
2022-05-12 16:34     ` Peter Xu
2022-04-05  1:48 ` [PATCH v8 07/23] mm/shmem: Persist uffd-wp bit across zapping for file-backed Peter Xu
2022-04-05  1:48 ` [PATCH v8 08/23] mm/shmem: Allow uffd wr-protect none pte for file-backed mem Peter Xu
2022-04-05  1:48 ` [PATCH v8 09/23] mm/shmem: Allows file-back mem to be uffd wr-protected on thps Peter Xu
2022-04-05  1:48 ` [PATCH v8 10/23] mm/shmem: Handle uffd-wp during fork() Peter Xu
2022-04-06  6:16   ` kernel test robot
2022-04-06 12:18     ` Peter Xu
2022-04-06 12:18       ` Peter Xu
2022-04-05  1:48 ` [PATCH v8 11/23] mm/hugetlb: Introduce huge pte version of uffd-wp helpers Peter Xu
2022-04-05  1:49 ` [PATCH v8 12/23] mm/hugetlb: Hook page faults for uffd write protection Peter Xu
2022-04-05  1:49 ` [PATCH v8 13/23] mm/hugetlb: Take care of UFFDIO_COPY_MODE_WP Peter Xu
2022-04-05  1:49 ` [PATCH v8 14/23] mm/hugetlb: Handle UFFDIO_WRITEPROTECT Peter Xu
2022-04-05  1:49 ` [PATCH v8 15/23] mm/hugetlb: Handle pte markers in page faults Peter Xu
2022-04-06 13:37   ` kernel test robot
2022-04-06 15:02     ` Peter Xu
2022-04-06 15:02       ` Peter Xu
2022-04-05  1:49 ` [PATCH v8 16/23] mm/hugetlb: Allow uffd wr-protect none ptes Peter Xu
2022-04-05  1:49 ` [PATCH v8 17/23] mm/hugetlb: Only drop uffd-wp special pte if required Peter Xu
2022-04-05  1:49 ` [PATCH v8 18/23] mm/hugetlb: Handle uffd-wp during fork() Peter Xu
2022-04-05  1:49 ` [PATCH v8 19/23] mm/khugepaged: Don't recycle vma pgtable if uffd-wp registered Peter Xu
2022-04-05  1:49 ` [PATCH v8 20/23] mm/pagemap: Recognize uffd-wp bit for shmem/hugetlbfs Peter Xu
2022-04-05  1:49 ` [PATCH v8 21/23] mm/uffd: Enable write protection for shmem & hugetlbfs Peter Xu
2022-04-05  1:49 ` [PATCH v8 22/23] mm: Enable PTE markers by default Peter Xu
2022-04-19 15:13   ` Johannes Weiner
2022-04-19 19:59     ` Peter Xu
2022-04-19 20:14       ` Johannes Weiner
2022-04-19 20:28         ` Peter Xu
2022-04-19 21:24           ` Johannes Weiner
2022-04-19 22:01             ` Peter Xu
2022-04-20 13:46               ` Johannes Weiner [this message]
2022-04-20 14:25                 ` Peter Xu
2022-04-05  1:49 ` [PATCH v8 23/23] selftests/uffd: Enable uffd-wp for shmem/hugetlbfs Peter Xu
2022-04-05 22:16 ` [PATCH v8 00/23] userfaultfd-wp: Support shmem and hugetlbfs Andrew Morton
2022-04-05 22:42   ` Peter Xu
2022-04-05 22:49     ` Andrew Morton
2022-04-05 23:02       ` Peter Xu
2022-04-05 23:08         ` Andrew Morton
2022-05-10 19:05 ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YmAOn75O7zOOioXQ@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=axelrasmussen@google.com \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=jglisse@redhat.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=nadav.amit@gmail.com \
    --cc=peterx@redhat.com \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.