From: Linus Torvalds <torvalds@linux-foundation.org>
To: Peter Xu <peterx@redhat.com>
Cc: David Hildenbrand <david@redhat.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-fsdevel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
liubo <liubo254@huawei.com>, Matthew Wilcox <willy@infradead.org>,
Hugh Dickins <hughd@google.com>, Jason Gunthorpe <jgg@ziepe.ca>,
John Hubbard <jhubbard@nvidia.com>
Subject: Re: [PATCH v1 0/4] smaps / mm/gup: fix gup_can_follow_protnone fallout
Date: Fri, 28 Jul 2023 13:23:11 -0700 [thread overview]
Message-ID: <CAHk-=wgRiP_9X0rRdZKT8nhemZGNateMtb366t37d8-x7VRs=g@mail.gmail.com> (raw)
In-Reply-To: <ZMQZfn/hUURmfqWN@x1n>
On Fri, 28 Jul 2023 at 12:39, Peter Xu <peterx@redhat.com> wrote:
>
> But then does it means that any gup-only user will have numa balancing
> completely disabled?
Why would we ever care about a GUP-only user?
Who knows where the actual access is coming from? It might be some
device that is on a different node entirely.
And even if the access is local from the CPU, it
(a) might have happened after we moved somewhere else
(b) who cares about the extra possible NUMA overhead when we just
wasted *thousands* of cycles on GUP?
So NUMA balancing really doesn't seem to make sense for GUP anyway as
far as I can see.
Now, the other side of the same thing is that (a) NUMA faulting should
be fairly rare and (b) once you do GUP, who cares anyway, so you can
also argue that "once you do GUP you might as well NUMA-fault, because
performance simply isn't an issue".
But I really think the real argument is "once you do GUP, numa
faulting is just crazy".
I think what happened is
- the GUP code couldn't tell NUMA and actual PROTNONE apart
- so the GUP code would punch through PROTNONE even when it shouldn't
- so people added FOLL_NUMA to say "I don't want you to punch
through, I want the NUMA fault"
- but then FOLL_FORCE ends up meaning that you actually *do* want to
punch through - regardless of NUMA or not - and now the two got tied
together, and we end up with nonsensical garbage like
if (!(gup_flags & FOLL_FORCE))
gup_flags |= FOLL_NUMA;
to say "oh, actually, to avoid punching through when we shouldn't,
we should NUMA fault".
so we ended up with that case where even if YOU DIDN'T CARE AT ALL,
you got FOLL_NUMA just so that you wouldn't punch through.
And now we're in the situation that we've confused FOLL_FORCE and
FOLL_NUMA, even though they have absolutely *nothing* to do with each
other, except for a random implementation detail about punching
through incorrectly that isn't even relevant any more.
I really think FOLL_NUMA should just go away. And that FOLL_FORCE
replacement for it is just wrong. If you *don't* do something without
FOLL_FORCE, you damn well shouldn't do it just because FOLL_FORCE is
set.
The *only* semantic meaning FOLL_FORCE should have is that it
overrides the vma protections for debuggers (in a very limited
manner). It should *not* affect any NUMA faulting logic in any way,
shape, or form.
Linus
next prev parent reply other threads:[~2023-07-28 20:23 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-27 21:28 [PATCH v1 0/4] smaps / mm/gup: fix gup_can_follow_protnone fallout David Hildenbrand
2023-07-27 21:28 ` [PATCH v1 1/4] smaps: Fix the abnormal memory statistics obtained through /proc/pid/smaps David Hildenbrand
2023-07-27 21:28 ` [PATCH v1 2/4] mm/gup: Make follow_page() succeed again on PROT_NONE PTEs/PMDs David Hildenbrand
2023-07-28 2:30 ` John Hubbard
2023-07-28 9:08 ` David Hildenbrand
2023-07-28 10:12 ` David Hildenbrand
2023-07-27 21:28 ` [PATCH v1 3/4] smaps: use vm_normal_page_pmd() instead of follow_trans_huge_pmd() David Hildenbrand
2023-07-27 21:28 ` [PATCH v1 4/4] mm/gup: document FOLL_FORCE behavior David Hildenbrand
2023-07-28 16:18 ` [PATCH v1 0/4] smaps / mm/gup: fix gup_can_follow_protnone fallout Linus Torvalds
2023-07-28 17:30 ` David Hildenbrand
2023-07-28 17:54 ` David Hildenbrand
2023-07-28 19:40 ` David Hildenbrand
2023-07-28 19:50 ` Peter Xu
2023-07-28 20:00 ` David Hildenbrand
2023-08-02 10:24 ` Mel Gorman
2023-07-28 19:39 ` Peter Xu
2023-07-28 19:52 ` David Hildenbrand
2023-07-28 20:23 ` Linus Torvalds [this message]
2023-07-28 20:33 ` David Hildenbrand
2023-07-28 20:50 ` Linus Torvalds
2023-07-28 21:02 ` David Hildenbrand
2023-07-28 21:20 ` Peter Xu
2023-07-28 21:31 ` David Hildenbrand
2023-07-28 22:14 ` Jason Gunthorpe
2023-07-31 16:01 ` Peter Xu
2023-07-28 21:32 ` John Hubbard
2023-07-28 21:49 ` Peter Xu
2023-07-28 22:00 ` John Hubbard
2023-07-31 16:05 ` Peter Xu
[not found] ` <412bb30f-0417-802c-3fc4-a4e9d5891c5d@redhat.com>
2023-07-29 9:35 ` David Hildenbrand
2023-07-31 16:10 ` Peter Xu
2023-07-31 16:20 ` David Hildenbrand
2023-07-31 18:23 ` Linus Torvalds
2023-07-31 18:51 ` Peter Xu
2023-07-31 19:00 ` David Hildenbrand
2023-07-31 19:07 ` Linus Torvalds
2023-07-31 19:22 ` David Hildenbrand
2023-08-01 13:05 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAHk-=wgRiP_9X0rRdZKT8nhemZGNateMtb366t37d8-x7VRs=g@mail.gmail.com' \
--to=torvalds@linux-foundation.org \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=hughd@google.com \
--cc=jgg@ziepe.ca \
--cc=jhubbard@nvidia.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=liubo254@huawei.com \
--cc=peterx@redhat.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).