From: Peter Xu <peterx@redhat.com>
To: David Hildenbrand <david@redhat.com>
Cc: liubo <liubo254@huawei.com>,
akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, hughd@google.com,
willy@infradead.org
Subject: Re: [PATCH] smaps: Fix the abnormal memory statistics obtained through /proc/pid/smaps
Date: Thu, 27 Jul 2023 14:59:25 -0400 [thread overview]
Message-ID: <ZMK+jSDgOmJKySTr@x1n> (raw)
In-Reply-To: <5a2c9ae4-50f5-3301-3b50-f57026e1f8e8@redhat.com>
On Thu, Jul 27, 2023 at 07:27:02PM +0200, David Hildenbrand wrote:
> > >
> > > This was wrong from the very start. If we're not in GUP, we shouldn't call
> > > GUP functions.
> >
> > My understanding is !GET && !PIN is also called gup.. otherwise we don't
> > need GET and it can just be always implied.
>
> That's not the point. The point is that _arbitrary_ code shouldn't call into
> GUP internal helper functions, where they bypass, for example, any sanity
> checks.
What's the sanity checks that you're referring to?
>
> >
> > The other proof is try_grab_page() doesn't fail hard on !GET && !PIN. So I
> > don't know whether that's "wrong" to be used..
> >
>
> To me, that is arbitrary code using a GUP internal helper and, therefore,
> wrong.
>
> > Back to the topic: I'd say either of the patches look good to solve the
> > problem. If p2pdma pages are mapped as PFNMAP/MIXEDMAP (?), I guess
> > vm_normal_page_pmd() proposed here will also work on it, so nothing I see
> > wrong on 2nd one yet.
> >
> > It looks nicer indeed to not have FOLL_FORCE here, but it also makes me
> > just wonder whether we should document NUMA behavior for FOLL_* somewhere,
> > because we have an implication right now on !FOLL_FORCE over NUMA, which is
> > not obvious to me..
>
> Yes, we probably should. For get_use_pages() and friends that behavior was
> always like that and it makes sense: usually it represent application
> behavior.
>
> >
> > And to look more over that aspect, see follow_page(): previously we can
> > follow a page for protnone (as it never applies FOLL_NUMA) but now it won't
> > (it never applies FOLL_FORCE, either, so it seems "accidentally" implies
> > FOLL_NUMA now). Not sure whether it's intended, though..
>
> That was certainly an oversight, thanks for spotting that. That patch was
> not supposed to change semantics:
>
> diff --git a/mm/gup.c b/mm/gup.c
> index 76d222ccc3ff..ac926e19ff72 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -851,6 +851,13 @@ struct page *follow_page(struct vm_area_struct *vma,
> unsigned long address,
> if (WARN_ON_ONCE(foll_flags & FOLL_PIN))
> return NULL;
>
> + /*
> + * In contrast to get_user_pages() and friends, we don't want to
> + * fail if the PTE is PROT_NONE: see gup_can_follow_protnone().
> + */
> + if (!(foll_flags & FOLL_WRITE))
> + foll_flags |= FOLL_FORCE;
> +
> page = follow_page_mask(vma, address, foll_flags, &ctx);
> if (ctx.pgmap)
> put_dev_pagemap(ctx.pgmap);
This seems to be slightly against your other solution though for smaps,
where we want to avoid abusing FOLL_FORCE.. isn't it..
Why read only? That'll always attach FOLL_FORCE to all follow page call
sites indeed for now, but just curious - logically "I want to fetch the
page even if protnone" is orthogonal to do with write permission here to
me.
I still worry about further abuse of FOLL_FORCE, I believe you also worry
that so you proposed the other way for the smaps issue.
Do you think we can just revive FOLL_NUMA? That'll be very clear to me
from that aspect that we do still have valid use cases for it.
The very least is if with above we should really document FOLL_FORCE - we
should mention NUMA effects. But that's ... really confusing. Thinking
about that I personally prefer a revival of FOLL_NUMA, then smaps issue all
go away.
Thanks,
--
Peter Xu
next prev parent reply other threads:[~2023-07-27 19:00 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-26 7:34 [PATCH] smaps: Fix the abnormal memory statistics obtained through /proc/pid/smaps liubo
2023-07-27 11:37 ` David Hildenbrand
2023-07-27 13:15 ` Peter Xu
2023-07-27 13:28 ` David Hildenbrand
2023-07-27 15:13 ` Peter Xu
2023-07-27 17:27 ` David Hildenbrand
2023-07-27 18:59 ` Peter Xu [this message]
2023-07-27 19:17 ` David Hildenbrand
2023-07-27 20:30 ` Peter Xu
2023-07-27 20:43 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZMK+jSDgOmJKySTr@x1n \
--to=peterx@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=hughd@google.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=liubo254@huawei.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.