Re: [PATCH v1 2/4] mm/gup: Make follow_page() succeed again on PROT_NONE PTEs/PMDs

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: David Hildenbrand <david@redhat.com>
To: John Hubbard <jhubbard@nvidia.com>, linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	liubo <liubo254@huawei.com>, Peter Xu <peterx@redhat.com>,
	Matthew Wilcox <willy@infradead.org>,
	Hugh Dickins <hughd@google.com>, Jason Gunthorpe <jgg@ziepe.ca>,
	stable@vger.kernel.org
Subject: Re: [PATCH v1 2/4] mm/gup: Make follow_page() succeed again on PROT_NONE PTEs/PMDs
Date: Fri, 28 Jul 2023 11:08:26 +0200	[thread overview]
Message-ID: <9de80e22-e89f-2760-34f4-61be5f8fd39c@redhat.com> (raw)
In-Reply-To: <55c92738-e402-4657-3d46-162ad2c09d68@nvidia.com>

On 28.07.23 04:30, John Hubbard wrote:
> On 7/27/23 14:28, David Hildenbrand wrote:
>> We accidentally enforced PROT_NONE PTE/PMD permission checks for
>> follow_page() like we do for get_user_pages() and friends. That was
>> undesired, because follow_page() is usually only used to lookup a currently
>> mapped page, not to actually access it. Further, follow_page() does not
>> actually trigger fault handling, but instead simply fails.
> 
> I see that follow_page() is also completely undocumented. And that
> reduces us to deducing how it should be used...these things that
> change follow_page()'s behavior maybe should have a go at documenting
> it too, perhaps.

I can certainly be motivated to do that. :)

> 
>>
>> Let's restore that behavior by conditionally setting FOLL_FORCE if
>> FOLL_WRITE is not set. This way, for example KSM and migration code will
>> no longer fail on PROT_NONE mapped PTEs/PMDS.
>>
>> Handling this internally doesn't require us to add any new FOLL_FORCE
>> usage outside of GUP code.
>>
>> While at it, refuse to accept FOLL_FORCE: we don't even perform VMA
>> permission checks like in check_vma_flags(), so especially
>> FOLL_FORCE|FOLL_WRITE would be dodgy.
>>
>> This issue was identified by code inspection. We'll add some
>> documentation regarding FOLL_FORCE next.
>>
>> Reported-by: Peter Xu <peterx@redhat.com>
>> Fixes: 474098edac26 ("mm/gup: replace FOLL_NUMA by gup_can_follow_protnone()")
>> Cc: <stable@vger.kernel.org>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>    mm/gup.c | 10 +++++++++-
>>    1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/gup.c b/mm/gup.c
>> index 2493ffa10f4b..da9a5cc096ac 100644
>> --- a/mm/gup.c
>> +++ b/mm/gup.c
>> @@ -841,9 +841,17 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address,
>>    	if (vma_is_secretmem(vma))
>>    		return NULL;
>>    
>> -	if (WARN_ON_ONCE(foll_flags & FOLL_PIN))
>> +	if (WARN_ON_ONCE(foll_flags & (FOLL_PIN | FOLL_FORCE)))
>>    		return NULL;
> 
> This is not a super happy situation: follow_page() is now prohibited
> (see above: we should document that interface) from passing in
> FOLL_FORCE...

I guess you saw my patch #4.

If you take a look at the existing callers (that are fortunately very 
limited), you'll see that nobody cares.

Most of the FOLL flags don't make any sense for follow_page(), and 
limiting further (ab)use is at least to me very appealing.

> 
>>    
>> +	/*
>> +	 * Traditionally, follow_page() succeeded on PROT_NONE-mapped pages
>> +	 * but failed follow_page(FOLL_WRITE) on R/O-mapped pages. Let's
>> +	 * keep these semantics by setting FOLL_FORCE if FOLL_WRITE is not set.
>> +	 */
>> +	if (!(foll_flags & FOLL_WRITE))
>> +		foll_flags |= FOLL_FORCE;
>> +
> 
> ...but then we set it anyway, for special cases. It's awkward because
> FOLL_FORCE is not an "internal to gup" flag (yet?).
> 
> I don't yet have suggestions, other than:
> 
> 1) Yes, the FOLL_NUMA made things bad.
> 
> 2) And they are still very confusing, especially the new use of
>      FOLL_FORCE.
> 
> ...I'll try to let this soak in and maybe recommend something
> in a more productive way. :)

What I can offer that might be very appealing is the following:

Get rid of the flags parameter for follow_page() *completely*. Yes, then 
we can even rename FOLL_ to something reasonable in the context where it 
is nowadays used ;)


Internally, we'll then set

FOLL_GET | FOLL_DUMP | FOLL_FORCE

and document exactly what this functions does. Any user that needs 
something different should just look into using get_user_pages() instead.

I can prototype that on top of this work easily.

-- 
Cheers,

David / dhildenb

next prev parent reply	other threads:[~2023-07-28  9:11 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-27 21:28 [PATCH v1 0/4] smaps / mm/gup: fix gup_can_follow_protnone fallout David Hildenbrand
2023-07-27 21:28 ` [PATCH v1 1/4] smaps: Fix the abnormal memory statistics obtained through /proc/pid/smaps David Hildenbrand
2023-07-27 21:28 ` [PATCH v1 2/4] mm/gup: Make follow_page() succeed again on PROT_NONE PTEs/PMDs David Hildenbrand
2023-07-28  2:30   ` John Hubbard
2023-07-28  9:08     ` David Hildenbrand [this message]
2023-07-28 10:12       ` David Hildenbrand
2023-07-27 21:28 ` [PATCH v1 3/4] smaps: use vm_normal_page_pmd() instead of follow_trans_huge_pmd() David Hildenbrand
2023-07-27 21:28 ` [PATCH v1 4/4] mm/gup: document FOLL_FORCE behavior David Hildenbrand
2023-07-28 16:18 ` [PATCH v1 0/4] smaps / mm/gup: fix gup_can_follow_protnone fallout Linus Torvalds
2023-07-28 17:30   ` David Hildenbrand
2023-07-28 17:54     ` David Hildenbrand
2023-07-28 19:40     ` David Hildenbrand
2023-07-28 19:50       ` Peter Xu
2023-07-28 20:00         ` David Hildenbrand
2023-08-02 10:24     ` Mel Gorman
2023-07-28 19:39   ` Peter Xu
2023-07-28 19:52     ` David Hildenbrand
2023-07-28 20:23     ` Linus Torvalds
2023-07-28 20:33       ` David Hildenbrand
2023-07-28 20:50         ` Linus Torvalds
2023-07-28 21:02           ` David Hildenbrand
2023-07-28 21:20             ` Peter Xu
2023-07-28 21:31               ` David Hildenbrand
2023-07-28 22:14                 ` Jason Gunthorpe
2023-07-31 16:01                   ` Peter Xu
2023-07-28 21:32               ` John Hubbard
2023-07-28 21:49                 ` Peter Xu
2023-07-28 22:00                   ` John Hubbard
2023-07-31 16:05                     ` Peter Xu
     [not found]   ` <412bb30f-0417-802c-3fc4-a4e9d5891c5d@redhat.com>
2023-07-29  9:35     ` David Hildenbrand
2023-07-31 16:10       ` Peter Xu
2023-07-31 16:20         ` David Hildenbrand
2023-07-31 18:23           ` Linus Torvalds
2023-07-31 18:51             ` Peter Xu
2023-07-31 19:00             ` David Hildenbrand
2023-07-31 19:07               ` Linus Torvalds
2023-07-31 19:22                 ` David Hildenbrand
2023-08-01 13:05               ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9de80e22-e89f-2760-34f4-61be5f8fd39c@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liubo254@huawei.com \
    --cc=peterx@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).