linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] mm/gup: Minor fix, cleanup and improvements
@ 2025-04-07  3:03 Baoquan He
  2025-04-07  3:03 ` [PATCH v3 1/3] mm/gup: fix wrongly calculated returned value in fault_in_safe_writeable() Baoquan He
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Baoquan He @ 2025-04-07  3:03 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, osalvador, david, mingo, yanjun.zhu, linux-kernel,
	Baoquan He

These are made when I explore codes in mm/gup.c.

v2->v3:
=========
- In patch 1, change to use local variable 'start' as a loop cursor
  in fault_in_readable() and fault_in_writeable() so that they have the
  consistent code style with fault_in_safe_writeable(). Doing this can
  avoid ugly long line caused by kinds of type cast (const char __user *). 
  Thanks to David who suggested two ways including this one, I found this
  is better when changing code.

- In patch2, changes to add VM_WARN_ON_ONCE() for both below chekcing
  because all external users of __get_user_pages() have done the
  checking in is_valid_gup_args() before calling __get_user_pages().
  Just adding these VM_WARN_ON_ONCE() in case, e.g internal setting
  wrong flags in kernel code. Thanks to David for suggesting this.

   - (gup_flags & (FOLL_PIN | FOLL_GET)) == (FOLL_PIN | FOLL_GET))
   - !!pages != !!(gup_flags & (FOLL_GET | FOLL_PIN))

- Drop
      - the old patch 3 of v2 because it's from misunderstanding;
      - the old patch 4, 5 of v2 because they have been merged into x86
        tip tree by Ingo;
      - the old patch 7 of v2 because it has been coverred in another
        thread. Thanks to David for telling the thread link.

- In patch 3, add reviewing tags from Oscar and David.

v1->v2:
==========
- In patch 1, I carelessly copied the fault_in_readable() as
  fault_in_writeable(). Thanks to Yanjun for pointing it out.

- In patch 2, I copied the code in follow_page_pte() to
  __get_user_pages() directly w/o adjustment which is done but not
  merged to patch. That failed testing taken by lkp test robot, thanks
  for reporting.

Baoquan He (3):
  mm/gup: fix wrongly calculated returned value in
    fault_in_safe_writeable()
  mm/gup: remove unneeded checking in follow_page_pte()
  mm/gup: remove gup_fast_pgd_leaf() and clean up the relevant codes

 mm/gup.c | 100 ++++++++++++++++---------------------------------------
 1 file changed, 28 insertions(+), 72 deletions(-)

-- 
2.41.0



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3 1/3] mm/gup: fix wrongly calculated returned value in fault_in_safe_writeable()
  2025-04-07  3:03 [PATCH v3 0/3] mm/gup: Minor fix, cleanup and improvements Baoquan He
@ 2025-04-07  3:03 ` Baoquan He
  2025-04-08  9:40   ` Oscar Salvador
  2025-04-08  9:52   ` David Hildenbrand
  2025-04-07  3:03 ` [PATCH v3 2/3] mm/gup: remove unneeded checking in follow_page_pte() Baoquan He
  2025-04-07  3:03 ` [PATCH v3 3/3] mm/gup: remove gup_fast_pgd_leaf() and clean up the relevant codes Baoquan He
  2 siblings, 2 replies; 11+ messages in thread
From: Baoquan He @ 2025-04-07  3:03 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, osalvador, david, mingo, yanjun.zhu, linux-kernel,
	Baoquan He

Not like fault_in_readable() or fault_in_writeable(), in
fault_in_safe_writeable() local variable 'start' is increased page
by page to loop till the whole address range is handled. However,
it mistakenly calcalates the size of handled range with 'uaddr - start'.

Here fix the code bug in fault_in_safe_writeable(), and also adjusting
the codes in fault_in_readable() and fault_in_writeable() to use local
variable 'start' to loop so that codes in these three functions are
consistent.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 mm/gup.c | 40 ++++++++++++++++++++--------------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 92351e2fa876..67a7de9e4f80 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2114,7 +2114,7 @@ static long __get_user_pages_locked(struct mm_struct *mm, unsigned long start,
  */
 size_t fault_in_writeable(char __user *uaddr, size_t size)
 {
-	char __user *start = uaddr, *end;
+	unsigned long start = (unsigned long)uaddr, end;
 
 	if (unlikely(size == 0))
 		return 0;
@@ -2122,20 +2122,20 @@ size_t fault_in_writeable(char __user *uaddr, size_t size)
 		return size;
 	if (!PAGE_ALIGNED(uaddr)) {
 		unsafe_put_user(0, uaddr, out);
-		uaddr = (char __user *)PAGE_ALIGN((unsigned long)uaddr);
+		start = PAGE_ALIGN((unsigned long)uaddr);
 	}
-	end = (char __user *)PAGE_ALIGN((unsigned long)start + size);
+	end = PAGE_ALIGN(start + size);
 	if (unlikely(end < start))
-		end = NULL;
-	while (uaddr != end) {
-		unsafe_put_user(0, uaddr, out);
-		uaddr += PAGE_SIZE;
+		end = 0;
+	while (start != end) {
+		unsafe_put_user(0, (char __user *)start, out);
+		start += PAGE_SIZE;
 	}
 
 out:
 	user_write_access_end();
-	if (size > uaddr - start)
-		return size - (uaddr - start);
+	if (size > start - (unsigned long)uaddr)
+		return size - (start - (unsigned long)uaddr);
 	return 0;
 }
 EXPORT_SYMBOL(fault_in_writeable);
@@ -2207,8 +2207,8 @@ size_t fault_in_safe_writeable(const char __user *uaddr, size_t size)
 	} while (start != end);
 	mmap_read_unlock(mm);
 
-	if (size > (unsigned long)uaddr - start)
-		return size - ((unsigned long)uaddr - start);
+	if (size > start - (unsigned long)uaddr)
+		return size - (start - (unsigned long)uaddr);
 	return 0;
 }
 EXPORT_SYMBOL(fault_in_safe_writeable);
@@ -2223,7 +2223,7 @@ EXPORT_SYMBOL(fault_in_safe_writeable);
  */
 size_t fault_in_readable(const char __user *uaddr, size_t size)
 {
-	const char __user *start = uaddr, *end;
+	unsigned long start = (unsigned long)uaddr, end;
 	volatile char c;
 
 	if (unlikely(size == 0))
@@ -2232,21 +2232,21 @@ size_t fault_in_readable(const char __user *uaddr, size_t size)
 		return size;
 	if (!PAGE_ALIGNED(uaddr)) {
 		unsafe_get_user(c, uaddr, out);
-		uaddr = (const char __user *)PAGE_ALIGN((unsigned long)uaddr);
+		start = PAGE_ALIGN((unsigned long)uaddr);
 	}
-	end = (const char __user *)PAGE_ALIGN((unsigned long)start + size);
+	end = PAGE_ALIGN(start + size);
 	if (unlikely(end < start))
-		end = NULL;
-	while (uaddr != end) {
-		unsafe_get_user(c, uaddr, out);
-		uaddr += PAGE_SIZE;
+		end = 0;
+	while (start != end) {
+		unsafe_get_user(c, (const char __user *)start, out);
+		start += PAGE_SIZE;
 	}
 
 out:
 	user_read_access_end();
 	(void)c;
-	if (size > uaddr - start)
-		return size - (uaddr - start);
+	if (size > start - (unsigned long)uaddr)
+		return size - (start - (unsigned long)uaddr);
 	return 0;
 }
 EXPORT_SYMBOL(fault_in_readable);
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 2/3] mm/gup: remove unneeded checking in follow_page_pte()
  2025-04-07  3:03 [PATCH v3 0/3] mm/gup: Minor fix, cleanup and improvements Baoquan He
  2025-04-07  3:03 ` [PATCH v3 1/3] mm/gup: fix wrongly calculated returned value in fault_in_safe_writeable() Baoquan He
@ 2025-04-07  3:03 ` Baoquan He
  2025-04-08  9:04   ` Oscar Salvador
  2025-04-07  3:03 ` [PATCH v3 3/3] mm/gup: remove gup_fast_pgd_leaf() and clean up the relevant codes Baoquan He
  2 siblings, 1 reply; 11+ messages in thread
From: Baoquan He @ 2025-04-07  3:03 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, osalvador, david, mingo, yanjun.zhu, linux-kernel,
	Baoquan He

In __get_user_pages(), it will traverse page table and take a reference
to the page the given user address corresponds to if GUP_GET or GUP_PIN
is set. However, it's not supported both GUP_GET and GUP_PIN are set.
Even though this check need be done, it should be done earlier, but not
doing it till entering into follow_page_pte() and failed.

Furthermore, this checking has been done in is_valid_gup_args() and all
external users of __get_user_pages() will call is_valid_gup_args() to
catch the illegal setting. We don't need to worry about internal users
of __get_user_pages() because the gup_flags are set by MM code correctly.

Here remove the checking in follow_page_pte(), and add VM_WARN_ON_ONCE()
to catch the possible exceptional setting just in case.

And also change the VM_BUG_ON to VM_WARN_ON_ONCE() for checking
(!!pages != !!(gup_flags & (FOLL_GET | FOLL_PIN))) because the checking
has been done in is_valid_gup_args() for external users of __get_user_pages().

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 mm/gup.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 67a7de9e4f80..5b3ac5a867a3 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -844,11 +844,6 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
 	pte_t *ptep, pte;
 	int ret;
 
-	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
-	if (WARN_ON_ONCE((flags & (FOLL_PIN | FOLL_GET)) ==
-			 (FOLL_PIN | FOLL_GET)))
-		return ERR_PTR(-EINVAL);
-
 	ptep = pte_offset_map_lock(mm, pmd, address, &ptl);
 	if (!ptep)
 		return no_page_table(vma, flags, address);
@@ -1432,7 +1427,11 @@ static long __get_user_pages(struct mm_struct *mm,
 
 	start = untagged_addr_remote(mm, start);
 
-	VM_BUG_ON(!!pages != !!(gup_flags & (FOLL_GET | FOLL_PIN)));
+	VM_WARN_ON_ONCE(!!pages != !!(gup_flags & (FOLL_GET | FOLL_PIN)));
+
+	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
+	VM_WARN_ON_ONCE((gup_flags & (FOLL_PIN | FOLL_GET)) ==
+			(FOLL_PIN | FOLL_GET));
 
 	do {
 		struct page *page;
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 3/3] mm/gup: remove gup_fast_pgd_leaf() and clean up the relevant codes
  2025-04-07  3:03 [PATCH v3 0/3] mm/gup: Minor fix, cleanup and improvements Baoquan He
  2025-04-07  3:03 ` [PATCH v3 1/3] mm/gup: fix wrongly calculated returned value in fault_in_safe_writeable() Baoquan He
  2025-04-07  3:03 ` [PATCH v3 2/3] mm/gup: remove unneeded checking in follow_page_pte() Baoquan He
@ 2025-04-07  3:03 ` Baoquan He
  2 siblings, 0 replies; 11+ messages in thread
From: Baoquan He @ 2025-04-07  3:03 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, osalvador, david, mingo, yanjun.zhu, linux-kernel,
	Baoquan He

In the current kernel, only pud huge page is supported in some
architectures. P4d and pgd huge pages haven't been supported yet.
And in mm/gup.c, there's no pgd huge page handling in the
follow_page_mask() code path. Hence it doesn't make sense to only
have gup_fast_pgd_leaf() in gup_fast code path.

Here remove gup_fast_pgd_leaf() and clean up the relevant codes.

Signed-off-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: David Hildenbrand <david@redhat.com>
---
 mm/gup.c | 49 +++----------------------------------------------
 1 file changed, 3 insertions(+), 46 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 5b3ac5a867a3..c9237db3cdb3 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -3172,46 +3172,6 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr,
 	return 1;
 }
 
-static int gup_fast_pgd_leaf(pgd_t orig, pgd_t *pgdp, unsigned long addr,
-		unsigned long end, unsigned int flags, struct page **pages,
-		int *nr)
-{
-	int refs;
-	struct page *page;
-	struct folio *folio;
-
-	if (!pgd_access_permitted(orig, flags & FOLL_WRITE))
-		return 0;
-
-	BUILD_BUG_ON(pgd_devmap(orig));
-
-	page = pgd_page(orig);
-	refs = record_subpages(page, PGDIR_SIZE, addr, end, pages + *nr);
-
-	folio = try_grab_folio_fast(page, refs, flags);
-	if (!folio)
-		return 0;
-
-	if (unlikely(pgd_val(orig) != pgd_val(*pgdp))) {
-		gup_put_folio(folio, refs, flags);
-		return 0;
-	}
-
-	if (!pgd_write(orig) && gup_must_unshare(NULL, flags, &folio->page)) {
-		gup_put_folio(folio, refs, flags);
-		return 0;
-	}
-
-	if (!gup_fast_folio_allowed(folio, flags)) {
-		gup_put_folio(folio, refs, flags);
-		return 0;
-	}
-
-	*nr += refs;
-	folio_set_referenced(folio);
-	return 1;
-}
-
 static int gup_fast_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr,
 		unsigned long end, unsigned int flags, struct page **pages,
 		int *nr)
@@ -3306,12 +3266,9 @@ static void gup_fast_pgd_range(unsigned long addr, unsigned long end,
 		next = pgd_addr_end(addr, end);
 		if (pgd_none(pgd))
 			return;
-		if (unlikely(pgd_leaf(pgd))) {
-			if (!gup_fast_pgd_leaf(pgd, pgdp, addr, next, flags,
-					       pages, nr))
-				return;
-		} else if (!gup_fast_p4d_range(pgdp, pgd, addr, next, flags,
-					       pages, nr))
+		BUILD_BUG_ON(pgd_leaf(pgd));
+		if (!gup_fast_p4d_range(pgdp, pgd, addr, next, flags,
+					pages, nr))
 			return;
 	} while (pgdp++, addr = next, addr != end);
 }
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/3] mm/gup: remove unneeded checking in follow_page_pte()
  2025-04-07  3:03 ` [PATCH v3 2/3] mm/gup: remove unneeded checking in follow_page_pte() Baoquan He
@ 2025-04-08  9:04   ` Oscar Salvador
  2025-04-08  9:15     ` David Hildenbrand
  0 siblings, 1 reply; 11+ messages in thread
From: Oscar Salvador @ 2025-04-08  9:04 UTC (permalink / raw)
  To: Baoquan He; +Cc: linux-mm, akpm, david, mingo, yanjun.zhu, linux-kernel

On Mon, Apr 07, 2025 at 11:03:05AM +0800, Baoquan He wrote:
> In __get_user_pages(), it will traverse page table and take a reference
> to the page the given user address corresponds to if GUP_GET or GUP_PIN
> is set. However, it's not supported both GUP_GET and GUP_PIN are set.
> Even though this check need be done, it should be done earlier, but not
> doing it till entering into follow_page_pte() and failed.
> 
> Furthermore, this checking has been done in is_valid_gup_args() and all
> external users of __get_user_pages() will call is_valid_gup_args() to
> catch the illegal setting. We don't need to worry about internal users
> of __get_user_pages() because the gup_flags are set by MM code correctly.
> 
> Here remove the checking in follow_page_pte(), and add VM_WARN_ON_ONCE()
> to catch the possible exceptional setting just in case.
> 
> And also change the VM_BUG_ON to VM_WARN_ON_ONCE() for checking
> (!!pages != !!(gup_flags & (FOLL_GET | FOLL_PIN))) because the checking
> has been done in is_valid_gup_args() for external users of __get_user_pages().

I will also note that you changed the flags check to VM_WARN_ON_ONCE.
I guess this is fine as we have the WARN_ON_ONCE version in
is_valid_gup_args().

> Signed-off-by: Baoquan He <bhe@redhat.com>

LGTM,

Reviewed-by: Oscar Salvador <osalvador@suse.de>


-- 
Oscar Salvador
SUSE Labs


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/3] mm/gup: remove unneeded checking in follow_page_pte()
  2025-04-08  9:04   ` Oscar Salvador
@ 2025-04-08  9:15     ` David Hildenbrand
  0 siblings, 0 replies; 11+ messages in thread
From: David Hildenbrand @ 2025-04-08  9:15 UTC (permalink / raw)
  To: Oscar Salvador, Baoquan He
  Cc: linux-mm, akpm, mingo, yanjun.zhu, linux-kernel

On 08.04.25 11:04, Oscar Salvador wrote:
> On Mon, Apr 07, 2025 at 11:03:05AM +0800, Baoquan He wrote:
>> In __get_user_pages(), it will traverse page table and take a reference
>> to the page the given user address corresponds to if GUP_GET or GUP_PIN
>> is set. However, it's not supported both GUP_GET and GUP_PIN are set.
>> Even though this check need be done, it should be done earlier, but not
>> doing it till entering into follow_page_pte() and failed.
>>
>> Furthermore, this checking has been done in is_valid_gup_args() and all
>> external users of __get_user_pages() will call is_valid_gup_args() to
>> catch the illegal setting. We don't need to worry about internal users
>> of __get_user_pages() because the gup_flags are set by MM code correctly.
>>
>> Here remove the checking in follow_page_pte(), and add VM_WARN_ON_ONCE()
>> to catch the possible exceptional setting just in case.
>>
>> And also change the VM_BUG_ON to VM_WARN_ON_ONCE() for checking
>> (!!pages != !!(gup_flags & (FOLL_GET | FOLL_PIN))) because the checking
>> has been done in is_valid_gup_args() for external users of __get_user_pages().
> 
> I will also note that you changed the flags check to VM_WARN_ON_ONCE.
> I guess this is fine as we have the WARN_ON_ONCE version in
> is_valid_gup_args().
> 

Yes, that was my reasoning when I proposed it.

>> Signed-off-by: Baoquan He <bhe@redhat.com>
> 
> LGTM,
> 
> Reviewed-by: Oscar Salvador <osalvador@suse.de>
> 
> 

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 1/3] mm/gup: fix wrongly calculated returned value in fault_in_safe_writeable()
  2025-04-07  3:03 ` [PATCH v3 1/3] mm/gup: fix wrongly calculated returned value in fault_in_safe_writeable() Baoquan He
@ 2025-04-08  9:40   ` Oscar Salvador
  2025-04-08 15:01     ` Baoquan He
  2025-04-08  9:52   ` David Hildenbrand
  1 sibling, 1 reply; 11+ messages in thread
From: Oscar Salvador @ 2025-04-08  9:40 UTC (permalink / raw)
  To: Baoquan He; +Cc: linux-mm, akpm, david, mingo, yanjun.zhu, linux-kernel

On Mon, Apr 07, 2025 at 11:03:04AM +0800, Baoquan He wrote:
> Not like fault_in_readable() or fault_in_writeable(), in
> fault_in_safe_writeable() local variable 'start' is increased page
> by page to loop till the whole address range is handled. However,
> it mistakenly calcalates the size of handled range with 'uaddr - start'.
                ^^ calculates
> 
> Here fix the code bug in fault_in_safe_writeable(), and also adjusting
> the codes in fault_in_readable() and fault_in_writeable() to use local
> variable 'start' to loop so that codes in these three functions are
> consistent.
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>

The fix for the bug in fault_in_safe_writeable() looks good to me.
But I think that David suggested the other way around wrt. uaddr and
start variables in those three functions? I think he had in mind that
fault_in_safe_writeable() follows fault_in_safe_writeable() and
fault_in_readable() lead.

Other than that looks good to me.


-- 
Oscar Salvador
SUSE Labs


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 1/3] mm/gup: fix wrongly calculated returned value in fault_in_safe_writeable()
  2025-04-07  3:03 ` [PATCH v3 1/3] mm/gup: fix wrongly calculated returned value in fault_in_safe_writeable() Baoquan He
  2025-04-08  9:40   ` Oscar Salvador
@ 2025-04-08  9:52   ` David Hildenbrand
  2025-04-08 10:00     ` David Hildenbrand
  1 sibling, 1 reply; 11+ messages in thread
From: David Hildenbrand @ 2025-04-08  9:52 UTC (permalink / raw)
  To: Baoquan He, linux-mm; +Cc: akpm, osalvador, mingo, yanjun.zhu, linux-kernel

On 07.04.25 05:03, Baoquan He wrote:
> Not like fault_in_readable() or fault_in_writeable(), in
> fault_in_safe_writeable() local variable 'start' is increased page
> by page to loop till the whole address range is handled. However,
> it mistakenly calcalates the size of handled range with 'uaddr - start'.
> 
> Here fix the code bug in fault_in_safe_writeable(), and also adjusting
> the codes in fault_in_readable() and fault_in_writeable() to use local
> variable 'start' to loop so that codes in these three functions are
> consistent.
> 

I probably phrased it poorly in my other reply: the confusing part (to 
me) is adjusting "start". Maybe we should have unsigned long start,end,cur;

Maybe we should really split the "fix" from the cleanups, and tag the 
fix with a Fixes:.

I was wondering if these functions could be simplified a bit. But the 
overflow handling is a bit nasty.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 1/3] mm/gup: fix wrongly calculated returned value in fault_in_safe_writeable()
  2025-04-08  9:52   ` David Hildenbrand
@ 2025-04-08 10:00     ` David Hildenbrand
  2025-04-08 14:59       ` Baoquan He
  0 siblings, 1 reply; 11+ messages in thread
From: David Hildenbrand @ 2025-04-08 10:00 UTC (permalink / raw)
  To: Baoquan He, linux-mm; +Cc: akpm, osalvador, mingo, yanjun.zhu, linux-kernel

On 08.04.25 11:52, David Hildenbrand wrote:
> On 07.04.25 05:03, Baoquan He wrote:
>> Not like fault_in_readable() or fault_in_writeable(), in
>> fault_in_safe_writeable() local variable 'start' is increased page
>> by page to loop till the whole address range is handled. However,
>> it mistakenly calcalates the size of handled range with 'uaddr - start'.
>>
>> Here fix the code bug in fault_in_safe_writeable(), and also adjusting
>> the codes in fault_in_readable() and fault_in_writeable() to use local
>> variable 'start' to loop so that codes in these three functions are
>> consistent.
>>
> 
> I probably phrased it poorly in my other reply: the confusing part (to
> me) is adjusting "start". Maybe we should have unsigned long start,end,cur;
> 
> Maybe we should really split the "fix" from the cleanups, and tag the
> fix with a Fixes:.
> 
> I was wondering if these functions could be simplified a bit. But the
> overflow handling is a bit nasty.

FWIW, maybe the following could work and clarify things. Just a thought.


diff --git a/mm/gup.c b/mm/gup.c
index 92351e2fa876b..7a3f78a209f8b 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2223,30 +2223,23 @@ EXPORT_SYMBOL(fault_in_safe_writeable);
   */
  size_t fault_in_readable(const char __user *uaddr, size_t size)
  {
-       const char __user *start = uaddr, *end;
+       const unsigned long start = (unsigned long)uaddr;
+       const unsigned long end = start + size;
+       unsigned long cur;
         volatile char c;
  
         if (unlikely(size == 0))
                 return 0;
         if (!user_read_access_begin(uaddr, size))
                 return size;
-       if (!PAGE_ALIGNED(uaddr)) {
-               unsafe_get_user(c, uaddr, out);
-               uaddr = (const char __user *)PAGE_ALIGN((unsigned long)uaddr);
-       }
-       end = (const char __user *)PAGE_ALIGN((unsigned long)start + size);
-       if (unlikely(end < start))
-               end = NULL;
-       while (uaddr != end) {
-               unsafe_get_user(c, uaddr, out);
-               uaddr += PAGE_SIZE;
-       }
-
-out:
+       /* Stop once we overflow to 0. */
+       for (cur = start; cur && cur < end; cur = PAGE_ALIGN_DOWN(cur + PAGE_SIZE))
+               unsafe_get_user(c, (const char __user *)cur, out);
         user_read_access_end();
         (void)c;
-       if (size > uaddr - start)
-               return size - (uaddr - start);
+out:
+       if (size > cur - start)
+               return size - (cur - start);
         return 0;
  }
  EXPORT_SYMBOL(fault_in_readable);


-- 
Cheers,

David / dhildenb



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 1/3] mm/gup: fix wrongly calculated returned value in fault_in_safe_writeable()
  2025-04-08 10:00     ` David Hildenbrand
@ 2025-04-08 14:59       ` Baoquan He
  0 siblings, 0 replies; 11+ messages in thread
From: Baoquan He @ 2025-04-08 14:59 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-mm, akpm, osalvador, mingo, yanjun.zhu, linux-kernel

On 04/08/25 at 12:00pm, David Hildenbrand wrote:
> On 08.04.25 11:52, David Hildenbrand wrote:
> > On 07.04.25 05:03, Baoquan He wrote:
> > > Not like fault_in_readable() or fault_in_writeable(), in
> > > fault_in_safe_writeable() local variable 'start' is increased page
> > > by page to loop till the whole address range is handled. However,
> > > it mistakenly calcalates the size of handled range with 'uaddr - start'.
> > > 
> > > Here fix the code bug in fault_in_safe_writeable(), and also adjusting
> > > the codes in fault_in_readable() and fault_in_writeable() to use local
> > > variable 'start' to loop so that codes in these three functions are
> > > consistent.
> > > 
> > 
> > I probably phrased it poorly in my other reply: the confusing part (to
> > me) is adjusting "start". Maybe we should have unsigned long start,end,cur;
> > 
> > Maybe we should really split the "fix" from the cleanups, and tag the
> > fix with a Fixes:.

> > 
> > I was wondering if these functions could be simplified a bit. But the
> > overflow handling is a bit nasty.
> 
> FWIW, maybe the following could work and clarify things. Just a thought.

The code simplification looks great to me. I will make a patch to only
contains the code bug fixing with Fixes so that it's eaiser to back port
to stable kernel, and make another patch as below to refactor codes in
fault_in_readable/writable/safe_writable(). Thanks for suggestion.

> 
> 
> diff --git a/mm/gup.c b/mm/gup.c
> index 92351e2fa876b..7a3f78a209f8b 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -2223,30 +2223,23 @@ EXPORT_SYMBOL(fault_in_safe_writeable);
>   */
>  size_t fault_in_readable(const char __user *uaddr, size_t size)
>  {
> -       const char __user *start = uaddr, *end;
> +       const unsigned long start = (unsigned long)uaddr;
> +       const unsigned long end = start + size;
> +       unsigned long cur;
>         volatile char c;
>         if (unlikely(size == 0))
>                 return 0;
>         if (!user_read_access_begin(uaddr, size))
>                 return size;
> -       if (!PAGE_ALIGNED(uaddr)) {
> -               unsafe_get_user(c, uaddr, out);
> -               uaddr = (const char __user *)PAGE_ALIGN((unsigned long)uaddr);
> -       }
> -       end = (const char __user *)PAGE_ALIGN((unsigned long)start + size);
> -       if (unlikely(end < start))
> -               end = NULL;
> -       while (uaddr != end) {
> -               unsafe_get_user(c, uaddr, out);
> -               uaddr += PAGE_SIZE;
> -       }
> -
> -out:
> +       /* Stop once we overflow to 0. */
> +       for (cur = start; cur && cur < end; cur = PAGE_ALIGN_DOWN(cur + PAGE_SIZE))
> +               unsafe_get_user(c, (const char __user *)cur, out);
>         user_read_access_end();
>         (void)c;
> -       if (size > uaddr - start)
> -               return size - (uaddr - start);
> +out:
> +       if (size > cur - start)
> +               return size - (cur - start);
>         return 0;
>  }
>  EXPORT_SYMBOL(fault_in_readable);
> 
> 
> -- 
> Cheers,
> 
> David / dhildenb
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 1/3] mm/gup: fix wrongly calculated returned value in fault_in_safe_writeable()
  2025-04-08  9:40   ` Oscar Salvador
@ 2025-04-08 15:01     ` Baoquan He
  0 siblings, 0 replies; 11+ messages in thread
From: Baoquan He @ 2025-04-08 15:01 UTC (permalink / raw)
  To: Oscar Salvador; +Cc: linux-mm, akpm, david, mingo, yanjun.zhu, linux-kernel

On 04/08/25 at 11:40am, Oscar Salvador wrote:
> On Mon, Apr 07, 2025 at 11:03:04AM +0800, Baoquan He wrote:
> > Not like fault_in_readable() or fault_in_writeable(), in
> > fault_in_safe_writeable() local variable 'start' is increased page
> > by page to loop till the whole address range is handled. However,
> > it mistakenly calcalates the size of handled range with 'uaddr - start'.
>                 ^^ calculates

Will fix, thanks.
> > 
> > Here fix the code bug in fault_in_safe_writeable(), and also adjusting
> > the codes in fault_in_readable() and fault_in_writeable() to use local
> > variable 'start' to loop so that codes in these three functions are
> > consistent.
> > 
> > Signed-off-by: Baoquan He <bhe@redhat.com>
> 
> The fix for the bug in fault_in_safe_writeable() looks good to me.
> But I think that David suggested the other way around wrt. uaddr and
> start variables in those three functions? I think he had in mind that
> fault_in_safe_writeable() follows fault_in_safe_writeable() and
> fault_in_readable() lead.

Right, will follow the way he suggested in another sub-thread, thanks
for careful reviewing.

> 
> Other than that looks good to me.
> 
> 
> -- 
> Oscar Salvador
> SUSE Labs
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-04-08 15:01 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-07  3:03 [PATCH v3 0/3] mm/gup: Minor fix, cleanup and improvements Baoquan He
2025-04-07  3:03 ` [PATCH v3 1/3] mm/gup: fix wrongly calculated returned value in fault_in_safe_writeable() Baoquan He
2025-04-08  9:40   ` Oscar Salvador
2025-04-08 15:01     ` Baoquan He
2025-04-08  9:52   ` David Hildenbrand
2025-04-08 10:00     ` David Hildenbrand
2025-04-08 14:59       ` Baoquan He
2025-04-07  3:03 ` [PATCH v3 2/3] mm/gup: remove unneeded checking in follow_page_pte() Baoquan He
2025-04-08  9:04   ` Oscar Salvador
2025-04-08  9:15     ` David Hildenbrand
2025-04-07  3:03 ` [PATCH v3 3/3] mm/gup: remove gup_fast_pgd_leaf() and clean up the relevant codes Baoquan He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).