[PATCH] filemap: optimize order0 folio in filemap_map

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] filemap: optimize order0 folio in filemap_map_pages
@ 2025-08-19 14:06 Jinjiang Tu
  2025-08-19 15:52 ` Matthew Wilcox
  0 siblings, 1 reply; 13+ messages in thread
From: Jinjiang Tu @ 2025-08-19 14:06 UTC (permalink / raw)
  To: fengwei.yin, willy, akpm, david, linux-mm; +Cc: wangkefeng.wang, tujinjiang

There are two meaningless folio refcount update for order0 folio in
filemap_map_pages(). First, filemap_map_order0_folio() adds folio refcount
after the folio is mapped to pte. And then, filemap_map_pages() drops a
refcount grabbed by next_uptodate_folio(). We could remain the refcount
unchanged in this case.

With this patch, we can get 8% performance gain for lmbench testcase
'lat_pagefault -P 1 file', the size of file is 512M.

Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
---
 mm/filemap.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 751838ef05e5..5de52deab138 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3693,6 +3693,8 @@ static vm_fault_t filemap_map_folio_range(struct vm_fault *vmf,
 	}
 
 	vmf->pte = old_ptep;
+	folio_unlock(folio);
+	folio_put(folio);
 
 	return ret;
 }
@@ -3705,7 +3707,7 @@ static vm_fault_t filemap_map_order0_folio(struct vm_fault *vmf,
 	struct page *page = &folio->page;
 
 	if (PageHWPoison(page))
-		return ret;
+		goto out;
 
 	/* See comment of filemap_map_folio_range() */
 	if (!folio_test_workingset(folio))
@@ -3717,15 +3719,19 @@ static vm_fault_t filemap_map_order0_folio(struct vm_fault *vmf,
 	 * the fault-around logic.
 	 */
 	if (!pte_none(ptep_get(vmf->pte)))
-		return ret;
+		goto out;
 
 	if (vmf->address == addr)
 		ret = VM_FAULT_NOPAGE;
 
 	set_pte_range(vmf, folio, page, 1, addr);
 	(*rss)++;
-	folio_ref_inc(folio);
+	folio_unlock(folio);
+	return ret;
 
+out:
+	folio_unlock(folio);
+	folio_put(folio);
 	return ret;
 }
 
@@ -3783,9 +3789,6 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
 			ret |= filemap_map_folio_range(vmf, folio,
 					xas.xa_index - folio->index, addr,
 					nr_pages, &rss, &mmap_miss);
-
-		folio_unlock(folio);
-		folio_put(folio);
 	} while ((folio = next_uptodate_folio(&xas, mapping, end_pgoff)) != NULL);
 	add_mm_counter(vma->vm_mm, folio_type, rss);
 	pte_unmap_unlock(vmf->pte, vmf->ptl);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] filemap: optimize order0 folio in filemap_map_pages
  2025-08-19 14:06 [PATCH] filemap: optimize order0 folio in filemap_map_pages Jinjiang Tu
@ 2025-08-19 15:52 ` Matthew Wilcox
  2025-08-20  1:10   ` Jinjiang Tu
  0 siblings, 1 reply; 13+ messages in thread
From: Matthew Wilcox @ 2025-08-19 15:52 UTC (permalink / raw)
  To: Jinjiang Tu; +Cc: fengwei.yin, akpm, david, linux-mm, wangkefeng.wang

On Tue, Aug 19, 2025 at 10:06:53PM +0800, Jinjiang Tu wrote:
> There are two meaningless folio refcount update for order0 folio in
> filemap_map_pages(). First, filemap_map_order0_folio() adds folio refcount
> after the folio is mapped to pte. And then, filemap_map_pages() drops a
> refcount grabbed by next_uptodate_folio(). We could remain the refcount
> unchanged in this case.
> 
> With this patch, we can get 8% performance gain for lmbench testcase
> 'lat_pagefault -P 1 file', the size of file is 512M.

You don't explain why you move the folio_unlock() call

> Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
> ---
>  mm/filemap.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 751838ef05e5..5de52deab138 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -3693,6 +3693,8 @@ static vm_fault_t filemap_map_folio_range(struct vm_fault *vmf,
>  	}
>  
>  	vmf->pte = old_ptep;
> +	folio_unlock(folio);
> +	folio_put(folio);
>  
>  	return ret;
>  }
> @@ -3705,7 +3707,7 @@ static vm_fault_t filemap_map_order0_folio(struct vm_fault *vmf,
>  	struct page *page = &folio->page;
>  
>  	if (PageHWPoison(page))
> -		return ret;
> +		goto out;
>  
>  	/* See comment of filemap_map_folio_range() */
>  	if (!folio_test_workingset(folio))
> @@ -3717,15 +3719,19 @@ static vm_fault_t filemap_map_order0_folio(struct vm_fault *vmf,
>  	 * the fault-around logic.
>  	 */
>  	if (!pte_none(ptep_get(vmf->pte)))
> -		return ret;
> +		goto out;
>  
>  	if (vmf->address == addr)
>  		ret = VM_FAULT_NOPAGE;
>  
>  	set_pte_range(vmf, folio, page, 1, addr);
>  	(*rss)++;
> -	folio_ref_inc(folio);
> +	folio_unlock(folio);
> +	return ret;
>  
> +out:
> +	folio_unlock(folio);
> +	folio_put(folio);
>  	return ret;
>  }
>  
> @@ -3783,9 +3789,6 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
>  			ret |= filemap_map_folio_range(vmf, folio,
>  					xas.xa_index - folio->index, addr,
>  					nr_pages, &rss, &mmap_miss);
> -
> -		folio_unlock(folio);
> -		folio_put(folio);
>  	} while ((folio = next_uptodate_folio(&xas, mapping, end_pgoff)) != NULL);
>  	add_mm_counter(vma->vm_mm, folio_type, rss);
>  	pte_unmap_unlock(vmf->pte, vmf->ptl);
> -- 
> 2.43.0
> 
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] filemap: optimize order0 folio in filemap_map_pages
  2025-08-19 15:52 ` Matthew Wilcox
@ 2025-08-20  1:10   ` Jinjiang Tu
  2025-08-20 12:42     ` Matthew Wilcox
  0 siblings, 1 reply; 13+ messages in thread
From: Jinjiang Tu @ 2025-08-20  1:10 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: fengwei.yin, akpm, david, linux-mm, wangkefeng.wang

[-- Attachment #1: Type: text/plain, Size: 2547 bytes --]


在 2025/8/19 23:52, Matthew Wilcox 写道:
> On Tue, Aug 19, 2025 at 10:06:53PM +0800, Jinjiang Tu wrote:
>> There are two meaningless folio refcount update for order0 folio in
>> filemap_map_pages(). First, filemap_map_order0_folio() adds folio refcount
>> after the folio is mapped to pte. And then, filemap_map_pages() drops a
>> refcount grabbed by next_uptodate_folio(). We could remain the refcount
>> unchanged in this case.
>>
>> With this patch, we can get 8% performance gain for lmbench testcase
>> 'lat_pagefault -P 1 file', the size of file is 512M.
> You don't explain why you move the folio_unlock() call

We should call folio_unlock() before folio_put(). In filemap_map_order0_folio(),
if we doesn't set folio into pte, we should unlock and put folio.

>
>> Signed-off-by: Jinjiang Tu<tujinjiang@huawei.com>
>> ---
>>   mm/filemap.c | 15 +++++++++------
>>   1 file changed, 9 insertions(+), 6 deletions(-)
>>
>> diff --git a/mm/filemap.c b/mm/filemap.c
>> index 751838ef05e5..5de52deab138 100644
>> --- a/mm/filemap.c
>> +++ b/mm/filemap.c
>> @@ -3693,6 +3693,8 @@ static vm_fault_t filemap_map_folio_range(struct vm_fault *vmf,
>>   	}
>>   
>>   	vmf->pte = old_ptep;
>> +	folio_unlock(folio);
>> +	folio_put(folio);
>>   
>>   	return ret;
>>   }
>> @@ -3705,7 +3707,7 @@ static vm_fault_t filemap_map_order0_folio(struct vm_fault *vmf,
>>   	struct page *page = &folio->page;
>>   
>>   	if (PageHWPoison(page))
>> -		return ret;
>> +		goto out;
>>   
>>   	/* See comment of filemap_map_folio_range() */
>>   	if (!folio_test_workingset(folio))
>> @@ -3717,15 +3719,19 @@ static vm_fault_t filemap_map_order0_folio(struct vm_fault *vmf,
>>   	 * the fault-around logic.
>>   	 */
>>   	if (!pte_none(ptep_get(vmf->pte)))
>> -		return ret;
>> +		goto out;
>>   
>>   	if (vmf->address == addr)
>>   		ret = VM_FAULT_NOPAGE;
>>   
>>   	set_pte_range(vmf, folio, page, 1, addr);
>>   	(*rss)++;
>> -	folio_ref_inc(folio);
>> +	folio_unlock(folio);
>> +	return ret;
>>   
>> +out:
>> +	folio_unlock(folio);
>> +	folio_put(folio);
>>   	return ret;
>>   }
>>   
>> @@ -3783,9 +3789,6 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
>>   			ret |= filemap_map_folio_range(vmf, folio,
>>   					xas.xa_index - folio->index, addr,
>>   					nr_pages, &rss, &mmap_miss);
>> -
>> -		folio_unlock(folio);
>> -		folio_put(folio);
>>   	} while ((folio = next_uptodate_folio(&xas, mapping, end_pgoff)) != NULL);
>>   	add_mm_counter(vma->vm_mm, folio_type, rss);
>>   	pte_unmap_unlock(vmf->pte, vmf->ptl);
>> -- 
>> 2.43.0
>>
>>

[-- Attachment #2: Type: text/html, Size: 3243 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] filemap: optimize order0 folio in filemap_map_pages
  2025-08-20  1:10   ` Jinjiang Tu
@ 2025-08-20 12:42     ` Matthew Wilcox
  2025-08-21  1:22       ` Jinjiang Tu
  0 siblings, 1 reply; 13+ messages in thread
From: Matthew Wilcox @ 2025-08-20 12:42 UTC (permalink / raw)
  To: Jinjiang Tu; +Cc: fengwei.yin, akpm, david, linux-mm, wangkefeng.wang

On Wed, Aug 20, 2025 at 09:10:56AM +0800, Jinjiang Tu wrote:
> 
> 在 2025/8/19 23:52, Matthew Wilcox 写道:
> > On Tue, Aug 19, 2025 at 10:06:53PM +0800, Jinjiang Tu wrote:
> > > There are two meaningless folio refcount update for order0 folio in
> > > filemap_map_pages(). First, filemap_map_order0_folio() adds folio refcount
> > > after the folio is mapped to pte. And then, filemap_map_pages() drops a
> > > refcount grabbed by next_uptodate_folio(). We could remain the refcount
> > > unchanged in this case.
> > > 
> > > With this patch, we can get 8% performance gain for lmbench testcase
> > > 'lat_pagefault -P 1 file', the size of file is 512M.
> > You don't explain why you move the folio_unlock() call
> 
> We should call folio_unlock() before folio_put(). In filemap_map_order0_folio(),
> if we doesn't set folio into pte, we should unlock and put folio.

I agree that folio_unlock() needs to be called before folio_put().
What I don't understand is why we need to delay folio_unlock() until
right before folio_put().  Can't we leave folio_unlock() where it
currently is and only move the folio_put()?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] filemap: optimize order0 folio in filemap_map_pages
  2025-08-20 12:42     ` Matthew Wilcox
@ 2025-08-21  1:22       ` Jinjiang Tu
  2025-08-21 12:45         ` Matthew Wilcox
  0 siblings, 1 reply; 13+ messages in thread
From: Jinjiang Tu @ 2025-08-21  1:22 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: fengwei.yin, akpm, david, linux-mm, wangkefeng.wang

[-- Attachment #1: Type: text/plain, Size: 1426 bytes --]


在 2025/8/20 20:42, Matthew Wilcox 写道:
> On Wed, Aug 20, 2025 at 09:10:56AM +0800, Jinjiang Tu wrote:
>> 在 2025/8/19 23:52, Matthew Wilcox 写道:
>>> On Tue, Aug 19, 2025 at 10:06:53PM +0800, Jinjiang Tu wrote:
>>>> There are two meaningless folio refcount update for order0 folio in
>>>> filemap_map_pages(). First, filemap_map_order0_folio() adds folio refcount
>>>> after the folio is mapped to pte. And then, filemap_map_pages() drops a
>>>> refcount grabbed by next_uptodate_folio(). We could remain the refcount
>>>> unchanged in this case.
>>>>
>>>> With this patch, we can get 8% performance gain for lmbench testcase
>>>> 'lat_pagefault -P 1 file', the size of file is 512M.
>>> You don't explain why you move the folio_unlock() call
>> We should call folio_unlock() before folio_put(). In filemap_map_order0_folio(),
>> if we doesn't set folio into pte, we should unlock and put folio.
> I agree that folio_unlock() needs to be called before folio_put().
> What I don't understand is why we need to delay folio_unlock() until
> right before folio_put().  Can't we leave folio_unlock() where it
> currently is and only move the folio_put()?

In filemap_map_order0_folio(), assuming the page is hwpoisoned, we skip set_pte_range(),
the folio should be unlocked and put. If we only move folio_put() to filemap_map_order0_folio(),
the folio is unlocked when filemap_map_pages() doesn't hold any folio refcount.

[-- Attachment #2: Type: text/html, Size: 2358 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] filemap: optimize order0 folio in filemap_map_pages
  2025-08-21  1:22       ` Jinjiang Tu
@ 2025-08-21 12:45         ` Matthew Wilcox
  2025-08-21 12:57           ` David Hildenbrand
  2025-08-22  2:01           ` Jinjiang Tu
  0 siblings, 2 replies; 13+ messages in thread
From: Matthew Wilcox @ 2025-08-21 12:45 UTC (permalink / raw)
  To: Jinjiang Tu; +Cc: fengwei.yin, akpm, david, linux-mm, wangkefeng.wang

On Thu, Aug 21, 2025 at 09:22:48AM +0800, Jinjiang Tu wrote:
> 在 2025/8/20 20:42, Matthew Wilcox 写道:
> > On Wed, Aug 20, 2025 at 09:10:56AM +0800, Jinjiang Tu wrote:
> > > We should call folio_unlock() before folio_put(). In filemap_map_order0_folio(),
> > > if we doesn't set folio into pte, we should unlock and put folio.
> > I agree that folio_unlock() needs to be called before folio_put().
> > What I don't understand is why we need to delay folio_unlock() until
> > right before folio_put().  Can't we leave folio_unlock() where it
> > currently is and only move the folio_put()?
> 
> In filemap_map_order0_folio(), assuming the page is hwpoisoned, we skip set_pte_range(),
> the folio should be unlocked and put. If we only move folio_put() to filemap_map_order0_folio(),
> the folio is unlocked when filemap_map_pages() doesn't hold any folio refcount.
Oh, I see.  I misread your patch; sorry about that.

However, it is still safe to move only the folio_put() and not move
the folio_unlock()!  It's a little subtle, so I'll explain.

We must not free a locked folio.  The page allocator has checks for this
and will complain (assuming appropriate debug options are enabled).  So
this sequence:

	folio_put(folio);
	folio_unlock(folio);

is _generally_ unsafe because the folio_put() might be the last put of
the refcount which will cause the folio to be freed.  However, if we know
that the folio has a refcount > 1, it's safe because the folio_put()
won't free the folio.  We do know that the folio has a refcount >1
because it's in the page cache, which keeps a refcount on the folio.
Since we have it locked, we know that truncation will wait for the unlock
to happen, and truncation will be the last one to put the refcount.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] filemap: optimize order0 folio in filemap_map_pages
  2025-08-21 12:45         ` Matthew Wilcox
@ 2025-08-21 12:57           ` David Hildenbrand
  2025-08-21 13:20             ` Matthew Wilcox
  2025-08-22  2:01           ` Jinjiang Tu
  1 sibling, 1 reply; 13+ messages in thread
From: David Hildenbrand @ 2025-08-21 12:57 UTC (permalink / raw)
  To: Matthew Wilcox, Jinjiang Tu; +Cc: fengwei.yin, akpm, linux-mm, wangkefeng.wang

On 21.08.25 14:45, Matthew Wilcox wrote:
> On Thu, Aug 21, 2025 at 09:22:48AM +0800, Jinjiang Tu wrote:
>> 在 2025/8/20 20:42, Matthew Wilcox 写道:
>>> On Wed, Aug 20, 2025 at 09:10:56AM +0800, Jinjiang Tu wrote:
>>>> We should call folio_unlock() before folio_put(). In filemap_map_order0_folio(),
>>>> if we doesn't set folio into pte, we should unlock and put folio.
>>> I agree that folio_unlock() needs to be called before folio_put().
>>> What I don't understand is why we need to delay folio_unlock() until
>>> right before folio_put().  Can't we leave folio_unlock() where it
>>> currently is and only move the folio_put()?
>>
>> In filemap_map_order0_folio(), assuming the page is hwpoisoned, we skip set_pte_range(),
>> the folio should be unlocked and put. If we only move folio_put() to filemap_map_order0_folio(),
>> the folio is unlocked when filemap_map_pages() doesn't hold any folio refcount.
> Oh, I see.  I misread your patch; sorry about that.
> 
> However, it is still safe to move only the folio_put() and not move
> the folio_unlock()!  It's a little subtle, so I'll explain.
> 
> We must not free a locked folio.  The page allocator has checks for this
> and will complain (assuming appropriate debug options are enabled).  So
> this sequence:
> 
> 	folio_put(folio);
> 	folio_unlock(folio);
> 
> is _generally_ unsafe because the folio_put() might be the last put of
> the refcount which will cause the folio to be freed.  However, if we know
> that the folio has a refcount > 1, it's safe because the folio_put()
> won't free the folio.  We do know that the folio has a refcount >1
> because it's in the page cache, which keeps a refcount on the folio.
> Since we have it locked, we know that truncation will wait for the unlock
> to happen, and truncation will be the last one to put the refcount.

I agree that it is save, but is it worth it having that subtle detail 
here instead of just doing unlock+put?

IOW, what do we gain by doing it differently? :)

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] filemap: optimize order0 folio in filemap_map_pages
  2025-08-21 12:57           ` David Hildenbrand
@ 2025-08-21 13:20             ` Matthew Wilcox
  2025-08-21 13:35               ` David Hildenbrand
  0 siblings, 1 reply; 13+ messages in thread
From: Matthew Wilcox @ 2025-08-21 13:20 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Jinjiang Tu, fengwei.yin, akpm, linux-mm, wangkefeng.wang

On Thu, Aug 21, 2025 at 02:57:54PM +0200, David Hildenbrand wrote:
> On 21.08.25 14:45, Matthew Wilcox wrote:
> > On Thu, Aug 21, 2025 at 09:22:48AM +0800, Jinjiang Tu wrote:
> > > 在 2025/8/20 20:42, Matthew Wilcox 写道:
> > > > On Wed, Aug 20, 2025 at 09:10:56AM +0800, Jinjiang Tu wrote:
> > > > > We should call folio_unlock() before folio_put(). In filemap_map_order0_folio(),
> > > > > if we doesn't set folio into pte, we should unlock and put folio.
> > > > I agree that folio_unlock() needs to be called before folio_put().
> > > > What I don't understand is why we need to delay folio_unlock() until
> > > > right before folio_put().  Can't we leave folio_unlock() where it
> > > > currently is and only move the folio_put()?
> > > 
> > > In filemap_map_order0_folio(), assuming the page is hwpoisoned, we skip set_pte_range(),
> > > the folio should be unlocked and put. If we only move folio_put() to filemap_map_order0_folio(),
> > > the folio is unlocked when filemap_map_pages() doesn't hold any folio refcount.
> > Oh, I see.  I misread your patch; sorry about that.
> > 
> > However, it is still safe to move only the folio_put() and not move
> > the folio_unlock()!  It's a little subtle, so I'll explain.
> > 
> > We must not free a locked folio.  The page allocator has checks for this
> > and will complain (assuming appropriate debug options are enabled).  So
> > this sequence:
> > 
> > 	folio_put(folio);
> > 	folio_unlock(folio);
> > 
> > is _generally_ unsafe because the folio_put() might be the last put of
> > the refcount which will cause the folio to be freed.  However, if we know
> > that the folio has a refcount > 1, it's safe because the folio_put()
> > won't free the folio.  We do know that the folio has a refcount >1
> > because it's in the page cache, which keeps a refcount on the folio.
> > Since we have it locked, we know that truncation will wait for the unlock
> > to happen, and truncation will be the last one to put the refcount.
> 
> I agree that it is save, but is it worth it having that subtle detail here
> instead of just doing unlock+put?
> 
> IOW, what do we gain by doing it differently? :)

That was in the initial mail:

> With this patch, we can get 8% performance gain for lmbench testcase
> 'lat_pagefault -P 1 file', the size of file is 512M.

Obviously the exact gains are going to depend on how good your CPU is
at doing atomic inc/decs, but reducing the number of atomics is always
a good thing.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] filemap: optimize order0 folio in filemap_map_pages
  2025-08-21 13:20             ` Matthew Wilcox
@ 2025-08-21 13:35               ` David Hildenbrand
  2025-09-02 14:04                 ` Kefeng Wang
  0 siblings, 1 reply; 13+ messages in thread
From: David Hildenbrand @ 2025-08-21 13:35 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Jinjiang Tu, fengwei.yin, akpm, linux-mm, wangkefeng.wang

On 21.08.25 15:20, Matthew Wilcox wrote:
> On Thu, Aug 21, 2025 at 02:57:54PM +0200, David Hildenbrand wrote:
>> On 21.08.25 14:45, Matthew Wilcox wrote:
>>> On Thu, Aug 21, 2025 at 09:22:48AM +0800, Jinjiang Tu wrote:
>>>> 在 2025/8/20 20:42, Matthew Wilcox 写道:
>>>>> On Wed, Aug 20, 2025 at 09:10:56AM +0800, Jinjiang Tu wrote:
>>>>>> We should call folio_unlock() before folio_put(). In filemap_map_order0_folio(),
>>>>>> if we doesn't set folio into pte, we should unlock and put folio.
>>>>> I agree that folio_unlock() needs to be called before folio_put().
>>>>> What I don't understand is why we need to delay folio_unlock() until
>>>>> right before folio_put().  Can't we leave folio_unlock() where it
>>>>> currently is and only move the folio_put()?
>>>>
>>>> In filemap_map_order0_folio(), assuming the page is hwpoisoned, we skip set_pte_range(),
>>>> the folio should be unlocked and put. If we only move folio_put() to filemap_map_order0_folio(),
>>>> the folio is unlocked when filemap_map_pages() doesn't hold any folio refcount.
>>> Oh, I see.  I misread your patch; sorry about that.
>>>
>>> However, it is still safe to move only the folio_put() and not move
>>> the folio_unlock()!  It's a little subtle, so I'll explain.
>>>
>>> We must not free a locked folio.  The page allocator has checks for this
>>> and will complain (assuming appropriate debug options are enabled).  So
>>> this sequence:
>>>
>>> 	folio_put(folio);
>>> 	folio_unlock(folio);
>>>
>>> is _generally_ unsafe because the folio_put() might be the last put of
>>> the refcount which will cause the folio to be freed.  However, if we know
>>> that the folio has a refcount > 1, it's safe because the folio_put()
>>> won't free the folio.  We do know that the folio has a refcount >1
>>> because it's in the page cache, which keeps a refcount on the folio.
>>> Since we have it locked, we know that truncation will wait for the unlock
>>> to happen, and truncation will be the last one to put the refcount.
>>
>> I agree that it is save, but is it worth it having that subtle detail here
>> instead of just doing unlock+put?
>>
>> IOW, what do we gain by doing it differently? :)
> 
> That was in the initial mail:
> 
>> With this patch, we can get 8% performance gain for lmbench testcase
>> 'lat_pagefault -P 1 file', the size of file is 512M.
> 
> Obviously the exact gains are going to depend on how good your CPU is
> at doing atomic inc/decs, but reducing the number of atomics is always
> a good thing.

Ah, yes, obviously. I misread your mail, assuming you would argue for a 
further change.

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] filemap: optimize order0 folio in filemap_map_pages
  2025-08-21 12:45         ` Matthew Wilcox
  2025-08-21 12:57           ` David Hildenbrand
@ 2025-08-22  2:01           ` Jinjiang Tu
  1 sibling, 0 replies; 13+ messages in thread
From: Jinjiang Tu @ 2025-08-22  2:01 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: fengwei.yin, akpm, david, linux-mm, wangkefeng.wang

[-- Attachment #1: Type: text/plain, Size: 2038 bytes --]


在 2025/8/21 20:45, Matthew Wilcox 写道:
> On Thu, Aug 21, 2025 at 09:22:48AM +0800, Jinjiang Tu wrote:
>> 在 2025/8/20 20:42, Matthew Wilcox 写道:
>>> On Wed, Aug 20, 2025 at 09:10:56AM +0800, Jinjiang Tu wrote:
>>>> We should call folio_unlock() before folio_put(). In filemap_map_order0_folio(),
>>>> if we doesn't set folio into pte, we should unlock and put folio.
>>> I agree that folio_unlock() needs to be called before folio_put().
>>> What I don't understand is why we need to delay folio_unlock() until
>>> right before folio_put().  Can't we leave folio_unlock() where it
>>> currently is and only move the folio_put()?
>> In filemap_map_order0_folio(), assuming the page is hwpoisoned, we skip set_pte_range(),
>> the folio should be unlocked and put. If we only move folio_put() to filemap_map_order0_folio(),
>> the folio is unlocked when filemap_map_pages() doesn't hold any folio refcount.
> Oh, I see.  I misread your patch; sorry about that.
>
> However, it is still safe to move only the folio_put() and not move
> the folio_unlock()!  It's a little subtle, so I'll explain.
>
> We must not free a locked folio.  The page allocator has checks for this
> and will complain (assuming appropriate debug options are enabled).  So
> this sequence:
>
> 	folio_put(folio);
> 	folio_unlock(folio);
>
> is _generally_ unsafe because the folio_put() might be the last put of
> the refcount which will cause the folio to be freed.  However, if we know
> that the folio has a refcount > 1, it's safe because the folio_put()
> won't free the folio.  We do know that the folio has a refcount >1
> because it's in the page cache, which keeps a refcount on the folio.
> Since we have it locked, we know that truncation will wait for the unlock
> to happen, and truncation will be the last one to put the refcount.

Ah, I see. Thanks for explaining this.

The performance gain is from decreasing atomic operations of folio's refcount,
is irrelevant to folio_unlock() call. So maybe it's no need to introduce this sublte change?

[-- Attachment #2: Type: text/html, Size: 2962 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] filemap: optimize order0 folio in filemap_map_pages
  2025-08-21 13:35               ` David Hildenbrand
@ 2025-09-02 14:04                 ` Kefeng Wang
  2025-09-03  4:50                   ` Matthew Wilcox
  0 siblings, 1 reply; 13+ messages in thread
From: Kefeng Wang @ 2025-09-02 14:04 UTC (permalink / raw)
  To: David Hildenbrand, Matthew Wilcox
  Cc: Jinjiang Tu, fengwei.yin, akpm, linux-mm



On 2025/8/21 21:35, David Hildenbrand wrote:
> On 21.08.25 15:20, Matthew Wilcox wrote:
>> On Thu, Aug 21, 2025 at 02:57:54PM +0200, David Hildenbrand wrote:
>>> On 21.08.25 14:45, Matthew Wilcox wrote:
>>>> On Thu, Aug 21, 2025 at 09:22:48AM +0800, Jinjiang Tu wrote:
>>>>> 在 2025/8/20 20:42, Matthew Wilcox 写道:
>>>>>> On Wed, Aug 20, 2025 at 09:10:56AM +0800, Jinjiang Tu wrote:
>>>>>>> We should call folio_unlock() before folio_put(). In 
>>>>>>> filemap_map_order0_folio(),
>>>>>>> if we doesn't set folio into pte, we should unlock and put folio.
>>>>>> I agree that folio_unlock() needs to be called before folio_put().
>>>>>> What I don't understand is why we need to delay folio_unlock() until
>>>>>> right before folio_put().  Can't we leave folio_unlock() where it
>>>>>> currently is and only move the folio_put()?
>>>>>
>>>>> In filemap_map_order0_folio(), assuming the page is hwpoisoned, we 
>>>>> skip set_pte_range(),
>>>>> the folio should be unlocked and put. If we only move folio_put() 
>>>>> to filemap_map_order0_folio(),
>>>>> the folio is unlocked when filemap_map_pages() doesn't hold any 
>>>>> folio refcount.
>>>> Oh, I see.  I misread your patch; sorry about that.
>>>>
>>>> However, it is still safe to move only the folio_put() and not move
>>>> the folio_unlock()!  It's a little subtle, so I'll explain.
>>>>
>>>> We must not free a locked folio.  The page allocator has checks for 
>>>> this
>>>> and will complain (assuming appropriate debug options are enabled).  So
>>>> this sequence:
>>>>
>>>>     folio_put(folio);
>>>>     folio_unlock(folio);
>>>>
>>>> is _generally_ unsafe because the folio_put() might be the last put of
>>>> the refcount which will cause the folio to be freed.  However, if we 
>>>> know
>>>> that the folio has a refcount > 1, it's safe because the folio_put()
>>>> won't free the folio.  We do know that the folio has a refcount >1
>>>> because it's in the page cache, which keeps a refcount on the folio.
>>>> Since we have it locked, we know that truncation will wait for the 
>>>> unlock
>>>> to happen, and truncation will be the last one to put the refcount.
>>>
>>> I agree that it is save, but is it worth it having that subtle detail 
>>> here
>>> instead of just doing unlock+put?
>>>
>>> IOW, what do we gain by doing it differently? :)
>>
>> That was in the initial mail:
>>
>>> With this patch, we can get 8% performance gain for lmbench testcase
>>> 'lat_pagefault -P 1 file', the size of file is 512M.
>>
>> Obviously the exact gains are going to depend on how good your CPU is
>> at doing atomic inc/decs, but reducing the number of atomics is always
>> a good thing.
> 
> Ah, yes, obviously. I misread your mail, assuming you would argue for a 
> further change.
> 

So, Andrew, could you help to pick this up?



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] filemap: optimize order0 folio in filemap_map_pages
  2025-09-02 14:04                 ` Kefeng Wang
@ 2025-09-03  4:50                   ` Matthew Wilcox
  2025-09-03  6:22                     ` Kefeng Wang
  0 siblings, 1 reply; 13+ messages in thread
From: Matthew Wilcox @ 2025-09-03  4:50 UTC (permalink / raw)
  To: Kefeng Wang; +Cc: David Hildenbrand, Jinjiang Tu, fengwei.yin, akpm, linux-mm

On Tue, Sep 02, 2025 at 10:04:09PM +0800, Kefeng Wang wrote:
> So, Andrew, could you help to pick this up?

NAK.  Send a version that does not move the folio_unlock().


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] filemap: optimize order0 folio in filemap_map_pages
  2025-09-03  4:50                   ` Matthew Wilcox
@ 2025-09-03  6:22                     ` Kefeng Wang
  0 siblings, 0 replies; 13+ messages in thread
From: Kefeng Wang @ 2025-09-03  6:22 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: David Hildenbrand, Jinjiang Tu, akpm, linux-mm



On 2025/9/3 12:50, Matthew Wilcox wrote:
> On Tue, Sep 02, 2025 at 10:04:09PM +0800, Kefeng Wang wrote:
>> So, Andrew, could you help to pick this up?
> 
> NAK.  Send a version that does not move the folio_unlock().
> 
OK, Jinjian, please update.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-09-03  6:23 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-19 14:06 [PATCH] filemap: optimize order0 folio in filemap_map_pages Jinjiang Tu
2025-08-19 15:52 ` Matthew Wilcox
2025-08-20  1:10   ` Jinjiang Tu
2025-08-20 12:42     ` Matthew Wilcox
2025-08-21  1:22       ` Jinjiang Tu
2025-08-21 12:45         ` Matthew Wilcox
2025-08-21 12:57           ` David Hildenbrand
2025-08-21 13:20             ` Matthew Wilcox
2025-08-21 13:35               ` David Hildenbrand
2025-09-02 14:04                 ` Kefeng Wang
2025-09-03  4:50                   ` Matthew Wilcox
2025-09-03  6:22                     ` Kefeng Wang
2025-08-22  2:01           ` Jinjiang Tu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).