Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: Wang Yugui <wangyugui@e16-tech.com>
Cc: linux-btrfs@vger.kernel.org,
	Julian Taylor <julian.taylor@1und1.de>,
	Filipe Manana <fdmanana@suse.com>
Subject: Re: [PATCH v2] btrfs: do not wait for short bulk allocation
Date: Mon, 15 Apr 2024 07:49:06 +0930	[thread overview]
Message-ID: <4ad8e412-1a37-42b1-8cc4-2ffec664b035@suse.com> (raw)
In-Reply-To: <20240414202622.B092.409509F4@e16-tech.com>



在 2024/4/14 21:56, Wang Yugui 写道:
> Hi,
> 
>> [BUG]
>> There is a recent report that when memory pressure is high (including
>> cached pages), btrfs can spend most of its time on memory allocation in
>> btrfs_alloc_page_array() for compressed read/write.
>>
>> [CAUSE]
>> For btrfs_alloc_page_array() we always go alloc_pages_bulk_array(), and
>> even if the bulk allocation failed (fell back to single page
>> allocation) we still retry but with extra memalloc_retry_wait().
>>
>> If the bulk alloc only returned one page a time, we would spend a lot of
>> time on the retry wait.
>>
>> The behavior was introduced in commit 395cb57e8560 ("btrfs: wait between
>> incomplete batch memory allocations").
>>
>> [FIX]
>> Although the commit mentioned that other filesystems do the wait, it's
>> not the case at least nowadays.
>>
>> All the mainlined filesystems only call memalloc_retry_wait() if they
>> failed to allocate any page (not only for bulk allocation).
>> If there is any progress, they won't call memalloc_retry_wait() at all.
>>
>> For example, xfs_buf_alloc_pages() would only call memalloc_retry_wait()
>> if there is no allocation progress at all, and the call is not for
>> metadata readahead.
>>
>> So I don't believe we should call memalloc_retry_wait() unconditionally
>> for short allocation.
>>
>> This patch would only call memalloc_retry_wait() if failed to allocate
>> any page for tree block allocation (which goes with __GFP_NOFAIL and may
>> not need the special handling anyway), and reduce the latency for
>> btrfs_alloc_page_array().
>>
>> Reported-by: Julian Taylor <julian.taylor@1und1.de>
>> Tested-by: Julian Taylor <julian.taylor@1und1.de>
>> Link: https://lore.kernel.org/all/8966c095-cbe7-4d22-9784-a647d1bf27c3@1und1.de/
>> Fixes: 395cb57e8560 ("btrfs: wait between incomplete batch memory allocations")
> 
> It seems this patch remove all the logic of
> 	395cb57e8560 ("btrfs: wait between incomplete batch memory allocations"),
> 
> so we should revert this part too?

Oh, right.

Feel free to submit a patch to cleanup the headers here.

Thanks,
Qu
> 
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index c140dd0..df4675e 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -6,7 +6,6 @@
>   #include <linux/mm.h>
>   #include <linux/pagemap.h>
>   #include <linux/page-flags.h>
> -#include <linux/sched/mm.h>
>   #include <linux/spinlock.h>
>   #include <linux/blkdev.h>
>   #include <linux/swap.h>
> 
> 
> Best Regards
> Wang Yugui (wangyugui@e16-tech.com)
> 2024/04/14
> 
> 
>> Reviewed-by: Filipe Manana <fdmanana@suse.com>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>> Changelog:
>> v2:
>> - Still use bulk allocation function
>>    Since alloc_pages_bulk_array() would fall back to single page
>>    allocation by itself, there is no need to go alloc_page() manually.
>>
>> - Update the commit message to indicate other fses do not call
>>    memalloc_retry_wait() unconditionally
>>    In fact, they only call it when they need to retry hard and can not
>>    really fail.
>> ---
>>   fs/btrfs/extent_io.c | 22 +++++++++-------------
>>   1 file changed, 9 insertions(+), 13 deletions(-)
>>
>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>> index 7441245b1ceb..c96089b6f388 100644
>> --- a/fs/btrfs/extent_io.c
>> +++ b/fs/btrfs/extent_io.c
>> @@ -681,31 +681,27 @@ static void end_bbio_data_read(struct btrfs_bio *bbio)
>>   int btrfs_alloc_page_array(unsigned int nr_pages, struct page **page_array,
>>   			   gfp_t extra_gfp)
>>   {
>> +	const gfp_t gfp = GFP_NOFS | extra_gfp;
>>   	unsigned int allocated;
>>   
>>   	for (allocated = 0; allocated < nr_pages;) {
>>   		unsigned int last = allocated;
>>   
>> -		allocated = alloc_pages_bulk_array(GFP_NOFS | extra_gfp,
>> -						   nr_pages, page_array);
>> +		allocated = alloc_pages_bulk_array(gfp, nr_pages, page_array);
>> +		if (unlikely(allocated == last)) {
>> +			/* Can not fail, wait and retry. */
>> +			if (extra_gfp & __GFP_NOFAIL) {
>> +				memalloc_retry_wait(GFP_NOFS);
>> +				continue;
>> +			}
>>   
>> -		if (allocated == nr_pages)
>> -			return 0;
>> -
>> -		/*
>> -		 * During this iteration, no page could be allocated, even
>> -		 * though alloc_pages_bulk_array() falls back to alloc_page()
>> -		 * if  it could not bulk-allocate. So we must be out of memory.
>> -		 */
>> -		if (allocated == last) {
>> +			/* Allowed to fail, error out. */
>>   			for (int i = 0; i < allocated; i++) {
>>   				__free_page(page_array[i]);
>>   				page_array[i] = NULL;
>>   			}
>>   			return -ENOMEM;
>>   		}
>> -
>> -		memalloc_retry_wait(GFP_NOFS);
>>   	}
>>   	return 0;
>>   }
>> -- 
>> 2.44.0
>>
> 
> 

  parent reply	other threads:[~2024-04-14 22:19 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-25 22:46 [PATCH v2] btrfs: do not wait for short bulk allocation Qu Wenruo
2024-03-25 22:57 ` Sweet Tea Dorminy
2024-03-26 13:05 ` Johannes Thumshirn
2024-03-28 15:57 ` David Sterba
2024-03-28 20:29   ` Qu Wenruo
2024-04-04 19:57     ` David Sterba
2024-04-04 21:08       ` Qu Wenruo
     [not found] ` <20240414202622.B092.409509F4@e16-tech.com>
2024-04-14 22:19   ` Qu Wenruo [this message]
2024-04-15  2:35     ` Wang Yugui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ad8e412-1a37-42b1-8cc4-2ffec664b035@suse.com \
    --to=wqu@suse.com \
    --cc=fdmanana@suse.com \
    --cc=julian.taylor@1und1.de \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wangyugui@e16-tech.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox