linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] btrfs: use correct compare function of dirty_metadata_bytes
@ 2018-07-02  7:44 Ethan Lien
  2018-07-02  7:49 ` Nikolay Borisov
  2018-07-04 16:26 ` David Sterba
  0 siblings, 2 replies; 5+ messages in thread
From: Ethan Lien @ 2018-07-02  7:44 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Ethan Lien

We use customized, nodesize batch value to update dirty_metadata_bytes.
We should also use batch version of compare function or we will easily
goto fast path and get false result from percpu_counter_compare().

Signed-off-by: Ethan Lien <ethanlien@synology.com>
---
 fs/btrfs/disk-io.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 205092dc9390..dfed08e70ec1 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -961,8 +961,9 @@ static int btree_writepages(struct address_space *mapping,
 
 		fs_info = BTRFS_I(mapping->host)->root->fs_info;
 		/* this is a bit racy, but that's ok */
-		ret = percpu_counter_compare(&fs_info->dirty_metadata_bytes,
-					     BTRFS_DIRTY_METADATA_THRESH);
+		ret = __percpu_counter_compare(&fs_info->dirty_metadata_bytes,
+					     BTRFS_DIRTY_METADATA_THRESH,
+					     fs_info->dirty_metadata_batch);
 		if (ret < 0)
 			return 0;
 	}
@@ -4150,8 +4151,9 @@ static void __btrfs_btree_balance_dirty(struct btrfs_fs_info *fs_info,
 	if (flush_delayed)
 		btrfs_balance_delayed_items(fs_info);
 
-	ret = percpu_counter_compare(&fs_info->dirty_metadata_bytes,
-				     BTRFS_DIRTY_METADATA_THRESH);
+	ret = __percpu_counter_compare(&fs_info->dirty_metadata_bytes,
+				     BTRFS_DIRTY_METADATA_THRESH,
+				     fs_info->dirty_metadata_batch);
 	if (ret > 0) {
 		balance_dirty_pages_ratelimited(fs_info->btree_inode->i_mapping);
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] btrfs: use correct compare function of dirty_metadata_bytes
  2018-07-02  7:44 [PATCH] btrfs: use correct compare function of dirty_metadata_bytes Ethan Lien
@ 2018-07-02  7:49 ` Nikolay Borisov
  2018-07-04 16:55   ` David Sterba
  2018-07-04 16:26 ` David Sterba
  1 sibling, 1 reply; 5+ messages in thread
From: Nikolay Borisov @ 2018-07-02  7:49 UTC (permalink / raw)
  To: Ethan Lien, linux-btrfs



On  2.07.2018 10:44, Ethan Lien wrote:
> We use customized, nodesize batch value to update dirty_metadata_bytes.
> We should also use batch version of compare function or we will easily
> goto fast path and get false result from percpu_counter_compare().
> 
> Signed-off-by: Ethan Lien <ethanlien@synology.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

Also this needs the following tag: 

Fixes: e2d845211eda ("Btrfs: use percpu counter for dirty metadata count")

And perhaps stable from 4.4 onwards. But David will take care
of that. So the patch is good. 

> ---
>  fs/btrfs/disk-io.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 205092dc9390..dfed08e70ec1 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -961,8 +961,9 @@ static int btree_writepages(struct address_space *mapping,
>  
>  		fs_info = BTRFS_I(mapping->host)->root->fs_info;
>  		/* this is a bit racy, but that's ok */
> -		ret = percpu_counter_compare(&fs_info->dirty_metadata_bytes,
> -					     BTRFS_DIRTY_METADATA_THRESH);
> +		ret = __percpu_counter_compare(&fs_info->dirty_metadata_bytes,
> +					     BTRFS_DIRTY_METADATA_THRESH,
> +					     fs_info->dirty_metadata_batch);
>  		if (ret < 0)
>  			return 0;
>  	}
> @@ -4150,8 +4151,9 @@ static void __btrfs_btree_balance_dirty(struct btrfs_fs_info *fs_info,
>  	if (flush_delayed)
>  		btrfs_balance_delayed_items(fs_info);
>  
> -	ret = percpu_counter_compare(&fs_info->dirty_metadata_bytes,
> -				     BTRFS_DIRTY_METADATA_THRESH);
> +	ret = __percpu_counter_compare(&fs_info->dirty_metadata_bytes,
> +				     BTRFS_DIRTY_METADATA_THRESH,
> +				     fs_info->dirty_metadata_batch);
>  	if (ret > 0) {
>  		balance_dirty_pages_ratelimited(fs_info->btree_inode->i_mapping);
>  	}
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] btrfs: use correct compare function of dirty_metadata_bytes
  2018-07-02  7:44 [PATCH] btrfs: use correct compare function of dirty_metadata_bytes Ethan Lien
  2018-07-02  7:49 ` Nikolay Borisov
@ 2018-07-04 16:26 ` David Sterba
  1 sibling, 0 replies; 5+ messages in thread
From: David Sterba @ 2018-07-04 16:26 UTC (permalink / raw)
  To: Ethan Lien; +Cc: linux-btrfs

On Mon, Jul 02, 2018 at 03:44:58PM +0800, Ethan Lien wrote:
> We use customized, nodesize batch value to update dirty_metadata_bytes.
> We should also use batch version of compare function or we will easily
> goto fast path and get false result from percpu_counter_compare().
> 
> Signed-off-by: Ethan Lien <ethanlien@synology.com>

This looks like it could have some observable effects. The default batch
is 32 * cpus online, while the one supplied by btrfs is much higher as
it's a multiple of nodesize.

The BTRFS_DIRTY_METADATA_THRESH is 32MiB so the comparision is likely to
be off scale in some if not most cases. I got lost in the callchains and
how the return values of writepage are interpreted.

The effect I see is: there are many pages batched (by the _add_batch
calls) but none of the compare calls dects that and exits early. So
there are more metadata being built up in memory and will have to be
synced later.

The change looks correct to me, we have to compare the numbers of the
same units. I'll add it to misc-next for testing, maybe the 0day bot
would detect some changes in performance.

Please let me know if you have further insights or clarifications to
what's written above, the changelog could use some update as the change
has some unobvious consequences.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] btrfs: use correct compare function of dirty_metadata_bytes
  2018-07-02  7:49 ` Nikolay Borisov
@ 2018-07-04 16:55   ` David Sterba
  2018-07-04 16:58     ` Nikolay Borisov
  0 siblings, 1 reply; 5+ messages in thread
From: David Sterba @ 2018-07-04 16:55 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: Ethan Lien, linux-btrfs

On Mon, Jul 02, 2018 at 10:49:46AM +0300, Nikolay Borisov wrote:
> On  2.07.2018 10:44, Ethan Lien wrote:
> > We use customized, nodesize batch value to update dirty_metadata_bytes.
> > We should also use batch version of compare function or we will easily
> > goto fast path and get false result from percpu_counter_compare().
> > 
> > Signed-off-by: Ethan Lien <ethanlien@synology.com>
> 
> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> 
> Also this needs the following tag: 
> 
> Fixes: e2d845211eda ("Btrfs: use percpu counter for dirty metadata count")

I'll add the tags, thanks.

2939         fs_info->dirty_metadata_batch = nodesize * (1 + ilog2(nr_cpu_ids));

What a commit, sigh. Arbitrary constants without any explanation. I
don't quite understand why the ilog2(nr_cpui_ids) was chosen. If cpus
are onlined/offlined there's no adjustment. I'd think that there should
be some amount of data that are supposed to be dirtied per-cpu and this
value is independent of the number of cpus. That the online/offline
status is abstracted by the percpu API. I don't have time to study that
too deeply, but this is something that needs to be revisited.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] btrfs: use correct compare function of dirty_metadata_bytes
  2018-07-04 16:55   ` David Sterba
@ 2018-07-04 16:58     ` Nikolay Borisov
  0 siblings, 0 replies; 5+ messages in thread
From: Nikolay Borisov @ 2018-07-04 16:58 UTC (permalink / raw)
  To: dsterba, Ethan Lien, linux-btrfs



On  4.07.2018 19:55, David Sterba wrote:
> On Mon, Jul 02, 2018 at 10:49:46AM +0300, Nikolay Borisov wrote:
>> On  2.07.2018 10:44, Ethan Lien wrote:
>>> We use customized, nodesize batch value to update dirty_metadata_bytes.
>>> We should also use batch version of compare function or we will easily
>>> goto fast path and get false result from percpu_counter_compare().
>>>
>>> Signed-off-by: Ethan Lien <ethanlien@synology.com>
>>
>> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
>>
>> Also this needs the following tag: 
>>
>> Fixes: e2d845211eda ("Btrfs: use percpu counter for dirty metadata count")
> 
> I'll add the tags, thanks.
> 
> 2939         fs_info->dirty_metadata_batch = nodesize * (1 + ilog2(nr_cpu_ids));
> 
> What a commit, sigh. Arbitrary constants without any explanation. I
> don't quite understand why the ilog2(nr_cpui_ids) was chosen. If cpus
> are onlined/offlined there's no adjustment. I'd think that there should
> be some amount of data that are supposed to be dirtied per-cpu and this
> value is independent of the number of cpus. That the online/offline
> status is abstracted by the percpu API. I don't have time to study that
> too deeply, but this is something that needs to be revisited.

TBH, what is perhaps the safest approach would be to modify the code to
eliminate the custom batch related variables/code. The code has been
working like that for years.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-07-04 16:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-07-02  7:44 [PATCH] btrfs: use correct compare function of dirty_metadata_bytes Ethan Lien
2018-07-02  7:49 ` Nikolay Borisov
2018-07-04 16:55   ` David Sterba
2018-07-04 16:58     ` Nikolay Borisov
2018-07-04 16:26 ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).