public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 0/4] btrfs: make autodefrag to defrag and only defrag small write ranges
Date: Tue, 15 Feb 2022 14:55:17 +0800	[thread overview]
Message-ID: <870cb1f0-4108-75d5-6b45-e6a26a2be3d2@suse.com> (raw)
In-Reply-To: <cover.1644737297.git.wqu@suse.com>



On 2022/2/13 15:42, Qu Wenruo wrote:
> When a small write reaches disk, btrfs will mark the inode for
> autodefrag, and record the transid of the inode for autodefrag.
> 
> Then autodefrag uses the transid to only scan newer file extents.
> 
> However this transid based scanning scheme has a hole, the small write
> size threshold triggering autodefrag is not the same extent size
> threshold for autodefrag.
> 
> For the following write sequence on an non-compressed inode:
> 
>   pwrite 0 4k
>   sync
>   pwrite 4k 128k
>   sync
>   pwrite 132k 128k
>   sync.
> 
> The first 4K is indeed a small write (<64K), but the later two 128K ones
> are definite not (>64K).
> 
> Hoever autodefrag will try to defrag all three writes, as the
> extent_threshold used for autodefrag is fixed 256K.
> 
> This extra scanning on extents which didn't trigger autodefrag can cause
> extra IO.
> 
> This patchset will try to address the problem by:
> 
> - Remove the inode_defrag re-queue behavior
>    Now we just scan one file til its end (while keep the
>    max_sectors_to_defrag limit, and frequently check the root refs, and
>    remount situation to exit).
> 
>    This also saves several bytes from inode_defrag structure.
> 
>    This is done in the 3rd patch.
> 
> - Save @small_write value into inode_defrag and use it as autodefrag
>    extent threshold
>    Now there is no gap for autodefrag and small_write.
> 
>    This is done in the 4th patch.
> 
> The remaining patches are:
> 
> - Removing one dead parameter
> 
> - Add extra trace events for autodefrag
>    So end users will no longer need to re-compile kernel modules, and
>    use trace events to provide debug info on the autodefrag/defrag ioctl.
> 
> Unfortunately I don't have a good benchmark setup for the patchset yet,
> but unlike previous RFC version, this one brings very little extra
> resource usage, and is just changing the extent_threshold for
> autodefrag.

Got a small benchmark result for it.

Using the following fio job:

  [torrent]
  filename=torrent-test
  rw=randwrite
  ioengine=sync
  size=4g
  randseed=123456
  allrandrepeat=1
  fallocate=none

And the VM only has 2G ram.

This should really be the worst case scenario.

Then the full benchmark includes:

start_trace()
{
	echo 0 > $tracedir/tracing_on
	echo > $tracedir/trace
	#echo > $tracedir/trace_options
	echo > $tracedir/set_event
	echo "btrfs:defrag_file_end" >> $tracedir/set_event
	echo 1 > $tracedir/tracing_on
}

end_trace()
{
	cp $tracedir/trace /home/adam
	echo 0 > $tracedir/tracing_on
}

	mkfs.btrfs -f $dev

	start_trace
	mount $dev $mnt -o autodefrag
	cd $mnt
	fio /home/adam/torrent.fio
	cd
	umount $mnt
	end_trace

With all defragged sectors accounted, before the last two patches:

Total sectors defragged		= 6846831
Total defrag_file() calls	= 6701

After the last two patches:

Total sectors defragged		= 3466851
Total defrag_file() calls	= 3396

Which shows an obvious drop in the sectors marked for autodefrag.

Thanks,
Qu




> 
> Changelog:
> RFC->v1:
> - Add ftrace events for defrag
> 
> - Add a new patch to change how we run defrag inodes
>    Instead of saving previous location and re-queue, just run it in one
>    run.
>    Previously btrfs_run_defrag_inodse() will always exhaust the existing
>    inode_defrag anyway, the change should not bring much difference.
> 
> - Change autodefrag extent_thresh to close the gap, other than using
>    another extent io tree
>    Now it uses less resource, keep the critical section small, while
>    can almost reach the same objective.
> 
> Qu Wenruo (4):
>    btrfs: remove unused parameter for btrfs_add_inode_defrag()
>    btrfs: add trace events for defrag
>    btrfs: autodefrag: only scan one inode once
>    btrfs: close the gap between inode_should_defrag() and autodefrag
>      extent size threshold
> 
>   fs/btrfs/ctree.h             |   3 +-
>   fs/btrfs/file.c              | 165 +++++++++++++++--------------------
>   fs/btrfs/inode.c             |   4 +-
>   fs/btrfs/ioctl.c             |   4 +
>   include/trace/events/btrfs.h | 127 +++++++++++++++++++++++++++
>   5 files changed, 206 insertions(+), 97 deletions(-)
> 


  parent reply	other threads:[~2022-02-15  6:55 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-13  7:42 [PATCH 0/4] btrfs: make autodefrag to defrag and only defrag small write ranges Qu Wenruo
2022-02-13  7:42 ` [PATCH 1/4] btrfs: remove unused parameter for btrfs_add_inode_defrag() Qu Wenruo
2022-02-13  7:42 ` [PATCH 2/4] btrfs: add trace events for defrag Qu Wenruo
2022-02-13  7:42 ` [PATCH 3/4] btrfs: autodefrag: only scan one inode once Qu Wenruo
2022-02-22 17:32   ` David Sterba
2022-02-22 23:42     ` Qu Wenruo
2022-02-23 15:53       ` David Sterba
2022-02-24  6:59         ` Qu Wenruo
2022-02-24  9:45           ` Qu Wenruo
2022-02-24 12:18             ` Qu Wenruo
2022-02-24 19:44               ` David Sterba
2022-02-24 19:41           ` David Sterba
2022-02-13  7:42 ` [PATCH 4/4] btrfs: close the gap between inode_should_defrag() and autodefrag extent size threshold Qu Wenruo
2022-02-15  6:55 ` Qu Wenruo [this message]
2022-02-22  1:10 ` [PATCH 0/4] btrfs: make autodefrag to defrag and only defrag small write ranges Su Yue

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=870cb1f0-4108-75d5-6b45-e6a26a2be3d2@suse.com \
    --to=wqu@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox