From: Arne Jansen <sensille@gmx.net>
To: Liu Bo <liubo2009@cn.fujitsu.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 0/5] btrfs: snapshot deletion via readahead
Date: Fri, 13 Apr 2012 08:53:33 +0200 [thread overview]
Message-ID: <4F87CD6D.5040100@gmx.net> (raw)
In-Reply-To: <4F87A034.3030607@cn.fujitsu.com>
On 13.04.2012 05:40, Liu Bo wrote:
> On 04/12/2012 11:54 PM, Arne Jansen wrote:
>> This patchset reimplements snapshot deletion with the help of the readahead
>> framework. For this callbacks are added to the framework. The main idea is
>> to traverse many snapshots at once at read many branches at once. This way
>> readahead get many requests at once (currently about 50000), giving it the
>> chance to order disk accesses properly. On a single disk, the effect is
>> currently spoiled by sync operations that still take place, mainly checksum
>> deletion. The most benefit can be gained with multiple devices, as all devices
>> can be fully utilized. It scales quite well with the number of devices.
>> For more details see the commit messages of the individual patches and the
>> source code comments.
>>
>> How it is tested:
>> I created a test volume using David Sterba's stress-subvol-git-aging.sh. It
>> checks out randoms version of the kernel git tree, creating a snapshot from it
>> from time to time and checks out other versions there, and so on. In the end
>> the fs had 80 subvols with various degrees of sharing between them. The
>> following tests were conducted on it:
>> - delete a subvol using droptree and check the fs with btrfsck afterwards
>> for consistency
>> - delete all subvols and verify with btrfs-debug-tree that the extent
>> allocation tree is clean
>> - delete 70 subvols, and in parallel empty the other 10 with rm -rf to get
>> a good pressure on locking
>> - add various degrees of memory pressure to the previous test to get pages
>> to expire early from page cache
>> - enable all relevant kernel debugging options during all tests
>>
>> The performance gain on a single drive was about 20%, on 8 drives about 600%.
>> It depends vastly on the maximum parallelity of the readahead, that is
>> currently hardcoded to about 50000. This number is subject to 2 factors, the
>> available RAM and the size of the saved state for a commit. As the full state
>> has to be saved on commit, a large parallelity leads to a large state.
>>
>> Based on this I'll see if I can add delayed checksum deletions and running
>> the delayed refs via readahead, to gain a maximum ordering of I/O ops.
>>
>
> Hi Arne,
>
> Can you please show us some user cases in this, or can we get some extra benefits from it? :)
The case I'm most concerned with is having large filesystems (like 20x3T)
with thousands of users on it in thousands of subvolumes and making
hourly snapshots. Creating these snapshots is relatively easy, getting rid
of them is not.
But there are already reports where deleting a single snapshot can take
several days. So we really need a huge speedup here, and this is only
the beginning :)
-Arne
>
> thanks,
> liubo
>
>> This patchset is also available at
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/arne/linux-btrfs.git droptree
>>
>> Arne Jansen (5):
>> btrfs: extend readahead interface
>> btrfs: add droptree inode
>> btrfs: droptree structures and initialization
>> btrfs: droptree implementation
>> btrfs: use droptree for snapshot deletion
>>
>> fs/btrfs/Makefile | 2 +-
>> fs/btrfs/btrfs_inode.h | 4 +
>> fs/btrfs/ctree.h | 78 ++-
>> fs/btrfs/disk-io.c | 19 +
>> fs/btrfs/droptree.c | 1916 +++++++++++++++++++++++++++++++++++++++++++
>> fs/btrfs/free-space-cache.c | 131 +++-
>> fs/btrfs/free-space-cache.h | 32 +
>> fs/btrfs/inode.c | 3 +-
>> fs/btrfs/reada.c | 494 +++++++++---
>> fs/btrfs/scrub.c | 29 +-
>> fs/btrfs/transaction.c | 35 +-
>> 11 files changed, 2592 insertions(+), 151 deletions(-)
>> create mode 100644 fs/btrfs/droptree.c
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2012-04-13 6:53 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-12 15:54 [PATCH 0/5] btrfs: snapshot deletion via readahead Arne Jansen
2012-04-12 15:54 ` [PATCH 1/5] btrfs: extend readahead interface Arne Jansen
2012-05-09 14:48 ` David Sterba
2012-05-17 13:47 ` Arne Jansen
2012-04-12 15:54 ` [PATCH 2/5] btrfs: add droptree inode Arne Jansen
2012-04-12 15:54 ` [PATCH 3/5] btrfs: droptree structures and initialization Arne Jansen
2012-04-12 15:54 ` [PATCH 4/5] btrfs: droptree implementation Arne Jansen
2012-04-13 2:53 ` Tsutomu Itoh
2012-04-13 6:48 ` Arne Jansen
2012-04-12 15:54 ` [PATCH 5/5] btrfs: use droptree for snapshot deletion Arne Jansen
2012-04-13 3:40 ` [PATCH 0/5] btrfs: snapshot deletion via readahead Liu Bo
2012-04-13 4:54 ` cwillu
2012-04-13 6:53 ` Arne Jansen [this message]
2012-04-13 7:10 ` Liu Bo
2012-04-13 7:19 ` Arne Jansen
2012-04-13 7:43 ` Liu Bo
2012-04-17 7:35 ` Arne Jansen
2012-04-17 8:21 ` Liu Bo
2012-04-27 3:16 ` Liu Bo
2012-04-27 6:13 ` Arne Jansen
2012-04-27 6:48 ` Liu Bo
2012-04-13 7:20 ` Liu Bo
2012-04-13 10:31 ` Arne Jansen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F87CD6D.5040100@gmx.net \
--to=sensille@gmx.net \
--cc=linux-btrfs@vger.kernel.org \
--cc=liubo2009@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.