From: Arne Jansen <sensille@gmx.net>
To: Linux Btrfs <linux-btrfs@vger.kernel.org>
Cc: Chris Mason <chris.mason@oracle.com>
Subject: Re: [RFC] Tree fragmentation and prefetching
Date: Wed, 23 Mar 2011 16:01:04 +0100 [thread overview]
Message-ID: <4D8A0B30.7080608@gmx.net> (raw)
In-Reply-To: <4D89F03A.9050306@gmx.net>
On 23.03.2011 14:06, Arne Jansen wrote:
> While looking into the performance of scrub I noticed that a significant
> amount of time is being used for loading the extent tree and the csum
> tree. While this is no surprise I did some prototyping on how to improve
> on it.
> The main idea is to load the tree (or parts of it) top-down, order the
> needed blocks and distribute it over all disks.
> To keep you interested, some results first.
>
> a) by tree enumeration with reada=2
> reading extent tree: 242s
> reading csum tree: 140s
> reading both trees: 324s
sorry, the number is wrong, it should be 384s (just the sum of both
./. rounding errors).
>
> b) prefetch prototype
> reading extent tree: 23.5s
> reading csum tree: 20.4s
> reading both trees: 25.7s
>
> The test setup consists of a 7 Seagate ES.2 1TB disks filesystem, filled
> 28%. It is created with the current git tree + the round robin patch and
> filled with
>
> fs_mark -D 512 -t 16 -n 4096 -F -S0
>
> The 'normal' read is done by enumerating the leaves by btrfs_next_leaf()
> with path->reada=2. Both trees are being enumerated one after the other.
> The prototype currently just uses raw bios, does not make use of the
> page cache and does not enter the read pages into the cache. This will
> probably add some overhead. It also does not check the crcs.
>
> While it is very promising to implement it for scrub, I think a more
> general interface which can be used for every enumeration would be
> beneficial. Use cases that come to mind are rebalance, reflink, deletion
> of large files, listing of large directories etc..
>
> I'd imagine an interface along the lines of
>
> int btrfs_readahead_init(struct btrfs_reada_ctx *reada);
> int btrfs_readahead_add(struct btrfs_root *root,
> struct btrfs_key *start,
> struct btrfs_key *end,
> struct btrfs_reada_ctx *reada);
> void btrfs_readahead_wait(struct btrfs_reada_ctx *reada);
>
> to trigger the readahead of parts of a tree. Multiple readahead
> requests can be given before waiting. This would enable the very
> beneficial folding seen above for 'reading both trees'.
>
> Also it would be possible to add a cascading readahead, where the
> content of leaves would trigger readaheads in other trees, maybe by
> giving a callback for the decisions what to read instead of the fixed
> start/end range.
>
> For the implementation I'd need an interface which I haven't been able
> to find yet. Currently I can trigger the read of several pages / tree
> blocks and wait for the completion of each of them. What I'd need would
> be an interface that gives me a callback on each completion or a waiting
> function that wakes up on each completion with the information which
> pages just completed.
> One way to achieve this would be to add a hook, but I gladly take any
> implementation hints.
>
> --
> Arne
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2011-03-23 15:01 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-23 13:06 [RFC] Tree fragmentation and prefetching Arne Jansen
2011-03-23 15:01 ` Arne Jansen [this message]
2011-03-23 19:26 ` Andrey Kuzmin
2011-03-23 20:28 ` Arne Jansen
2011-03-23 21:47 ` Andrey Kuzmin
2011-03-24 1:38 ` Miao Xie
2011-03-24 7:29 ` Arne Jansen
2011-03-24 12:45 ` Miao Xie
2011-03-23 19:32 ` Chris Mason
2011-03-25 20:14 ` Arne Jansen
2011-03-25 20:15 ` Chris Mason
2011-03-25 20:53 ` Arne Jansen
2011-03-25 21:01 ` Chris Mason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D8A0B30.7080608@gmx.net \
--to=sensille@gmx.net \
--cc=chris.mason@oracle.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.