linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* the idea for improving the performance of b-tree search
@ 2010-10-13  9:00 Miao Xie
  2010-10-13 11:24 ` Chris Mason
  2010-10-18  2:41 ` Shaohua Li
  0 siblings, 2 replies; 7+ messages in thread
From: Miao Xie @ 2010-10-13  9:00 UTC (permalink / raw)
  To: Chris Mason; +Cc: Linux Btrfs

Hi, Chris

When I investigated the performance problem of file creation/deletion, I found
btrfs spends lots of time in the b-tree search, so I consider whether we can use
the latest search result in the same transaction or not.

My idea follows:
we can add mask or time stamp into b-tree's node and leaf, then we know whether
the node/leaf is COWed by the other task. If not, we check if the node/leaf of
the latest search result contains the key that we want to search. By this way,
we can reuse the latest search result in the same transaction and reduce the CPU
time spent in the b-tree search.

Chris, how do you think about it?

Regards
Miao

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: the idea for improving the performance of b-tree search
  2010-10-13  9:00 the idea for improving the performance of b-tree search Miao Xie
@ 2010-10-13 11:24 ` Chris Mason
  2010-10-13 15:22   ` David Nicol
  2010-10-18  2:41 ` Shaohua Li
  1 sibling, 1 reply; 7+ messages in thread
From: Chris Mason @ 2010-10-13 11:24 UTC (permalink / raw)
  To: Miao Xie; +Cc: Linux Btrfs

On Wed, Oct 13, 2010 at 05:00:56PM +0800, Miao Xie wrote:
> Hi, Chris
> 
> When I investigated the performance problem of file creation/deletion, I found
> btrfs spends lots of time in the b-tree search, so I consider whether we can use
> the latest search result in the same transaction or not.
> 
> My idea follows:
> we can add mask or time stamp into b-tree's node and leaf, then we know whether
> the node/leaf is COWed by the other task. If not, we check if the node/leaf of
> the latest search result contains the key that we want to search. By this way,
> we can reuse the latest search result in the same transaction and reduce the CPU
> time spent in the b-tree search.
> 
> Chris, how do you think about it?

Do your results show which btree searches are happening most often?  My
impression is that we are usually searching for inodes, and I see 3
different optimizations that are likely to help:

1) cache the block number of the leaf where the inode lives in the
inode, or perhaps the whole path down from the root.

2) Do delayed insertion.  Each time we create a file we do a
number of btree insertions.  One for the inode+name backrefs,
one for any xattrs, two for the directory entries, one for the directory
inode update.

The directory name index is especially CPU intensive because it goes in
hash order, which will mean a lot of btree balances.

Batching these insertions, even in groups of 4 or 8 will make a big
difference.

3) Do delayed inode deletion.  See above, but change the word insert
with delete ;)

-chris

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: the idea for improving the performance of b-tree search
  2010-10-13 11:24 ` Chris Mason
@ 2010-10-13 15:22   ` David Nicol
  2010-10-13 15:45     ` Chris Mason
  0 siblings, 1 reply; 7+ messages in thread
From: David Nicol @ 2010-10-13 15:22 UTC (permalink / raw)
  To: Linux Btrfs

On Wed, Oct 13, 2010 at 6:24 AM, Chris Mason <chris.mason@oracle.com> w=
rote:

> 3) Do delayed inode deletion. =C2=A0See above, but change the word in=
sert
> with delete ;)

And the reserved-for-the-future-but-not-used-yet flags field in the
ioctl#21 control structure gets a use: bits to indicate for the
completion of which of the potential delayed things the caller is
waiting.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: the idea for improving the performance of b-tree search
  2010-10-13 15:22   ` David Nicol
@ 2010-10-13 15:45     ` Chris Mason
  2010-10-13 20:38       ` David Nicol
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Mason @ 2010-10-13 15:45 UTC (permalink / raw)
  To: David Nicol; +Cc: Linux Btrfs

On Wed, Oct 13, 2010 at 10:22:30AM -0500, David Nicol wrote:
> On Wed, Oct 13, 2010 at 6:24 AM, Chris Mason <chris.mason@oracle.com>=
 wrote:
>=20
> > 3) Do delayed inode deletion. =A0See above, but change the word ins=
ert
> > with delete ;)
>=20
> And the reserved-for-the-future-but-not-used-yet flags field in the
> ioctl#21 control structure gets a use: bits to indicate for the
> completion of which of the potential delayed things the caller is
> waiting.

This is similar to the delayed allocation we already do for file data
extents.  It is supposed to be transparent to userland, and just a way
to more efficiently poke the on disk structures.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: the idea for improving the performance of b-tree search
  2010-10-13 15:45     ` Chris Mason
@ 2010-10-13 20:38       ` David Nicol
  0 siblings, 0 replies; 7+ messages in thread
From: David Nicol @ 2010-10-13 20:38 UTC (permalink / raw)
  To: Chris Mason, Linux Btrfs

>> > 3) Do delayed inode deletion. =C2=A0See above, but change the word=
 insert
>> > with delete ;)
>>
>> And the reserved-for-the-future-but-not-used-yet flags field in the
>> ioctl#21 control structure gets a use: bits to indicate for the
>> completion of which of the potential delayed things the caller is
>> waiting.
>
> This is similar to the delayed allocation we already do for file data
> extents. =C2=A0It is supposed to be transparent to userland, and just=
 a way
> to more efficiently poke the on disk structures.
>
> -chris

What I mean is, if we want to give a userland way to see when any
deferred anything is done, using one wait queue that wakes all waiters
on all completion events will be simpler than having a separate wait
queue structure for every class of event, at the cost of unnecessary
tests when a waiter is woken on the wrong event. That is what is
traded off when deciding how many wait queues -- just like locks -- to
have; it's a granularity question.

Now that there is rmdir of empty subvolumes, though, ioctl#21 might be
redundant; I suppose that's your call.

Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: the idea for improving the performance of b-tree search
  2010-10-13  9:00 the idea for improving the performance of b-tree search Miao Xie
  2010-10-13 11:24 ` Chris Mason
@ 2010-10-18  2:41 ` Shaohua Li
  2010-10-18  9:36   ` Miao Xie
  1 sibling, 1 reply; 7+ messages in thread
From: Shaohua Li @ 2010-10-18  2:41 UTC (permalink / raw)
  To: Miao Xie; +Cc: Chris Mason, Linux Btrfs

Hi Miao & Chris,
On Wed, Oct 13, 2010 at 05:00:56PM +0800, Miao Xie wrote:
> When I investigated the performance problem of file creation/deletion, I found
> btrfs spends lots of time in the b-tree search, so I consider whether we can use
> the latest search result in the same transaction or not.
> 
> My idea follows:
> we can add mask or time stamp into b-tree's node and leaf, then we know whether
> the node/leaf is COWed by the other task. If not, we check if the node/leaf of
> the latest search result contains the key that we want to search. By this way,
> we can reuse the latest search result in the same transaction and reduce the CPU
> time spent in the b-tree search.
Does this patch help a little for the tree search?
http://marc.info/?l=linux-btrfs&m=127175532803943&w=2
It doesn't solve all the issues, but in my test, it does help.

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: the idea for improving the performance of b-tree search
  2010-10-18  2:41 ` Shaohua Li
@ 2010-10-18  9:36   ` Miao Xie
  0 siblings, 0 replies; 7+ messages in thread
From: Miao Xie @ 2010-10-18  9:36 UTC (permalink / raw)
  To: Shaohua Li; +Cc: Chris Mason, Linux Btrfs

On Mon, 18 Oct 2010 10:41:41 +0800, Shaohua Li wrote:
> Hi Miao&  Chris,
> On Wed, Oct 13, 2010 at 05:00:56PM +0800, Miao Xie wrote:
>> When I investigated the performance problem of file creation/deletion, I found
>> btrfs spends lots of time in the b-tree search, so I consider whether we can use
>> the latest search result in the same transaction or not.
>>
>> My idea follows:
>> we can add mask or time stamp into b-tree's node and leaf, then we know whether
>> the node/leaf is COWed by the other task. If not, we check if the node/leaf of
>> the latest search result contains the key that we want to search. By this way,
>> we can reuse the latest search result in the same transaction and reduce the CPU
>> time spent in the b-tree search.
> Does this patch help a little for the tree search?
> http://marc.info/?l=linux-btrfs&m=127175532803943&w=2
> It doesn't solve all the issues, but in my test, it does help.

Thanks for your patch, but according to my test result, the time of b-tree search
has not been reduced. I think the reason is that extent-state is used when btrfs writes
data into the disk, so your patch is helpful when the disk I/O happens, but my test just
measures the cost of the b-tree searchs/insertions/removals, the disk I/O doesn't happen.

Regards
Miao

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-10-18  9:36 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-13  9:00 the idea for improving the performance of b-tree search Miao Xie
2010-10-13 11:24 ` Chris Mason
2010-10-13 15:22   ` David Nicol
2010-10-13 15:45     ` Chris Mason
2010-10-13 20:38       ` David Nicol
2010-10-18  2:41 ` Shaohua Li
2010-10-18  9:36   ` Miao Xie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).