From: Steven Pratt <slpratt@austin.ibm.com>
To: Chris Mason <chris.mason@oracle.com>, linux-btrfs@vger.kernel.org
Subject: Re: New experimental btrfs branch ready for testing
Date: Thu, 04 Jun 2009 14:02:20 -0500 [thread overview]
Message-ID: <4A281A3C.6000006@austin.ibm.com> (raw)
In-Reply-To: <20090601210447.GC3890@think>
Chris Mason wrote:
> Hello everyone,
>
> Yan Zheng has been doing some major surgery to the back references and
> extent allocation code, tackling bottlenecks in the code that tracks
> extents. It scales better with many snapshots and performs better in
> the common case of no snapshots at all.
>
> THE NEW CODE IS A FORWARD ROLLING DISK FORMAT CHANGE. This means it is
> compatible with the current btrfs disk format, but once you mount a
> filesystem with the new code, it WILL NO LONGER BE MOUNTABLE FROM OLD
> KERNELS. Old kernels spit out an error message when you try them on new
> format filesystems.
>
> This is a large change, and I'm hoping to have it stable in time for the
> 2.6.31 merge window. I've been testing it for about a week now, and
> haven't been able to cause major problems yet. But, testing the
> compatibility with old format filesystems is the hard part, and
> everyone that pulls the new code should backup their data first.
>
> I've setup git branches called newformat where you can pull the new code.
>
> For the kernel (based on 2.6.30-rc7):
>
> git pull git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git newformat
>
>
So I started the performance runs on this. The base tests completed fine
on the raid system and I will post results as soon as I can finish
postprocessing, but when I tried to do nodatacow that machine it crashed
pretty early. Here is console log:
btrfs2 kernel: [82057.882255] ------------[ cut here ]------------
Message from syslogd@ at Thu Jun 4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535] invalid opcode: 0000 [#1] SMP
Message from syslogd@ at Thu Jun 4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535] last sysfs file:
/sys/devices/system/cpu/cpu15/cache/index1/shared_cpu_map
Message from syslogd@ at Thu Jun 4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535] Stack:
Message from syslogd@ at Thu Jun 4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535] ffff88011786d800 ffff8801259f6ea0
000000b21f256030 00000000000000e9
Message from syslogd@ at Thu Jun 4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535] 000000352231b250 ffff880089abbf40
ffff88013d0e2440 0000000000000001
Message from syslogd@ at Thu Jun 4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535] Call Trace:
Message from syslogd@ at Thu Jun 4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535] [<ffffffffa0445198>]
run_one_delayed_ref+0x382/0x42f [btrfs]
Message from syslogd@ at Thu Jun 4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535] [<ffffffffa0464bd1>] ?
map_extent_buffer+0xab/0xbe [btrfs]
Message from syslogd@ at Thu Jun 4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535] [<ffffffffa0445f75>]
run_clustered_refs+0x237/0x2b4 [btrfs]
Message from syslogd@ at Thu Jun 4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535] [<ffffffffa0478f85>] ?
btrfs_find_ref_cluster+0xdc/0x115 [btrfs]
Message from syslogd@ at Thu Jun 4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535] [<ffffffffa044609e>]
btrfs_run_delayed_refs+0xac/0x195 [btrfs]
Message from syslogd@ at Thu Jun 4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535] [<ffffffffa044e86e>]
__btrfs_end_transaction+0x59/0xfe [btrfs]
Message from syslogd@ at Thu Jun 4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535] [<ffffffffa044e92e>]
btrfs_end_transaction+0xb/0xd [btrfs]
Message from syslogd@ at Thu Jun 4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535] [<ffffffffa045418b>]
btrfs_finish_ordered_io+0x224/0x24d [btrfs]
Message from syslogd@ at Thu Jun 4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535] [<ffffffffa04541c4>]
btrfs_writepage_end_io_hook+0x10/0x12 [btrfs]
Message from syslogd@ at Thu Jun 4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535] [<ffffffffa0467599>]
end_bio_extent_writepage+0xa3/0x18f [btrfs]
Message from syslogd@ at Thu Jun 4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535] [<ffffffff8024276e>] ?
del_timer_sync+0x14/0x20
Message from syslogd@ at Thu Jun 4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535] [<ffffffff802cbbee>] bio_endio+0x26/0x28
Message from syslogd@ at Thu Jun 4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535] [<ffffffffa044b5d6>]
end_workqueue_fn+0x111/0x11e [btrfs]
Message from syslogd@ at Thu Jun 4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535] [<ffffffffa046eff5>]
worker_loop+0x67/0x1ee [btrfs]
Message from syslogd@ at Thu Jun 4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535] [<ffffffffa046ef8e>] ?
worker_loop+0x0/0x1ee [btrfs]
Message from syslogd@ at Thu Jun 4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535] [<ffffffff8024c324>] kthread+0x56/0x86
Message from syslogd@ at Thu Jun 4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535] [<ffffffff8020c9fa>] child_rip+0xa/0x20
Message from syslogd@ at Thu Jun 4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535] [<ffffffff8024c2ce>] ? kthread+0x0/0x86
Message from syslogd@ at Thu Jun 4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535] [<ffffffff8020c9f0>] ? child_rip+0x0/0x20
Message from syslogd@ at Thu Jun 4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535] Code: 08 4c 8d 45 d4 41 8d 44 24 18 48 8b
73 20 48 8b 4d 18 41 b9 01 00 00 00 48 8b 7d b8 4c 89 ea 89 45 d4 e8 df
e3 ff ff 85 c0 74 04 <0f> 0b eb fe 49 63 75 40 4d 8b 65 00 49 83 cf 01
4c 89 e7 48 6b
Message from syslogd@ at Thu Jun 4 08:02:48 2009 ...
I also ran this on the single disk system and it did not make it through
base tests. Error are different.
[101511.664497] Pid: 28597, comm: btrfs-transacti Tainted: G D
2.6.30-rc7-autokern1 #1 IBM x3950-[88726RU]-
[101511.675497] RIP: 0010:[<ffffffff804cd70d>] [<ffffffff804cd70d>]
_spin_lock+0x14/0x1a
[101511.684494] RSP: 0018:ffff8801309bbb40 EFLAGS: 00000297
[101511.689494] RAX: 0000000000001514 RBX: ffff8801309bbb40 RCX:
ffff8801309bbb40
[101511.697493] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
ffff8800b7427d70
[101511.705491] RBP: ffffffff8020c50e R08: 0000000000000001 R09:
ffff8801309bba68
[101511.713490] R10: ffff88012231b910 R11: ffff8800478ad5b0 R12:
0000001a00000032
[101511.721488] R13: ffffffffa04370b1 R14: ffff8801309bbb60 R15:
00000000000003bf
[101511.729486] FS: 0000000000000000(0000) GS:ffff88002bac0000(0000)
knlGS:0000000000000000
[101511.738483] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[101511.744482] CR2: 00007fbcd3ff1b80 CR3: 0000000000201000 CR4:
00000000000006e0
[101511.752480] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[101511.760479] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[101511.768478] Call Trace:
[101511.771478] [<ffffffffa0471187>] ? btrfs_try_spin_lock+0x1c/0x61
[btrfs]
[101511.778476] [<ffffffffa043ea17>] ? btrfs_search_slot+0x619/0x73e
[btrfs]
[101511.786474] [<ffffffffa043f11d>] ?
btrfs_insert_empty_items+0x5e/0xa9 [btrfs]
[101511.803472] [<ffffffffa0440ce0>] ?
alloc_reserved_file_extent+0x89/0x1c3 [btrfs]
[101511.811470] [<ffffffffa04401d8>] ?
update_reserved_extents+0x98/0xab [btrfs]
[101511.819468] [<ffffffffa0445198>] ? run_one_delayed_ref+0x382/0x42f
[btrfs]
[101511.827467] [<ffffffff802a5387>] ? cache_flusharray+0xa2/0xae
[101511.833466] [<ffffffffa0445f75>] ? run_clustered_refs+0x237/0x2b4
[btrfs]
[101511.840463] [<ffffffffa0478f85>] ?
btrfs_find_ref_cluster+0xdc/0x115 [btrfs]
[101511.848462] [<ffffffff804cbdad>] ? thread_return+0x3e/0x91
[101511.854461] [<ffffffffa044609e>] ?
btrfs_run_delayed_refs+0xac/0x195 [btrfs]
[101511.862459] [<ffffffffa044f59f>] ?
btrfs_commit_transaction+0x7b/0x69c [btrfs]
[101511.870458] [<ffffffff8024c460>] ? autoremove_wake_function+0x0/0x38
[101511.877458] [<ffffffffa044ee87>] ? start_transaction+0x103/0x10f
[btrfs]
[101511.885456] [<ffffffffa044c2c6>] ? transaction_kthread+0x17f/0x20a
[btrfs]
[101511.892453] [<ffffffffa044c147>] ? transaction_kthread+0x0/0x20a
[btrfs]
[101511.900453] [<ffffffffa044c147>] ? transaction_kthread+0x0/0x20a
[btrfs]
[101511.907452] [<ffffffff8024c324>] ? kthread+0x56/0x86
[101511.912450] [<ffffffff8020c9fa>] ? child_rip+0xa/0x20
[101511.918449] [<ffffffff8024c2ce>] ? kthread+0x0/0x86
[101511.923449] [<ffffffff8020c9f0>] ? child_rip+0x0/0
[101536.249729] Pid: 28594, comm: btrfs-endio-wri Tainted: G D
2.6.30-rc7-autokern1 #1 IBM x3950-[88726RU]-
[101536.249729] RIP: 0010:[<ffffffff804cd70d>] [<ffffffff804cd70d>]
_spin_lock+0x14/0x1a
[101536.249729] RSP: 0018:ffff88011a80da80 EFLAGS: 00000297
[101536.249729] RAX: 000000000000c6c2 RBX: ffff88011a80da80 RCX:
0000000000000000
[101536.249729] RDX: 0000000000000000 RSI: ffff88013d080000 RDI:
ffff8800478ad6b0
[101536.249729] RBP: ffffffff8020c50e R08: 000000000000004c R09:
0000000000000001
[101536.249729] R10: 0000000000000008 R11: 0000000000086000 R12:
ffff88011a80da40
[101536.249729] R13: ffff8800aa254800 R14: 0000000b470c7fff R15:
ffff88011f256030
[101536.249729] FS: 0000000000000000(0000) GS:ffff88002ba30000(0000)
knlGS:0000000000000000
[101536.249729] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[101536.249729] CR2: 000000000065b078 CR3: 0000000000201000 CR4:
00000000000006e0
[101536.249729] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[101536.249729] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[101536.249729] Call Trace:
[101536.249729] [<ffffffffa04710cf>] ? btrfs_tree_lock+0x54/0x9e [btrfs]
[101536.249729] [<ffffffffa0471022>] ? btrfs_wake_function+0x0/0x10 [btrfs]
[101536.249729] [<ffffffffa0438104>] ? btrfs_lock_root_node+0x1d/0x4b
[btrfs]
[101536.249729] [<ffffffffa043e4c5>] ? btrfs_search_slot+0xc7/0x73e [btrfs]
[101536.249729] [<ffffffffa043f11d>] ?
btrfs_insert_empty_items+0x5e/0xa9 [btrfs]
[101536.249729] [<ffffffffa0444f7a>] ? run_one_delayed_ref+0x164/0x42f
[btrfs]
[101536.249729] [<ffffffffa0445f75>] ? run_clustered_refs+0x237/0x2b4
[btrfs]
[101536.249729] [<ffffffffa0478f85>] ?
btrfs_find_ref_cluster+0xdc/0x115 [btrfs]
[101536.249729] [<ffffffffa044609e>] ?
btrfs_run_delayed_refs+0xac/0x195 [btrfs]
[101536.249729] [<ffffffffa044e86e>] ?
__btrfs_end_transaction+0x59/0xfe [btrfs]
[101536.249729] [<ffffffffa044e92e>] ? btrfs_end_transaction+0xb/0xd
[btrfs]
[101536.249729] [<ffffffffa045418b>] ?
btrfs_finish_ordered_io+0x224/0x24d [btrfs]
[101536.249729] [<ffffffffa04541c4>] ?
btrfs_writepage_end_io_hook+0x10/0x12 [btrfs]
[101536.249729] [<ffffffffa0467599>] ?
end_bio_extent_writepage+0xa3/0x18f [btrfs]
[101536.249729] [<ffffffff8024276e>] ? del_timer_sync+0x14/0x20
[101536.249729] [<ffffffff802cbbee>] ? bio_endio+0x26/0x28
[101536.249729] [<ffffffffa044b5d6>] ? end_workqueue_fn+0x111/0x11e [btrfs]
[101536.249729] [<ffffffffa046eff5>] ? worker_loop+0x67/0x1ee [btrfs]
:
> For the progs:
>
> git pull git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs-unstable.git newformat
>
I should mention that I missed the part about the new user tools, so
while these we newly formated filesystems, they were created with the
old tools. These are both running 64bit. I plan to install the new
tools and re-run.
Steve
> The main benefit of the new code is that backrefs on the extent
> allocation tree use a fuzzier format. It basically means that we search
> for the key in the extent allocation tree instead of providing an exact
> backref to the parent block.
>
> This means we can predict how many blocks will be changed when changing
> the extent allocation tree, and it makes enospc much less complex. It
> is also significantly faster.
>
> For regular subvolume trees, a similar change is made as long as there
> are no snapshots against a given block. This is the common case, and it
> makes COW less expensive overall.
>
> Yan Zheng also worked out a way to free blocks during the transaction
> without needing to do an explicit snapshot deletion on the old root when
> the transaction was done. This gets rid of some complex caching code,
> and fixes worst-case problems where btrfs could take a very very long
> time to unmount.
>
> btrfs-vol -b is faster with the new code as well, he added caching of
> high levels in the tree to speed things up.
>
> (Many kudos to Yan Zheng for all of this work!)
>
> -chris
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2009-06-04 19:02 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-01 21:04 New experimental btrfs branch ready for testing Chris Mason
2009-06-02 13:28 ` Chris Mason
2009-06-03 17:08 ` Chris Mason
2009-06-04 19:02 ` Steven Pratt [this message]
2009-06-04 19:05 ` Chris Mason
2009-06-05 14:20 ` Chris Mason
2009-06-05 16:02 ` Steven Pratt
2009-06-05 21:27 ` Steven Pratt
2009-06-06 0:20 ` Chris Mason
2009-06-06 16:38 ` Steven Pratt
2009-06-09 15:26 ` Chris Mason
2009-06-15 15:46 ` Steven Pratt
2009-06-07 11:50 ` Roy Sigurd Karlsbakk
2009-06-07 12:13 ` Daniel Cordero
2009-06-08 12:33 ` Yan Zheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A281A3C.6000006@austin.ibm.com \
--to=slpratt@austin.ibm.com \
--cc=chris.mason@oracle.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox