public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Steven Pratt <slpratt@austin.ibm.com>
To: Chris Mason <chris.mason@oracle.com>, linux-btrfs@vger.kernel.org
Subject: Re: New experimental btrfs branch ready for testing
Date: Thu, 04 Jun 2009 14:02:20 -0500	[thread overview]
Message-ID: <4A281A3C.6000006@austin.ibm.com> (raw)
In-Reply-To: <20090601210447.GC3890@think>

Chris Mason wrote:
> Hello everyone,
>
> Yan Zheng has been doing some major surgery to the back references and
> extent allocation code, tackling bottlenecks in the code that tracks
> extents.  It scales better with many snapshots and performs better in
> the common case of no snapshots at all.
>
> THE NEW CODE IS A FORWARD ROLLING DISK FORMAT CHANGE.  This means it is
> compatible with the current btrfs disk format, but once you mount a
> filesystem with the new code, it WILL NO LONGER BE MOUNTABLE FROM OLD
> KERNELS.  Old kernels spit out an error message when you try them on new
> format filesystems.
>
> This is a large change, and I'm hoping to have it stable in time for the
> 2.6.31 merge window.  I've been testing it for about a week now, and
> haven't been able to cause major problems yet.  But, testing the
> compatibility with old format filesystems is the hard part, and
> everyone that pulls the new code should backup their data first.
>
> I've setup git branches called newformat where you can pull the new code.
>
> For the kernel (based on 2.6.30-rc7):
>
> git pull git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git newformat
>
>   
So I started the performance runs on this. The base tests completed fine 
on the raid system and I will post results as soon as I can finish 
postprocessing, but when I tried to do nodatacow that machine it crashed 
pretty early. Here is console log:

btrfs2 kernel: [82057.882255] ------------[ cut here ]------------
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535] invalid opcode: 0000 [#1] SMP
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535] last sysfs file: 
/sys/devices/system/cpu/cpu15/cache/index1/shared_cpu_map
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535] Stack:
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535]  ffff88011786d800 ffff8801259f6ea0 
000000b21f256030 00000000000000e9
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535]  000000352231b250 ffff880089abbf40 
ffff88013d0e2440 0000000000000001
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535] Call Trace:
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa0445198>] 
run_one_delayed_ref+0x382/0x42f [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa0464bd1>] ? 
map_extent_buffer+0xab/0xbe [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa0445f75>] 
run_clustered_refs+0x237/0x2b4 [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa0478f85>] ? 
btrfs_find_ref_cluster+0xdc/0x115 [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:47 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa044609e>] 
btrfs_run_delayed_refs+0xac/0x195 [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa044e86e>] 
__btrfs_end_transaction+0x59/0xfe [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa044e92e>] 
btrfs_end_transaction+0xb/0xd [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa045418b>] 
btrfs_finish_ordered_io+0x224/0x24d [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa04541c4>] 
btrfs_writepage_end_io_hook+0x10/0x12 [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa0467599>] 
end_bio_extent_writepage+0xa3/0x18f [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffff8024276e>] ? 
del_timer_sync+0x14/0x20
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffff802cbbee>] bio_endio+0x26/0x28
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa044b5d6>] 
end_workqueue_fn+0x111/0x11e [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa046eff5>] 
worker_loop+0x67/0x1ee [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffffa046ef8e>] ? 
worker_loop+0x0/0x1ee [btrfs]
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffff8024c324>] kthread+0x56/0x86
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffff8020c9fa>] child_rip+0xa/0x20
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffff8024c2ce>] ? kthread+0x0/0x86
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535]  [<ffffffff8020c9f0>] ? child_rip+0x0/0x20
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...
btrfs2 kernel: [82057.882535] Code: 08 4c 8d 45 d4 41 8d 44 24 18 48 8b 
73 20 48 8b 4d 18 41 b9 01 00 00 00 48 8b 7d b8 4c 89 ea 89 45 d4 e8 df 
e3 ff ff 85 c0 74 04 <0f> 0b eb fe 49 63 75 40 4d 8b 65 00 49 83 cf 01 
4c 89 e7 48 6b
Message from syslogd@ at Thu Jun  4 08:02:48 2009 ...


I also ran this on the single disk system and it did not make it through 
base tests.  Error are different.

[101511.664497] Pid: 28597, comm: btrfs-transacti Tainted: G      D    
2.6.30-rc7-autokern1 #1 IBM x3950-[88726RU]-
[101511.675497] RIP: 0010:[<ffffffff804cd70d>]  [<ffffffff804cd70d>] 
_spin_lock+0x14/0x1a
[101511.684494] RSP: 0018:ffff8801309bbb40  EFLAGS: 00000297
[101511.689494] RAX: 0000000000001514 RBX: ffff8801309bbb40 RCX: 
ffff8801309bbb40
[101511.697493] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
ffff8800b7427d70
[101511.705491] RBP: ffffffff8020c50e R08: 0000000000000001 R09: 
ffff8801309bba68
[101511.713490] R10: ffff88012231b910 R11: ffff8800478ad5b0 R12: 
0000001a00000032
[101511.721488] R13: ffffffffa04370b1 R14: ffff8801309bbb60 R15: 
00000000000003bf
[101511.729486] FS:  0000000000000000(0000) GS:ffff88002bac0000(0000) 
knlGS:0000000000000000
[101511.738483] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[101511.744482] CR2: 00007fbcd3ff1b80 CR3: 0000000000201000 CR4: 
00000000000006e0
[101511.752480] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[101511.760479] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[101511.768478] Call Trace:
[101511.771478]  [<ffffffffa0471187>] ? btrfs_try_spin_lock+0x1c/0x61 
[btrfs]
[101511.778476]  [<ffffffffa043ea17>] ? btrfs_search_slot+0x619/0x73e 
[btrfs]
[101511.786474]  [<ffffffffa043f11d>] ? 
btrfs_insert_empty_items+0x5e/0xa9 [btrfs]
[101511.803472]  [<ffffffffa0440ce0>] ? 
alloc_reserved_file_extent+0x89/0x1c3 [btrfs]
[101511.811470]  [<ffffffffa04401d8>] ? 
update_reserved_extents+0x98/0xab [btrfs]
[101511.819468]  [<ffffffffa0445198>] ? run_one_delayed_ref+0x382/0x42f 
[btrfs]
[101511.827467]  [<ffffffff802a5387>] ? cache_flusharray+0xa2/0xae
[101511.833466]  [<ffffffffa0445f75>] ? run_clustered_refs+0x237/0x2b4 
[btrfs]
[101511.840463]  [<ffffffffa0478f85>] ? 
btrfs_find_ref_cluster+0xdc/0x115 [btrfs]
[101511.848462]  [<ffffffff804cbdad>] ? thread_return+0x3e/0x91
[101511.854461]  [<ffffffffa044609e>] ? 
btrfs_run_delayed_refs+0xac/0x195 [btrfs]
[101511.862459]  [<ffffffffa044f59f>] ? 
btrfs_commit_transaction+0x7b/0x69c [btrfs]
[101511.870458]  [<ffffffff8024c460>] ? autoremove_wake_function+0x0/0x38
[101511.877458]  [<ffffffffa044ee87>] ? start_transaction+0x103/0x10f 
[btrfs]
[101511.885456]  [<ffffffffa044c2c6>] ? transaction_kthread+0x17f/0x20a 
[btrfs]
[101511.892453]  [<ffffffffa044c147>] ? transaction_kthread+0x0/0x20a 
[btrfs]
[101511.900453]  [<ffffffffa044c147>] ? transaction_kthread+0x0/0x20a 
[btrfs]
[101511.907452]  [<ffffffff8024c324>] ? kthread+0x56/0x86
[101511.912450]  [<ffffffff8020c9fa>] ? child_rip+0xa/0x20
[101511.918449]  [<ffffffff8024c2ce>] ? kthread+0x0/0x86
[101511.923449]  [<ffffffff8020c9f0>] ? child_rip+0x0/0

[101536.249729] Pid: 28594, comm: btrfs-endio-wri Tainted: G      D    
2.6.30-rc7-autokern1 #1 IBM x3950-[88726RU]-
[101536.249729] RIP: 0010:[<ffffffff804cd70d>]  [<ffffffff804cd70d>] 
_spin_lock+0x14/0x1a
[101536.249729] RSP: 0018:ffff88011a80da80  EFLAGS: 00000297
[101536.249729] RAX: 000000000000c6c2 RBX: ffff88011a80da80 RCX: 
0000000000000000
[101536.249729] RDX: 0000000000000000 RSI: ffff88013d080000 RDI: 
ffff8800478ad6b0
[101536.249729] RBP: ffffffff8020c50e R08: 000000000000004c R09: 
0000000000000001
[101536.249729] R10: 0000000000000008 R11: 0000000000086000 R12: 
ffff88011a80da40
[101536.249729] R13: ffff8800aa254800 R14: 0000000b470c7fff R15: 
ffff88011f256030
[101536.249729] FS:  0000000000000000(0000) GS:ffff88002ba30000(0000) 
knlGS:0000000000000000
[101536.249729] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[101536.249729] CR2: 000000000065b078 CR3: 0000000000201000 CR4: 
00000000000006e0
[101536.249729] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[101536.249729] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[101536.249729] Call Trace:
[101536.249729]  [<ffffffffa04710cf>] ? btrfs_tree_lock+0x54/0x9e [btrfs]
[101536.249729]  [<ffffffffa0471022>] ? btrfs_wake_function+0x0/0x10 [btrfs]
[101536.249729]  [<ffffffffa0438104>] ? btrfs_lock_root_node+0x1d/0x4b 
[btrfs]
[101536.249729]  [<ffffffffa043e4c5>] ? btrfs_search_slot+0xc7/0x73e [btrfs]
[101536.249729]  [<ffffffffa043f11d>] ? 
btrfs_insert_empty_items+0x5e/0xa9 [btrfs]
[101536.249729]  [<ffffffffa0444f7a>] ? run_one_delayed_ref+0x164/0x42f 
[btrfs]
[101536.249729]  [<ffffffffa0445f75>] ? run_clustered_refs+0x237/0x2b4 
[btrfs]
[101536.249729]  [<ffffffffa0478f85>] ? 
btrfs_find_ref_cluster+0xdc/0x115 [btrfs]
[101536.249729]  [<ffffffffa044609e>] ? 
btrfs_run_delayed_refs+0xac/0x195 [btrfs]
[101536.249729]  [<ffffffffa044e86e>] ? 
__btrfs_end_transaction+0x59/0xfe [btrfs]
[101536.249729]  [<ffffffffa044e92e>] ? btrfs_end_transaction+0xb/0xd 
[btrfs]
[101536.249729]  [<ffffffffa045418b>] ? 
btrfs_finish_ordered_io+0x224/0x24d [btrfs]
[101536.249729]  [<ffffffffa04541c4>] ? 
btrfs_writepage_end_io_hook+0x10/0x12 [btrfs]
[101536.249729]  [<ffffffffa0467599>] ? 
end_bio_extent_writepage+0xa3/0x18f [btrfs]
[101536.249729]  [<ffffffff8024276e>] ? del_timer_sync+0x14/0x20
[101536.249729]  [<ffffffff802cbbee>] ? bio_endio+0x26/0x28
[101536.249729]  [<ffffffffa044b5d6>] ? end_workqueue_fn+0x111/0x11e [btrfs]
[101536.249729]  [<ffffffffa046eff5>] ? worker_loop+0x67/0x1ee [btrfs]
:                                                                                               



> For the progs:
>
> git pull git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs-unstable.git newformat
>   

I should mention that I missed the part about the new user tools, so 
while these we newly formated filesystems, they were created with the 
old tools.  These are both running 64bit. I plan to install the new 
tools and re-run.

Steve


> The main benefit of the new code is that backrefs on the extent
> allocation tree use a fuzzier format.  It basically means that we search
> for the key in the extent allocation tree instead of providing an exact
> backref to the parent block.
>
> This means we can predict how many blocks will be changed when changing
> the extent allocation tree, and it makes enospc much less complex.  It
> is also significantly faster.
>
> For regular subvolume trees, a similar change is made as long as there
> are no snapshots against a given block.  This is the common case, and it
> makes COW less expensive overall.
>
> Yan Zheng also worked out a way to free blocks during the transaction
> without needing to do an explicit snapshot deletion on the old root when
> the transaction was done.  This gets rid of some complex caching code,
> and fixes worst-case problems where btrfs could take a very very long
> time to unmount.
>
> btrfs-vol -b is faster with the new code as well, he added caching of
> high levels in the tree to speed things up.
>
> (Many kudos to Yan Zheng for all of this work!)
>
> -chris
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>   


  parent reply	other threads:[~2009-06-04 19:02 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-01 21:04 New experimental btrfs branch ready for testing Chris Mason
2009-06-02 13:28 ` Chris Mason
2009-06-03 17:08   ` Chris Mason
2009-06-04 19:02 ` Steven Pratt [this message]
2009-06-04 19:05   ` Chris Mason
2009-06-05 14:20   ` Chris Mason
2009-06-05 16:02     ` Steven Pratt
2009-06-05 21:27       ` Steven Pratt
2009-06-06  0:20         ` Chris Mason
2009-06-06 16:38           ` Steven Pratt
2009-06-09 15:26             ` Chris Mason
2009-06-15 15:46               ` Steven Pratt
2009-06-07 11:50 ` Roy Sigurd Karlsbakk
2009-06-07 12:13   ` Daniel Cordero
2009-06-08 12:33 ` Yan Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A281A3C.6000006@austin.ibm.com \
    --to=slpratt@austin.ibm.com \
    --cc=chris.mason@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox