* [PATCH] btrfs: fix write_dev_supers
@ 2009-06-09 1:46 Hisashi Hifumi
2009-06-09 11:25 ` Chris Mason
0 siblings, 1 reply; 5+ messages in thread
From: Hisashi Hifumi @ 2009-06-09 1:46 UTC (permalink / raw)
To: chris.mason; +Cc: linux-btrfs
Hi.
I got following BUG trace.
This is violation of BUG_ON(!buffer_locked(bh)) check on submit_bh() function.
In write_dev_supers(), if wait parameter is set and buffer_uptodate() check
is negative, submit_bh() is executed and hit above BUG_ON.
So I fixed this issue.
Thanks.
Jun 9 00:41:32 dl580 kernel: ------------[ cut here ]------------
Jun 9 00:41:32 dl580 kernel: kernel BUG at fs/buffer.c:2933!
Jun 9 00:41:32 dl580 kernel: invalid opcode: 0000 [#1] SMP
Jun 9 00:41:32 dl580 kernel: last sysfs file: /sys/devices/system/cpu/cpu7/cache/index1/sha
red_cpu_map
Jun 9 00:41:32 dl580 kernel: CPU 3
Jun 9 00:41:32 dl580 kernel: Modules linked in: btrfs zlib_deflate ext4 jbd2 crc16 sg qla2x
xx scsi_transport_fc autofs4 i2c_dev i2c_core sunrpc ipv6 serio_raw
tg3 libphy ata_piix libata shpchp rtc_cmos rtc_core rtc_lib cciss sd_mod scsi_mod ext3 jbd [
last unloaded: scsi_transport_fc]
Jun 9 00:41:32 dl580 kernel: Pid: 5207, comm: umount Tainted: G W 2.6.30-rc6 #1 Pro
Liant DL580 G3
Jun 9 00:41:32 dl580 kernel: RIP: 0010:[<ffffffff802c458b>] [<ffffffff802c458b>] submit_bh
+0x1a/0x105
Jun 9 00:41:32 dl580 kernel: RSP: 0018:ffff8801f46e5bf8 EFLAGS: 00010246
Jun 9 00:41:32 dl580 kernel: RAX: 0000000000000028 RBX: ffff88018a7ea420 RCX: 0000000000000
000
Jun 9 00:41:32 dl580 kernel: RDX: ffff88018a7ea420 RSI: ffff88018a7ea420 RDI: 0000000000000
419
Jun 9 00:41:32 dl580 kernel: RBP: ffff8801f46e5c18 R08: ffffffff802c533d R09: 0000000000000
000
Jun 9 00:41:32 dl580 kernel: R10: 0000000000000001 R11: 0000000000000088 R12: ffff88021d448
248
Jun 9 00:41:32 dl580 kernel: R13: 0000000000000419 R14: ffff8802191dacbb R15: 0000000000000
000
Jun 9 00:41:32 dl580 kernel: FS: 00007fd64fef3760(0000) GS:ffff880028150000(0000) knlGS:00
00000000000000
Jun 9 00:41:32 dl580 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 9 00:41:32 dl580 kernel: CR2: 000000000044ef40 CR3: 0000000104287000 CR4: 0000000000000
6e0
Jun 9 00:41:32 dl580 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000
000
Jun 9 00:41:32 dl580 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000
400
Jun 9 00:41:32 dl580 kernel: Process umount (pid: 5207, threadinfo ffff8801f46e4000, task f
fff8801e1168000)
Jun 9 00:41:32 dl580 kernel: Stack:
Jun 9 00:41:32 dl580 kernel: 0000000000000003 ffff88018a7ea420 ffff88021d448248 0000000000
000003
Jun 9 00:41:32 dl580 kernel: ffff8801f46e5c68 ffffffffa02d9979 0000000000000000 0000000100
000001
Jun 9 00:41:32 dl580 kernel: 0000000100000000 ffff88021d448248 0000000000000000 ffff880219
1dacbb
Jun 9 00:41:32 dl580 kernel: Call Trace:
Jun 9 00:41:33 dl580 kernel: [<ffffffffa02d9979>] write_dev_supers+0x1eb/0x258 [btrfs]
Jun 9 00:41:33 dl580 kernel: [<ffffffffa02d9b6d>] write_all_supers+0x187/0x1c8 [btrfs]
Jun 9 00:41:33 dl580 kernel: [<ffffffffa02d9bbc>] write_ctree_super+0xe/0x10 [btrfs]
Jun 9 00:41:33 dl580 kernel: [<ffffffffa02de39f>] btrfs_commit_transaction+0x6bb/0x841 [bt
rfs]
Jun 9 00:41:33 dl580 kernel: [<ffffffff80246914>] ? autoremove_wake_function+0x0/0x38
Jun 9 00:41:33 dl580 kernel: [<ffffffffa02c14ed>] btrfs_sync_fs+0x67/0x72 [btrfs]
Jun 9 00:41:33 dl580 kernel: [<ffffffff802e6e3a>] quota_sync_sb+0x42/0xf3
Jun 9 00:41:33 dl580 kernel: [<ffffffff802e6f14>] sync_dquots+0x29/0x138
Jun 9 00:41:33 dl580 kernel: [<ffffffff802a8c29>] __fsync_super+0x1e/0x7b
Jun 9 00:41:33 dl580 kernel: [<ffffffff802a8c97>] fsync_super+0x11/0x22
Jun 9 00:41:33 dl580 kernel: [<ffffffff802a8ea9>] generic_shutdown_super+0x26/0xe2
Jun 9 00:41:33 dl580 kernel: [<ffffffff802a8fb6>] kill_anon_super+0x17/0x3b
Jun 9 00:41:33 dl580 kernel: [<ffffffff802a92e8>] deactivate_super+0x62/0x77
Jun 9 00:41:33 dl580 kernel: [<ffffffff802bb7ae>] mntput_no_expire+0xec/0x12c
Jun 9 00:41:33 dl580 kernel: [<ffffffff802bbcff>] sys_umount+0x2c5/0x31c
Jun 9 00:41:33 dl580 kernel: [<ffffffff8020aeeb>] system_call_fastpath+0x16/0x
Jun 9 00:41:33 dl580 kernel: Code: e0 eb ec 44 89 e8 48 83 c4 18 5b 41 5c 41 5d 5d c3 55 48
89 e5 41 55 41 54 53 48 83 ec 08 41 89 fd 48 89 f3 48 8b 06 a8 04 75 04 <0f> 0b eb fe a8 20
75 04 0f 0b eb fe 48 83 7e 38 00 75 04 0f 0b
Jun 9 00:41:33 dl580 kernel: RIP [<ffffffff802c458b>] submit_bh+0x1a/0x105
Jun 9 00:41:33 dl580 kernel: RSP <ffff8801f46e5bf8>
Jun 9 00:41:33 dl580 kernel: ---[ end trace 4eaa2a86a8e2da24 ]---
Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
--- linux-2.6.30-rc8.org/fs/btrfs/disk-io.c 2009-06-04 16:26:25.000000000 +0900
+++ linux-2.6.30-rc8.btrfs/fs/btrfs/disk-io.c 2009-06-08 18:42:46.000000000 +0900
@@ -2045,6 +2045,9 @@ static int write_dev_supers(struct btrfs
if (buffer_uptodate(bh)) {
brelse(bh);
continue;
+ } else {
+ get_bh(bh);
+ lock_buffer(bh);
}
} else {
btrfs_set_super_bytenr(sb, bytenr);
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] btrfs: fix write_dev_supers
2009-06-09 1:46 [PATCH] btrfs: fix write_dev_supers Hisashi Hifumi
@ 2009-06-09 11:25 ` Chris Mason
2009-06-09 11:28 ` Hisashi Hifumi
2009-06-10 7:32 ` Hisashi Hifumi
0 siblings, 2 replies; 5+ messages in thread
From: Chris Mason @ 2009-06-09 11:25 UTC (permalink / raw)
To: Hisashi Hifumi; +Cc: linux-btrfs
On Tue, Jun 09, 2009 at 10:46:55AM +0900, Hisashi Hifumi wrote:
> Hi.
>
> I got following BUG trace.
> This is violation of BUG_ON(!buffer_locked(bh)) check on submit_bh() function.
> In write_dev_supers(), if wait parameter is set and buffer_uptodate() check
> is negative, submit_bh() is executed and hit above BUG_ON.
> So I fixed this issue.
Thanks for finding this bug and sending the patch.
This function is very confusing. If wait parameter is set, it
isn't supposed to do any IO at all. The caller first does
write_dev_supers with wait == 0, and that sends all the supers down on
all the devices.
Then it calls again with wait == 1, which is supposed to make sure all
the supers actually got to disk.
We should change the wait == 0 behavior to leave a reference held on all
the buffers, and wait == 1 to drop that reference. That way the buffer
won't disappear while we are waiting, and we can return an error if the
buffer wasn't up to date when wait == 1.
Are you interested in fixing this?
-chris
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] btrfs: fix write_dev_supers
2009-06-09 11:25 ` Chris Mason
@ 2009-06-09 11:28 ` Hisashi Hifumi
2009-06-10 7:32 ` Hisashi Hifumi
1 sibling, 0 replies; 5+ messages in thread
From: Hisashi Hifumi @ 2009-06-09 11:28 UTC (permalink / raw)
To: Chris Mason; +Cc: linux-btrfs
At 20:25 09/06/09, Chris Mason wrote:
>On Tue, Jun 09, 2009 at 10:46:55AM +0900, Hisashi Hifumi wrote:
>> Hi.
>>
>> I got following BUG trace.
>> This is violation of BUG_ON(!buffer_locked(bh)) check on submit_bh() function.
>> In write_dev_supers(), if wait parameter is set and buffer_uptodate() check
>> is negative, submit_bh() is executed and hit above BUG_ON.
>> So I fixed this issue.
>
>Thanks for finding this bug and sending the patch.
>
>This function is very confusing. If wait parameter is set, it
>isn't supposed to do any IO at all. The caller first does
>write_dev_supers with wait == 0, and that sends all the supers down on
>all the devices.
>
>Then it calls again with wait == 1, which is supposed to make sure all
>the supers actually got to disk.
>
>We should change the wait == 0 behavior to leave a reference held on all
>the buffers, and wait == 1 to drop that reference. That way the buffer
>won't disappear while we are waiting, and we can return an error if the
>buffer wasn't up to date when wait == 1.
>
>Are you interested in fixing this?
Yes, I want to fix this.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] btrfs: fix write_dev_supers
2009-06-09 11:25 ` Chris Mason
2009-06-09 11:28 ` Hisashi Hifumi
@ 2009-06-10 7:32 ` Hisashi Hifumi
2009-06-10 21:02 ` Chris Mason
1 sibling, 1 reply; 5+ messages in thread
From: Hisashi Hifumi @ 2009-06-10 7:32 UTC (permalink / raw)
To: Chris Mason; +Cc: linux-btrfs
At 20:25 09/06/09, Chris Mason wrote:
>On Tue, Jun 09, 2009 at 10:46:55AM +0900, Hisashi Hifumi wrote:
>> Hi.
>>
>> I got following BUG trace.
>> This is violation of BUG_ON(!buffer_locked(bh)) check on submit_bh() function.
>> In write_dev_supers(), if wait parameter is set and buffer_uptodate() check
>> is negative, submit_bh() is executed and hit above BUG_ON.
>> So I fixed this issue.
>
>Thanks for finding this bug and sending the patch.
>
>This function is very confusing. If wait parameter is set, it
>isn't supposed to do any IO at all. The caller first does
>write_dev_supers with wait == 0, and that sends all the supers down on
>all the devices.
>
>Then it calls again with wait == 1, which is supposed to make sure all
>the supers actually got to disk.
>
>We should change the wait == 0 behavior to leave a reference held on all
>the buffers, and wait == 1 to drop that reference. That way the buffer
>won't disappear while we are waiting, and we can return an error if the
>buffer wasn't up to date when wait == 1.
>
Like this?
I changed wait == 0 case to get extra ref and on wait == 1 case if buffer is
uptodate, bh releases ref otherwise buffer takes lock to proceed to submit_bh.
Thanks.
Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
diff -Nrup linux-2.6.30-rc8.org/fs/btrfs/disk-io.c linux-2.6.30-rc8.btrfs/fs/btrfs/disk-io.c
--- linux-2.6.30-rc8.org/fs/btrfs/disk-io.c 2009-06-04 16:26:25.000000000 +0900
+++ linux-2.6.30-rc8.btrfs/fs/btrfs/disk-io.c 2009-06-10 15:41:03.000000000 +0900
@@ -2044,8 +2044,10 @@ static int write_dev_supers(struct btrfs
wait_on_buffer(bh);
if (buffer_uptodate(bh)) {
brelse(bh);
+ brelse(bh);
continue;
- }
+ } else
+ lock_buffer(bh);
} else {
btrfs_set_super_bytenr(sb, bytenr);
@@ -2062,6 +2064,7 @@ static int write_dev_supers(struct btrfs
set_buffer_uptodate(bh);
get_bh(bh);
+ get_bh(bh);
lock_buffer(bh);
bh->b_end_io = btrfs_end_buffer_write_sync;
}
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] btrfs: fix write_dev_supers
2009-06-10 7:32 ` Hisashi Hifumi
@ 2009-06-10 21:02 ` Chris Mason
0 siblings, 0 replies; 5+ messages in thread
From: Chris Mason @ 2009-06-10 21:02 UTC (permalink / raw)
To: Hisashi Hifumi; +Cc: linux-btrfs
On Wed, Jun 10, 2009 at 04:32:31PM +0900, Hisashi Hifumi wrote:
>
> At 20:25 09/06/09, Chris Mason wrote:
> >On Tue, Jun 09, 2009 at 10:46:55AM +0900, Hisashi Hifumi wrote:
> >> Hi.
> >>
> >> I got following BUG trace.
> >> This is violation of BUG_ON(!buffer_locked(bh)) check on submit_bh() function.
> >> In write_dev_supers(), if wait parameter is set and buffer_uptodate() check
> >> is negative, submit_bh() is executed and hit above BUG_ON.
> >> So I fixed this issue.
> >
> >Thanks for finding this bug and sending the patch.
> >
> >This function is very confusing. If wait parameter is set, it
> >isn't supposed to do any IO at all. The caller first does
> >write_dev_supers with wait == 0, and that sends all the supers down on
> >all the devices.
> >
> >Then it calls again with wait == 1, which is supposed to make sure all
> >the supers actually got to disk.
> >
> >We should change the wait == 0 behavior to leave a reference held on all
> >the buffers, and wait == 1 to drop that reference. That way the buffer
> >won't disappear while we are waiting, and we can return an error if the
> >buffer wasn't up to date when wait == 1.
> >
>
> Like this?
>
> I changed wait == 0 case to get extra ref and on wait == 1 case if buffer is
> uptodate, bh releases ref otherwise buffer takes lock to proceed to submit_bh.
That's very close to what I had in mind, thank you. In reviewing this I
realized that write_dev_supers had other bugs, including a race with
device add/removal. So, I took your patch and edited it slightly. You
could you please check the change I put into newformat2 branch?
In this version, wait == 1 only waits for IO and does not try to start
it, I think it makes it more clear overall.
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git newformat2
Thanks!
-chris
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-06-10 21:02 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-09 1:46 [PATCH] btrfs: fix write_dev_supers Hisashi Hifumi
2009-06-09 11:25 ` Chris Mason
2009-06-09 11:28 ` Hisashi Hifumi
2009-06-10 7:32 ` Hisashi Hifumi
2009-06-10 21:02 ` Chris Mason
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).