* BUG at fs/inode.c
@ 2011-10-24 10:39 Amon Ott
2011-10-24 16:51 ` Yehuda Sadeh Weinraub
0 siblings, 1 reply; 11+ messages in thread
From: Amon Ott @ 2011-10-24 10:39 UTC (permalink / raw)
To: ceph-devel@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 1274 bytes --]
Hi folks,
we have hit a kernel bug with current ceph-client master (commit
a2742a09568f81315e0f30021f29f14e7cd3924b), which I assume to be a Ceph bug.
Kernel is x86-32, Ceph is running on a two node cluster over ext4. The kernel
traces are attached, the system dies shortly after these messages. The bug is
reproducable. I have not found anything useful in ceph bug tracker when
searching for "fs/inode.c".
Around fs/inode.c line 1375 mentioned in the trace is the iput() function:
void iput(struct inode *inode)
{
if (inode) {
BUG_ON(inode->i_state & I_CLEAR);
if (atomic_dec_and_lock(&inode->i_count, &inode->i_lock))
iput_final(inode);
}
}
So inode->i_state seems to be incorrect when iput() is called, maybe a double
call to iput() or a missing iget() somewhere. Is this really a Ceph bug or
have I messed up our kernel code when merging patches?
Amon Ott
--
Dr. Amon Ott
m-privacy GmbH Tel: +49 30 24342334
Am Köllnischen Park 1 Fax: +49 30 24342336
10179 Berlin http://www.m-privacy.de
Amtsgericht Charlottenburg, HRB 84946
Geschäftsführer:
Dipl.-Kfm. Holger Maczkowsky,
Roman Maczkowsky
GnuPG-Key-ID: 0x2DD3A649
[-- Attachment #2: Console.log --]
[-- Type: text/x-log, Size: 2825 bytes --]
------------[ cut here ]------------
kernel BUG at fs/inode.c:1375!
invalid opcode: 0000 [#1] PREEMPT SMP
Modules linked in: lp ceph libceph crc32c libcrc32c fuse parport_pc parport floppy evdev i2c_piix4 button 8139too 8139cp mii i2c_core
Pid: 14455, comm: find Tainted: G W 3.0.7-rsbac #1 Bochs Bochs
EIP: 0060:[<000e91bf>] EFLAGS: 00010202 CPU: 0
EIP is at iput+0x16/0x126
EAX: ea114950 EBX: ea114950 ECX: 00000282 EDX: ea1147a4
ESI: 005a1bc9 EDI: ea114950 EBP: e7bb9e34 ESP: e7bb9e28
DS: 0068 ES: 0068 FS: 00d8 GS: 0033 SS: 0068
Process find (pid: 14455, ti=ee3f6dc0 task=ee3f6b20 task.ti=ee3f6dc0)
Stack:
e7a6e200 005a1bc9 ea114950 e7bb9e40 005a1c4c e7a6e398 e7bb9e50 00196860
e7a6e200 00000000 e7bb9e6c 0058ffde ee247200 00000155 e7bb9f5c 0058ffeb
ea114950 e7bb9e84 00590005 e7bb9f34 ea114950 0058ffeb e7bb9ec4 e7bb9f34
Call Trace:
[<005a1bc9>] ? ceph_mdsc_create_request+0xf5/0xf5 [ceph]
[<005a1c4c>] ceph_mdsc_release_request+0x83/0xfb [ceph]
[<00196860>] kref_put+0x3f/0x48
[<0058ffde>] ceph_do_getattr+0xb6/0xc3 [ceph]
[<0058ffeb>] ? ceph_do_getattr+0xc3/0xc3 [ceph]
[<00590005>] ceph_getattr+0x1a/0xb6 [ceph]
[<0058ffeb>] ? ceph_do_getattr+0xc3/0xc3 [ceph]
[<000d68a2>] vfs_getattr+0x125/0x13e
[<000d6914>] vfs_fstatat+0x59/0x6c
[<000d6941>] sys_fstatat64+0x1a/0x2e
[<000081a4>] ? hw_breakpoint_exceptions_notify+0x2f/0x117
[<00003e34>] ? math_state_restore+0x2d/0x2d
[<00003e32>] ? math_state_restore+0x2b/0x2d
[<00003e3f>] ? do_device_not_available+0xb/0x15
[<004dea0a>] syscall_call+0x7/0xb
Code: 4b 3f 00 b8 e4 34 44 c2 e8 04 39 f9 ff 83 c4 10 5b 5e 5f 5d c3 55 85 c0 89 e5 57 56 53 89 c3 0f 84 11 01 00 00 f6 40 1c 40 74 04 <0f> 0b eb fe 8d 50 14 8d 40 64 e8 86 b5 0a 00 85 c0 0f 84 f4 00
EIP: [<000e91bf>] iput+0x16/0x126 SS:ESP 0068:e7bb9e28
---[ end trace fbba93cb09482261 ]---
------------[ cut here ]------------
WARNING: at fs/inode.c:334 ihold+0x27/0x29()
Hardware name: Bochs
Modules linked in: lp ceph libceph crc32c libcrc32c fuse parport_pc parport floppy evdev i2c_piix4 button 8139too 8139cp mii i2c_core
Pid: 14432, comm: genstatus Tainted: G D W 3.0.7-rsbac #1
Call Trace:
[<00061e40>] warn_slowpath_common+0x65/0x7a
[<000e85e1>] ? ihold+0x27/0x29
[<00061e64>] warn_slowpath_null+0xf/0x13
[<000e85e1>] ihold+0x27/0x29
[<0058ffb0>] ceph_do_getattr+0x88/0xc3 [ceph]
[<0058ffeb>] ? ceph_do_getattr+0xc3/0xc3 [ceph]
[<00590005>] ceph_getattr+0x1a/0xb6 [ceph]
[<0058ffeb>] ? ceph_do_getattr+0xc3/0xc3 [ceph]
[<000d68a2>] vfs_getattr+0x125/0x13e
[<000d6914>] vfs_fstatat+0x59/0x6c
[<000d69f8>] vfs_stat+0x13/0x15
[<000d6a0e>] sys_stat64+0x14/0x28
[<0006de3b>] ? set_current_blocked+0x37/0x3b
[<0006e006>] ? sigprocmask+0x7e/0x89
[<0006e134>] ? sys_rt_sigprocmask+0x123/0x138
[<004dea0a>] syscall_call+0x7/0xb
---[ end trace fbba93cb09482262 ]---
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: BUG at fs/inode.c 2011-10-24 10:39 BUG at fs/inode.c Amon Ott @ 2011-10-24 16:51 ` Yehuda Sadeh Weinraub 2011-10-25 8:38 ` Amon Ott 0 siblings, 1 reply; 11+ messages in thread From: Yehuda Sadeh Weinraub @ 2011-10-24 16:51 UTC (permalink / raw) To: Amon Ott; +Cc: ceph-devel@vger.kernel.org On Mon, Oct 24, 2011 at 3:39 AM, Amon Ott <a.ott@m-privacy.de> wrote: > Hi folks, > > we have hit a kernel bug with current ceph-client master (commit > a2742a09568f81315e0f30021f29f14e7cd3924b), which I assume to be a Ceph bug. Is it easily reproducible? What's the scenario? > > Kernel is x86-32, Ceph is running on a two node cluster over ext4. The kernel > traces are attached, the system dies shortly after these messages. The bug is > reproducable. I have not found anything useful in ceph bug tracker when > searching for "fs/inode.c". How many mds servers? > > Around fs/inode.c line 1375 mentioned in the trace is the iput() function: > void iput(struct inode *inode) > { > if (inode) { > BUG_ON(inode->i_state & I_CLEAR); > > if (atomic_dec_and_lock(&inode->i_count, &inode->i_lock)) > iput_final(inode); > } > } > > So inode->i_state seems to be incorrect when iput() is called, maybe a double > call to iput() or a missing iget() somewhere. Is this really a Ceph bug or > have I messed up our kernel code when merging patches? > What patches? Also, the client logs could help shedding a light on the issue. You should have dynamic debugging turned on (CONFIG_DYNAMIC_DEBUG), and something along the lines of: # mount -t debugfs none /sys/kernel/debug # echo 'module ceph +p' > /sys/kernel/debug/dynamic_debug/control # echo 'module libceph +p' > /sys/kernel/debug/dynamic_debug/control Thanks, Yehuda -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: BUG at fs/inode.c 2011-10-24 16:51 ` Yehuda Sadeh Weinraub @ 2011-10-25 8:38 ` Amon Ott 2011-10-25 14:35 ` Amon Ott 0 siblings, 1 reply; 11+ messages in thread From: Amon Ott @ 2011-10-25 8:38 UTC (permalink / raw) To: Yehuda Sadeh Weinraub; +Cc: ceph-devel@vger.kernel.org On Monday 24 October 2011 wrote Yehuda Sadeh Weinraub: > On Mon, Oct 24, 2011 at 3:39 AM, Amon Ott <a.ott@m-privacy.de> wrote: > > we have hit a kernel bug with current ceph-client master (commit > > a2742a09568f81315e0f30021f29f14e7cd3924b), which I assume to be a Ceph > > bug. > > Is it easily reproducible? What's the scenario? It is quite easy to reproduce. We run a virtual test cluster with two nodes, each running OSD, MDS and MON, but using "max mon = 1". Cephfs is mounted on both nodes so that they share the same data. Kernel is 3.0.7 with PaX, RSBAC and ceph-client master. The intention is to have a scalable cluster of servers where any number of nodes may fail at any time, as long as there are always enough left to keep at least one copy of the data and restore redundancy. If it works out as expected, we want to scale to 20 or even more nodes, depending on the needs of our customers. > > Kernel is x86-32, Ceph is running on a two node cluster over ext4. The > > kernel traces are attached, the system dies shortly after these messages. > > The bug is reproducable. I have not found anything useful in ceph bug > > tracker when searching for "fs/inode.c". > > How many mds servers? We run a test cluster with two nodes, each running OSD, MDS and MON, but using "max mon = 1". > > Around fs/inode.c line 1375 mentioned in the trace is the iput() > > function: void iput(struct inode *inode) > > { > > if (inode) { > > BUG_ON(inode->i_state & I_CLEAR); > > > > if (atomic_dec_and_lock(&inode->i_count, &inode->i_lock)) > > iput_final(inode); > > } > > } > > > > So inode->i_state seems to be incorrect when iput() is called, maybe a > > double call to iput() or a missing iget() somewhere. Is this really a > > Ceph bug or have I messed up our kernel code when merging patches? > > What patches? See above. PaX, RSBAC and Ceph master. I have been merging the first two in for years now, being the RSBAC main author myself. > Also, the client logs could help shedding a light on the issue. You > should have dynamic debugging turned on (CONFIG_DYNAMIC_DEBUG), and > something along the lines of: > > # mount -t debugfs none /sys/kernel/debug > # echo 'module ceph +p' > /sys/kernel/debug/dynamic_debug/control > # echo 'module libceph +p' > /sys/kernel/debug/dynamic_debug/control New kernels are building right now. Upgraded to 3.0.8, put in new ceph-client master fix 8ba1683acc83aee4bcab304844f8e60330e5ef1f and added CONFIG_DYNAMIC_DEBUG. This kernel will go into two big servers this time to give it some real load. Let's see whether I can reproduce there, too. If so, I will provide debug output as requested. Amon Ott -- Dr. Amon Ott m-privacy GmbH Tel: +49 30 24342334 Am Köllnischen Park 1 Fax: +49 30 24342336 10179 Berlin http://www.m-privacy.de Amtsgericht Charlottenburg, HRB 84946 Geschäftsführer: Dipl.-Kfm. Holger Maczkowsky, Roman Maczkowsky GnuPG-Key-ID: 0x2DD3A649 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: BUG at fs/inode.c 2011-10-25 8:38 ` Amon Ott @ 2011-10-25 14:35 ` Amon Ott 2011-11-01 8:23 ` Amon Ott 0 siblings, 1 reply; 11+ messages in thread From: Amon Ott @ 2011-10-25 14:35 UTC (permalink / raw) To: Yehuda Sadeh Weinraub; +Cc: ceph-devel@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 1436 bytes --] On Tuesday 25 October 2011 wrote Amon Ott: > On Monday 24 October 2011 wrote Yehuda Sadeh Weinraub: > > Also, the client logs could help shedding a light on the issue. You > > should have dynamic debugging turned on (CONFIG_DYNAMIC_DEBUG), and > > something along the lines of: > > > > # mount -t debugfs none /sys/kernel/debug > > # echo 'module ceph +p' > /sys/kernel/debug/dynamic_debug/control > > # echo 'module libceph +p' > /sys/kernel/debug/dynamic_debug/control > > New kernels are building right now. Upgraded to 3.0.8, put in new > ceph-client master fix 8ba1683acc83aee4bcab304844f8e60330e5ef1f and added > CONFIG_DYNAMIC_DEBUG. This kernel will go into two big servers this time to > give it some real load. Let's see whether I can reproduce there, too. If > so, I will provide debug output as requested. Finally, I could reproduce with debugging on and keep the system alive long enough to copy the kernel log. Attached are two examples of the BUG happening with surrounding Ceph logs. Before and after these extracts are some seconds without logging, so I assume they are complete. Amon Ott -- Dr. Amon Ott m-privacy GmbH Tel: +49 30 24342334 Am Köllnischen Park 1 Fax: +49 30 24342336 10179 Berlin http://www.m-privacy.de Amtsgericht Charlottenburg, HRB 84946 Geschäftsführer: Dipl.-Kfm. Holger Maczkowsky, Roman Maczkowsky GnuPG-Key-ID: 0x2DD3A649 [-- Attachment #2: bug1.log.gz --] [-- Type: application/x-gzip, Size: 19794 bytes --] [-- Attachment #3: bug2.log.gz --] [-- Type: application/x-gzip, Size: 7125 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: BUG at fs/inode.c 2011-10-25 14:35 ` Amon Ott @ 2011-11-01 8:23 ` Amon Ott 2011-11-01 16:51 ` Sage Weil 0 siblings, 1 reply; 11+ messages in thread From: Amon Ott @ 2011-11-01 8:23 UTC (permalink / raw) To: Yehuda Sadeh Weinraub; +Cc: ceph-devel@vger.kernel.org On Tuesday 25 October 2011 wrote Amon Ott: > On Tuesday 25 October 2011 wrote Amon Ott: > > On Monday 24 October 2011 wrote Yehuda Sadeh Weinraub: > > > Also, the client logs could help shedding a light on the issue. You > > > should have dynamic debugging turned on (CONFIG_DYNAMIC_DEBUG), and > > > something along the lines of: > > > > > > # mount -t debugfs none /sys/kernel/debug > > > # echo 'module ceph +p' > /sys/kernel/debug/dynamic_debug/control > > > # echo 'module libceph +p' > /sys/kernel/debug/dynamic_debug/control > > > > New kernels are building right now. Upgraded to 3.0.8, put in new > > ceph-client master fix 8ba1683acc83aee4bcab304844f8e60330e5ef1f and added > > CONFIG_DYNAMIC_DEBUG. This kernel will go into two big servers this time > > to give it some real load. Let's see whether I can reproduce there, too. > > If so, I will provide debug output as requested. > > Finally, I could reproduce with debugging on and keep the system alive long > enough to copy the kernel log. Attached are two examples of the BUG > happening with surrounding Ceph logs. Before and after these extracts are > some seconds without logging, so I assume they are complete. Any news on this bug? Do you need more info? I would really like to go on testing, but this bug is a show stopper for me. Amon Ott -- Dr. Amon Ott m-privacy GmbH Tel: +49 30 24342334 Am Köllnischen Park 1 Fax: +49 30 24342336 10179 Berlin http://www.m-privacy.de Amtsgericht Charlottenburg, HRB 84946 Geschäftsführer: Dipl.-Kfm. Holger Maczkowsky, Roman Maczkowsky GnuPG-Key-ID: 0x2DD3A649 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: BUG at fs/inode.c 2011-11-01 8:23 ` Amon Ott @ 2011-11-01 16:51 ` Sage Weil 2011-11-02 8:53 ` Amon Ott 0 siblings, 1 reply; 11+ messages in thread From: Sage Weil @ 2011-11-01 16:51 UTC (permalink / raw) To: Amon Ott; +Cc: Yehuda Sadeh Weinraub, ceph-devel@vger.kernel.org On Tue, 1 Nov 2011, Amon Ott wrote: > On Tuesday 25 October 2011 wrote Amon Ott: > > On Tuesday 25 October 2011 wrote Amon Ott: > > > On Monday 24 October 2011 wrote Yehuda Sadeh Weinraub: > > > > Also, the client logs could help shedding a light on the issue. You > > > > should have dynamic debugging turned on (CONFIG_DYNAMIC_DEBUG), and > > > > something along the lines of: > > > > > > > > # mount -t debugfs none /sys/kernel/debug > > > > # echo 'module ceph +p' > /sys/kernel/debug/dynamic_debug/control > > > > # echo 'module libceph +p' > /sys/kernel/debug/dynamic_debug/control > > > > > > New kernels are building right now. Upgraded to 3.0.8, put in new > > > ceph-client master fix 8ba1683acc83aee4bcab304844f8e60330e5ef1f and added > > > CONFIG_DYNAMIC_DEBUG. This kernel will go into two big servers this time > > > to give it some real load. Let's see whether I can reproduce there, too. > > > If so, I will provide debug output as requested. > > > > Finally, I could reproduce with debugging on and keep the system alive long > > enough to copy the kernel log. Attached are two examples of the BUG > > happening with surrounding Ceph logs. Before and after these extracts are > > some seconds without logging, so I assume they are complete. > > Any news on this bug? Do you need more info? I would really like to go on > testing, but this bug is a show stopper for me. Sorry, I dropped this one. Just added it to the tracker at http://tracker.newdream.net/issues/1667 There is enough in those logs to tell that there is a bad iput() somewhere. In bug1, we create a new inode, link it to dn A, and shortly thereafter dn B sees it is also linked to it (incorrectly) and we iput(). Presumably we reused an inode address that was still in use (and the same thing happens with the new inode right after that). Can you capture a larger log segment? The hope is to catch the first use-after-free, and not the subsequent side-effects. Also, the below patch may help us parse the output with multiple threads. Thanks! sage diff --git a/kernel/printk.c b/kernel/printk.c index 28a40d8..9fcf993 100644 --- a/kernel/printk.c +++ b/kernel/printk.c @@ -929,9 +929,10 @@ asmlinkage int vprintk(const char *fmt, va_list args) t = cpu_clock(printk_cpu); nanosec_rem = do_div(t, 1000000000); - tlen = sprintf(tbuf, "[%5lu.%06lu] ", + tlen = sprintf(tbuf, "[%5lu.%06lu %6lu] ", (unsigned long) t, - nanosec_rem / 1000); + nanosec_rem / 1000, + current->pid); for (tp = tbuf; tp < tbuf + tlen; tp++) emit_log_char(*tp); ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: BUG at fs/inode.c 2011-11-01 16:51 ` Sage Weil @ 2011-11-02 8:53 ` Amon Ott 2011-11-02 14:23 ` Sage Weil 0 siblings, 1 reply; 11+ messages in thread From: Amon Ott @ 2011-11-02 8:53 UTC (permalink / raw) To: Sage Weil; +Cc: Yehuda Sadeh Weinraub, ceph-devel@vger.kernel.org On Tuesday 01 November 2011 wrote Sage Weil: > Can you capture a larger log segment? The hope is to catch the first > use-after-free, and not the subsequent side-effects. I still have the full kern.log here from boot till BUG, cleaned it up a bit (no firewall lines, RSBAC stuff) and uploaded to https://download.m-privacy.de/kern-full.log.bz2 Full ceph logging had been enabled as soon as possible, after boot and before mounting ceph fs. > Also, the below patch may help us parse the output with multiple threads. Put in the patch, new kernel packages building now. Amon Ott -- Dr. Amon Ott m-privacy GmbH Tel: +49 30 24342334 Am Köllnischen Park 1 Fax: +49 30 24342336 10179 Berlin http://www.m-privacy.de Amtsgericht Charlottenburg, HRB 84946 Geschäftsführer: Dipl.-Kfm. Holger Maczkowsky, Roman Maczkowsky GnuPG-Key-ID: 0x2DD3A649 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: BUG at fs/inode.c 2011-11-02 8:53 ` Amon Ott @ 2011-11-02 14:23 ` Sage Weil 2011-11-05 17:06 ` Amon Ott 0 siblings, 1 reply; 11+ messages in thread From: Sage Weil @ 2011-11-02 14:23 UTC (permalink / raw) To: Amon Ott; +Cc: Yehuda Sadeh Weinraub, ceph-devel@vger.kernel.org On Wed, 2 Nov 2011, Amon Ott wrote: > On Tuesday 01 November 2011 wrote Sage Weil: > > Can you capture a larger log segment? The hope is to catch the first > > use-after-free, and not the subsequent side-effects. > > I still have the full kern.log here from boot till BUG, cleaned it up a bit > (no firewall lines, RSBAC stuff) and uploaded to > https://download.m-privacy.de/kern-full.log.bz2 > > Full ceph logging had been enabled as soon as possible, after boot and before > mounting ceph fs. > > > Also, the below patch may help us parse the output with multiple threads. The following would also help: 4f9ea86237b8d0005f5467fe817b4f1f0955072c, or wip-debug-inode-refs in ceph-client.git. Thanks! sage ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: BUG at fs/inode.c 2011-11-02 14:23 ` Sage Weil @ 2011-11-05 17:06 ` Amon Ott 2011-11-06 5:33 ` Sage Weil 0 siblings, 1 reply; 11+ messages in thread From: Amon Ott @ 2011-11-05 17:06 UTC (permalink / raw) To: Sage Weil; +Cc: Yehuda Sadeh Weinraub, ceph-devel@vger.kernel.org On Wednesday 02 November 2011 you wrote: > On Wed, 2 Nov 2011, Amon Ott wrote: > > On Tuesday 01 November 2011 wrote Sage Weil: > > > Can you capture a larger log segment? The hope is to catch the first > > > use-after-free, and not the subsequent side-effects. > > > > I still have the full kern.log here from boot till BUG, cleaned it up a > > bit (no firewall lines, RSBAC stuff) and uploaded to > > https://download.m-privacy.de/kern-full.log.bz2 > > > > Full ceph logging had been enabled as soon as possible, after boot and > > before mounting ceph fs. > > > > > Also, the below patch may help us parse the output with multiple > > > threads. > > The following would also help: 4f9ea86237b8d0005f5467fe817b4f1f0955072c, > or wip-debug-inode-refs in ceph-client.git. The bug had been a lot harder to trigger with all that debugging slowing down the systems, but now I have something. I hope it helps tracking that beast down. 282K compressed size, so I uploaded the full log there: https://download.m-privacy.de/kern.log2.bz2 Amon Ott -- Dr. Amon Ott m-privacy GmbH Tel: +49 30 24342334 Am Köllnischen Park 1 Fax: +49 30 24342336 10179 Berlin http://www.m-privacy.de Amtsgericht Charlottenburg, HRB 84946 Geschäftsführer: Dipl.-Kfm. Holger Maczkowsky, Roman Maczkowsky GnuPG-Key-ID: 0x2DD3A649 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: BUG at fs/inode.c 2011-11-05 17:06 ` Amon Ott @ 2011-11-06 5:33 ` Sage Weil 2011-11-07 15:32 ` Amon Ott 0 siblings, 1 reply; 11+ messages in thread From: Sage Weil @ 2011-11-06 5:33 UTC (permalink / raw) To: Amon Ott; +Cc: Yehuda Sadeh Weinraub, ceph-devel@vger.kernel.org On Sat, 5 Nov 2011, Amon Ott wrote: > On Wednesday 02 November 2011 you wrote: > > On Wed, 2 Nov 2011, Amon Ott wrote: > > > On Tuesday 01 November 2011 wrote Sage Weil: > > > > Can you capture a larger log segment? The hope is to catch the first > > > > use-after-free, and not the subsequent side-effects. > > > > > > I still have the full kern.log here from boot till BUG, cleaned it up a > > > bit (no firewall lines, RSBAC stuff) and uploaded to > > > https://download.m-privacy.de/kern-full.log.bz2 > > > > > > Full ceph logging had been enabled as soon as possible, after boot and > > > before mounting ceph fs. > > > > > > > Also, the below patch may help us parse the output with multiple > > > > threads. > > > > The following would also help: 4f9ea86237b8d0005f5467fe817b4f1f0955072c, > > or wip-debug-inode-refs in ceph-client.git. > > The bug had been a lot harder to trigger with all that debugging slowing down > the systems, but now I have something. I hope it helps tracking that beast > down. 282K compressed size, so I uploaded the full log there: > > https://download.m-privacy.de/kern.log2.bz2 Pretty sure I've found this. Can you test the patch below? Thanks! sage From 15a2015fbc692e1c97d7ce12d96e077f5ae7ea6d Mon Sep 17 00:00:00 2001 From: Sage Weil <sage@newdream.net> Date: Sat, 5 Nov 2011 22:06:31 -0700 Subject: [PATCH] ceph: fix iput race when queueing inode work If we queue a work item that calls iput(), make sure we ihold() before attempting to queue work. Otherwise our queued work might miraculously run before we notice the queue_work() succeeded and call ihold(), allowing the inode to be destroyed. That is, instead of if (queue_work(...)) ihold(); we need to do ihold(); if (!queue_work(...)) iput(); Reported-by: Amon Ott <a.ott@m-privacy.de> Signed-off-by: Sage Weil <sage@newdream.net> --- fs/ceph/inode.c | 9 ++++++--- 1 files changed, 6 insertions(+), 3 deletions(-) diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index e392bfc..116f365 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -1328,12 +1328,13 @@ int ceph_inode_set_size(struct inode *inode, loff_t size) */ void ceph_queue_writeback(struct inode *inode) { + ihold(inode); if (queue_work(ceph_inode_to_client(inode)->wb_wq, &ceph_inode(inode)->i_wb_work)) { dout("ceph_queue_writeback %p\n", inode); - ihold(inode); } else { dout("ceph_queue_writeback %p failed\n", inode); + iput(inode); } } @@ -1353,12 +1354,13 @@ static void ceph_writeback_work(struct work_struct *work) */ void ceph_queue_invalidate(struct inode *inode) { + ihold(inode); if (queue_work(ceph_inode_to_client(inode)->pg_inv_wq, &ceph_inode(inode)->i_pg_inv_work)) { dout("ceph_queue_invalidate %p\n", inode); - ihold(inode); } else { dout("ceph_queue_invalidate %p failed\n", inode); + iput(inode); } } @@ -1434,13 +1436,14 @@ void ceph_queue_vmtruncate(struct inode *inode) { struct ceph_inode_info *ci = ceph_inode(inode); + ihold(inode); if (queue_work(ceph_sb_to_client(inode->i_sb)->trunc_wq, &ci->i_vmtruncate_work)) { dout("ceph_queue_vmtruncate %p\n", inode); - ihold(inode); } else { dout("ceph_queue_vmtruncate %p failed, pending=%d\n", inode, ci->i_truncate_pending); + iput(inode); } } -- 1.7.2.5 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: BUG at fs/inode.c 2011-11-06 5:33 ` Sage Weil @ 2011-11-07 15:32 ` Amon Ott 0 siblings, 0 replies; 11+ messages in thread From: Amon Ott @ 2011-11-07 15:32 UTC (permalink / raw) To: Sage Weil; +Cc: Yehuda Sadeh Weinraub, ceph-devel@vger.kernel.org On Sunday 06 November 2011 wrote Sage Weil: > Pretty sure I've found this. Can you test the patch below? With this patch, the bug disappeared. Thank you, good work! Unfortunately, I hit another bug with ceph_filldir(). Will report on that one soon, when I know more. Might also be my fault, need to check. Amon Ott -- Dr. Amon Ott m-privacy GmbH Tel: +49 30 24342334 Am Köllnischen Park 1 Fax: +49 30 24342336 10179 Berlin http://www.m-privacy.de Amtsgericht Charlottenburg, HRB 84946 Geschäftsführer: Dipl.-Kfm. Holger Maczkowsky, Roman Maczkowsky GnuPG-Key-ID: 0x2DD3A649 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2011-11-07 15:32 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-10-24 10:39 BUG at fs/inode.c Amon Ott 2011-10-24 16:51 ` Yehuda Sadeh Weinraub 2011-10-25 8:38 ` Amon Ott 2011-10-25 14:35 ` Amon Ott 2011-11-01 8:23 ` Amon Ott 2011-11-01 16:51 ` Sage Weil 2011-11-02 8:53 ` Amon Ott 2011-11-02 14:23 ` Sage Weil 2011-11-05 17:06 ` Amon Ott 2011-11-06 5:33 ` Sage Weil 2011-11-07 15:32 ` Amon Ott
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.