* rbd and ceph
@ 2012-05-15 14:01 Martin Wilderoth
2012-05-15 16:10 ` Josh Durgin
0 siblings, 1 reply; 4+ messages in thread
From: Martin Wilderoth @ 2012-05-15 14:01 UTC (permalink / raw)
To: ceph-devel
Hello,
I have a xenhost using rbd device for the guests. In the guest I have a mounted ceph file system.
From time to time I get the guest hanging and I have the following error in my logfiles on the guest.
Maybe I should not use both rbd and ceph ?
May 15 14:13:18 lintx2 kernel: [ 3560.225095] Modules linked in: cryptd aes_x86_64 aes_generic cbc ceph libceph crc32c libcrc32c evdev snd_pcm snd_timer snd soundcore snd_page_alloc pcspkr ext3 jbd mbcache xen_netfront xen_blkfront
May 15 14:13:18 lintx2 kernel: [ 3560.225140] Pid: 18, comm: kworker/0:1 Tainted: G W 3.2.0-0.bpo.2-amd64 #1
May 15 14:13:18 lintx2 kernel: [ 3560.225148] Call Trace:
May 15 14:13:18 lintx2 kernel: [ 3560.225155] [<ffffffff810497b4>] ? warn_slowpath_common+0x78/0x8c
May 15 14:13:18 lintx2 kernel: [ 3560.225167] [<ffffffffa00db647>] ? ceph_add_cap+0x38e/0x49e [ceph]
May 15 14:13:18 lintx2 kernel: [ 3560.225178] [<ffffffffa00d220a>] ? fill_inode+0x4eb/0x602 [ceph]
May 15 14:13:18 lintx2 kernel: [ 3560.225186] [<ffffffff811157ad>] ? __d_instantiate+0x8b/0xda
May 15 14:13:18 lintx2 kernel: [ 3560.225197] [<ffffffffa00d317d>] ? ceph_readdir_prepopulate+0x2de/0x375 [ceph]
May 15 14:13:18 lintx2 kernel: [ 3560.225209] [<ffffffffa00e2d3f>] ? dispatch+0xa35/0xef2 [ceph]
May 15 14:13:18 lintx2 kernel: [ 3560.225220] [<ffffffffa00ae841>] ? ceph_tcp_recvmsg+0x43/0x4f [libceph]
May 15 14:13:18 lintx2 kernel: [ 3560.225231] [<ffffffffa00b0821>] ? con_work+0x1070/0x13b8 [libceph]
May 15 14:13:18 lintx2 kernel: [ 3560.225240] [<ffffffff81006f7f>] ? xen_restore_fl_direct_reloc+0x4/0x4
May 15 14:13:18 lintx2 kernel: [ 3560.225248] [<ffffffff81044549>] ? update_curr+0xbc/0x160
May 15 14:13:18 lintx2 kernel: [ 3560.225259] [<ffffffffa00af7b1>] ? try_write+0xbe1/0xbe1 [libceph]
May 15 14:13:18 lintx2 kernel: [ 3560.225268] [<ffffffff8105f897>] ? process_one_work+0x1cc/0x2ea
May 15 14:13:18 lintx2 kernel: [ 3560.225277] [<ffffffff8105fae2>] ? worker_thread+0x12d/0x247
May 15 14:13:18 lintx2 kernel: [ 3560.225285] [<ffffffff8105f9b5>] ? process_one_work+0x2ea/0x2ea
May 15 14:13:18 lintx2 kernel: [ 3560.225294] [<ffffffff810632ed>] ? kthread+0x7a/0x82
May 15 14:13:18 lintx2 kernel: [ 3560.225302] [<ffffffff8136b974>] ? kernel_thread_helper+0x4/0x10
May 15 14:13:18 lintx2 kernel: [ 3560.225311] [<ffffffff81369a33>] ? int_ret_from_sys_call+0x7/0x1b
May 15 14:13:18 lintx2 kernel: [ 3560.225319] [<ffffffff8136453c>] ? retint_restore_args+0x5/0x6
May 15 14:13:18 lintx2 kernel: [ 3560.225328] [<ffffffff8136b970>] ? gs_change+0x13/0x13
May 15 14:13:18 lintx2 kernel: [ 3560.225335] ---[ end trace 111652db8892cd8b ]---
May 15 14:13:19 lintx2 kernel: [ 3560.467074] ------------[ cut here ]------------
May 15 14:13:19 lintx2 kernel: [ 3560.467086] WARNING: at /build/buildd-linux-2.6_3.2.15-1~bpo60+1-amd64-Rdi2JW/linux-2.6-3.2.15/debian/build/source_amd64_none/fs/ceph/caps.c:590 ceph_add_cap+0x38e/0x49e [ceph]()
May 15 14:13:19 lintx2 kernel: [ 3560.467102] Modules linked in: cryptd aes_x86_64 aes_generic cbc ceph libceph crc32c libcrc32c evdev snd_pcm snd_timer snd soundcore snd_page_alloc pcspkr ext3 jbd mbcache xen_netfront xen_blkfront
May 15 14:13:19 lintx2 kernel: [ 3560.467147] Pid: 18, comm: kworker/0:1 Tainted: G W 3.2.0-0.bpo.2-amd64 #1
May 15 14:13:19 lintx2 kernel: [ 3560.467155] Call Trace:
May 15 14:13:19 lintx2 kernel: [ 3560.467163] [<ffffffff810497b4>] ? warn_slowpath_common+0x78/0x8c
May 15 14:13:19 lintx2 kernel: [ 3560.467175] [<ffffffffa00db647>] ? ceph_add_cap+0x38e/0x49e [ceph]
May 15 14:13:19 lintx2 kernel: [ 3560.467187] [<ffffffffa00d220a>] ? fill_inode+0x4eb/0x602 [ceph]
May 15 14:13:19 lintx2 kernel: [ 3560.467196] [<ffffffff811157ad>] ? __d_instantiate+0x8b/0xda
May 15 14:13:19 lintx2 kernel: [ 3560.467207] [<ffffffffa00d317d>] ? ceph_readdir_prepopulate+0x2de/0x375 [ceph]
May 15 14:13:19 lintx2 kernel: [ 3560.467220] [<ffffffffa00e2d3f>] ? dispatch+0xa35/0xef2 [ceph]
May 15 14:13:19 lintx2 kernel: [ 3560.467229] [<ffffffff8136453c>] ? retint_restore_args+0x5/0x6
May 15 14:13:19 lintx2 kernel: [ 3560.467241] [<ffffffffa00ae841>] ? ceph_tcp_recvmsg+0x43/0x4f [libceph]
May 15 14:13:19 lintx2 kernel: [ 3560.467252] [<ffffffffa00b0821>] ? con_work+0x1070/0x13b8 [libceph]
May 15 14:13:19 lintx2 kernel: [ 3560.467262] [<ffffffff81006f7f>] ? xen_restore_fl_direct_reloc+0x4/0x4
May 15 14:13:19 lintx2 kernel: [ 3560.467271] [<ffffffff81044549>] ? update_curr+0xbc/0x160
May 15 14:13:19 lintx2 kernel: [ 3560.467280] [<ffffffff81362d0d>] ? __schedule+0x5a0/0x5cd
May 15 14:13:19 lintx2 kernel: [ 3560.467290] [<ffffffffa00af7b1>] ? try_write+0xbe1/0xbe1 [libceph]
May 15 14:13:19 lintx2 kernel: [ 3560.467299] [<ffffffff8105f897>] ? process_one_work+0x1cc/0x2ea
May 15 14:13:19 lintx2 kernel: [ 3560.467308] [<ffffffff8105fae2>] ? worker_thread+0x12d/0x247
May 15 14:13:19 lintx2 kernel: [ 3560.467317] [<ffffffff8105f9b5>] ? process_one_work+0x2ea/0x2ea
May 15 14:13:19 lintx2 kernel: [ 3560.467325] [<ffffffff810632ed>] ? kthread+0x7a/0x82
May 15 14:13:19 lintx2 kernel: [ 3560.467334] [<ffffffff8136b974>] ? kernel_thread_helper+0x4/0x10
May 15 14:13:19 lintx2 kernel: [ 3560.467343] [<ffffffff81369a33>] ? int_ret_from_sys_call+0x7/0x1b
May 15 14:13:19 lintx2 kernel: [ 3560.467352] [<ffffffff8136453c>] ? retint_restore_args+0x5/0x6
May 15 14:13:19 lintx2 kernel: [ 3560.467361] [<ffffffff8136b970>] ? gs_change+0x13/0x13
May 15 14:13:19 lintx2 kernel: [ 3560.467368] ---[ end trace 111652db8892cd8c ]---
May 15 14:13:19 lintx2 kernel: [ 3560.467384] ------------[ cut here ]------------
May 15 14:13:19 lintx2 kernel: [ 3560.467395] WARNING: at /build/buildd-linux-2.6_3.2.15-1~bpo60+1-amd64-Rdi2JW/linux-2.6-3.2.15/debian/build/source_amd64_none/fs/ceph/caps.c:590 ceph_add_cap+0x38e/0x49e [ceph]()
May 15 14:13:19 lintx2 kernel: [ 3560.467408] Modules linked in: cryptd aes_x86_64 aes_generic cbc ceph libceph crc32c libcrc32c evdev snd_pcm snd_timer snd soundcore snd_page_alloc pcspkr ext3 jbd mbcache xen_netfront xen_blkfront
May 15 14:13:19 lintx2 kernel: [ 3560.467452] Pid: 18, comm: kworker/0:1 Tainted: G W 3.2.0-0.bpo.2-amd64 #1
May 15 14:13:19 lintx2 kernel: [ 3560.467460] Call Trace:
May 15 14:13:19 lintx2 kernel: [ 3560.467467] [<ffffffff810497b4>] ? warn_slowpath_common+0x78/0x8c
May 15 14:13:19 lintx2 kernel: [ 3560.467478] [<ffffffffa00db647>] ? ceph_add_cap+0x38e/0x49e [ceph]
May 15 14:13:19 lintx2 kernel: [ 3560.467489] [<ffffffffa00d220a>] ? fill_inode+0x4eb/0x602 [ceph]
May 15 14:13:19 lintx2 kernel: [ 3560.467498] [<ffffffff811157ad>] ? __d_instantiate+0x8b/0xda
May 15 14:13:19 lintx2 kernel: [ 3560.467508] [<ffffffffa00d317d>] ? ceph_readdir_prepopulate+0x2de/0x375 [ceph]
May 15 14:13:19 lintx2 kernel: [ 3560.467521] [<ffffffffa00e2d3f>] ? dispatch+0xa35/0xef2 [ceph]
May 15 14:13:19 lintx2 kernel: [ 3560.467529] [<ffffffff8136453c>] ? retint_restore_args+0x5/0x6
May 15 14:13:19 lintx2 kernel: [ 3560.467540] [<ffffffffa00ae841>] ? ceph_tcp_recvmsg+0x43/0x4f [libceph]
May 15 14:13:19 lintx2 kernel: [ 3560.467551] [<ffffffffa00b0821>] ? con_work+0x1070/0x13b8 [libceph]
May 15 14:13:19 lintx2 kernel: [ 3560.467560] [<ffffffff81006f7f>] ? xen_restore_fl_direct_reloc+0x4/0x4
May 15 14:13:19 lintx2 kernel: [ 3560.467569] [<ffffffff81044549>] ? update_curr+0xbc/0x160
May 15 14:13:19 lintx2 kernel: [ 3560.467577] [<ffffffff81362d0d>] ? __schedule+0x5a0/0x5cd
May 15 14:13:19 lintx2 kernel: [ 3560.467587] [<ffffffffa00af7b1>] ? try_write+0xbe1/0xbe1 [libceph]
May 15 14:13:19 lintx2 kernel: [ 3560.467596] [<ffffffff8105f897>] ? process_one_work+0x1cc/0x2ea
May 15 14:13:19 lintx2 kernel: [ 3560.467605] [<ffffffff8105fae2>] ? worker_thread+0x12d/0x247
May 15 14:13:19 lintx2 kernel: [ 3560.467613] [<ffffffff8105f9b5>] ? process_one_work+0x2ea/0x2ea
May 15 14:13:19 lintx2 kernel: [ 3560.467622] [<ffffffff810632ed>] ? kthread+0x7a/0x82
May 15 14:13:19 lintx2 kernel: [ 3560.467630] [<ffffffff8136b974>] ? kernel_thread_helper+0x4/0x10
May 15 14:13:19 lintx2 kernel: [ 3560.467639] [<ffffffff81369a33>] ? int_ret_from_sys_call+0x7/0x1b
May 15 14:13:19 lintx2 kernel: [ 3560.467647] [<ffffffff8136453c>] ? retint_restore_args+0x5/0x6
May 15 14:13:19 lintx2 kernel: [ 3560.467656] [<ffffffff8136b970>] ? gs_change+0x13/0x13
May 15 14:13:19 lintx2 kernel: [ 3560.467663] ---[ end trace 111652db8892cd8d ]---
May 15 14:13:19 lintx2 kernel: [ 3560.467777] ------------[ cut here ]------------
May 15 14:13:19 lintx2 kernel: [ 3560.467787] WARNING: at /build/buildd-linux-2.6_3.2.15-1~bpo60+1-amd64-Rdi2JW/linux-2.6-3.2.15/debian/build/source_amd64_none/fs/ceph/caps.c:590 ceph_add_cap+0x38e/0x49e [ceph]()
May 15 14:13:19 lintx2 kernel: [ 3560.467801] Modules linked in: cryptd aes_x86_64 aes_generic cbc ceph libceph crc32c libcrc32c evdev snd_pcm snd_timer snd soundcore snd_page_alloc pcspkr ext3 jbd mbcache xen_netfront xen_blkfront
May 15 14:13:19 lintx2 kernel: [ 3560.470923] Pid: 18, comm: kworker/0:1 Tainted: G W 3.2.0-0.bpo.2-amd64 #1
May 15 14:13:19 lintx2 kernel: [ 3560.470923] Call Trace:
May 15 14:13:19 lintx2 kernel: [ 3560.470923] [<ffffffff810497b4>] ? warn_slowpath_common+0x78/0x8c
May 15 14:13:19 lintx2 kernel: [ 3560.470923] [<ffffffffa00db647>] ? ceph_add_cap+0x38e/0x49e [ceph]
May 15 14:13:19 lintx2 kernel: [ 3560.470923] [<ffffffffa00d220a>] ? fill_inode+0x4eb/0x602 [ceph]
May 15 14:13:19 lintx2 kernel: [ 3560.470923] [<ffffffff811157ad>] ? __d_instantiate+0x8b/0xda
May 15 14:13:19 lintx2 kernel: [ 3560.470923] [<ffffffffa00d317d>] ? ceph_readdir_prepopulate+0x2de/0x375 [ceph]
May 15 14:13:19 lintx2 kernel: [ 3560.470923] [<ffffffffa00e2d3f>] ? dispatch+0xa35/0xef2 [ceph]
May 15 14:13:19 lintx2 kernel: [ 3560.470923] [<ffffffff8136453c>] ? retint_restore_args+0x5/0x6
May 15 14:13:19 lintx2 kernel: [ 3560.470923] [<ffffffffa00ae841>] ? ceph_tcp_recvmsg+0x43/0x4f [libceph]
May 15 14:13:19 lintx2 kernel: [ 3560.470923] [<ffffffffa00b0821>] ? con_work+0x1070/0x13b8 [libceph]
May 15 14:13:19 lintx2 kernel: [ 3560.470923] [<ffffffff81006f7f>] ? xen_restore_fl_direct_reloc+0x4/0x4
May 15 14:13:19 lintx2 kernel: [ 3560.470923] [<ffffffff81044549>] ? update_curr+0xbc/0x160
May 15 14:13:19 lintx2 kernel: [ 3560.470923] [<ffffffff81362d0d>] ? __schedule+0x5a0/0x5cd
May 15 14:13:19 lintx2 kernel: [ 3560.470923] [<ffffffffa00af7b1>] ? try_write+0xbe1/0xbe1 [libceph]
May 15 14:13:19 lintx2 kernel: [ 3560.470923] [<ffffffff8105f897>] ? process_one_work+0x1cc/0x2ea
May 15 14:13:19 lintx2 kernel: [ 3560.470923] [<ffffffff8105fae2>] ? worker_thread+0x12d/0x247
May 15 14:13:19 lintx2 kernel: [ 3560.470923] [<ffffffff8105f9b5>] ? process_one_work+0x2ea/0x2ea
May 15 14:13:19 lintx2 kernel: [ 3560.470923] [<ffffffff810632ed>] ? kthread+0x7a/0x82
May 15 14:13:19 lintx2 kernel: [ 3560.470923] [<ffffffff8136b974>] ? kernel_thread_helper+0x4/0x10
May 15 14:13:19 lintx2 kernel: [ 3560.470923] [<ffffffff81369a33>] ? int_ret_from_sys_call+0x7/0x1b
May 15 14:13:19 lintx2 kernel: [ 3560.470923] [<ffffffff8136453c>] ? retint_restore_args+0x5/0x6
May 15 14:13:19 lintx2 kernel: [ 3560.470923] [<ffffffff8136b970>] ? gs_change+0x13/0x13
May 15 14:13:19 lintx2 kernel: [ 3560.470923] ---[ end trace 111652db8892cd8e ]---
Regards
Martin
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: rbd and ceph
2012-05-15 14:01 rbd and ceph Martin Wilderoth
@ 2012-05-15 16:10 ` Josh Durgin
2012-05-15 16:24 ` Sage Weil
0 siblings, 1 reply; 4+ messages in thread
From: Josh Durgin @ 2012-05-15 16:10 UTC (permalink / raw)
To: Martin Wilderoth; +Cc: ceph-devel
On 05/15/2012 07:01 AM, Martin Wilderoth wrote:
> Hello,
>
> I have a xenhost using rbd device for the guests. In the guest I have a mounted ceph file system.
> From time to time I get the guest hanging and I have the following error in my logfiles on the guest.
>
> Maybe I should not use both rbd and ceph ?
While it's not a well tested configuration, I don't see any reason this
wouldn't work. Sage, Alex, are there any shared resources in libceph
that would cause problems with this?
> May 15 14:13:18 lintx2 kernel: [ 3560.225095] Modules linked in: cryptd aes_x86_64 aes_generic cbc ceph libceph crc32c libcrc32c evdev snd_pcm snd_timer snd soundcore snd_page_alloc pcspkr ext3 jbd mbcache xen_netfront xen_blkfront
> May 15 14:13:18 lintx2 kernel: [ 3560.225140] Pid: 18, comm: kworker/0:1 Tainted: G W 3.2.0-0.bpo.2-amd64 #1
> May 15 14:13:18 lintx2 kernel: [ 3560.225148] Call Trace:
> May 15 14:13:18 lintx2 kernel: [ 3560.225155] [<ffffffff810497b4>] ? warn_slowpath_common+0x78/0x8c
> May 15 14:13:18 lintx2 kernel: [ 3560.225167] [<ffffffffa00db647>] ? ceph_add_cap+0x38e/0x49e [ceph]
> May 15 14:13:18 lintx2 kernel: [ 3560.225178] [<ffffffffa00d220a>] ? fill_inode+0x4eb/0x602 [ceph]
> May 15 14:13:18 lintx2 kernel: [ 3560.225186] [<ffffffff811157ad>] ? __d_instantiate+0x8b/0xda
> May 15 14:13:18 lintx2 kernel: [ 3560.225197] [<ffffffffa00d317d>] ? ceph_readdir_prepopulate+0x2de/0x375 [ceph]
> May 15 14:13:18 lintx2 kernel: [ 3560.225209] [<ffffffffa00e2d3f>] ? dispatch+0xa35/0xef2 [ceph]
> May 15 14:13:18 lintx2 kernel: [ 3560.225220] [<ffffffffa00ae841>] ? ceph_tcp_recvmsg+0x43/0x4f [libceph]
> May 15 14:13:18 lintx2 kernel: [ 3560.225231] [<ffffffffa00b0821>] ? con_work+0x1070/0x13b8 [libceph]
> May 15 14:13:18 lintx2 kernel: [ 3560.225240] [<ffffffff81006f7f>] ? xen_restore_fl_direct_reloc+0x4/0x4
> May 15 14:13:18 lintx2 kernel: [ 3560.225248] [<ffffffff81044549>] ? update_curr+0xbc/0x160
> May 15 14:13:18 lintx2 kernel: [ 3560.225259] [<ffffffffa00af7b1>] ? try_write+0xbe1/0xbe1 [libceph]
> May 15 14:13:18 lintx2 kernel: [ 3560.225268] [<ffffffff8105f897>] ? process_one_work+0x1cc/0x2ea
> May 15 14:13:18 lintx2 kernel: [ 3560.225277] [<ffffffff8105fae2>] ? worker_thread+0x12d/0x247
> May 15 14:13:18 lintx2 kernel: [ 3560.225285] [<ffffffff8105f9b5>] ? process_one_work+0x2ea/0x2ea
> May 15 14:13:18 lintx2 kernel: [ 3560.225294] [<ffffffff810632ed>] ? kthread+0x7a/0x82
> May 15 14:13:18 lintx2 kernel: [ 3560.225302] [<ffffffff8136b974>] ? kernel_thread_helper+0x4/0x10
> May 15 14:13:18 lintx2 kernel: [ 3560.225311] [<ffffffff81369a33>] ? int_ret_from_sys_call+0x7/0x1b
> May 15 14:13:18 lintx2 kernel: [ 3560.225319] [<ffffffff8136453c>] ? retint_restore_args+0x5/0x6
> May 15 14:13:18 lintx2 kernel: [ 3560.225328] [<ffffffff8136b970>] ? gs_change+0x13/0x13
> May 15 14:13:18 lintx2 kernel: [ 3560.225335] ---[ end trace 111652db8892cd8b ]---
The only warning in ceph_add_cap is when it can't lookup the snap
realm. I'm not sure if this has any real consequences. Sage?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: rbd and ceph
2012-05-15 16:10 ` Josh Durgin
@ 2012-05-15 16:24 ` Sage Weil
2012-05-15 18:22 ` Martin Wilderoth
0 siblings, 1 reply; 4+ messages in thread
From: Sage Weil @ 2012-05-15 16:24 UTC (permalink / raw)
To: Josh Durgin; +Cc: Martin Wilderoth, ceph-devel
On Tue, 15 May 2012, Josh Durgin wrote:
> On 05/15/2012 07:01 AM, Martin Wilderoth wrote:
> > Hello,
> >
> > I have a xenhost using rbd device for the guests. In the guest I have a
> > mounted ceph file system.
> > From time to time I get the guest hanging and I have the following error in
> > my logfiles on the guest.
> >
> > Maybe I should not use both rbd and ceph ?
>
> While it's not a well tested configuration, I don't see any reason this
> wouldn't work. Sage, Alex, are there any shared resources in libceph
> that would cause problems with this?
There shouldn't be any problems with running rbd + ceph together.
> > May 15 14:13:18 lintx2 kernel: [ 3560.225095] Modules linked in: cryptd
> > aes_x86_64 aes_generic cbc ceph libceph crc32c libcrc32c evdev snd_pcm
> > snd_timer snd soundcore snd_page_alloc pcspkr ext3 jbd mbcache xen_netfront
> > xen_blkfront
> > May 15 14:13:18 lintx2 kernel: [ 3560.225140] Pid: 18, comm: kworker/0:1
> > Tainted: G W 3.2.0-0.bpo.2-amd64 #1
> > May 15 14:13:18 lintx2 kernel: [ 3560.225148] Call Trace:
> > May 15 14:13:18 lintx2 kernel: [ 3560.225155] [<ffffffff810497b4>] ?
> > warn_slowpath_common+0x78/0x8c
> > May 15 14:13:18 lintx2 kernel: [ 3560.225167] [<ffffffffa00db647>] ?
> > ceph_add_cap+0x38e/0x49e [ceph]
> > May 15 14:13:18 lintx2 kernel: [ 3560.225178] [<ffffffffa00d220a>] ?
> > fill_inode+0x4eb/0x602 [ceph]
> > May 15 14:13:18 lintx2 kernel: [ 3560.225186] [<ffffffff811157ad>] ?
> > __d_instantiate+0x8b/0xda
> > May 15 14:13:18 lintx2 kernel: [ 3560.225197] [<ffffffffa00d317d>] ?
> > ceph_readdir_prepopulate+0x2de/0x375 [ceph]
> > May 15 14:13:18 lintx2 kernel: [ 3560.225209] [<ffffffffa00e2d3f>] ?
> > dispatch+0xa35/0xef2 [ceph]
> > May 15 14:13:18 lintx2 kernel: [ 3560.225220] [<ffffffffa00ae841>] ?
> > ceph_tcp_recvmsg+0x43/0x4f [libceph]
> > May 15 14:13:18 lintx2 kernel: [ 3560.225231] [<ffffffffa00b0821>] ?
> > con_work+0x1070/0x13b8 [libceph]
> > May 15 14:13:18 lintx2 kernel: [ 3560.225240] [<ffffffff81006f7f>] ?
> > xen_restore_fl_direct_reloc+0x4/0x4
> > May 15 14:13:18 lintx2 kernel: [ 3560.225248] [<ffffffff81044549>] ?
> > update_curr+0xbc/0x160
> > May 15 14:13:18 lintx2 kernel: [ 3560.225259] [<ffffffffa00af7b1>] ?
> > try_write+0xbe1/0xbe1 [libceph]
> > May 15 14:13:18 lintx2 kernel: [ 3560.225268] [<ffffffff8105f897>] ?
> > process_one_work+0x1cc/0x2ea
> > May 15 14:13:18 lintx2 kernel: [ 3560.225277] [<ffffffff8105fae2>] ?
> > worker_thread+0x12d/0x247
> > May 15 14:13:18 lintx2 kernel: [ 3560.225285] [<ffffffff8105f9b5>] ?
> > process_one_work+0x2ea/0x2ea
> > May 15 14:13:18 lintx2 kernel: [ 3560.225294] [<ffffffff810632ed>] ?
> > kthread+0x7a/0x82
> > May 15 14:13:18 lintx2 kernel: [ 3560.225302] [<ffffffff8136b974>] ?
> > kernel_thread_helper+0x4/0x10
> > May 15 14:13:18 lintx2 kernel: [ 3560.225311] [<ffffffff81369a33>] ?
> > int_ret_from_sys_call+0x7/0x1b
> > May 15 14:13:18 lintx2 kernel: [ 3560.225319] [<ffffffff8136453c>] ?
> > retint_restore_args+0x5/0x6
> > May 15 14:13:18 lintx2 kernel: [ 3560.225328] [<ffffffff8136b970>] ?
> > gs_change+0x13/0x13
> > May 15 14:13:18 lintx2 kernel: [ 3560.225335] ---[ end trace
> > 111652db8892cd8b ]---
>
> The only warning in ceph_add_cap is when it can't lookup the snap
> realm. I'm not sure if this has any real consequences. Sage?
Not really. It is a bug, but you're seeing the _guest_ hang, not the fs,
right? I suspect there is something else going on, and this is a red
herring.
FWIW, I've seen two RBD kernel hangs in the last few days in our QA (under
xfstests workload). We're still looking into that.
sage
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: rbd and ceph
2012-05-15 16:24 ` Sage Weil
@ 2012-05-15 18:22 ` Martin Wilderoth
0 siblings, 0 replies; 4+ messages in thread
From: Martin Wilderoth @ 2012-05-15 18:22 UTC (permalink / raw)
To: ceph-devel
>On Tue, 15 May 2012, Josh Durgin wrote:
> > On 05/15/2012 07:01 AM, Martin Wilderoth wrote:
> > > Hello,
> > >
> > > I have a xenhost using rbd device for the guests. In the guest I have a
> > > mounted ceph file system.
> > > From time to time I get the guest hanging and I have the following error in
> > > my logfiles on the guest.
> > >
> > > Maybe I should not use both rbd and ceph ?
> >
> > While it's not a well tested configuration, I don't see any reason this
> > wouldn't work. Sage, Alex, are there any shared resources in libceph
> > that would cause problems with this?
There shouldn't be any problems with running rbd + ceph together.
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225095] Modules linked in: cryptd
> > > aes_x86_64 aes_generic cbc ceph libceph crc32c libcrc32c evdev snd_pcm
> > > snd_timer snd soundcore snd_page_alloc pcspkr ext3 jbd mbcache xen_netfront
> > > xen_blkfront
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225140] Pid: 18, comm: kworker/0:1
> > > Tainted: G W 3.2.0-0.bpo.2-amd64 #1
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225148] Call Trace:
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225155] [<ffffffff810497b4>] ?
> > > warn_slowpath_common+0x78/0x8c
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225167] [<ffffffffa00db647>] ?
> > > ceph_add_cap+0x38e/0x49e [ceph]
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225178] [<ffffffffa00d220a>] ?
> > > fill_inode+0x4eb/0x602 [ceph]
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225186] [<ffffffff811157ad>] ?
> > > __d_instantiate+0x8b/0xda
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225197] [<ffffffffa00d317d>] ?
> > > ceph_readdir_prepopulate+0x2de/0x375 [ceph]
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225209] [<ffffffffa00e2d3f>] ?
> > > dispatch+0xa35/0xef2 [ceph]
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225220] [<ffffffffa00ae841>] ?
> > > ceph_tcp_recvmsg+0x43/0x4f [libceph]
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225231] [<ffffffffa00b0821>] ?
> > > con_work+0x1070/0x13b8 [libceph]
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225240] [<ffffffff81006f7f>] ?
> > > xen_restore_fl_direct_reloc+0x4/0x4
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225248] [<ffffffff81044549>] ?
> > > update_curr+0xbc/0x160
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225259] [<ffffffffa00af7b1>] ?
> > > try_write+0xbe1/0xbe1 [libceph]
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225268] [<ffffffff8105f897>] ?
> > > process_one_work+0x1cc/0x2ea
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225277] [<ffffffff8105fae2>] ?
> > > worker_thread+0x12d/0x247
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225285] [<ffffffff8105f9b5>] ?
> > > process_one_work+0x2ea/0x2ea
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225294] [<ffffffff810632ed>] ?
> > > kthread+0x7a/0x82
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225302] [<ffffffff8136b974>] ?
> > > kernel_thread_helper+0x4/0x10
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225311] [<ffffffff81369a33>] ?
> > > int_ret_from_sys_call+0x7/0x1b
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225319] [<ffffffff8136453c>] ?
> > > retint_restore_args+0x5/0x6
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225328] [<ffffffff8136b970>] ?
> > > gs_change+0x13/0x13
> > > May 15 14:13:18 lintx2 kernel: [ 3560.225335] ---[ end trace
> > > 111652db8892cd8b ]---
> >
> > The only warning in ceph_add_cap is when it can't lookup the snap
> > realm. I'm not sure if this has any real consequences. Sage?
>
> Not really. It is a bug, but you're seeing the _guest_ hang, not the fs,
> right? I suspect there is something else going on, and this is a red
> herring.
It's the guest that hangs or is looping I get some CPU#0 stuck for ??s on the console
all the time. The only solution is to reset the guest.
I can remount the ceph. So the filesystem is not hanging.
Martin
>
> FWIW, I've seen two RBD kernel hangs in the last few days in our QA (under
> xfstests workload). We're still looking into that.
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2012-05-15 18:24 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-15 14:01 rbd and ceph Martin Wilderoth
2012-05-15 16:10 ` Josh Durgin
2012-05-15 16:24 ` Sage Weil
2012-05-15 18:22 ` Martin Wilderoth
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.