* crash on umount
@ 2011-06-03 10:40 Fyodor Ustinov
2011-06-03 15:39 ` Tommi Virtanen
0 siblings, 1 reply; 4+ messages in thread
From: Fyodor Ustinov @ 2011-06-03 10:40 UTC (permalink / raw)
To: ceph-devel
Hi!
kernel 2.6.39
ceph - 0.28.2
In sysctl.conf set
vm.min_free_kbytes=262144
Jun 2 03:08:17 amanda kernel: [35398.757055] libceph: msg_new can't
allocate 4096 bytes
Jun 2 03:08:17 amanda kernel: [35398.757088] libceph: msg_new can't
create type 0 front 4096
Jun 2 03:08:17 amanda kernel: [35398.757148] libceph: msgpool osd_op
alloc failed
Jun 2 03:08:17 amanda kernel: [35398.759437] libceph: msg_new can't
allocate 4096 bytes
Jun 2 03:08:17 amanda kernel: [35398.759469] libceph: msg_new can't
create type 0 front 4096
Jun 2 03:08:17 amanda kernel: [35398.759491] libceph: msgpool osd_op
alloc failed
Jun 2 03:08:17 amanda kernel: [35398.775955] libceph: msg_new can't
allocate 4096 bytes
Jun 2 03:08:17 amanda kernel: [35398.775987] libceph: msg_new can't
create type 0 front 4096
Jun 2 03:08:17 amanda kernel: [35398.776008] libceph: msgpool osd_op
alloc failed
Jun 3 00:16:47 amanda kernel: [111508.961385] libceph: msg_new can't
allocate 4096 bytes
Jun 3 00:16:47 amanda kernel: [111508.961415] libceph: msg_new can't
create type 0 front 4096
Jun 3 00:16:47 amanda kernel: [111508.961442] libceph: msgpool osd_op
alloc failed
Jun 3 13:33:10 amanda kernel: [159291.960881] ------------[ cut here
]------------
Jun 3 13:33:10 amanda kernel: [159291.960930] kernel BUG at
mm/mempool.c:186!
Jun 3 13:33:10 amanda kernel: [159291.960971] invalid opcode: 0000 [#1]
SMP
Jun 3 13:33:10 amanda kernel: [159291.961011] last sysfs file:
/sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map
Jun 3 13:33:10 amanda kernel: [159291.961074] CPU 1
Jun 3 13:33:10 amanda kernel: [159291.961093] Modules linked in: 8021q
garp stp ceph libceph i915 drm_kms_helper bonding drm psmouse
i2c_algo_bit serio_raw video btrfs lp parport zlib_deflate libcrc32c
raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov
e1000e raid6_pq async_tx raid1 raid0 multipath linear
Jun 3 13:33:10 amanda kernel: [159291.961414]
Jun 3 13:33:10 amanda kernel: [159291.961431] Pid: 4236, comm: umount
Not tainted 2.6.39-ufm2+ #1 Gigabyte Technology Co., Ltd.
EG41MF-US2H/EG41MF-US2H
Jun 3 13:33:10 amanda kernel: [159291.961525] RIP:
0010:[<ffffffff8110a5c8>] [<ffffffff8110a5c8>] mempool_destroy+0x18/0x20
Jun 3 13:33:10 amanda kernel: [159291.961601] RSP:
0018:ffff88000f1dbdf8 EFLAGS: 00010297
Jun 3 13:33:10 amanda kernel: [159291.961644] RAX: 000000000000000a
RBX: ffff880078ebf1a8 RCX: 0000000000000000
Jun 3 13:33:10 amanda kernel: [159291.961701] RDX: 000000000000001e
RSI: ffffea0000c0c130 RDI: ffff88003712a6c0
Jun 3 13:33:10 amanda kernel: [159291.961759] RBP: ffff88000f1dbdf8
R08: ffff88007d002200 R09: ffffffff8110a5a5
Jun 3 13:33:10 amanda kernel: [159291.961816] R10: 0000000000000001
R11: dead000000200200 R12: ffff8800795d3040
Jun 3 13:33:10 amanda kernel: [159291.961872] R13: ffff880078eeef00
R14: ffff88000f1dbf28 R15: ffff880078eeef40
Jun 3 13:33:10 amanda kernel: [159291.961929] FS:
00007f767c662760(0000) GS:ffff88007da80000(0000) knlGS:0000000000000000
Jun 3 13:33:10 amanda kernel: [159291.961993] CS: 0010 DS: 0000 ES:
0000 CR0: 000000008005003b
Jun 3 13:33:10 amanda kernel: [159291.962040] CR2: 00007f767bd7b0d0
CR3: 00000000770d3000 CR4: 00000000000406e0
Jun 3 13:33:10 amanda kernel: [159291.962096] DR0: 0000000000000000
DR1: 0000000000000000 DR2: 0000000000000000
Jun 3 13:33:10 amanda kernel: [159291.963761] DR3: 0000000000000000
DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jun 3 13:33:10 amanda kernel: [159291.965395] Process umount (pid:
4236, threadinfo ffff88000f1da000, task ffff880036d60000)
Jun 3 13:33:10 amanda kernel: [159291.967062] Stack:
Jun 3 13:33:10 amanda kernel: [159291.968711] ffff88000f1dbe08
ffffffffa02a59e2 ffff88000f1dbe28 ffffffffa02a7fc3
Jun 3 13:33:10 amanda kernel: [159291.970432] ffff880078ebe400
ffff880078ebf000 ffff88000f1dbe48 ffffffffa02a158d
Jun 3 13:33:10 amanda kernel: [159291.970496] ffff8800795d3040
ffff880078ebe400 ffff88000f1dbe68 ffffffffa02cebf3
Jun 3 13:33:10 amanda kernel: [159291.970496] Call Trace:
Jun 3 13:33:10 amanda kernel: [159291.970496] [<ffffffffa02a59e2>]
ceph_msgpool_destroy+0x12/0x20 [libceph]
Jun 3 13:33:10 amanda kernel: [159291.970496] [<ffffffffa02a7fc3>]
ceph_osdc_stop+0x83/0xb0 [libceph]
Jun 3 13:33:10 amanda kernel: [159291.970496] [<ffffffffa02a158d>]
ceph_destroy_client+0x1d/0x60 [libceph]
Jun 3 13:33:10 amanda kernel: [159291.970496] [<ffffffffa02cebf3>]
destroy_fs_client+0x63/0x70 [ceph]
Jun 3 13:33:10 amanda kernel: [159291.970496] [<ffffffffa02cec41>]
ceph_kill_sb+0x41/0x50 [ceph]
Jun 3 13:33:10 amanda kernel: [159291.970496] [<ffffffff81162865>]
deactivate_locked_super+0x45/0x70
Jun 3 13:33:10 amanda kernel: [159291.970496] [<ffffffff811632ba>]
deactivate_super+0x4a/0x70
Jun 3 13:33:10 amanda kernel: [159291.970496] [<ffffffff8117ddb4>]
mntput_no_expire+0xa4/0xf0
Jun 3 13:33:10 amanda kernel: [159291.970496] [<ffffffff8117eb0c>]
sys_umount+0x6c/0x3a0
Jun 3 13:33:10 amanda kernel: [159291.970496] [<ffffffff815ddec2>]
system_call_fastpath+0x16/0x1b
Jun 3 13:33:10 amanda kernel: [159291.970496] Code: 48 89 df e8 ab 5a
04 00 48 83 c4 08 5b 41 5c 41 5d c9 c3 55 48 89 e5 0f 1f 44 00 00 8b 47
04 39 47 08 75 07 e8 8a ff ff ff c9 c3 <0f> 0b 66 0f 1f 44 00 00 55 48
89 e5 41 57 41 56 41 55 41 54 53
Jun 3 13:33:10 amanda kernel: [159291.970496] RIP [<ffffffff8110a5c8>]
mempool_destroy+0x18/0x20
Jun 3 13:33:10 amanda kernel: [159291.970496] RSP <ffff88000f1dbdf8>
Jun 3 13:33:10 amanda kernel: [159292.007957] ---[ end trace
d333fc57eb1e6ccd ]---
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: crash on umount
2011-06-03 10:40 crash on umount Fyodor Ustinov
@ 2011-06-03 15:39 ` Tommi Virtanen
2011-06-03 16:13 ` Fyodor Ustinov
2011-06-03 16:26 ` Sage Weil
0 siblings, 2 replies; 4+ messages in thread
From: Tommi Virtanen @ 2011-06-03 15:39 UTC (permalink / raw)
To: Fyodor Ustinov; +Cc: ceph-devel
On Fri, Jun 03, 2011 at 01:40:44PM +0300, Fyodor Ustinov wrote:
> Hi!
>
> kernel 2.6.39
> ceph - 0.28.2
>
> In sysctl.conf set
> vm.min_free_kbytes=262144
>
> Jun 2 03:08:17 amanda kernel: [35398.757055] libceph: msg_new can't
> allocate 4096 bytes
... so first you run out of memory ...
> Jun 3 13:33:10 amanda kernel: [159291.960881] ------------[ cut
> here ]------------
> Jun 3 13:33:10 amanda kernel: [159291.960930] kernel BUG at
> mm/mempool.c:186!
...
> Jun 3 13:33:10 amanda kernel: [159291.970496] Call Trace:
> Jun 3 13:33:10 amanda kernel: [159291.970496] [<ffffffffa02a59e2>]
> ceph_msgpool_destroy+0x12/0x20 [libceph]
> Jun 3 13:33:10 amanda kernel: [159291.970496] [<ffffffffa02a7fc3>]
> ceph_osdc_stop+0x83/0xb0 [libceph]
> Jun 3 13:33:10 amanda kernel: [159291.970496] [<ffffffffa02a158d>]
> ceph_destroy_client+0x1d/0x60 [libceph]
And then, the mempool destroy goes wrong. And that's because...
/**
* mempool_destroy - deallocate a memory pool
* @pool: pointer to the memory pool which was allocated via
* mempool_create().
*
* this function only sleeps if the free_fn() function sleeps. The caller
* has to guarantee that all elements have been returned to the pool (ie:
* freed) prior to calling mempool_destroy().
*/
void mempool_destroy(mempool_t *pool)
{
/* Check for outstanding elements */
BUG_ON(pool->curr_nr != pool->min_nr);
free_pool(pool);
}
We didn't empty the pool before trying to release it. It's either one
of these
ceph_msgpool_destroy(&osdc->msgpool_op);
ceph_msgpool_destroy(&osdc->msgpool_op_reply);
but I can't easily tell which one.
Summary so far: we're leaking msgpool_op or msgpool_op_reply entries
when unmounting kclient while out of memory.
devs: If anyone else has a good idea where this is heading, please
take over.
--
:(){ :|:&};:
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: crash on umount
2011-06-03 15:39 ` Tommi Virtanen
@ 2011-06-03 16:13 ` Fyodor Ustinov
2011-06-03 16:26 ` Sage Weil
1 sibling, 0 replies; 4+ messages in thread
From: Fyodor Ustinov @ 2011-06-03 16:13 UTC (permalink / raw)
To: Tommi Virtanen; +Cc: ceph-devel
On 06/03/2011 06:39 PM, Tommi Virtanen wrote:
> On Fri, Jun 03, 2011 at 01:40:44PM +0300, Fyodor Ustinov wrote:
>> Hi!
>>
>> kernel 2.6.39
>> ceph - 0.28.2
>>
>> In sysctl.conf set
>> vm.min_free_kbytes=262144
>>
>> Jun 2 03:08:17 amanda kernel: [35398.757055] libceph: msg_new can't
>> allocate 4096 bytes
> ... so first you run out of memory ...
I do not understand one thing: why on server with 2G of RAM and 250M of
vm.min_free_kbytes ceph (and _only_ ceph) can not allocate memory?
WBR,
Fyodor.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: crash on umount
2011-06-03 15:39 ` Tommi Virtanen
2011-06-03 16:13 ` Fyodor Ustinov
@ 2011-06-03 16:26 ` Sage Weil
1 sibling, 0 replies; 4+ messages in thread
From: Sage Weil @ 2011-06-03 16:26 UTC (permalink / raw)
To: Tommi Virtanen; +Cc: Fyodor Ustinov, ceph-devel
On Fri, 3 Jun 2011, Tommi Virtanen wrote:
> On Fri, Jun 03, 2011 at 01:40:44PM +0300, Fyodor Ustinov wrote:
> > Hi!
> >
> > kernel 2.6.39
> > ceph - 0.28.2
> >
> > In sysctl.conf set
> > vm.min_free_kbytes=262144
> >
> > Jun 2 03:08:17 amanda kernel: [35398.757055] libceph: msg_new can't
> > allocate 4096 bytes
>
> ... so first you run out of memory ...
>
> > Jun 3 13:33:10 amanda kernel: [159291.960881] ------------[ cut
> > here ]------------
> > Jun 3 13:33:10 amanda kernel: [159291.960930] kernel BUG at
> > mm/mempool.c:186!
> ...
> > Jun 3 13:33:10 amanda kernel: [159291.970496] Call Trace:
> > Jun 3 13:33:10 amanda kernel: [159291.970496] [<ffffffffa02a59e2>]
> > ceph_msgpool_destroy+0x12/0x20 [libceph]
> > Jun 3 13:33:10 amanda kernel: [159291.970496] [<ffffffffa02a7fc3>]
> > ceph_osdc_stop+0x83/0xb0 [libceph]
> > Jun 3 13:33:10 amanda kernel: [159291.970496] [<ffffffffa02a158d>]
> > ceph_destroy_client+0x1d/0x60 [libceph]
>
> And then, the mempool destroy goes wrong. And that's because...
>
> /**
> * mempool_destroy - deallocate a memory pool
> * @pool: pointer to the memory pool which was allocated via
> * mempool_create().
> *
> * this function only sleeps if the free_fn() function sleeps. The caller
> * has to guarantee that all elements have been returned to the pool (ie:
> * freed) prior to calling mempool_destroy().
> */
> void mempool_destroy(mempool_t *pool)
> {
> /* Check for outstanding elements */
> BUG_ON(pool->curr_nr != pool->min_nr);
> free_pool(pool);
> }
>
> We didn't empty the pool before trying to release it. It's either one
> of these
>
> ceph_msgpool_destroy(&osdc->msgpool_op);
> ceph_msgpool_destroy(&osdc->msgpool_op_reply);
>
> but I can't easily tell which one.
>
> Summary so far: we're leaking msgpool_op or msgpool_op_reply entries
> when unmounting kclient while out of memory.
>
> devs: If anyone else has a good idea where this is heading, please
> take over.
Argh, I just saw this yesterday (#1136) but saved the wrong log file.
I'll see if I can reproduce.
sage
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-06-03 16:24 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-03 10:40 crash on umount Fyodor Ustinov
2011-06-03 15:39 ` Tommi Virtanen
2011-06-03 16:13 ` Fyodor Ustinov
2011-06-03 16:26 ` Sage Weil
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.