From: Olivier Bonvalet <ceph.list@daevel.fr>
To: ceph-devel@vger.kernel.org
Subject: Re: Issue #5876 : assertion failure in rbd_img_obj_callback()
Date: Sat, 05 Apr 2014 03:16:19 +0200 [thread overview]
Message-ID: <1396660579.2130.103.camel@localhost> (raw)
In-Reply-To: <1395736765.2823.29.camel@localhost>
Le mardi 25 mars 2014 à 09:39 +0100, Olivier Bonvalet a écrit :
> Hi,
>
> what can/should I do to help fix that problem ?
>
> for now, RBD kernel client hang on :
> Assertion failure in rbd_img_obj_callback() at line 2131:
> rbd_assert(which >= img_request->next_completion);
>
> or on :
> Assertion failure in rbd_img_obj_callback() at line 2127:
> rbd_assert(img_request != NULL);
>
>
> I have both case at least once per week, on latest 3.13.5 kernels.
>
> It seems that the problem occurs only on more loaded servers (I have 4
> near same servers, and crash occurs on two of them. If I move the VM,
> crash follows...).
>
> Olivier
>
> --
Hi,
so. After some days without any problems, RBD crashed toonight :
Apr 5 02:52:24 rurkh kernel: [799426.461742]
Apr 5 02:52:24 rurkh kernel: [799426.461742] Assertion failure in rbd_img_obj_callback() at line 2128:
Apr 5 02:52:24 rurkh kernel: [799426.461742]
Apr 5 02:52:24 rurkh kernel: [799426.461742] rbd_assert(img_request->obj_request_count > 0);
Apr 5 02:52:24 rurkh kernel: [799426.461742]
Apr 5 02:52:24 rurkh kernel: [799426.461958] ------------[ cut here ]------------
Apr 5 02:52:24 rurkh kernel: [799426.461997] kernel BUG at drivers/block/rbd.c:2128!
Apr 5 02:52:24 rurkh kernel: [799426.462036] invalid opcode: 0000 [#1] SMP
Apr 5 02:52:24 rurkh kernel: [799426.462080] Modules linked in: cbc rbd libceph xen_gntdev xt_physdev iptable_filter ip_tables x_tables xfs
libcrc32c bridge loop iTCO_wdt gpio_ich iTCO_vendor_support serio_raw sb_edac edac_core evdev i2c_i801 lpc_ich mfd_core ioatdma shpchp ipmi
_si ipmi_msghandler wmi ac button dm_mod hid_generic usbhid hid sg sd_mod crc_t10dif crct10dif_common isci ahci ehci_pci libsas libahci mega
raid_sas ehci_hcd libata scsi_transport_sas igb usbcore scsi_mod i2c_algo_bit ixgbe i2c_core usb_common dca ptp pps_core mdio
Apr 5 02:52:24 rurkh kernel: [799426.462579] CPU: 0 PID: 15975 Comm: kworker/0:0 Not tainted 3.13-dae-dom0 #24
Apr 5 02:52:24 rurkh kernel: [799426.462644] Hardware name: Supermicro X9DRW-7TPF+/X9DRW-7TPF+, BIOS 3.0 07/24/2013
Apr 5 02:52:24 rurkh kernel: [799426.462717] Workqueue: ceph-msgr con_work [libceph]
Apr 5 02:52:24 rurkh kernel: [799426.462759] task: ffff88024cd9a8a0 ti: ffff88021a4e4000 task.ti: ffff88021a4e4000
Apr 5 02:52:24 rurkh kernel: [799426.462825] RIP: e030:[<ffffffffa0305ae8>] [<ffffffffa0305ae8>] rbd_img_obj_callback+0x91/0x3a2 [rbd]
Apr 5 02:52:24 rurkh kernel: [799426.462901] RSP: e02b:ffff88021a4e5ce8 EFLAGS: 00010282
Apr 5 02:52:24 rurkh kernel: [799426.462940] RAX: 000000000000006d RBX: ffff88023f8f6ec8 RCX: 0000000000000000
Apr 5 02:52:24 rurkh kernel: [799426.463005] RDX: ffff88027fe0eb50 RSI: ffff88027fe0e1a8 RDI: ffff88021a4e02a8
Apr 5 02:52:24 rurkh kernel: [799426.463069] RBP: ffff88021c90a718 R08: 0000000000000000 R09: 0000000000000000
Apr 5 02:52:24 rurkh kernel: [799426.463134] R10: 0000000000000000 R11: 000000000000084e R12: 0000000000000001
Apr 5 02:52:24 rurkh kernel: [799426.463197] R13: 0000000000000000 R14: ffff88025584a130 R15: 0000000000000000
Apr 5 02:52:24 rurkh kernel: [799426.481060] FS: 00007f1c6138f720(0000) GS:ffff88027fe00000(0000) knlGS:0000000000000000
Apr 5 02:52:24 rurkh kernel: [799426.481130] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 5 02:52:24 rurkh kernel: [799426.481170] CR2: 00007f1c6139f000 CR3: 000000023825c000 CR4: 0000000000042660
Apr 5 02:52:24 rurkh kernel: [799426.481235] Stack:
Apr 5 02:52:24 rurkh kernel: [799426.481266] 000000000000000d ffff880254da107d ffffffffffffffff ffff880254da1048
Apr 5 02:52:24 rurkh kernel: [799426.481349] ffff88025584a128 ffff88026dc59718 0000000000000000 ffff88025584a130
Apr 5 02:52:24 rurkh kernel: [799426.481429] 0000000000000000 ffffffffa02e4595 0000000000000015 ffff88026dc59770
Apr 5 02:52:24 rurkh kernel: [799426.481510] Call Trace:
Apr 5 02:52:24 rurkh kernel: [799426.481554] [<ffffffffa02e4595>] ? dispatch+0x3e4/0x55e [libceph]
Apr 5 02:52:24 rurkh kernel: [799426.481600] [<ffffffffa02df0fc>] ? con_work+0xf6e/0x1a65 [libceph]
Apr 5 02:52:24 rurkh kernel: [799426.481646] [<ffffffff81051f83>] ? mmdrop+0xd/0x1c
Apr 5 02:52:24 rurkh kernel: [799426.481687] [<ffffffff8105265e>] ? finish_task_switch+0x4d/0x83
Apr 5 02:52:24 rurkh kernel: [799426.481732] [<ffffffff810484d7>] ? process_one_work+0x15a/0x214
Apr 5 02:52:24 rurkh kernel: [799426.481775] [<ffffffff8104895b>] ? worker_thread+0x139/0x1de
Apr 5 02:52:24 rurkh kernel: [799426.481817] [<ffffffff81048822>] ? rescuer_thread+0x26e/0x26e
Apr 5 02:52:24 rurkh kernel: [799426.481859] [<ffffffff8104cff6>] ? kthread+0x9e/0xa6
Apr 5 02:52:24 rurkh kernel: [799426.481900] [<ffffffff8104cf58>] ? __kthread_parkme+0x55/0x55
Apr 5 02:52:24 rurkh kernel: [799426.481944] [<ffffffff8137260c>] ? ret_from_fork+0x7c/0xb0
Apr 5 02:52:24 rurkh kernel: [799426.481985] [<ffffffff8104cf58>] ? __kthread_parkme+0x55/0x55
Apr 5 02:52:24 rurkh kernel: [799426.482025] Code: 26 06 e1 0f 0b 8b 45 5c 85 c0 75 21 48 c7 c1 66 88 30 a0 ba 50 08 00 00 48 c7 c6 50 99 30 a0 48 c7 c7 1f 81 30 a0 e8 5b 26 06 e1 <0f> 0b 41 83 fc ff 75 23 48 c7 c1 f4 8b 30 a0 ba 51 08 00 00 31
Apr 5 02:52:24 rurkh kernel: [799426.482413] RIP [<ffffffffa0305ae8>] rbd_img_obj_callback+0x91/0x3a2 [rbd]
Apr 5 02:52:24 rurkh kernel: [799426.482462] RSP <ffff88021a4e5ce8>
Apr 5 02:52:24 rurkh kernel: [799426.483907] ---[ end trace 4aea8b8c107c24be ]---
At this time there was a lot of IO, because of backups in VM.
(but no RBD snapshot create or remove)
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2014-04-05 1:16 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-25 8:39 Issue #5876 : assertion failure in rbd_img_obj_callback() Olivier Bonvalet
2014-03-25 9:04 ` Ilya Dryomov
[not found] ` <1395739214.2823.34.camel@localhost>
2014-03-25 9:52 ` Ilya Dryomov
2014-03-25 11:48 ` Alex Elder
2014-03-25 12:34 ` Ilya Dryomov
2014-03-25 12:51 ` Alex Elder
2014-03-25 12:57 ` Ilya Dryomov
2014-03-25 13:18 ` Olivier Bonvalet
2014-03-25 13:29 ` Alex Elder
2014-03-25 13:31 ` Alex Elder
2014-03-25 14:01 ` Olivier Bonvalet
2014-03-25 17:15 ` Olivier Bonvalet
2014-03-25 17:21 ` Alex Elder
2014-03-25 18:53 ` Olivier Bonvalet
2014-03-25 17:43 ` Alex Elder
2014-03-25 18:53 ` Olivier Bonvalet
2014-03-25 19:03 ` Alex Elder
2014-03-25 20:18 ` Ilya Dryomov
2014-03-25 20:21 ` Olivier Bonvalet
2014-03-25 20:24 ` Alex Elder
2014-03-25 20:29 ` Olivier Bonvalet
2014-03-25 20:44 ` Alex Elder
2014-03-25 21:03 ` Olivier Bonvalet
2014-03-25 20:41 ` Alex Elder
2014-03-25 20:53 ` Olivier Bonvalet
2014-03-25 21:10 ` Olivier Bonvalet
2014-03-25 21:20 ` Ilya Dryomov
[not found] ` <1395782577.2076.23.camel@localhost>
2014-03-25 21:25 ` Ilya Dryomov
2014-03-25 21:41 ` Olivier Bonvalet
2014-03-25 21:49 ` Ilya Dryomov
2014-03-25 21:54 ` Olivier Bonvalet
2014-03-25 22:17 ` Olivier Bonvalet
2014-03-25 22:46 ` Alex Elder
2014-03-25 23:04 ` Olivier Bonvalet
2014-03-26 0:00 ` Alex Elder
2014-03-26 1:33 ` Olivier Bonvalet
2014-03-26 1:50 ` Olivier Bonvalet
2014-03-26 1:55 ` Alex Elder
2014-03-26 2:40 ` Olivier Bonvalet
2014-03-26 2:42 ` Alex Elder
2014-03-26 2:45 ` Olivier Bonvalet
2014-03-26 3:54 ` Alex Elder
2014-03-26 4:00 ` Olivier Bonvalet
2014-03-26 5:00 ` Alex Elder
2014-03-26 11:13 ` Alex Elder
2014-03-26 11:43 ` Ilya Dryomov
2014-03-26 11:47 ` Alex Elder
2014-03-26 12:05 ` Ilya Dryomov
2014-03-26 20:58 ` Alex Elder
2014-03-27 7:48 ` Olivier Bonvalet
2014-03-27 8:45 ` Ilya Dryomov
2014-03-27 8:49 ` Olivier Bonvalet
2014-03-26 2:35 ` Olivier Bonvalet
2014-03-26 2:54 ` Alex Elder
2014-03-26 3:58 ` Olivier Bonvalet
2014-04-05 1:16 ` Olivier Bonvalet [this message]
2014-04-05 1:57 ` Alex Elder
2014-04-05 8:09 ` Olivier Bonvalet
2014-04-05 13:08 ` Alex Elder
2014-04-25 11:37 ` Olivier Bonvalet
2014-04-25 12:17 ` Alex Elder
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1396660579.2130.103.camel@localhost \
--to=ceph.list@daevel.fr \
--cc=ceph-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.