From: Alex Elder <elder@ieee.org>
To: Olivier Bonvalet <ceph.list@daevel.fr>
Cc: Ilya Dryomov <ilya.dryomov@inktank.com>,
Ceph Development <ceph-devel@vger.kernel.org>
Subject: Re: Issue #5876 : assertion failure in rbd_img_obj_callback()
Date: Tue, 25 Mar 2014 12:21:57 -0500 [thread overview]
Message-ID: <5331BB35.7070107@ieee.org> (raw)
In-Reply-To: <1395767705.9967.5.camel@localhost>
On 03/25/2014 12:15 PM, Olivier Bonvalet wrote:
> Le mardi 25 mars 2014 à 08:31 -0500, Alex Elder a écrit :
>> ...
>>>> So, a (partial) fix can be this patch ?
>>>>
>>>
>>>
>>> Yes, roughly. I'd do the following instead. It would be great
>>> to learn whether it eliminates the one form of assertion failure
>>> you were seeing.
>>>
>>> -Alex
>>>
>>
>>
>> Strike that, my last patch was dead wrong. Sorry. Try this:
>>
>> --- a/drivers/block/rbd.c
>> +++ b/drivers/block/rbd.c
>> @@ -2128,11 +2128,11 @@ static void rbd_img_obj_callback(struct
>> rbd_assert(img_request->obj_request_count > 0);
>> rbd_assert(which != BAD_WHICH);
>> rbd_assert(which < img_request->obj_request_count);
>> - rbd_assert(which >= img_request->next_completion);
>>
>> spin_lock_irq(&img_request->completion_lock);
>> - if (which != img_request->next_completion)
>> + if (which > img_request->next_completion)
>> goto out;
>> + rbd_assert(which == img_request->next_completion);
>>
>> for_each_obj_request_from(img_request, obj_request) {
>> rbd_assert(more);
>>
>>
>>
>
> Well, it just hang :
It's great to know you can reproduce this.
Let me put together another quick patch that might supply a bit
more information when it happens. I'll send something shortly.
-Alex
> Mar 25 17:58:36 rurkh kernel: [ 4135.913079] Assertion failure in rbd_img_obj_callback() at line 2135:
> Mar 25 17:58:36 rurkh kernel: [ 4135.913079]
> Mar 25 17:58:36 rurkh kernel: [ 4135.913079] rbd_assert(which == img_request->next_completion);
> Mar 25 17:58:36 rurkh kernel: [ 4135.913079]
> Mar 25 17:58:36 rurkh kernel: [ 4135.913252] ------------[ cut here ]------------
> Mar 25 17:58:36 rurkh kernel: [ 4135.913288] kernel BUG at drivers/block/rbd.c:2135!
> Mar 25 17:58:36 rurkh kernel: [ 4135.913331] invalid opcode: 0000 [#1] SMP
> Mar 25 17:58:36 rurkh kernel: [ 4135.913373] Modules linked in: cbc rbd libceph xen_gntdev xt_physdev iptable_filter ip_tables x_tables xfs libcrc32c bridge loop iTCO_wdt iTCO_vendor_support gpio_ich serio_raw sb_edac edac_core i2c_i801 lpc_ich mfd_core evdev ioatdma shpchp ipmi_si ipmi_msghandler wmi ac button dm_mod hid_generic usbhid hid sg sd_mod crc_t10dif crct10dif_common isci ahci libsas libahci megaraid_sas libata scsi_transport_sas ehci_pci igb scsi_mod ehci_hcd ixgbe i2c_algo_bit i2c_core usbcore dca ptp usb_common pps_core mdio
> Mar 25 17:58:36 rurkh kernel: [ 4135.913821] CPU: 0 PID: 30629 Comm: kworker/0:1 Not tainted 3.13-dae-dom0 #20
> Mar 25 17:58:36 rurkh kernel: [ 4135.913863] Hardware name: Supermicro X9DRW-7TPF+/X9DRW-7TPF+, BIOS 3.0 07/24/2013
> Mar 25 17:58:36 rurkh kernel: [ 4135.913931] Workqueue: ceph-msgr con_work [libceph]
> Mar 25 17:58:36 rurkh kernel: [ 4135.913970] task: ffff88027374b760 ti: ffff88024933c000 task.ti: ffff88024933c000
> Mar 25 17:58:36 rurkh kernel: [ 4135.914033] RIP: e030:[<ffffffffa0304b86>] [<ffffffffa0304b86>] rbd_img_obj_callback+0x12f/0x3d0 [rbd]
> Mar 25 17:58:36 rurkh kernel: [ 4135.914104] RSP: e02b:ffff88024933dce8 EFLAGS: 00010082
> Mar 25 17:58:36 rurkh kernel: [ 4135.914141] RAX: 0000000000000070 RBX: ffff88024d2dcc48 RCX: 0000000000000000
> Mar 25 17:58:36 rurkh kernel: [ 4135.914182] RDX: ffff88027fe0eb50 RSI: ffff88027fe0e1a8 RDI: ffff8802493300a8
> Mar 25 17:58:36 rurkh kernel: [ 4135.914223] RBP: ffff88024ccc3e20 R08: 0000000000000000 R09: 0000000000000000
> Mar 25 17:58:36 rurkh kernel: [ 4135.914265] R10: 0000000000000000 R11: 0000000000000098 R12: 0000000000000001
> Mar 25 17:58:36 rurkh kernel: [ 4135.914306] R13: 0000000000000000 R14: ffff88027144b1d0 R15: 0000000000000000
> Mar 25 17:58:36 rurkh kernel: [ 4135.914351] FS: 00007f6ec996f700(0000) GS:ffff88027fe00000(0000) knlGS:0000000000000000
> Mar 25 17:58:36 rurkh kernel: [ 4135.914415] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> Mar 25 17:58:36 rurkh kernel: [ 4135.914453] CR2: 0000000001ff1b10 CR3: 00000002492b3000 CR4: 0000000000042660
> Mar 25 17:58:36 rurkh kernel: [ 4135.914495] Stack:
> Mar 25 17:58:36 rurkh kernel: [ 4135.914524] ffff88024ccc3e5c ffff88024a48eb5d ffffffffffffffff ffff88024a48eb28
> Mar 25 17:58:36 rurkh kernel: [ 4135.914610] ffff88027144b1c8 ffff8802656cc718 0000000000000000 ffff88027144b1d0
> Mar 25 17:58:36 rurkh kernel: [ 4135.914689] 0000000000000000 ffffffffa02e3595 0000000000000015 ffff8802656cc770
> Mar 25 17:58:36 rurkh kernel: [ 4135.914768] Call Trace:
> Mar 25 17:58:36 rurkh kernel: [ 4135.914809] [<ffffffffa02e3595>] ? dispatch+0x3e4/0x55e [libceph]
> Mar 25 17:58:36 rurkh kernel: [ 4135.914854] [<ffffffffa02de0fc>] ? con_work+0xf6e/0x1a65 [libceph]
> Mar 25 17:58:36 rurkh kernel: [ 4135.914901] [<ffffffff81005f00>] ? xen_timer_resume+0x4f/0x4f
> Mar 25 17:58:36 rurkh kernel: [ 4135.914944] [<ffffffff81051f83>] ? mmdrop+0xd/0x1c
> Mar 25 17:58:36 rurkh kernel: [ 4135.914984] [<ffffffff8105265e>] ? finish_task_switch+0x4d/0x83
> Mar 25 17:58:36 rurkh kernel: [ 4135.915029] [<ffffffff810484d7>] ? process_one_work+0x15a/0x214
> Mar 25 17:58:36 rurkh kernel: [ 4135.915072] [<ffffffff8104895b>] ? worker_thread+0x139/0x1de
> Mar 25 17:58:36 rurkh kernel: [ 4135.915113] [<ffffffff81048822>] ? rescuer_thread+0x26e/0x26e
> Mar 25 17:58:36 rurkh kernel: [ 4135.915155] [<ffffffff8104cff6>] ? kthread+0x9e/0xa6
> Mar 25 17:58:36 rurkh kernel: [ 4135.915195] [<ffffffff8104cf58>] ? __kthread_parkme+0x55/0x55
> Mar 25 17:58:36 rurkh kernel: [ 4135.915238] [<ffffffff8137260c>] ? ret_from_fork+0x7c/0xb0
> Mar 25 17:58:36 rurkh kernel: [ 4135.915279] [<ffffffff8104cf58>] ? __kthread_parkme+0x55/0x55
> Mar 25 17:58:36 rurkh kernel: [ 4135.915319] Code: 41 b5 01 48 89 44 24 08 eb 3b 48 c7 c1 2e 7c 30 a0 ba 57 08 00 00 31 c0 48 c7 c6 80 89 30 a0 48 c7 c7 1f 71 30 a0 e8 bd 35 06 e1 <0f> 0b 41 8b 45 5c ff c8 39 43 40 41 0f 92 c5 48 8b 5b 30 41 ff
> Mar 25 17:58:36 rurkh kernel: [ 4135.915701] RIP [<ffffffffa0304b86>] rbd_img_obj_callback+0x12f/0x3d0 [rbd]
> Mar 25 17:58:36 rurkh kernel: [ 4135.915749] RSP <ffff88024933dce8>
> Mar 25 17:58:36 rurkh kernel: [ 4135.916087] ---[ end trace ff823e5e2d6cd4e9 ]--
>
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2014-03-25 17:21 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-25 8:39 Issue #5876 : assertion failure in rbd_img_obj_callback() Olivier Bonvalet
2014-03-25 9:04 ` Ilya Dryomov
[not found] ` <1395739214.2823.34.camel@localhost>
2014-03-25 9:52 ` Ilya Dryomov
2014-03-25 11:48 ` Alex Elder
2014-03-25 12:34 ` Ilya Dryomov
2014-03-25 12:51 ` Alex Elder
2014-03-25 12:57 ` Ilya Dryomov
2014-03-25 13:18 ` Olivier Bonvalet
2014-03-25 13:29 ` Alex Elder
2014-03-25 13:31 ` Alex Elder
2014-03-25 14:01 ` Olivier Bonvalet
2014-03-25 17:15 ` Olivier Bonvalet
2014-03-25 17:21 ` Alex Elder [this message]
2014-03-25 18:53 ` Olivier Bonvalet
2014-03-25 17:43 ` Alex Elder
2014-03-25 18:53 ` Olivier Bonvalet
2014-03-25 19:03 ` Alex Elder
2014-03-25 20:18 ` Ilya Dryomov
2014-03-25 20:21 ` Olivier Bonvalet
2014-03-25 20:24 ` Alex Elder
2014-03-25 20:29 ` Olivier Bonvalet
2014-03-25 20:44 ` Alex Elder
2014-03-25 21:03 ` Olivier Bonvalet
2014-03-25 20:41 ` Alex Elder
2014-03-25 20:53 ` Olivier Bonvalet
2014-03-25 21:10 ` Olivier Bonvalet
2014-03-25 21:20 ` Ilya Dryomov
[not found] ` <1395782577.2076.23.camel@localhost>
2014-03-25 21:25 ` Ilya Dryomov
2014-03-25 21:41 ` Olivier Bonvalet
2014-03-25 21:49 ` Ilya Dryomov
2014-03-25 21:54 ` Olivier Bonvalet
2014-03-25 22:17 ` Olivier Bonvalet
2014-03-25 22:46 ` Alex Elder
2014-03-25 23:04 ` Olivier Bonvalet
2014-03-26 0:00 ` Alex Elder
2014-03-26 1:33 ` Olivier Bonvalet
2014-03-26 1:50 ` Olivier Bonvalet
2014-03-26 1:55 ` Alex Elder
2014-03-26 2:40 ` Olivier Bonvalet
2014-03-26 2:42 ` Alex Elder
2014-03-26 2:45 ` Olivier Bonvalet
2014-03-26 3:54 ` Alex Elder
2014-03-26 4:00 ` Olivier Bonvalet
2014-03-26 5:00 ` Alex Elder
2014-03-26 11:13 ` Alex Elder
2014-03-26 11:43 ` Ilya Dryomov
2014-03-26 11:47 ` Alex Elder
2014-03-26 12:05 ` Ilya Dryomov
2014-03-26 20:58 ` Alex Elder
2014-03-27 7:48 ` Olivier Bonvalet
2014-03-27 8:45 ` Ilya Dryomov
2014-03-27 8:49 ` Olivier Bonvalet
2014-03-26 2:35 ` Olivier Bonvalet
2014-03-26 2:54 ` Alex Elder
2014-03-26 3:58 ` Olivier Bonvalet
2014-04-05 1:16 ` Olivier Bonvalet
2014-04-05 1:57 ` Alex Elder
2014-04-05 8:09 ` Olivier Bonvalet
2014-04-05 13:08 ` Alex Elder
2014-04-25 11:37 ` Olivier Bonvalet
2014-04-25 12:17 ` Alex Elder
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5331BB35.7070107@ieee.org \
--to=elder@ieee.org \
--cc=ceph-devel@vger.kernel.org \
--cc=ceph.list@daevel.fr \
--cc=ilya.dryomov@inktank.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.