From: Olivier Bonvalet <ceph.list@daevel.fr>
To: Alex Elder <elder@ieee.org>
Cc: Ilya Dryomov <ilya.dryomov@inktank.com>,
Ceph Development <ceph-devel@vger.kernel.org>
Subject: Re: Issue #5876 : assertion failure in rbd_img_obj_callback()
Date: Wed, 26 Mar 2014 03:40:25 +0100 [thread overview]
Message-ID: <1395801625.2076.52.camel@localhost> (raw)
In-Reply-To: <5332339A.8030000@ieee.org>
Le mardi 25 mars 2014 à 20:55 -0500, Alex Elder a écrit :
> On 03/25/2014 08:50 PM, Olivier Bonvalet wrote:
> > Le mercredi 26 mars 2014 à 02:33 +0100, Olivier Bonvalet a écrit :
> >> Thanks for your patch.
> >>
> >> This is an output of a crash case :
> >>
> >> Mar 26 02:31:18 alg kernel: [ 965.366895] rbd_img_obj_callback: bad image object request information:
> >> Mar 26 02:31:18 alg kernel: [ 965.366905] obj_request ffff880224bc9528
> >> Mar 26 02:31:18 alg kernel: [ 965.366909] ->object_name <(null)>
> >> Mar 26 02:31:18 alg kernel: [ 965.366913] ->offset 0
> >> Mar 26 02:31:18 alg kernel: [ 965.366917] ->length 4096
> >> Mar 26 02:31:18 alg kernel: [ 965.366921] ->type 0x1
> >> Mar 26 02:31:18 alg kernel: [ 965.366925] ->flags 0x3
> >> Mar 26 02:31:18 alg kernel: [ 965.366929] ->img_request (null)
> >> Mar 26 02:31:18 alg kernel: [ 965.366933] ->which 4294967295
> >> Mar 26 02:31:18 alg kernel: [ 965.366936] ->xferred 4096
> >> Mar 26 02:31:18 alg kernel: [ 965.366940] ->result 0
> >> Mar 26 02:31:18 alg kernel: [ 965.366943] ->kref 0
> >> Mar 26 02:31:18 alg kernel: [ 965.366947] img_request ffff880222f4fb50
> >> Mar 26 02:31:18 alg kernel: [ 965.366950] ->snap 0xfffffffffffffffe
> >> Mar 26 02:31:18 alg kernel: [ 965.366954] ->offset 1417662464
> >> Mar 26 02:31:18 alg kernel: [ 965.366957] ->length 16384
> >> Mar 26 02:31:18 alg kernel: [ 965.366960] ->flags 0x0
> >> Mar 26 02:31:18 alg kernel: [ 965.366963] ->obj_request_count 0
> >> Mar 26 02:31:18 alg kernel: [ 965.366966] ->next_completion 2
> >> Mar 26 02:31:18 alg kernel: [ 965.366969] ->xferred 16384
> >> Mar 26 02:31:18 alg kernel: [ 965.366973] ->result 0
> >> Mar 26 02:31:18 alg kernel: [ 965.366976] ->obj_requests head ffff880222f4fbb0
> >> Mar 26 02:31:18 alg kernel: [ 965.366980] ->kref 0
> >> Mar 26 02:31:18 alg kernel: [ 965.366985]
> >> Mar 26 02:31:18 alg kernel: [ 965.366985] Assertion failure in rbd_img_obj_callback() at line 2165:
> >> Mar 26 02:31:18 alg kernel: [ 965.366985]
> >> Mar 26 02:31:18 alg kernel: [ 965.366985] rbd_assert(which == img_request->next_completion);
> >> Mar 26 02:31:18 alg kernel: [ 965.366985]
> >> Mar 26 02:31:18 alg kernel: [ 965.367185] ------------[ cut here ]------------
> >> Mar 26 02:31:18 alg kernel: [ 965.367241] kernel BUG at drivers/block/rbd.c:2165!
> >>
> >>
> >> I hope it can help.
> >>
> >>
>
>
> Thanks for sending these.
>
> >
> > and a second one, very similar :
> >
> > Mar 26 02:48:27 alg kernel: [ 681.167833] rbd_img_obj_callback: bad image object request information:
> > Mar 26 02:48:27 alg kernel: [ 681.167836] obj_request ffff88022e1e2828
> > Mar 26 02:48:27 alg kernel: [ 681.167837] ->object_name <(null)>
> > Mar 26 02:48:27 alg kernel: [ 681.167838] ->offset 0
> > Mar 26 02:48:27 alg kernel: [ 681.167839] ->length 4096
> > Mar 26 02:48:27 alg kernel: [ 681.167840] ->type 0x1
> > Mar 26 02:48:27 alg kernel: [ 681.167840] ->flags 0x3
> > Mar 26 02:48:27 alg kernel: [ 681.167841] ->img_request (null)
> > Mar 26 02:48:27 alg kernel: [ 681.167842] ->which 4294967295
> > Mar 26 02:48:27 alg kernel: [ 681.167843] ->xferred 4096
> > Mar 26 02:48:27 alg kernel: [ 681.167844] ->result 0
> > Mar 26 02:48:27 alg kernel: [ 681.167844] ->kref 0
>
> This confirms the reference count of the object request has gone
> to zero. This object request has already been destroyed (yet
> we're handling a callback for it).
>
> > Mar 26 02:48:27 alg kernel: [ 681.167845] img_request ffff88021f555f10
> > Mar 26 02:48:27 alg kernel: [ 681.167846] ->snap 0xfffffffffffffffe
> > Mar 26 02:48:27 alg kernel: [ 681.167847] ->offset 28072464384
> > Mar 26 02:48:27 alg kernel: [ 681.167847] ->length 16384
> > Mar 26 02:48:27 alg kernel: [ 681.167848] ->flags 0x0
> > Mar 26 02:48:27 alg kernel: [ 681.167849] ->obj_request_count 0
> > Mar 26 02:48:27 alg kernel: [ 681.167850] ->next_completion 2
> > Mar 26 02:48:27 alg kernel: [ 681.167850] ->xferred 16384
> > Mar 26 02:48:27 alg kernel: [ 681.167851] ->result 0
> > Mar 26 02:48:27 alg kernel: [ 681.167852] ->obj_requests head ffff88021f555f70
>
> The object request list is empty.
>
> > Mar 26 02:48:27 alg kernel: [ 681.167853] ->kref 0
>
> This confirms the reference count of the image request has gone
> to zero. So not only has the object request already completed,
> the image request has as well.
>
> I'm almost done composing a very large e-mail with some detailed
> analysis. No answer quite yet, but I am certain that we're
> getting duplicate callbacks on the second object request of
> an image request that spans two objects. That should help
> narrow the search for the root cause.
>
> -Alex
Thanks again to took time to analyze that problem.
All my RBD images have daily snapshots, can this bug be related to
snapshots ?
Maybe it's a stupid question, but is there a workaround that I could use
to reduce that problem in production, until a proper fix is found ?
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2014-03-26 2:40 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-25 8:39 Issue #5876 : assertion failure in rbd_img_obj_callback() Olivier Bonvalet
2014-03-25 9:04 ` Ilya Dryomov
[not found] ` <1395739214.2823.34.camel@localhost>
2014-03-25 9:52 ` Ilya Dryomov
2014-03-25 11:48 ` Alex Elder
2014-03-25 12:34 ` Ilya Dryomov
2014-03-25 12:51 ` Alex Elder
2014-03-25 12:57 ` Ilya Dryomov
2014-03-25 13:18 ` Olivier Bonvalet
2014-03-25 13:29 ` Alex Elder
2014-03-25 13:31 ` Alex Elder
2014-03-25 14:01 ` Olivier Bonvalet
2014-03-25 17:15 ` Olivier Bonvalet
2014-03-25 17:21 ` Alex Elder
2014-03-25 18:53 ` Olivier Bonvalet
2014-03-25 17:43 ` Alex Elder
2014-03-25 18:53 ` Olivier Bonvalet
2014-03-25 19:03 ` Alex Elder
2014-03-25 20:18 ` Ilya Dryomov
2014-03-25 20:21 ` Olivier Bonvalet
2014-03-25 20:24 ` Alex Elder
2014-03-25 20:29 ` Olivier Bonvalet
2014-03-25 20:44 ` Alex Elder
2014-03-25 21:03 ` Olivier Bonvalet
2014-03-25 20:41 ` Alex Elder
2014-03-25 20:53 ` Olivier Bonvalet
2014-03-25 21:10 ` Olivier Bonvalet
2014-03-25 21:20 ` Ilya Dryomov
[not found] ` <1395782577.2076.23.camel@localhost>
2014-03-25 21:25 ` Ilya Dryomov
2014-03-25 21:41 ` Olivier Bonvalet
2014-03-25 21:49 ` Ilya Dryomov
2014-03-25 21:54 ` Olivier Bonvalet
2014-03-25 22:17 ` Olivier Bonvalet
2014-03-25 22:46 ` Alex Elder
2014-03-25 23:04 ` Olivier Bonvalet
2014-03-26 0:00 ` Alex Elder
2014-03-26 1:33 ` Olivier Bonvalet
2014-03-26 1:50 ` Olivier Bonvalet
2014-03-26 1:55 ` Alex Elder
2014-03-26 2:40 ` Olivier Bonvalet [this message]
2014-03-26 2:42 ` Alex Elder
2014-03-26 2:45 ` Olivier Bonvalet
2014-03-26 3:54 ` Alex Elder
2014-03-26 4:00 ` Olivier Bonvalet
2014-03-26 5:00 ` Alex Elder
2014-03-26 11:13 ` Alex Elder
2014-03-26 11:43 ` Ilya Dryomov
2014-03-26 11:47 ` Alex Elder
2014-03-26 12:05 ` Ilya Dryomov
2014-03-26 20:58 ` Alex Elder
2014-03-27 7:48 ` Olivier Bonvalet
2014-03-27 8:45 ` Ilya Dryomov
2014-03-27 8:49 ` Olivier Bonvalet
2014-03-26 2:35 ` Olivier Bonvalet
2014-03-26 2:54 ` Alex Elder
2014-03-26 3:58 ` Olivier Bonvalet
2014-04-05 1:16 ` Olivier Bonvalet
2014-04-05 1:57 ` Alex Elder
2014-04-05 8:09 ` Olivier Bonvalet
2014-04-05 13:08 ` Alex Elder
2014-04-25 11:37 ` Olivier Bonvalet
2014-04-25 12:17 ` Alex Elder
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1395801625.2076.52.camel@localhost \
--to=ceph.list@daevel.fr \
--cc=ceph-devel@vger.kernel.org \
--cc=elder@ieee.org \
--cc=ilya.dryomov@inktank.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.