All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin Mailand <martin@tuxadero.com>
To: Sage Weil <sage@newdream.net>
Cc: Gregory Farnum <gregory.farnum@dreamhost.com>,
	ceph-devel@vger.kernel.org
Subject: Re: osd/OSD.cc: 5534: FAILED assert(pending_ops > 0)
Date: Thu, 24 Nov 2011 14:23:44 +0100	[thread overview]
Message-ID: <4ECE4560.9090500@tuxadero.com> (raw)
In-Reply-To: <Pine.LNX.4.64.1111161311220.6368@cobra.newdream.net>

Hi Sage,
I hit it again, this time on another osd

ceph version 0.38-181-g2e19550 
(commit:2e195500b5d3a8ab8512bcf2a219a6b7ff922c97)

Thread 1 (Thread 2951):
#0  0x00007f36bbb41b3b in raise () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00000000005f5852 in reraise_fatal (signum=6) at 
global/signal_handler.cc:59
#2  0x00000000005f5e4a in handle_fatal_signal (signum=6) at 
global/signal_handler.cc:106
#3  <signal handler called>
#4  0x00007f36ba0c2d05 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#5  0x00007f36ba0c6ab6 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x00007f36ba9796dd in __gnu_cxx::__verbose_terminate_handler() () 
from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
---Type <return> to continue, or q <return> to quit---
#7  0x00007f36ba977926 in ?? () from 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x00007f36ba977953 in std::terminate() () from 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9  0x00007f36ba977a5e in __cxa_throw () from 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x00000000005f6956 in ceph::__ceph_assert_fail (assertion=<value 
optimized out>, file=<value optimized out>, line=<value optimized out>,
     func=<value optimized out>) at common/assert.cc:70
#11 0x000000000056616a in OSD::dequeue_op (this=0x25b0000, pg=<value 
optimized out>) at osd/OSD.cc:5518
#12 0x00000000005d4406 in ThreadPool::worker (this=0x25b0408) at 
common/WorkQueue.cc:54
#13 0x00000000005822dd in ThreadPool::WorkThread::entry (this=<value 
optimized out>) at ./common/WorkQueue.h:120
#14 0x00007f36bbb38d8c in start_thread () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#15 0x00007f36ba17504d in clone () from /lib/x86_64-linux-gnu/libc.so.6
#16 0x0000000000000000 in ?? ()
(gdb) thread 1
[Switching to thread 1 (Thread 2951)]#0  0x00007f36bbb41b3b in raise () 
from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) frame 11
#11 0x000000000056616a in OSD::dequeue_op (this=0x25b0000, pg=<value 
optimized out>) at osd/OSD.cc:5518
5518    osd/OSD.cc: No such file or directory.
         in osd/OSD.cc
(gdb) p pending_ops
$1 = 0



-martin


Am 16.11.2011 22:12, schrieb Sage Weil:
> Hi Martin,
>
> I've reread the code twice now and it's really not clear to me how
> pending_ops could get out of sync with the actual queue size.  I've pushed
> a couple of patches that remove surrounding dead code and add an
> additional assert sanity check to master.    Have you seen this again, or
> just that once?
>
> Opened http://tracker.newdream.net/issues/1727
>
> Thanks-
> sage
>
>
> On Wed, 16 Nov 2011, Martin Mailand wrote:
>
>> Hi,
>> so after a little help from greg.
>>
>> (gdb) print pending_ops
>> $1 = 0
>>
>> -martin
>>
>> Sage Weil schrieb:
>>> On Mon, 14 Nov 2011, Gregory Farnum wrote:
>>>> It's not a big deal; logging is expensive. :) Just a backtrace isn't a
>>>> lot to go on, but it's better than nothing!
>>>>
>>>> On Mon, Nov 14, 2011 at 11:45 AM, Martin Mailand<martin@tuxadero.com>
>>>> wrote:
>>>>> Hi Gregory,
>>>>> I do not have more at the moment. As I cannot have the debug log always
>>>>> on,
>>>>> a core dump would be the best solution?
>>>
>>> I'm mainly interested in whether pending_ops is 0 or<  0.  A 'thread apply
>>> all bt' may also be useful.
>>>
>>> Thanks!
>>> sage
>>>
>>>
>>>>> -martin
>>>>>
>>>>> Gregory Farnum schrieb:
>>>>>> Do you have any other system state? (More logs, core dumps.)
>>>>>>
>>>>>> Make a bug in the tracker either way so it doesn't get lost track of.
>>>>>> :)
>>>>>> -Greg
>>>>>>
>>>>>> On Mon, Nov 14, 2011 at 6:04 AM, Martin Mailand<martin@tuxadero.com>
>>>>>> wrote:
>>>>>>> Hi,
>>>>>>> today one of my ods died, the log is.
>>>>>>>
>>>>>>> sd/OSD.cc: In function 'void OSD::dequeue_op(PG*)', in thread
>>>>>>> '7faeb6139700'
>>>>>>> osd/OSD.cc: 5534: FAILED assert(pending_ops>  0)
>>>>>>>   ceph version 0.38 (commit:b600ec2ac7c0f2e508720f8e8bb87c3db15509b9)
>>>>>>>   1: (OSD::dequeue_op(PG*)+0x4bb) [0x55a4db]
>>>>>>>   2: (ThreadPool::worker()+0x6e6) [0x5b7b16]
>>>>>>>   3: (ThreadPool::WorkThread::entry()+0xd) [0x57398d]
>>>>>>>   4: (()+0x6d8c) [0x7faec4d12d8c]
>>>>>>>   5: (clone()+0x6d) [0x7faec355404d]
>>>>>>>   ceph version 0.38 (commit:b600ec2ac7c0f2e508720f8e8bb87c3db15509b9)
>>>>>>>   1: (OSD::dequeue_op(PG*)+0x4bb) [0x55a4db]
>>>>>>>   2: (ThreadPool::worker()+0x6e6) [0x5b7b16]
>>>>>>>   3: (ThreadPool::WorkThread::entry()+0xd) [0x57398d]
>>>>>>>   4: (()+0x6d8c) [0x7faec4d12d8c]
>>>>>>>   5: (clone()+0x6d) [0x7faec355404d]
>>>>>>> *** Caught signal (Aborted) **
>>>>>>>   in thread 7faeb6139700
>>>>>>>   ceph version 0.38 (commit:b600ec2ac7c0f2e508720f8e8bb87c3db15509b9)
>>>>>>>   1: /usr/bin/ceph-osd() [0x5b8b52]
>>>>>>>   2: (()+0xfc60) [0x7faec4d1bc60]
>>>>>>>   3: (gsignal()+0x35) [0x7faec34a1d05]
>>>>>>>   4: (abort()+0x186) [0x7faec34a5ab6]
>>>>>>>   5: (__gnu_cxx::__verbose_terminate_handler()+0x11d)
>>>>>>> [0x7faec3d586dd]
>>>>>>>   6: (()+0xb9926) [0x7faec3d56926]
>>>>>>>   7: (()+0xb9953) [0x7faec3d56953]
>>>>>>>   8: (()+0xb9a5e) [0x7faec3d56a5e]
>>>>>>>   9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>>>>>>> const*)+0x396) [0x5bddb6]
>>>>>>>   10: (OSD::dequeue_op(PG*)+0x4bb) [0x55a4db]
>>>>>>>   11: (ThreadPool::worker()+0x6e6) [0x5b7b16]
>>>>>>>   12: (ThreadPool::WorkThread::entry()+0xd) [0x57398d]
>>>>>>>   13: (()+0x6d8c) [0x7faec4d12d8c]
>>>>>>>   14: (clone()+0x6d) [0x7faec355404d]
>>>>>>>
>>>>>>> Anything else needed to debug this?
>>>>>>>
>>>>>>> -martin
>>>>>>> --
>>>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>>>> ceph-devel" in
>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


  parent reply	other threads:[~2011-11-24 13:23 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-14 14:04 osd/OSD.cc: 5534: FAILED assert(pending_ops > 0) Martin Mailand
2011-11-14 19:11 ` Gregory Farnum
2011-11-14 19:45   ` Martin Mailand
2011-11-14 19:54     ` Gregory Farnum
2011-11-14 20:21       ` Sage Weil
2011-11-15 19:57         ` Martin Mailand
2011-11-15 23:05         ` Martin Mailand
2011-11-16 21:12           ` Sage Weil
2011-11-17 12:07             ` Martin Mailand
2011-11-24 13:23             ` Martin Mailand [this message]
2011-11-28 17:19               ` Sage Weil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ECE4560.9090500@tuxadero.com \
    --to=martin@tuxadero.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=gregory.farnum@dreamhost.com \
    --cc=sage@newdream.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.