All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Priebe <s.priebe@profihost.ag>
To: Sage Weil <sage@inktank.com>
Cc: "ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: OSD crashed today in os/JournalingObjectStore.cc
Date: Wed, 05 Dec 2012 23:29:15 +0100	[thread overview]
Message-ID: <50BFCABB.6080207@profihost.ag> (raw)
In-Reply-To: <50BFC9BF.9050302@profihost.ag>

Hello,

this seems to happens since:
85574a3

Stefan

Am 05.12.2012 23:25, schrieb Stefan Priebe:
> Hello,
>
> i had now 8 OSDs failing again with the same error.
>
>       0> 2012-12-05 23:10:41.213149 7f7fad109700 -1
> os/JournalingObjectStore.cc: In function 'uint64_t
> JournalingObjectStore::ApplyManager::op_apply_start(uint64_t)' thread
> 7f7fad109700 time 2012-12-05 23:10:41.212454
> os/JournalingObjectStore.cc: 134: FAILED assert(op > committed_seq)
>
>   ceph version 0.55-142-g22f794d (22f794da074dd1b3221c484a5ae05b2ff1bd0fa4)
>   1: (JournalingObjectStore::ApplyManager::op_apply_start(unsigned
> long)+0x816) [0x747626]
>   2: (FileStore::_do_op(FileStore::OpSequencer*)+0x52) [0x703c22]
>   3: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x82f81b]
>   4: (ThreadPool::WorkThread::entry()+0x10) [0x832000]
>   5: (()+0x68ca) [0x7f7fc17a78ca]
>   6: (clone()+0x6d) [0x7f7fbfc16bfd]
>   NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> needed to interpret this.
>
> --- logging levels ---
>     0/ 5 none
>     0/ 0 lockdep
>     0/ 0 context
>     0/ 0 crush
>     1/ 5 mds
>     1/ 5 mds_balancer
>     1/ 5 mds_locker
>     1/ 5 mds_log
>     1/ 5 mds_log_expire
>     1/ 5 mds_migrator
>     0/ 0 buffer
>     0/ 0 timer
>     0/ 1 filer
>     0/ 1 striper
>     0/ 1 objecter
>     0/ 5 rados
>     0/ 5 rbd
>     0/ 0 journaler
>     0/ 5 objectcacher
>    0/ 5 client
>     0/ 0 osd
>     0/ 0 optracker
>     0/ 0 objclass
>     0/ 0 filestore
>     0/ 0 journal
>     0/ 0 ms
>     1/ 5 mon
>     0/ 0 monc
>     0/ 5 paxos
>     0/ 0 tp
>     0/ 0 auth
>     1/ 5 crypto
>     0/ 0 finisher
>     0/ 0 heartbeatmap
>     0/ 0 perfcounter
>     1/ 5 rgw
>     1/ 5 hadoop
>     1/ 5 rgw
>     1/ 5 hadoop
>     1/ 5 javaclient
>     0/ 0 asok
>     0/ 0 throttle
>    -2/-2 (syslog threshold)
>    -1/-1 (stderr threshold)
>    max_recent    100000
>    max_new         1000
>    log_file /var/log/ceph/ceph-osd.13.log
> --- end dump of recent events ---
> 2012-12-05 23:10:41.216011 7f7fad109700 -1 *** Caught signal (Aborted) **
>   in thread 7f7fad109700
>
>   ceph version 0.55-142-g22f794d (22f794da074dd1b3221c484a5ae05b2ff1bd0fa4)
>   1: /usr/bin/ceph-osd() [0x797bd9]
>   2: (()+0xeff0) [0x7f7fc17afff0]
>   3: (gsignal()+0x35) [0x7f7fbfb79215]
>   4: (abort()+0x180) [0x7f7fbfb7c020]
>   5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f7fc040ddc5]
>   6: (()+0xcb166) [0x7f7fc040c166]
>   7: (()+0xcb193) [0x7f7fc040c193]
>   8: (()+0xcb28e) [0x7f7fc040c28e]
>   9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x7c9) [0x7fb939]
>   10: (JournalingObjectStore::ApplyManager::op_apply_start(unsigned
> long)+0x816) [0x747626]
>   11: (FileStore::_do_op(FileStore::OpSequencer*)+0x52) [0x703c22]
>   12: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x82f81b]
>   13: (ThreadPool::WorkThread::entry()+0x10) [0x832000]
>   14: (()+0x68ca) [0x7f7fc17a78ca]
>   15: (clone()+0x6d) [0x7f7fbfc16bfd]
>   NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> needed to interpret this.
>
> --- begin dump of recent events ---
>       0> 2012-12-05 23:10:41.216011 7f7fad109700 -1 *** Caught signal
> (Aborted) **
>   in thread 7f7fad109700
>
>   ceph version 0.55-142-g22f794d (22f794da074dd1b3221c484a5ae05b2ff1bd0fa4)
>   1: /usr/bin/ceph-osd() [0x797bd9]
>   2: (()+0xeff0) [0x7f7fc17afff0]
>   3: (gsignal()+0x35) [0x7f7fbfb79215]
>   4: (abort()+0x180) [0x7f7fbfb7c020]
>   5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f7fc040ddc5]
>   6: (()+0xcb166) [0x7f7fc040c166]
>   7: (()+0xcb193) [0x7f7fc040c193]
>   8: (()+0xcb28e) [0x7f7fc040c28e]
>   9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x7c9) [0x7fb939]
>   10: (JournalingObjectStore::ApplyManager::op_apply_start(unsigned
> long)+0x816) [0x747626]
>   11: (FileStore::_do_op(FileStore::OpSequencer*)+0x52) [0x703c22]
>   12: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x82f81b]
>   13: (ThreadPool::WorkThread::entry()+0x10) [0x832000]
>   14: (()+0x68ca) [0x7f7fc17a78ca]
>   15: (clone()+0x6d) [0x7f7fbfc16bfd]
>   NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> needed to interpret this.
>
> --- logging levels ---
>     0/ 5 none
>     0/ 0 lockdep
>     0/ 0 context
>     0/ 0 crush
>     1/ 5 mds
>     1/ 5 mds_balancer
>     1/ 5 mds_locker
>     1/ 5 mds_log
>     1/ 5 mds_log_expire
>     1/ 5 mds_migrator
>     0/ 0 buffer
>     0/ 0 timer
>     0/ 1 filer
>     0/ 1 striper
>     0/ 1 objecter
>     0/ 5 rados
>     0/ 5 rbd
>     0/ 0 journaler
>     0/ 5 objectcacher
>     0/ 5 client
>     0/ 0 osd
>     0/ 0 optracker
>     0/ 0 objclass
>     0/ 0 filestore
>     0/ 0 journal
>     0/ 0 ms
>     1/ 5 mon
>     0/ 0 monc
>     0/ 5 paxos
>     0/ 0 tp
>     0/ 0 auth
>     1/ 5 crypto
>     0/ 0 finisher
>     0/ 0 heartbeatmap
>     0/ 0 perfcounter
>     1/ 5 rgw
>     1/ 5 hadoop
>     1/ 5 javaclient
>     0/ 0 asok
>     0/ 0 throttle
>    -2/-2 (syslog threshold)
>    -1/-1 (stderr threshold)
>    max_recent    100000
>    max_new         1000
>    log_file /var/log/ceph/ceph-osd.13.log
> --- end dump of recent events ---
>
> Stefan
> Am 05.12.2012 17:05, schrieb Stefan Priebe - Profihost AG:
>> There was a dump in the attached log.
>>
>> Stefan
>>
>> Am 05.12.2012 um 15:41 schrieb Sage Weil <sage@inktank.com>:
>>
>>> On Wed, 5 Dec 2012, Stefan Priebe - Profihost AG wrote:
>>>> Hello list,
>>>>
>>>> i updated to latest next from today and then after 20 minutes an OSD
>>>> was
>>>> crashing in os/JournalingObjectStore.cc.
>>>>
>>>> Attached is the log.
>>>
>>> Hmm, this is perplexing.  It might just be a bad assert, but I can't see
>>> how it could happen.  Any chance you can reproduce with
>>>
>>>     debug journal = 0/10
>>>
>>> in the [osd] section?  That will give us a dump if it fails the assert.
>>>
>>> Thanks!
>>> s
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2012-12-05 22:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-05  9:56 OSD crashed today in os/JournalingObjectStore.cc Stefan Priebe - Profihost AG
2012-12-05 14:41 ` Sage Weil
2012-12-05 16:05   ` Stefan Priebe - Profihost AG
2012-12-05 22:25     ` Stefan Priebe
2012-12-05 22:29       ` Stefan Priebe [this message]
2012-12-05 23:36     ` Sage Weil
2012-12-06  9:38       ` Stefan Priebe - Profihost AG
2012-12-06 14:43         ` Sage Weil
2012-12-06 14:47           ` Stefan Priebe - Profihost AG
2012-12-07  0:38         ` Sage Weil
2012-12-07  7:49           ` Stefan Priebe - Profihost AG
2012-12-07 11:02             ` Sage Weil
2012-12-07 11:29               ` Stefan Priebe - Profihost AG

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50BFCABB.6080207@profihost.ag \
    --to=s.priebe@profihost.ag \
    --cc=ceph-devel@vger.kernel.org \
    --cc=sage@inktank.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.