All of lore.kernel.org
 help / color / mirror / Atom feed
* ceph-osd crashing (os/FileStore.cc: 4500: FAILED assert(replaying))
@ 2012-11-15 21:07 Stefan Priebe
  2012-11-19 23:39 ` Samuel Just
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan Priebe @ 2012-11-15 21:07 UTC (permalink / raw)
  To: ceph-devel@vger.kernel.org

Hello list,

actual master incl. upstream/wip-fd-simple-cache results in this crash 
when i try to start some of my osds (others work fine) today on multiple 
nodes:

     -2> 2012-11-15 22:04:09.226945 7f3af1c7a780  0 osd.52 pg_epoch: 657 
pg[3.3b( v 632'823 (632'823,632'823] n=5 ec=17 les/c 18/18 656/656/17) 
[] r=0 lpr=0 pi=17-655/2 (info mismatch, log(632'823,0'0]) (log bound 
mismatch, empty) lcod 0'0 mlcod 0'0 inactive] Got exception 
'read_log_error: read_log got 0 bytes, expected 126086-0=126086' while 
reading log. Moving corrupted log file to 
'corrupt_log_2012-11-15_22:04_3.3b' for later analysis.
     -1> 2012-11-15 22:04:09.233563 7f3af1c7a780  0 osd.52 pg_epoch: 657 
pg[3.557( v 632'753 (0'0,632'753] n=2 ec=17 les/c 18/18 656/656/17) [] 
r=0 lpr=0 pi=17-655/2 (info mismatch, log(0'0,0'0]) lcod 0'0 mlcod 0'0 
inactive] Got exception 'read_log_error: read_log got 0 bytes, expected 
115488-0=115488' while reading log. Moving corrupted log file to 
'corrupt_log_2012-11-15_22:04_3.557' for later analysis.
      0> 2012-11-15 22:04:09.234536 7f3ae87d0700 -1 os/FileStore.cc: In 
function 'int FileStore::_collection_add(coll_t, coll_t, const 
hobject_t&, const SequencerPosition&)' thread 7f3ae87d0700 time 
2012-11-15 22:04:09.233672
os/FileStore.cc: 4500: FAILED assert(replaying)

  ceph version 0.54-607-gf89e101 (f89e1012bafabd6875a4a1e1832d76ffdf45b039)
  1: (FileStore::_collection_add(coll_t, coll_t, hobject_t const&, 
SequencerPosition const&)+0x77d) [0x72ff0d]
  2: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned 
long, int)+0x25fb) [0x73481b]
  3: (FileStore::do_transactions(std::list<ObjectStore::Transaction*, 
std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c) 
[0x73952c]
  4: (FileStore::_do_op(FileStore::OpSequencer*)+0x195) [0x705c45]
  5: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x830f1b]
  6: (ThreadPool::WorkThread::entry()+0x10) [0x833700]
  7: (()+0x68ca) [0x7f3af16578ca]
  8: (clone()+0x6d) [0x7f3aefac6bfd]
  NOTE: a copy of the executable, or `objdump -rdS <executable>` is 
needed to interpret this.

--- logging levels ---
    0/ 5 none
    0/ 0 lockdep
    0/ 0 context
    0/ 0 crush
    1/ 5 mds
    1/ 5 mds_balancer
    1/ 5 mds_locker
    1/ 5 mds_log
    1/ 5 mds_log_expire
    1/ 5 mds_migrator
    0/ 0 buffer
    0/ 0 timer
    0/ 1 filer
    0/ 1 striper
    0/ 1 objecter
    0/ 5 rados
    0/ 5 rbd
    0/ 0 journaler
    0/ 5 objectcacher
    0/ 5 client
    0/ 0 osd
    0/ 0 optracker
    0/ 0 objclass
    0/ 0 filestore
    0/ 0 journal
    0/ 0 ms
    1/ 5 mon
    0/ 0 monc
    0/ 5 paxos
    0/ 0 tp
    0/ 0 auth
    1/ 5 crypto
    0/ 0 finisher
    0/ 0 heartbeatmap
    0/ 0 perfcounter
    1/ 5 rgw
    1/ 5 hadoop
    1/ 5 javaclient
    0/ 0 asok
    0/ 0 throttle
   -2/-2 (syslog threshold)
   -1/-1 (stderr threshold)
   max_recent     10000
   max_new      1000000
   log_file /var/log/ceph/ceph-osd.52.log
--- end dump of recent events ---
2012-11-15 22:04:09.235734 7f3ae87d0700 -1 *** Caught signal (Aborted) **
  in thread 7f3ae87d0700

  ceph version 0.54-607-gf89e101 (f89e1012bafabd6875a4a1e1832d76ffdf45b039)
  1: /usr/bin/ceph-osd() [0x799769]
  2: (()+0xeff0) [0x7f3af165fff0]
  3: (gsignal()+0x35) [0x7f3aefa29215]
  4: (abort()+0x180) [0x7f3aefa2c020]
  5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f3af02bddc5]
  6: (()+0xcb166) [0x7f3af02bc166]
  7: (()+0xcb193) [0x7f3af02bc193]
  8: (()+0xcb28e) [0x7f3af02bc28e]
  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x7c9) [0x7fd069]
  10: (FileStore::_collection_add(coll_t, coll_t, hobject_t const&, 
SequencerPosition const&)+0x77d) [0x72ff0d]
  11: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned 
long, int)+0x25fb) [0x73481b]
  12: (FileStore::do_transactions(std::list<ObjectStore::Transaction*, 
std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c) 
[0x73952c]
  13: (FileStore::_do_op(FileStore::OpSequencer*)+0x195) [0x705c45]
  14: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x830f1b]
  15: (ThreadPool::WorkThread::entry()+0x10) [0x833700]
  16: (()+0x68ca) [0x7f3af16578ca]
  17: (clone()+0x6d) [0x7f3aefac6bfd]
  NOTE: a copy of the executable, or `objdump -rdS <executable>` is 
needed to interpret this.

--- begin dump of recent events ---
      0> 2012-11-15 22:04:09.235734 7f3ae87d0700 -1 *** Caught signal 
(Aborted) **
  in thread 7f3ae87d0700

  ceph version 0.54-607-gf89e101 (f89e1012bafabd6875a4a1e1832d76ffdf45b039)
  1: /usr/bin/ceph-osd() [0x799769]
  2: (()+0xeff0) [0x7f3af165fff0]
  3: (gsignal()+0x35) [0x7f3aefa29215]
  4: (abort()+0x180) [0x7f3aefa2c020]
  5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f3af02bddc5]
  6: (()+0xcb166) [0x7f3af02bc166]
  7: (()+0xcb193) [0x7f3af02bc193]
  8: (()+0xcb28e) [0x7f3af02bc28e]
  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x7c9) [0x7fd069]
  10: (FileStore::_collection_add(coll_t, coll_t, hobject_t const&, 
SequencerPosition const&)+0x77d) [0x72ff0d]
  11: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned 
long, int)+0x25fb) [0x73481b]
  12: (FileStore::do_transactions(std::list<ObjectStore::Transaction*, 
std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c) 
[0x73952c]
  13: (FileStore::_do_op(FileStore::OpSequencer*)+0x195) [0x705c45]
  14: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x830f1b]
  15: (ThreadPool::WorkThread::entry()+0x10) [0x833700]
  16: (()+0x68ca) [0x7f3af16578ca]
  17: (clone()+0x6d) [0x7f3aefac6bfd]
  NOTE: a copy of the executable, or `objdump -rdS <executable>` is 
needed to interpret this.

--- logging levels ---
    0/ 5 none
    0/ 0 lockdep
    0/ 0 context
    0/ 0 crush
    1/ 5 mds
    1/ 5 mds_balancer
    1/ 5 mds_locker
    1/ 5 mds_log
    1/ 5 mds_log_expire
    1/ 5 mds_migrator
    0/ 0 buffer
    0/ 0 timer
    0/ 1 filer
    0/ 1 striper
    0/ 1 objecter
    0/ 5 rados
    0/ 5 rbd
    0/ 0 journaler
    0/ 5 objectcacher
    0/ 5 client
    0/ 0 osd
    0/ 0 optracker
    0/ 0 objclass
    0/ 0 filestore
    0/ 0 journal
    0/ 0 ms
    1/ 5 mon
    0/ 0 monc
    0/ 5 paxos
    0/ 0 tp
    0/ 0 auth
    1/ 5 crypto
    0/ 0 finisher
    0/ 0 heartbeatmap
    0/ 0 perfcounter
    1/ 5 rgw
    1/ 5 hadoop
    1/ 5 javaclient
    0/ 0 asok
    0/ 0 throttle
   -2/-2 (syslog threshold)
   -1/-1 (stderr threshold)
   max_recent     10000
   max_new      1000000
   log_file /var/log/ceph/ceph-osd.52.log
--- end dump of recent events ---

Stefan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ceph-osd crashing (os/FileStore.cc: 4500: FAILED assert(replaying))
  2012-11-15 21:07 ceph-osd crashing (os/FileStore.cc: 4500: FAILED assert(replaying)) Stefan Priebe
@ 2012-11-19 23:39 ` Samuel Just
  2012-11-19 23:39   ` Stefan Priebe
  0 siblings, 1 reply; 5+ messages in thread
From: Samuel Just @ 2012-11-19 23:39 UTC (permalink / raw)
  To: Stefan Priebe; +Cc: ceph-devel@vger.kernel.org

Seems to be a truncated log file...  That usually indicates filesystem
corruption.  Anything in dmesg?
-Sam

On Thu, Nov 15, 2012 at 1:07 PM, Stefan Priebe <s.priebe@profihost.ag> wrote:
> Hello list,
>
> actual master incl. upstream/wip-fd-simple-cache results in this crash when
> i try to start some of my osds (others work fine) today on multiple nodes:
>
>     -2> 2012-11-15 22:04:09.226945 7f3af1c7a780  0 osd.52 pg_epoch: 657
> pg[3.3b( v 632'823 (632'823,632'823] n=5 ec=17 les/c 18/18 656/656/17) []
> r=0 lpr=0 pi=17-655/2 (info mismatch, log(632'823,0'0]) (log bound mismatch,
> empty) lcod 0'0 mlcod 0'0 inactive] Got exception 'read_log_error: read_log
> got 0 bytes, expected 126086-0=126086' while reading log. Moving corrupted
> log file to 'corrupt_log_2012-11-15_22:04_3.3b' for later analysis.
>     -1> 2012-11-15 22:04:09.233563 7f3af1c7a780  0 osd.52 pg_epoch: 657
> pg[3.557( v 632'753 (0'0,632'753] n=2 ec=17 les/c 18/18 656/656/17) [] r=0
> lpr=0 pi=17-655/2 (info mismatch, log(0'0,0'0]) lcod 0'0 mlcod 0'0 inactive]
> Got exception 'read_log_error: read_log got 0 bytes, expected
> 115488-0=115488' while reading log. Moving corrupted log file to
> 'corrupt_log_2012-11-15_22:04_3.557' for later analysis.
>      0> 2012-11-15 22:04:09.234536 7f3ae87d0700 -1 os/FileStore.cc: In
> function 'int FileStore::_collection_add(coll_t, coll_t, const hobject_t&,
> const SequencerPosition&)' thread 7f3ae87d0700 time 2012-11-15
> 22:04:09.233672
> os/FileStore.cc: 4500: FAILED assert(replaying)
>
>  ceph version 0.54-607-gf89e101 (f89e1012bafabd6875a4a1e1832d76ffdf45b039)
>  1: (FileStore::_collection_add(coll_t, coll_t, hobject_t const&,
> SequencerPosition const&)+0x77d) [0x72ff0d]
>  2: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long,
> int)+0x25fb) [0x73481b]
>  3: (FileStore::do_transactions(std::list<ObjectStore::Transaction*,
> std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c)
> [0x73952c]
>  4: (FileStore::_do_op(FileStore::OpSequencer*)+0x195) [0x705c45]
>  5: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x830f1b]
>  6: (ThreadPool::WorkThread::entry()+0x10) [0x833700]
>  7: (()+0x68ca) [0x7f3af16578ca]
>  8: (clone()+0x6d) [0x7f3aefac6bfd]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
> interpret this.
>
> --- logging levels ---
>    0/ 5 none
>    0/ 0 lockdep
>    0/ 0 context
>    0/ 0 crush
>    1/ 5 mds
>    1/ 5 mds_balancer
>    1/ 5 mds_locker
>    1/ 5 mds_log
>    1/ 5 mds_log_expire
>    1/ 5 mds_migrator
>    0/ 0 buffer
>    0/ 0 timer
>    0/ 1 filer
>    0/ 1 striper
>    0/ 1 objecter
>    0/ 5 rados
>    0/ 5 rbd
>    0/ 0 journaler
>    0/ 5 objectcacher
>    0/ 5 client
>    0/ 0 osd
>    0/ 0 optracker
>    0/ 0 objclass
>    0/ 0 filestore
>    0/ 0 journal
>    0/ 0 ms
>    1/ 5 mon
>    0/ 0 monc
>    0/ 5 paxos
>    0/ 0 tp
>    0/ 0 auth
>    1/ 5 crypto
>    0/ 0 finisher
>    0/ 0 heartbeatmap
>    0/ 0 perfcounter
>    1/ 5 rgw
>    1/ 5 hadoop
>    1/ 5 javaclient
>    0/ 0 asok
>    0/ 0 throttle
>   -2/-2 (syslog threshold)
>   -1/-1 (stderr threshold)
>   max_recent     10000
>   max_new      1000000
>   log_file /var/log/ceph/ceph-osd.52.log
> --- end dump of recent events ---
> 2012-11-15 22:04:09.235734 7f3ae87d0700 -1 *** Caught signal (Aborted) **
>  in thread 7f3ae87d0700
>
>  ceph version 0.54-607-gf89e101 (f89e1012bafabd6875a4a1e1832d76ffdf45b039)
>  1: /usr/bin/ceph-osd() [0x799769]
>  2: (()+0xeff0) [0x7f3af165fff0]
>  3: (gsignal()+0x35) [0x7f3aefa29215]
>  4: (abort()+0x180) [0x7f3aefa2c020]
>  5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f3af02bddc5]
>  6: (()+0xcb166) [0x7f3af02bc166]
>  7: (()+0xcb193) [0x7f3af02bc193]
>  8: (()+0xcb28e) [0x7f3af02bc28e]
>  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x7c9) [0x7fd069]
>  10: (FileStore::_collection_add(coll_t, coll_t, hobject_t const&,
> SequencerPosition const&)+0x77d) [0x72ff0d]
>  11: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long,
> int)+0x25fb) [0x73481b]
>  12: (FileStore::do_transactions(std::list<ObjectStore::Transaction*,
> std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c)
> [0x73952c]
>  13: (FileStore::_do_op(FileStore::OpSequencer*)+0x195) [0x705c45]
>  14: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x830f1b]
>  15: (ThreadPool::WorkThread::entry()+0x10) [0x833700]
>  16: (()+0x68ca) [0x7f3af16578ca]
>  17: (clone()+0x6d) [0x7f3aefac6bfd]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
> interpret this.
>
> --- begin dump of recent events ---
>      0> 2012-11-15 22:04:09.235734 7f3ae87d0700 -1 *** Caught signal
> (Aborted) **
>  in thread 7f3ae87d0700
>
>  ceph version 0.54-607-gf89e101 (f89e1012bafabd6875a4a1e1832d76ffdf45b039)
>  1: /usr/bin/ceph-osd() [0x799769]
>  2: (()+0xeff0) [0x7f3af165fff0]
>  3: (gsignal()+0x35) [0x7f3aefa29215]
>  4: (abort()+0x180) [0x7f3aefa2c020]
>  5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f3af02bddc5]
>  6: (()+0xcb166) [0x7f3af02bc166]
>  7: (()+0xcb193) [0x7f3af02bc193]
>  8: (()+0xcb28e) [0x7f3af02bc28e]
>  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x7c9) [0x7fd069]
>  10: (FileStore::_collection_add(coll_t, coll_t, hobject_t const&,
> SequencerPosition const&)+0x77d) [0x72ff0d]
>  11: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long,
> int)+0x25fb) [0x73481b]
>  12: (FileStore::do_transactions(std::list<ObjectStore::Transaction*,
> std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c)
> [0x73952c]
>  13: (FileStore::_do_op(FileStore::OpSequencer*)+0x195) [0x705c45]
>  14: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x830f1b]
>  15: (ThreadPool::WorkThread::entry()+0x10) [0x833700]
>  16: (()+0x68ca) [0x7f3af16578ca]
>  17: (clone()+0x6d) [0x7f3aefac6bfd]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
> interpret this.
>
> --- logging levels ---
>    0/ 5 none
>    0/ 0 lockdep
>    0/ 0 context
>    0/ 0 crush
>    1/ 5 mds
>    1/ 5 mds_balancer
>    1/ 5 mds_locker
>    1/ 5 mds_log
>    1/ 5 mds_log_expire
>    1/ 5 mds_migrator
>    0/ 0 buffer
>    0/ 0 timer
>    0/ 1 filer
>    0/ 1 striper
>    0/ 1 objecter
>    0/ 5 rados
>    0/ 5 rbd
>    0/ 0 journaler
>    0/ 5 objectcacher
>    0/ 5 client
>    0/ 0 osd
>    0/ 0 optracker
>    0/ 0 objclass
>    0/ 0 filestore
>    0/ 0 journal
>    0/ 0 ms
>    1/ 5 mon
>    0/ 0 monc
>    0/ 5 paxos
>    0/ 0 tp
>    0/ 0 auth
>    1/ 5 crypto
>    0/ 0 finisher
>    0/ 0 heartbeatmap
>    0/ 0 perfcounter
>    1/ 5 rgw
>    1/ 5 hadoop
>    1/ 5 javaclient
>    0/ 0 asok
>    0/ 0 throttle
>   -2/-2 (syslog threshold)
>   -1/-1 (stderr threshold)
>   max_recent     10000
>   max_new      1000000
>   log_file /var/log/ceph/ceph-osd.52.log
> --- end dump of recent events ---
>
> Stefan
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ceph-osd crashing (os/FileStore.cc: 4500: FAILED assert(replaying))
  2012-11-19 23:39 ` Samuel Just
@ 2012-11-19 23:39   ` Stefan Priebe
  2012-11-19 23:43     ` Samuel Just
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan Priebe @ 2012-11-19 23:39 UTC (permalink / raw)
  To: Samuel Just; +Cc: ceph-devel@vger.kernel.org

Am 20.11.2012 00:39, schrieb Samuel Just:
> Seems to be a truncated log file...  That usually indicates filesystem
> corruption.  Anything in dmesg?
> -Sam
No. Everything fine.


> On Thu, Nov 15, 2012 at 1:07 PM, Stefan Priebe <s.priebe@profihost.ag> wrote:
>> Hello list,
>>
>> actual master incl. upstream/wip-fd-simple-cache results in this crash when
>> i try to start some of my osds (others work fine) today on multiple nodes:
>>
>>      -2> 2012-11-15 22:04:09.226945 7f3af1c7a780  0 osd.52 pg_epoch: 657
>> pg[3.3b( v 632'823 (632'823,632'823] n=5 ec=17 les/c 18/18 656/656/17) []
>> r=0 lpr=0 pi=17-655/2 (info mismatch, log(632'823,0'0]) (log bound mismatch,
>> empty) lcod 0'0 mlcod 0'0 inactive] Got exception 'read_log_error: read_log
>> got 0 bytes, expected 126086-0=126086' while reading log. Moving corrupted
>> log file to 'corrupt_log_2012-11-15_22:04_3.3b' for later analysis.
>>      -1> 2012-11-15 22:04:09.233563 7f3af1c7a780  0 osd.52 pg_epoch: 657
>> pg[3.557( v 632'753 (0'0,632'753] n=2 ec=17 les/c 18/18 656/656/17) [] r=0
>> lpr=0 pi=17-655/2 (info mismatch, log(0'0,0'0]) lcod 0'0 mlcod 0'0 inactive]
>> Got exception 'read_log_error: read_log got 0 bytes, expected
>> 115488-0=115488' while reading log. Moving corrupted log file to
>> 'corrupt_log_2012-11-15_22:04_3.557' for later analysis.
>>       0> 2012-11-15 22:04:09.234536 7f3ae87d0700 -1 os/FileStore.cc: In
>> function 'int FileStore::_collection_add(coll_t, coll_t, const hobject_t&,
>> const SequencerPosition&)' thread 7f3ae87d0700 time 2012-11-15
>> 22:04:09.233672
>> os/FileStore.cc: 4500: FAILED assert(replaying)
>>
>>   ceph version 0.54-607-gf89e101 (f89e1012bafabd6875a4a1e1832d76ffdf45b039)
>>   1: (FileStore::_collection_add(coll_t, coll_t, hobject_t const&,
>> SequencerPosition const&)+0x77d) [0x72ff0d]
>>   2: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long,
>> int)+0x25fb) [0x73481b]
>>   3: (FileStore::do_transactions(std::list<ObjectStore::Transaction*,
>> std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c)
>> [0x73952c]
>>   4: (FileStore::_do_op(FileStore::OpSequencer*)+0x195) [0x705c45]
>>   5: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x830f1b]
>>   6: (ThreadPool::WorkThread::entry()+0x10) [0x833700]
>>   7: (()+0x68ca) [0x7f3af16578ca]
>>   8: (clone()+0x6d) [0x7f3aefac6bfd]
>>   NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
>> interpret this.
>>
>> --- logging levels ---
>>     0/ 5 none
>>     0/ 0 lockdep
>>     0/ 0 context
>>     0/ 0 crush
>>     1/ 5 mds
>>     1/ 5 mds_balancer
>>     1/ 5 mds_locker
>>     1/ 5 mds_log
>>     1/ 5 mds_log_expire
>>     1/ 5 mds_migrator
>>     0/ 0 buffer
>>     0/ 0 timer
>>     0/ 1 filer
>>     0/ 1 striper
>>     0/ 1 objecter
>>     0/ 5 rados
>>     0/ 5 rbd
>>     0/ 0 journaler
>>     0/ 5 objectcacher
>>     0/ 5 client
>>     0/ 0 osd
>>     0/ 0 optracker
>>     0/ 0 objclass
>>     0/ 0 filestore
>>     0/ 0 journal
>>     0/ 0 ms
>>     1/ 5 mon
>>     0/ 0 monc
>>     0/ 5 paxos
>>     0/ 0 tp
>>     0/ 0 auth
>>     1/ 5 crypto
>>     0/ 0 finisher
>>     0/ 0 heartbeatmap
>>     0/ 0 perfcounter
>>     1/ 5 rgw
>>     1/ 5 hadoop
>>     1/ 5 javaclient
>>     0/ 0 asok
>>     0/ 0 throttle
>>    -2/-2 (syslog threshold)
>>    -1/-1 (stderr threshold)
>>    max_recent     10000
>>    max_new      1000000
>>    log_file /var/log/ceph/ceph-osd.52.log
>> --- end dump of recent events ---
>> 2012-11-15 22:04:09.235734 7f3ae87d0700 -1 *** Caught signal (Aborted) **
>>   in thread 7f3ae87d0700
>>
>>   ceph version 0.54-607-gf89e101 (f89e1012bafabd6875a4a1e1832d76ffdf45b039)
>>   1: /usr/bin/ceph-osd() [0x799769]
>>   2: (()+0xeff0) [0x7f3af165fff0]
>>   3: (gsignal()+0x35) [0x7f3aefa29215]
>>   4: (abort()+0x180) [0x7f3aefa2c020]
>>   5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f3af02bddc5]
>>   6: (()+0xcb166) [0x7f3af02bc166]
>>   7: (()+0xcb193) [0x7f3af02bc193]
>>   8: (()+0xcb28e) [0x7f3af02bc28e]
>>   9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>> const*)+0x7c9) [0x7fd069]
>>   10: (FileStore::_collection_add(coll_t, coll_t, hobject_t const&,
>> SequencerPosition const&)+0x77d) [0x72ff0d]
>>   11: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long,
>> int)+0x25fb) [0x73481b]
>>   12: (FileStore::do_transactions(std::list<ObjectStore::Transaction*,
>> std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c)
>> [0x73952c]
>>   13: (FileStore::_do_op(FileStore::OpSequencer*)+0x195) [0x705c45]
>>   14: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x830f1b]
>>   15: (ThreadPool::WorkThread::entry()+0x10) [0x833700]
>>   16: (()+0x68ca) [0x7f3af16578ca]
>>   17: (clone()+0x6d) [0x7f3aefac6bfd]
>>   NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
>> interpret this.
>>
>> --- begin dump of recent events ---
>>       0> 2012-11-15 22:04:09.235734 7f3ae87d0700 -1 *** Caught signal
>> (Aborted) **
>>   in thread 7f3ae87d0700
>>
>>   ceph version 0.54-607-gf89e101 (f89e1012bafabd6875a4a1e1832d76ffdf45b039)
>>   1: /usr/bin/ceph-osd() [0x799769]
>>   2: (()+0xeff0) [0x7f3af165fff0]
>>   3: (gsignal()+0x35) [0x7f3aefa29215]
>>   4: (abort()+0x180) [0x7f3aefa2c020]
>>   5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f3af02bddc5]
>>   6: (()+0xcb166) [0x7f3af02bc166]
>>   7: (()+0xcb193) [0x7f3af02bc193]
>>   8: (()+0xcb28e) [0x7f3af02bc28e]
>>   9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>> const*)+0x7c9) [0x7fd069]
>>   10: (FileStore::_collection_add(coll_t, coll_t, hobject_t const&,
>> SequencerPosition const&)+0x77d) [0x72ff0d]
>>   11: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long,
>> int)+0x25fb) [0x73481b]
>>   12: (FileStore::do_transactions(std::list<ObjectStore::Transaction*,
>> std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c)
>> [0x73952c]
>>   13: (FileStore::_do_op(FileStore::OpSequencer*)+0x195) [0x705c45]
>>   14: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x830f1b]
>>   15: (ThreadPool::WorkThread::entry()+0x10) [0x833700]
>>   16: (()+0x68ca) [0x7f3af16578ca]
>>   17: (clone()+0x6d) [0x7f3aefac6bfd]
>>   NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
>> interpret this.
>>
>> --- logging levels ---
>>     0/ 5 none
>>     0/ 0 lockdep
>>     0/ 0 context
>>     0/ 0 crush
>>     1/ 5 mds
>>     1/ 5 mds_balancer
>>     1/ 5 mds_locker
>>     1/ 5 mds_log
>>     1/ 5 mds_log_expire
>>     1/ 5 mds_migrator
>>     0/ 0 buffer
>>     0/ 0 timer
>>     0/ 1 filer
>>     0/ 1 striper
>>     0/ 1 objecter
>>     0/ 5 rados
>>     0/ 5 rbd
>>     0/ 0 journaler
>>     0/ 5 objectcacher
>>     0/ 5 client
>>     0/ 0 osd
>>     0/ 0 optracker
>>     0/ 0 objclass
>>     0/ 0 filestore
>>     0/ 0 journal
>>     0/ 0 ms
>>     1/ 5 mon
>>     0/ 0 monc
>>     0/ 5 paxos
>>     0/ 0 tp
>>     0/ 0 auth
>>     1/ 5 crypto
>>     0/ 0 finisher
>>     0/ 0 heartbeatmap
>>     0/ 0 perfcounter
>>     1/ 5 rgw
>>     1/ 5 hadoop
>>     1/ 5 javaclient
>>     0/ 0 asok
>>     0/ 0 throttle
>>    -2/-2 (syslog threshold)
>>    -1/-1 (stderr threshold)
>>    max_recent     10000
>>    max_new      1000000
>>    log_file /var/log/ceph/ceph-osd.52.log
>> --- end dump of recent events ---
>>
>> Stefan
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ceph-osd crashing (os/FileStore.cc: 4500: FAILED assert(replaying))
  2012-11-19 23:39   ` Stefan Priebe
@ 2012-11-19 23:43     ` Samuel Just
  2012-11-19 23:44       ` Stefan Priebe
  0 siblings, 1 reply; 5+ messages in thread
From: Samuel Just @ 2012-11-19 23:43 UTC (permalink / raw)
  To: Stefan Priebe; +Cc: ceph-devel@vger.kernel.org

Can you restart one of the affected osds with debug osd = 20, debug
filestore = 20, debug ms = 1 and post the log?
-Sam

On Mon, Nov 19, 2012 at 3:39 PM, Stefan Priebe <s.priebe@profihost.ag> wrote:
> Am 20.11.2012 00:39, schrieb Samuel Just:
>
>> Seems to be a truncated log file...  That usually indicates filesystem
>> corruption.  Anything in dmesg?
>> -Sam
>
> No. Everything fine.
>
>
>
>> On Thu, Nov 15, 2012 at 1:07 PM, Stefan Priebe <s.priebe@profihost.ag>
>> wrote:
>>>
>>> Hello list,
>>>
>>> actual master incl. upstream/wip-fd-simple-cache results in this crash
>>> when
>>> i try to start some of my osds (others work fine) today on multiple
>>> nodes:
>>>
>>>      -2> 2012-11-15 22:04:09.226945 7f3af1c7a780  0 osd.52 pg_epoch: 657
>>> pg[3.3b( v 632'823 (632'823,632'823] n=5 ec=17 les/c 18/18 656/656/17) []
>>> r=0 lpr=0 pi=17-655/2 (info mismatch, log(632'823,0'0]) (log bound
>>> mismatch,
>>> empty) lcod 0'0 mlcod 0'0 inactive] Got exception 'read_log_error:
>>> read_log
>>> got 0 bytes, expected 126086-0=126086' while reading log. Moving
>>> corrupted
>>> log file to 'corrupt_log_2012-11-15_22:04_3.3b' for later analysis.
>>>      -1> 2012-11-15 22:04:09.233563 7f3af1c7a780  0 osd.52 pg_epoch: 657
>>> pg[3.557( v 632'753 (0'0,632'753] n=2 ec=17 les/c 18/18 656/656/17) []
>>> r=0
>>> lpr=0 pi=17-655/2 (info mismatch, log(0'0,0'0]) lcod 0'0 mlcod 0'0
>>> inactive]
>>> Got exception 'read_log_error: read_log got 0 bytes, expected
>>> 115488-0=115488' while reading log. Moving corrupted log file to
>>> 'corrupt_log_2012-11-15_22:04_3.557' for later analysis.
>>>       0> 2012-11-15 22:04:09.234536 7f3ae87d0700 -1 os/FileStore.cc: In
>>> function 'int FileStore::_collection_add(coll_t, coll_t, const
>>> hobject_t&,
>>> const SequencerPosition&)' thread 7f3ae87d0700 time 2012-11-15
>>> 22:04:09.233672
>>> os/FileStore.cc: 4500: FAILED assert(replaying)
>>>
>>>   ceph version 0.54-607-gf89e101
>>> (f89e1012bafabd6875a4a1e1832d76ffdf45b039)
>>>   1: (FileStore::_collection_add(coll_t, coll_t, hobject_t const&,
>>> SequencerPosition const&)+0x77d) [0x72ff0d]
>>>   2: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned
>>> long,
>>> int)+0x25fb) [0x73481b]
>>>   3: (FileStore::do_transactions(std::list<ObjectStore::Transaction*,
>>> std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c)
>>> [0x73952c]
>>>   4: (FileStore::_do_op(FileStore::OpSequencer*)+0x195) [0x705c45]
>>>   5: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x830f1b]
>>>   6: (ThreadPool::WorkThread::entry()+0x10) [0x833700]
>>>   7: (()+0x68ca) [0x7f3af16578ca]
>>>   8: (clone()+0x6d) [0x7f3aefac6bfd]
>>>   NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>> needed to
>>> interpret this.
>>>
>>> --- logging levels ---
>>>     0/ 5 none
>>>     0/ 0 lockdep
>>>     0/ 0 context
>>>     0/ 0 crush
>>>     1/ 5 mds
>>>     1/ 5 mds_balancer
>>>     1/ 5 mds_locker
>>>     1/ 5 mds_log
>>>     1/ 5 mds_log_expire
>>>     1/ 5 mds_migrator
>>>     0/ 0 buffer
>>>     0/ 0 timer
>>>     0/ 1 filer
>>>     0/ 1 striper
>>>     0/ 1 objecter
>>>     0/ 5 rados
>>>     0/ 5 rbd
>>>     0/ 0 journaler
>>>     0/ 5 objectcacher
>>>     0/ 5 client
>>>     0/ 0 osd
>>>     0/ 0 optracker
>>>     0/ 0 objclass
>>>     0/ 0 filestore
>>>     0/ 0 journal
>>>     0/ 0 ms
>>>     1/ 5 mon
>>>     0/ 0 monc
>>>     0/ 5 paxos
>>>     0/ 0 tp
>>>     0/ 0 auth
>>>     1/ 5 crypto
>>>     0/ 0 finisher
>>>     0/ 0 heartbeatmap
>>>     0/ 0 perfcounter
>>>     1/ 5 rgw
>>>     1/ 5 hadoop
>>>     1/ 5 javaclient
>>>     0/ 0 asok
>>>     0/ 0 throttle
>>>    -2/-2 (syslog threshold)
>>>    -1/-1 (stderr threshold)
>>>    max_recent     10000
>>>    max_new      1000000
>>>    log_file /var/log/ceph/ceph-osd.52.log
>>> --- end dump of recent events ---
>>> 2012-11-15 22:04:09.235734 7f3ae87d0700 -1 *** Caught signal (Aborted) **
>>>   in thread 7f3ae87d0700
>>>
>>>   ceph version 0.54-607-gf89e101
>>> (f89e1012bafabd6875a4a1e1832d76ffdf45b039)
>>>   1: /usr/bin/ceph-osd() [0x799769]
>>>   2: (()+0xeff0) [0x7f3af165fff0]
>>>   3: (gsignal()+0x35) [0x7f3aefa29215]
>>>   4: (abort()+0x180) [0x7f3aefa2c020]
>>>   5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f3af02bddc5]
>>>   6: (()+0xcb166) [0x7f3af02bc166]
>>>   7: (()+0xcb193) [0x7f3af02bc193]
>>>   8: (()+0xcb28e) [0x7f3af02bc28e]
>>>   9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>>> const*)+0x7c9) [0x7fd069]
>>>   10: (FileStore::_collection_add(coll_t, coll_t, hobject_t const&,
>>> SequencerPosition const&)+0x77d) [0x72ff0d]
>>>   11: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned
>>> long,
>>> int)+0x25fb) [0x73481b]
>>>   12: (FileStore::do_transactions(std::list<ObjectStore::Transaction*,
>>> std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c)
>>> [0x73952c]
>>>   13: (FileStore::_do_op(FileStore::OpSequencer*)+0x195) [0x705c45]
>>>   14: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x830f1b]
>>>   15: (ThreadPool::WorkThread::entry()+0x10) [0x833700]
>>>   16: (()+0x68ca) [0x7f3af16578ca]
>>>   17: (clone()+0x6d) [0x7f3aefac6bfd]
>>>   NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>> needed to
>>> interpret this.
>>>
>>> --- begin dump of recent events ---
>>>       0> 2012-11-15 22:04:09.235734 7f3ae87d0700 -1 *** Caught signal
>>> (Aborted) **
>>>   in thread 7f3ae87d0700
>>>
>>>   ceph version 0.54-607-gf89e101
>>> (f89e1012bafabd6875a4a1e1832d76ffdf45b039)
>>>   1: /usr/bin/ceph-osd() [0x799769]
>>>   2: (()+0xeff0) [0x7f3af165fff0]
>>>   3: (gsignal()+0x35) [0x7f3aefa29215]
>>>   4: (abort()+0x180) [0x7f3aefa2c020]
>>>   5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f3af02bddc5]
>>>   6: (()+0xcb166) [0x7f3af02bc166]
>>>   7: (()+0xcb193) [0x7f3af02bc193]
>>>   8: (()+0xcb28e) [0x7f3af02bc28e]
>>>   9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>>> const*)+0x7c9) [0x7fd069]
>>>   10: (FileStore::_collection_add(coll_t, coll_t, hobject_t const&,
>>> SequencerPosition const&)+0x77d) [0x72ff0d]
>>>   11: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned
>>> long,
>>> int)+0x25fb) [0x73481b]
>>>   12: (FileStore::do_transactions(std::list<ObjectStore::Transaction*,
>>> std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c)
>>> [0x73952c]
>>>   13: (FileStore::_do_op(FileStore::OpSequencer*)+0x195) [0x705c45]
>>>   14: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x830f1b]
>>>   15: (ThreadPool::WorkThread::entry()+0x10) [0x833700]
>>>   16: (()+0x68ca) [0x7f3af16578ca]
>>>   17: (clone()+0x6d) [0x7f3aefac6bfd]
>>>   NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>> needed to
>>> interpret this.
>>>
>>> --- logging levels ---
>>>     0/ 5 none
>>>     0/ 0 lockdep
>>>     0/ 0 context
>>>     0/ 0 crush
>>>     1/ 5 mds
>>>     1/ 5 mds_balancer
>>>     1/ 5 mds_locker
>>>     1/ 5 mds_log
>>>     1/ 5 mds_log_expire
>>>     1/ 5 mds_migrator
>>>     0/ 0 buffer
>>>     0/ 0 timer
>>>     0/ 1 filer
>>>     0/ 1 striper
>>>     0/ 1 objecter
>>>     0/ 5 rados
>>>     0/ 5 rbd
>>>     0/ 0 journaler
>>>     0/ 5 objectcacher
>>>     0/ 5 client
>>>     0/ 0 osd
>>>     0/ 0 optracker
>>>     0/ 0 objclass
>>>     0/ 0 filestore
>>>     0/ 0 journal
>>>     0/ 0 ms
>>>     1/ 5 mon
>>>     0/ 0 monc
>>>     0/ 5 paxos
>>>     0/ 0 tp
>>>     0/ 0 auth
>>>     1/ 5 crypto
>>>     0/ 0 finisher
>>>     0/ 0 heartbeatmap
>>>     0/ 0 perfcounter
>>>     1/ 5 rgw
>>>     1/ 5 hadoop
>>>     1/ 5 javaclient
>>>     0/ 0 asok
>>>     0/ 0 throttle
>>>    -2/-2 (syslog threshold)
>>>    -1/-1 (stderr threshold)
>>>    max_recent     10000
>>>    max_new      1000000
>>>    log_file /var/log/ceph/ceph-osd.52.log
>>> --- end dump of recent events ---
>>>
>>> Stefan
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ceph-osd crashing (os/FileStore.cc: 4500: FAILED assert(replaying))
  2012-11-19 23:43     ` Samuel Just
@ 2012-11-19 23:44       ` Stefan Priebe
  0 siblings, 0 replies; 5+ messages in thread
From: Stefan Priebe @ 2012-11-19 23:44 UTC (permalink / raw)
  To: Samuel Just; +Cc: ceph-devel@vger.kernel.org

I've formatted the cluster since then. But i'll report back if this 
happens again.

Stefan
Am 20.11.2012 00:43, schrieb Samuel Just:
> Can you restart one of the affected osds with debug osd = 20, debug
> filestore = 20, debug ms = 1 and post the log?
> -Sam
>
> On Mon, Nov 19, 2012 at 3:39 PM, Stefan Priebe <s.priebe@profihost.ag> wrote:
>> Am 20.11.2012 00:39, schrieb Samuel Just:
>>
>>> Seems to be a truncated log file...  That usually indicates filesystem
>>> corruption.  Anything in dmesg?
>>> -Sam
>>
>> No. Everything fine.
>>
>>
>>
>>> On Thu, Nov 15, 2012 at 1:07 PM, Stefan Priebe <s.priebe@profihost.ag>
>>> wrote:
>>>>
>>>> Hello list,
>>>>
>>>> actual master incl. upstream/wip-fd-simple-cache results in this crash
>>>> when
>>>> i try to start some of my osds (others work fine) today on multiple
>>>> nodes:
>>>>
>>>>       -2> 2012-11-15 22:04:09.226945 7f3af1c7a780  0 osd.52 pg_epoch: 657
>>>> pg[3.3b( v 632'823 (632'823,632'823] n=5 ec=17 les/c 18/18 656/656/17) []
>>>> r=0 lpr=0 pi=17-655/2 (info mismatch, log(632'823,0'0]) (log bound
>>>> mismatch,
>>>> empty) lcod 0'0 mlcod 0'0 inactive] Got exception 'read_log_error:
>>>> read_log
>>>> got 0 bytes, expected 126086-0=126086' while reading log. Moving
>>>> corrupted
>>>> log file to 'corrupt_log_2012-11-15_22:04_3.3b' for later analysis.
>>>>       -1> 2012-11-15 22:04:09.233563 7f3af1c7a780  0 osd.52 pg_epoch: 657
>>>> pg[3.557( v 632'753 (0'0,632'753] n=2 ec=17 les/c 18/18 656/656/17) []
>>>> r=0
>>>> lpr=0 pi=17-655/2 (info mismatch, log(0'0,0'0]) lcod 0'0 mlcod 0'0
>>>> inactive]
>>>> Got exception 'read_log_error: read_log got 0 bytes, expected
>>>> 115488-0=115488' while reading log. Moving corrupted log file to
>>>> 'corrupt_log_2012-11-15_22:04_3.557' for later analysis.
>>>>        0> 2012-11-15 22:04:09.234536 7f3ae87d0700 -1 os/FileStore.cc: In
>>>> function 'int FileStore::_collection_add(coll_t, coll_t, const
>>>> hobject_t&,
>>>> const SequencerPosition&)' thread 7f3ae87d0700 time 2012-11-15
>>>> 22:04:09.233672
>>>> os/FileStore.cc: 4500: FAILED assert(replaying)
>>>>
>>>>    ceph version 0.54-607-gf89e101
>>>> (f89e1012bafabd6875a4a1e1832d76ffdf45b039)
>>>>    1: (FileStore::_collection_add(coll_t, coll_t, hobject_t const&,
>>>> SequencerPosition const&)+0x77d) [0x72ff0d]
>>>>    2: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned
>>>> long,
>>>> int)+0x25fb) [0x73481b]
>>>>    3: (FileStore::do_transactions(std::list<ObjectStore::Transaction*,
>>>> std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c)
>>>> [0x73952c]
>>>>    4: (FileStore::_do_op(FileStore::OpSequencer*)+0x195) [0x705c45]
>>>>    5: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x830f1b]
>>>>    6: (ThreadPool::WorkThread::entry()+0x10) [0x833700]
>>>>    7: (()+0x68ca) [0x7f3af16578ca]
>>>>    8: (clone()+0x6d) [0x7f3aefac6bfd]
>>>>    NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>>> needed to
>>>> interpret this.
>>>>
>>>> --- logging levels ---
>>>>      0/ 5 none
>>>>      0/ 0 lockdep
>>>>      0/ 0 context
>>>>      0/ 0 crush
>>>>      1/ 5 mds
>>>>      1/ 5 mds_balancer
>>>>      1/ 5 mds_locker
>>>>      1/ 5 mds_log
>>>>      1/ 5 mds_log_expire
>>>>      1/ 5 mds_migrator
>>>>      0/ 0 buffer
>>>>      0/ 0 timer
>>>>      0/ 1 filer
>>>>      0/ 1 striper
>>>>      0/ 1 objecter
>>>>      0/ 5 rados
>>>>      0/ 5 rbd
>>>>      0/ 0 journaler
>>>>      0/ 5 objectcacher
>>>>      0/ 5 client
>>>>      0/ 0 osd
>>>>      0/ 0 optracker
>>>>      0/ 0 objclass
>>>>      0/ 0 filestore
>>>>      0/ 0 journal
>>>>      0/ 0 ms
>>>>      1/ 5 mon
>>>>      0/ 0 monc
>>>>      0/ 5 paxos
>>>>      0/ 0 tp
>>>>      0/ 0 auth
>>>>      1/ 5 crypto
>>>>      0/ 0 finisher
>>>>      0/ 0 heartbeatmap
>>>>      0/ 0 perfcounter
>>>>      1/ 5 rgw
>>>>      1/ 5 hadoop
>>>>      1/ 5 javaclient
>>>>      0/ 0 asok
>>>>      0/ 0 throttle
>>>>     -2/-2 (syslog threshold)
>>>>     -1/-1 (stderr threshold)
>>>>     max_recent     10000
>>>>     max_new      1000000
>>>>     log_file /var/log/ceph/ceph-osd.52.log
>>>> --- end dump of recent events ---
>>>> 2012-11-15 22:04:09.235734 7f3ae87d0700 -1 *** Caught signal (Aborted) **
>>>>    in thread 7f3ae87d0700
>>>>
>>>>    ceph version 0.54-607-gf89e101
>>>> (f89e1012bafabd6875a4a1e1832d76ffdf45b039)
>>>>    1: /usr/bin/ceph-osd() [0x799769]
>>>>    2: (()+0xeff0) [0x7f3af165fff0]
>>>>    3: (gsignal()+0x35) [0x7f3aefa29215]
>>>>    4: (abort()+0x180) [0x7f3aefa2c020]
>>>>    5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f3af02bddc5]
>>>>    6: (()+0xcb166) [0x7f3af02bc166]
>>>>    7: (()+0xcb193) [0x7f3af02bc193]
>>>>    8: (()+0xcb28e) [0x7f3af02bc28e]
>>>>    9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>>>> const*)+0x7c9) [0x7fd069]
>>>>    10: (FileStore::_collection_add(coll_t, coll_t, hobject_t const&,
>>>> SequencerPosition const&)+0x77d) [0x72ff0d]
>>>>    11: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned
>>>> long,
>>>> int)+0x25fb) [0x73481b]
>>>>    12: (FileStore::do_transactions(std::list<ObjectStore::Transaction*,
>>>> std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c)
>>>> [0x73952c]
>>>>    13: (FileStore::_do_op(FileStore::OpSequencer*)+0x195) [0x705c45]
>>>>    14: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x830f1b]
>>>>    15: (ThreadPool::WorkThread::entry()+0x10) [0x833700]
>>>>    16: (()+0x68ca) [0x7f3af16578ca]
>>>>    17: (clone()+0x6d) [0x7f3aefac6bfd]
>>>>    NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>>> needed to
>>>> interpret this.
>>>>
>>>> --- begin dump of recent events ---
>>>>        0> 2012-11-15 22:04:09.235734 7f3ae87d0700 -1 *** Caught signal
>>>> (Aborted) **
>>>>    in thread 7f3ae87d0700
>>>>
>>>>    ceph version 0.54-607-gf89e101
>>>> (f89e1012bafabd6875a4a1e1832d76ffdf45b039)
>>>>    1: /usr/bin/ceph-osd() [0x799769]
>>>>    2: (()+0xeff0) [0x7f3af165fff0]
>>>>    3: (gsignal()+0x35) [0x7f3aefa29215]
>>>>    4: (abort()+0x180) [0x7f3aefa2c020]
>>>>    5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f3af02bddc5]
>>>>    6: (()+0xcb166) [0x7f3af02bc166]
>>>>    7: (()+0xcb193) [0x7f3af02bc193]
>>>>    8: (()+0xcb28e) [0x7f3af02bc28e]
>>>>    9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>>>> const*)+0x7c9) [0x7fd069]
>>>>    10: (FileStore::_collection_add(coll_t, coll_t, hobject_t const&,
>>>> SequencerPosition const&)+0x77d) [0x72ff0d]
>>>>    11: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned
>>>> long,
>>>> int)+0x25fb) [0x73481b]
>>>>    12: (FileStore::do_transactions(std::list<ObjectStore::Transaction*,
>>>> std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c)
>>>> [0x73952c]
>>>>    13: (FileStore::_do_op(FileStore::OpSequencer*)+0x195) [0x705c45]
>>>>    14: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x830f1b]
>>>>    15: (ThreadPool::WorkThread::entry()+0x10) [0x833700]
>>>>    16: (()+0x68ca) [0x7f3af16578ca]
>>>>    17: (clone()+0x6d) [0x7f3aefac6bfd]
>>>>    NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>>> needed to
>>>> interpret this.
>>>>
>>>> --- logging levels ---
>>>>      0/ 5 none
>>>>      0/ 0 lockdep
>>>>      0/ 0 context
>>>>      0/ 0 crush
>>>>      1/ 5 mds
>>>>      1/ 5 mds_balancer
>>>>      1/ 5 mds_locker
>>>>      1/ 5 mds_log
>>>>      1/ 5 mds_log_expire
>>>>      1/ 5 mds_migrator
>>>>      0/ 0 buffer
>>>>      0/ 0 timer
>>>>      0/ 1 filer
>>>>      0/ 1 striper
>>>>      0/ 1 objecter
>>>>      0/ 5 rados
>>>>      0/ 5 rbd
>>>>      0/ 0 journaler
>>>>      0/ 5 objectcacher
>>>>      0/ 5 client
>>>>      0/ 0 osd
>>>>      0/ 0 optracker
>>>>      0/ 0 objclass
>>>>      0/ 0 filestore
>>>>      0/ 0 journal
>>>>      0/ 0 ms
>>>>      1/ 5 mon
>>>>      0/ 0 monc
>>>>      0/ 5 paxos
>>>>      0/ 0 tp
>>>>      0/ 0 auth
>>>>      1/ 5 crypto
>>>>      0/ 0 finisher
>>>>      0/ 0 heartbeatmap
>>>>      0/ 0 perfcounter
>>>>      1/ 5 rgw
>>>>      1/ 5 hadoop
>>>>      1/ 5 javaclient
>>>>      0/ 0 asok
>>>>      0/ 0 throttle
>>>>     -2/-2 (syslog threshold)
>>>>     -1/-1 (stderr threshold)
>>>>     max_recent     10000
>>>>     max_new      1000000
>>>>     log_file /var/log/ceph/ceph-osd.52.log
>>>> --- end dump of recent events ---
>>>>
>>>> Stefan
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-11-19 23:44 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-15 21:07 ceph-osd crashing (os/FileStore.cc: 4500: FAILED assert(replaying)) Stefan Priebe
2012-11-19 23:39 ` Samuel Just
2012-11-19 23:39   ` Stefan Priebe
2012-11-19 23:43     ` Samuel Just
2012-11-19 23:44       ` Stefan Priebe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.