A couple of OSD-crashes after serious network trouble

All of lore.kernel.org
 help / color / mirror / Atom feed

* A couple of OSD-crashes after serious network trouble
@ 2012-12-05 11:15 Oliver Francke
  2012-12-05 14:54 ` Sage Weil
  0 siblings, 1 reply; 14+ messages in thread
From: Oliver Francke @ 2012-12-05 11:15 UTC (permalink / raw)
  To: ceph-devel@vger.kernel.org

Hi *,

around midnight yesterday we faced some layer-2 network problems. OSD's 
started to lose heartbeats and so on. Slow requests... you name it.
So, after all OSD's doing their work, we had in sum around 6 of them 
crashed, 2 had to be restarted after first start. Should be 8 crashes in 
total.

Typical output:


=== 8-< ===
--- begin dump of recent events ---
    -10> 2012-12-04 23:35:26.623091 7f1db7895700  5 
filestore(/data/osd6-1) _do_op 0x21035870 seq 111010292 osr(65.72 
0x9e13570)/0x9e13570 start
     -9> 2012-12-04 23:35:26.623995 7f1db7895700  5 
filestore(/data/osd6-1) _do_op 0x21035500 seq 111010294 osr(10.3 
0x5b5c170)/0x5b5c170 start
     -8> 2012-12-04 23:35:26.624013 7f1db6893700  5 --OSD::tracker-- 
reqid: client.290626.0:798537, seq: 151093878, time: 2012-12-04 
23:35:26.624012, event: sub_op_applied, request: 
osd_sub_op(client.290626.0:798537 65.72 
c9612472/rb.0.2d5e5.39bd39.000000000652/head//65 [] v 8084'770407 
snapset=0=[]:[] snapc=0=[]) v7
     -7> 2012-12-04 23:35:26.624047 7f1db8096700  5 
filestore(/data/osd6-1) _do_op 0x21035c80 seq 111010293 osr(65.72 
0x9e13570)/0x9e13570 start
     -6> 2012-12-04 23:35:26.624119 7f1db6893700  5 --OSD::tracker-- 
reqid: client.290626.0:798537, seq: 151093878, time: 2012-12-04 
23:35:26.624119, event: done, request: osd_sub_op(client.290626.0:798537 
65.72 c9612472/rb.0.2d5e5.39bd39.000000000652/head//65 [] v 8084'770407 
snapset=0=[]:[] snapc=0=[]) v7
     -5> 2012-12-04 23:35:26.624953 7f1db6893700  5 --OSD::tracker-- 
reqid: client.290626.0:798549, seq: 151093879, time: 2012-12-04 
23:35:26.624953, event: sub_op_applied, request: 
osd_sub_op(client.290626.0:798549 65.72 
c9612472/rb.0.2d5e5.39bd39.000000000652/head//65 [] v 8084'770408 
snapset=0=[]:[] snapc=0=[]) v7
     -4> 2012-12-04 23:35:26.625017 7f1db6893700  5 --OSD::tracker-- 
reqid: client.290626.0:798549, seq: 151093879, time: 2012-12-04 
23:35:26.625017, event: done, request: osd_sub_op(client.290626.0:798549 
65.72 c9612472/rb.0.2d5e5.39bd39.000000000652/head//65 [] v 8084'770408 
snapset=0=[]:[] snapc=0=[]) v7
     -3> 2012-12-04 23:35:26.626220 7f1db7895700  5 
filestore(/data/osd6-1) _do_op 0x21035f00 seq 111010296 osr(6.7 
0x5ca4570)/0x5ca4570 start
     -2> 2012-12-04 23:35:26.626218 7f1db8096700  5 
filestore(/data/osd6-1) _do_op 0x21035e10 seq 111010295 osr(10.3 
0x5b5c170)/0x5b5c170 start
     -1> 2012-12-04 23:35:26.652283 7f1daed81700  5 
throttle(msgr_dispatch_throttler-cluster 0x2791560) get 1049621 (0 -> 
1049621)
      0> 2012-12-04 23:35:26.654669 7f1db1f89700 -1 *** Caught signal 
(Aborted) **
  in thread 7f1db1f89700

  ceph version 0.48.2argonaut 
(commit:3e02b2fad88c2a95d9c0c86878f10d1beb780bfe)
  1: /usr/bin/ceph-osd() [0x6edaba]
  2: (()+0xfcb0) [0x7f1dc34c7cb0]
  3: (gsignal()+0x35) [0x7f1dc208e425]
  4: (abort()+0x17b) [0x7f1dc2091b8b]
  5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f1dc29e769d]
  6: (()+0xb5846) [0x7f1dc29e5846]
  7: (()+0xb5873) [0x7f1dc29e5873]
  8: (()+0xb596e) [0x7f1dc29e596e]
  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x1de) [0x7a82fe]
  10: (ReplicatedPG::recover_got(hobject_t, eversion_t)+0x4ae) [0x52b5ee]
  11: (ReplicatedPG::submit_push_complete(ObjectRecoveryInfo&, 
ObjectStore::Transaction*)+0x470) [0x52ddd0]
  12: 
(ReplicatedPG::handle_pull_response(std::tr1::shared_ptr<OpRequest>)+0x4d4) 
[0x54b124]
  13: (ReplicatedPG::sub_op_push(std::tr1::shared_ptr<OpRequest>)+0x98) 
[0x54bef8]
  14: (ReplicatedPG::do_sub_op(std::tr1::shared_ptr<OpRequest>)+0x3f7) 
[0x54c3a7]
  15: (PG::do_request(std::tr1::shared_ptr<OpRequest>)+0x9f) [0x60073f]
  16: (OSD::dequeue_op(PG*)+0x238) [0x5bfaf8]
  17: (ThreadPool::worker()+0x4d5) [0x79f835]
  18: (ThreadPool::WorkThread::entry()+0xd) [0x5d87cd]
  19: (()+0x7e9a) [0x7f1dc34bfe9a]
  20: (clone()+0x6d) [0x7f1dc214bcbd]
  NOTE: a copy of the executable, or `objdump -rdS <executable>` is 
needed to interpret this.

--- end dump of recent events ---

=== 8-< ===

A - not very scientific, but useful - aggregation of all OSD-outputs as 
follows. My hope is, that someone says:
"Uhm, OK, tha's fixed in ..." ;)

( count of occurences and corresponding string)

=== 8-< ===

       4 (boost::statechart::simple_state<PG::RecoveryState::Stray,
       4 
(boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine,
      18 (ceph::__ceph_assert_fail(char
      36 (clone()+0x6d)
      18 (gsignal()+0x35)
      16 (OSD::dequeue_op(PG*)+0x238)
      16 (OSD::dequeue_op(PG*)+0x39a)
       4 (OSD::_dispatch(Message*)+0x173)
       4 (OSD::dispatch_op(std::tr1::shared_ptr<OpRequest>)+0x11b)
       4 (OSD::handle_pg_log(std::tr1::shared_ptr<OpRequest>)+0x666)
       4 (OSD::ms_dispatch(Message*)+0x184)
      16 (PG::do_request(std::tr1::shared_ptr<OpRequest>)+0x9f)
      16 (PG::do_request(std::tr1::shared_ptr<OpRequest>)+0xab)
       4 (PG::merge_log(ObjectStore::Transaction&,
       4 (PG::RecoveryState::handle_log(int,
       4 (PG::RecoveryState::Stray::react(PG::RecoveryState::MLogRec
      16 (ReplicatedPG::do_sub_op(std::tr1::shared_ptr<OpRequest>)+0x32e)
      16 (ReplicatedPG::do_sub_op(std::tr1::shared_ptr<OpRequest>)+0x3f7)
      12 
(ReplicatedPG::handle_pull_response(std::tr1::shared_ptr<OpRequest>)+0x4d4)
      16 
(ReplicatedPG::handle_pull_response(std::tr1::shared_ptr<OpRequest>)+0xb24)
       4 (ReplicatedPG::handle_push(std::tr1::shared_ptr<OpRequest>)+0x263)
      32 (ReplicatedPG::recover_got(hobject_t,
      32 (ReplicatedPG::submit_push_complete(ObjectRecoveryInfo&,
      12 (ReplicatedPG::sub_op_push(std::tr1::shared_ptr<OpRequest>)+0x98)
      16 (ReplicatedPG::sub_op_push(std::tr1::shared_ptr<OpRequest>)+0xa2)
       4 (ReplicatedPG::sub_op_push(std::tr1::shared_ptr<OpRequest>)+0xf3)
       4 (SimpleMessenger::dispatch_entry()+0x15)
       4 (SimpleMessenger::DispatchQueue::entry()+0x5e9)
       4 (SimpleMessenger::DispatchThread::entry()+0xd)
      16 (ThreadPool::worker()+0x4d5)
      16 (ThreadPool::worker()+0x76f)
      32 (ThreadPool::WorkThread::entry()+0xd)

=== 8-< ===

Everything has cleared up so far, so that's some good news ;)

Comments welcome,

Oliver.

-- 

Oliver Francke

filoo GmbH
Moltkestraße 25a
33330 Gütersloh
HRB4355 AG Gütersloh

Geschäftsführer: S.Grewing | J.Rehpöhler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: A couple of OSD-crashes after serious network trouble
  2012-12-05 11:15 A couple of OSD-crashes after serious network trouble Oliver Francke
@ 2012-12-05 14:54 ` Sage Weil
  2012-12-06 17:27   ` Oliver Francke
  0 siblings, 1 reply; 14+ messages in thread
From: Sage Weil @ 2012-12-05 14:54 UTC (permalink / raw)
  To: Oliver Francke; +Cc: ceph-devel@vger.kernel.org

On Wed, 5 Dec 2012, Oliver Francke wrote:
> Hi *,
> 
> around midnight yesterday we faced some layer-2 network problems. OSD's
> started to lose heartbeats and so on. Slow requests... you name it.
> So, after all OSD's doing their work, we had in sum around 6 of them crashed,
> 2 had to be restarted after first start. Should be 8 crashes in total.

The recover_got() crash has definitely been resolved in the recent code.   
The others are hard to read since they've been sorted/summed; the full 
backtrace is better for identifying the crash.  Do you have those 
available?

Thanks!
sage


 > 
> Typical output:
> 
> 
> === 8-< ===
> --- begin dump of recent events ---
>    -10> 2012-12-04 23:35:26.623091 7f1db7895700  5 filestore(/data/osd6-1)
> _do_op 0x21035870 seq 111010292 osr(65.72 0x9e13570)/0x9e13570 start
>     -9> 2012-12-04 23:35:26.623995 7f1db7895700  5 filestore(/data/osd6-1)
> _do_op 0x21035500 seq 111010294 osr(10.3 0x5b5c170)/0x5b5c170 start
>     -8> 2012-12-04 23:35:26.624013 7f1db6893700  5 --OSD::tracker-- reqid:
> client.290626.0:798537, seq: 151093878, time: 2012-12-04 23:35:26.624012,
> event: sub_op_applied, request: osd_sub_op(client.290626.0:798537 65.72
> c9612472/rb.0.2d5e5.39bd39.000000000652/head//65 [] v 8084'770407
> snapset=0=[]:[] snapc=0=[]) v7
>     -7> 2012-12-04 23:35:26.624047 7f1db8096700  5 filestore(/data/osd6-1)
> _do_op 0x21035c80 seq 111010293 osr(65.72 0x9e13570)/0x9e13570 start
>     -6> 2012-12-04 23:35:26.624119 7f1db6893700  5 --OSD::tracker-- reqid:
> client.290626.0:798537, seq: 151093878, time: 2012-12-04 23:35:26.624119,
> event: done, request: osd_sub_op(client.290626.0:798537 65.72
> c9612472/rb.0.2d5e5.39bd39.000000000652/head//65 [] v 8084'770407
> snapset=0=[]:[] snapc=0=[]) v7
>     -5> 2012-12-04 23:35:26.624953 7f1db6893700  5 --OSD::tracker-- reqid:
> client.290626.0:798549, seq: 151093879, time: 2012-12-04 23:35:26.624953,
> event: sub_op_applied, request: osd_sub_op(client.290626.0:798549 65.72
> c9612472/rb.0.2d5e5.39bd39.000000000652/head//65 [] v 8084'770408
> snapset=0=[]:[] snapc=0=[]) v7
>     -4> 2012-12-04 23:35:26.625017 7f1db6893700  5 --OSD::tracker-- reqid:
> client.290626.0:798549, seq: 151093879, time: 2012-12-04 23:35:26.625017,
> event: done, request: osd_sub_op(client.290626.0:798549 65.72
> c9612472/rb.0.2d5e5.39bd39.000000000652/head//65 [] v 8084'770408
> snapset=0=[]:[] snapc=0=[]) v7
>     -3> 2012-12-04 23:35:26.626220 7f1db7895700  5 filestore(/data/osd6-1)
> _do_op 0x21035f00 seq 111010296 osr(6.7 0x5ca4570)/0x5ca4570 start
>     -2> 2012-12-04 23:35:26.626218 7f1db8096700  5 filestore(/data/osd6-1)
> _do_op 0x21035e10 seq 111010295 osr(10.3 0x5b5c170)/0x5b5c170 start
>     -1> 2012-12-04 23:35:26.652283 7f1daed81700  5
> throttle(msgr_dispatch_throttler-cluster 0x2791560) get 1049621 (0 -> 1049621)
>      0> 2012-12-04 23:35:26.654669 7f1db1f89700 -1 *** Caught signal (Aborted)
> **
>  in thread 7f1db1f89700
> 
>  ceph version 0.48.2argonaut (commit:3e02b2fad88c2a95d9c0c86878f10d1beb780bfe)
>  1: /usr/bin/ceph-osd() [0x6edaba]
>  2: (()+0xfcb0) [0x7f1dc34c7cb0]
>  3: (gsignal()+0x35) [0x7f1dc208e425]
>  4: (abort()+0x17b) [0x7f1dc2091b8b]
>  5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f1dc29e769d]
>  6: (()+0xb5846) [0x7f1dc29e5846]
>  7: (()+0xb5873) [0x7f1dc29e5873]
>  8: (()+0xb596e) [0x7f1dc29e596e]
>  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x1de) [0x7a82fe]
>  10: (ReplicatedPG::recover_got(hobject_t, eversion_t)+0x4ae) [0x52b5ee]
>  11: (ReplicatedPG::submit_push_complete(ObjectRecoveryInfo&,
> ObjectStore::Transaction*)+0x470) [0x52ddd0]
>  12:
> (ReplicatedPG::handle_pull_response(std::tr1::shared_ptr<OpRequest>)+0x4d4)
> [0x54b124]
>  13: (ReplicatedPG::sub_op_push(std::tr1::shared_ptr<OpRequest>)+0x98)
> [0x54bef8]
>  14: (ReplicatedPG::do_sub_op(std::tr1::shared_ptr<OpRequest>)+0x3f7)
> [0x54c3a7]
>  15: (PG::do_request(std::tr1::shared_ptr<OpRequest>)+0x9f) [0x60073f]
>  16: (OSD::dequeue_op(PG*)+0x238) [0x5bfaf8]
>  17: (ThreadPool::worker()+0x4d5) [0x79f835]
>  18: (ThreadPool::WorkThread::entry()+0xd) [0x5d87cd]
>  19: (()+0x7e9a) [0x7f1dc34bfe9a]
>  20: (clone()+0x6d) [0x7f1dc214bcbd]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
> interpret this.
> 
> --- end dump of recent events ---
> 
> === 8-< ===
> 
> A - not very scientific, but useful - aggregation of all OSD-outputs as
> follows. My hope is, that someone says:
> "Uhm, OK, tha's fixed in ..." ;)
> 
> ( count of occurences and corresponding string)
> 
> === 8-< ===
> 
>       4 (boost::statechart::simple_state<PG::RecoveryState::Stray,
>       4 (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine,
>      18 (ceph::__ceph_assert_fail(char
>      36 (clone()+0x6d)
>      18 (gsignal()+0x35)
>      16 (OSD::dequeue_op(PG*)+0x238)
>      16 (OSD::dequeue_op(PG*)+0x39a)
>       4 (OSD::_dispatch(Message*)+0x173)
>       4 (OSD::dispatch_op(std::tr1::shared_ptr<OpRequest>)+0x11b)
>       4 (OSD::handle_pg_log(std::tr1::shared_ptr<OpRequest>)+0x666)
>       4 (OSD::ms_dispatch(Message*)+0x184)
>      16 (PG::do_request(std::tr1::shared_ptr<OpRequest>)+0x9f)
>      16 (PG::do_request(std::tr1::shared_ptr<OpRequest>)+0xab)
>       4 (PG::merge_log(ObjectStore::Transaction&,
>       4 (PG::RecoveryState::handle_log(int,
>       4 (PG::RecoveryState::Stray::react(PG::RecoveryState::MLogRec
>      16 (ReplicatedPG::do_sub_op(std::tr1::shared_ptr<OpRequest>)+0x32e)
>      16 (ReplicatedPG::do_sub_op(std::tr1::shared_ptr<OpRequest>)+0x3f7)
>      12
> (ReplicatedPG::handle_pull_response(std::tr1::shared_ptr<OpRequest>)+0x4d4)
>      16
> (ReplicatedPG::handle_pull_response(std::tr1::shared_ptr<OpRequest>)+0xb24)
>       4 (ReplicatedPG::handle_push(std::tr1::shared_ptr<OpRequest>)+0x263)
>      32 (ReplicatedPG::recover_got(hobject_t,
>      32 (ReplicatedPG::submit_push_complete(ObjectRecoveryInfo&,
>      12 (ReplicatedPG::sub_op_push(std::tr1::shared_ptr<OpRequest>)+0x98)
>      16 (ReplicatedPG::sub_op_push(std::tr1::shared_ptr<OpRequest>)+0xa2)
>       4 (ReplicatedPG::sub_op_push(std::tr1::shared_ptr<OpRequest>)+0xf3)
>       4 (SimpleMessenger::dispatch_entry()+0x15)
>       4 (SimpleMessenger::DispatchQueue::entry()+0x5e9)
>       4 (SimpleMessenger::DispatchThread::entry()+0xd)
>      16 (ThreadPool::worker()+0x4d5)
>      16 (ThreadPool::worker()+0x76f)
>      32 (ThreadPool::WorkThread::entry()+0xd)
> 
> === 8-< ===
> 
> Everything has cleared up so far, so that's some good news ;)
> 
> Comments welcome,
> 
> Oliver.
> 
> -- 
> 
> Oliver Francke
> 
> filoo GmbH
> Moltkestra?e 25a
> 33330 G?tersloh
> HRB4355 AG G?tersloh
> 
> Gesch?ftsf?hrer: S.Grewing | J.Rehp?hler | C.Kunz
> 
> Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: A couple of OSD-crashes after serious network trouble
  2012-12-05 14:54 ` Sage Weil
@ 2012-12-06 17:27   ` Oliver Francke
  2012-12-07 14:39     ` Oliver Francke
  0 siblings, 1 reply; 14+ messages in thread
From: Oliver Francke @ 2012-12-06 17:27 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel@vger.kernel.org

Hi,

On 12/05/2012 03:54 PM, Sage Weil wrote:
> On Wed, 5 Dec 2012, Oliver Francke wrote:
>> Hi *,
>>
>> around midnight yesterday we faced some layer-2 network problems. OSD's
>> started to lose heartbeats and so on. Slow requests... you name it.
>> So, after all OSD's doing their work, we had in sum around 6 of them crashed,
>> 2 had to be restarted after first start. Should be 8 crashes in total.
> The recover_got() crash has definitely been resolved in the recent code.
> The others are hard to read since they've been sorted/summed; the full
> backtrace is better for identifying the crash.  Do you have those
> available?

There is "the other" pattern:

/var/log/ceph/ceph-osd.40.log.1.gz: 1: /usr/bin/ceph-osd() [0x706c59]
/var/log/ceph/ceph-osd.40.log.1.gz: 2: (()+0xeff0) [0x7f7f306c0ff0]
/var/log/ceph/ceph-osd.40.log.1.gz: 3: (gsignal()+0x35) [0x7f7f2f35f1b5]
/var/log/ceph/ceph-osd.40.log.1.gz: 4: (abort()+0x180) [0x7f7f2f361fc0]
/var/log/ceph/ceph-osd.40.log.1.gz: 5: 
(__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f7f2fbf3dc5]
/var/log/ceph/ceph-osd.40.log.1.gz: 6: (()+0xcb166) [0x7f7f2fbf2166]
/var/log/ceph/ceph-osd.40.log.1.gz: 7: (()+0xcb193) [0x7f7f2fbf2193]
/var/log/ceph/ceph-osd.40.log.1.gz: 8: (()+0xcb28e) [0x7f7f2fbf228e]
/var/log/ceph/ceph-osd.40.log.1.gz: 9: (ceph::__ceph_assert_fail(char 
const*, char const*, int, char const*)+0x793) [0x77e903]
/var/log/ceph/ceph-osd.40.log.1.gz: 10: 
(PG::merge_log(ObjectStore::Transaction&, pg_info_t&, pg_log_t&, 
int)+0x1de3) [0x63db93]
/var/log/ceph/ceph-osd.40.log.1.gz: 11: 
(PG::RecoveryState::Stray::react(PG::RecoveryState::MLogRec 
const&)+0x2cc) [0x63e00c]
/var/log/ceph/ceph-osd.40.log.1.gz: 12: 
(boost::statechart::simple_state<PG::RecoveryState::Stray, 
PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na, 
mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, 
mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, 
mpl_::na, mpl_::na, mpl_::na, mpl_::na>, 
(boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base 
const&, void const*)+0x203) [0x658a63]
/var/log/ceph/ceph-osd.40.log.1.gz: 13: 
(boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, 
PG::RecoveryState::Initial, std::allocator<void>, 
boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base 
const&)+0x6b) [0x650b4b]
/var/log/ceph/ceph-osd.40.log.1.gz: 14: 
(PG::RecoveryState::handle_log(int, MOSDPGLog*, PG::RecoveryCtx*)+0x190) 
[0x60a520]
/var/log/ceph/ceph-osd.40.log.1.gz: 15: 
(OSD::handle_pg_log(std::tr1::shared_ptr<OpRequest>)+0x666) [0x5c62e6]
/var/log/ceph/ceph-osd.40.log.1.gz: 16: 
(OSD::dispatch_op(std::tr1::shared_ptr<OpRequest>)+0x11b) [0x5c6f3b]
/var/log/ceph/ceph-osd.40.log.1.gz: 17: (OSD::_dispatch(Message*)+0x173) 
[0x5d1983]
/var/log/ceph/ceph-osd.40.log.1.gz: 18: 
(OSD::ms_dispatch(Message*)+0x184) [0x5d2254]
/var/log/ceph/ceph-osd.40.log.1.gz: 19: 
(SimpleMessenger::DispatchQueue::entry()+0x5e9) [0x7d3c09]
/var/log/ceph/ceph-osd.40.log.1.gz: 20: 
(SimpleMessenger::dispatch_entry()+0x15) [0x7d5195]
/var/log/ceph/ceph-osd.40.log.1.gz: 21: 
(SimpleMessenger::DispatchThread::entry()+0xd) [0x726bad]
/var/log/ceph/ceph-osd.40.log.1.gz: 22: (()+0x68ca) [0x7f7f306b88ca]
/var/log/ceph/ceph-osd.40.log.1.gz: 23: (clone()+0x6d) [0x7f7f2f3fc92d]

State at the end of the day: active+clean;

Unfortunately... after some scrubbing today, we see again 
inconsistencies... *sigh*

End of year syndrom?

Tried to get onto one OSD, which crashed yesterday and fired off some 
ceph osd scrub 0.
And then ceph osd repair 0.

2012-12-06 16:46:29.818551 7f49f1923700  0 log [ERR] : 65.ad repair stat 
mismatch, got 4204/4205 objects, 0/0 clones, 16466529280/16470149632 bytes.
2012-12-06 16:46:29.818734 7f49f1923700  0 log [ERR] : 65.ad repair 1 
errors, 1 fixed
2012-12-06 16:46:30.104722 7f49f2124700  0 log [ERR] : 65.23 repair stat 
mismatch, got 4258/4259 objects, 0/0 clones, 16686233712/16690428016 bytes.
2012-12-06 16:46:30.104890 7f49f2124700  0 log [ERR] : 65.23 repair 1 
errors, 1 fixed
2012-12-06 16:51:26.973407 7f49f2124700  0 log [ERR] : 6.1 osd.31: soid 
bafe2559/rb.0.1adf5.6733efe2.0000000007ce/head//6 size 4194304 != known 
size 3046912
2012-12-06 16:51:26.973426 7f49f2124700  0 log [ERR] : 6.1 repair 0 
missing, 1 inconsistent objects
2012-12-06 16:51:26.981234 7f49f2124700  0 log [ERR] : 6.1 repair stat 
mismatch, got 2153/2154 objects, 0/0 clones, 7013002752/7017197056 bytes.
2012-12-06 16:51:26.981402 7f49f2124700  0 log [ERR] : 6.1 repair 1 
errors, 1 fixed

um... is it repaired? Really? Everything cool now for OSD.0? 
Additionally there are - again - half a dozen headers missing. If 
corresponding VM's are stopped now, they will not restart, of course.

First tickets are raised by customers having s/t like "filesystems 
errors... mounted read-only..." on the console and kind of that crap... 
again.

Well then, should one now do a ceph osd repair \* ? Fix the headers? Is 
there a best practice?

Other hints? How can we now discover all of the potential errors 
_before_ customers may see them, too?

Too many questions, but even more trouble...

Thnx for all attention.

Oliver.


> Thanks!
> sage
>
>
>   >
>> Typical output:
>>
>>
>> === 8-< ===
>> --- begin dump of recent events ---
>>     -10> 2012-12-04 23:35:26.623091 7f1db7895700  5 filestore(/data/osd6-1)
>> _do_op 0x21035870 seq 111010292 osr(65.72 0x9e13570)/0x9e13570 start
>>      -9> 2012-12-04 23:35:26.623995 7f1db7895700  5 filestore(/data/osd6-1)
>> _do_op 0x21035500 seq 111010294 osr(10.3 0x5b5c170)/0x5b5c170 start
>>      -8> 2012-12-04 23:35:26.624013 7f1db6893700  5 --OSD::tracker-- reqid:
>> client.290626.0:798537, seq: 151093878, time: 2012-12-04 23:35:26.624012,
>> event: sub_op_applied, request: osd_sub_op(client.290626.0:798537 65.72
>> c9612472/rb.0.2d5e5.39bd39.000000000652/head//65 [] v 8084'770407
>> snapset=0=[]:[] snapc=0=[]) v7
>>      -7> 2012-12-04 23:35:26.624047 7f1db8096700  5 filestore(/data/osd6-1)
>> _do_op 0x21035c80 seq 111010293 osr(65.72 0x9e13570)/0x9e13570 start
>>      -6> 2012-12-04 23:35:26.624119 7f1db6893700  5 --OSD::tracker-- reqid:
>> client.290626.0:798537, seq: 151093878, time: 2012-12-04 23:35:26.624119,
>> event: done, request: osd_sub_op(client.290626.0:798537 65.72
>> c9612472/rb.0.2d5e5.39bd39.000000000652/head//65 [] v 8084'770407
>> snapset=0=[]:[] snapc=0=[]) v7
>>      -5> 2012-12-04 23:35:26.624953 7f1db6893700  5 --OSD::tracker-- reqid:
>> client.290626.0:798549, seq: 151093879, time: 2012-12-04 23:35:26.624953,
>> event: sub_op_applied, request: osd_sub_op(client.290626.0:798549 65.72
>> c9612472/rb.0.2d5e5.39bd39.000000000652/head//65 [] v 8084'770408
>> snapset=0=[]:[] snapc=0=[]) v7
>>      -4> 2012-12-04 23:35:26.625017 7f1db6893700  5 --OSD::tracker-- reqid:
>> client.290626.0:798549, seq: 151093879, time: 2012-12-04 23:35:26.625017,
>> event: done, request: osd_sub_op(client.290626.0:798549 65.72
>> c9612472/rb.0.2d5e5.39bd39.000000000652/head//65 [] v 8084'770408
>> snapset=0=[]:[] snapc=0=[]) v7
>>      -3> 2012-12-04 23:35:26.626220 7f1db7895700  5 filestore(/data/osd6-1)
>> _do_op 0x21035f00 seq 111010296 osr(6.7 0x5ca4570)/0x5ca4570 start
>>      -2> 2012-12-04 23:35:26.626218 7f1db8096700  5 filestore(/data/osd6-1)
>> _do_op 0x21035e10 seq 111010295 osr(10.3 0x5b5c170)/0x5b5c170 start
>>      -1> 2012-12-04 23:35:26.652283 7f1daed81700  5
>> throttle(msgr_dispatch_throttler-cluster 0x2791560) get 1049621 (0 -> 1049621)
>>       0> 2012-12-04 23:35:26.654669 7f1db1f89700 -1 *** Caught signal (Aborted)
>> **
>>   in thread 7f1db1f89700
>>
>>   ceph version 0.48.2argonaut (commit:3e02b2fad88c2a95d9c0c86878f10d1beb780bfe)
>>   1: /usr/bin/ceph-osd() [0x6edaba]
>>   2: (()+0xfcb0) [0x7f1dc34c7cb0]
>>   3: (gsignal()+0x35) [0x7f1dc208e425]
>>   4: (abort()+0x17b) [0x7f1dc2091b8b]
>>   5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f1dc29e769d]
>>   6: (()+0xb5846) [0x7f1dc29e5846]
>>   7: (()+0xb5873) [0x7f1dc29e5873]
>>   8: (()+0xb596e) [0x7f1dc29e596e]
>>   9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>> const*)+0x1de) [0x7a82fe]
>>   10: (ReplicatedPG::recover_got(hobject_t, eversion_t)+0x4ae) [0x52b5ee]
>>   11: (ReplicatedPG::submit_push_complete(ObjectRecoveryInfo&,
>> ObjectStore::Transaction*)+0x470) [0x52ddd0]
>>   12:
>> (ReplicatedPG::handle_pull_response(std::tr1::shared_ptr<OpRequest>)+0x4d4)
>> [0x54b124]
>>   13: (ReplicatedPG::sub_op_push(std::tr1::shared_ptr<OpRequest>)+0x98)
>> [0x54bef8]
>>   14: (ReplicatedPG::do_sub_op(std::tr1::shared_ptr<OpRequest>)+0x3f7)
>> [0x54c3a7]
>>   15: (PG::do_request(std::tr1::shared_ptr<OpRequest>)+0x9f) [0x60073f]
>>   16: (OSD::dequeue_op(PG*)+0x238) [0x5bfaf8]
>>   17: (ThreadPool::worker()+0x4d5) [0x79f835]
>>   18: (ThreadPool::WorkThread::entry()+0xd) [0x5d87cd]
>>   19: (()+0x7e9a) [0x7f1dc34bfe9a]
>>   20: (clone()+0x6d) [0x7f1dc214bcbd]
>>   NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
>> interpret this.
>>
>> --- end dump of recent events ---
>>
>> === 8-< ===
>>
>> A - not very scientific, but useful - aggregation of all OSD-outputs as
>> follows. My hope is, that someone says:
>> "Uhm, OK, tha's fixed in ..." ;)
>>
>> ( count of occurences and corresponding string)
>>
>> === 8-< ===
>>
>>        4 (boost::statechart::simple_state<PG::RecoveryState::Stray,
>>        4 (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine,
>>       18 (ceph::__ceph_assert_fail(char
>>       36 (clone()+0x6d)
>>       18 (gsignal()+0x35)
>>       16 (OSD::dequeue_op(PG*)+0x238)
>>       16 (OSD::dequeue_op(PG*)+0x39a)
>>        4 (OSD::_dispatch(Message*)+0x173)
>>        4 (OSD::dispatch_op(std::tr1::shared_ptr<OpRequest>)+0x11b)
>>        4 (OSD::handle_pg_log(std::tr1::shared_ptr<OpRequest>)+0x666)
>>        4 (OSD::ms_dispatch(Message*)+0x184)
>>       16 (PG::do_request(std::tr1::shared_ptr<OpRequest>)+0x9f)
>>       16 (PG::do_request(std::tr1::shared_ptr<OpRequest>)+0xab)
>>        4 (PG::merge_log(ObjectStore::Transaction&,
>>        4 (PG::RecoveryState::handle_log(int,
>>        4 (PG::RecoveryState::Stray::react(PG::RecoveryState::MLogRec
>>       16 (ReplicatedPG::do_sub_op(std::tr1::shared_ptr<OpRequest>)+0x32e)
>>       16 (ReplicatedPG::do_sub_op(std::tr1::shared_ptr<OpRequest>)+0x3f7)
>>       12
>> (ReplicatedPG::handle_pull_response(std::tr1::shared_ptr<OpRequest>)+0x4d4)
>>       16
>> (ReplicatedPG::handle_pull_response(std::tr1::shared_ptr<OpRequest>)+0xb24)
>>        4 (ReplicatedPG::handle_push(std::tr1::shared_ptr<OpRequest>)+0x263)
>>       32 (ReplicatedPG::recover_got(hobject_t,
>>       32 (ReplicatedPG::submit_push_complete(ObjectRecoveryInfo&,
>>       12 (ReplicatedPG::sub_op_push(std::tr1::shared_ptr<OpRequest>)+0x98)
>>       16 (ReplicatedPG::sub_op_push(std::tr1::shared_ptr<OpRequest>)+0xa2)
>>        4 (ReplicatedPG::sub_op_push(std::tr1::shared_ptr<OpRequest>)+0xf3)
>>        4 (SimpleMessenger::dispatch_entry()+0x15)
>>        4 (SimpleMessenger::DispatchQueue::entry()+0x5e9)
>>        4 (SimpleMessenger::DispatchThread::entry()+0xd)
>>       16 (ThreadPool::worker()+0x4d5)
>>       16 (ThreadPool::worker()+0x76f)
>>       32 (ThreadPool::WorkThread::entry()+0xd)
>>
>> === 8-< ===
>>
>> Everything has cleared up so far, so that's some good news ;)
>>
>> Comments welcome,
>>
>> Oliver.
>>
>> -- 
>>
>> Oliver Francke
>>
>> filoo GmbH
>> Moltkestra?e 25a
>> 33330 G?tersloh
>> HRB4355 AG G?tersloh
>>
>> Gesch?ftsf?hrer: S.Grewing | J.Rehp?hler | C.Kunz
>>
>> Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 

Oliver Francke

filoo GmbH
Moltkestraße 25a
33330 Gütersloh
HRB4355 AG Gütersloh

Geschäftsführer: S.Grewing | J.Rehpöhler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: A couple of OSD-crashes after serious network trouble
  2012-12-06 17:27   ` Oliver Francke
@ 2012-12-07 14:39     ` Oliver Francke
  2012-12-07 18:37       ` Samuel Just
  0 siblings, 1 reply; 14+ messages in thread
From: Oliver Francke @ 2012-12-07 14:39 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel@vger.kernel.org

Hi,

is the following a "known one", too? Would be good to get it out of my head:

> /var/log/ceph/ceph-osd.40.log.1.gz: 1: /usr/bin/ceph-osd() [0x706c59]
> /var/log/ceph/ceph-osd.40.log.1.gz: 2: (()+0xeff0) [0x7f7f306c0ff0]
> /var/log/ceph/ceph-osd.40.log.1.gz: 3: (gsignal()+0x35) [0x7f7f2f35f1b5]
> /var/log/ceph/ceph-osd.40.log.1.gz: 4: (abort()+0x180) [0x7f7f2f361fc0]
> /var/log/ceph/ceph-osd.40.log.1.gz: 5: 
> (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f7f2fbf3dc5]
> /var/log/ceph/ceph-osd.40.log.1.gz: 6: (()+0xcb166) [0x7f7f2fbf2166]
> /var/log/ceph/ceph-osd.40.log.1.gz: 7: (()+0xcb193) [0x7f7f2fbf2193]
> /var/log/ceph/ceph-osd.40.log.1.gz: 8: (()+0xcb28e) [0x7f7f2fbf228e]
> /var/log/ceph/ceph-osd.40.log.1.gz: 9: (ceph::__ceph_assert_fail(char 
> const*, char const*, int, char const*)+0x793) [0x77e903]
> /var/log/ceph/ceph-osd.40.log.1.gz: 10: 
> (PG::merge_log(ObjectStore::Transaction&, pg_info_t&, pg_log_t&, 
> int)+0x1de3) [0x63db93]
> /var/log/ceph/ceph-osd.40.log.1.gz: 11: 
> (PG::RecoveryState::Stray::react(PG::RecoveryState::MLogRec 
> const&)+0x2cc) [0x63e00c]
> /var/log/ceph/ceph-osd.40.log.1.gz: 12: 
> (boost::statechart::simple_state<PG::RecoveryState::Stray, 
> PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na, 
> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, 
> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, 
> mpl_::na, mpl_::na, mpl_::na, mpl_::na>, 
> (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base 
> const&, void const*)+0x203) [0x658a63]
> /var/log/ceph/ceph-osd.40.log.1.gz: 13: 
> (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, 
> PG::RecoveryState::Initial, std::allocator<void>, 
> boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base 
> const&)+0x6b) [0x650b4b]
> /var/log/ceph/ceph-osd.40.log.1.gz: 14: 
> (PG::RecoveryState::handle_log(int, MOSDPGLog*, 
> PG::RecoveryCtx*)+0x190) [0x60a520]
> /var/log/ceph/ceph-osd.40.log.1.gz: 15: 
> (OSD::handle_pg_log(std::tr1::shared_ptr<OpRequest>)+0x666) [0x5c62e6]
> /var/log/ceph/ceph-osd.40.log.1.gz: 16: 
> (OSD::dispatch_op(std::tr1::shared_ptr<OpRequest>)+0x11b) [0x5c6f3b]
> /var/log/ceph/ceph-osd.40.log.1.gz: 17: 
> (OSD::_dispatch(Message*)+0x173) [0x5d1983]
> /var/log/ceph/ceph-osd.40.log.1.gz: 18: 
> (OSD::ms_dispatch(Message*)+0x184) [0x5d2254]
> /var/log/ceph/ceph-osd.40.log.1.gz: 19: 
> (SimpleMessenger::DispatchQueue::entry()+0x5e9) [0x7d3c09]
> /var/log/ceph/ceph-osd.40.log.1.gz: 20: 
> (SimpleMessenger::dispatch_entry()+0x15) [0x7d5195]
> /var/log/ceph/ceph-osd.40.log.1.gz: 21: 
> (SimpleMessenger::DispatchThread::entry()+0xd) [0x726bad]
> /var/log/ceph/ceph-osd.40.log.1.gz: 22: (()+0x68ca) [0x7f7f306b88ca]
> /var/log/ceph/ceph-osd.40.log.1.gz: 23: (clone()+0x6d) [0x7f7f2f3fc92d]
>

Thnx for looking,

Oliver.

-- 

Oliver Francke

filoo GmbH
Moltkestraße 25a
33330 Gütersloh
HRB4355 AG Gütersloh

Geschäftsführer: S.Grewing | J.Rehpöhler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: A couple of OSD-crashes after serious network trouble
  2012-12-07 14:39     ` Oliver Francke
@ 2012-12-07 18:37       ` Samuel Just
  2012-12-07 19:09         ` Oliver Francke
  0 siblings, 1 reply; 14+ messages in thread
From: Samuel Just @ 2012-12-07 18:37 UTC (permalink / raw)
  To: Oliver Francke; +Cc: Sage Weil, ceph-devel@vger.kernel.org

That is very likely to be one of the merge_log bugs fixed between 0.48
and 0.55.  I could confirm with a stacktrace from gdb with line
numbers or the remainder of the logging dumped when the daemon
crashed.

My understanding of your situation is that currently all pgs are
active+clean but you are missing some rbd image headers and some rbd
images appear to be corrupted.  Is that accurate?
-Sam

On Fri, Dec 7, 2012 at 6:39 AM, Oliver Francke <Oliver.Francke@filoo.de> wrote:
> Hi,
>
> is the following a "known one", too? Would be good to get it out of my head:
>
>
>> /var/log/ceph/ceph-osd.40.log.1.gz: 1: /usr/bin/ceph-osd() [0x706c59]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 2: (()+0xeff0) [0x7f7f306c0ff0]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 3: (gsignal()+0x35) [0x7f7f2f35f1b5]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 4: (abort()+0x180) [0x7f7f2f361fc0]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 5:
>> (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f7f2fbf3dc5]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 6: (()+0xcb166) [0x7f7f2fbf2166]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 7: (()+0xcb193) [0x7f7f2fbf2193]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 8: (()+0xcb28e) [0x7f7f2fbf228e]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 9: (ceph::__ceph_assert_fail(char
>> const*, char const*, int, char const*)+0x793) [0x77e903]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 10:
>> (PG::merge_log(ObjectStore::Transaction&, pg_info_t&, pg_log_t&,
>> int)+0x1de3) [0x63db93]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 11:
>> (PG::RecoveryState::Stray::react(PG::RecoveryState::MLogRec const&)+0x2cc)
>> [0x63e00c]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 12:
>> (boost::statechart::simple_state<PG::RecoveryState::Stray,
>> PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na,
>> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
>> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
>> mpl_::na, mpl_::na, mpl_::na>,
>> (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base
>> const&, void const*)+0x203) [0x658a63]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 13:
>> (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine,
>> PG::RecoveryState::Initial, std::allocator<void>,
>> boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base
>> const&)+0x6b) [0x650b4b]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 14:
>> (PG::RecoveryState::handle_log(int, MOSDPGLog*, PG::RecoveryCtx*)+0x190)
>> [0x60a520]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 15:
>> (OSD::handle_pg_log(std::tr1::shared_ptr<OpRequest>)+0x666) [0x5c62e6]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 16:
>> (OSD::dispatch_op(std::tr1::shared_ptr<OpRequest>)+0x11b) [0x5c6f3b]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 17: (OSD::_dispatch(Message*)+0x173)
>> [0x5d1983]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 18: (OSD::ms_dispatch(Message*)+0x184)
>> [0x5d2254]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 19:
>> (SimpleMessenger::DispatchQueue::entry()+0x5e9) [0x7d3c09]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 20:
>> (SimpleMessenger::dispatch_entry()+0x15) [0x7d5195]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 21:
>> (SimpleMessenger::DispatchThread::entry()+0xd) [0x726bad]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 22: (()+0x68ca) [0x7f7f306b88ca]
>> /var/log/ceph/ceph-osd.40.log.1.gz: 23: (clone()+0x6d) [0x7f7f2f3fc92d]
>>
>
> Thnx for looking,
>
>
> Oliver.
>
> --
>
> Oliver Francke
>
> filoo GmbH
> Moltkestraße 25a
> 33330 Gütersloh
> HRB4355 AG Gütersloh
>
> Geschäftsführer: S.Grewing | J.Rehpöhler | C.Kunz
>
> Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: A couple of OSD-crashes after serious network trouble
  2012-12-07 18:37       ` Samuel Just
@ 2012-12-07 19:09         ` Oliver Francke
  2012-12-07 21:18           ` Samuel Just
  0 siblings, 1 reply; 14+ messages in thread
From: Oliver Francke @ 2012-12-07 19:09 UTC (permalink / raw)
  To: Samuel Just; +Cc: Sage Weil, ceph-devel@vger.kernel.org

Hi Sam,

Am 07.12.2012 um 19:37 schrieb Samuel Just <sam.just@inktank.com>:

> That is very likely to be one of the merge_log bugs fixed between 0.48
> and 0.55.  I could confirm with a stacktrace from gdb with line
> numbers or the remainder of the logging dumped when the daemon
> crashed.
> 
> My understanding of your situation is that currently all pgs are
> active+clean but you are missing some rbd image headers and some rbd
> images appear to be corrupted.  Is that accurate?
> -Sam
> 

thnx for droppig in.

Uhm almost correct, there are now 6 pg in state inconsistent:

HEALTH_WARN 6 pgs inconsistent
pg 65.da is active+clean+inconsistent, acting [1,33]
pg 65.d7 is active+clean+inconsistent, acting [13,42]
pg 65.10 is active+clean+inconsistent, acting [12,40]
pg 65.f is active+clean+inconsistent, acting [13,31]
pg 65.75 is active+clean+inconsistent, acting [1,33]
pg 65.6a is active+clean+inconsistent, acting [13,31]

I know which images are affected, but does a repair help?

0 log [ERR] : 65.10 osd.40: soid 87c96f10/rb.0.47d9b.1014b7b4.0000000002df/head//65 size 4194304 != known size 699904
0 log [ERR] : 65.6a osd.31: soid 19a2526a/rb.0.2dcf2.1da2a31e.000000000737/head//65 size 4191744 != known size 2757632
0 log [ERR] : 65.75 osd.33: soid 20550575/rb.0.2d520.5c17a6e3.000000000339/head//65 size 4194304 != known size 1238016
0 log [ERR] : 65.d7 osd.42: soid fa3a5d7/rb.0.2c2a8.12ec359d.00000000205c/head//65 size 4194304 != known size 1382912
0 log [ERR] : 65.da osd.33: soid c2a344da/rb.0.2be17.cb4bd69.000000000081/head//65 size 4191744 != known size 1815552
0 log [ERR] : 65.f osd.31: soid e8d2430f/rb.0.2d1e9.1339c5dd.000000000c41/head//65 size 2424832 != known size 2331648

of make things worse?

I could only check 14 out of 20 OSD's so far, cause from two older nodes a scrub leads to slow-requests… > couple of minutes, so VM's got stalled… customers pressing the "reset-button", so losing caches…

Comments welcome,

Oliver.

> On Fri, Dec 7, 2012 at 6:39 AM, Oliver Francke <Oliver.Francke@filoo.de> wrote:
>> Hi,
>> 
>> is the following a "known one", too? Would be good to get it out of my head:
>> 
>> 
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 1: /usr/bin/ceph-osd() [0x706c59]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 2: (()+0xeff0) [0x7f7f306c0ff0]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 3: (gsignal()+0x35) [0x7f7f2f35f1b5]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 4: (abort()+0x180) [0x7f7f2f361fc0]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 5:
>>> (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f7f2fbf3dc5]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 6: (()+0xcb166) [0x7f7f2fbf2166]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 7: (()+0xcb193) [0x7f7f2fbf2193]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 8: (()+0xcb28e) [0x7f7f2fbf228e]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 9: (ceph::__ceph_assert_fail(char
>>> const*, char const*, int, char const*)+0x793) [0x77e903]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 10:
>>> (PG::merge_log(ObjectStore::Transaction&, pg_info_t&, pg_log_t&,
>>> int)+0x1de3) [0x63db93]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 11:
>>> (PG::RecoveryState::Stray::react(PG::RecoveryState::MLogRec const&)+0x2cc)
>>> [0x63e00c]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 12:
>>> (boost::statechart::simple_state<PG::RecoveryState::Stray,
>>> PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na,
>>> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
>>> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
>>> mpl_::na, mpl_::na, mpl_::na>,
>>> (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base
>>> const&, void const*)+0x203) [0x658a63]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 13:
>>> (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine,
>>> PG::RecoveryState::Initial, std::allocator<void>,
>>> boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base
>>> const&)+0x6b) [0x650b4b]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 14:
>>> (PG::RecoveryState::handle_log(int, MOSDPGLog*, PG::RecoveryCtx*)+0x190)
>>> [0x60a520]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 15:
>>> (OSD::handle_pg_log(std::tr1::shared_ptr<OpRequest>)+0x666) [0x5c62e6]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 16:
>>> (OSD::dispatch_op(std::tr1::shared_ptr<OpRequest>)+0x11b) [0x5c6f3b]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 17: (OSD::_dispatch(Message*)+0x173)
>>> [0x5d1983]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 18: (OSD::ms_dispatch(Message*)+0x184)
>>> [0x5d2254]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 19:
>>> (SimpleMessenger::DispatchQueue::entry()+0x5e9) [0x7d3c09]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 20:
>>> (SimpleMessenger::dispatch_entry()+0x15) [0x7d5195]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 21:
>>> (SimpleMessenger::DispatchThread::entry()+0xd) [0x726bad]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 22: (()+0x68ca) [0x7f7f306b88ca]
>>> /var/log/ceph/ceph-osd.40.log.1.gz: 23: (clone()+0x6d) [0x7f7f2f3fc92d]
>>> 
>> 
>> Thnx for looking,
>> 
>> 
>> Oliver.
>> 
>> --
>> 
>> Oliver Francke
>> 
>> filoo GmbH
>> Moltkestraße 25a
>> 33330 Gütersloh
>> HRB4355 AG Gütersloh
>> 
>> Geschäftsführer: S.Grewing | J.Rehpöhler | C.Kunz
>> 
>> Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: A couple of OSD-crashes after serious network trouble
  2012-12-07 19:09         ` Oliver Francke
@ 2012-12-07 21:18           ` Samuel Just
  2012-12-10 10:48             ` Oliver Francke
  0 siblings, 1 reply; 14+ messages in thread
From: Samuel Just @ 2012-12-07 21:18 UTC (permalink / raw)
  To: Oliver Francke; +Cc: Sage Weil, ceph-devel@vger.kernel.org

Ah... unfortunately doing a repair in these 6 cases would probably
result in the wrong object surviving.  It should work, but it might
corrupt the rbd image contents.  If the images are expendable, you
could repair and then delete the images.

The red flag here is that the "known size" is smaller than the other
size.  This indicates that it most likely chose the wrong file as the
"correct" one since rbd image blocks usually get bigger over time.  To
fix this, you will need to manually copy the file for the larger of
the two object replicas to replace the smaller of the two object
replicas.

For the first, soid 87c96f10/rb.0.47d9b.1014b7b4.0000000002df/head//65
in pg 65.10:
1) Find the object on the primary and the replica (from above, primary
is 12 and replica is 40).  You can use find in the primary and replica
current/65.10_head directories to look for a file matching
*rb.0.47d9b.1014b7b4.0000000002df*).  The file name should be
'rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__65' I think.
2) Stop the primary and replica osds
3) Compare the file sizes for the two files -- you should find that
the file sizes do not match.
4) Replace the smaller file with the larger one (you'll probably want
to keep a copy of the smaller one around just in case).
5) Restart the osds and scrub pg 65.10 -- the pg should come up clean
(possibly with a relatively harmless stat mismatch)

If this worked our correctly, you can repeat for the other 5 cases.

Let me know if you have any questions.
-Sam

On Fri, Dec 7, 2012 at 11:09 AM, Oliver Francke <Oliver.Francke@filoo.de> wrote:
> Hi Sam,
>
> Am 07.12.2012 um 19:37 schrieb Samuel Just <sam.just@inktank.com>:
>
>> That is very likely to be one of the merge_log bugs fixed between 0.48
>> and 0.55.  I could confirm with a stacktrace from gdb with line
>> numbers or the remainder of the logging dumped when the daemon
>> crashed.
>>
>> My understanding of your situation is that currently all pgs are
>> active+clean but you are missing some rbd image headers and some rbd
>> images appear to be corrupted.  Is that accurate?
>> -Sam
>>
>
> thnx for droppig in.
>
> Uhm almost correct, there are now 6 pg in state inconsistent:
>
> HEALTH_WARN 6 pgs inconsistent
> pg 65.da is active+clean+inconsistent, acting [1,33]
> pg 65.d7 is active+clean+inconsistent, acting [13,42]
> pg 65.10 is active+clean+inconsistent, acting [12,40]
> pg 65.f is active+clean+inconsistent, acting [13,31]
> pg 65.75 is active+clean+inconsistent, acting [1,33]
> pg 65.6a is active+clean+inconsistent, acting [13,31]
>
> I know which images are affected, but does a repair help?
>
> 0 log [ERR] : 65.10 osd.40: soid 87c96f10/rb.0.47d9b.1014b7b4.0000000002df/head//65 size 4194304 != known size 699904
> 0 log [ERR] : 65.6a osd.31: soid 19a2526a/rb.0.2dcf2.1da2a31e.000000000737/head//65 size 4191744 != known size 2757632
> 0 log [ERR] : 65.75 osd.33: soid 20550575/rb.0.2d520.5c17a6e3.000000000339/head//65 size 4194304 != known size 1238016
> 0 log [ERR] : 65.d7 osd.42: soid fa3a5d7/rb.0.2c2a8.12ec359d.00000000205c/head//65 size 4194304 != known size 1382912
> 0 log [ERR] : 65.da osd.33: soid c2a344da/rb.0.2be17.cb4bd69.000000000081/head//65 size 4191744 != known size 1815552
> 0 log [ERR] : 65.f osd.31: soid e8d2430f/rb.0.2d1e9.1339c5dd.000000000c41/head//65 size 2424832 != known size 2331648
>
> of make things worse?
>
> I could only check 14 out of 20 OSD's so far, cause from two older nodes a scrub leads to slow-requests… > couple of minutes, so VM's got stalled… customers pressing the "reset-button", so losing caches…
>
> Comments welcome,
>
> Oliver.
>
>> On Fri, Dec 7, 2012 at 6:39 AM, Oliver Francke <Oliver.Francke@filoo.de> wrote:
>>> Hi,
>>>
>>> is the following a "known one", too? Would be good to get it out of my head:
>>>
>>>
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 1: /usr/bin/ceph-osd() [0x706c59]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 2: (()+0xeff0) [0x7f7f306c0ff0]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 3: (gsignal()+0x35) [0x7f7f2f35f1b5]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 4: (abort()+0x180) [0x7f7f2f361fc0]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 5:
>>>> (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f7f2fbf3dc5]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 6: (()+0xcb166) [0x7f7f2fbf2166]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 7: (()+0xcb193) [0x7f7f2fbf2193]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 8: (()+0xcb28e) [0x7f7f2fbf228e]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 9: (ceph::__ceph_assert_fail(char
>>>> const*, char const*, int, char const*)+0x793) [0x77e903]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 10:
>>>> (PG::merge_log(ObjectStore::Transaction&, pg_info_t&, pg_log_t&,
>>>> int)+0x1de3) [0x63db93]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 11:
>>>> (PG::RecoveryState::Stray::react(PG::RecoveryState::MLogRec const&)+0x2cc)
>>>> [0x63e00c]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 12:
>>>> (boost::statechart::simple_state<PG::RecoveryState::Stray,
>>>> PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na,
>>>> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
>>>> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
>>>> mpl_::na, mpl_::na, mpl_::na>,
>>>> (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base
>>>> const&, void const*)+0x203) [0x658a63]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 13:
>>>> (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine,
>>>> PG::RecoveryState::Initial, std::allocator<void>,
>>>> boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base
>>>> const&)+0x6b) [0x650b4b]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 14:
>>>> (PG::RecoveryState::handle_log(int, MOSDPGLog*, PG::RecoveryCtx*)+0x190)
>>>> [0x60a520]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 15:
>>>> (OSD::handle_pg_log(std::tr1::shared_ptr<OpRequest>)+0x666) [0x5c62e6]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 16:
>>>> (OSD::dispatch_op(std::tr1::shared_ptr<OpRequest>)+0x11b) [0x5c6f3b]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 17: (OSD::_dispatch(Message*)+0x173)
>>>> [0x5d1983]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 18: (OSD::ms_dispatch(Message*)+0x184)
>>>> [0x5d2254]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 19:
>>>> (SimpleMessenger::DispatchQueue::entry()+0x5e9) [0x7d3c09]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 20:
>>>> (SimpleMessenger::dispatch_entry()+0x15) [0x7d5195]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 21:
>>>> (SimpleMessenger::DispatchThread::entry()+0xd) [0x726bad]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 22: (()+0x68ca) [0x7f7f306b88ca]
>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 23: (clone()+0x6d) [0x7f7f2f3fc92d]
>>>>
>>>
>>> Thnx for looking,
>>>
>>>
>>> Oliver.
>>>
>>> --
>>>
>>> Oliver Francke
>>>
>>> filoo GmbH
>>> Moltkestraße 25a
>>> 33330 Gütersloh
>>> HRB4355 AG Gütersloh
>>>
>>> Geschäftsführer: S.Grewing | J.Rehpöhler | C.Kunz
>>>
>>> Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: A couple of OSD-crashes after serious network trouble
  2012-12-07 21:18           ` Samuel Just
@ 2012-12-10 10:48             ` Oliver Francke
  2012-12-11 15:19               ` Oliver Francke
  0 siblings, 1 reply; 14+ messages in thread
From: Oliver Francke @ 2012-12-10 10:48 UTC (permalink / raw)
  To: Samuel Just; +Cc: ceph-devel@vger.kernel.org

Hi Sam,

helpful input.. and... not so...

On 12/07/2012 10:18 PM, Samuel Just wrote:
> Ah... unfortunately doing a repair in these 6 cases would probably
> result in the wrong object surviving.  It should work, but it might
> corrupt the rbd image contents.  If the images are expendable, you
> could repair and then delete the images.
>
> The red flag here is that the "known size" is smaller than the other
> size.  This indicates that it most likely chose the wrong file as the
> "correct" one since rbd image blocks usually get bigger over time.  To
> fix this, you will need to manually copy the file for the larger of
> the two object replicas to replace the smaller of the two object
> replicas.
>
> For the first, soid 87c96f10/rb.0.47d9b.1014b7b4.0000000002df/head//65
> in pg 65.10:
> 1) Find the object on the primary and the replica (from above, primary
> is 12 and replica is 40).  You can use find in the primary and replica
> current/65.10_head directories to look for a file matching
> *rb.0.47d9b.1014b7b4.0000000002df*).  The file name should be
> 'rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__65' I think.
> 2) Stop the primary and replica osds
> 3) Compare the file sizes for the two files -- you should find that
> the file sizes do not match.
> 4) Replace the smaller file with the larger one (you'll probably want
> to keep a copy of the smaller one around just in case).
> 5) Restart the osds and scrub pg 65.10 -- the pg should come up clean
> (possibly with a relatively harmless stat mismatch)

been there. on OSD.12 it's
-rw-r--r-- 1 root root 699904 Dec  9 06:25 
rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__41

on OSD.40:
-rw-r--r-- 1 root root 4194304 Dec  9 06:25 
rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__41

going by a short glance into the file, there are some readable 
syslog-entries, in both files.
For the bad luck in this example, the shorter file contains the more 
current entries?!

What exactly happens, if I try to copy or export the file? Which block 
will be chosen?
VM is running as I'm writing, so flexibility reduced.

Regards,

Oliver.

> If this worked our correctly, you can repeat for the other 5 cases.
>
> Let me know if you have any questions.
> -Sam
>
> On Fri, Dec 7, 2012 at 11:09 AM, Oliver Francke <Oliver.Francke@filoo.de> wrote:
>> Hi Sam,
>>
>> Am 07.12.2012 um 19:37 schrieb Samuel Just <sam.just@inktank.com>:
>>
>>> That is very likely to be one of the merge_log bugs fixed between 0.48
>>> and 0.55.  I could confirm with a stacktrace from gdb with line
>>> numbers or the remainder of the logging dumped when the daemon
>>> crashed.
>>>
>>> My understanding of your situation is that currently all pgs are
>>> active+clean but you are missing some rbd image headers and some rbd
>>> images appear to be corrupted.  Is that accurate?
>>> -Sam
>>>
>> thnx for droppig in.
>>
>> Uhm almost correct, there are now 6 pg in state inconsistent:
>>
>> HEALTH_WARN 6 pgs inconsistent
>> pg 65.da is active+clean+inconsistent, acting [1,33]
>> pg 65.d7 is active+clean+inconsistent, acting [13,42]
>> pg 65.10 is active+clean+inconsistent, acting [12,40]
>> pg 65.f is active+clean+inconsistent, acting [13,31]
>> pg 65.75 is active+clean+inconsistent, acting [1,33]
>> pg 65.6a is active+clean+inconsistent, acting [13,31]
>>
>> I know which images are affected, but does a repair help?
>>
>> 0 log [ERR] : 65.10 osd.40: soid 87c96f10/rb.0.47d9b.1014b7b4.0000000002df/head//65 size 4194304 != known size 699904
>> 0 log [ERR] : 65.6a osd.31: soid 19a2526a/rb.0.2dcf2.1da2a31e.000000000737/head//65 size 4191744 != known size 2757632
>> 0 log [ERR] : 65.75 osd.33: soid 20550575/rb.0.2d520.5c17a6e3.000000000339/head//65 size 4194304 != known size 1238016
>> 0 log [ERR] : 65.d7 osd.42: soid fa3a5d7/rb.0.2c2a8.12ec359d.00000000205c/head//65 size 4194304 != known size 1382912
>> 0 log [ERR] : 65.da osd.33: soid c2a344da/rb.0.2be17.cb4bd69.000000000081/head//65 size 4191744 != known size 1815552
>> 0 log [ERR] : 65.f osd.31: soid e8d2430f/rb.0.2d1e9.1339c5dd.000000000c41/head//65 size 2424832 != known size 2331648
>>
>> of make things worse?
>>
>> I could only check 14 out of 20 OSD's so far, cause from two older nodes a scrub leads to slow-requests… > couple of minutes, so VM's got stalled… customers pressing the "reset-button", so losing caches…
>>
>> Comments welcome,
>>
>> Oliver.
>>
>>> On Fri, Dec 7, 2012 at 6:39 AM, Oliver Francke <Oliver.Francke@filoo.de> wrote:
>>>> Hi,
>>>>
>>>> is the following a "known one", too? Would be good to get it out of my head:
>>>>
>>>>
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 1: /usr/bin/ceph-osd() [0x706c59]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 2: (()+0xeff0) [0x7f7f306c0ff0]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 3: (gsignal()+0x35) [0x7f7f2f35f1b5]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 4: (abort()+0x180) [0x7f7f2f361fc0]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 5:
>>>>> (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f7f2fbf3dc5]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 6: (()+0xcb166) [0x7f7f2fbf2166]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 7: (()+0xcb193) [0x7f7f2fbf2193]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 8: (()+0xcb28e) [0x7f7f2fbf228e]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 9: (ceph::__ceph_assert_fail(char
>>>>> const*, char const*, int, char const*)+0x793) [0x77e903]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 10:
>>>>> (PG::merge_log(ObjectStore::Transaction&, pg_info_t&, pg_log_t&,
>>>>> int)+0x1de3) [0x63db93]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 11:
>>>>> (PG::RecoveryState::Stray::react(PG::RecoveryState::MLogRec const&)+0x2cc)
>>>>> [0x63e00c]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 12:
>>>>> (boost::statechart::simple_state<PG::RecoveryState::Stray,
>>>>> PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na,
>>>>> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
>>>>> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
>>>>> mpl_::na, mpl_::na, mpl_::na>,
>>>>> (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base
>>>>> const&, void const*)+0x203) [0x658a63]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 13:
>>>>> (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine,
>>>>> PG::RecoveryState::Initial, std::allocator<void>,
>>>>> boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base
>>>>> const&)+0x6b) [0x650b4b]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 14:
>>>>> (PG::RecoveryState::handle_log(int, MOSDPGLog*, PG::RecoveryCtx*)+0x190)
>>>>> [0x60a520]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 15:
>>>>> (OSD::handle_pg_log(std::tr1::shared_ptr<OpRequest>)+0x666) [0x5c62e6]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 16:
>>>>> (OSD::dispatch_op(std::tr1::shared_ptr<OpRequest>)+0x11b) [0x5c6f3b]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 17: (OSD::_dispatch(Message*)+0x173)
>>>>> [0x5d1983]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 18: (OSD::ms_dispatch(Message*)+0x184)
>>>>> [0x5d2254]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 19:
>>>>> (SimpleMessenger::DispatchQueue::entry()+0x5e9) [0x7d3c09]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 20:
>>>>> (SimpleMessenger::dispatch_entry()+0x15) [0x7d5195]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 21:
>>>>> (SimpleMessenger::DispatchThread::entry()+0xd) [0x726bad]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 22: (()+0x68ca) [0x7f7f306b88ca]
>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 23: (clone()+0x6d) [0x7f7f2f3fc92d]
>>>>>
>>>> Thnx for looking,
>>>>
>>>>
>>>> Oliver.
>>>>
>>>> --
>>>>
>>>> Oliver Francke
>>>>
>>>> filoo GmbH
>>>> Moltkestraße 25a
>>>> 33330 Gütersloh
>>>> HRB4355 AG Gütersloh
>>>>
>>>> Geschäftsführer: S.Grewing | J.Rehpöhler | C.Kunz
>>>>
>>>> Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 

Oliver Francke

filoo GmbH
Moltkestraße 25a
33330 Gütersloh
HRB4355 AG Gütersloh

Geschäftsführer: S.Grewing | J.Rehpöhler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: A couple of OSD-crashes after serious network trouble
  2012-12-10 10:48             ` Oliver Francke
@ 2012-12-11 15:19               ` Oliver Francke
  2012-12-11 17:04                 ` Sage Weil
  0 siblings, 1 reply; 14+ messages in thread
From: Oliver Francke @ 2012-12-11 15:19 UTC (permalink / raw)
  To: Samuel Just; +Cc: ceph-devel@vger.kernel.org

Hi Sam,

perhaps you have overlooked my comments further down, beginning with
"been there" ? ;)

If so, please have a look, cause I'm clueless 8-)

On 12/10/2012 11:48 AM, Oliver Francke wrote:
> Hi Sam,
>
> helpful input.. and... not so...
>
> On 12/07/2012 10:18 PM, Samuel Just wrote:
>> Ah... unfortunately doing a repair in these 6 cases would probably
>> result in the wrong object surviving.  It should work, but it might
>> corrupt the rbd image contents.  If the images are expendable, you
>> could repair and then delete the images.
>>
>> The red flag here is that the "known size" is smaller than the other
>> size.  This indicates that it most likely chose the wrong file as the
>> "correct" one since rbd image blocks usually get bigger over time.  To
>> fix this, you will need to manually copy the file for the larger of
>> the two object replicas to replace the smaller of the two object
>> replicas.
>>
>> For the first, soid 87c96f10/rb.0.47d9b.1014b7b4.0000000002df/head//65
>> in pg 65.10:
>> 1) Find the object on the primary and the replica (from above, primary
>> is 12 and replica is 40).  You can use find in the primary and replica
>> current/65.10_head directories to look for a file matching
>> *rb.0.47d9b.1014b7b4.0000000002df*).  The file name should be
>> 'rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__65' I think.
>> 2) Stop the primary and replica osds
>> 3) Compare the file sizes for the two files -- you should find that
>> the file sizes do not match.
>> 4) Replace the smaller file with the larger one (you'll probably want
>> to keep a copy of the smaller one around just in case).
>> 5) Restart the osds and scrub pg 65.10 -- the pg should come up clean
>> (possibly with a relatively harmless stat mismatch)
>
> been there. on OSD.12 it's
> -rw-r--r-- 1 root root 699904 Dec  9 06:25 
> rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__41
>
> on OSD.40:
> -rw-r--r-- 1 root root 4194304 Dec  9 06:25 
> rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__41
>
> going by a short glance into the file, there are some readable 
> syslog-entries, in both files.
> For the bad luck in this example, the shorter file contains the more 
> current entries?!
>
> What exactly happens, if I try to copy or export the file? Which block 
> will be chosen?
> VM is running as I'm writing, so flexibility reduced.
>
> Regards,
>
> Oliver.
>
>> If this worked our correctly, you can repeat for the other 5 cases.
>>
>> Let me know if you have any questions.
>> -Sam
>>
>> On Fri, Dec 7, 2012 at 11:09 AM, Oliver Francke 
>> <Oliver.Francke@filoo.de> wrote:
>>> Hi Sam,
>>>
>>> Am 07.12.2012 um 19:37 schrieb Samuel Just <sam.just@inktank.com>:
>>>
>>>> That is very likely to be one of the merge_log bugs fixed between 0.48
>>>> and 0.55.  I could confirm with a stacktrace from gdb with line
>>>> numbers or the remainder of the logging dumped when the daemon
>>>> crashed.
>>>>
>>>> My understanding of your situation is that currently all pgs are
>>>> active+clean but you are missing some rbd image headers and some rbd
>>>> images appear to be corrupted.  Is that accurate?
>>>> -Sam
>>>>
>>> thnx for droppig in.
>>>
>>> Uhm almost correct, there are now 6 pg in state inconsistent:
>>>
>>> HEALTH_WARN 6 pgs inconsistent
>>> pg 65.da is active+clean+inconsistent, acting [1,33]
>>> pg 65.d7 is active+clean+inconsistent, acting [13,42]
>>> pg 65.10 is active+clean+inconsistent, acting [12,40]
>>> pg 65.f is active+clean+inconsistent, acting [13,31]
>>> pg 65.75 is active+clean+inconsistent, acting [1,33]
>>> pg 65.6a is active+clean+inconsistent, acting [13,31]
>>>
>>> I know which images are affected, but does a repair help?
>>>
>>> 0 log [ERR] : 65.10 osd.40: soid 
>>> 87c96f10/rb.0.47d9b.1014b7b4.0000000002df/head//65 size 4194304 != 
>>> known size 699904
>>> 0 log [ERR] : 65.6a osd.31: soid 
>>> 19a2526a/rb.0.2dcf2.1da2a31e.000000000737/head//65 size 4191744 != 
>>> known size 2757632
>>> 0 log [ERR] : 65.75 osd.33: soid 
>>> 20550575/rb.0.2d520.5c17a6e3.000000000339/head//65 size 4194304 != 
>>> known size 1238016
>>> 0 log [ERR] : 65.d7 osd.42: soid 
>>> fa3a5d7/rb.0.2c2a8.12ec359d.00000000205c/head//65 size 4194304 != 
>>> known size 1382912
>>> 0 log [ERR] : 65.da osd.33: soid 
>>> c2a344da/rb.0.2be17.cb4bd69.000000000081/head//65 size 4191744 != 
>>> known size 1815552
>>> 0 log [ERR] : 65.f osd.31: soid 
>>> e8d2430f/rb.0.2d1e9.1339c5dd.000000000c41/head//65 size 2424832 != 
>>> known size 2331648
>>>
>>> of make things worse?
>>>
>>> I could only check 14 out of 20 OSD's so far, cause from two older 
>>> nodes a scrub leads to slow-requests… > couple of minutes, so VM's 
>>> got stalled… customers pressing the "reset-button", so losing caches…
>>>
>>> Comments welcome,
>>>
>>> Oliver.
>>>
>>>> On Fri, Dec 7, 2012 at 6:39 AM, Oliver Francke 
>>>> <Oliver.Francke@filoo.de> wrote:
>>>>> Hi,
>>>>>
>>>>> is the following a "known one", too? Would be good to get it out 
>>>>> of my head:
>>>>>
>>>>>
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 1: /usr/bin/ceph-osd() 
>>>>>> [0x706c59]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 2: (()+0xeff0) [0x7f7f306c0ff0]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 3: (gsignal()+0x35) 
>>>>>> [0x7f7f2f35f1b5]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 4: (abort()+0x180) 
>>>>>> [0x7f7f2f361fc0]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 5:
>>>>>> (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f7f2fbf3dc5]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 6: (()+0xcb166) [0x7f7f2fbf2166]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 7: (()+0xcb193) [0x7f7f2fbf2193]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 8: (()+0xcb28e) [0x7f7f2fbf228e]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 9: 
>>>>>> (ceph::__ceph_assert_fail(char
>>>>>> const*, char const*, int, char const*)+0x793) [0x77e903]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 10:
>>>>>> (PG::merge_log(ObjectStore::Transaction&, pg_info_t&, pg_log_t&,
>>>>>> int)+0x1de3) [0x63db93]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 11:
>>>>>> (PG::RecoveryState::Stray::react(PG::RecoveryState::MLogRec 
>>>>>> const&)+0x2cc)
>>>>>> [0x63e00c]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 12:
>>>>>> (boost::statechart::simple_state<PG::RecoveryState::Stray,
>>>>>> PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na, 
>>>>>> mpl_::na,
>>>>>> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, 
>>>>>> mpl_::na,
>>>>>> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, 
>>>>>> mpl_::na,
>>>>>> mpl_::na, mpl_::na, mpl_::na>,
>>>>>> (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base 
>>>>>>
>>>>>> const&, void const*)+0x203) [0x658a63]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 13:
>>>>>> (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, 
>>>>>>
>>>>>> PG::RecoveryState::Initial, std::allocator<void>,
>>>>>> boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base 
>>>>>>
>>>>>> const&)+0x6b) [0x650b4b]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 14:
>>>>>> (PG::RecoveryState::handle_log(int, MOSDPGLog*, 
>>>>>> PG::RecoveryCtx*)+0x190)
>>>>>> [0x60a520]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 15:
>>>>>> (OSD::handle_pg_log(std::tr1::shared_ptr<OpRequest>)+0x666) 
>>>>>> [0x5c62e6]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 16:
>>>>>> (OSD::dispatch_op(std::tr1::shared_ptr<OpRequest>)+0x11b) [0x5c6f3b]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 17: 
>>>>>> (OSD::_dispatch(Message*)+0x173)
>>>>>> [0x5d1983]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 18: 
>>>>>> (OSD::ms_dispatch(Message*)+0x184)
>>>>>> [0x5d2254]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 19:
>>>>>> (SimpleMessenger::DispatchQueue::entry()+0x5e9) [0x7d3c09]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 20:
>>>>>> (SimpleMessenger::dispatch_entry()+0x15) [0x7d5195]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 21:
>>>>>> (SimpleMessenger::DispatchThread::entry()+0xd) [0x726bad]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 22: (()+0x68ca) [0x7f7f306b88ca]
>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 23: (clone()+0x6d) 
>>>>>> [0x7f7f2f3fc92d]
>>>>>>
>>>>> Thnx for looking,
>>>>>
>>>>>
>>>>> Oliver.
>>>>>
>>>>> -- 
>>>>>
>>>>> Oliver Francke
>>>>>
>>>>> filoo GmbH
>>>>> Moltkestraße 25a
>>>>> 33330 Gütersloh
>>>>> HRB4355 AG Gütersloh
>>>>>
>>>>> Geschäftsführer: S.Grewing | J.Rehpöhler | C.Kunz
>>>>>
>>>>> Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
>>>>>
>>>>> -- 
>>>>> To unsubscribe from this list: send the line "unsubscribe 
>>>>> ceph-devel" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>> -- 
>>>> To unsubscribe from this list: send the line "unsubscribe 
>>>> ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>


-- 

Oliver Francke

filoo GmbH
Moltkestraße 25a
33330 Gütersloh
HRB4355 AG Gütersloh

Geschäftsführer: S.Grewing | J.Rehpöhler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: A couple of OSD-crashes after serious network trouble
  2012-12-11 15:19               ` Oliver Francke
@ 2012-12-11 17:04                 ` Sage Weil
  2012-12-11 19:38                   ` Oliver Francke
  0 siblings, 1 reply; 14+ messages in thread
From: Sage Weil @ 2012-12-11 17:04 UTC (permalink / raw)
  To: Oliver Francke; +Cc: Samuel Just, ceph-devel@vger.kernel.org

On Tue, 11 Dec 2012, Oliver Francke wrote:
> Hi Sam,
> 
> perhaps you have overlooked my comments further down, beginning with
> "been there" ? ;)

We're pretty swamped with bobtail stuff at the moment, so ceph-devel 
inquiries are low on the priority list right now.

See below:

> 
> If so, please have a look, cause I'm clueless 8-)
> 
> On 12/10/2012 11:48 AM, Oliver Francke wrote:
> > Hi Sam,
> > 
> > helpful input.. and... not so...
> > 
> > On 12/07/2012 10:18 PM, Samuel Just wrote:
> > > Ah... unfortunately doing a repair in these 6 cases would probably
> > > result in the wrong object surviving.  It should work, but it might
> > > corrupt the rbd image contents.  If the images are expendable, you
> > > could repair and then delete the images.
> > > 
> > > The red flag here is that the "known size" is smaller than the other
> > > size.  This indicates that it most likely chose the wrong file as the
> > > "correct" one since rbd image blocks usually get bigger over time.  To
> > > fix this, you will need to manually copy the file for the larger of
> > > the two object replicas to replace the smaller of the two object
> > > replicas.
> > > 
> > > For the first, soid 87c96f10/rb.0.47d9b.1014b7b4.0000000002df/head//65
> > > in pg 65.10:
> > > 1) Find the object on the primary and the replica (from above, primary
> > > is 12 and replica is 40).  You can use find in the primary and replica
> > > current/65.10_head directories to look for a file matching
> > > *rb.0.47d9b.1014b7b4.0000000002df*).  The file name should be
> > > 'rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__65' I think.
> > > 2) Stop the primary and replica osds
> > > 3) Compare the file sizes for the two files -- you should find that
> > > the file sizes do not match.
> > > 4) Replace the smaller file with the larger one (you'll probably want
> > > to keep a copy of the smaller one around just in case).
> > > 5) Restart the osds and scrub pg 65.10 -- the pg should come up clean
> > > (possibly with a relatively harmless stat mismatch)
> > 
> > been there. on OSD.12 it's
> > -rw-r--r-- 1 root root 699904 Dec  9 06:25
> > rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__41
> > 
> > on OSD.40:
> > -rw-r--r-- 1 root root 4194304 Dec  9 06:25
> > rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__41
> > 
> > going by a short glance into the file, there are some readable
> > syslog-entries, in both files.
> > For the bad luck in this example, the shorter file contains the more current
> > entries?!

It sounds like the larger one was at one point correct, but since they got 
out of sync an update was applied to the other.  What fs is this (inside 
the VM)?  If we're lucky the whole block is file data, in which case I 
would extend the small one with more recent out to the full size by taking 
the last chunk of the second one.  (Or, if the bytes look like an 
unimportant file, just use truncate(1) to extend it, and get zeros for 
that region.)  Make backups of the object first, and fsck inside the VM 
afterwards.

--

We've seen this issue bite twice now, both times on argonaut.  So far 
nobody using anything more recent..but that is a smaller pool of people, 
so no real comform there.  Working on setting up a higher-stress long-term 
testing cluster to trigger this.

Can you remind me what kernel version you are using?

sage


> > 
> > What exactly happens, if I try to copy or export the file? Which block will
> > be chosen?
> > VM is running as I'm writing, so flexibility reduced.
> > 
> > Regards,
> > 
> > Oliver.
> > 
> > > If this worked our correctly, you can repeat for the other 5 cases.
> > > 
> > > Let me know if you have any questions.
> > > -Sam
> > > 
> > > On Fri, Dec 7, 2012 at 11:09 AM, Oliver Francke <Oliver.Francke@filoo.de>
> > > wrote:
> > > > Hi Sam,
> > > > 
> > > > Am 07.12.2012 um 19:37 schrieb Samuel Just <sam.just@inktank.com>:
> > > > 
> > > > > That is very likely to be one of the merge_log bugs fixed between 0.48
> > > > > and 0.55.  I could confirm with a stacktrace from gdb with line
> > > > > numbers or the remainder of the logging dumped when the daemon
> > > > > crashed.
> > > > > 
> > > > > My understanding of your situation is that currently all pgs are
> > > > > active+clean but you are missing some rbd image headers and some rbd
> > > > > images appear to be corrupted.  Is that accurate?
> > > > > -Sam
> > > > > 
> > > > thnx for droppig in.
> > > > 
> > > > Uhm almost correct, there are now 6 pg in state inconsistent:
> > > > 
> > > > HEALTH_WARN 6 pgs inconsistent
> > > > pg 65.da is active+clean+inconsistent, acting [1,33]
> > > > pg 65.d7 is active+clean+inconsistent, acting [13,42]
> > > > pg 65.10 is active+clean+inconsistent, acting [12,40]
> > > > pg 65.f is active+clean+inconsistent, acting [13,31]
> > > > pg 65.75 is active+clean+inconsistent, acting [1,33]
> > > > pg 65.6a is active+clean+inconsistent, acting [13,31]
> > > > 
> > > > I know which images are affected, but does a repair help?
> > > > 
> > > > 0 log [ERR] : 65.10 osd.40: soid
> > > > 87c96f10/rb.0.47d9b.1014b7b4.0000000002df/head//65 size 4194304 != known
> > > > size 699904
> > > > 0 log [ERR] : 65.6a osd.31: soid
> > > > 19a2526a/rb.0.2dcf2.1da2a31e.000000000737/head//65 size 4191744 != known
> > > > size 2757632
> > > > 0 log [ERR] : 65.75 osd.33: soid
> > > > 20550575/rb.0.2d520.5c17a6e3.000000000339/head//65 size 4194304 != known
> > > > size 1238016
> > > > 0 log [ERR] : 65.d7 osd.42: soid
> > > > fa3a5d7/rb.0.2c2a8.12ec359d.00000000205c/head//65 size 4194304 != known
> > > > size 1382912
> > > > 0 log [ERR] : 65.da osd.33: soid
> > > > c2a344da/rb.0.2be17.cb4bd69.000000000081/head//65 size 4191744 != known
> > > > size 1815552
> > > > 0 log [ERR] : 65.f osd.31: soid
> > > > e8d2430f/rb.0.2d1e9.1339c5dd.000000000c41/head//65 size 2424832 != known
> > > > size 2331648
> > > > 
> > > > of make things worse?
> > > > 
> > > > I could only check 14 out of 20 OSD's so far, cause from two older nodes
> > > > a scrub leads to slow-requests? > couple of minutes, so VM's got
> > > > stalled? customers pressing the "reset-button", so losing caches?
> > > > 
> > > > Comments welcome,
> > > > 
> > > > Oliver.
> > > > 
> > > > > On Fri, Dec 7, 2012 at 6:39 AM, Oliver Francke
> > > > > <Oliver.Francke@filoo.de> wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > is the following a "known one", too? Would be good to get it out of
> > > > > > my head:
> > > > > > 
> > > > > > 
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 1: /usr/bin/ceph-osd()
> > > > > > > [0x706c59]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 2: (()+0xeff0)
> > > > > > > [0x7f7f306c0ff0]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 3: (gsignal()+0x35)
> > > > > > > [0x7f7f2f35f1b5]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 4: (abort()+0x180)
> > > > > > > [0x7f7f2f361fc0]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 5:
> > > > > > > (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f7f2fbf3dc5]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 6: (()+0xcb166)
> > > > > > > [0x7f7f2fbf2166]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 7: (()+0xcb193)
> > > > > > > [0x7f7f2fbf2193]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 8: (()+0xcb28e)
> > > > > > > [0x7f7f2fbf228e]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 9:
> > > > > > > (ceph::__ceph_assert_fail(char
> > > > > > > const*, char const*, int, char const*)+0x793) [0x77e903]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 10:
> > > > > > > (PG::merge_log(ObjectStore::Transaction&, pg_info_t&, pg_log_t&,
> > > > > > > int)+0x1de3) [0x63db93]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 11:
> > > > > > > (PG::RecoveryState::Stray::react(PG::RecoveryState::MLogRec
> > > > > > > const&)+0x2cc)
> > > > > > > [0x63e00c]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 12:
> > > > > > > (boost::statechart::simple_state<PG::RecoveryState::Stray,
> > > > > > > PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na,
> > > > > > > mpl_::na,
> > > > > > > mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
> > > > > > > mpl_::na,
> > > > > > > mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
> > > > > > > mpl_::na,
> > > > > > > mpl_::na, mpl_::na, mpl_::na>,
> > > > > > > (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base 
> > > > > > > const&, void const*)+0x203) [0x658a63]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 13:
> > > > > > > (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, 
> > > > > > > PG::RecoveryState::Initial, std::allocator<void>,
> > > > > > > boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base 
> > > > > > > const&)+0x6b) [0x650b4b]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 14:
> > > > > > > (PG::RecoveryState::handle_log(int, MOSDPGLog*,
> > > > > > > PG::RecoveryCtx*)+0x190)
> > > > > > > [0x60a520]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 15:
> > > > > > > (OSD::handle_pg_log(std::tr1::shared_ptr<OpRequest>)+0x666)
> > > > > > > [0x5c62e6]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 16:
> > > > > > > (OSD::dispatch_op(std::tr1::shared_ptr<OpRequest>)+0x11b)
> > > > > > > [0x5c6f3b]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 17:
> > > > > > > (OSD::_dispatch(Message*)+0x173)
> > > > > > > [0x5d1983]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 18:
> > > > > > > (OSD::ms_dispatch(Message*)+0x184)
> > > > > > > [0x5d2254]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 19:
> > > > > > > (SimpleMessenger::DispatchQueue::entry()+0x5e9) [0x7d3c09]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 20:
> > > > > > > (SimpleMessenger::dispatch_entry()+0x15) [0x7d5195]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 21:
> > > > > > > (SimpleMessenger::DispatchThread::entry()+0xd) [0x726bad]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 22: (()+0x68ca)
> > > > > > > [0x7f7f306b88ca]
> > > > > > > /var/log/ceph/ceph-osd.40.log.1.gz: 23: (clone()+0x6d)
> > > > > > > [0x7f7f2f3fc92d]
> > > > > > > 
> > > > > > Thnx for looking,
> > > > > > 
> > > > > > 
> > > > > > Oliver.
> > > > > > 
> > > > > > -- 
> > > > > > 
> > > > > > Oliver Francke
> > > > > > 
> > > > > > filoo GmbH
> > > > > > Moltkestra?e 25a
> > > > > > 33330 G?tersloh
> > > > > > HRB4355 AG G?tersloh
> > > > > > 
> > > > > > Gesch?ftsf?hrer: S.Grewing | J.Rehp?hler | C.Kunz
> > > > > > 
> > > > > > Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
> > > > > > 
> > > > > > -- 
> > > > > > To unsubscribe from this list: send the line "unsubscribe
> > > > > > ceph-devel" in
> > > > > > the body of a message to majordomo@vger.kernel.org
> > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > > > -- 
> > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> > > > > in
> > > > > the body of a message to majordomo@vger.kernel.org
> > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > 
> > 
> 
> 
> -- 
> 
> Oliver Francke
> 
> filoo GmbH
> Moltkestra?e 25a
> 33330 G?tersloh
> HRB4355 AG G?tersloh
> 
> Gesch?ftsf?hrer: S.Grewing | J.Rehp?hler | C.Kunz
> 
> Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: A couple of OSD-crashes after serious network trouble
  2012-12-11 17:04                 ` Sage Weil
@ 2012-12-11 19:38                   ` Oliver Francke
  2012-12-13  4:15                     ` Samuel Just
  0 siblings, 1 reply; 14+ messages in thread
From: Oliver Francke @ 2012-12-11 19:38 UTC (permalink / raw)
  To: Sage Weil; +Cc: Samuel Just, ceph-devel@vger.kernel.org

Hi Sage,

Am 11.12.2012 um 18:04 schrieb Sage Weil <sage@inktank.com>:

> On Tue, 11 Dec 2012, Oliver Francke wrote:
>> Hi Sam,
>> 
>> perhaps you have overlooked my comments further down, beginning with
>> "been there" ? ;)
> 
> We're pretty swamped with bobtail stuff at the moment, so ceph-devel 
> inquiries are low on the priority list right now.
> 

100% agree, this thing here is "best effort" right now, true.

> See below:
> 
>> 
>> If so, please have a look, cause I'm clueless 8-)
>> 
>> On 12/10/2012 11:48 AM, Oliver Francke wrote:
>>> Hi Sam,
>>> 
>>> helpful input.. and... not so...
>>> 
>>> On 12/07/2012 10:18 PM, Samuel Just wrote:
>>>> Ah... unfortunately doing a repair in these 6 cases would probably
>>>> result in the wrong object surviving.  It should work, but it might
>>>> corrupt the rbd image contents.  If the images are expendable, you
>>>> could repair and then delete the images.
>>>> 
>>>> The red flag here is that the "known size" is smaller than the other
>>>> size.  This indicates that it most likely chose the wrong file as the
>>>> "correct" one since rbd image blocks usually get bigger over time.  To
>>>> fix this, you will need to manually copy the file for the larger of
>>>> the two object replicas to replace the smaller of the two object
>>>> replicas.
>>>> 
>>>> For the first, soid 87c96f10/rb.0.47d9b.1014b7b4.0000000002df/head//65
>>>> in pg 65.10:
>>>> 1) Find the object on the primary and the replica (from above, primary
>>>> is 12 and replica is 40).  You can use find in the primary and replica
>>>> current/65.10_head directories to look for a file matching
>>>> *rb.0.47d9b.1014b7b4.0000000002df*).  The file name should be
>>>> 'rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__65' I think.
>>>> 2) Stop the primary and replica osds
>>>> 3) Compare the file sizes for the two files -- you should find that
>>>> the file sizes do not match.
>>>> 4) Replace the smaller file with the larger one (you'll probably want
>>>> to keep a copy of the smaller one around just in case).
>>>> 5) Restart the osds and scrub pg 65.10 -- the pg should come up clean
>>>> (possibly with a relatively harmless stat mismatch)
>>> 
>>> been there. on OSD.12 it's
>>> -rw-r--r-- 1 root root 699904 Dec  9 06:25
>>> rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__41
>>> 
>>> on OSD.40:
>>> -rw-r--r-- 1 root root 4194304 Dec  9 06:25
>>> rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__41
>>> 
>>> going by a short glance into the file, there are some readable
>>> syslog-entries, in both files.
>>> For the bad luck in this example, the shorter file contains the more current
>>> entries?!
> 
> It sounds like the larger one was at one point correct, but since they got 
> out of sync an update was applied to the other.  What fs is this (inside 
> the VM)?  If we're lucky the whole block is file data, in which case I 
> would extend the small one with more recent out to the full size by taking 
> the last chunk of the second one.  (Or, if the bytes look like an 
> unimportant file, just use truncate(1) to extend it, and get zeros for 
> that region.)  Make backups of the object first, and fsck inside the VM 
> afterwards.
> 
> --
> 
> We've seen this issue bite twice now, both times on argonaut.  So far 
> nobody using anything more recent..but that is a smaller pool of people, 
> so no real comform there.  Working on setting up a higher-stress long-term 
> testing cluster to trigger this.
> 
> Can you remind me what kernel version you are using?

one of the affected nodes are driven by 3.5.4, the newer ones are nowadays Ubtuntu 12.04.1 LTS with self-compiled 3.6.6.
Inside the VM's you can imagine all flavors, less forgiving CentOS 5.8, some debian5.0 ( ext3)… mostly ext3, I think. Not optimum, at least.

Couple of problems caused by slow requests, I can see in some log-files customers pressing the "RESET" button, implemented via qemu-monitor.
Destructive as can be, with having some megs of cache with the rbd-device.

Thnx n regards,

Oliver.

> 
> sage
> 
> 
>>> 
>>> What exactly happens, if I try to copy or export the file? Which block will
>>> be chosen?
>>> VM is running as I'm writing, so flexibility reduced.
>>> 
>>> Regards,
>>> 
>>> Oliver.
>>> 
>>>> If this worked our correctly, you can repeat for the other 5 cases.
>>>> 
>>>> Let me know if you have any questions.
>>>> -Sam
>>>> 
>>>> On Fri, Dec 7, 2012 at 11:09 AM, Oliver Francke <Oliver.Francke@filoo.de>
>>>> wrote:
>>>>> Hi Sam,
>>>>> 
>>>>> Am 07.12.2012 um 19:37 schrieb Samuel Just <sam.just@inktank.com>:
>>>>> 
>>>>>> That is very likely to be one of the merge_log bugs fixed between 0.48
>>>>>> and 0.55.  I could confirm with a stacktrace from gdb with line
>>>>>> numbers or the remainder of the logging dumped when the daemon
>>>>>> crashed.
>>>>>> 
>>>>>> My understanding of your situation is that currently all pgs are
>>>>>> active+clean but you are missing some rbd image headers and some rbd
>>>>>> images appear to be corrupted.  Is that accurate?
>>>>>> -Sam
>>>>>> 
>>>>> thnx for droppig in.
>>>>> 
>>>>> Uhm almost correct, there are now 6 pg in state inconsistent:
>>>>> 
>>>>> HEALTH_WARN 6 pgs inconsistent
>>>>> pg 65.da is active+clean+inconsistent, acting [1,33]
>>>>> pg 65.d7 is active+clean+inconsistent, acting [13,42]
>>>>> pg 65.10 is active+clean+inconsistent, acting [12,40]
>>>>> pg 65.f is active+clean+inconsistent, acting [13,31]
>>>>> pg 65.75 is active+clean+inconsistent, acting [1,33]
>>>>> pg 65.6a is active+clean+inconsistent, acting [13,31]
>>>>> 
>>>>> I know which images are affected, but does a repair help?
>>>>> 
>>>>> 0 log [ERR] : 65.10 osd.40: soid
>>>>> 87c96f10/rb.0.47d9b.1014b7b4.0000000002df/head//65 size 4194304 != known
>>>>> size 699904
>>>>> 0 log [ERR] : 65.6a osd.31: soid
>>>>> 19a2526a/rb.0.2dcf2.1da2a31e.000000000737/head//65 size 4191744 != known
>>>>> size 2757632
>>>>> 0 log [ERR] : 65.75 osd.33: soid
>>>>> 20550575/rb.0.2d520.5c17a6e3.000000000339/head//65 size 4194304 != known
>>>>> size 1238016
>>>>> 0 log [ERR] : 65.d7 osd.42: soid
>>>>> fa3a5d7/rb.0.2c2a8.12ec359d.00000000205c/head//65 size 4194304 != known
>>>>> size 1382912
>>>>> 0 log [ERR] : 65.da osd.33: soid
>>>>> c2a344da/rb.0.2be17.cb4bd69.000000000081/head//65 size 4191744 != known
>>>>> size 1815552
>>>>> 0 log [ERR] : 65.f osd.31: soid
>>>>> e8d2430f/rb.0.2d1e9.1339c5dd.000000000c41/head//65 size 2424832 != known
>>>>> size 2331648
>>>>> 
>>>>> of make things worse?
>>>>> 
>>>>> I could only check 14 out of 20 OSD's so far, cause from two older nodes
>>>>> a scrub leads to slow-requests? > couple of minutes, so VM's got
>>>>> stalled? customers pressing the "reset-button", so losing caches?
>>>>> 
>>>>> Comments welcome,
>>>>> 
>>>>> Oliver.
>>>>> 
>>>>>> On Fri, Dec 7, 2012 at 6:39 AM, Oliver Francke
>>>>>> <Oliver.Francke@filoo.de> wrote:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> is the following a "known one", too? Would be good to get it out of
>>>>>>> my head:
>>>>>>> 
>>>>>>> 
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 1: /usr/bin/ceph-osd()
>>>>>>>> [0x706c59]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 2: (()+0xeff0)
>>>>>>>> [0x7f7f306c0ff0]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 3: (gsignal()+0x35)
>>>>>>>> [0x7f7f2f35f1b5]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 4: (abort()+0x180)
>>>>>>>> [0x7f7f2f361fc0]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 5:
>>>>>>>> (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f7f2fbf3dc5]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 6: (()+0xcb166)
>>>>>>>> [0x7f7f2fbf2166]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 7: (()+0xcb193)
>>>>>>>> [0x7f7f2fbf2193]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 8: (()+0xcb28e)
>>>>>>>> [0x7f7f2fbf228e]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 9:
>>>>>>>> (ceph::__ceph_assert_fail(char
>>>>>>>> const*, char const*, int, char const*)+0x793) [0x77e903]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 10:
>>>>>>>> (PG::merge_log(ObjectStore::Transaction&, pg_info_t&, pg_log_t&,
>>>>>>>> int)+0x1de3) [0x63db93]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 11:
>>>>>>>> (PG::RecoveryState::Stray::react(PG::RecoveryState::MLogRec
>>>>>>>> const&)+0x2cc)
>>>>>>>> [0x63e00c]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 12:
>>>>>>>> (boost::statechart::simple_state<PG::RecoveryState::Stray,
>>>>>>>> PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na,
>>>>>>>> mpl_::na,
>>>>>>>> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
>>>>>>>> mpl_::na,
>>>>>>>> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
>>>>>>>> mpl_::na,
>>>>>>>> mpl_::na, mpl_::na, mpl_::na>,
>>>>>>>> (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base 
>>>>>>>> const&, void const*)+0x203) [0x658a63]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 13:
>>>>>>>> (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, 
>>>>>>>> PG::RecoveryState::Initial, std::allocator<void>,
>>>>>>>> boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base 
>>>>>>>> const&)+0x6b) [0x650b4b]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 14:
>>>>>>>> (PG::RecoveryState::handle_log(int, MOSDPGLog*,
>>>>>>>> PG::RecoveryCtx*)+0x190)
>>>>>>>> [0x60a520]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 15:
>>>>>>>> (OSD::handle_pg_log(std::tr1::shared_ptr<OpRequest>)+0x666)
>>>>>>>> [0x5c62e6]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 16:
>>>>>>>> (OSD::dispatch_op(std::tr1::shared_ptr<OpRequest>)+0x11b)
>>>>>>>> [0x5c6f3b]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 17:
>>>>>>>> (OSD::_dispatch(Message*)+0x173)
>>>>>>>> [0x5d1983]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 18:
>>>>>>>> (OSD::ms_dispatch(Message*)+0x184)
>>>>>>>> [0x5d2254]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 19:
>>>>>>>> (SimpleMessenger::DispatchQueue::entry()+0x5e9) [0x7d3c09]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 20:
>>>>>>>> (SimpleMessenger::dispatch_entry()+0x15) [0x7d5195]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 21:
>>>>>>>> (SimpleMessenger::DispatchThread::entry()+0xd) [0x726bad]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 22: (()+0x68ca)
>>>>>>>> [0x7f7f306b88ca]
>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 23: (clone()+0x6d)
>>>>>>>> [0x7f7f2f3fc92d]
>>>>>>>> 
>>>>>>> Thnx for looking,
>>>>>>> 
>>>>>>> 
>>>>>>> Oliver.
>>>>>>> 
>>>>>>> -- 
>>>>>>> 
>>>>>>> Oliver Francke
>>>>>>> 
>>>>>>> filoo GmbH
>>>>>>> Moltkestra?e 25a
>>>>>>> 33330 G?tersloh
>>>>>>> HRB4355 AG G?tersloh
>>>>>>> 
>>>>>>> Gesch?ftsf?hrer: S.Grewing | J.Rehp?hler | C.Kunz
>>>>>>> 
>>>>>>> Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
>>>>>>> 
>>>>>>> -- 
>>>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>>>> ceph-devel" in
>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>> -- 
>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>>>>>> in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>> 
>>> 
>> 
>> 
>> -- 
>> 
>> Oliver Francke
>> 
>> filoo GmbH
>> Moltkestra?e 25a
>> 33330 G?tersloh
>> HRB4355 AG G?tersloh
>> 
>> Gesch?ftsf?hrer: S.Grewing | J.Rehp?hler | C.Kunz
>> 
>> Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> 

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: A couple of OSD-crashes after serious network trouble
  2012-12-11 19:38                   ` Oliver Francke
@ 2012-12-13  4:15                     ` Samuel Just
  2012-12-13 16:48                       ` Oliver Francke
  0 siblings, 1 reply; 14+ messages in thread
From: Samuel Just @ 2012-12-13  4:15 UTC (permalink / raw)
  To: Oliver Francke; +Cc: Sage Weil, ceph-devel@vger.kernel.org

Apologies, I missed your reply on Monday.  Any attempt to read or
write the object will hit the file on the primary (the smaller one
with the newer syslog entries).  If you take down both OSDs (12 and
40) while performing the repair, the vm in question will hang if it
tries to access that block, but should recover when you bring the OSDs
back up.  To expand on the the response Sage posted, writes/reads to
that block have been hitting the primary (osd.12) which unfortunately
is the incorrect file.  I would, however, have expected that those
writes would have been replicated to the larger file on osd.40 as
well.  Are you certain that the newer syslog entries on 12 aren't also
present on 40?
-Sam

On Tue, Dec 11, 2012 at 11:38 AM, Oliver Francke
<Oliver.Francke@filoo.de> wrote:
> Hi Sage,
>
> Am 11.12.2012 um 18:04 schrieb Sage Weil <sage@inktank.com>:
>
>> On Tue, 11 Dec 2012, Oliver Francke wrote:
>>> Hi Sam,
>>>
>>> perhaps you have overlooked my comments further down, beginning with
>>> "been there" ? ;)
>>
>> We're pretty swamped with bobtail stuff at the moment, so ceph-devel
>> inquiries are low on the priority list right now.
>>
>
> 100% agree, this thing here is "best effort" right now, true.
>
>> See below:
>>
>>>
>>> If so, please have a look, cause I'm clueless 8-)
>>>
>>> On 12/10/2012 11:48 AM, Oliver Francke wrote:
>>>> Hi Sam,
>>>>
>>>> helpful input.. and... not so...
>>>>
>>>> On 12/07/2012 10:18 PM, Samuel Just wrote:
>>>>> Ah... unfortunately doing a repair in these 6 cases would probably
>>>>> result in the wrong object surviving.  It should work, but it might
>>>>> corrupt the rbd image contents.  If the images are expendable, you
>>>>> could repair and then delete the images.
>>>>>
>>>>> The red flag here is that the "known size" is smaller than the other
>>>>> size.  This indicates that it most likely chose the wrong file as the
>>>>> "correct" one since rbd image blocks usually get bigger over time.  To
>>>>> fix this, you will need to manually copy the file for the larger of
>>>>> the two object replicas to replace the smaller of the two object
>>>>> replicas.
>>>>>
>>>>> For the first, soid 87c96f10/rb.0.47d9b.1014b7b4.0000000002df/head//65
>>>>> in pg 65.10:
>>>>> 1) Find the object on the primary and the replica (from above, primary
>>>>> is 12 and replica is 40).  You can use find in the primary and replica
>>>>> current/65.10_head directories to look for a file matching
>>>>> *rb.0.47d9b.1014b7b4.0000000002df*).  The file name should be
>>>>> 'rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__65' I think.
>>>>> 2) Stop the primary and replica osds
>>>>> 3) Compare the file sizes for the two files -- you should find that
>>>>> the file sizes do not match.
>>>>> 4) Replace the smaller file with the larger one (you'll probably want
>>>>> to keep a copy of the smaller one around just in case).
>>>>> 5) Restart the osds and scrub pg 65.10 -- the pg should come up clean
>>>>> (possibly with a relatively harmless stat mismatch)
>>>>
>>>> been there. on OSD.12 it's
>>>> -rw-r--r-- 1 root root 699904 Dec  9 06:25
>>>> rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__41
>>>>
>>>> on OSD.40:
>>>> -rw-r--r-- 1 root root 4194304 Dec  9 06:25
>>>> rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__41
>>>>
>>>> going by a short glance into the file, there are some readable
>>>> syslog-entries, in both files.
>>>> For the bad luck in this example, the shorter file contains the more current
>>>> entries?!
>>
>> It sounds like the larger one was at one point correct, but since they got
>> out of sync an update was applied to the other.  What fs is this (inside
>> the VM)?  If we're lucky the whole block is file data, in which case I
>> would extend the small one with more recent out to the full size by taking
>> the last chunk of the second one.  (Or, if the bytes look like an
>> unimportant file, just use truncate(1) to extend it, and get zeros for
>> that region.)  Make backups of the object first, and fsck inside the VM
>> afterwards.
>>
>> --
>>
>> We've seen this issue bite twice now, both times on argonaut.  So far
>> nobody using anything more recent..but that is a smaller pool of people,
>> so no real comform there.  Working on setting up a higher-stress long-term
>> testing cluster to trigger this.
>>
>> Can you remind me what kernel version you are using?
>
> one of the affected nodes are driven by 3.5.4, the newer ones are nowadays Ubtuntu 12.04.1 LTS with self-compiled 3.6.6.
> Inside the VM's you can imagine all flavors, less forgiving CentOS 5.8, some debian5.0 ( ext3)… mostly ext3, I think. Not optimum, at least.
>
> Couple of problems caused by slow requests, I can see in some log-files customers pressing the "RESET" button, implemented via qemu-monitor.
> Destructive as can be, with having some megs of cache with the rbd-device.
>
> Thnx n regards,
>
> Oliver.
>
>>
>> sage
>>
>>
>>>>
>>>> What exactly happens, if I try to copy or export the file? Which block will
>>>> be chosen?
>>>> VM is running as I'm writing, so flexibility reduced.
>>>>
>>>> Regards,
>>>>
>>>> Oliver.
>>>>
>>>>> If this worked our correctly, you can repeat for the other 5 cases.
>>>>>
>>>>> Let me know if you have any questions.
>>>>> -Sam
>>>>>
>>>>> On Fri, Dec 7, 2012 at 11:09 AM, Oliver Francke <Oliver.Francke@filoo.de>
>>>>> wrote:
>>>>>> Hi Sam,
>>>>>>
>>>>>> Am 07.12.2012 um 19:37 schrieb Samuel Just <sam.just@inktank.com>:
>>>>>>
>>>>>>> That is very likely to be one of the merge_log bugs fixed between 0.48
>>>>>>> and 0.55.  I could confirm with a stacktrace from gdb with line
>>>>>>> numbers or the remainder of the logging dumped when the daemon
>>>>>>> crashed.
>>>>>>>
>>>>>>> My understanding of your situation is that currently all pgs are
>>>>>>> active+clean but you are missing some rbd image headers and some rbd
>>>>>>> images appear to be corrupted.  Is that accurate?
>>>>>>> -Sam
>>>>>>>
>>>>>> thnx for droppig in.
>>>>>>
>>>>>> Uhm almost correct, there are now 6 pg in state inconsistent:
>>>>>>
>>>>>> HEALTH_WARN 6 pgs inconsistent
>>>>>> pg 65.da is active+clean+inconsistent, acting [1,33]
>>>>>> pg 65.d7 is active+clean+inconsistent, acting [13,42]
>>>>>> pg 65.10 is active+clean+inconsistent, acting [12,40]
>>>>>> pg 65.f is active+clean+inconsistent, acting [13,31]
>>>>>> pg 65.75 is active+clean+inconsistent, acting [1,33]
>>>>>> pg 65.6a is active+clean+inconsistent, acting [13,31]
>>>>>>
>>>>>> I know which images are affected, but does a repair help?
>>>>>>
>>>>>> 0 log [ERR] : 65.10 osd.40: soid
>>>>>> 87c96f10/rb.0.47d9b.1014b7b4.0000000002df/head//65 size 4194304 != known
>>>>>> size 699904
>>>>>> 0 log [ERR] : 65.6a osd.31: soid
>>>>>> 19a2526a/rb.0.2dcf2.1da2a31e.000000000737/head//65 size 4191744 != known
>>>>>> size 2757632
>>>>>> 0 log [ERR] : 65.75 osd.33: soid
>>>>>> 20550575/rb.0.2d520.5c17a6e3.000000000339/head//65 size 4194304 != known
>>>>>> size 1238016
>>>>>> 0 log [ERR] : 65.d7 osd.42: soid
>>>>>> fa3a5d7/rb.0.2c2a8.12ec359d.00000000205c/head//65 size 4194304 != known
>>>>>> size 1382912
>>>>>> 0 log [ERR] : 65.da osd.33: soid
>>>>>> c2a344da/rb.0.2be17.cb4bd69.000000000081/head//65 size 4191744 != known
>>>>>> size 1815552
>>>>>> 0 log [ERR] : 65.f osd.31: soid
>>>>>> e8d2430f/rb.0.2d1e9.1339c5dd.000000000c41/head//65 size 2424832 != known
>>>>>> size 2331648
>>>>>>
>>>>>> of make things worse?
>>>>>>
>>>>>> I could only check 14 out of 20 OSD's so far, cause from two older nodes
>>>>>> a scrub leads to slow-requests? > couple of minutes, so VM's got
>>>>>> stalled? customers pressing the "reset-button", so losing caches?
>>>>>>
>>>>>> Comments welcome,
>>>>>>
>>>>>> Oliver.
>>>>>>
>>>>>>> On Fri, Dec 7, 2012 at 6:39 AM, Oliver Francke
>>>>>>> <Oliver.Francke@filoo.de> wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> is the following a "known one", too? Would be good to get it out of
>>>>>>>> my head:
>>>>>>>>
>>>>>>>>
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 1: /usr/bin/ceph-osd()
>>>>>>>>> [0x706c59]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 2: (()+0xeff0)
>>>>>>>>> [0x7f7f306c0ff0]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 3: (gsignal()+0x35)
>>>>>>>>> [0x7f7f2f35f1b5]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 4: (abort()+0x180)
>>>>>>>>> [0x7f7f2f361fc0]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 5:
>>>>>>>>> (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f7f2fbf3dc5]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 6: (()+0xcb166)
>>>>>>>>> [0x7f7f2fbf2166]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 7: (()+0xcb193)
>>>>>>>>> [0x7f7f2fbf2193]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 8: (()+0xcb28e)
>>>>>>>>> [0x7f7f2fbf228e]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 9:
>>>>>>>>> (ceph::__ceph_assert_fail(char
>>>>>>>>> const*, char const*, int, char const*)+0x793) [0x77e903]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 10:
>>>>>>>>> (PG::merge_log(ObjectStore::Transaction&, pg_info_t&, pg_log_t&,
>>>>>>>>> int)+0x1de3) [0x63db93]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 11:
>>>>>>>>> (PG::RecoveryState::Stray::react(PG::RecoveryState::MLogRec
>>>>>>>>> const&)+0x2cc)
>>>>>>>>> [0x63e00c]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 12:
>>>>>>>>> (boost::statechart::simple_state<PG::RecoveryState::Stray,
>>>>>>>>> PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na,
>>>>>>>>> mpl_::na,
>>>>>>>>> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
>>>>>>>>> mpl_::na,
>>>>>>>>> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
>>>>>>>>> mpl_::na,
>>>>>>>>> mpl_::na, mpl_::na, mpl_::na>,
>>>>>>>>> (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base
>>>>>>>>> const&, void const*)+0x203) [0x658a63]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 13:
>>>>>>>>> (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine,
>>>>>>>>> PG::RecoveryState::Initial, std::allocator<void>,
>>>>>>>>> boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base
>>>>>>>>> const&)+0x6b) [0x650b4b]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 14:
>>>>>>>>> (PG::RecoveryState::handle_log(int, MOSDPGLog*,
>>>>>>>>> PG::RecoveryCtx*)+0x190)
>>>>>>>>> [0x60a520]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 15:
>>>>>>>>> (OSD::handle_pg_log(std::tr1::shared_ptr<OpRequest>)+0x666)
>>>>>>>>> [0x5c62e6]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 16:
>>>>>>>>> (OSD::dispatch_op(std::tr1::shared_ptr<OpRequest>)+0x11b)
>>>>>>>>> [0x5c6f3b]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 17:
>>>>>>>>> (OSD::_dispatch(Message*)+0x173)
>>>>>>>>> [0x5d1983]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 18:
>>>>>>>>> (OSD::ms_dispatch(Message*)+0x184)
>>>>>>>>> [0x5d2254]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 19:
>>>>>>>>> (SimpleMessenger::DispatchQueue::entry()+0x5e9) [0x7d3c09]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 20:
>>>>>>>>> (SimpleMessenger::dispatch_entry()+0x15) [0x7d5195]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 21:
>>>>>>>>> (SimpleMessenger::DispatchThread::entry()+0xd) [0x726bad]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 22: (()+0x68ca)
>>>>>>>>> [0x7f7f306b88ca]
>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 23: (clone()+0x6d)
>>>>>>>>> [0x7f7f2f3fc92d]
>>>>>>>>>
>>>>>>>> Thnx for looking,
>>>>>>>>
>>>>>>>>
>>>>>>>> Oliver.
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Oliver Francke
>>>>>>>>
>>>>>>>> filoo GmbH
>>>>>>>> Moltkestra?e 25a
>>>>>>>> 33330 G?tersloh
>>>>>>>> HRB4355 AG G?tersloh
>>>>>>>>
>>>>>>>> Gesch?ftsf?hrer: S.Grewing | J.Rehp?hler | C.Kunz
>>>>>>>>
>>>>>>>> Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
>>>>>>>>
>>>>>>>> --
>>>>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>>>>> ceph-devel" in
>>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>>> --
>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>>>>>>> in
>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Oliver Francke
>>>
>>> filoo GmbH
>>> Moltkestra?e 25a
>>> 33330 G?tersloh
>>> HRB4355 AG G?tersloh
>>>
>>> Gesch?ftsf?hrer: S.Grewing | J.Rehp?hler | C.Kunz
>>>
>>> Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: A couple of OSD-crashes after serious network trouble
  2012-12-13  4:15                     ` Samuel Just
@ 2012-12-13 16:48                       ` Oliver Francke
  2012-12-13 20:48                         ` Samuel Just
  0 siblings, 1 reply; 14+ messages in thread
From: Oliver Francke @ 2012-12-13 16:48 UTC (permalink / raw)
  To: Samuel Just; +Cc: Sage Weil, ceph-devel@vger.kernel.org

Hi Sam,

On 12/13/2012 05:15 AM, Samuel Just wrote:
> Apologies, I missed your reply on Monday.  Any attempt to read or

no prob ;) We are busy, too, with preparing new nodes and switch to 10GE 
this evening.

> write the object will hit the file on the primary (the smaller one
> with the newer syslog entries).  If you take down both OSDs (12 and
> 40) while performing the repair, the vm in question will hang if it
> tries to access that block, but should recover when you bring the OSDs
> back up.  To expand on the the response Sage posted, writes/reads to
> that block have been hitting the primary (osd.12) which unfortunately
> is the incorrect file.  I would, however, have expected that those
> writes would have been replicated to the larger file on osd.40 as
> well.  Are you certain that the newer syslog entries on 12 aren't also
> present on 40?

well... time heals... I re-checked right now and both files are md5-wise 
identical?!
Not checked the other 5 inconsistencies.
Still having three headers missing and 6 OSD's not checked with scrub, 
though.

Will be back... for sure ;)

Thnx for now,

Oliver.


> -Sam
>
> On Tue, Dec 11, 2012 at 11:38 AM, Oliver Francke
> <Oliver.Francke@filoo.de> wrote:
>> Hi Sage,
>>
>> Am 11.12.2012 um 18:04 schrieb Sage Weil <sage@inktank.com>:
>>
>>> On Tue, 11 Dec 2012, Oliver Francke wrote:
>>>> Hi Sam,
>>>>
>>>> perhaps you have overlooked my comments further down, beginning with
>>>> "been there" ? ;)
>>> We're pretty swamped with bobtail stuff at the moment, so ceph-devel
>>> inquiries are low on the priority list right now.
>>>
>> 100% agree, this thing here is "best effort" right now, true.
>>
>>> See below:
>>>
>>>> If so, please have a look, cause I'm clueless 8-)
>>>>
>>>> On 12/10/2012 11:48 AM, Oliver Francke wrote:
>>>>> Hi Sam,
>>>>>
>>>>> helpful input.. and... not so...
>>>>>
>>>>> On 12/07/2012 10:18 PM, Samuel Just wrote:
>>>>>> Ah... unfortunately doing a repair in these 6 cases would probably
>>>>>> result in the wrong object surviving.  It should work, but it might
>>>>>> corrupt the rbd image contents.  If the images are expendable, you
>>>>>> could repair and then delete the images.
>>>>>>
>>>>>> The red flag here is that the "known size" is smaller than the other
>>>>>> size.  This indicates that it most likely chose the wrong file as the
>>>>>> "correct" one since rbd image blocks usually get bigger over time.  To
>>>>>> fix this, you will need to manually copy the file for the larger of
>>>>>> the two object replicas to replace the smaller of the two object
>>>>>> replicas.
>>>>>>
>>>>>> For the first, soid 87c96f10/rb.0.47d9b.1014b7b4.0000000002df/head//65
>>>>>> in pg 65.10:
>>>>>> 1) Find the object on the primary and the replica (from above, primary
>>>>>> is 12 and replica is 40).  You can use find in the primary and replica
>>>>>> current/65.10_head directories to look for a file matching
>>>>>> *rb.0.47d9b.1014b7b4.0000000002df*).  The file name should be
>>>>>> 'rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__65' I think.
>>>>>> 2) Stop the primary and replica osds
>>>>>> 3) Compare the file sizes for the two files -- you should find that
>>>>>> the file sizes do not match.
>>>>>> 4) Replace the smaller file with the larger one (you'll probably want
>>>>>> to keep a copy of the smaller one around just in case).
>>>>>> 5) Restart the osds and scrub pg 65.10 -- the pg should come up clean
>>>>>> (possibly with a relatively harmless stat mismatch)
>>>>> been there. on OSD.12 it's
>>>>> -rw-r--r-- 1 root root 699904 Dec  9 06:25
>>>>> rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__41
>>>>>
>>>>> on OSD.40:
>>>>> -rw-r--r-- 1 root root 4194304 Dec  9 06:25
>>>>> rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__41
>>>>>
>>>>> going by a short glance into the file, there are some readable
>>>>> syslog-entries, in both files.
>>>>> For the bad luck in this example, the shorter file contains the more current
>>>>> entries?!
>>> It sounds like the larger one was at one point correct, but since they got
>>> out of sync an update was applied to the other.  What fs is this (inside
>>> the VM)?  If we're lucky the whole block is file data, in which case I
>>> would extend the small one with more recent out to the full size by taking
>>> the last chunk of the second one.  (Or, if the bytes look like an
>>> unimportant file, just use truncate(1) to extend it, and get zeros for
>>> that region.)  Make backups of the object first, and fsck inside the VM
>>> afterwards.
>>>
>>> --
>>>
>>> We've seen this issue bite twice now, both times on argonaut.  So far
>>> nobody using anything more recent..but that is a smaller pool of people,
>>> so no real comform there.  Working on setting up a higher-stress long-term
>>> testing cluster to trigger this.
>>>
>>> Can you remind me what kernel version you are using?
>> one of the affected nodes are driven by 3.5.4, the newer ones are nowadays Ubtuntu 12.04.1 LTS with self-compiled 3.6.6.
>> Inside the VM's you can imagine all flavors, less forgiving CentOS 5.8, some debian5.0 ( ext3)… mostly ext3, I think. Not optimum, at least.
>>
>> Couple of problems caused by slow requests, I can see in some log-files customers pressing the "RESET" button, implemented via qemu-monitor.
>> Destructive as can be, with having some megs of cache with the rbd-device.
>>
>> Thnx n regards,
>>
>> Oliver.
>>
>>> sage
>>>
>>>
>>>>> What exactly happens, if I try to copy or export the file? Which block will
>>>>> be chosen?
>>>>> VM is running as I'm writing, so flexibility reduced.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Oliver.
>>>>>
>>>>>> If this worked our correctly, you can repeat for the other 5 cases.
>>>>>>
>>>>>> Let me know if you have any questions.
>>>>>> -Sam
>>>>>>
>>>>>> On Fri, Dec 7, 2012 at 11:09 AM, Oliver Francke <Oliver.Francke@filoo.de>
>>>>>> wrote:
>>>>>>> Hi Sam,
>>>>>>>
>>>>>>> Am 07.12.2012 um 19:37 schrieb Samuel Just <sam.just@inktank.com>:
>>>>>>>
>>>>>>>> That is very likely to be one of the merge_log bugs fixed between 0.48
>>>>>>>> and 0.55.  I could confirm with a stacktrace from gdb with line
>>>>>>>> numbers or the remainder of the logging dumped when the daemon
>>>>>>>> crashed.
>>>>>>>>
>>>>>>>> My understanding of your situation is that currently all pgs are
>>>>>>>> active+clean but you are missing some rbd image headers and some rbd
>>>>>>>> images appear to be corrupted.  Is that accurate?
>>>>>>>> -Sam
>>>>>>>>
>>>>>>> thnx for droppig in.
>>>>>>>
>>>>>>> Uhm almost correct, there are now 6 pg in state inconsistent:
>>>>>>>
>>>>>>> HEALTH_WARN 6 pgs inconsistent
>>>>>>> pg 65.da is active+clean+inconsistent, acting [1,33]
>>>>>>> pg 65.d7 is active+clean+inconsistent, acting [13,42]
>>>>>>> pg 65.10 is active+clean+inconsistent, acting [12,40]
>>>>>>> pg 65.f is active+clean+inconsistent, acting [13,31]
>>>>>>> pg 65.75 is active+clean+inconsistent, acting [1,33]
>>>>>>> pg 65.6a is active+clean+inconsistent, acting [13,31]
>>>>>>>
>>>>>>> I know which images are affected, but does a repair help?
>>>>>>>
>>>>>>> 0 log [ERR] : 65.10 osd.40: soid
>>>>>>> 87c96f10/rb.0.47d9b.1014b7b4.0000000002df/head//65 size 4194304 != known
>>>>>>> size 699904
>>>>>>> 0 log [ERR] : 65.6a osd.31: soid
>>>>>>> 19a2526a/rb.0.2dcf2.1da2a31e.000000000737/head//65 size 4191744 != known
>>>>>>> size 2757632
>>>>>>> 0 log [ERR] : 65.75 osd.33: soid
>>>>>>> 20550575/rb.0.2d520.5c17a6e3.000000000339/head//65 size 4194304 != known
>>>>>>> size 1238016
>>>>>>> 0 log [ERR] : 65.d7 osd.42: soid
>>>>>>> fa3a5d7/rb.0.2c2a8.12ec359d.00000000205c/head//65 size 4194304 != known
>>>>>>> size 1382912
>>>>>>> 0 log [ERR] : 65.da osd.33: soid
>>>>>>> c2a344da/rb.0.2be17.cb4bd69.000000000081/head//65 size 4191744 != known
>>>>>>> size 1815552
>>>>>>> 0 log [ERR] : 65.f osd.31: soid
>>>>>>> e8d2430f/rb.0.2d1e9.1339c5dd.000000000c41/head//65 size 2424832 != known
>>>>>>> size 2331648
>>>>>>>
>>>>>>> of make things worse?
>>>>>>>
>>>>>>> I could only check 14 out of 20 OSD's so far, cause from two older nodes
>>>>>>> a scrub leads to slow-requests? > couple of minutes, so VM's got
>>>>>>> stalled? customers pressing the "reset-button", so losing caches?
>>>>>>>
>>>>>>> Comments welcome,
>>>>>>>
>>>>>>> Oliver.
>>>>>>>
>>>>>>>> On Fri, Dec 7, 2012 at 6:39 AM, Oliver Francke
>>>>>>>> <Oliver.Francke@filoo.de> wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> is the following a "known one", too? Would be good to get it out of
>>>>>>>>> my head:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 1: /usr/bin/ceph-osd()
>>>>>>>>>> [0x706c59]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 2: (()+0xeff0)
>>>>>>>>>> [0x7f7f306c0ff0]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 3: (gsignal()+0x35)
>>>>>>>>>> [0x7f7f2f35f1b5]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 4: (abort()+0x180)
>>>>>>>>>> [0x7f7f2f361fc0]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 5:
>>>>>>>>>> (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f7f2fbf3dc5]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 6: (()+0xcb166)
>>>>>>>>>> [0x7f7f2fbf2166]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 7: (()+0xcb193)
>>>>>>>>>> [0x7f7f2fbf2193]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 8: (()+0xcb28e)
>>>>>>>>>> [0x7f7f2fbf228e]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 9:
>>>>>>>>>> (ceph::__ceph_assert_fail(char
>>>>>>>>>> const*, char const*, int, char const*)+0x793) [0x77e903]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 10:
>>>>>>>>>> (PG::merge_log(ObjectStore::Transaction&, pg_info_t&, pg_log_t&,
>>>>>>>>>> int)+0x1de3) [0x63db93]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 11:
>>>>>>>>>> (PG::RecoveryState::Stray::react(PG::RecoveryState::MLogRec
>>>>>>>>>> const&)+0x2cc)
>>>>>>>>>> [0x63e00c]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 12:
>>>>>>>>>> (boost::statechart::simple_state<PG::RecoveryState::Stray,
>>>>>>>>>> PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na,
>>>>>>>>>> mpl_::na,
>>>>>>>>>> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
>>>>>>>>>> mpl_::na,
>>>>>>>>>> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
>>>>>>>>>> mpl_::na,
>>>>>>>>>> mpl_::na, mpl_::na, mpl_::na>,
>>>>>>>>>> (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base
>>>>>>>>>> const&, void const*)+0x203) [0x658a63]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 13:
>>>>>>>>>> (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine,
>>>>>>>>>> PG::RecoveryState::Initial, std::allocator<void>,
>>>>>>>>>> boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base
>>>>>>>>>> const&)+0x6b) [0x650b4b]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 14:
>>>>>>>>>> (PG::RecoveryState::handle_log(int, MOSDPGLog*,
>>>>>>>>>> PG::RecoveryCtx*)+0x190)
>>>>>>>>>> [0x60a520]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 15:
>>>>>>>>>> (OSD::handle_pg_log(std::tr1::shared_ptr<OpRequest>)+0x666)
>>>>>>>>>> [0x5c62e6]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 16:
>>>>>>>>>> (OSD::dispatch_op(std::tr1::shared_ptr<OpRequest>)+0x11b)
>>>>>>>>>> [0x5c6f3b]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 17:
>>>>>>>>>> (OSD::_dispatch(Message*)+0x173)
>>>>>>>>>> [0x5d1983]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 18:
>>>>>>>>>> (OSD::ms_dispatch(Message*)+0x184)
>>>>>>>>>> [0x5d2254]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 19:
>>>>>>>>>> (SimpleMessenger::DispatchQueue::entry()+0x5e9) [0x7d3c09]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 20:
>>>>>>>>>> (SimpleMessenger::dispatch_entry()+0x15) [0x7d5195]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 21:
>>>>>>>>>> (SimpleMessenger::DispatchThread::entry()+0xd) [0x726bad]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 22: (()+0x68ca)
>>>>>>>>>> [0x7f7f306b88ca]
>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 23: (clone()+0x6d)
>>>>>>>>>> [0x7f7f2f3fc92d]
>>>>>>>>>>
>>>>>>>>> Thnx for looking,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Oliver.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> Oliver Francke
>>>>>>>>>
>>>>>>>>> filoo GmbH
>>>>>>>>> Moltkestra?e 25a
>>>>>>>>> 33330 G?tersloh
>>>>>>>>> HRB4355 AG G?tersloh
>>>>>>>>>
>>>>>>>>> Gesch?ftsf?hrer: S.Grewing | J.Rehp?hler | C.Kunz
>>>>>>>>>
>>>>>>>>> Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>>>>>> ceph-devel" in
>>>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>>>> --
>>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>>>>>>>> in
>>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>
>>>> --
>>>>
>>>> Oliver Francke
>>>>
>>>> filoo GmbH
>>>> Moltkestra?e 25a
>>>> 33330 G?tersloh
>>>> HRB4355 AG G?tersloh
>>>>
>>>> Gesch?ftsf?hrer: S.Grewing | J.Rehp?hler | C.Kunz
>>>>
>>>> Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 

Oliver Francke

filoo GmbH
Moltkestraße 25a
33330 Gütersloh
HRB4355 AG Gütersloh

Geschäftsführer: S.Grewing | J.Rehpöhler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: A couple of OSD-crashes after serious network trouble
  2012-12-13 16:48                       ` Oliver Francke
@ 2012-12-13 20:48                         ` Samuel Just
  0 siblings, 0 replies; 14+ messages in thread
From: Samuel Just @ 2012-12-13 20:48 UTC (permalink / raw)
  To: Oliver Francke; +Cc: Sage Weil, ceph-devel@vger.kernel.org

Most likely what happened is that the block represented by that file
was fully overwritten replacing both copies.  You can probably
consider that one healed.  The others should be dealt with similarly:
the larger file should be the more correct one (since it should also
reflect writes made recently to the smaller one).
-Sam

On Thu, Dec 13, 2012 at 8:48 AM, Oliver Francke <Oliver.Francke@filoo.de> wrote:
> Hi Sam,
>
>
> On 12/13/2012 05:15 AM, Samuel Just wrote:
>>
>> Apologies, I missed your reply on Monday.  Any attempt to read or
>
>
> no prob ;) We are busy, too, with preparing new nodes and switch to 10GE
> this evening.
>
>
>> write the object will hit the file on the primary (the smaller one
>> with the newer syslog entries).  If you take down both OSDs (12 and
>> 40) while performing the repair, the vm in question will hang if it
>> tries to access that block, but should recover when you bring the OSDs
>> back up.  To expand on the the response Sage posted, writes/reads to
>> that block have been hitting the primary (osd.12) which unfortunately
>> is the incorrect file.  I would, however, have expected that those
>> writes would have been replicated to the larger file on osd.40 as
>> well.  Are you certain that the newer syslog entries on 12 aren't also
>> present on 40?
>
>
> well... time heals... I re-checked right now and both files are md5-wise
> identical?!
> Not checked the other 5 inconsistencies.
> Still having three headers missing and 6 OSD's not checked with scrub,
> though.
>
> Will be back... for sure ;)
>
> Thnx for now,
>
> Oliver.
>
>
>
>> -Sam
>>
>> On Tue, Dec 11, 2012 at 11:38 AM, Oliver Francke
>> <Oliver.Francke@filoo.de> wrote:
>>>
>>> Hi Sage,
>>>
>>> Am 11.12.2012 um 18:04 schrieb Sage Weil <sage@inktank.com>:
>>>
>>>> On Tue, 11 Dec 2012, Oliver Francke wrote:
>>>>>
>>>>> Hi Sam,
>>>>>
>>>>> perhaps you have overlooked my comments further down, beginning with
>>>>> "been there" ? ;)
>>>>
>>>> We're pretty swamped with bobtail stuff at the moment, so ceph-devel
>>>> inquiries are low on the priority list right now.
>>>>
>>> 100% agree, this thing here is "best effort" right now, true.
>>>
>>>> See below:
>>>>
>>>>> If so, please have a look, cause I'm clueless 8-)
>>>>>
>>>>> On 12/10/2012 11:48 AM, Oliver Francke wrote:
>>>>>>
>>>>>> Hi Sam,
>>>>>>
>>>>>> helpful input.. and... not so...
>>>>>>
>>>>>> On 12/07/2012 10:18 PM, Samuel Just wrote:
>>>>>>>
>>>>>>> Ah... unfortunately doing a repair in these 6 cases would probably
>>>>>>> result in the wrong object surviving.  It should work, but it might
>>>>>>> corrupt the rbd image contents.  If the images are expendable, you
>>>>>>> could repair and then delete the images.
>>>>>>>
>>>>>>> The red flag here is that the "known size" is smaller than the other
>>>>>>> size.  This indicates that it most likely chose the wrong file as the
>>>>>>> "correct" one since rbd image blocks usually get bigger over time.
>>>>>>> To
>>>>>>> fix this, you will need to manually copy the file for the larger of
>>>>>>> the two object replicas to replace the smaller of the two object
>>>>>>> replicas.
>>>>>>>
>>>>>>> For the first, soid
>>>>>>> 87c96f10/rb.0.47d9b.1014b7b4.0000000002df/head//65
>>>>>>> in pg 65.10:
>>>>>>> 1) Find the object on the primary and the replica (from above,
>>>>>>> primary
>>>>>>> is 12 and replica is 40).  You can use find in the primary and
>>>>>>> replica
>>>>>>> current/65.10_head directories to look for a file matching
>>>>>>> *rb.0.47d9b.1014b7b4.0000000002df*).  The file name should be
>>>>>>> 'rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__65' I think.
>>>>>>> 2) Stop the primary and replica osds
>>>>>>> 3) Compare the file sizes for the two files -- you should find that
>>>>>>> the file sizes do not match.
>>>>>>> 4) Replace the smaller file with the larger one (you'll probably want
>>>>>>> to keep a copy of the smaller one around just in case).
>>>>>>> 5) Restart the osds and scrub pg 65.10 -- the pg should come up clean
>>>>>>> (possibly with a relatively harmless stat mismatch)
>>>>>>
>>>>>> been there. on OSD.12 it's
>>>>>> -rw-r--r-- 1 root root 699904 Dec  9 06:25
>>>>>> rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__41
>>>>>>
>>>>>> on OSD.40:
>>>>>> -rw-r--r-- 1 root root 4194304 Dec  9 06:25
>>>>>> rb.0.47d9b.1014b7b4.0000000002df__head_87C96F10__41
>>>>>>
>>>>>> going by a short glance into the file, there are some readable
>>>>>> syslog-entries, in both files.
>>>>>> For the bad luck in this example, the shorter file contains the more
>>>>>> current
>>>>>> entries?!
>>>>
>>>> It sounds like the larger one was at one point correct, but since they
>>>> got
>>>> out of sync an update was applied to the other.  What fs is this (inside
>>>> the VM)?  If we're lucky the whole block is file data, in which case I
>>>> would extend the small one with more recent out to the full size by
>>>> taking
>>>> the last chunk of the second one.  (Or, if the bytes look like an
>>>> unimportant file, just use truncate(1) to extend it, and get zeros for
>>>> that region.)  Make backups of the object first, and fsck inside the VM
>>>> afterwards.
>>>>
>>>> --
>>>>
>>>> We've seen this issue bite twice now, both times on argonaut.  So far
>>>> nobody using anything more recent..but that is a smaller pool of people,
>>>> so no real comform there.  Working on setting up a higher-stress
>>>> long-term
>>>> testing cluster to trigger this.
>>>>
>>>> Can you remind me what kernel version you are using?
>>>
>>> one of the affected nodes are driven by 3.5.4, the newer ones are
>>> nowadays Ubtuntu 12.04.1 LTS with self-compiled 3.6.6.
>>> Inside the VM's you can imagine all flavors, less forgiving CentOS 5.8,
>>> some debian5.0 ( ext3)… mostly ext3, I think. Not optimum, at least.
>>>
>>> Couple of problems caused by slow requests, I can see in some log-files
>>> customers pressing the "RESET" button, implemented via qemu-monitor.
>>> Destructive as can be, with having some megs of cache with the
>>> rbd-device.
>>>
>>> Thnx n regards,
>>>
>>> Oliver.
>>>
>>>> sage
>>>>
>>>>
>>>>>> What exactly happens, if I try to copy or export the file? Which block
>>>>>> will
>>>>>> be chosen?
>>>>>> VM is running as I'm writing, so flexibility reduced.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Oliver.
>>>>>>
>>>>>>> If this worked our correctly, you can repeat for the other 5 cases.
>>>>>>>
>>>>>>> Let me know if you have any questions.
>>>>>>> -Sam
>>>>>>>
>>>>>>> On Fri, Dec 7, 2012 at 11:09 AM, Oliver Francke
>>>>>>> <Oliver.Francke@filoo.de>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi Sam,
>>>>>>>>
>>>>>>>> Am 07.12.2012 um 19:37 schrieb Samuel Just <sam.just@inktank.com>:
>>>>>>>>
>>>>>>>>> That is very likely to be one of the merge_log bugs fixed between
>>>>>>>>> 0.48
>>>>>>>>> and 0.55.  I could confirm with a stacktrace from gdb with line
>>>>>>>>> numbers or the remainder of the logging dumped when the daemon
>>>>>>>>> crashed.
>>>>>>>>>
>>>>>>>>> My understanding of your situation is that currently all pgs are
>>>>>>>>> active+clean but you are missing some rbd image headers and some
>>>>>>>>> rbd
>>>>>>>>> images appear to be corrupted.  Is that accurate?
>>>>>>>>> -Sam
>>>>>>>>>
>>>>>>>> thnx for droppig in.
>>>>>>>>
>>>>>>>> Uhm almost correct, there are now 6 pg in state inconsistent:
>>>>>>>>
>>>>>>>> HEALTH_WARN 6 pgs inconsistent
>>>>>>>> pg 65.da is active+clean+inconsistent, acting [1,33]
>>>>>>>> pg 65.d7 is active+clean+inconsistent, acting [13,42]
>>>>>>>> pg 65.10 is active+clean+inconsistent, acting [12,40]
>>>>>>>> pg 65.f is active+clean+inconsistent, acting [13,31]
>>>>>>>> pg 65.75 is active+clean+inconsistent, acting [1,33]
>>>>>>>> pg 65.6a is active+clean+inconsistent, acting [13,31]
>>>>>>>>
>>>>>>>> I know which images are affected, but does a repair help?
>>>>>>>>
>>>>>>>> 0 log [ERR] : 65.10 osd.40: soid
>>>>>>>> 87c96f10/rb.0.47d9b.1014b7b4.0000000002df/head//65 size 4194304 !=
>>>>>>>> known
>>>>>>>> size 699904
>>>>>>>> 0 log [ERR] : 65.6a osd.31: soid
>>>>>>>> 19a2526a/rb.0.2dcf2.1da2a31e.000000000737/head//65 size 4191744 !=
>>>>>>>> known
>>>>>>>> size 2757632
>>>>>>>> 0 log [ERR] : 65.75 osd.33: soid
>>>>>>>> 20550575/rb.0.2d520.5c17a6e3.000000000339/head//65 size 4194304 !=
>>>>>>>> known
>>>>>>>> size 1238016
>>>>>>>> 0 log [ERR] : 65.d7 osd.42: soid
>>>>>>>> fa3a5d7/rb.0.2c2a8.12ec359d.00000000205c/head//65 size 4194304 !=
>>>>>>>> known
>>>>>>>> size 1382912
>>>>>>>> 0 log [ERR] : 65.da osd.33: soid
>>>>>>>> c2a344da/rb.0.2be17.cb4bd69.000000000081/head//65 size 4191744 !=
>>>>>>>> known
>>>>>>>> size 1815552
>>>>>>>> 0 log [ERR] : 65.f osd.31: soid
>>>>>>>> e8d2430f/rb.0.2d1e9.1339c5dd.000000000c41/head//65 size 2424832 !=
>>>>>>>> known
>>>>>>>> size 2331648
>>>>>>>>
>>>>>>>> of make things worse?
>>>>>>>>
>>>>>>>> I could only check 14 out of 20 OSD's so far, cause from two older
>>>>>>>> nodes
>>>>>>>> a scrub leads to slow-requests? > couple of minutes, so VM's got
>>>>>>>> stalled? customers pressing the "reset-button", so losing caches?
>>>>>>>>
>>>>>>>> Comments welcome,
>>>>>>>>
>>>>>>>> Oliver.
>>>>>>>>
>>>>>>>>> On Fri, Dec 7, 2012 at 6:39 AM, Oliver Francke
>>>>>>>>> <Oliver.Francke@filoo.de> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> is the following a "known one", too? Would be good to get it out
>>>>>>>>>> of
>>>>>>>>>> my head:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 1: /usr/bin/ceph-osd()
>>>>>>>>>>> [0x706c59]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 2: (()+0xeff0)
>>>>>>>>>>> [0x7f7f306c0ff0]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 3: (gsignal()+0x35)
>>>>>>>>>>> [0x7f7f2f35f1b5]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 4: (abort()+0x180)
>>>>>>>>>>> [0x7f7f2f361fc0]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 5:
>>>>>>>>>>> (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f7f2fbf3dc5]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 6: (()+0xcb166)
>>>>>>>>>>> [0x7f7f2fbf2166]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 7: (()+0xcb193)
>>>>>>>>>>> [0x7f7f2fbf2193]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 8: (()+0xcb28e)
>>>>>>>>>>> [0x7f7f2fbf228e]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 9:
>>>>>>>>>>> (ceph::__ceph_assert_fail(char
>>>>>>>>>>> const*, char const*, int, char const*)+0x793) [0x77e903]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 10:
>>>>>>>>>>> (PG::merge_log(ObjectStore::Transaction&, pg_info_t&, pg_log_t&,
>>>>>>>>>>> int)+0x1de3) [0x63db93]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 11:
>>>>>>>>>>> (PG::RecoveryState::Stray::react(PG::RecoveryState::MLogRec
>>>>>>>>>>> const&)+0x2cc)
>>>>>>>>>>> [0x63e00c]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 12:
>>>>>>>>>>> (boost::statechart::simple_state<PG::RecoveryState::Stray,
>>>>>>>>>>> PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na,
>>>>>>>>>>> mpl_::na,
>>>>>>>>>>> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
>>>>>>>>>>> mpl_::na,
>>>>>>>>>>> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
>>>>>>>>>>> mpl_::na,
>>>>>>>>>>> mpl_::na, mpl_::na, mpl_::na>,
>>>>>>>>>>>
>>>>>>>>>>> (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base
>>>>>>>>>>> const&, void const*)+0x203) [0x658a63]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 13:
>>>>>>>>>>>
>>>>>>>>>>> (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine,
>>>>>>>>>>> PG::RecoveryState::Initial, std::allocator<void>,
>>>>>>>>>>>
>>>>>>>>>>> boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base
>>>>>>>>>>> const&)+0x6b) [0x650b4b]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 14:
>>>>>>>>>>> (PG::RecoveryState::handle_log(int, MOSDPGLog*,
>>>>>>>>>>> PG::RecoveryCtx*)+0x190)
>>>>>>>>>>> [0x60a520]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 15:
>>>>>>>>>>> (OSD::handle_pg_log(std::tr1::shared_ptr<OpRequest>)+0x666)
>>>>>>>>>>> [0x5c62e6]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 16:
>>>>>>>>>>> (OSD::dispatch_op(std::tr1::shared_ptr<OpRequest>)+0x11b)
>>>>>>>>>>> [0x5c6f3b]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 17:
>>>>>>>>>>> (OSD::_dispatch(Message*)+0x173)
>>>>>>>>>>> [0x5d1983]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 18:
>>>>>>>>>>> (OSD::ms_dispatch(Message*)+0x184)
>>>>>>>>>>> [0x5d2254]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 19:
>>>>>>>>>>> (SimpleMessenger::DispatchQueue::entry()+0x5e9) [0x7d3c09]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 20:
>>>>>>>>>>> (SimpleMessenger::dispatch_entry()+0x15) [0x7d5195]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 21:
>>>>>>>>>>> (SimpleMessenger::DispatchThread::entry()+0xd) [0x726bad]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 22: (()+0x68ca)
>>>>>>>>>>> [0x7f7f306b88ca]
>>>>>>>>>>> /var/log/ceph/ceph-osd.40.log.1.gz: 23: (clone()+0x6d)
>>>>>>>>>>> [0x7f7f2f3fc92d]
>>>>>>>>>>>
>>>>>>>>>> Thnx for looking,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Oliver.
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> Oliver Francke
>>>>>>>>>>
>>>>>>>>>> filoo GmbH
>>>>>>>>>> Moltkestra?e 25a
>>>>>>>>>> 33330 G?tersloh
>>>>>>>>>> HRB4355 AG G?tersloh
>>>>>>>>>>
>>>>>>>>>> Gesch?ftsf?hrer: S.Grewing | J.Rehp?hler | C.Kunz
>>>>>>>>>>
>>>>>>>>>> Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>>>>>>> ceph-devel" in
>>>>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>>>>>> ceph-devel"
>>>>>>>>> in
>>>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Oliver Francke
>>>>>
>>>>> filoo GmbH
>>>>> Moltkestra?e 25a
>>>>> 33330 G?tersloh
>>>>> HRB4355 AG G?tersloh
>>>>>
>>>>> Gesch?ftsf?hrer: S.Grewing | J.Rehp?hler | C.Kunz
>>>>>
>>>>> Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>>>>> in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
> --
>
> Oliver Francke
>
> filoo GmbH
> Moltkestraße 25a
> 33330 Gütersloh
> HRB4355 AG Gütersloh
>
> Geschäftsführer: S.Grewing | J.Rehpöhler | C.Kunz
>
>
> Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-12-13 20:48 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-05 11:15 A couple of OSD-crashes after serious network trouble Oliver Francke
2012-12-05 14:54 ` Sage Weil
2012-12-06 17:27   ` Oliver Francke
2012-12-07 14:39     ` Oliver Francke
2012-12-07 18:37       ` Samuel Just
2012-12-07 19:09         ` Oliver Francke
2012-12-07 21:18           ` Samuel Just
2012-12-10 10:48             ` Oliver Francke
2012-12-11 15:19               ` Oliver Francke
2012-12-11 17:04                 ` Sage Weil
2012-12-11 19:38                   ` Oliver Francke
2012-12-13  4:15                     ` Samuel Just
2012-12-13 16:48                       ` Oliver Francke
2012-12-13 20:48                         ` Samuel Just

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.