All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Spano <dspano@optogenics.com>
To: ceph-devel <ceph-devel@vger.kernel.org>
Subject: OSD Crash
Date: Mon, 4 Mar 2013 13:02:03 -0500 (EST)	[thread overview]
Message-ID: <33027299.339.1362420122090.JavaMail.dspano@it1> (raw)
In-Reply-To: <8566685.312.1362419807745.JavaMail.dspano@it1>

[-- Attachment #1: Type: text/plain, Size: 525 bytes --]

I had one of my OSDs crash yesterday. I'm using ceph version 0.56.3 (6eb7e15a4783b122e9b0c85ea9ba064145958aa5). 

The part of the log file where the crash happened is attached. Not really sure what lead up to it, but I did get an alert from my server monitor telling me my swap space got really low around the time it crashed. 

The OSD reconnected after restarting the service. Currently, I'm waiting patiently as 1 of my 400 pgs gets out of active+clean+scrubbing status. 

Dave Spano 
Optogenics 
Systems Administrator 



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: osd.0.log --]
[-- Type: text/x-log; name=osd.0.log, Size: 6851 bytes --]

   -17> 2013-03-03 13:02:13.478152 7f5d5a9b5700  5 --OSD::tracker-- reqid: client.13039.0:6860359, seq: 5393222, time: 2013-03-03 13:02:13.478134, event: write_thread_in_journal_buffer, request: osd_sub_op(client.13039.0:6860359 3.0 a10c17c8/rb.0.2dd7.16d28c4f.00000000002f/head//3 [] v 411'1980074 snapset=0=[]:[] snapc=0=[]) v7
   -16> 2013-03-03 13:02:13.478153 7f5d559ab700  1 -- 192.168.3.11:6801/4500 --> osd.1 192.168.3.12:6802/2467 -- osd_sub_op_reply(client.14000.1:570700 0.16 5e01a96/100003797f2.00000000/head//0 [] ondisk, result = 0) v1 -- ?+0 0xc45cc80
   -15> 2013-03-03 13:02:13.478184 7f5d5a9b5700  5 --OSD::tracker-- reqid: client.14000.1:570701, seq: 5393223, time: 2013-03-03 13:02:13.478184, event: write_thread_in_journal_buffer, request: osd_sub_op(client.14000.1:570701 0.22 40dccca2/100001164ca.00000002/head//0 [] v 411'447369 snapset=0=[]:[] snapc=0=[]) v7
   -14> 2013-03-03 13:02:13.478209 7f5d5a9b5700  5 --OSD::tracker-- reqid: client.11755.0:2625658, seq: 5393225, time: 2013-03-03 13:02:13.478209, event: write_thread_in_journal_buffer, request: osd_sub_op(client.11755.0:2625658 3.7 2cb006a7/rb.0.2ea4.614c277f.00000000103d/head//3 [] v 411'6095529 snapset=0=[]:[] snapc=0=[]) v7
   -13> 2013-03-03 13:02:13.478234 7f5d5a9b5700  5 --OSD::tracker-- reqid: client.11755.0:2625659, seq: 5393226, time: 2013-03-03 13:02:13.478234, event: write_thread_in_journal_buffer, request: osd_sub_op(client.11755.0:2625659 3.7 2cb006a7/rb.0.2ea4.614c277f.00000000103d/head//3 [] v 411'6095530 snapset=0=[]:[] snapc=0=[]) v7
   -12> 2013-03-03 13:02:13.484696 7f5d549a9700  1 -- 192.168.3.11:6800/4500 <== client.11755 192.168.1.64:0/1062411 90128 ==== ping v1 ==== 0+0+0 (0 0 0) 0xff4e000 con 0x307a6e0
   -11> 2013-03-03 13:02:13.489457 7f5d4f99f700  5 --OSD::tracker-- reqid: client.11755.0:2625660, seq: 5393227, time: 2013-03-03 13:02:13.489457, event: started, request: osd_sub_op(client.11755.0:2625660 3.7 2cb006a7/rb.0.2ea4.614c277f.00000000103d/head//3 [] v 411'6095531 snapset=0=[]:[] snapc=0=[]) v7
   -10> 2013-03-03 13:02:13.489503 7f5d4f99f700  5 --OSD::tracker-- reqid: client.11755.0:2625660, seq: 5393227, time: 2013-03-03 13:02:13.489503, event: commit_queued_for_journal_write, request: osd_sub_op(client.11755.0:2625660 3.7 2cb006a7/rb.0.2ea4.614c277f.00000000103d/head//3 [] v 411'6095531 snapset=0=[]:[] snapc=0=[]) v7
    -9> 2013-03-03 13:02:13.571632 7f5d501a0700  5 --OSD::tracker-- reqid: client.11755.0:2625657, seq: 5393224, time: 2013-03-03 13:02:13.571631, event: started, request: osd_op(client.11755.0:2625657 rb.0.2ea4.614c277f.00000000003d [write 1253376~4096] 3.c7bd6ff1) v4
    -8> 2013-03-03 13:02:13.571661 7f5d501a0700  5 --OSD::tracker-- reqid: client.11755.0:2625657, seq: 5393224, time: 2013-03-03 13:02:13.571661, event: started, request: osd_op(client.11755.0:2625657 rb.0.2ea4.614c277f.00000000003d [write 1253376~4096] 3.c7bd6ff1) v4
    -7> 2013-03-03 13:02:13.571733 7f5d501a0700  5 --OSD::tracker-- reqid: client.11755.0:2625657, seq: 5393224, time: 2013-03-03 13:02:13.571733, event: waiting for subops from [1], request: osd_op(client.11755.0:2625657 rb.0.2ea4.614c277f.00000000003d [write 1253376~4096] 3.c7bd6ff1) v4
    -6> 2013-03-03 13:02:13.598028 7f5d5a9b5700  5 --OSD::tracker-- reqid: client.13039.0:6860359, seq: 5393222, time: 2013-03-03 13:02:13.598027, event: journaled_completion_queued, request: osd_sub_op(client.13039.0:6860359 3.0 a10c17c8/rb.0.2dd7.16d28c4f.00000000002f/head//3 [] v 411'1980074 snapset=0=[]:[] snapc=0=[]) v7
    -5> 2013-03-03 13:02:13.598061 7f5d5a9b5700  5 --OSD::tracker-- reqid: client.14000.1:570701, seq: 5393223, time: 2013-03-03 13:02:13.598061, event: journaled_completion_queued, request: osd_sub_op(client.14000.1:570701 0.22 40dccca2/100001164ca.00000002/head//0 [] v 411'447369 snapset=0=[]:[] snapc=0=[]) v7
    -4> 2013-03-03 13:02:13.598081 7f5d5a9b5700  5 --OSD::tracker-- reqid: client.11755.0:2625658, seq: 5393225, time: 2013-03-03 13:02:13.598081, event: journaled_completion_queued, request: osd_sub_op(client.11755.0:2625658 3.7 2cb006a7/rb.0.2ea4.614c277f.00000000103d/head//3 [] v 411'6095529 snapset=0=[]:[] snapc=0=[]) v7
    -3> 2013-03-03 13:02:13.598098 7f5d5a9b5700  5 --OSD::tracker-- reqid: client.11755.0:2625659, seq: 5393226, time: 2013-03-03 13:02:13.598098, event: journaled_completion_queued, request: osd_sub_op(client.11755.0:2625659 3.7 2cb006a7/rb.0.2ea4.614c277f.00000000103d/head//3 [] v 411'6095530 snapset=0=[]:[] snapc=0=[]) v7
    -2> 2013-03-03 13:02:13.598134 7f5d5a9b5700  5 --OSD::tracker-- reqid: client.11755.0:2625660, seq: 5393227, time: 2013-03-03 13:02:13.598134, event: write_thread_in_journal_buffer, request: osd_sub_op(client.11755.0:2625660 3.7 2cb006a7/rb.0.2ea4.614c277f.00000000103d/head//3 [] v 411'6095531 snapset=0=[]:[] snapc=0=[]) v7
    -1> 2013-03-03 13:02:13.598257 7f5d5a9b5700  5 --OSD::tracker-- reqid: client.11755.0:2625660, seq: 5393227, time: 2013-03-03 13:02:13.598257, event: journaled_completion_queued, request: osd_sub_op(client.11755.0:2625660 3.7 2cb006a7/rb.0.2ea4.614c277f.00000000103d/head//3 [] v 411'6095531 snapset=0=[]:[] snapc=0=[]) v7
     0> 2013-03-03 13:02:13.753064 7f5d4c097700 -1 *** Caught signal (Aborted) **
 in thread 7f5d4c097700

 ceph version 0.56.3 (6eb7e15a4783b122e9b0c85ea9ba064145958aa5)
 1: /usr/bin/ceph-osd() [0x78430a]
 2: (()+0xfcb0) [0x7f5d60fc3cb0]
 3: (gsignal()+0x35) [0x7f5d5f982425]
 4: (abort()+0x17b) [0x7f5d5f985b8b]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f5d602d469d]
 6: (()+0xb5846) [0x7f5d602d2846]
 7: (()+0xb5873) [0x7f5d602d2873]
 8: (()+0xb596e) [0x7f5d602d296e]
 9: (ceph::buffer::create_page_aligned(unsigned int)+0x95) [0x82ef25]
 10: (Pipe::read_message(Message**)+0x2421) [0x8d3591]
 11: (Pipe::reader()+0x8c2) [0x8e3db2]
 12: (Pipe::Reader::entry()+0xd) [0x8e668d]
 13: (()+0x7e9a) [0x7f5d60fbbe9a]
 14: (clone()+0x6d) [0x7f5d5fa3fcbd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 journal
   0/ 5 ms
   1/ 5 mon
   0/10 monc
   0/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/ 5 hadoop
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent    100000
  max_new         1000
  log_file /var/log/ceph/osd.0.log
--- end dump of recent events ---
root@ha1:/var/log/ceph# 


       reply	other threads:[~2013-03-04 18:02 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <8566685.312.1362419807745.JavaMail.dspano@it1>
2013-03-04 18:02 ` Dave Spano [this message]
2020-09-07 16:42 osd crash Kaarlo Lahtela
  -- strict thread matches above, loose matches on Subject: below --
2012-08-22 20:31 OSD crash Andrey Korolyov
2012-08-22 22:33 ` Sage Weil
2012-08-22 22:55   ` Andrey Korolyov
2012-08-23  0:09     ` Gregory Farnum
2012-08-25  8:30       ` Andrey Korolyov
2012-08-26 16:52         ` Andrey Korolyov
2012-08-26 20:44           ` Sage Weil
2012-09-04  8:13           ` Andrey Korolyov
2012-09-04 15:32             ` Sage Weil
2012-06-16 12:57 Stefan Priebe
2012-06-16 13:34 ` Stefan Priebe
2012-06-17 21:16   ` Sage Weil
2012-06-18  6:41     ` Stefan Priebe - Profihost AG
2011-05-27  0:12 Fyodor Ustinov
2011-05-27 15:16 ` Gregory Farnum
2011-05-27 16:41   ` Fyodor Ustinov
2011-05-27 16:49     ` Gregory Farnum
2011-05-27 19:18       ` Gregory Farnum
2011-05-27 19:30         ` Fyodor Ustinov
2011-05-27 22:52         ` Fyodor Ustinov
2011-05-11 20:47 OSD Crash Mark Nigh
2011-05-11 21:06 ` Sage Weil
2011-05-11 21:39 ` Colin McCabe
2011-05-13 17:03   ` Mark Nigh
2011-05-13 18:34     ` Sage Weil
2011-05-11 13:12 Mark Nigh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=33027299.339.1362420122090.JavaMail.dspano@it1 \
    --to=dspano@optogenics.com \
    --cc=ceph-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.