* Assertion: ./messages/MOSDRepScrub.h: 64: FAILED assert(v == 0)
@ 2011-12-22 13:48 Martin Mailand
2011-12-22 20:24 ` Gregory Farnum
0 siblings, 1 reply; 5+ messages in thread
From: Martin Mailand @ 2011-12-22 13:48 UTC (permalink / raw)
To: ceph-devel
Hi
today 2 of my osds (osd.4 and osd.7) crashed with the same error.
2011-12-21 14:41:18.896008 7fae9f3a5700 journal check_for_full at
80625664 : JOURNAL FULL 80625664 >= 368639 (max_size 107372544 start
80994304)
2011-12-21 14:41:23.205993 7fae9fba6700 journal FULL_FULL -> FULL_WAIT.
last commit epoch committed, waiting for a new one to start.
2011-12-21 14:41:24.075990 7fae9fba6700 journal FULL_WAIT ->
FULL_NOTFULL. journal now active, setting completion plug.
./messages/MOSDRepScrub.h: In function 'virtual void
MOSDRepScrub::decode_payload(CephContext*)', in thread '7fae93977700'
./messages/MOSDRepScrub.h: 64: FAILED assert(v == 0)
ceph version 0.39-171-gdcedda8
(commit:dcedda84d0e1f69af985c301276c67c1b11e7efc)
1: /usr/bin/ceph-osd() [0x685e77]
2: (decode_message(CephContext*, ceph_msg_header&, ceph_msg_footer&,
ceph::buffer::list&, ceph::buffer::list&, ceph::buffer::list&)+0xcd2)
[0x6a7202]
3: (SimpleMessenger::Pipe::read_message(Message**)+0x136d) [0x62c9cd]
4: (SimpleMessenger::Pipe::reader()+0xb99) [0x6357d9]
5: (SimpleMessenger::Pipe::Reader::entry()+0xd) [0x4c244d]
6: (()+0x6d8c) [0x7faea6873d8c]
7: (clone()+0x6d) [0x7faea4eb004d]
ceph version 0.39-171-gdcedda8
(commit:dcedda84d0e1f69af985c301276c67c1b11e7efc)
1: /usr/bin/ceph-osd() [0x685e77]
2: (decode_message(CephContext*, ceph_msg_header&, ceph_msg_footer&,
ceph::buffer::list&, ceph::buffer::list&, ceph::buffer::list&)+0xcd2)
[0x6a7202]
3: (SimpleMessenger::Pipe::read_message(Message**)+0x136d) [0x62c9cd]
4: (SimpleMessenger::Pipe::reader()+0xb99) [0x6357d9]
5: (SimpleMessenger::Pipe::Reader::entry()+0xd) [0x4c244d]
6: (()+0x6d8c) [0x7faea6873d8c]
7: (clone()+0x6d) [0x7faea4eb004d]
*** Caught signal (Aborted) **
in thread 7fae93977700
ceph version 0.39-171-gdcedda8
(commit:dcedda84d0e1f69af985c301276c67c1b11e7efc)
1: /usr/bin/ceph-osd() [0x645172]
2: (()+0xfc60) [0x7faea687cc60]
3: (gsignal()+0x35) [0x7faea4dfdd05]
4: (abort()+0x186) [0x7faea4e01ab6]
5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7faea56b46dd]
6: (()+0xb9926) [0x7faea56b2926]
7: (()+0xb9953) [0x7faea56b2953]
8: (()+0xb9a5e) [0x7faea56b2a5e]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x396) [0x6193d6]
10: /usr/bin/ceph-osd() [0x685e77]
11: (decode_message(CephContext*, ceph_msg_header&, ceph_msg_footer&,
ceph::buffer::list&, ceph::buffer::list&, ceph::buffer::list&)+0xcd2)
[0x6a7202]
12: (SimpleMessenger::Pipe::read_message(Message**)+0x136d) [0x62c9cd]
13: (SimpleMessenger::Pipe::reader()+0xb99) [0x6357d9]
14: (SimpleMessenger::Pipe::Reader::entry()+0xd) [0x4c244d]
15: (()+0x6d8c) [0x7faea6873d8c]
16: (clone()+0x6d) [0x7faea4eb004d]
(gdb) thread apply all bt
<snip>
Thread 1 (Thread 2400):
#0 0x00007faea687cb3b in raise () from
/lib/x86_64-linux-gnu/libpthread.so.0
#1 0x0000000000644dc2 in reraise_fatal (signum=6) at
global/signal_handler.cc:59
#2 0x00000000006453ba in handle_fatal_signal (signum=6) at
global/signal_handler.cc:106
#3 <signal handler called>
---Type <return> to continue, or q <return> to quit---
#4 0x00007faea4dfdd05 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#5 0x00007faea4e01ab6 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#6 0x00007faea56b46dd in __gnu_cxx::__verbose_terminate_handler() ()
from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7 0x00007faea56b2926 in ?? () from
/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8 0x00007faea56b2953 in std::terminate() () from
/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9 0x00007faea56b2a5e in __cxa_throw () from
/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x00000000006193d6 in ceph::__ceph_assert_fail (assertion=<value
optimized out>, file=<value optimized out>, line=<value optimized out>,
func=<value optimized out>) at common/assert.cc:70
#11 0x0000000000685e77 in MOSDRepScrub::decode_payload (this=0x33c0c40,
cct=<value optimized out>) at ./messages/MOSDRepScrub.h:64
#12 0x00000000006a7202 in decode_message (cct=0x2722000, header=...,
footer=<value optimized out>, front=<value optimized out>, middle=<value
optimized out>,
data=...) at msg/Message.cc:551
#13 0x000000000062c9cd in SimpleMessenger::Pipe::read_message
(this=0x2ed3780, pm=0x7fae93976d88) at msg/SimpleMessenger.cc:1987
#14 0x00000000006357d9 in SimpleMessenger::Pipe::reader (this=0x2ed3780)
at msg/SimpleMessenger.cc:1601
#15 0x00000000004c244d in SimpleMessenger::Pipe::Reader::entry
(this=<value optimized out>) at msg/SimpleMessenger.h:208
#16 0x00007faea6873d8c in start_thread () from
/lib/x86_64-linux-gnu/libpthread.so.0
#17 0x00007faea4eb004d in clone () from /lib/x86_64-linux-gnu/libc.so.6
#18 0x0000000000000000 in ?? ()
(gdb) thread 1
[Switching to thread 1 (Thread 2400)]#0 0x00007faea687cb3b in raise ()
from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) frame 11
#11 0x0000000000685e77 in MOSDRepScrub::decode_payload (this=0x33c0c40,
cct=<value optimized out>) at ./messages/MOSDRepScrub.h:64
64 ./messages/MOSDRepScrub.h: No such file or directory.
in ./messages/MOSDRepScrub.h
(gdb) p v
$1 = 1 '\001'
-martin
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: Assertion: ./messages/MOSDRepScrub.h: 64: FAILED assert(v == 0) 2011-12-22 13:48 Assertion: ./messages/MOSDRepScrub.h: 64: FAILED assert(v == 0) Martin Mailand @ 2011-12-22 20:24 ` Gregory Farnum 2011-12-22 20:40 ` Martin Mailand 0 siblings, 1 reply; 5+ messages in thread From: Gregory Farnum @ 2011-12-22 20:24 UTC (permalink / raw) To: Martin Mailand; +Cc: ceph-devel I see you're following master! :) You got bit by a wire-incompatible change in one of the OSD messages that Sam made, although I think he's actually going to be walking it back after a conversation we just had. In any case, restarting all of your OSDs so they're running the same code will fix it. :) -Greg On Thu, Dec 22, 2011 at 5:48 AM, Martin Mailand <martin@tuxadero.com> wrote: > Hi > today 2 of my osds (osd.4 and osd.7) crashed with the same error. > > 2011-12-21 14:41:18.896008 7fae9f3a5700 journal check_for_full at 80625664 : > JOURNAL FULL 80625664 >= 368639 (max_size 107372544 start 80994304) > 2011-12-21 14:41:23.205993 7fae9fba6700 journal FULL_FULL -> FULL_WAIT. > last commit epoch committed, waiting for a new one to start. > 2011-12-21 14:41:24.075990 7fae9fba6700 journal FULL_WAIT -> FULL_NOTFULL. > journal now active, setting completion plug. > ./messages/MOSDRepScrub.h: In function 'virtual void > MOSDRepScrub::decode_payload(CephContext*)', in thread '7fae93977700' > ./messages/MOSDRepScrub.h: 64: FAILED assert(v == 0) > ceph version 0.39-171-gdcedda8 > (commit:dcedda84d0e1f69af985c301276c67c1b11e7efc) > 1: /usr/bin/ceph-osd() [0x685e77] > 2: (decode_message(CephContext*, ceph_msg_header&, ceph_msg_footer&, > ceph::buffer::list&, ceph::buffer::list&, ceph::buffer::list&)+0xcd2) > [0x6a7202] > 3: (SimpleMessenger::Pipe::read_message(Message**)+0x136d) [0x62c9cd] > 4: (SimpleMessenger::Pipe::reader()+0xb99) [0x6357d9] > 5: (SimpleMessenger::Pipe::Reader::entry()+0xd) [0x4c244d] > 6: (()+0x6d8c) [0x7faea6873d8c] > 7: (clone()+0x6d) [0x7faea4eb004d] > ceph version 0.39-171-gdcedda8 > (commit:dcedda84d0e1f69af985c301276c67c1b11e7efc) > 1: /usr/bin/ceph-osd() [0x685e77] > 2: (decode_message(CephContext*, ceph_msg_header&, ceph_msg_footer&, > ceph::buffer::list&, ceph::buffer::list&, ceph::buffer::list&)+0xcd2) > [0x6a7202] > 3: (SimpleMessenger::Pipe::read_message(Message**)+0x136d) [0x62c9cd] > 4: (SimpleMessenger::Pipe::reader()+0xb99) [0x6357d9] > 5: (SimpleMessenger::Pipe::Reader::entry()+0xd) [0x4c244d] > 6: (()+0x6d8c) [0x7faea6873d8c] > 7: (clone()+0x6d) [0x7faea4eb004d] > *** Caught signal (Aborted) ** > in thread 7fae93977700 > ceph version 0.39-171-gdcedda8 > (commit:dcedda84d0e1f69af985c301276c67c1b11e7efc) > 1: /usr/bin/ceph-osd() [0x645172] > 2: (()+0xfc60) [0x7faea687cc60] > 3: (gsignal()+0x35) [0x7faea4dfdd05] > 4: (abort()+0x186) [0x7faea4e01ab6] > 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7faea56b46dd] > 6: (()+0xb9926) [0x7faea56b2926] > 7: (()+0xb9953) [0x7faea56b2953] > 8: (()+0xb9a5e) [0x7faea56b2a5e] > 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x396) [0x6193d6] > 10: /usr/bin/ceph-osd() [0x685e77] > 11: (decode_message(CephContext*, ceph_msg_header&, ceph_msg_footer&, > ceph::buffer::list&, ceph::buffer::list&, ceph::buffer::list&)+0xcd2) > [0x6a7202] > 12: (SimpleMessenger::Pipe::read_message(Message**)+0x136d) [0x62c9cd] > 13: (SimpleMessenger::Pipe::reader()+0xb99) [0x6357d9] > 14: (SimpleMessenger::Pipe::Reader::entry()+0xd) [0x4c244d] > 15: (()+0x6d8c) [0x7faea6873d8c] > 16: (clone()+0x6d) [0x7faea4eb004d] > > > (gdb) thread apply all bt > > <snip> > > Thread 1 (Thread 2400): > #0 0x00007faea687cb3b in raise () from > /lib/x86_64-linux-gnu/libpthread.so.0 > #1 0x0000000000644dc2 in reraise_fatal (signum=6) at > global/signal_handler.cc:59 > #2 0x00000000006453ba in handle_fatal_signal (signum=6) at > global/signal_handler.cc:106 > #3 <signal handler called> > ---Type <return> to continue, or q <return> to quit--- > #4 0x00007faea4dfdd05 in raise () from /lib/x86_64-linux-gnu/libc.so.6 > #5 0x00007faea4e01ab6 in abort () from /lib/x86_64-linux-gnu/libc.so.6 > #6 0x00007faea56b46dd in __gnu_cxx::__verbose_terminate_handler() () from > /usr/lib/x86_64-linux-gnu/libstdc++.so.6 > #7 0x00007faea56b2926 in ?? () from > /usr/lib/x86_64-linux-gnu/libstdc++.so.6 > #8 0x00007faea56b2953 in std::terminate() () from > /usr/lib/x86_64-linux-gnu/libstdc++.so.6 > #9 0x00007faea56b2a5e in __cxa_throw () from > /usr/lib/x86_64-linux-gnu/libstdc++.so.6 > #10 0x00000000006193d6 in ceph::__ceph_assert_fail (assertion=<value > optimized out>, file=<value optimized out>, line=<value optimized out>, > func=<value optimized out>) at common/assert.cc:70 > #11 0x0000000000685e77 in MOSDRepScrub::decode_payload (this=0x33c0c40, > cct=<value optimized out>) at ./messages/MOSDRepScrub.h:64 > #12 0x00000000006a7202 in decode_message (cct=0x2722000, header=..., > footer=<value optimized out>, front=<value optimized out>, middle=<value > optimized out>, > data=...) at msg/Message.cc:551 > #13 0x000000000062c9cd in SimpleMessenger::Pipe::read_message > (this=0x2ed3780, pm=0x7fae93976d88) at msg/SimpleMessenger.cc:1987 > #14 0x00000000006357d9 in SimpleMessenger::Pipe::reader (this=0x2ed3780) at > msg/SimpleMessenger.cc:1601 > #15 0x00000000004c244d in SimpleMessenger::Pipe::Reader::entry (this=<value > optimized out>) at msg/SimpleMessenger.h:208 > #16 0x00007faea6873d8c in start_thread () from > /lib/x86_64-linux-gnu/libpthread.so.0 > #17 0x00007faea4eb004d in clone () from /lib/x86_64-linux-gnu/libc.so.6 > #18 0x0000000000000000 in ?? () > (gdb) thread 1 > [Switching to thread 1 (Thread 2400)]#0 0x00007faea687cb3b in raise () from > /lib/x86_64-linux-gnu/libpthread.so.0 > (gdb) frame 11 > #11 0x0000000000685e77 in MOSDRepScrub::decode_payload (this=0x33c0c40, > cct=<value optimized out>) at ./messages/MOSDRepScrub.h:64 > 64 ./messages/MOSDRepScrub.h: No such file or directory. > in ./messages/MOSDRepScrub.h > (gdb) p v > $1 = 1 '\001' > > > -martin > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Assertion: ./messages/MOSDRepScrub.h: 64: FAILED assert(v == 0) 2011-12-22 20:24 ` Gregory Farnum @ 2011-12-22 20:40 ` Martin Mailand 2011-12-22 21:08 ` Samuel Just 0 siblings, 1 reply; 5+ messages in thread From: Martin Mailand @ 2011-12-22 20:40 UTC (permalink / raw) To: Gregory Farnum; +Cc: ceph-devel, Samuel Just Hi Greg, ok, I also have at the moment one pg which stays in scrubbing, is that also a result of the different versions I am running? Do you know if Sam needs the cluster in this state to debug the scrubbing problem? Or is it unusable for that due to the different versions? -martin Am 22.12.2011 21:24, schrieb Gregory Farnum: > I see you're following master! :) You got bit by a wire-incompatible > change in one of the OSD messages that Sam made, although I think he's > actually going to be walking it back after a conversation we just had. > In any case, restarting all of your OSDs so they're running the same > code will fix it. :) > -Greg > > On Thu, Dec 22, 2011 at 5:48 AM, Martin Mailand<martin@tuxadero.com> wrote: >> Hi >> today 2 of my osds (osd.4 and osd.7) crashed with the same error. >> >> 2011-12-21 14:41:18.896008 7fae9f3a5700 journal check_for_full at 80625664 : >> JOURNAL FULL 80625664>= 368639 (max_size 107372544 start 80994304) >> 2011-12-21 14:41:23.205993 7fae9fba6700 journal FULL_FULL -> FULL_WAIT. >> last commit epoch committed, waiting for a new one to start. >> 2011-12-21 14:41:24.075990 7fae9fba6700 journal FULL_WAIT -> FULL_NOTFULL. >> journal now active, setting completion plug. >> ./messages/MOSDRepScrub.h: In function 'virtual void >> MOSDRepScrub::decode_payload(CephContext*)', in thread '7fae93977700' >> ./messages/MOSDRepScrub.h: 64: FAILED assert(v == 0) >> ceph version 0.39-171-gdcedda8 >> (commit:dcedda84d0e1f69af985c301276c67c1b11e7efc) >> 1: /usr/bin/ceph-osd() [0x685e77] >> 2: (decode_message(CephContext*, ceph_msg_header&, ceph_msg_footer&, >> ceph::buffer::list&, ceph::buffer::list&, ceph::buffer::list&)+0xcd2) >> [0x6a7202] >> 3: (SimpleMessenger::Pipe::read_message(Message**)+0x136d) [0x62c9cd] >> 4: (SimpleMessenger::Pipe::reader()+0xb99) [0x6357d9] >> 5: (SimpleMessenger::Pipe::Reader::entry()+0xd) [0x4c244d] >> 6: (()+0x6d8c) [0x7faea6873d8c] >> 7: (clone()+0x6d) [0x7faea4eb004d] >> ceph version 0.39-171-gdcedda8 >> (commit:dcedda84d0e1f69af985c301276c67c1b11e7efc) >> 1: /usr/bin/ceph-osd() [0x685e77] >> 2: (decode_message(CephContext*, ceph_msg_header&, ceph_msg_footer&, >> ceph::buffer::list&, ceph::buffer::list&, ceph::buffer::list&)+0xcd2) >> [0x6a7202] >> 3: (SimpleMessenger::Pipe::read_message(Message**)+0x136d) [0x62c9cd] >> 4: (SimpleMessenger::Pipe::reader()+0xb99) [0x6357d9] >> 5: (SimpleMessenger::Pipe::Reader::entry()+0xd) [0x4c244d] >> 6: (()+0x6d8c) [0x7faea6873d8c] >> 7: (clone()+0x6d) [0x7faea4eb004d] >> *** Caught signal (Aborted) ** >> in thread 7fae93977700 >> ceph version 0.39-171-gdcedda8 >> (commit:dcedda84d0e1f69af985c301276c67c1b11e7efc) >> 1: /usr/bin/ceph-osd() [0x645172] >> 2: (()+0xfc60) [0x7faea687cc60] >> 3: (gsignal()+0x35) [0x7faea4dfdd05] >> 4: (abort()+0x186) [0x7faea4e01ab6] >> 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7faea56b46dd] >> 6: (()+0xb9926) [0x7faea56b2926] >> 7: (()+0xb9953) [0x7faea56b2953] >> 8: (()+0xb9a5e) [0x7faea56b2a5e] >> 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char >> const*)+0x396) [0x6193d6] >> 10: /usr/bin/ceph-osd() [0x685e77] >> 11: (decode_message(CephContext*, ceph_msg_header&, ceph_msg_footer&, >> ceph::buffer::list&, ceph::buffer::list&, ceph::buffer::list&)+0xcd2) >> [0x6a7202] >> 12: (SimpleMessenger::Pipe::read_message(Message**)+0x136d) [0x62c9cd] >> 13: (SimpleMessenger::Pipe::reader()+0xb99) [0x6357d9] >> 14: (SimpleMessenger::Pipe::Reader::entry()+0xd) [0x4c244d] >> 15: (()+0x6d8c) [0x7faea6873d8c] >> 16: (clone()+0x6d) [0x7faea4eb004d] >> >> >> (gdb) thread apply all bt >> >> <snip> >> >> Thread 1 (Thread 2400): >> #0 0x00007faea687cb3b in raise () from >> /lib/x86_64-linux-gnu/libpthread.so.0 >> #1 0x0000000000644dc2 in reraise_fatal (signum=6) at >> global/signal_handler.cc:59 >> #2 0x00000000006453ba in handle_fatal_signal (signum=6) at >> global/signal_handler.cc:106 >> #3<signal handler called> >> ---Type<return> to continue, or q<return> to quit--- >> #4 0x00007faea4dfdd05 in raise () from /lib/x86_64-linux-gnu/libc.so.6 >> #5 0x00007faea4e01ab6 in abort () from /lib/x86_64-linux-gnu/libc.so.6 >> #6 0x00007faea56b46dd in __gnu_cxx::__verbose_terminate_handler() () from >> /usr/lib/x86_64-linux-gnu/libstdc++.so.6 >> #7 0x00007faea56b2926 in ?? () from >> /usr/lib/x86_64-linux-gnu/libstdc++.so.6 >> #8 0x00007faea56b2953 in std::terminate() () from >> /usr/lib/x86_64-linux-gnu/libstdc++.so.6 >> #9 0x00007faea56b2a5e in __cxa_throw () from >> /usr/lib/x86_64-linux-gnu/libstdc++.so.6 >> #10 0x00000000006193d6 in ceph::__ceph_assert_fail (assertion=<value >> optimized out>, file=<value optimized out>, line=<value optimized out>, >> func=<value optimized out>) at common/assert.cc:70 >> #11 0x0000000000685e77 in MOSDRepScrub::decode_payload (this=0x33c0c40, >> cct=<value optimized out>) at ./messages/MOSDRepScrub.h:64 >> #12 0x00000000006a7202 in decode_message (cct=0x2722000, header=..., >> footer=<value optimized out>, front=<value optimized out>, middle=<value >> optimized out>, >> data=...) at msg/Message.cc:551 >> #13 0x000000000062c9cd in SimpleMessenger::Pipe::read_message >> (this=0x2ed3780, pm=0x7fae93976d88) at msg/SimpleMessenger.cc:1987 >> #14 0x00000000006357d9 in SimpleMessenger::Pipe::reader (this=0x2ed3780) at >> msg/SimpleMessenger.cc:1601 >> #15 0x00000000004c244d in SimpleMessenger::Pipe::Reader::entry (this=<value >> optimized out>) at msg/SimpleMessenger.h:208 >> #16 0x00007faea6873d8c in start_thread () from >> /lib/x86_64-linux-gnu/libpthread.so.0 >> #17 0x00007faea4eb004d in clone () from /lib/x86_64-linux-gnu/libc.so.6 >> #18 0x0000000000000000 in ?? () >> (gdb) thread 1 >> [Switching to thread 1 (Thread 2400)]#0 0x00007faea687cb3b in raise () from >> /lib/x86_64-linux-gnu/libpthread.so.0 >> (gdb) frame 11 >> #11 0x0000000000685e77 in MOSDRepScrub::decode_payload (this=0x33c0c40, >> cct=<value optimized out>) at ./messages/MOSDRepScrub.h:64 >> 64 ./messages/MOSDRepScrub.h: No such file or directory. >> in ./messages/MOSDRepScrub.h >> (gdb) p v >> $1 = 1 '\001' >> >> >> -martin >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Assertion: ./messages/MOSDRepScrub.h: 64: FAILED assert(v == 0) 2011-12-22 20:40 ` Martin Mailand @ 2011-12-22 21:08 ` Samuel Just 2011-12-22 21:27 ` Martin Mailand 0 siblings, 1 reply; 5+ messages in thread From: Samuel Just @ 2011-12-22 21:08 UTC (permalink / raw) To: martin; +Cc: Gregory Farnum, ceph-devel Martin, that bug should actually be fixed in current master. You'll need to upgrade the whole cluster, though. -Sam On Thu, Dec 22, 2011 at 12:40 PM, Martin Mailand <martin@tuxadero.com> wrote: > Hi Greg, > ok, I also have at the moment one pg which stays in scrubbing, is that also > a result of the different versions I am running? > Do you know if Sam needs the cluster in this state to debug the scrubbing > problem? Or is it unusable for that due to the different versions? > > > -martin > > Am 22.12.2011 21:24, schrieb Gregory Farnum: > >> I see you're following master! :) You got bit by a wire-incompatible >> change in one of the OSD messages that Sam made, although I think he's >> actually going to be walking it back after a conversation we just had. >> In any case, restarting all of your OSDs so they're running the same >> code will fix it. :) >> -Greg >> >> On Thu, Dec 22, 2011 at 5:48 AM, Martin Mailand<martin@tuxadero.com> >> wrote: >>> >>> Hi >>> today 2 of my osds (osd.4 and osd.7) crashed with the same error. >>> >>> 2011-12-21 14:41:18.896008 7fae9f3a5700 journal check_for_full at >>> 80625664 : >>> JOURNAL FULL 80625664>= 368639 (max_size 107372544 start 80994304) >>> 2011-12-21 14:41:23.205993 7fae9fba6700 journal FULL_FULL -> FULL_WAIT. >>> last commit epoch committed, waiting for a new one to start. >>> 2011-12-21 14:41:24.075990 7fae9fba6700 journal FULL_WAIT -> >>> FULL_NOTFULL. >>> journal now active, setting completion plug. >>> ./messages/MOSDRepScrub.h: In function 'virtual void >>> MOSDRepScrub::decode_payload(CephContext*)', in thread '7fae93977700' >>> ./messages/MOSDRepScrub.h: 64: FAILED assert(v == 0) >>> ceph version 0.39-171-gdcedda8 >>> (commit:dcedda84d0e1f69af985c301276c67c1b11e7efc) >>> 1: /usr/bin/ceph-osd() [0x685e77] >>> 2: (decode_message(CephContext*, ceph_msg_header&, ceph_msg_footer&, >>> ceph::buffer::list&, ceph::buffer::list&, ceph::buffer::list&)+0xcd2) >>> [0x6a7202] >>> 3: (SimpleMessenger::Pipe::read_message(Message**)+0x136d) [0x62c9cd] >>> 4: (SimpleMessenger::Pipe::reader()+0xb99) [0x6357d9] >>> 5: (SimpleMessenger::Pipe::Reader::entry()+0xd) [0x4c244d] >>> 6: (()+0x6d8c) [0x7faea6873d8c] >>> 7: (clone()+0x6d) [0x7faea4eb004d] >>> ceph version 0.39-171-gdcedda8 >>> (commit:dcedda84d0e1f69af985c301276c67c1b11e7efc) >>> 1: /usr/bin/ceph-osd() [0x685e77] >>> 2: (decode_message(CephContext*, ceph_msg_header&, ceph_msg_footer&, >>> ceph::buffer::list&, ceph::buffer::list&, ceph::buffer::list&)+0xcd2) >>> [0x6a7202] >>> 3: (SimpleMessenger::Pipe::read_message(Message**)+0x136d) [0x62c9cd] >>> 4: (SimpleMessenger::Pipe::reader()+0xb99) [0x6357d9] >>> 5: (SimpleMessenger::Pipe::Reader::entry()+0xd) [0x4c244d] >>> 6: (()+0x6d8c) [0x7faea6873d8c] >>> 7: (clone()+0x6d) [0x7faea4eb004d] >>> *** Caught signal (Aborted) ** >>> in thread 7fae93977700 >>> ceph version 0.39-171-gdcedda8 >>> (commit:dcedda84d0e1f69af985c301276c67c1b11e7efc) >>> 1: /usr/bin/ceph-osd() [0x645172] >>> 2: (()+0xfc60) [0x7faea687cc60] >>> 3: (gsignal()+0x35) [0x7faea4dfdd05] >>> 4: (abort()+0x186) [0x7faea4e01ab6] >>> 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7faea56b46dd] >>> 6: (()+0xb9926) [0x7faea56b2926] >>> 7: (()+0xb9953) [0x7faea56b2953] >>> 8: (()+0xb9a5e) [0x7faea56b2a5e] >>> 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char >>> const*)+0x396) [0x6193d6] >>> 10: /usr/bin/ceph-osd() [0x685e77] >>> 11: (decode_message(CephContext*, ceph_msg_header&, ceph_msg_footer&, >>> ceph::buffer::list&, ceph::buffer::list&, ceph::buffer::list&)+0xcd2) >>> [0x6a7202] >>> 12: (SimpleMessenger::Pipe::read_message(Message**)+0x136d) [0x62c9cd] >>> 13: (SimpleMessenger::Pipe::reader()+0xb99) [0x6357d9] >>> 14: (SimpleMessenger::Pipe::Reader::entry()+0xd) [0x4c244d] >>> 15: (()+0x6d8c) [0x7faea6873d8c] >>> 16: (clone()+0x6d) [0x7faea4eb004d] >>> >>> >>> (gdb) thread apply all bt >>> >>> <snip> >>> >>> Thread 1 (Thread 2400): >>> #0 0x00007faea687cb3b in raise () from >>> /lib/x86_64-linux-gnu/libpthread.so.0 >>> #1 0x0000000000644dc2 in reraise_fatal (signum=6) at >>> global/signal_handler.cc:59 >>> #2 0x00000000006453ba in handle_fatal_signal (signum=6) at >>> global/signal_handler.cc:106 >>> #3<signal handler called> >>> ---Type<return> to continue, or q<return> to quit--- >>> #4 0x00007faea4dfdd05 in raise () from /lib/x86_64-linux-gnu/libc.so.6 >>> #5 0x00007faea4e01ab6 in abort () from /lib/x86_64-linux-gnu/libc.so.6 >>> #6 0x00007faea56b46dd in __gnu_cxx::__verbose_terminate_handler() () >>> from >>> /usr/lib/x86_64-linux-gnu/libstdc++.so.6 >>> #7 0x00007faea56b2926 in ?? () from >>> /usr/lib/x86_64-linux-gnu/libstdc++.so.6 >>> #8 0x00007faea56b2953 in std::terminate() () from >>> /usr/lib/x86_64-linux-gnu/libstdc++.so.6 >>> #9 0x00007faea56b2a5e in __cxa_throw () from >>> /usr/lib/x86_64-linux-gnu/libstdc++.so.6 >>> #10 0x00000000006193d6 in ceph::__ceph_assert_fail (assertion=<value >>> optimized out>, file=<value optimized out>, line=<value optimized out>, >>> func=<value optimized out>) at common/assert.cc:70 >>> #11 0x0000000000685e77 in MOSDRepScrub::decode_payload (this=0x33c0c40, >>> cct=<value optimized out>) at ./messages/MOSDRepScrub.h:64 >>> #12 0x00000000006a7202 in decode_message (cct=0x2722000, header=..., >>> footer=<value optimized out>, front=<value optimized out>, middle=<value >>> optimized out>, >>> data=...) at msg/Message.cc:551 >>> #13 0x000000000062c9cd in SimpleMessenger::Pipe::read_message >>> (this=0x2ed3780, pm=0x7fae93976d88) at msg/SimpleMessenger.cc:1987 >>> #14 0x00000000006357d9 in SimpleMessenger::Pipe::reader (this=0x2ed3780) >>> at >>> msg/SimpleMessenger.cc:1601 >>> #15 0x00000000004c244d in SimpleMessenger::Pipe::Reader::entry >>> (this=<value >>> optimized out>) at msg/SimpleMessenger.h:208 >>> #16 0x00007faea6873d8c in start_thread () from >>> /lib/x86_64-linux-gnu/libpthread.so.0 >>> #17 0x00007faea4eb004d in clone () from /lib/x86_64-linux-gnu/libc.so.6 >>> #18 0x0000000000000000 in ?? () >>> (gdb) thread 1 >>> [Switching to thread 1 (Thread 2400)]#0 0x00007faea687cb3b in raise () >>> from >>> /lib/x86_64-linux-gnu/libpthread.so.0 >>> (gdb) frame 11 >>> #11 0x0000000000685e77 in MOSDRepScrub::decode_payload (this=0x33c0c40, >>> cct=<value optimized out>) at ./messages/MOSDRepScrub.h:64 >>> 64 ./messages/MOSDRepScrub.h: No such file or directory. >>> in ./messages/MOSDRepScrub.h >>> (gdb) p v >>> $1 = 1 '\001' >>> >>> >>> -martin >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Assertion: ./messages/MOSDRepScrub.h: 64: FAILED assert(v == 0) 2011-12-22 21:08 ` Samuel Just @ 2011-12-22 21:27 ` Martin Mailand 0 siblings, 0 replies; 5+ messages in thread From: Martin Mailand @ 2011-12-22 21:27 UTC (permalink / raw) To: Samuel Just; +Cc: Gregory Farnum, ceph-devel Hi Sam, okay, after I upgraded the whole cluster, the stuck pg went away. -martin Am 22.12.2011 22:08, schrieb Samuel Just: > Martin, that bug should actually be fixed in current master. You'll > need to upgrade the whole cluster, though. > -Sam > > On Thu, Dec 22, 2011 at 12:40 PM, Martin Mailand<martin@tuxadero.com> wrote: >> Hi Greg, >> ok, I also have at the moment one pg which stays in scrubbing, is that also >> a result of the different versions I am running? >> Do you know if Sam needs the cluster in this state to debug the scrubbing >> problem? Or is it unusable for that due to the different versions? >> >> >> -martin >> >> Am 22.12.2011 21:24, schrieb Gregory Farnum: >> >>> I see you're following master! :) You got bit by a wire-incompatible >>> change in one of the OSD messages that Sam made, although I think he's >>> actually going to be walking it back after a conversation we just had. >>> In any case, restarting all of your OSDs so they're running the same >>> code will fix it. :) >>> -Greg >>> >>> On Thu, Dec 22, 2011 at 5:48 AM, Martin Mailand<martin@tuxadero.com> >>> wrote: >>>> >>>> Hi >>>> today 2 of my osds (osd.4 and osd.7) crashed with the same error. >>>> >>>> 2011-12-21 14:41:18.896008 7fae9f3a5700 journal check_for_full at >>>> 80625664 : >>>> JOURNAL FULL 80625664>= 368639 (max_size 107372544 start 80994304) >>>> 2011-12-21 14:41:23.205993 7fae9fba6700 journal FULL_FULL -> FULL_WAIT. >>>> last commit epoch committed, waiting for a new one to start. >>>> 2011-12-21 14:41:24.075990 7fae9fba6700 journal FULL_WAIT -> >>>> FULL_NOTFULL. >>>> journal now active, setting completion plug. >>>> ./messages/MOSDRepScrub.h: In function 'virtual void >>>> MOSDRepScrub::decode_payload(CephContext*)', in thread '7fae93977700' >>>> ./messages/MOSDRepScrub.h: 64: FAILED assert(v == 0) >>>> ceph version 0.39-171-gdcedda8 >>>> (commit:dcedda84d0e1f69af985c301276c67c1b11e7efc) >>>> 1: /usr/bin/ceph-osd() [0x685e77] >>>> 2: (decode_message(CephContext*, ceph_msg_header&, ceph_msg_footer&, >>>> ceph::buffer::list&, ceph::buffer::list&, ceph::buffer::list&)+0xcd2) >>>> [0x6a7202] >>>> 3: (SimpleMessenger::Pipe::read_message(Message**)+0x136d) [0x62c9cd] >>>> 4: (SimpleMessenger::Pipe::reader()+0xb99) [0x6357d9] >>>> 5: (SimpleMessenger::Pipe::Reader::entry()+0xd) [0x4c244d] >>>> 6: (()+0x6d8c) [0x7faea6873d8c] >>>> 7: (clone()+0x6d) [0x7faea4eb004d] >>>> ceph version 0.39-171-gdcedda8 >>>> (commit:dcedda84d0e1f69af985c301276c67c1b11e7efc) >>>> 1: /usr/bin/ceph-osd() [0x685e77] >>>> 2: (decode_message(CephContext*, ceph_msg_header&, ceph_msg_footer&, >>>> ceph::buffer::list&, ceph::buffer::list&, ceph::buffer::list&)+0xcd2) >>>> [0x6a7202] >>>> 3: (SimpleMessenger::Pipe::read_message(Message**)+0x136d) [0x62c9cd] >>>> 4: (SimpleMessenger::Pipe::reader()+0xb99) [0x6357d9] >>>> 5: (SimpleMessenger::Pipe::Reader::entry()+0xd) [0x4c244d] >>>> 6: (()+0x6d8c) [0x7faea6873d8c] >>>> 7: (clone()+0x6d) [0x7faea4eb004d] >>>> *** Caught signal (Aborted) ** >>>> in thread 7fae93977700 >>>> ceph version 0.39-171-gdcedda8 >>>> (commit:dcedda84d0e1f69af985c301276c67c1b11e7efc) >>>> 1: /usr/bin/ceph-osd() [0x645172] >>>> 2: (()+0xfc60) [0x7faea687cc60] >>>> 3: (gsignal()+0x35) [0x7faea4dfdd05] >>>> 4: (abort()+0x186) [0x7faea4e01ab6] >>>> 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7faea56b46dd] >>>> 6: (()+0xb9926) [0x7faea56b2926] >>>> 7: (()+0xb9953) [0x7faea56b2953] >>>> 8: (()+0xb9a5e) [0x7faea56b2a5e] >>>> 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char >>>> const*)+0x396) [0x6193d6] >>>> 10: /usr/bin/ceph-osd() [0x685e77] >>>> 11: (decode_message(CephContext*, ceph_msg_header&, ceph_msg_footer&, >>>> ceph::buffer::list&, ceph::buffer::list&, ceph::buffer::list&)+0xcd2) >>>> [0x6a7202] >>>> 12: (SimpleMessenger::Pipe::read_message(Message**)+0x136d) [0x62c9cd] >>>> 13: (SimpleMessenger::Pipe::reader()+0xb99) [0x6357d9] >>>> 14: (SimpleMessenger::Pipe::Reader::entry()+0xd) [0x4c244d] >>>> 15: (()+0x6d8c) [0x7faea6873d8c] >>>> 16: (clone()+0x6d) [0x7faea4eb004d] >>>> >>>> >>>> (gdb) thread apply all bt >>>> >>>> <snip> >>>> >>>> Thread 1 (Thread 2400): >>>> #0 0x00007faea687cb3b in raise () from >>>> /lib/x86_64-linux-gnu/libpthread.so.0 >>>> #1 0x0000000000644dc2 in reraise_fatal (signum=6) at >>>> global/signal_handler.cc:59 >>>> #2 0x00000000006453ba in handle_fatal_signal (signum=6) at >>>> global/signal_handler.cc:106 >>>> #3<signal handler called> >>>> ---Type<return> to continue, or q<return> to quit--- >>>> #4 0x00007faea4dfdd05 in raise () from /lib/x86_64-linux-gnu/libc.so.6 >>>> #5 0x00007faea4e01ab6 in abort () from /lib/x86_64-linux-gnu/libc.so.6 >>>> #6 0x00007faea56b46dd in __gnu_cxx::__verbose_terminate_handler() () >>>> from >>>> /usr/lib/x86_64-linux-gnu/libstdc++.so.6 >>>> #7 0x00007faea56b2926 in ?? () from >>>> /usr/lib/x86_64-linux-gnu/libstdc++.so.6 >>>> #8 0x00007faea56b2953 in std::terminate() () from >>>> /usr/lib/x86_64-linux-gnu/libstdc++.so.6 >>>> #9 0x00007faea56b2a5e in __cxa_throw () from >>>> /usr/lib/x86_64-linux-gnu/libstdc++.so.6 >>>> #10 0x00000000006193d6 in ceph::__ceph_assert_fail (assertion=<value >>>> optimized out>, file=<value optimized out>, line=<value optimized out>, >>>> func=<value optimized out>) at common/assert.cc:70 >>>> #11 0x0000000000685e77 in MOSDRepScrub::decode_payload (this=0x33c0c40, >>>> cct=<value optimized out>) at ./messages/MOSDRepScrub.h:64 >>>> #12 0x00000000006a7202 in decode_message (cct=0x2722000, header=..., >>>> footer=<value optimized out>, front=<value optimized out>, middle=<value >>>> optimized out>, >>>> data=...) at msg/Message.cc:551 >>>> #13 0x000000000062c9cd in SimpleMessenger::Pipe::read_message >>>> (this=0x2ed3780, pm=0x7fae93976d88) at msg/SimpleMessenger.cc:1987 >>>> #14 0x00000000006357d9 in SimpleMessenger::Pipe::reader (this=0x2ed3780) >>>> at >>>> msg/SimpleMessenger.cc:1601 >>>> #15 0x00000000004c244d in SimpleMessenger::Pipe::Reader::entry >>>> (this=<value >>>> optimized out>) at msg/SimpleMessenger.h:208 >>>> #16 0x00007faea6873d8c in start_thread () from >>>> /lib/x86_64-linux-gnu/libpthread.so.0 >>>> #17 0x00007faea4eb004d in clone () from /lib/x86_64-linux-gnu/libc.so.6 >>>> #18 0x0000000000000000 in ?? () >>>> (gdb) thread 1 >>>> [Switching to thread 1 (Thread 2400)]#0 0x00007faea687cb3b in raise () >>>> from >>>> /lib/x86_64-linux-gnu/libpthread.so.0 >>>> (gdb) frame 11 >>>> #11 0x0000000000685e77 in MOSDRepScrub::decode_payload (this=0x33c0c40, >>>> cct=<value optimized out>) at ./messages/MOSDRepScrub.h:64 >>>> 64 ./messages/MOSDRepScrub.h: No such file or directory. >>>> in ./messages/MOSDRepScrub.h >>>> (gdb) p v >>>> $1 = 1 '\001' >>>> >>>> >>>> -martin >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-12-22 21:27 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-12-22 13:48 Assertion: ./messages/MOSDRepScrub.h: 64: FAILED assert(v == 0) Martin Mailand 2011-12-22 20:24 ` Gregory Farnum 2011-12-22 20:40 ` Martin Mailand 2011-12-22 21:08 ` Samuel Just 2011-12-22 21:27 ` Martin Mailand
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.