All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jim Schutt" <jaschut@sandia.gov>
To: Samuel Just <sam.just@inktank.com>
Cc: "ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: osd/OSDMap.h: 330: FAILED assert(is_up(osd))
Date: Wed, 18 Jul 2012 09:29:48 -0600	[thread overview]
Message-ID: <5006D66C.50006@sandia.gov> (raw)
In-Reply-To: <CA+4uBUb237Z-dqyCTe49Ce73Csxz_Kop7U9vbYj35L9CLXiNag@mail.gmail.com>

On 07/17/2012 06:03 PM, Samuel Just wrote:
> master should now have a fix for that, let me know how it goes.  I opened
> bug #2798 for this issue.
>

Hmmm, it seems handle_osd_ping() now runs into a case
where for the first ping it gets, service.osdmap can be empty?

      0> 2012-07-18 09:17:23.977497 7fffe6ec6700 -1 *** Caught signal (Segmentation fault) **
  in thread 7fffe6ec6700

  ceph version 0.48argonaut-419-g4e1d973 (commit:4e1d973e466cd45138f004e84ab8631d9b2a60fa)
  1: /usr/bin/ceph-osd() [0x723c39]
  2: (()+0xf4a0) [0x7ffff76584a0]
  3: (OSD::handle_osd_ping(MOSDPing*)+0x7d4) [0x5d7894]
  4: (OSD::heartbeat_dispatch(Message*)+0x71) [0x5d8111]
  5: (SimpleMessenger::DispatchQueue::entry()+0x583) [0x7d5103]
  6: (SimpleMessenger::dispatch_entry()+0x15) [0x7d6485]
  7: (SimpleMessenger::DispatchThread::entry()+0xd) [0x79523d]
  8: (()+0x77f1) [0x7ffff76507f1]
  9: (clone()+0x6d) [0x7ffff6aa1ccd]

gdb has this to say:

(gdb) bt
#0  0x00007ffff765836b in raise (sig=11) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42
#1  0x0000000000724067 in reraise_fatal (signum=11) at global/signal_handler.cc:58
#2  handle_fatal_signal (signum=11) at global/signal_handler.cc:104
#3  <signal handler called>
#4  get_epoch (this=0x15d0000, m=0x1587000) at ./osd/OSDMap.h:210
#5  OSD::handle_osd_ping (this=0x15d0000, m=0x1587000) at osd/OSD.cc:1711
#6  0x00000000005d8111 in OSD::heartbeat_dispatch (this=0x15d0000, m=0x1587000) at osd/OSD.cc:2769
#7  0x00000000007d5103 in ms_deliver_dispatch (this=0x1472960) at msg/Messenger.h:504
#8  SimpleMessenger::DispatchQueue::entry (this=0x1472960) at msg/SimpleMessenger.cc:367
#9  0x00000000007d6485 in SimpleMessenger::dispatch_entry (this=0x1472880) at msg/SimpleMessenger.cc:384
#10 0x000000000079523d in SimpleMessenger::DispatchThread::entry (this=<value optimized out>) at ./msg/SimpleMessenger.h:807
#11 0x00007ffff76507f1 in start_thread (arg=0x7fffe6ec6700) at pthread_create.c:301
#12 0x00007ffff6aa1ccd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
(gdb) f 5
#5  OSD::handle_osd_ping (this=0x15d0000, m=0x1587000) at osd/OSD.cc:1711
1711					m->stamp);
(gdb) l
1706		}
1707	      }
1708	      Message *r = new MOSDPing(monc->get_fsid(),
1709					curmap->get_epoch(),
1710					MOSDPing::PING_REPLY,
1711					m->stamp);
1712	      hbserver_messenger->send_message(r, m->get_connection());
1713	
1714	      if (curmap->is_up(from)) {
1715		note_peer_epoch(from, m->map_epoch);
(gdb) p curmap
$1 = std::tr1::shared_ptr (empty) 0x0

-- Jim

> Thanks for the info!
> -Sam
>
> On Tue, Jul 17, 2012 at 2:54 PM, Jim Schutt<jaschut@sandia.gov>  wrote:
>> On 07/17/2012 03:44 PM, Samuel Just wrote:
>>>
>>> Not quite.  OSDService::get_osdmap() returns the most recently
>>> published osdmap.  Generally, OSD::osdmap is safe to use when you are
>>> holding the osd lock.  Otherwise, OSDService::get_osdmap() should be
>>> used.  There are a few other things that should be fixed surrounding
>>> this issue as well, I'll put some time into it today.  The map_lock
>>> should probably be removed all together.
>>
>>
>> Thanks for taking a look.  Let me know when
>> you get something, and I'll take it for a spin.
>>
>> Thanks -- Jim
>>
>>> -Sam
>>
>>
>>
>
>



  reply	other threads:[~2012-07-18 15:30 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-17 20:49 osd/OSDMap.h: 330: FAILED assert(is_up(osd)) Jim Schutt
2012-07-17 21:44 ` Samuel Just
2012-07-17 21:54   ` Jim Schutt
2012-07-18  0:03     ` Samuel Just
2012-07-18 15:29       ` Jim Schutt [this message]
2012-07-18 18:03         ` Samuel Just
2012-07-18 18:42           ` Jim Schutt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5006D66C.50006@sandia.gov \
    --to=jaschut@sandia.gov \
    --cc=ceph-devel@vger.kernel.org \
    --cc=sam.just@inktank.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.