udev remove events to mark OSD down/out on disk-pull

All of lore.kernel.org
 help / color / mirror / Atom feed

* udev remove events to mark OSD down/out on disk-pull
@ 2016-11-16  2:30 David Disseldorp
  2016-11-16  7:05 ` Loic Dachary
  2016-11-16 14:50 ` Sage Weil
  0 siblings, 2 replies; 5+ messages in thread
From: David Disseldorp @ 2016-11-16  2:30 UTC (permalink / raw)
  To: ceph-devel

Hi,

I'm currently looking at ways to speed up OSD down/out notifications
for disk-pull events, and was investigating using udev remove events
for this.

IIUC, the outage currently propagates through to the mons via OSD device
I/O error -> filestore I/O error ->  ceph-osd ceph_abort() -> heartbeat
failure.

For the disk-pull case, this should be relatively easy to speed up
by handling the remove event in 95-ceph-osd.rules with an appropriate
osd down/out PDU. The problem then becomes maintaining consistent
information in the udev database (all stashed via IMPORT{program}):
- cluster / OSD ids
- appropriate cephx creds

Before I hack something up for this, I'm interested in what others
think, and whether anyone has already gone down this path. I seem to
recall someone attempting to change the ceph-osd behaviour on I/O
error at some stage.

Cheers, David

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: udev remove events to mark OSD down/out on disk-pull
  2016-11-16  2:30 udev remove events to mark OSD down/out on disk-pull David Disseldorp
@ 2016-11-16  7:05 ` Loic Dachary
  2016-11-16 12:57   ` David Disseldorp
  2016-11-16 14:50 ` Sage Weil
  1 sibling, 1 reply; 5+ messages in thread
From: Loic Dachary @ 2016-11-16  7:05 UTC (permalink / raw)
  To: David Disseldorp, ceph-devel

Hi David,

How would you distinguish a udev remove occuring because the disk was pulled from a udev remove occuring because partprobe was run by a user / utility program ?

Cheers

On 16/11/2016 03:30, David Disseldorp wrote:
> Hi,
> 
> I'm currently looking at ways to speed up OSD down/out notifications
> for disk-pull events, and was investigating using udev remove events
> for this.
> 
> IIUC, the outage currently propagates through to the mons via OSD device
> I/O error -> filestore I/O error ->  ceph-osd ceph_abort() -> heartbeat
> failure.
> 
> For the disk-pull case, this should be relatively easy to speed up
> by handling the remove event in 95-ceph-osd.rules with an appropriate
> osd down/out PDU. The problem then becomes maintaining consistent
> information in the udev database (all stashed via IMPORT{program}):
> - cluster / OSD ids
> - appropriate cephx creds
> 
> Before I hack something up for this, I'm interested in what others
> think, and whether anyone has already gone down this path. I seem to
> recall someone attempting to change the ceph-osd behaviour on I/O
> error at some stage.
> 
> Cheers, David
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: udev remove events to mark OSD down/out on disk-pull
  2016-11-16  7:05 ` Loic Dachary
@ 2016-11-16 12:57   ` David Disseldorp
  0 siblings, 0 replies; 5+ messages in thread
From: David Disseldorp @ 2016-11-16 12:57 UTC (permalink / raw)
  To: Loic Dachary; +Cc: ceph-devel

On Wed, 16 Nov 2016 08:05:42 +0100, Loic Dachary wrote:

> How would you distinguish a udev remove occuring because the disk was pulled from a udev remove occuring because partprobe was run by a user / utility program ?

That could be detected based on whether the corresponding disk (backing
the partition) is still around, although that may complicate things if
the order of the udev events changes.

Cheers, David

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: udev remove events to mark OSD down/out on disk-pull
  2016-11-16  2:30 udev remove events to mark OSD down/out on disk-pull David Disseldorp
  2016-11-16  7:05 ` Loic Dachary
@ 2016-11-16 14:50 ` Sage Weil
  2016-11-16 15:16   ` David Disseldorp
  1 sibling, 1 reply; 5+ messages in thread
From: Sage Weil @ 2016-11-16 14:50 UTC (permalink / raw)
  To: David Disseldorp; +Cc: ceph-devel

On Wed, 16 Nov 2016, David Disseldorp wrote:
> Hi,
> 
> I'm currently looking at ways to speed up OSD down/out notifications
> for disk-pull events, and was investigating using udev remove events
> for this.
> 
> IIUC, the outage currently propagates through to the mons via OSD device
> I/O error -> filestore I/O error ->  ceph-osd ceph_abort() -> heartbeat
> failure.

We just merged (post-jewel) a change that makes connection refused events 
trigger an immediate mark-down of the peer OSD.  I think this will have 
the same effect, as long as the ceph-osd process is killed in a timely 
manner.  Have you tried it?  I'd suggest making sure that it's not 
sufficient before investing too much time into a udev-based approach...

See a033dc6f5b4cef357db6f5951062d680e880ba0e

sage

> 
> For the disk-pull case, this should be relatively easy to speed up
> by handling the remove event in 95-ceph-osd.rules with an appropriate
> osd down/out PDU. The problem then becomes maintaining consistent
> information in the udev database (all stashed via IMPORT{program}):
> - cluster / OSD ids
> - appropriate cephx creds
> 
> Before I hack something up for this, I'm interested in what others
> think, and whether anyone has already gone down this path. I seem to
> recall someone attempting to change the ceph-osd behaviour on I/O
> error at some stage.
> 
> Cheers, David
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: udev remove events to mark OSD down/out on disk-pull
  2016-11-16 14:50 ` Sage Weil
@ 2016-11-16 15:16   ` David Disseldorp
  0 siblings, 0 replies; 5+ messages in thread
From: David Disseldorp @ 2016-11-16 15:16 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

On Wed, 16 Nov 2016 14:50:30 +0000 (UTC), Sage Weil wrote:

> On Wed, 16 Nov 2016, David Disseldorp wrote:
> > Hi,
> > 
> > I'm currently looking at ways to speed up OSD down/out notifications
> > for disk-pull events, and was investigating using udev remove events
> > for this.
> > 
> > IIUC, the outage currently propagates through to the mons via OSD device
> > I/O error -> filestore I/O error ->  ceph-osd ceph_abort() -> heartbeat
> > failure.  
> 
> We just merged (post-jewel) a change that makes connection refused events 
> trigger an immediate mark-down of the peer OSD.  I think this will have 
> the same effect, as long as the ceph-osd process is killed in a timely 
> manner.  Have you tried it?  I'd suggest making sure that it's not 
> sufficient before investing too much time into a udev-based approach...
> 
> See a033dc6f5b4cef357db6f5951062d680e880ba0e

Looks much cleaner than handling this in udev. I'll test this with
Jewel and follow up - thanks Sage!

Cheers, David

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-11-16 15:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-16  2:30 udev remove events to mark OSD down/out on disk-pull David Disseldorp
2016-11-16  7:05 ` Loic Dachary
2016-11-16 12:57   ` David Disseldorp
2016-11-16 14:50 ` Sage Weil
2016-11-16 15:16   ` David Disseldorp

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.