All of lore.kernel.org
 help / color / mirror / Atom feed
* udev remove events to mark OSD down/out on disk-pull
@ 2016-11-16  2:30 David Disseldorp
  2016-11-16  7:05 ` Loic Dachary
  2016-11-16 14:50 ` Sage Weil
  0 siblings, 2 replies; 5+ messages in thread
From: David Disseldorp @ 2016-11-16  2:30 UTC (permalink / raw)
  To: ceph-devel

Hi,

I'm currently looking at ways to speed up OSD down/out notifications
for disk-pull events, and was investigating using udev remove events
for this.

IIUC, the outage currently propagates through to the mons via OSD device
I/O error -> filestore I/O error ->  ceph-osd ceph_abort() -> heartbeat
failure.

For the disk-pull case, this should be relatively easy to speed up
by handling the remove event in 95-ceph-osd.rules with an appropriate
osd down/out PDU. The problem then becomes maintaining consistent
information in the udev database (all stashed via IMPORT{program}):
- cluster / OSD ids
- appropriate cephx creds

Before I hack something up for this, I'm interested in what others
think, and whether anyone has already gone down this path. I seem to
recall someone attempting to change the ceph-osd behaviour on I/O
error at some stage.

Cheers, David

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-11-16 15:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-16  2:30 udev remove events to mark OSD down/out on disk-pull David Disseldorp
2016-11-16  7:05 ` Loic Dachary
2016-11-16 12:57   ` David Disseldorp
2016-11-16 14:50 ` Sage Weil
2016-11-16 15:16   ` David Disseldorp

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.