* Failed HD, automate procedure?
@ 2012-04-02 10:31 Marco Aroldi
2012-04-02 15:07 ` Wido den Hollander
2012-04-02 16:23 ` Tommi Virtanen
0 siblings, 2 replies; 4+ messages in thread
From: Marco Aroldi @ 2012-04-02 10:31 UTC (permalink / raw)
To: ceph-devel
Hi all,
i'm looking the procedure to replace a failed HD
(http://ceph.newdream.net/wiki/Replacing_a_failed_disk/OSD)
and I was wondering if the procedure could be more automated like:
1- The operator replaces the failed hd, say osd.23
2- Give a new command like "ceph reborn osd.23"
Thanks, thanks to all the community for this big piece of software!
Marco Aroldi
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Failed HD, automate procedure?
2012-04-02 10:31 Failed HD, automate procedure? Marco Aroldi
@ 2012-04-02 15:07 ` Wido den Hollander
2012-04-02 16:23 ` Tommi Virtanen
1 sibling, 0 replies; 4+ messages in thread
From: Wido den Hollander @ 2012-04-02 15:07 UTC (permalink / raw)
To: Marco Aroldi; +Cc: ceph-devel
Hi,
On 04/02/2012 12:31 PM, Marco Aroldi wrote:
> Hi all,
> i'm looking the procedure to replace a failed HD
> (http://ceph.newdream.net/wiki/Replacing_a_failed_disk/OSD)
The Wiki is a bit outdated, most of the docs are moving to
http://ceph.newdream.net/docs/ but the Wiki is still linked on the
frontpage.
> and I was wondering if the procedure could be more automated like:
>
> 1- The operator replaces the failed hd, say osd.23
> 2- Give a new command like "ceph reborn osd.23"
Currently that isn't available. But some discussion has been going on
about this: http://marc.info/?l=ceph-devel&m=133106885906229&w=2
That might not seem related at first glance, but it is. Somehow that new
disk has to be formatted and linked to the new OSD.
A OSD simply wants a data directory were to store it's data.
If you replace the disk, something/somebody has to format that disk and
make sure it's mounted.
When that's done, the OSD can format the fresh data directory and
connect to the cluster again.
The OSD also needs the current monitor map and needs his key to
authenticate to the cluster.
That data needs to come from somewhere, some form of external
involvement is needed to get this done.
To make the story short: This is on the radar.
Wido
>
> Thanks, thanks to all the community for this big piece of software!
> Marco Aroldi
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Failed HD, automate procedure?
2012-04-02 10:31 Failed HD, automate procedure? Marco Aroldi
2012-04-02 15:07 ` Wido den Hollander
@ 2012-04-02 16:23 ` Tommi Virtanen
2012-04-02 20:33 ` Marco Aroldi
1 sibling, 1 reply; 4+ messages in thread
From: Tommi Virtanen @ 2012-04-02 16:23 UTC (permalink / raw)
To: Marco Aroldi; +Cc: ceph-devel
On Mon, Apr 2, 2012 at 03:31, Marco Aroldi <marco.aroldi@gmail.com> wrote:
> i'm looking the procedure to replace a failed HD
> (http://ceph.newdream.net/wiki/Replacing_a_failed_disk/OSD)
> and I was wondering if the procedure could be more automated like:
>
> 1- The operator replaces the failed hd, say osd.23
> 2- Give a new command like "ceph reborn osd.23"
To echo what Wido said, this is definitely in the plan. The goal is
something like this:
- failed disk is unmounted and the "service light" is set (where available)
- a technician pulls out the failed disk and plugs in a prepared new one
- new disk is automatically taken into use, a new osd is started
- later, the failed disk can be examined and recovered, if desired; OR
the osd that used that disk is deleted from the monitors and the disk
discarded
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Failed HD, automate procedure?
2012-04-02 16:23 ` Tommi Virtanen
@ 2012-04-02 20:33 ` Marco Aroldi
0 siblings, 0 replies; 4+ messages in thread
From: Marco Aroldi @ 2012-04-02 20:33 UTC (permalink / raw)
To: ceph-devel
Wido, Tommi, thanks
It would be great!
> Somehow that new disk has to be formatted and linked to the new OSD.
Yes, of course.
But in theory, the cluster's osd informations could be stored on the
mon or the mds, so they know the size, the filesystem, the partitions
of the failed osd and so, once replaced the hd, can recreate a new
"twin" osd
Il 02 aprile 2012 18:23, Tommi Virtanen <tommi.virtanen@dreamhost.com>
ha scritto:
> To echo what Wido said, this is definitely in the plan
Great news Tommi!
Marco
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2012-04-02 20:33 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-04-02 10:31 Failed HD, automate procedure? Marco Aroldi
2012-04-02 15:07 ` Wido den Hollander
2012-04-02 16:23 ` Tommi Virtanen
2012-04-02 20:33 ` Marco Aroldi
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.