All of lore.kernel.org
 help / color / mirror / Atom feed
From: Loic Dachary <loic@dachary.org>
To: Gregory Farnum <greg@inktank.com>
Cc: Ceph Development <ceph-devel@vger.kernel.org>
Subject: Re: Removing disks / OSDs
Date: Mon, 21 Oct 2013 18:57:10 +0200	[thread overview]
Message-ID: <52655CE6.9000203@dachary.org> (raw)
In-Reply-To: <CAPYLRzjBmdewVvNVQd5JwnAAF2dv+q7Qk4qtzwQO7jVtpDGDRg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3436 bytes --]



On 21/10/2013 18:49, Gregory Farnum wrote:
> I'm not quite sure what questions you're actually asking here...

I guess I was asking if my understanding was correct.

> In general, the OSD is not removed from the system without explicit
> admin intervention. When it is removed, all traces of it should be
> zapped (including its key), so it can't reconnect.

Ok. So reusing osd ids is not an issue. If you reconnect a disk after osd rm the id it had, you get what you deserve and you can zap the disk so that it is formated again. Is it what you mean ?

> If it hasn't been removed, then indeed it will continue working
> properly even if moved to a different box.

Cool.

That makes it real simple from the point of view of tools like puppet :-)

Cheers

> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
> 
> 
> On Mon, Oct 21, 2013 at 9:15 AM, Loic Dachary <loic@dachary.org> wrote:
>> Hi Ceph,
>>
>> In the context of the ceph puppet module ( https://wiki.openstack.org/wiki/Puppet-openstack/ceph-blueprint ) I tried to think about what should be provided to deal with disks / OSDs when they are removed or moved around.
>>
>> Here is a possible scenario:
>>
>> * Machine A dies and contains OSD 42
>> * ceph osd rm 42 is done to get rid of the OSD
>> * ceph-prepare is called on a new disk and gets OSD id 42
>>
>> ceph/src/mon/OSDMonitor.cc
>>
>>     // allocate a new id
>>     for (i=0; i < osdmap.get_max_osd(); i++) {
>>       if (!osdmap.exists(i) &&
>>           pending_inc.new_up_client.count(i) == 0 &&
>>           (pending_inc.new_state.count(i) == 0 ||
>>            (pending_inc.new_state[i] & CEPH_OSD_EXISTS) == 0))
>>         goto done;
>>     }
>>
>> * The disk of machine A is still good and is plugged into machine C
>> * the udev logic sees it has the ceph magic uuid and contains a well formed osd file system and runs the osd daemon on it. The osd daemon fails and dies because its key does not match. It will try again and fail in the same way when the machine reboots.
>>
>> If the osd id was not reused, the disk would find its way back in the cluster and be reused without manual intervention. Since ceph-disk uses the osd uuid to create the disk, it does not matter that it has been removed : https://github.com/ceph/ceph/blob/master/src/ceph-disk#L458 . I'm not sure to understand how the key that was previously registered is re-imported. If I understand correctly it is created with ceph-osd --mkkey https://github.com/ceph/ceph/blob/master/src/ceph-disk#L1301 and stored in the osd tree at the location specified by --keyring https://github.com/ceph/ceph/blob/master/src/ceph-disk#L1307 .
>>
>> At this point my understanding is that (as long as OSD ids are not reused), removing a disk or moving it from machine to machine, even over a long period of time does not require any action. The OSD id is an int which is probably large enough for any kind of cluster in the near future. The OSD ids that are not used and not removed could be cleaned from time to time to garbage collect the space they use.
>>
>> Please let me know if I've missed something :-)
>>
>> --
>> Loïc Dachary, Artisan Logiciel Libre
>> All that is necessary for the triumph of evil is that good people do nothing.
>>

-- 
Loïc Dachary, Artisan Logiciel Libre
All that is necessary for the triumph of evil is that good people do nothing.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

  reply	other threads:[~2013-10-21 16:57 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-21 16:15 Removing disks / OSDs Loic Dachary
2013-10-21 16:49 ` Gregory Farnum
2013-10-21 16:57   ` Loic Dachary [this message]
2013-10-21 17:17     ` Gregory Farnum
2013-10-21 21:15       ` Mark Kirkwood
2013-10-22  6:13   ` Loic Dachary
2013-10-22 17:26     ` Gregory Farnum
2013-10-22 17:31       ` Sage Weil
2013-10-23 13:45         ` Loic Dachary
2013-10-25 13:11           ` Loic Dachary
2013-10-25 15:58             ` Sage Weil
2013-10-23  6:29       ` Loic Dachary

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52655CE6.9000203@dachary.org \
    --to=loic@dachary.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=greg@inktank.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.