From mboxrd@z Thu Jan 1 00:00:00 1970 From: Loic Dachary Subject: Re: Removing disks / OSDs Date: Tue, 22 Oct 2013 08:13:10 +0200 Message-ID: <52661776.20803@dachary.org> References: <52655309.8070500@dachary.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="8mQDpvJIEUGS1ajchA4GS2Pqg0UqWiATm" Return-path: Received: from smtp.dmail.dachary.org ([91.121.254.229]:42828 "EHLO smtp.dmail.dachary.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751372Ab3JVMM3 (ORCPT ); Tue, 22 Oct 2013 08:12:29 -0400 In-Reply-To: Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Gregory Farnum Cc: Ceph Development This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --8mQDpvJIEUGS1ajchA4GS2Pqg0UqWiATm Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 21/10/2013 18:49, Gregory Farnum wrote: > I'm not quite sure what questions you're actually asking here... > In general, the OSD is not removed from the system without explicit > admin intervention. When it is removed, all traces of it should be > zapped (including its key), so it can't reconnect. > If it hasn't been removed, then indeed it will continue working > properly even if moved to a different box. If there is an external journal, the device containing the journal needs = to be moved with the device containing the data. If I read ceph/src/upsta= rt/ceph-osd.conf correctly, when the data device is plugged in the new ma= chine it will fail to start because the journal is not there yet. When th= e journal device is plugged in, the ceph-osd.conf would be called because= udev rule in ceph/udev/95-ceph-osd.rules call ceph-disk activate-journal= =2E Is my understanding correct ? Cheers > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com >=20 >=20 > On Mon, Oct 21, 2013 at 9:15 AM, Loic Dachary wrote:= >> Hi Ceph, >> >> In the context of the ceph puppet module ( https://wiki.openstack.org/= wiki/Puppet-openstack/ceph-blueprint ) I tried to think about what should= be provided to deal with disks / OSDs when they are removed or moved aro= und. >> >> Here is a possible scenario: >> >> * Machine A dies and contains OSD 42 >> * ceph osd rm 42 is done to get rid of the OSD >> * ceph-prepare is called on a new disk and gets OSD id 42 >> >> ceph/src/mon/OSDMonitor.cc >> >> // allocate a new id >> for (i=3D0; i < osdmap.get_max_osd(); i++) { >> if (!osdmap.exists(i) && >> pending_inc.new_up_client.count(i) =3D=3D 0 && >> (pending_inc.new_state.count(i) =3D=3D 0 || >> (pending_inc.new_state[i] & CEPH_OSD_EXISTS) =3D=3D 0)) >> goto done; >> } >> >> * The disk of machine A is still good and is plugged into machine C >> * the udev logic sees it has the ceph magic uuid and contains a well f= ormed osd file system and runs the osd daemon on it. The osd daemon fails= and dies because its key does not match. It will try again and fail in t= he same way when the machine reboots. >> >> If the osd id was not reused, the disk would find its way back in the = cluster and be reused without manual intervention. Since ceph-disk uses t= he osd uuid to create the disk, it does not matter that it has been remov= ed : https://github.com/ceph/ceph/blob/master/src/ceph-disk#L458 . I'm no= t sure to understand how the key that was previously registered is re-imp= orted. If I understand correctly it is created with ceph-osd --mkkey http= s://github.com/ceph/ceph/blob/master/src/ceph-disk#L1301 and stored in th= e osd tree at the location specified by --keyring https://github.com/ceph= /ceph/blob/master/src/ceph-disk#L1307 . >> >> At this point my understanding is that (as long as OSD ids are not reu= sed), removing a disk or moving it from machine to machine, even over a l= ong period of time does not require any action. The OSD id is an int whic= h is probably large enough for any kind of cluster in the near future. Th= e OSD ids that are not used and not removed could be cleaned from time to= time to garbage collect the space they use. >> >> Please let me know if I've missed something :-) >> >> -- >> Lo=EFc Dachary, Artisan Logiciel Libre >> All that is necessary for the triumph of evil is that good people do n= othing. >> --=20 Lo=EFc Dachary, Artisan Logiciel Libre All that is necessary for the triumph of evil is that good people do noth= ing. --8mQDpvJIEUGS1ajchA4GS2Pqg0UqWiATm Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlJmF3cACgkQ8dLMyEl6F21ywwCfaSikUErVNyINM1uDGzEL3ghx v6sAnAjYfTWBLKHMdMHkC9O1Qgs9AFJd =kW2k -----END PGP SIGNATURE----- --8mQDpvJIEUGS1ajchA4GS2Pqg0UqWiATm--