All of lore.kernel.org
 help / color / mirror / Atom feed
* dmcrypt with luks keys in hammer
@ 2015-07-20 19:52 Wyllys Ingersoll
  2015-07-20 21:22 ` Sage Weil
  0 siblings, 1 reply; 9+ messages in thread
From: Wyllys Ingersoll @ 2015-07-20 19:52 UTC (permalink / raw)
  To: ceph-devel

Were running a cluster with Hammer v94.2 and are running into issues
with the Luks encrypted OSD data and journal partitions.  The
installation goes smoothly and everything runs OK, but we've had to
reboot a couple of the storage nodes for various reasons and when they
come back online, a large number of OSD processes fail to start
because the LUKS encrypted partitions are not getting mounted
correctly.

I'm not sure if it is a udev issue or a problem with the OSD process
itself, but the encrypted partitions end up getting mounted as
"temporary-cryptsetup-PID" and they never recover.  From below, you
can see that some of the OSDs did come up correctly, but the majority
do not.   We've seen this problem now on several storage nodes, and it
only occurs for those OSDs that used luks (the new default).  The only
recovery that we've found is to wipe them all out and rebuild them
using "plain" dmcrypt (as it used to be).

Using "blkid" on a partition that is in the "temporary-cryptsetup"
state, does show that it has the right ID_PART_ENTRY_UUID and TYPE
values and I can confirm that there is an associated key in
/etc/ceph/dmcrypt-keys, but it still isn't mounting correctly.

$ sudo blkid -p -o udev /dev/sdv2
ID_FS_UUID=87008c17-9e57-487d-8f8b-160f8f803d8b
ID_FS_UUID_ENC=87008c17-9e57-487d-8f8b-160f8f803d8b
ID_FS_VERSION=1
ID_FS_TYPE=crypto_LUKS
ID_FS_USAGE=crypto
ID_PART_ENTRY_SCHEME=gpt
ID_PART_ENTRY_NAME=ceph\x20journal
ID_PART_ENTRY_UUID=e3eda67b-a2e0-4d22-a62e-d9bda5ecf8b1
ID_PART_ENTRY_TYPE=45b0969e-9b03-4f30-b4c6-35865ceff106
ID_PART_ENTRY_NUMBER=2
ID_PART_ENTRY_OFFSET=2048
ID_PART_ENTRY_SIZE=20969473
ID_PART_ENTRY_DISK=65:80

So Im checking to see if this is a known issue or if we are missing
something in the installation or configuration that would fix this
problem.

-Wyllys Ingersoll


Ex:
$ lsblk -l
NAME                                         MAJ:MIN RM   SIZE RO TYPE
 MOUNTPOINT
sda                                            8:0    0 111.8G  0 disk
sda1                                           8:1    0  15.3G  0 part  [SWAP]
sda2                                           8:2    0     1K  0 part
sda5                                           8:5    0  96.5G  0 part  /
sdb                                            8:16   0   3.7T  0 disk
sdb1                                           8:17   0   3.6T  0 part
e8bc1531-a187-4fd2-9e3f-cf90255f89d0 (dm-0)  252:0    0   3.6T  0 crypt
sdb2                                           8:18   0    10G  0 part
temporary-cryptsetup-1235 (dm-6)             252:6    0   125K  1 crypt
sdc                                            8:32   0   3.7T  0 disk
sdc1                                           8:33   0   3.6T  0 part
temporary-cryptsetup-1788 (dm-37)            252:37   0   125K  1 crypt
sdc2                                           8:34   0    10G  0 part
temporary-cryptsetup-1789 (dm-36)            252:36   0   125K  1 crypt
sdd                                            8:48   0   3.7T  0 disk
sdd1                                           8:49   0   3.6T  0 part
temporary-cryptsetup-1252 (dm-1)             252:1    0   125K  1 crypt
sdd2                                           8:50   0    10G  0 part
temporary-cryptsetup-1246 (dm-3)             252:3    0   125K  1 crypt
sde                                            8:64   0   3.7T  0 disk
sde1                                           8:65   0   3.6T  0 part
temporary-cryptsetup-1260 (dm-14)            252:14   0   125K  1 crypt
sde2                                           8:66   0    10G  0 part
temporary-cryptsetup-1255 (dm-12)            252:12   0   125K  1 crypt
sdf                                            8:80   0   3.7T  0 disk
sdf1                                           8:81   0   3.6T  0 part
temporary-cryptsetup-1268 (dm-15)            252:15   0   125K  1 crypt
sdf2                                           8:82   0    10G  0 part
temporary-cryptsetup-1245 (dm-5)             252:5    0   125K  1 crypt
sdg                                            8:96   0   3.7T  0 disk
sdg1                                           8:97   0   3.6T  0 part
temporary-cryptsetup-1271 (dm-17)            252:17   0   125K  1 crypt
sdg2                                           8:98   0    10G  0 part
temporary-cryptsetup-1278 (dm-2)             252:2    0   125K  1 crypt
sdh                                            8:112  0   3.7T  0 disk
sdh1                                           8:113  0   3.6T  0 part
69dcd1e1-6e11-41ec-af19-1e0d90013957 (dm-43) 252:43   0   3.6T  0
crypt /var/lib/ceph/osd/ceph-42
sdh2                                           8:114  0    10G  0 part
3382723d-b0d9-4b50-affe-fb9f5df78d6f (dm-45) 252:45   0    10G  0 crypt
sdi                                            8:128  0   3.7T  0 disk
sdi1                                           8:129  0   3.6T  0 part
temporary-cryptsetup-1265 (dm-20)            252:20   0   125K  1 crypt
sdi2                                           8:130  0    10G  0 part
temporary-cryptsetup-1277 (dm-16)            252:16   0   125K  1 crypt
sdj                                            8:144  0   3.7T  0 disk
sdj1                                           8:145  0   3.6T  0 part
temporary-cryptsetup-1359 (dm-13)            252:13   0   125K  1 crypt
sdj2                                           8:146  0    10G  0 part
temporary-cryptsetup-1280 (dm-4)             252:4    0   125K  1 crypt
sdk                                            8:160  0   3.7T  0 disk
sdk1                                           8:161  0   3.6T  0 part
temporary-cryptsetup-1760 (dm-34)            252:34   0   125K  1 crypt
sdk2                                           8:162  0    10G  0 part
temporary-cryptsetup-1761 (dm-31)            252:31   0   125K  1 crypt
sdl                                            8:176  0   3.7T  0 disk
sdl1                                           8:177  0   3.6T  0 part
c3175d9f-ae12-4852-bbbc-b1d2c344c4ac (dm-38) 252:38   0   3.6T  0
crypt /var/lib/ceph/osd/ceph-32
sdl2                                           8:178  0    10G  0 part
e4e10521-985a-4d94-a766-56d6de26443a (dm-41) 252:41   0    10G  0 crypt
sdm                                            8:192  0   3.7T  0 disk
sdm1                                           8:193  0   3.6T  0 part
temporary-cryptsetup-1407 (dm-9)             252:9    0   125K  1 crypt
sdm2                                           8:194  0    10G  0 part
temporary-cryptsetup-1423 (dm-19)            252:19   0   125K  1 crypt
sdn                                            8:208  0   3.7T  0 disk
sdn1                                           8:209  0   3.6T  0 part
temporary-cryptsetup-1442 (dm-11)            252:11   0   125K  1 crypt
sdn2                                           8:210  0    10G  0 part
temporary-cryptsetup-1433 (dm-7)             252:7    0   125K  1 crypt
sdo                                            8:224  0   3.7T  0 disk
sdo1                                           8:225  0   3.6T  0 part
temporary-cryptsetup-1600 (dm-23)            252:23   0   125K  1 crypt
sdo2                                           8:226  0    10G  0 part
temporary-cryptsetup-1602 (dm-24)            252:24   0   125K  1 crypt
sdp                                            8:240  0   3.7T  0 disk
sdp1                                           8:241  0   3.6T  0 part
temporary-cryptsetup-1634 (dm-27)            252:27   0   125K  1 crypt
sdp2                                           8:242  0    10G  0 part
temporary-cryptsetup-1638 (dm-25)            252:25   0   125K  1 crypt
sdq                                           65:0    0   3.7T  0 disk
sdq1                                          65:1    0   3.6T  0 part
temporary-cryptsetup-1428 (dm-18)            252:18   0   125K  1 crypt
sdq2                                          65:2    0    10G  0 part
temporary-cryptsetup-1430 (dm-10)            252:10   0   125K  1 crypt
sdr                                           65:16   0   3.7T  0 disk
sdr1                                          65:17   0   3.6T  0 part
temporary-cryptsetup-1727 (dm-29)            252:29   0   125K  1 crypt
sdr2                                          65:18   0    10G  0 part
temporary-cryptsetup-1728 (dm-32)            252:32   0   125K  1 crypt
sds                                           65:32   0   3.7T  0 disk
sds1                                          65:33   0   3.6T  0 part
temporary-cryptsetup-1366 (dm-8)             252:8    0   125K  1 crypt
sds2                                          65:34   0    10G  0 part
temporary-cryptsetup-1611 (dm-21)            252:21   0   125K  1 crypt
sdt                                           65:48   0   3.7T  0 disk
sdt1                                          65:49   0   3.6T  0 part
temporary-cryptsetup-1734 (dm-30)            252:30   0   125K  1 crypt
sdt2                                          65:50   0    10G  0 part
temporary-cryptsetup-1735 (dm-28)            252:28   0   125K  1 crypt
sdu                                           65:64   0   3.7T  0 disk
sdu1                                          65:65   0   3.6T  0 part
temporary-cryptsetup-1605 (dm-22)            252:22   0   125K  1 crypt
sdu2                                          65:66   0    10G  0 part
temporary-cryptsetup-1607 (dm-26)            252:26   0   125K  1 crypt
sdv                                           65:80   0   3.7T  0 disk
sdv1                                          65:81   0   3.6T  0 part
temporary-cryptsetup-1739 (dm-33)            252:33   0   125K  1 crypt
sdv2                                          65:82   0    10G  0 part
temporary-cryptsetup-1772 (dm-35)            252:35   0   125K  1 crypt
sdw                                           65:96   0   3.7T  0 disk
sdw1                                          65:97   0   3.6T  0 part
3171a1b9-e0f8-4521-a31a-821fcb549731 (dm-46) 252:46   0   3.6T  0
crypt /var/lib/ceph/osd/ceph-14
sdw2                                          65:98   0    10G  0 part
8c5882fd-21ef-4d9c-b62b-676248236514 (dm-47) 252:47   0    10G  0 crypt
sdx                                           65:112  0   3.7T  0 disk
sdx1                                          65:113  0   3.6T  0 part
a576166d-07c4-468c-a704-c4080290a12e (dm-40) 252:40   0   3.6T  0
crypt /var/lib/ceph/osd/ceph-7
sdx2                                          65:114  0    10G  0 part
1a93e588-dbd4-4ce4-9955-e2f450576314 (dm-42) 252:42   0    10G  0 crypt
sdy                                           65:128  0   3.7T  0 disk
sdy1                                          65:129  0   3.6T  0 part
da2f4e17-f2ba-49ce-bc11-fa699fbf0ba2 (dm-39) 252:39   0   3.6T  0
crypt /var/lib/ceph/osd/ceph-2
sdy2                                          65:130  0    10G  0 part
14422a1f-083c-44a8-ac6d-d2b4fe20650e (dm-44) 252:44   0    10G  0 crypt

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: dmcrypt with luks keys in hammer
  2015-07-20 19:52 dmcrypt with luks keys in hammer Wyllys Ingersoll
@ 2015-07-20 21:22 ` Sage Weil
  2015-07-20 21:46   ` Wyllys Ingersoll
  0 siblings, 1 reply; 9+ messages in thread
From: Sage Weil @ 2015-07-20 21:22 UTC (permalink / raw)
  To: Wyllys Ingersoll; +Cc: ceph-devel

On Mon, 20 Jul 2015, Wyllys Ingersoll wrote:
> Were running a cluster with Hammer v94.2 and are running into issues
> with the Luks encrypted OSD data and journal partitions.  The
> installation goes smoothly and everything runs OK, but we've had to
> reboot a couple of the storage nodes for various reasons and when they
> come back online, a large number of OSD processes fail to start
> because the LUKS encrypted partitions are not getting mounted
> correctly.
> 
> I'm not sure if it is a udev issue or a problem with the OSD process
> itself, but the encrypted partitions end up getting mounted as
> "temporary-cryptsetup-PID" and they never recover.  From below, you
> can see that some of the OSDs did come up correctly, but the majority
> do not.   We've seen this problem now on several storage nodes, and it
> only occurs for those OSDs that used luks (the new default).  The only
> recovery that we've found is to wipe them all out and rebuild them
> using "plain" dmcrypt (as it used to be).
> 
> Using "blkid" on a partition that is in the "temporary-cryptsetup"
> state, does show that it has the right ID_PART_ENTRY_UUID and TYPE
> values and I can confirm that there is an associated key in
> /etc/ceph/dmcrypt-keys, but it still isn't mounting correctly.
> 
> $ sudo blkid -p -o udev /dev/sdv2
> ID_FS_UUID=87008c17-9e57-487d-8f8b-160f8f803d8b
> ID_FS_UUID_ENC=87008c17-9e57-487d-8f8b-160f8f803d8b
> ID_FS_VERSION=1
> ID_FS_TYPE=crypto_LUKS
> ID_FS_USAGE=crypto
> ID_PART_ENTRY_SCHEME=gpt
> ID_PART_ENTRY_NAME=ceph\x20journal
> ID_PART_ENTRY_UUID=e3eda67b-a2e0-4d22-a62e-d9bda5ecf8b1
> ID_PART_ENTRY_TYPE=45b0969e-9b03-4f30-b4c6-35865ceff106
> ID_PART_ENTRY_NUMBER=2
> ID_PART_ENTRY_OFFSET=2048
> ID_PART_ENTRY_SIZE=20969473
> ID_PART_ENTRY_DISK=65:80
> 
> So Im checking to see if this is a known issue or if we are missing
> something in the installation or configuration that would fix this
> problem.

This isn't a known issue, although I think we have seen problems in 
general with hosts with lots of OSDs not always coming up on boot.  If it 
is specifically a problem with luks+dmcrypt that would be interesting!

Does an explicit 'ceph-disk activate /dev/...' on one of the devices make 
it come up?  And/or a 'ceph-disk activate-all'?  If so that would indicate 
a race issue in udev.

Thanks-
sage


> 
> -Wyllys Ingersoll
> 
> 
> Ex:
> $ lsblk -l
> NAME                                         MAJ:MIN RM   SIZE RO TYPE
>  MOUNTPOINT
> sda                                            8:0    0 111.8G  0 disk
> sda1                                           8:1    0  15.3G  0 part  [SWAP]
> sda2                                           8:2    0     1K  0 part
> sda5                                           8:5    0  96.5G  0 part  /
> sdb                                            8:16   0   3.7T  0 disk
> sdb1                                           8:17   0   3.6T  0 part
> e8bc1531-a187-4fd2-9e3f-cf90255f89d0 (dm-0)  252:0    0   3.6T  0 crypt
> sdb2                                           8:18   0    10G  0 part
> temporary-cryptsetup-1235 (dm-6)             252:6    0   125K  1 crypt
> sdc                                            8:32   0   3.7T  0 disk
> sdc1                                           8:33   0   3.6T  0 part
> temporary-cryptsetup-1788 (dm-37)            252:37   0   125K  1 crypt
> sdc2                                           8:34   0    10G  0 part
> temporary-cryptsetup-1789 (dm-36)            252:36   0   125K  1 crypt
> sdd                                            8:48   0   3.7T  0 disk
> sdd1                                           8:49   0   3.6T  0 part
> temporary-cryptsetup-1252 (dm-1)             252:1    0   125K  1 crypt
> sdd2                                           8:50   0    10G  0 part
> temporary-cryptsetup-1246 (dm-3)             252:3    0   125K  1 crypt
> sde                                            8:64   0   3.7T  0 disk
> sde1                                           8:65   0   3.6T  0 part
> temporary-cryptsetup-1260 (dm-14)            252:14   0   125K  1 crypt
> sde2                                           8:66   0    10G  0 part
> temporary-cryptsetup-1255 (dm-12)            252:12   0   125K  1 crypt
> sdf                                            8:80   0   3.7T  0 disk
> sdf1                                           8:81   0   3.6T  0 part
> temporary-cryptsetup-1268 (dm-15)            252:15   0   125K  1 crypt
> sdf2                                           8:82   0    10G  0 part
> temporary-cryptsetup-1245 (dm-5)             252:5    0   125K  1 crypt
> sdg                                            8:96   0   3.7T  0 disk
> sdg1                                           8:97   0   3.6T  0 part
> temporary-cryptsetup-1271 (dm-17)            252:17   0   125K  1 crypt
> sdg2                                           8:98   0    10G  0 part
> temporary-cryptsetup-1278 (dm-2)             252:2    0   125K  1 crypt
> sdh                                            8:112  0   3.7T  0 disk
> sdh1                                           8:113  0   3.6T  0 part
> 69dcd1e1-6e11-41ec-af19-1e0d90013957 (dm-43) 252:43   0   3.6T  0
> crypt /var/lib/ceph/osd/ceph-42
> sdh2                                           8:114  0    10G  0 part
> 3382723d-b0d9-4b50-affe-fb9f5df78d6f (dm-45) 252:45   0    10G  0 crypt
> sdi                                            8:128  0   3.7T  0 disk
> sdi1                                           8:129  0   3.6T  0 part
> temporary-cryptsetup-1265 (dm-20)            252:20   0   125K  1 crypt
> sdi2                                           8:130  0    10G  0 part
> temporary-cryptsetup-1277 (dm-16)            252:16   0   125K  1 crypt
> sdj                                            8:144  0   3.7T  0 disk
> sdj1                                           8:145  0   3.6T  0 part
> temporary-cryptsetup-1359 (dm-13)            252:13   0   125K  1 crypt
> sdj2                                           8:146  0    10G  0 part
> temporary-cryptsetup-1280 (dm-4)             252:4    0   125K  1 crypt
> sdk                                            8:160  0   3.7T  0 disk
> sdk1                                           8:161  0   3.6T  0 part
> temporary-cryptsetup-1760 (dm-34)            252:34   0   125K  1 crypt
> sdk2                                           8:162  0    10G  0 part
> temporary-cryptsetup-1761 (dm-31)            252:31   0   125K  1 crypt
> sdl                                            8:176  0   3.7T  0 disk
> sdl1                                           8:177  0   3.6T  0 part
> c3175d9f-ae12-4852-bbbc-b1d2c344c4ac (dm-38) 252:38   0   3.6T  0
> crypt /var/lib/ceph/osd/ceph-32
> sdl2                                           8:178  0    10G  0 part
> e4e10521-985a-4d94-a766-56d6de26443a (dm-41) 252:41   0    10G  0 crypt
> sdm                                            8:192  0   3.7T  0 disk
> sdm1                                           8:193  0   3.6T  0 part
> temporary-cryptsetup-1407 (dm-9)             252:9    0   125K  1 crypt
> sdm2                                           8:194  0    10G  0 part
> temporary-cryptsetup-1423 (dm-19)            252:19   0   125K  1 crypt
> sdn                                            8:208  0   3.7T  0 disk
> sdn1                                           8:209  0   3.6T  0 part
> temporary-cryptsetup-1442 (dm-11)            252:11   0   125K  1 crypt
> sdn2                                           8:210  0    10G  0 part
> temporary-cryptsetup-1433 (dm-7)             252:7    0   125K  1 crypt
> sdo                                            8:224  0   3.7T  0 disk
> sdo1                                           8:225  0   3.6T  0 part
> temporary-cryptsetup-1600 (dm-23)            252:23   0   125K  1 crypt
> sdo2                                           8:226  0    10G  0 part
> temporary-cryptsetup-1602 (dm-24)            252:24   0   125K  1 crypt
> sdp                                            8:240  0   3.7T  0 disk
> sdp1                                           8:241  0   3.6T  0 part
> temporary-cryptsetup-1634 (dm-27)            252:27   0   125K  1 crypt
> sdp2                                           8:242  0    10G  0 part
> temporary-cryptsetup-1638 (dm-25)            252:25   0   125K  1 crypt
> sdq                                           65:0    0   3.7T  0 disk
> sdq1                                          65:1    0   3.6T  0 part
> temporary-cryptsetup-1428 (dm-18)            252:18   0   125K  1 crypt
> sdq2                                          65:2    0    10G  0 part
> temporary-cryptsetup-1430 (dm-10)            252:10   0   125K  1 crypt
> sdr                                           65:16   0   3.7T  0 disk
> sdr1                                          65:17   0   3.6T  0 part
> temporary-cryptsetup-1727 (dm-29)            252:29   0   125K  1 crypt
> sdr2                                          65:18   0    10G  0 part
> temporary-cryptsetup-1728 (dm-32)            252:32   0   125K  1 crypt
> sds                                           65:32   0   3.7T  0 disk
> sds1                                          65:33   0   3.6T  0 part
> temporary-cryptsetup-1366 (dm-8)             252:8    0   125K  1 crypt
> sds2                                          65:34   0    10G  0 part
> temporary-cryptsetup-1611 (dm-21)            252:21   0   125K  1 crypt
> sdt                                           65:48   0   3.7T  0 disk
> sdt1                                          65:49   0   3.6T  0 part
> temporary-cryptsetup-1734 (dm-30)            252:30   0   125K  1 crypt
> sdt2                                          65:50   0    10G  0 part
> temporary-cryptsetup-1735 (dm-28)            252:28   0   125K  1 crypt
> sdu                                           65:64   0   3.7T  0 disk
> sdu1                                          65:65   0   3.6T  0 part
> temporary-cryptsetup-1605 (dm-22)            252:22   0   125K  1 crypt
> sdu2                                          65:66   0    10G  0 part
> temporary-cryptsetup-1607 (dm-26)            252:26   0   125K  1 crypt
> sdv                                           65:80   0   3.7T  0 disk
> sdv1                                          65:81   0   3.6T  0 part
> temporary-cryptsetup-1739 (dm-33)            252:33   0   125K  1 crypt
> sdv2                                          65:82   0    10G  0 part
> temporary-cryptsetup-1772 (dm-35)            252:35   0   125K  1 crypt
> sdw                                           65:96   0   3.7T  0 disk
> sdw1                                          65:97   0   3.6T  0 part
> 3171a1b9-e0f8-4521-a31a-821fcb549731 (dm-46) 252:46   0   3.6T  0
> crypt /var/lib/ceph/osd/ceph-14
> sdw2                                          65:98   0    10G  0 part
> 8c5882fd-21ef-4d9c-b62b-676248236514 (dm-47) 252:47   0    10G  0 crypt
> sdx                                           65:112  0   3.7T  0 disk
> sdx1                                          65:113  0   3.6T  0 part
> a576166d-07c4-468c-a704-c4080290a12e (dm-40) 252:40   0   3.6T  0
> crypt /var/lib/ceph/osd/ceph-7
> sdx2                                          65:114  0    10G  0 part
> 1a93e588-dbd4-4ce4-9955-e2f450576314 (dm-42) 252:42   0    10G  0 crypt
> sdy                                           65:128  0   3.7T  0 disk
> sdy1                                          65:129  0   3.6T  0 part
> da2f4e17-f2ba-49ce-bc11-fa699fbf0ba2 (dm-39) 252:39   0   3.6T  0
> crypt /var/lib/ceph/osd/ceph-2
> sdy2                                          65:130  0    10G  0 part
> 14422a1f-083c-44a8-ac6d-d2b4fe20650e (dm-44) 252:44   0    10G  0 crypt
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: dmcrypt with luks keys in hammer
  2015-07-20 21:22 ` Sage Weil
@ 2015-07-20 21:46   ` Wyllys Ingersoll
  2015-07-20 22:21     ` Sage Weil
  0 siblings, 1 reply; 9+ messages in thread
From: Wyllys Ingersoll @ 2015-07-20 21:46 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

No luck with ceph-disk-activate (all or just one device).

$ sudo ceph-disk-activate /dev/sdv1
mount: unknown filesystem type 'crypto_LUKS'
ceph-disk: Mounting filesystem failed: Command '['/bin/mount', '-t',
'crypto_LUKS', '-o', '', '--', '/dev/sdv1',
'/var/lib/ceph/tmp/mnt.QHe3zK']' returned non-zero exit status 32


Its odd that it should complain about the "crypto_LUKS" filesystem not
being recognized, because it did mount some of the LUKS systems
successfully, though not sometimes just the data and not the journal
(or vice versa).

$ lsblk /dev/sdb
NAME                                            MAJ:MIN RM   SIZE RO
TYPE  MOUNTPOINT
sdb                                               8:16   0   3.7T  0 disk
├─sdb1                                            8:17   0   3.6T  0 part
│ └─e8bc1531-a187-4fd2-9e3f-cf90255f89d0 (dm-0) 252:0    0   3.6T  0
crypt /var/lib/ceph/osd/ceph-54
└─sdb2                                            8:18   0    10G  0 part
  └─temporary-cryptsetup-1235 (dm-6)            252:6    0   125K  1 crypt


$ blkid /dev/sdb1
/dev/sdb1: UUID="d6194096-a219-4732-8d61-d0c125c49393" TYPE="crypto_LUKS"


A race condition (or other issue) with udev seems likely given that
its rather random which ones come up and which ones don't.




On Mon, Jul 20, 2015 at 5:22 PM, Sage Weil <sage@newdream.net> wrote:
> On Mon, 20 Jul 2015, Wyllys Ingersoll wrote:
>> Were running a cluster with Hammer v94.2 and are running into issues
>> with the Luks encrypted OSD data and journal partitions.  The
>> installation goes smoothly and everything runs OK, but we've had to
>> reboot a couple of the storage nodes for various reasons and when they
>> come back online, a large number of OSD processes fail to start
>> because the LUKS encrypted partitions are not getting mounted
>> correctly.
>>
>> I'm not sure if it is a udev issue or a problem with the OSD process
>> itself, but the encrypted partitions end up getting mounted as
>> "temporary-cryptsetup-PID" and they never recover.  From below, you
>> can see that some of the OSDs did come up correctly, but the majority
>> do not.   We've seen this problem now on several storage nodes, and it
>> only occurs for those OSDs that used luks (the new default).  The only
>> recovery that we've found is to wipe them all out and rebuild them
>> using "plain" dmcrypt (as it used to be).
>>
>> Using "blkid" on a partition that is in the "temporary-cryptsetup"
>> state, does show that it has the right ID_PART_ENTRY_UUID and TYPE
>> values and I can confirm that there is an associated key in
>> /etc/ceph/dmcrypt-keys, but it still isn't mounting correctly.
>>
>> $ sudo blkid -p -o udev /dev/sdv2
>> ID_FS_UUID=87008c17-9e57-487d-8f8b-160f8f803d8b
>> ID_FS_UUID_ENC=87008c17-9e57-487d-8f8b-160f8f803d8b
>> ID_FS_VERSION=1
>> ID_FS_TYPE=crypto_LUKS
>> ID_FS_USAGE=crypto
>> ID_PART_ENTRY_SCHEME=gpt
>> ID_PART_ENTRY_NAME=ceph\x20journal
>> ID_PART_ENTRY_UUID=e3eda67b-a2e0-4d22-a62e-d9bda5ecf8b1
>> ID_PART_ENTRY_TYPE=45b0969e-9b03-4f30-b4c6-35865ceff106
>> ID_PART_ENTRY_NUMBER=2
>> ID_PART_ENTRY_OFFSET=2048
>> ID_PART_ENTRY_SIZE=20969473
>> ID_PART_ENTRY_DISK=65:80
>>
>> So Im checking to see if this is a known issue or if we are missing
>> something in the installation or configuration that would fix this
>> problem.
>
> This isn't a known issue, although I think we have seen problems in
> general with hosts with lots of OSDs not always coming up on boot.  If it
> is specifically a problem with luks+dmcrypt that would be interesting!
>
> Does an explicit 'ceph-disk activate /dev/...' on one of the devices make
> it come up?  And/or a 'ceph-disk activate-all'?  If so that would indicate
> a race issue in udev.
>
> Thanks-
> sage
>
>
>>
>> -Wyllys Ingersoll
>>
>>
>> Ex:
>> $ lsblk -l
>> NAME                                         MAJ:MIN RM   SIZE RO TYPE
>>  MOUNTPOINT
>> sda                                            8:0    0 111.8G  0 disk
>> sda1                                           8:1    0  15.3G  0 part  [SWAP]
>> sda2                                           8:2    0     1K  0 part
>> sda5                                           8:5    0  96.5G  0 part  /
>> sdb                                            8:16   0   3.7T  0 disk
>> sdb1                                           8:17   0   3.6T  0 part
>> e8bc1531-a187-4fd2-9e3f-cf90255f89d0 (dm-0)  252:0    0   3.6T  0 crypt
>> sdb2                                           8:18   0    10G  0 part
>> temporary-cryptsetup-1235 (dm-6)             252:6    0   125K  1 crypt
>> sdc                                            8:32   0   3.7T  0 disk
>> sdc1                                           8:33   0   3.6T  0 part
>> temporary-cryptsetup-1788 (dm-37)            252:37   0   125K  1 crypt
>> sdc2                                           8:34   0    10G  0 part
>> temporary-cryptsetup-1789 (dm-36)            252:36   0   125K  1 crypt
>> sdd                                            8:48   0   3.7T  0 disk
>> sdd1                                           8:49   0   3.6T  0 part
>> temporary-cryptsetup-1252 (dm-1)             252:1    0   125K  1 crypt
>> sdd2                                           8:50   0    10G  0 part
>> temporary-cryptsetup-1246 (dm-3)             252:3    0   125K  1 crypt
>> sde                                            8:64   0   3.7T  0 disk
>> sde1                                           8:65   0   3.6T  0 part
>> temporary-cryptsetup-1260 (dm-14)            252:14   0   125K  1 crypt
>> sde2                                           8:66   0    10G  0 part
>> temporary-cryptsetup-1255 (dm-12)            252:12   0   125K  1 crypt
>> sdf                                            8:80   0   3.7T  0 disk
>> sdf1                                           8:81   0   3.6T  0 part
>> temporary-cryptsetup-1268 (dm-15)            252:15   0   125K  1 crypt
>> sdf2                                           8:82   0    10G  0 part
>> temporary-cryptsetup-1245 (dm-5)             252:5    0   125K  1 crypt
>> sdg                                            8:96   0   3.7T  0 disk
>> sdg1                                           8:97   0   3.6T  0 part
>> temporary-cryptsetup-1271 (dm-17)            252:17   0   125K  1 crypt
>> sdg2                                           8:98   0    10G  0 part
>> temporary-cryptsetup-1278 (dm-2)             252:2    0   125K  1 crypt
>> sdh                                            8:112  0   3.7T  0 disk
>> sdh1                                           8:113  0   3.6T  0 part
>> 69dcd1e1-6e11-41ec-af19-1e0d90013957 (dm-43) 252:43   0   3.6T  0
>> crypt /var/lib/ceph/osd/ceph-42
>> sdh2                                           8:114  0    10G  0 part
>> 3382723d-b0d9-4b50-affe-fb9f5df78d6f (dm-45) 252:45   0    10G  0 crypt
>> sdi                                            8:128  0   3.7T  0 disk
>> sdi1                                           8:129  0   3.6T  0 part
>> temporary-cryptsetup-1265 (dm-20)            252:20   0   125K  1 crypt
>> sdi2                                           8:130  0    10G  0 part
>> temporary-cryptsetup-1277 (dm-16)            252:16   0   125K  1 crypt
>> sdj                                            8:144  0   3.7T  0 disk
>> sdj1                                           8:145  0   3.6T  0 part
>> temporary-cryptsetup-1359 (dm-13)            252:13   0   125K  1 crypt
>> sdj2                                           8:146  0    10G  0 part
>> temporary-cryptsetup-1280 (dm-4)             252:4    0   125K  1 crypt
>> sdk                                            8:160  0   3.7T  0 disk
>> sdk1                                           8:161  0   3.6T  0 part
>> temporary-cryptsetup-1760 (dm-34)            252:34   0   125K  1 crypt
>> sdk2                                           8:162  0    10G  0 part
>> temporary-cryptsetup-1761 (dm-31)            252:31   0   125K  1 crypt
>> sdl                                            8:176  0   3.7T  0 disk
>> sdl1                                           8:177  0   3.6T  0 part
>> c3175d9f-ae12-4852-bbbc-b1d2c344c4ac (dm-38) 252:38   0   3.6T  0
>> crypt /var/lib/ceph/osd/ceph-32
>> sdl2                                           8:178  0    10G  0 part
>> e4e10521-985a-4d94-a766-56d6de26443a (dm-41) 252:41   0    10G  0 crypt
>> sdm                                            8:192  0   3.7T  0 disk
>> sdm1                                           8:193  0   3.6T  0 part
>> temporary-cryptsetup-1407 (dm-9)             252:9    0   125K  1 crypt
>> sdm2                                           8:194  0    10G  0 part
>> temporary-cryptsetup-1423 (dm-19)            252:19   0   125K  1 crypt
>> sdn                                            8:208  0   3.7T  0 disk
>> sdn1                                           8:209  0   3.6T  0 part
>> temporary-cryptsetup-1442 (dm-11)            252:11   0   125K  1 crypt
>> sdn2                                           8:210  0    10G  0 part
>> temporary-cryptsetup-1433 (dm-7)             252:7    0   125K  1 crypt
>> sdo                                            8:224  0   3.7T  0 disk
>> sdo1                                           8:225  0   3.6T  0 part
>> temporary-cryptsetup-1600 (dm-23)            252:23   0   125K  1 crypt
>> sdo2                                           8:226  0    10G  0 part
>> temporary-cryptsetup-1602 (dm-24)            252:24   0   125K  1 crypt
>> sdp                                            8:240  0   3.7T  0 disk
>> sdp1                                           8:241  0   3.6T  0 part
>> temporary-cryptsetup-1634 (dm-27)            252:27   0   125K  1 crypt
>> sdp2                                           8:242  0    10G  0 part
>> temporary-cryptsetup-1638 (dm-25)            252:25   0   125K  1 crypt
>> sdq                                           65:0    0   3.7T  0 disk
>> sdq1                                          65:1    0   3.6T  0 part
>> temporary-cryptsetup-1428 (dm-18)            252:18   0   125K  1 crypt
>> sdq2                                          65:2    0    10G  0 part
>> temporary-cryptsetup-1430 (dm-10)            252:10   0   125K  1 crypt
>> sdr                                           65:16   0   3.7T  0 disk
>> sdr1                                          65:17   0   3.6T  0 part
>> temporary-cryptsetup-1727 (dm-29)            252:29   0   125K  1 crypt
>> sdr2                                          65:18   0    10G  0 part
>> temporary-cryptsetup-1728 (dm-32)            252:32   0   125K  1 crypt
>> sds                                           65:32   0   3.7T  0 disk
>> sds1                                          65:33   0   3.6T  0 part
>> temporary-cryptsetup-1366 (dm-8)             252:8    0   125K  1 crypt
>> sds2                                          65:34   0    10G  0 part
>> temporary-cryptsetup-1611 (dm-21)            252:21   0   125K  1 crypt
>> sdt                                           65:48   0   3.7T  0 disk
>> sdt1                                          65:49   0   3.6T  0 part
>> temporary-cryptsetup-1734 (dm-30)            252:30   0   125K  1 crypt
>> sdt2                                          65:50   0    10G  0 part
>> temporary-cryptsetup-1735 (dm-28)            252:28   0   125K  1 crypt
>> sdu                                           65:64   0   3.7T  0 disk
>> sdu1                                          65:65   0   3.6T  0 part
>> temporary-cryptsetup-1605 (dm-22)            252:22   0   125K  1 crypt
>> sdu2                                          65:66   0    10G  0 part
>> temporary-cryptsetup-1607 (dm-26)            252:26   0   125K  1 crypt
>> sdv                                           65:80   0   3.7T  0 disk
>> sdv1                                          65:81   0   3.6T  0 part
>> temporary-cryptsetup-1739 (dm-33)            252:33   0   125K  1 crypt
>> sdv2                                          65:82   0    10G  0 part
>> temporary-cryptsetup-1772 (dm-35)            252:35   0   125K  1 crypt
>> sdw                                           65:96   0   3.7T  0 disk
>> sdw1                                          65:97   0   3.6T  0 part
>> 3171a1b9-e0f8-4521-a31a-821fcb549731 (dm-46) 252:46   0   3.6T  0
>> crypt /var/lib/ceph/osd/ceph-14
>> sdw2                                          65:98   0    10G  0 part
>> 8c5882fd-21ef-4d9c-b62b-676248236514 (dm-47) 252:47   0    10G  0 crypt
>> sdx                                           65:112  0   3.7T  0 disk
>> sdx1                                          65:113  0   3.6T  0 part
>> a576166d-07c4-468c-a704-c4080290a12e (dm-40) 252:40   0   3.6T  0
>> crypt /var/lib/ceph/osd/ceph-7
>> sdx2                                          65:114  0    10G  0 part
>> 1a93e588-dbd4-4ce4-9955-e2f450576314 (dm-42) 252:42   0    10G  0 crypt
>> sdy                                           65:128  0   3.7T  0 disk
>> sdy1                                          65:129  0   3.6T  0 part
>> da2f4e17-f2ba-49ce-bc11-fa699fbf0ba2 (dm-39) 252:39   0   3.6T  0
>> crypt /var/lib/ceph/osd/ceph-2
>> sdy2                                          65:130  0    10G  0 part
>> 14422a1f-083c-44a8-ac6d-d2b4fe20650e (dm-44) 252:44   0    10G  0 crypt
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: dmcrypt with luks keys in hammer
  2015-07-20 21:46   ` Wyllys Ingersoll
@ 2015-07-20 22:21     ` Sage Weil
  2015-07-20 22:23       ` Wyllys Ingersoll
  2015-07-21 11:14       ` David Disseldorp
  0 siblings, 2 replies; 9+ messages in thread
From: Sage Weil @ 2015-07-20 22:21 UTC (permalink / raw)
  To: Wyllys Ingersoll; +Cc: ceph-devel

On Mon, 20 Jul 2015, Wyllys Ingersoll wrote:
> No luck with ceph-disk-activate (all or just one device).
> 
> $ sudo ceph-disk-activate /dev/sdv1
> mount: unknown filesystem type 'crypto_LUKS'
> ceph-disk: Mounting filesystem failed: Command '['/bin/mount', '-t',
> 'crypto_LUKS', '-o', '', '--', '/dev/sdv1',
> '/var/lib/ceph/tmp/mnt.QHe3zK']' returned non-zero exit status 32
> 
> 
> Its odd that it should complain about the "crypto_LUKS" filesystem not
> being recognized, because it did mount some of the LUKS systems
> successfully, though not sometimes just the data and not the journal
> (or vice versa).
> 
> $ lsblk /dev/sdb
> NAME                                            MAJ:MIN RM   SIZE RO
> TYPE  MOUNTPOINT
> sdb                                               8:16   0   3.7T  0 disk
> ??sdb1                                            8:17   0   3.6T  0 part
> ? ??e8bc1531-a187-4fd2-9e3f-cf90255f89d0 (dm-0) 252:0    0   3.6T  0
> crypt /var/lib/ceph/osd/ceph-54
> ??sdb2                                            8:18   0    10G  0 part
>   ??temporary-cryptsetup-1235 (dm-6)            252:6    0   125K  1 crypt
> 
> 
> $ blkid /dev/sdb1
> /dev/sdb1: UUID="d6194096-a219-4732-8d61-d0c125c49393" TYPE="crypto_LUKS"
> 
> 
> A race condition (or other issue) with udev seems likely given that
> its rather random which ones come up and which ones don't.

A race condition during creation or activation?  If it's activation I 
would expect ceph-disk activate ... to work reasonably reliably when 
called manually (on a single device at a time).

sage

> 
> 
> 
> 
> On Mon, Jul 20, 2015 at 5:22 PM, Sage Weil <sage@newdream.net> wrote:
> > On Mon, 20 Jul 2015, Wyllys Ingersoll wrote:
> >> Were running a cluster with Hammer v94.2 and are running into issues
> >> with the Luks encrypted OSD data and journal partitions.  The
> >> installation goes smoothly and everything runs OK, but we've had to
> >> reboot a couple of the storage nodes for various reasons and when they
> >> come back online, a large number of OSD processes fail to start
> >> because the LUKS encrypted partitions are not getting mounted
> >> correctly.
> >>
> >> I'm not sure if it is a udev issue or a problem with the OSD process
> >> itself, but the encrypted partitions end up getting mounted as
> >> "temporary-cryptsetup-PID" and they never recover.  From below, you
> >> can see that some of the OSDs did come up correctly, but the majority
> >> do not.   We've seen this problem now on several storage nodes, and it
> >> only occurs for those OSDs that used luks (the new default).  The only
> >> recovery that we've found is to wipe them all out and rebuild them
> >> using "plain" dmcrypt (as it used to be).
> >>
> >> Using "blkid" on a partition that is in the "temporary-cryptsetup"
> >> state, does show that it has the right ID_PART_ENTRY_UUID and TYPE
> >> values and I can confirm that there is an associated key in
> >> /etc/ceph/dmcrypt-keys, but it still isn't mounting correctly.
> >>
> >> $ sudo blkid -p -o udev /dev/sdv2
> >> ID_FS_UUID=87008c17-9e57-487d-8f8b-160f8f803d8b
> >> ID_FS_UUID_ENC=87008c17-9e57-487d-8f8b-160f8f803d8b
> >> ID_FS_VERSION=1
> >> ID_FS_TYPE=crypto_LUKS
> >> ID_FS_USAGE=crypto
> >> ID_PART_ENTRY_SCHEME=gpt
> >> ID_PART_ENTRY_NAME=ceph\x20journal
> >> ID_PART_ENTRY_UUID=e3eda67b-a2e0-4d22-a62e-d9bda5ecf8b1
> >> ID_PART_ENTRY_TYPE=45b0969e-9b03-4f30-b4c6-35865ceff106
> >> ID_PART_ENTRY_NUMBER=2
> >> ID_PART_ENTRY_OFFSET=2048
> >> ID_PART_ENTRY_SIZE=20969473
> >> ID_PART_ENTRY_DISK=65:80
> >>
> >> So Im checking to see if this is a known issue or if we are missing
> >> something in the installation or configuration that would fix this
> >> problem.
> >
> > This isn't a known issue, although I think we have seen problems in
> > general with hosts with lots of OSDs not always coming up on boot.  If it
> > is specifically a problem with luks+dmcrypt that would be interesting!
> >
> > Does an explicit 'ceph-disk activate /dev/...' on one of the devices make
> > it come up?  And/or a 'ceph-disk activate-all'?  If so that would indicate
> > a race issue in udev.
> >
> > Thanks-
> > sage
> >
> >
> >>
> >> -Wyllys Ingersoll
> >>
> >>
> >> Ex:
> >> $ lsblk -l
> >> NAME                                         MAJ:MIN RM   SIZE RO TYPE
> >>  MOUNTPOINT
> >> sda                                            8:0    0 111.8G  0 disk
> >> sda1                                           8:1    0  15.3G  0 part  [SWAP]
> >> sda2                                           8:2    0     1K  0 part
> >> sda5                                           8:5    0  96.5G  0 part  /
> >> sdb                                            8:16   0   3.7T  0 disk
> >> sdb1                                           8:17   0   3.6T  0 part
> >> e8bc1531-a187-4fd2-9e3f-cf90255f89d0 (dm-0)  252:0    0   3.6T  0 crypt
> >> sdb2                                           8:18   0    10G  0 part
> >> temporary-cryptsetup-1235 (dm-6)             252:6    0   125K  1 crypt
> >> sdc                                            8:32   0   3.7T  0 disk
> >> sdc1                                           8:33   0   3.6T  0 part
> >> temporary-cryptsetup-1788 (dm-37)            252:37   0   125K  1 crypt
> >> sdc2                                           8:34   0    10G  0 part
> >> temporary-cryptsetup-1789 (dm-36)            252:36   0   125K  1 crypt
> >> sdd                                            8:48   0   3.7T  0 disk
> >> sdd1                                           8:49   0   3.6T  0 part
> >> temporary-cryptsetup-1252 (dm-1)             252:1    0   125K  1 crypt
> >> sdd2                                           8:50   0    10G  0 part
> >> temporary-cryptsetup-1246 (dm-3)             252:3    0   125K  1 crypt
> >> sde                                            8:64   0   3.7T  0 disk
> >> sde1                                           8:65   0   3.6T  0 part
> >> temporary-cryptsetup-1260 (dm-14)            252:14   0   125K  1 crypt
> >> sde2                                           8:66   0    10G  0 part
> >> temporary-cryptsetup-1255 (dm-12)            252:12   0   125K  1 crypt
> >> sdf                                            8:80   0   3.7T  0 disk
> >> sdf1                                           8:81   0   3.6T  0 part
> >> temporary-cryptsetup-1268 (dm-15)            252:15   0   125K  1 crypt
> >> sdf2                                           8:82   0    10G  0 part
> >> temporary-cryptsetup-1245 (dm-5)             252:5    0   125K  1 crypt
> >> sdg                                            8:96   0   3.7T  0 disk
> >> sdg1                                           8:97   0   3.6T  0 part
> >> temporary-cryptsetup-1271 (dm-17)            252:17   0   125K  1 crypt
> >> sdg2                                           8:98   0    10G  0 part
> >> temporary-cryptsetup-1278 (dm-2)             252:2    0   125K  1 crypt
> >> sdh                                            8:112  0   3.7T  0 disk
> >> sdh1                                           8:113  0   3.6T  0 part
> >> 69dcd1e1-6e11-41ec-af19-1e0d90013957 (dm-43) 252:43   0   3.6T  0
> >> crypt /var/lib/ceph/osd/ceph-42
> >> sdh2                                           8:114  0    10G  0 part
> >> 3382723d-b0d9-4b50-affe-fb9f5df78d6f (dm-45) 252:45   0    10G  0 crypt
> >> sdi                                            8:128  0   3.7T  0 disk
> >> sdi1                                           8:129  0   3.6T  0 part
> >> temporary-cryptsetup-1265 (dm-20)            252:20   0   125K  1 crypt
> >> sdi2                                           8:130  0    10G  0 part
> >> temporary-cryptsetup-1277 (dm-16)            252:16   0   125K  1 crypt
> >> sdj                                            8:144  0   3.7T  0 disk
> >> sdj1                                           8:145  0   3.6T  0 part
> >> temporary-cryptsetup-1359 (dm-13)            252:13   0   125K  1 crypt
> >> sdj2                                           8:146  0    10G  0 part
> >> temporary-cryptsetup-1280 (dm-4)             252:4    0   125K  1 crypt
> >> sdk                                            8:160  0   3.7T  0 disk
> >> sdk1                                           8:161  0   3.6T  0 part
> >> temporary-cryptsetup-1760 (dm-34)            252:34   0   125K  1 crypt
> >> sdk2                                           8:162  0    10G  0 part
> >> temporary-cryptsetup-1761 (dm-31)            252:31   0   125K  1 crypt
> >> sdl                                            8:176  0   3.7T  0 disk
> >> sdl1                                           8:177  0   3.6T  0 part
> >> c3175d9f-ae12-4852-bbbc-b1d2c344c4ac (dm-38) 252:38   0   3.6T  0
> >> crypt /var/lib/ceph/osd/ceph-32
> >> sdl2                                           8:178  0    10G  0 part
> >> e4e10521-985a-4d94-a766-56d6de26443a (dm-41) 252:41   0    10G  0 crypt
> >> sdm                                            8:192  0   3.7T  0 disk
> >> sdm1                                           8:193  0   3.6T  0 part
> >> temporary-cryptsetup-1407 (dm-9)             252:9    0   125K  1 crypt
> >> sdm2                                           8:194  0    10G  0 part
> >> temporary-cryptsetup-1423 (dm-19)            252:19   0   125K  1 crypt
> >> sdn                                            8:208  0   3.7T  0 disk
> >> sdn1                                           8:209  0   3.6T  0 part
> >> temporary-cryptsetup-1442 (dm-11)            252:11   0   125K  1 crypt
> >> sdn2                                           8:210  0    10G  0 part
> >> temporary-cryptsetup-1433 (dm-7)             252:7    0   125K  1 crypt
> >> sdo                                            8:224  0   3.7T  0 disk
> >> sdo1                                           8:225  0   3.6T  0 part
> >> temporary-cryptsetup-1600 (dm-23)            252:23   0   125K  1 crypt
> >> sdo2                                           8:226  0    10G  0 part
> >> temporary-cryptsetup-1602 (dm-24)            252:24   0   125K  1 crypt
> >> sdp                                            8:240  0   3.7T  0 disk
> >> sdp1                                           8:241  0   3.6T  0 part
> >> temporary-cryptsetup-1634 (dm-27)            252:27   0   125K  1 crypt
> >> sdp2                                           8:242  0    10G  0 part
> >> temporary-cryptsetup-1638 (dm-25)            252:25   0   125K  1 crypt
> >> sdq                                           65:0    0   3.7T  0 disk
> >> sdq1                                          65:1    0   3.6T  0 part
> >> temporary-cryptsetup-1428 (dm-18)            252:18   0   125K  1 crypt
> >> sdq2                                          65:2    0    10G  0 part
> >> temporary-cryptsetup-1430 (dm-10)            252:10   0   125K  1 crypt
> >> sdr                                           65:16   0   3.7T  0 disk
> >> sdr1                                          65:17   0   3.6T  0 part
> >> temporary-cryptsetup-1727 (dm-29)            252:29   0   125K  1 crypt
> >> sdr2                                          65:18   0    10G  0 part
> >> temporary-cryptsetup-1728 (dm-32)            252:32   0   125K  1 crypt
> >> sds                                           65:32   0   3.7T  0 disk
> >> sds1                                          65:33   0   3.6T  0 part
> >> temporary-cryptsetup-1366 (dm-8)             252:8    0   125K  1 crypt
> >> sds2                                          65:34   0    10G  0 part
> >> temporary-cryptsetup-1611 (dm-21)            252:21   0   125K  1 crypt
> >> sdt                                           65:48   0   3.7T  0 disk
> >> sdt1                                          65:49   0   3.6T  0 part
> >> temporary-cryptsetup-1734 (dm-30)            252:30   0   125K  1 crypt
> >> sdt2                                          65:50   0    10G  0 part
> >> temporary-cryptsetup-1735 (dm-28)            252:28   0   125K  1 crypt
> >> sdu                                           65:64   0   3.7T  0 disk
> >> sdu1                                          65:65   0   3.6T  0 part
> >> temporary-cryptsetup-1605 (dm-22)            252:22   0   125K  1 crypt
> >> sdu2                                          65:66   0    10G  0 part
> >> temporary-cryptsetup-1607 (dm-26)            252:26   0   125K  1 crypt
> >> sdv                                           65:80   0   3.7T  0 disk
> >> sdv1                                          65:81   0   3.6T  0 part
> >> temporary-cryptsetup-1739 (dm-33)            252:33   0   125K  1 crypt
> >> sdv2                                          65:82   0    10G  0 part
> >> temporary-cryptsetup-1772 (dm-35)            252:35   0   125K  1 crypt
> >> sdw                                           65:96   0   3.7T  0 disk
> >> sdw1                                          65:97   0   3.6T  0 part
> >> 3171a1b9-e0f8-4521-a31a-821fcb549731 (dm-46) 252:46   0   3.6T  0
> >> crypt /var/lib/ceph/osd/ceph-14
> >> sdw2                                          65:98   0    10G  0 part
> >> 8c5882fd-21ef-4d9c-b62b-676248236514 (dm-47) 252:47   0    10G  0 crypt
> >> sdx                                           65:112  0   3.7T  0 disk
> >> sdx1                                          65:113  0   3.6T  0 part
> >> a576166d-07c4-468c-a704-c4080290a12e (dm-40) 252:40   0   3.6T  0
> >> crypt /var/lib/ceph/osd/ceph-7
> >> sdx2                                          65:114  0    10G  0 part
> >> 1a93e588-dbd4-4ce4-9955-e2f450576314 (dm-42) 252:42   0    10G  0 crypt
> >> sdy                                           65:128  0   3.7T  0 disk
> >> sdy1                                          65:129  0   3.6T  0 part
> >> da2f4e17-f2ba-49ce-bc11-fa699fbf0ba2 (dm-39) 252:39   0   3.6T  0
> >> crypt /var/lib/ceph/osd/ceph-2
> >> sdy2                                          65:130  0    10G  0 part
> >> 14422a1f-083c-44a8-ac6d-d2b4fe20650e (dm-44) 252:44   0    10G  0 crypt
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>
> >>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: dmcrypt with luks keys in hammer
  2015-07-20 22:21     ` Sage Weil
@ 2015-07-20 22:23       ` Wyllys Ingersoll
  2015-07-21 11:14       ` David Disseldorp
  1 sibling, 0 replies; 9+ messages in thread
From: Wyllys Ingersoll @ 2015-07-20 22:23 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

On Mon, Jul 20, 2015 at 6:21 PM, Sage Weil <sage@newdream.net> wrote:
> On Mon, 20 Jul 2015, Wyllys Ingersoll wrote:
>> No luck with ceph-disk-activate (all or just one device).
>>
>> $ sudo ceph-disk-activate /dev/sdv1
>> mount: unknown filesystem type 'crypto_LUKS'
>> ceph-disk: Mounting filesystem failed: Command '['/bin/mount', '-t',
>> 'crypto_LUKS', '-o', '', '--', '/dev/sdv1',
>> '/var/lib/ceph/tmp/mnt.QHe3zK']' returned non-zero exit status 32
>>
>>
>> Its odd that it should complain about the "crypto_LUKS" filesystem not
>> being recognized, because it did mount some of the LUKS systems
>> successfully, though not sometimes just the data and not the journal
>> (or vice versa).
>>
>> $ lsblk /dev/sdb
>> NAME                                            MAJ:MIN RM   SIZE RO
>> TYPE  MOUNTPOINT
>> sdb                                               8:16   0   3.7T  0 disk
>> ??sdb1                                            8:17   0   3.6T  0 part
>> ? ??e8bc1531-a187-4fd2-9e3f-cf90255f89d0 (dm-0) 252:0    0   3.6T  0
>> crypt /var/lib/ceph/osd/ceph-54
>> ??sdb2                                            8:18   0    10G  0 part
>>   ??temporary-cryptsetup-1235 (dm-6)            252:6    0   125K  1 crypt
>>
>>
>> $ blkid /dev/sdb1
>> /dev/sdb1: UUID="d6194096-a219-4732-8d61-d0c125c49393" TYPE="crypto_LUKS"
>>
>>
>> A race condition (or other issue) with udev seems likely given that
>> its rather random which ones come up and which ones don't.
>
> A race condition during creation or activation?  If it's activation I
> would expect ceph-disk activate ... to work reasonably reliably when
> called manually (on a single device at a time).
>
> sage
>


Im not sure. I do know that all of the disks *did* work after the
initial installation and activation, but they fail after reboot, and
the failures are non-deterministic.  Im not really sure how to debug
it any further.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: dmcrypt with luks keys in hammer
  2015-07-20 22:21     ` Sage Weil
  2015-07-20 22:23       ` Wyllys Ingersoll
@ 2015-07-21 11:14       ` David Disseldorp
  2015-07-21 14:00         ` Sage Weil
  2015-07-21 14:25         ` Milan Broz
  1 sibling, 2 replies; 9+ messages in thread
From: David Disseldorp @ 2015-07-21 11:14 UTC (permalink / raw)
  To: Sage Weil; +Cc: Wyllys Ingersoll, ceph-devel, Lars Marowsky-Bree

Hi,

On Mon, 20 Jul 2015 15:21:50 -0700 (PDT), Sage Weil wrote:

> On Mon, 20 Jul 2015, Wyllys Ingersoll wrote:
> > No luck with ceph-disk-activate (all or just one device).
> > 
> > $ sudo ceph-disk-activate /dev/sdv1
> > mount: unknown filesystem type 'crypto_LUKS'
> > ceph-disk: Mounting filesystem failed: Command '['/bin/mount', '-t',
> > 'crypto_LUKS', '-o', '', '--', '/dev/sdv1',
> > '/var/lib/ceph/tmp/mnt.QHe3zK']' returned non-zero exit status 32
> > 
> > 
> > Its odd that it should complain about the "crypto_LUKS" filesystem not
> > being recognized, because it did mount some of the LUKS systems
> > successfully, though not sometimes just the data and not the journal
> > (or vice versa).
> > 
> > $ lsblk /dev/sdb
> > NAME                                            MAJ:MIN RM   SIZE RO
> > TYPE  MOUNTPOINT
> > sdb                                               8:16   0   3.7T  0 disk
> > ??sdb1                                            8:17   0   3.6T  0 part
> > ? ??e8bc1531-a187-4fd2-9e3f-cf90255f89d0 (dm-0) 252:0    0   3.6T  0
> > crypt /var/lib/ceph/osd/ceph-54
> > ??sdb2                                            8:18   0    10G  0 part
> >   ??temporary-cryptsetup-1235 (dm-6)            252:6    0   125K  1 crypt
> > 
> > 
> > $ blkid /dev/sdb1
> > /dev/sdb1: UUID="d6194096-a219-4732-8d61-d0c125c49393" TYPE="crypto_LUKS"
> > 
> > 
> > A race condition (or other issue) with udev seems likely given that
> > its rather random which ones come up and which ones don't.
> 
> A race condition during creation or activation?  If it's activation I 
> would expect ceph-disk activate ... to work reasonably reliably when 
> called manually (on a single device at a time).

We encountered similar issues on a non-dmcrypt firefly deployment with
10 OSDs per node.

I've been working on a patch set to defer device activation to systemd
services. ceph-disk activate is extended to support mapping of dmcrypt
devices prior to OSD startup.

The master-based changes aren't ready for upstream yet, but can be found
in my WIP branch at:
https://github.com/ddiss/ceph/tree/wip_bnc926756_split_udev_systemd_master

There are a few things that I'd still like to address before submitting
upstream, mostly covering activate-journal:
- The test/ceph-disk.sh unit tests need to be extended and fixed.
- The activate-journal --dmcrypt changes are less than optimal, and leave
  me with a few unanswered questions:
  + Does get_journal_osd_uuid(dev) return the plaintext or cyphertext
    uuid?
  + If a journal is encrypted, is the data partition also always
    encrypted?
- dmcrypt journal device mapping should probably also be split out into
  a separate systemd service, as that'll be needed for the future
  network based key retrieval feature.

Feedback on the approach taken would be appreciated.

Cheers, David

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: dmcrypt with luks keys in hammer
  2015-07-21 11:14       ` David Disseldorp
@ 2015-07-21 14:00         ` Sage Weil
  2015-07-21 14:26           ` Wyllys Ingersoll
  2015-07-21 14:25         ` Milan Broz
  1 sibling, 1 reply; 9+ messages in thread
From: Sage Weil @ 2015-07-21 14:00 UTC (permalink / raw)
  To: David Disseldorp; +Cc: Wyllys Ingersoll, ceph-devel, Lars Marowsky-Bree

On Tue, 21 Jul 2015, David Disseldorp wrote:
> Hi,
> 
> On Mon, 20 Jul 2015 15:21:50 -0700 (PDT), Sage Weil wrote:
> 
> > On Mon, 20 Jul 2015, Wyllys Ingersoll wrote:
> > > No luck with ceph-disk-activate (all or just one device).
> > > 
> > > $ sudo ceph-disk-activate /dev/sdv1
> > > mount: unknown filesystem type 'crypto_LUKS'
> > > ceph-disk: Mounting filesystem failed: Command '['/bin/mount', '-t',
> > > 'crypto_LUKS', '-o', '', '--', '/dev/sdv1',
> > > '/var/lib/ceph/tmp/mnt.QHe3zK']' returned non-zero exit status 32
> > > 
> > > 
> > > Its odd that it should complain about the "crypto_LUKS" filesystem not
> > > being recognized, because it did mount some of the LUKS systems
> > > successfully, though not sometimes just the data and not the journal
> > > (or vice versa).
> > > 
> > > $ lsblk /dev/sdb
> > > NAME                                            MAJ:MIN RM   SIZE RO
> > > TYPE  MOUNTPOINT
> > > sdb                                               8:16   0   3.7T  0 disk
> > > ??sdb1                                            8:17   0   3.6T  0 part
> > > ? ??e8bc1531-a187-4fd2-9e3f-cf90255f89d0 (dm-0) 252:0    0   3.6T  0
> > > crypt /var/lib/ceph/osd/ceph-54
> > > ??sdb2                                            8:18   0    10G  0 part
> > >   ??temporary-cryptsetup-1235 (dm-6)            252:6    0   125K  1 crypt
> > > 
> > > 
> > > $ blkid /dev/sdb1
> > > /dev/sdb1: UUID="d6194096-a219-4732-8d61-d0c125c49393" TYPE="crypto_LUKS"
> > > 
> > > 
> > > A race condition (or other issue) with udev seems likely given that
> > > its rather random which ones come up and which ones don't.
> > 
> > A race condition during creation or activation?  If it's activation I 
> > would expect ceph-disk activate ... to work reasonably reliably when 
> > called manually (on a single device at a time).
> 
> We encountered similar issues on a non-dmcrypt firefly deployment with
> 10 OSDs per node.
> 
> I've been working on a patch set to defer device activation to systemd
> services. ceph-disk activate is extended to support mapping of dmcrypt
> devices prior to OSD startup.
> 
> The master-based changes aren't ready for upstream yet, but can be found
> in my WIP branch at:
> https://github.com/ddiss/ceph/tree/wip_bnc926756_split_udev_systemd_master

This approach looks to be MUCH MUCH better than what we're doing right 
now!
 
> There are a few things that I'd still like to address before submitting
> upstream, mostly covering activate-journal:
> - The test/ceph-disk.sh unit tests need to be extended and fixed.
> - The activate-journal --dmcrypt changes are less than optimal, and leave
>   me with a few unanswered questions:
>   + Does get_journal_osd_uuid(dev) return the plaintext or cyphertext
>     uuid?

The uuid is never encrypted.

>   + If a journal is encrypted, is the data partition also always
>     encrypted?

Yes (I don't think it's useful to support a mixed encrypted/unencrypted 
OSD).

> - dmcrypt journal device mapping should probably also be split out into
>   a separate systemd service, as that'll be needed for the future
>   network based key retrieval feature.
> 
> Feedback on the approach taken would be appreciated.

My only regret is that it won't help non-systemd cases, but I'm okay with 
leaving those as is (users can use the existing workarounds, like 
'ceph-disk activate-all' in rc.local to mop up stragglers) and focus 
instead on the new systemd world.

Let us know if there's anything else we can do to help!

sage


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: dmcrypt with luks keys in hammer
  2015-07-21 11:14       ` David Disseldorp
  2015-07-21 14:00         ` Sage Weil
@ 2015-07-21 14:25         ` Milan Broz
  1 sibling, 0 replies; 9+ messages in thread
From: Milan Broz @ 2015-07-21 14:25 UTC (permalink / raw)
  To: David Disseldorp, Sage Weil
  Cc: Wyllys Ingersoll, ceph-devel, Lars Marowsky-Bree

On 07/21/2015 01:14 PM, David Disseldorp wrote:

>>> A race condition (or other issue) with udev seems likely given that
>>> its rather random which ones come up and which ones don't
>>
>> A race condition during creation or activation?  If it's activation I 
>> would expect ceph-disk activate ... to work reasonably reliably when 
>> called manually (on a single device at a time).

I still do not understand completely how the dmcrypt activation
in Ceph is designed, but there are clear problems in the current design.

Activation of another device-mapper inside udev rules (here LUKS or
plain dmcrypt device) is broken by design, it can work with only
with ugly workarounds.

The first reason is correctly mentioned in your mentioned wip branch
(udev RUN is intended for short-running commands. For example,
I think if you increase iteration count in LUKS device, the whole Ceph udev
rules fails completely because udev thread processing will kill it on timeout...)
(Unlocking can take even minutes when you move encrypted disk to a very slow machine)

The second reason is even more serious - cryptsetup itself uses udev
(through libdevmapper) to create nodes and must synchronize with
some other device-mapper udev rules. So here it is a race by design...
udev waits for another udev process. Ditto for creating /dev/by* links
(created by udev rule as well).

(And add to mix +watch rules, which reacts on close-on-write on every
node by running another udev rule blkid scan. If you see some leftover
temporary-cryptsetup* devices, something is really wrong. These
devices are internal to libcryptsetup and maps keyslots only, there are never
keep open in correct operation.)

So moving activation outside of the udev rules is the correct solution here,
only processing of device nodes should be there and rest should be
offloaded after udev rules run.

> We encountered similar issues on a non-dmcrypt firefly deployment with
> 10 OSDs per node.
> 
> I've been working on a patch set to defer device activation to systemd
> services. ceph-disk activate is extended to support mapping of dmcrypt
> devices prior to OSD startup.

Well, using systemd service is one option. But then it should handle all
cryptsetup device activations.

Milan


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: dmcrypt with luks keys in hammer
  2015-07-21 14:00         ` Sage Weil
@ 2015-07-21 14:26           ` Wyllys Ingersoll
  0 siblings, 0 replies; 9+ messages in thread
From: Wyllys Ingersoll @ 2015-07-21 14:26 UTC (permalink / raw)
  To: Sage Weil; +Cc: David Disseldorp, ceph-devel, Lars Marowsky-Bree

"ceph-disk activate-all" does not fix the problem for non-systemd
users.  Once they are into the "temporary-cryptsetup-PID" state, they
have to be manually cleared and remounted as follows:


1. "cryptsetup close" all of the ones in the "temporary-cryptsetup" state
2. find the UUID for each block device (journal and data partitions)
3. cryptsetup luksOpen on those devices individually


for i in `ls /dev/sd?[12] | grep -v sda`
do
   UUID=`sudo blkid -p $i | sed 's/ /\n/g'|grep PART_ENTRY_UUID|cut
-f2 -d=| tr -d "\""
   cryptsetup luksOpen $i $UUID --key-file
/etc/ceph/dmcrypt-keys/${UUID}.luks.key
done

$ sudo start ceph-osd-all

On Tue, Jul 21, 2015 at 10:00 AM, Sage Weil <sage@newdream.net> wrote:
> On Tue, 21 Jul 2015, David Disseldorp wrote:
>> Hi,
>>
>> On Mon, 20 Jul 2015 15:21:50 -0700 (PDT), Sage Weil wrote:
>>
>> > On Mon, 20 Jul 2015, Wyllys Ingersoll wrote:
>> > > No luck with ceph-disk-activate (all or just one device).
>> > >
>> > > $ sudo ceph-disk-activate /dev/sdv1
>> > > mount: unknown filesystem type 'crypto_LUKS'
>> > > ceph-disk: Mounting filesystem failed: Command '['/bin/mount', '-t',
>> > > 'crypto_LUKS', '-o', '', '--', '/dev/sdv1',
>> > > '/var/lib/ceph/tmp/mnt.QHe3zK']' returned non-zero exit status 32
>> > >
>> > >
>> > > Its odd that it should complain about the "crypto_LUKS" filesystem not
>> > > being recognized, because it did mount some of the LUKS systems
>> > > successfully, though not sometimes just the data and not the journal
>> > > (or vice versa).
>> > >
>> > > $ lsblk /dev/sdb
>> > > NAME                                            MAJ:MIN RM   SIZE RO
>> > > TYPE  MOUNTPOINT
>> > > sdb                                               8:16   0   3.7T  0 disk
>> > > ??sdb1                                            8:17   0   3.6T  0 part
>> > > ? ??e8bc1531-a187-4fd2-9e3f-cf90255f89d0 (dm-0) 252:0    0   3.6T  0
>> > > crypt /var/lib/ceph/osd/ceph-54
>> > > ??sdb2                                            8:18   0    10G  0 part
>> > >   ??temporary-cryptsetup-1235 (dm-6)            252:6    0   125K  1 crypt
>> > >
>> > >
>> > > $ blkid /dev/sdb1
>> > > /dev/sdb1: UUID="d6194096-a219-4732-8d61-d0c125c49393" TYPE="crypto_LUKS"
>> > >
>> > >
>> > > A race condition (or other issue) with udev seems likely given that
>> > > its rather random which ones come up and which ones don't.
>> >
>> > A race condition during creation or activation?  If it's activation I
>> > would expect ceph-disk activate ... to work reasonably reliably when
>> > called manually (on a single device at a time).
>>
>> We encountered similar issues on a non-dmcrypt firefly deployment with
>> 10 OSDs per node.
>>
>> I've been working on a patch set to defer device activation to systemd
>> services. ceph-disk activate is extended to support mapping of dmcrypt
>> devices prior to OSD startup.
>>
>> The master-based changes aren't ready for upstream yet, but can be found
>> in my WIP branch at:
>> https://github.com/ddiss/ceph/tree/wip_bnc926756_split_udev_systemd_master
>
> This approach looks to be MUCH MUCH better than what we're doing right
> now!
>
>> There are a few things that I'd still like to address before submitting
>> upstream, mostly covering activate-journal:
>> - The test/ceph-disk.sh unit tests need to be extended and fixed.
>> - The activate-journal --dmcrypt changes are less than optimal, and leave
>>   me with a few unanswered questions:
>>   + Does get_journal_osd_uuid(dev) return the plaintext or cyphertext
>>     uuid?
>
> The uuid is never encrypted.
>
>>   + If a journal is encrypted, is the data partition also always
>>     encrypted?
>
> Yes (I don't think it's useful to support a mixed encrypted/unencrypted
> OSD).
>
>> - dmcrypt journal device mapping should probably also be split out into
>>   a separate systemd service, as that'll be needed for the future
>>   network based key retrieval feature.
>>
>> Feedback on the approach taken would be appreciated.
>
> My only regret is that it won't help non-systemd cases, but I'm okay with
> leaving those as is (users can use the existing workarounds, like
> 'ceph-disk activate-all' in rc.local to mop up stragglers) and focus
> instead on the new systemd world.
>
> Let us know if there's anything else we can do to help!
>
> sage
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-07-21 14:26 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-20 19:52 dmcrypt with luks keys in hammer Wyllys Ingersoll
2015-07-20 21:22 ` Sage Weil
2015-07-20 21:46   ` Wyllys Ingersoll
2015-07-20 22:21     ` Sage Weil
2015-07-20 22:23       ` Wyllys Ingersoll
2015-07-21 11:14       ` David Disseldorp
2015-07-21 14:00         ` Sage Weil
2015-07-21 14:26           ` Wyllys Ingersoll
2015-07-21 14:25         ` Milan Broz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.