All of lore.kernel.org
 help / color / mirror / Atom feed
* pre-Infernalis ceph-disk bug
@ 2015-10-13 22:02 Jeremy Hanmer
  2015-10-13 22:31 ` Loic Dachary
  0 siblings, 1 reply; 3+ messages in thread
From: Jeremy Hanmer @ 2015-10-13 22:02 UTC (permalink / raw)
  To: Ceph Development

I think I've found a bug in ceph-disk when running on Ubuntu 14.04
(and I believe 12.04 as well, but haven't confirmed) and using
--dmcrypt.

The problem is that when update_partition() is called, partprobe is
used to re-read the partition table (as opposed to partx on all other
distros) and it appears that it isn't smart/thorough enough to update
all of the device's metadata. Specifically, ID_PART_ENTRY_TYPE isn't
updated:

root@ceph-osd03:~# udevadm info --query=env --name=/dev/vdd1 | grep
ID_PART_ENTRY_TYPE
ID_PART_ENTRY_TYPE=89c57f98-2fe5-4dc0-89c1-5ec00ceff2be

running `partx -u` rather than `partprobe` does the appropriate thing:

root@ceph-osd03:~# partx -u /dev/vdd1
root@ceph-osd03:~# udevadm info --query=env --name=/dev/vdd1 | grep
ID_PART_ENTRY_TYPE
ID_PART_ENTRY_TYPE=4fbd7e29-9d25-41b8-afd0-5ec00ceff05d


I have an experimental patch here that Works For Me, but Sage wanted
me to ping the list for input:

https://github.com/fzylogic/ceph/commit/8c83f75392d68fbec7def8aa61f20b2c9c237571


I also want to test the new Infernalis code for this same bug (after a
cursory check, I strongly suspect it's there as well), but it'll take
a little bit to get another test cluster up to confirm.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: pre-Infernalis ceph-disk bug
  2015-10-13 22:02 pre-Infernalis ceph-disk bug Jeremy Hanmer
@ 2015-10-13 22:31 ` Loic Dachary
  2015-10-13 22:51   ` Jeremy Hanmer
  0 siblings, 1 reply; 3+ messages in thread
From: Loic Dachary @ 2015-10-13 22:31 UTC (permalink / raw)
  To: Jeremy Hanmer, Ceph Development

[-- Attachment #1: Type: text/plain, Size: 2350 bytes --]

Hi,

On 14/10/2015 00:02, Jeremy Hanmer wrote:
> I think I've found a bug in ceph-disk when running on Ubuntu 14.04
> (and I believe 12.04 as well, but haven't confirmed) and using
> --dmcrypt.
> 
> The problem is that when update_partition() is called, partprobe is
> used to re-read the partition table (as opposed to partx on all other
> distros) and it appears that it isn't smart/thorough enough to update
> all of the device's metadata. Specifically, ID_PART_ENTRY_TYPE isn't
> updated:
> 
> root@ceph-osd03:~# udevadm info --query=env --name=/dev/vdd1 | grep
> ID_PART_ENTRY_TYPE
> ID_PART_ENTRY_TYPE=89c57f98-2fe5-4dc0-89c1-5ec00ceff2be
> 
> running `partx -u` rather than `partprobe` does the appropriate thing:
> 
> root@ceph-osd03:~# partx -u /dev/vdd1
> root@ceph-osd03:~# udevadm info --query=env --name=/dev/vdd1 | grep
> ID_PART_ENTRY_TYPE
> ID_PART_ENTRY_TYPE=4fbd7e29-9d25-41b8-afd0-5ec00ceff05d
> 
> 
> I have an experimental patch here that Works For Me, but Sage wanted
> me to ping the list for input:
> 
> https://github.com/fzylogic/ceph/commit/8c83f75392d68fbec7def8aa61f20b2c9c237571
> 
> 
> I also want to test the new Infernalis code for this same bug (after a
> cursory check, I strongly suspect it's there as well), but it'll take
> a little bit to get another test cluster up to confirm.

There has been many changes in infernalis, most of them to make it more robust. It would be great if you could try to reproduce the problem you had with infernalis. 

Your patch looks good and you could also remove https://github.com/fzylogic/ceph/blob/8c83f75392d68fbec7def8aa61f20b2c9c237571/src/ceph-disk#L1505 which will happen immediately after the function returns. 

An alternate fix would be to udevadm settle before https://github.com/fzylogic/ceph/blob/8c83f75392d68fbec7def8aa61f20b2c9c237571/src/ceph-disk#L985 and after it to avoid races. I think the reason why partprobe does not appear to work is because it triggers udev events that race with udev events triggered by sgdisk while creating the partition.

Cheers

> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: pre-Infernalis ceph-disk bug
  2015-10-13 22:31 ` Loic Dachary
@ 2015-10-13 22:51   ` Jeremy Hanmer
  0 siblings, 0 replies; 3+ messages in thread
From: Jeremy Hanmer @ 2015-10-13 22:51 UTC (permalink / raw)
  To: Loic Dachary; +Cc: Ceph Development

Cool, I'll try to clean things up and submit a PR (probably one for
<Infernalis and one for Infernalis, assuming it's still broken).

I'm 99% certain I ruled out any udev silliness by throwing a bunch of
settle calls every place I thought might make some theoretical sense.
No number of those calls ever resulted in a properly functioning
partprobe call.

On Tue, Oct 13, 2015 at 3:31 PM, Loic Dachary <loic@dachary.org> wrote:
> Hi,
>
> On 14/10/2015 00:02, Jeremy Hanmer wrote:
>> I think I've found a bug in ceph-disk when running on Ubuntu 14.04
>> (and I believe 12.04 as well, but haven't confirmed) and using
>> --dmcrypt.
>>
>> The problem is that when update_partition() is called, partprobe is
>> used to re-read the partition table (as opposed to partx on all other
>> distros) and it appears that it isn't smart/thorough enough to update
>> all of the device's metadata. Specifically, ID_PART_ENTRY_TYPE isn't
>> updated:
>>
>> root@ceph-osd03:~# udevadm info --query=env --name=/dev/vdd1 | grep
>> ID_PART_ENTRY_TYPE
>> ID_PART_ENTRY_TYPE=89c57f98-2fe5-4dc0-89c1-5ec00ceff2be
>>
>> running `partx -u` rather than `partprobe` does the appropriate thing:
>>
>> root@ceph-osd03:~# partx -u /dev/vdd1
>> root@ceph-osd03:~# udevadm info --query=env --name=/dev/vdd1 | grep
>> ID_PART_ENTRY_TYPE
>> ID_PART_ENTRY_TYPE=4fbd7e29-9d25-41b8-afd0-5ec00ceff05d
>>
>>
>> I have an experimental patch here that Works For Me, but Sage wanted
>> me to ping the list for input:
>>
>> https://github.com/fzylogic/ceph/commit/8c83f75392d68fbec7def8aa61f20b2c9c237571
>>
>>
>> I also want to test the new Infernalis code for this same bug (after a
>> cursory check, I strongly suspect it's there as well), but it'll take
>> a little bit to get another test cluster up to confirm.
>
> There has been many changes in infernalis, most of them to make it more robust. It would be great if you could try to reproduce the problem you had with infernalis.
>
> Your patch looks good and you could also remove https://github.com/fzylogic/ceph/blob/8c83f75392d68fbec7def8aa61f20b2c9c237571/src/ceph-disk#L1505 which will happen immediately after the function returns.
>
> An alternate fix would be to udevadm settle before https://github.com/fzylogic/ceph/blob/8c83f75392d68fbec7def8aa61f20b2c9c237571/src/ceph-disk#L985 and after it to avoid races. I think the reason why partprobe does not appear to work is because it triggers udev events that race with udev events triggered by sgdisk while creating the partition.
>
> Cheers
>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
> --
> Loïc Dachary, Artisan Logiciel Libre
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-10-13 22:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-13 22:02 pre-Infernalis ceph-disk bug Jeremy Hanmer
2015-10-13 22:31 ` Loic Dachary
2015-10-13 22:51   ` Jeremy Hanmer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.