Re: [PATCH 2/9] uuid: use random32_get

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [PATCH 2/9] uuid: use random32_get_bytes()
       [not found] ` <1351408746-8623-2-git-send-email-akinobu.mita@gmail.com>
@ 2012-10-29 20:52   ` Theodore Ts'o
  2012-10-30  1:49     ` Huang Ying
  0 siblings, 1 reply; 6+ messages in thread
From: Theodore Ts'o @ 2012-10-29 20:52 UTC (permalink / raw)
  To: Akinobu Mita; +Cc: linux-kernel, akpm, ying.huang, chris.mason, linux-btrfs

On Sun, Oct 28, 2012 at 04:18:59PM +0900, Akinobu Mita wrote:
> Use random32_get_bytes() to generate 16 bytes of pseudo-random bytes.
> 
> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>

Since your patch is going to allow users to set the random seed, it
means that what had previously been a bad security bug has just become
a grievous security bug.  If you are going to be generating UUID's
they _must_ use a truly random random generator, since the whole point
of uuid's is that they be unique.  If someone can trivially set the
random seed of a prng, and thus cause the uuid generator to generate,
well, non-unique UUID's, the results can range anywhere from
confusion, to file system corruption and data loss.

Fortunately, there is only one user of lib/uuid.c, and that's the
btrfs file system.

Chris and the Btrfs folks --- my recommendation would be to ditch the
use of uuid_be_gen, "git rm lib/uuid.c" with extreme prejudice, and
use generate_random_uuid() which was coded over a decade ago in
drivers/char/random.c.  Not only does this properly use the kernel
random number generator, but it also creates a UUID with the correct
format.  (It's not enough to set the UUID version to 4; you also need
to set the UUID variant to be DCE if you want to be properly compliant
with RFC 4122 --- see section 4.1.1.)

The btrfs file system doesn't generate uuid's in any critical fast
paths as near as I can determine, so it should be perfectly safe to
use uuid_generate() --- I also would drop the whole distinction
between little-endian and big-endian uuid's, BTW.  RFC 4122 is quite
explicit about how uuid's should be encoded, and it's in internet byte
order.  This is what OSF/DCE uses, and it's what the rest of the world
(Microsoft, SAP AG, Apple, GNOME, KDE) uses as well.

Regards,

					- Ted

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/9] uuid: use random32_get_bytes()
  2012-10-29 20:52   ` [PATCH 2/9] uuid: use random32_get_bytes() Theodore Ts'o
@ 2012-10-30  1:49     ` Huang Ying
  2012-10-30  4:48       ` Theodore Ts'o
  0 siblings, 1 reply; 6+ messages in thread
From: Huang Ying @ 2012-10-30  1:49 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Akinobu Mita, linux-kernel, akpm, chris.mason, linux-btrfs

On Mon, 2012-10-29 at 16:52 -0400, Theodore Ts'o wrote:
> On Sun, Oct 28, 2012 at 04:18:59PM +0900, Akinobu Mita wrote:
> > Use random32_get_bytes() to generate 16 bytes of pseudo-random bytes.
> > 
> > Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
> 
> Since your patch is going to allow users to set the random seed, it
> means that what had previously been a bad security bug has just become
> a grievous security bug.  If you are going to be generating UUID's
> they _must_ use a truly random random generator, since the whole point
> of uuid's is that they be unique.  If someone can trivially set the
> random seed of a prng, and thus cause the uuid generator to generate,
> well, non-unique UUID's, the results can range anywhere from
> confusion, to file system corruption and data loss.
> 
> Fortunately, there is only one user of lib/uuid.c, and that's the
> btrfs file system.
> 
> Chris and the Btrfs folks --- my recommendation would be to ditch the
> use of uuid_be_gen, "git rm lib/uuid.c" with extreme prejudice, and
> use generate_random_uuid() which was coded over a decade ago in
> drivers/char/random.c.  Not only does this properly use the kernel
> random number generator, but it also creates a UUID with the correct
> format.  (It's not enough to set the UUID version to 4; you also need
> to set the UUID variant to be DCE if you want to be properly compliant
> with RFC 4122 --- see section 4.1.1.)

The uuid_le/be_gen() in lib/uuid.c has set UUID variants to be DCE,
that is done in __uuid_gen_common() with "b[8] = (b[8] & 0x3F) | 0x80".

To deal with random number generation issue, how about use
get_random_bytes() in __uuid_gen_common()?

> The btrfs file system doesn't generate uuid's in any critical fast
> paths as near as I can determine, so it should be perfectly safe to
> use uuid_generate() --- I also would drop the whole distinction
> between little-endian and big-endian uuid's, BTW.  RFC 4122 is quite
> explicit about how uuid's should be encoded, and it's in internet byte
> order.  This is what OSF/DCE uses, and it's what the rest of the world
> (Microsoft, SAP AG, Apple, GNOME, KDE) uses as well.

uuid_be complies RFC4122, it uses internet byte order.  But EFI uses
little endian.

Best Regards,
Huang Ying



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/9] uuid: use random32_get_bytes()
  2012-10-30  1:49     ` Huang Ying
@ 2012-10-30  4:48       ` Theodore Ts'o
  2012-10-31  1:35         ` Huang Ying
  0 siblings, 1 reply; 6+ messages in thread
From: Theodore Ts'o @ 2012-10-30  4:48 UTC (permalink / raw)
  To: Huang Ying; +Cc: Akinobu Mita, linux-kernel, akpm, chris.mason, linux-btrfs

On Tue, Oct 30, 2012 at 09:49:58AM +0800, Huang Ying wrote:
> The uuid_le/be_gen() in lib/uuid.c has set UUID variants to be DCE,
> that is done in __uuid_gen_common() with "b[8] = (b[8] & 0x3F) | 0x80".

Oh, I see, I missed that.

> To deal with random number generation issue, how about use
> get_random_bytes() in __uuid_gen_common()?

We already have generate_random_uuid() in drivers/char/random.c, and
no users for lib/uuid.c's equivalent uuid_be_gen().  So here's a
counter-proposal, why don't we drop lib/uuid.c, and include in
drivers/char/random.c:

/*
 * Generate random GUID
 *
 * GUID's is like UUID's, but they uses the non-standard little-endian
 * layout, compared to what is defined in RFC-4112; it is primarily
 * used by the EFI specification.
 */
void generate_random_guid(unsigned char uuid_out[16])
{
	get_random_bytes(uuid_out, 16);
	/* Set UUID version to 4 --- truly random generation */
	uuid_out[7] = (uuid_out[7] & 0x0F) | 0x40;
	/* Set the UUID variant to DCE */
	uuid_out[8] = (uuid_out[8] & 0x3F) | 0x80;
}
EXPORT_SYMBOL(generate_random_guid);

I really don't think it's worth it to have a __uuid_gen_common once we
are using get_random_bytes(), since there isn't much code to be
factored out, and it's simpler just to have two functions in one place.

Using UUID vs. GUID I think makes things much clearer, since the EFI
specification talks about GUID's, not UUID's, and that way we don't
have to worry about people getting confused about whether they should
be using the little-endian versus big-endian variant.  (And I'd love
to ask to whoever wrote the EFI specification what on *Earth* were
they thinking when they decided to diverge from the rest of the
world....)

	      	       	       	       	    - Ted

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/9] uuid: use random32_get_bytes()
  2012-10-30  4:48       ` Theodore Ts'o
@ 2012-10-31  1:35         ` Huang Ying
  2012-10-31  2:38           ` Theodore Ts'o
  0 siblings, 1 reply; 6+ messages in thread
From: Huang Ying @ 2012-10-31  1:35 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Akinobu Mita, linux-kernel, akpm, chris.mason, linux-btrfs

On Tue, 2012-10-30 at 00:48 -0400, Theodore Ts'o wrote:
> On Tue, Oct 30, 2012 at 09:49:58AM +0800, Huang Ying wrote:
> > The uuid_le/be_gen() in lib/uuid.c has set UUID variants to be DCE,
> > that is done in __uuid_gen_common() with "b[8] = (b[8] & 0x3F) | 0x80".
> 
> Oh, I see, I missed that.
> 
> > To deal with random number generation issue, how about use
> > get_random_bytes() in __uuid_gen_common()?
> 
> We already have generate_random_uuid() in drivers/char/random.c, and
> no users for lib/uuid.c's equivalent uuid_be_gen().  So here's a
> counter-proposal, why don't we drop lib/uuid.c, and include in
> drivers/char/random.c:
> 
> /*
>  * Generate random GUID
>  *
>  * GUID's is like UUID's, but they uses the non-standard little-endian
>  * layout, compared to what is defined in RFC-4112; it is primarily
>  * used by the EFI specification.
>  */
> void generate_random_guid(unsigned char uuid_out[16])
> {
> 	get_random_bytes(uuid_out, 16);
> 	/* Set UUID version to 4 --- truly random generation */
> 	uuid_out[7] = (uuid_out[7] & 0x0F) | 0x40;
> 	/* Set the UUID variant to DCE */
> 	uuid_out[8] = (uuid_out[8] & 0x3F) | 0x80;
> }
> EXPORT_SYMBOL(generate_random_guid);
> 
> I really don't think it's worth it to have a __uuid_gen_common once we
> are using get_random_bytes(), since there isn't much code to be
> factored out, and it's simpler just to have two functions in one place.

The intention of lib/uuid.c is to unify various UUID related code, and
put them in same place.  In addition to UUID generation, it provide some
other utility and may provide/collect more in the future.  So do you
think it is a good idea to put generate_rand_uuid/guid into lib/uuid.c
and maybe change the name/prototype to make it consistent with other
uuid definitions?

> Using UUID vs. GUID I think makes things much clearer, since the EFI
> specification talks about GUID's, not UUID's, and that way we don't
> have to worry about people getting confused about whether they should
> be using the little-endian versus big-endian variant.  (And I'd love
> to ask to whoever wrote the EFI specification what on *Earth* were
> they thinking when they decided to diverge from the rest of the
> world....)

I think that is a good idea.  From Wikipedia, GUID is in native byte
order, while UUID is in internet byte order.

Best Regards,
Huang Ying



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/9] uuid: use random32_get_bytes()
  2012-10-31  1:35         ` Huang Ying
@ 2012-10-31  2:38           ` Theodore Ts'o
  2012-10-31  3:06             ` Huang Ying
  0 siblings, 1 reply; 6+ messages in thread
From: Theodore Ts'o @ 2012-10-31  2:38 UTC (permalink / raw)
  To: Huang Ying; +Cc: Akinobu Mita, linux-kernel, akpm, chris.mason, linux-btrfs

On Wed, Oct 31, 2012 at 09:35:37AM +0800, Huang Ying wrote:
> 
> The intention of lib/uuid.c is to unify various UUID related code, and
> put them in same place.  In addition to UUID generation, it provide some
> other utility and may provide/collect more in the future.  So do you
> think it is a good idea to put generate_rand_uuid/guid into lib/uuid.c
> and maybe change the name/prototype to make it consistent with other
> uuid definitions?

I had trouble understanding why lib/uuid.c existed, since the only
thing I saw was the uuid generation function.  After some more
looking, I see you also created inline functions which wrapped
memcmp().

The problem I have with your abstractions is that it just makes life
more complicated for the callers.  All of the current places which use
generate_random_uuid() merely want to fill in a unsigned char array.
This includes btrfs, by the way, which is already using
generate_random_uuid in some places, and I'm not sure why they are
using uuid_le_gen(), since there doesn't seem to be any need for a
little-endian uuid/guid here (it's just used as unique bag of bits
which is 16 bytes long), and using uuid_le_gen() means extra memory
has to be allocated on the stack, and then an extra memory copy is
required.  Contrast (in fs/btrfs/root-tree.c):

	   uuid_le uuid;
	   ...
		uuid_le_gen(&uuid);
		memcpy(item->uuid, uuid.b, BTRFS_UUID_SIZE);

versus, simply doing (fs/btrfs/volumes.c):

	generate_random_uuid(fs_devices->fsid);

see which one is easier?  And after the uuid is generated, none of the
current callers ever do any manipulation of the uuid, so there's no
real point to play fancy typedef games; it just adds more work for no
real gain.

> > Using UUID vs. GUID I think makes things much clearer, since the EFI
> > specification talks about GUID's, not UUID's, and that way we don't
> > have to worry about people getting confused about whether they should
> > be using the little-endian versus big-endian variant.  (And I'd love
> > to ask to whoever wrote the EFI specification what on *Earth* were
> > they thinking when they decided to diverge from the rest of the
> > world....)
> 
> I think that is a good idea.  From Wikipedia, GUID is in native byte
> order, while UUID is in internet byte order.

Well, technially GUID is "intel/little-endian byte order".  If someone
tried to implement the GPT on a big-endian system, such as PowerPC,
they would still have to use the little-endian byte order, even though
it's not the native byte order for that architecture.  Otherwise
devices wouldn't be portable between those systems.  (This is why I
think the GUID was such a bad idea; everyone basically treats them as
16 byte octet strings, so this whole idea of "native byte order" just
to save a few byte swaps at UUID generation time was really, IMHO, a
very bad idea.)

Regards,

					- Ted

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/9] uuid: use random32_get_bytes()
  2012-10-31  2:38           ` Theodore Ts'o
@ 2012-10-31  3:06             ` Huang Ying
  0 siblings, 0 replies; 6+ messages in thread
From: Huang Ying @ 2012-10-31  3:06 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Akinobu Mita, linux-kernel, akpm, chris.mason, linux-btrfs

On Tue, 2012-10-30 at 22:38 -0400, Theodore Ts'o wrote:
> On Wed, Oct 31, 2012 at 09:35:37AM +0800, Huang Ying wrote:
> > 
> > The intention of lib/uuid.c is to unify various UUID related code, and
> > put them in same place.  In addition to UUID generation, it provide some
> > other utility and may provide/collect more in the future.  So do you
> > think it is a good idea to put generate_rand_uuid/guid into lib/uuid.c
> > and maybe change the name/prototype to make it consistent with other
> > uuid definitions?
> 
> I had trouble understanding why lib/uuid.c existed, since the only
> thing I saw was the uuid generation function.  After some more
> looking, I see you also created inline functions which wrapped
> memcmp().
> 
> The problem I have with your abstractions is that it just makes life
> more complicated for the callers.  All of the current places which use
> generate_random_uuid() merely want to fill in a unsigned char array.
> This includes btrfs, by the way, which is already using
> generate_random_uuid in some places, and I'm not sure why they are
> using uuid_le_gen(), since there doesn't seem to be any need for a
> little-endian uuid/guid here (it's just used as unique bag of bits
> which is 16 bytes long), and using uuid_le_gen() means extra memory
> has to be allocated on the stack, and then an extra memory copy is
> required.  Contrast (in fs/btrfs/root-tree.c):
> 
> 	   uuid_le uuid;
> 	   ...
> 		uuid_le_gen(&uuid);
> 		memcpy(item->uuid, uuid.b, BTRFS_UUID_SIZE);
> 
> versus, simply doing (fs/btrfs/volumes.c):
> 
> 	generate_random_uuid(fs_devices->fsid);
> 
> see which one is easier?  And after the uuid is generated, none of the
> current callers ever do any manipulation of the uuid, so there's no
> real point to play fancy typedef games; it just adds more work for no
> real gain.

If we use uuid_le when we define the data structure, life will be eaiser

struct btrfs_root_item {
	...
	uuid_le uuid;
	...
};

Then it is quite easy to use it.

uuid_le_gen(&item->uuid);

That is the intended usage model.

UUID_LE() macro definition has some user.  It makes it easier to
construct UUID/GUID defined in some specs.

> > > Using UUID vs. GUID I think makes things much clearer, since the EFI
> > > specification talks about GUID's, not UUID's, and that way we don't
> > > have to worry about people getting confused about whether they should
> > > be using the little-endian versus big-endian variant.  (And I'd love
> > > to ask to whoever wrote the EFI specification what on *Earth* were
> > > they thinking when they decided to diverge from the rest of the
> > > world....)
> > 
> > I think that is a good idea.  From Wikipedia, GUID is in native byte
> > order, while UUID is in internet byte order.
> 
> Well, technially GUID is "intel/little-endian byte order".  If someone
> tried to implement the GPT on a big-endian system, such as PowerPC,
> they would still have to use the little-endian byte order, even though
> it's not the native byte order for that architecture.  Otherwise
> devices wouldn't be portable between those systems.  (This is why I
> think the GUID was such a bad idea; everyone basically treats them as
> 16 byte octet strings, so this whole idea of "native byte order" just
> to save a few byte swaps at UUID generation time was really, IMHO, a
> very bad idea.)

Yes.  Explicit byte order is better.

Best Regards,
Huang Ying



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-10-31  3:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1351408746-8623-1-git-send-email-akinobu.mita@gmail.com>
     [not found] ` <1351408746-8623-2-git-send-email-akinobu.mita@gmail.com>
2012-10-29 20:52   ` [PATCH 2/9] uuid: use random32_get_bytes() Theodore Ts'o
2012-10-30  1:49     ` Huang Ying
2012-10-30  4:48       ` Theodore Ts'o
2012-10-31  1:35         ` Huang Ying
2012-10-31  2:38           ` Theodore Ts'o
2012-10-31  3:06             ` Huang Ying

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).