All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Don't use floor() in _bitset_with_random_bits
@ 2010-07-30 17:05 Przemyslaw Iskra
  2010-08-06 19:24 ` Petr Rockai
  0 siblings, 1 reply; 3+ messages in thread
From: Przemyslaw Iskra @ 2010-07-30 17:05 UTC (permalink / raw)
  To: lvm-devel

Use _even_rand() function instead of floor() in
_bitset_with_random_bits(). floor() function is missing in dietlibc (on
architectures other than x86). Moreover using floor() to clip rand
results does not assure even result distribution.
_even_rand() uses integer arithmetic only and is designed to return evenly
distributed results.

Signed-off-by: Przemyslaw Iskra <sparky@pld-linux.org>
---
 configure.in            |    3 +--
 lib/metadata/metadata.c |   16 +++++++++++++++-
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/configure.in b/configure.in
index bd56136..6f6c67e 100644
--- a/configure.in
+++ b/configure.in
@@ -125,8 +125,7 @@ AC_STRUCT_TM
 
 ################################################################################
 dnl -- Check for functions
-AC_SEARCH_LIBS([floor], [m], , [AC_MSG_ERROR(bailing out)])
-AC_CHECK_FUNCS([floor ftruncate gethostname getpagesize \
+AC_CHECK_FUNCS([ftruncate gethostname getpagesize \
   gettimeofday memset mkdir mkfifo rmdir munmap nl_langinfo setenv setlocale \
   strcasecmp strchr strcspn strspn strdup strncasecmp strerror strrchr \
   strstr strtol strtoul uname], , [AC_MSG_ERROR(bailing out)])
diff --git a/lib/metadata/metadata.c b/lib/metadata/metadata.c
index 07222a7..6ee7731 100644
--- a/lib/metadata/metadata.c
+++ b/lib/metadata/metadata.c
@@ -1018,6 +1018,20 @@ static int _recalc_extents(uint32_t *extents, const char *desc1,
 	return 1;
 }
 
+/* return random integer in [0,max) interval */
+static unsigned _even_rand( unsigned *seed, unsigned max )
+{
+	unsigned r, ret;
+
+	/* make sure distribution is even */
+	do {
+		r = (unsigned) rand_r( seed );
+		ret = r % max;
+	} while ( r - ret >= RAND_MAX - max );
+
+	return ret;
+}
+
 static dm_bitset_t _bitset_with_random_bits(struct dm_pool *mem, uint32_t num_bits,
 					    uint32_t num_set_bits, unsigned *seed)
 {
@@ -1040,7 +1054,7 @@ static dm_bitset_t _bitset_with_random_bits(struct dm_pool *mem, uint32_t num_bi
 	/* Perform loop num_set_bits times, selecting one bit each time */
 	while (i++ < num_bits) {
 		/* Select a random bit between 0 and (i-1) inclusive. */
-		bit_selected = (unsigned) floor(i * (rand_r(seed) / (RAND_MAX + 1.0)));
+		bit_selected = _even_rand( seed, i );
 
 		/*
 		 * If the bit was already set, set the new bit that became
-- 
1.7.1



-- 
 ____  Sparky{PI] -- Przemyslaw _  ___  _  _  ......... LANG...Pl,Ca,Es,En
/____) ___  ___  _ _ || Iskra  |  | _ \| |  | : WWW...ppcrcd.pld-linux.org
\____\| -_)'___| ||^'||//\\// <   |  _/| |  | : WWW2..............rsget.pl
(____/||   (_-_|_||  ||\\ ||   |_ |_|  |_| _| : Mail..sparky@pld-linux.org



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH] Don't use floor() in _bitset_with_random_bits
  2010-07-30 17:05 [PATCH] Don't use floor() in _bitset_with_random_bits Przemyslaw Iskra
@ 2010-08-06 19:24 ` Petr Rockai
  2010-08-07  1:51   ` Przemyslaw Iskra
  0 siblings, 1 reply; 3+ messages in thread
From: Petr Rockai @ 2010-08-06 19:24 UTC (permalink / raw)
  To: lvm-devel

Hi,

Przemyslaw Iskra <sparky@pld-linux.org> writes:
> Use _even_rand() function instead of floor() in
> _bitset_with_random_bits(). floor() function is missing in dietlibc (on
> architectures other than x86). Moreover using floor() to clip rand
> results does not assure even result distribution.
> _even_rand() uses integer arithmetic only and is designed to return evenly
> distributed results.
>
> Signed-off-by: Przemyslaw Iskra <sparky@pld-linux.org>

Reviewed-by: Petr Rockai <prockai@redhat.com>

Looks OK to me. It took a while to decipher what is the exact meaning of
the loop in _even_rand (to a non-pseudorandomness-expert) but I am
fairly comfortable with it now. If I understand this correctly, it
rejects numbers that come from an "incomplete" slice of the RAND_MAX
space (considering the number space [0, RAND_MAX] is divided into some
"max"-sized slices and at most a single smaller slice, between [n*max,
RAND_MAX] for suitable n -- numbers from this last slice are discarded
because they could distort the distribution in favour of smaller
numbers).

> diff --git a/configure.in b/configure.in
> index bd56136..6f6c67e 100644
> --- a/configure.in
> +++ b/configure.in
> @@ -125,8 +125,7 @@ AC_STRUCT_TM
>  
>  ################################################################################
>  dnl -- Check for functions
> -AC_SEARCH_LIBS([floor], [m], , [AC_MSG_ERROR(bailing out)])
> -AC_CHECK_FUNCS([floor ftruncate gethostname getpagesize \
> +AC_CHECK_FUNCS([ftruncate gethostname getpagesize \
>    gettimeofday memset mkdir mkfifo rmdir munmap nl_langinfo setenv setlocale \
>    strcasecmp strchr strcspn strspn strdup strncasecmp strerror strrchr \
>    strstr strtol strtoul uname], , [AC_MSG_ERROR(bailing out)])

OK, since floor is not used anywhere else in LVM.

> diff --git a/lib/metadata/metadata.c b/lib/metadata/metadata.c
> index 07222a7..6ee7731 100644
> --- a/lib/metadata/metadata.c
> +++ b/lib/metadata/metadata.c
> @@ -1018,6 +1018,20 @@ static int _recalc_extents(uint32_t *extents, const char *desc1,
>  	return 1;
>  }
>  
> +/* return random integer in [0,max) interval */
> +static unsigned _even_rand( unsigned *seed, unsigned max )
> +{
> +	unsigned r, ret;
> +
> +	/* make sure distribution is even */
> +	do {
> +		r = (unsigned) rand_r( seed );
> +		ret = r % max;
> +	} while ( r - ret >= RAND_MAX - max );
> +
> +	return ret;
> +}
OK as per above.

>  static dm_bitset_t _bitset_with_random_bits(struct dm_pool *mem, uint32_t num_bits,
>  					    uint32_t num_set_bits, unsigned *seed)
>  {
> @@ -1040,7 +1054,7 @@ static dm_bitset_t _bitset_with_random_bits(struct dm_pool *mem, uint32_t num_bi
>  	/* Perform loop num_set_bits times, selecting one bit each time */
>  	while (i++ < num_bits) {
>  		/* Select a random bit between 0 and (i-1) inclusive. */
> -		bit_selected = (unsigned) floor(i * (rand_r(seed) / (RAND_MAX + 1.0)));
> +		bit_selected = _even_rand( seed, i );

OK. Gets a random integer from [0, i). The previous code was doing the
same thing (less efficiently and less precisely).

Yours,
   Petr.



^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH] Don't use floor() in _bitset_with_random_bits
  2010-08-06 19:24 ` Petr Rockai
@ 2010-08-07  1:51   ` Przemyslaw Iskra
  0 siblings, 0 replies; 3+ messages in thread
From: Przemyslaw Iskra @ 2010-08-07  1:51 UTC (permalink / raw)
  To: lvm-devel

On Fri, Aug 06, 2010 at 09:24:05PM +0200, Petr Rockai wrote:
> Hi,
> 
> Przemyslaw Iskra <sparky@pld-linux.org> writes:
> > Use _even_rand() function instead of floor() in
> > _bitset_with_random_bits(). floor() function is missing in dietlibc (on
> > architectures other than x86). Moreover using floor() to clip rand
> > results does not assure even result distribution.
> > _even_rand() uses integer arithmetic only and is designed to return evenly
> > distributed results.
> >
> > Signed-off-by: Przemyslaw Iskra <sparky@pld-linux.org>
> 
> Reviewed-by: Petr Rockai <prockai@redhat.com>
> 
> Looks OK to me. It took a while to decipher what is the exact meaning of
> the loop in _even_rand (to a non-pseudorandomness-expert) but I am
> fairly comfortable with it now. If I understand this correctly, it
> rejects numbers that come from an "incomplete" slice of the RAND_MAX
> space (considering the number space [0, RAND_MAX] is divided into some
> "max"-sized slices and at most a single smaller slice, between [n*max,
> RAND_MAX] for suitable n -- numbers from this last slice are discarded
> because they could distort the distribution in favour of smaller
> numbers).

That's exactly what it does.

However, right now it cutts off the last slice even if it's complete.
You could replace:

> > +	} while ( r - ret >= RAND_MAX - max );

With:

+	} while ( r - ret > (unsigned) RAND_MAX + 1 - max );

Result distribution is correct in both cases. It just avoids the branch
in 0.00something %.


Przemyslaw Iskra

-- 
 ____  Sparky{PI] -- Przemyslaw _  ___  _  _  ......... LANG...Pl,Ca,Es,En
/____) ___  ___  _ _ || Iskra  |  | _ \| |  | : WWW...ppcrcd.pld-linux.org
\____\| -_)'___| ||^'||//\\// <   |  _/| |  | : WWW2..............rsget.pl
(____/||   (_-_|_||  ||\\ ||   |_ |_|  |_| _| : Mail..sparky@pld-linux.org



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-08-07  1:51 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-30 17:05 [PATCH] Don't use floor() in _bitset_with_random_bits Przemyslaw Iskra
2010-08-06 19:24 ` Petr Rockai
2010-08-07  1:51   ` Przemyslaw Iskra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.