* [PATCH] Don't use floor() in _bitset_with_random_bits
@ 2010-07-30 17:05 Przemyslaw Iskra
2010-08-06 19:24 ` Petr Rockai
0 siblings, 1 reply; 3+ messages in thread
From: Przemyslaw Iskra @ 2010-07-30 17:05 UTC (permalink / raw)
To: lvm-devel
Use _even_rand() function instead of floor() in
_bitset_with_random_bits(). floor() function is missing in dietlibc (on
architectures other than x86). Moreover using floor() to clip rand
results does not assure even result distribution.
_even_rand() uses integer arithmetic only and is designed to return evenly
distributed results.
Signed-off-by: Przemyslaw Iskra <sparky@pld-linux.org>
---
configure.in | 3 +--
lib/metadata/metadata.c | 16 +++++++++++++++-
2 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/configure.in b/configure.in
index bd56136..6f6c67e 100644
--- a/configure.in
+++ b/configure.in
@@ -125,8 +125,7 @@ AC_STRUCT_TM
################################################################################
dnl -- Check for functions
-AC_SEARCH_LIBS([floor], [m], , [AC_MSG_ERROR(bailing out)])
-AC_CHECK_FUNCS([floor ftruncate gethostname getpagesize \
+AC_CHECK_FUNCS([ftruncate gethostname getpagesize \
gettimeofday memset mkdir mkfifo rmdir munmap nl_langinfo setenv setlocale \
strcasecmp strchr strcspn strspn strdup strncasecmp strerror strrchr \
strstr strtol strtoul uname], , [AC_MSG_ERROR(bailing out)])
diff --git a/lib/metadata/metadata.c b/lib/metadata/metadata.c
index 07222a7..6ee7731 100644
--- a/lib/metadata/metadata.c
+++ b/lib/metadata/metadata.c
@@ -1018,6 +1018,20 @@ static int _recalc_extents(uint32_t *extents, const char *desc1,
return 1;
}
+/* return random integer in [0,max) interval */
+static unsigned _even_rand( unsigned *seed, unsigned max )
+{
+ unsigned r, ret;
+
+ /* make sure distribution is even */
+ do {
+ r = (unsigned) rand_r( seed );
+ ret = r % max;
+ } while ( r - ret >= RAND_MAX - max );
+
+ return ret;
+}
+
static dm_bitset_t _bitset_with_random_bits(struct dm_pool *mem, uint32_t num_bits,
uint32_t num_set_bits, unsigned *seed)
{
@@ -1040,7 +1054,7 @@ static dm_bitset_t _bitset_with_random_bits(struct dm_pool *mem, uint32_t num_bi
/* Perform loop num_set_bits times, selecting one bit each time */
while (i++ < num_bits) {
/* Select a random bit between 0 and (i-1) inclusive. */
- bit_selected = (unsigned) floor(i * (rand_r(seed) / (RAND_MAX + 1.0)));
+ bit_selected = _even_rand( seed, i );
/*
* If the bit was already set, set the new bit that became
--
1.7.1
--
____ Sparky{PI] -- Przemyslaw _ ___ _ _ ......... LANG...Pl,Ca,Es,En
/____) ___ ___ _ _ || Iskra | | _ \| | | : WWW...ppcrcd.pld-linux.org
\____\| -_)'___| ||^'||//\\// < | _/| | | : WWW2..............rsget.pl
(____/|| (_-_|_|| ||\\ || |_ |_| |_| _| : Mail..sparky@pld-linux.org
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH] Don't use floor() in _bitset_with_random_bits
2010-07-30 17:05 [PATCH] Don't use floor() in _bitset_with_random_bits Przemyslaw Iskra
@ 2010-08-06 19:24 ` Petr Rockai
2010-08-07 1:51 ` Przemyslaw Iskra
0 siblings, 1 reply; 3+ messages in thread
From: Petr Rockai @ 2010-08-06 19:24 UTC (permalink / raw)
To: lvm-devel
Hi,
Przemyslaw Iskra <sparky@pld-linux.org> writes:
> Use _even_rand() function instead of floor() in
> _bitset_with_random_bits(). floor() function is missing in dietlibc (on
> architectures other than x86). Moreover using floor() to clip rand
> results does not assure even result distribution.
> _even_rand() uses integer arithmetic only and is designed to return evenly
> distributed results.
>
> Signed-off-by: Przemyslaw Iskra <sparky@pld-linux.org>
Reviewed-by: Petr Rockai <prockai@redhat.com>
Looks OK to me. It took a while to decipher what is the exact meaning of
the loop in _even_rand (to a non-pseudorandomness-expert) but I am
fairly comfortable with it now. If I understand this correctly, it
rejects numbers that come from an "incomplete" slice of the RAND_MAX
space (considering the number space [0, RAND_MAX] is divided into some
"max"-sized slices and at most a single smaller slice, between [n*max,
RAND_MAX] for suitable n -- numbers from this last slice are discarded
because they could distort the distribution in favour of smaller
numbers).
> diff --git a/configure.in b/configure.in
> index bd56136..6f6c67e 100644
> --- a/configure.in
> +++ b/configure.in
> @@ -125,8 +125,7 @@ AC_STRUCT_TM
>
> ################################################################################
> dnl -- Check for functions
> -AC_SEARCH_LIBS([floor], [m], , [AC_MSG_ERROR(bailing out)])
> -AC_CHECK_FUNCS([floor ftruncate gethostname getpagesize \
> +AC_CHECK_FUNCS([ftruncate gethostname getpagesize \
> gettimeofday memset mkdir mkfifo rmdir munmap nl_langinfo setenv setlocale \
> strcasecmp strchr strcspn strspn strdup strncasecmp strerror strrchr \
> strstr strtol strtoul uname], , [AC_MSG_ERROR(bailing out)])
OK, since floor is not used anywhere else in LVM.
> diff --git a/lib/metadata/metadata.c b/lib/metadata/metadata.c
> index 07222a7..6ee7731 100644
> --- a/lib/metadata/metadata.c
> +++ b/lib/metadata/metadata.c
> @@ -1018,6 +1018,20 @@ static int _recalc_extents(uint32_t *extents, const char *desc1,
> return 1;
> }
>
> +/* return random integer in [0,max) interval */
> +static unsigned _even_rand( unsigned *seed, unsigned max )
> +{
> + unsigned r, ret;
> +
> + /* make sure distribution is even */
> + do {
> + r = (unsigned) rand_r( seed );
> + ret = r % max;
> + } while ( r - ret >= RAND_MAX - max );
> +
> + return ret;
> +}
OK as per above.
> static dm_bitset_t _bitset_with_random_bits(struct dm_pool *mem, uint32_t num_bits,
> uint32_t num_set_bits, unsigned *seed)
> {
> @@ -1040,7 +1054,7 @@ static dm_bitset_t _bitset_with_random_bits(struct dm_pool *mem, uint32_t num_bi
> /* Perform loop num_set_bits times, selecting one bit each time */
> while (i++ < num_bits) {
> /* Select a random bit between 0 and (i-1) inclusive. */
> - bit_selected = (unsigned) floor(i * (rand_r(seed) / (RAND_MAX + 1.0)));
> + bit_selected = _even_rand( seed, i );
OK. Gets a random integer from [0, i). The previous code was doing the
same thing (less efficiently and less precisely).
Yours,
Petr.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH] Don't use floor() in _bitset_with_random_bits
2010-08-06 19:24 ` Petr Rockai
@ 2010-08-07 1:51 ` Przemyslaw Iskra
0 siblings, 0 replies; 3+ messages in thread
From: Przemyslaw Iskra @ 2010-08-07 1:51 UTC (permalink / raw)
To: lvm-devel
On Fri, Aug 06, 2010 at 09:24:05PM +0200, Petr Rockai wrote:
> Hi,
>
> Przemyslaw Iskra <sparky@pld-linux.org> writes:
> > Use _even_rand() function instead of floor() in
> > _bitset_with_random_bits(). floor() function is missing in dietlibc (on
> > architectures other than x86). Moreover using floor() to clip rand
> > results does not assure even result distribution.
> > _even_rand() uses integer arithmetic only and is designed to return evenly
> > distributed results.
> >
> > Signed-off-by: Przemyslaw Iskra <sparky@pld-linux.org>
>
> Reviewed-by: Petr Rockai <prockai@redhat.com>
>
> Looks OK to me. It took a while to decipher what is the exact meaning of
> the loop in _even_rand (to a non-pseudorandomness-expert) but I am
> fairly comfortable with it now. If I understand this correctly, it
> rejects numbers that come from an "incomplete" slice of the RAND_MAX
> space (considering the number space [0, RAND_MAX] is divided into some
> "max"-sized slices and at most a single smaller slice, between [n*max,
> RAND_MAX] for suitable n -- numbers from this last slice are discarded
> because they could distort the distribution in favour of smaller
> numbers).
That's exactly what it does.
However, right now it cutts off the last slice even if it's complete.
You could replace:
> > + } while ( r - ret >= RAND_MAX - max );
With:
+ } while ( r - ret > (unsigned) RAND_MAX + 1 - max );
Result distribution is correct in both cases. It just avoids the branch
in 0.00something %.
Przemyslaw Iskra
--
____ Sparky{PI] -- Przemyslaw _ ___ _ _ ......... LANG...Pl,Ca,Es,En
/____) ___ ___ _ _ || Iskra | | _ \| | | : WWW...ppcrcd.pld-linux.org
\____\| -_)'___| ||^'||//\\// < | _/| | | : WWW2..............rsget.pl
(____/|| (_-_|_|| ||\\ || |_ |_| |_| _| : Mail..sparky@pld-linux.org
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2010-08-07 1:51 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-30 17:05 [PATCH] Don't use floor() in _bitset_with_random_bits Przemyslaw Iskra
2010-08-06 19:24 ` Petr Rockai
2010-08-07 1:51 ` Przemyslaw Iskra
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.