All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: "René Scharfe" <l.s.r@web.de>
Cc: Junio C Hamano <gitster@pobox.com>,
	Git List <git@vger.kernel.org>, Jeff King <peff@peff.net>
Subject: Re: [PATCH 10/9 v2] test-mergesort: use repeatable random numbers
Date: Fri, 08 Oct 2021 09:23:21 +0200	[thread overview]
Message-ID: <87tuhsez93.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <b73452fa-f5b0-0ab4-25e8-7494637c49f5@web.de>


On Fri, Oct 08 2021, René Scharfe wrote:

> Use MINSTD to generate pseudo-random numbers consistently instead of
> using rand(3), whose output can vary from system to system, and reset
> its seed before filling in the test values.  This gives repeatable
> results across versions and systems, which simplifies sharing and
> comparing of results between developers.
>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
> Change: Use uint32_t to avoid relying on unsigned int being exactly
> 4 bytes wide.  D'oh!
>
>  t/helper/test-mergesort.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/t/helper/test-mergesort.c b/t/helper/test-mergesort.c
> index 29758cf89b..c6fa816be3 100644
> --- a/t/helper/test-mergesort.c
> +++ b/t/helper/test-mergesort.c
> @@ -2,6 +2,12 @@
>  #include "cache.h"
>  #include "mergesort.h"
>
> +static uint32_t minstd_rand(uint32_t *state)
> +{
> +	*state = (uint64_t)*state * 48271 % 2147483647;
> +	return *state;
> +}
> +
>  struct line {
>  	char *text;
>  	struct line *next;
> @@ -60,8 +66,9 @@ static void dist_sawtooth(int *arr, int n, int m)
>  static void dist_rand(int *arr, int n, int m)
>  {
>  	int i;
> +	uint32_t seed = 1;
>  	for (i = 0; i < n; i++)
> -		arr[i] = rand() % m;
> +		arr[i] = minstd_rand(&seed) % m;
>  }
>
>  static void dist_stagger(int *arr, int n, int m)
> @@ -81,8 +88,9 @@ static void dist_plateau(int *arr, int n, int m)
>  static void dist_shuffle(int *arr, int n, int m)
>  {
>  	int i, j, k;
> +	uint32_t seed = 1;
>  	for (i = j = 0, k = 1; i < n; i++)
> -		arr[i] = (rand() % m) ? (j += 2) : (k += 2);
> +		arr[i] = minstd_rand(&seed) % m ? (j += 2) : (k += 2);
>  }
>
>  #define DIST(name) { #name, dist_##name }

Just to your upthread:

    "Right, so we'd need to ship our own random number generator."

I don't really think this matters in either case here, and if anything a
flaky failure in this test would quickly point us in the right
direction, as opposed to say having the N test_expect_success being run
in rand() order or whatever.

If we'd like results we can compare across platforms we're surely better
of here running this in a loop with different per-platform srand()
values N times for some high value of N, than we are in picking one
"golden" distribution.

But just on srand() and rand() use more generally in the test suite: I
think it's fine to just assume that we can call srand()/rand() and get
"predictable" results, because what we're really after in most cases is
to avoid hard-to-diagnose flakyness. If as a result of random
distribution we'll get a consistent failure on one OS (or the flakyness
is just OpenBSD...).

Also generally: If you'd like "portable" rand() for a test just shell
out to perl. I ran this on various Perl versions (oldest 5.12) on Debian
Linux, OSX, Solaris & OpenBSD, all returned the same number for both:

    ruby -e 'srand(1); puts rand'; perl -E 'srand(1); say $^V; say rand'

Whereas a C program doing the same:
    
    #include <stdio.h>
    #include <stdlib.h>
    
    int main(void)
    {
            srand(1);
            printf("rand = %d\n", rand());
            return 0;
    }

Returns different numbers an all, and on OpenBSD the number is different
each time, per their well-known non-standard srand()/rand() behavior.

  parent reply	other threads:[~2021-10-08  7:45 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-01  9:07 [PATCH 0/9] mergesort: improve tests and performance René Scharfe
2021-10-01  9:10 ` [PATCH 1/9] test-mergesort: use strbuf_getline() René Scharfe
2021-10-02  9:08   ` Ævar Arnfjörð Bjarmason
2021-10-02 16:56     ` René Scharfe
2021-10-01  9:11 ` [PATCH 2/9] test-mergesort: add sort subcommand René Scharfe
2021-10-01 20:26   ` Junio C Hamano
2021-10-01  9:12 ` [PATCH 3/9] test-mergesort: add test subcommand René Scharfe
2021-10-01 20:26   ` Junio C Hamano
2021-10-02  8:35     ` Ævar Arnfjörð Bjarmason
2021-10-03 10:15       ` René Scharfe
2021-10-03 17:33         ` Junio C Hamano
2021-10-07 20:00           ` René Scharfe
2021-10-08  4:04             ` [PATCH 10/9 v2] test-mergesort: use repeatable random numbers René Scharfe
2021-10-08  4:17               ` Jeff King
2021-10-08  7:23               ` Ævar Arnfjörð Bjarmason [this message]
2021-10-08 17:30                 ` René Scharfe
2021-10-08 19:00                   ` Ævar Arnfjörð Bjarmason
2021-10-03 10:15       ` [PATCH 3/9] test-mergesort: add test subcommand René Scharfe
2021-10-01  9:14 ` [PATCH 4/9] test-mergesort: add generate subcommand René Scharfe
2021-10-01  9:16 ` [PATCH 5/9] test-mergesort: add unriffle mode René Scharfe
2021-10-01  9:17 ` [PATCH 6/9] test-mergesort: add unriffle_skewed mode René Scharfe
2021-10-01  9:19 ` [PATCH 7/9] p0071: measure sorting of already sorted and reversed files René Scharfe
2021-10-01  9:19 ` [PATCH 8/9] p0071: test performance of llist_mergesort() René Scharfe
2021-10-01  9:22 ` [PATCH 9/9] mergesort: use ranks stack René Scharfe
2022-01-17 17:43   ` Ævar Arnfjörð Bjarmason
2022-01-17 18:22     ` René Scharfe
2022-01-18  5:07       ` René Scharfe
2022-01-18 10:40         ` Ævar Arnfjörð Bjarmason
2022-01-18 12:27           ` René Scharfe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tuhsez93.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.