[PATCH 2/2] selftests/x86/fsgsbase: Default to trying to run the test repeatedly

All of lore.kernel.org
 help / color / mirror / Atom feed

From: mingo at kernel.org (Ingo Molnar)
Subject: [PATCH 2/2] selftests/x86/fsgsbase: Default to trying to run the test repeatedly
Date: Mon, 11 Feb 2019 09:49:16 +0100	[thread overview]
Message-ID: <20190211084916.GB62722@gmail.com> (raw)
In-Reply-To: <20190203134017.9375-3-broonie@kernel.org>


* Mark Brown <broonie at kernel.org> wrote:

> In automated testing it has been found that on many systems the fsgsbase
> test fails intermittently.  This was reported and discussed a while
> back:
> 
>     https://lore.kernel.org/lkml/20180126153631.ha7yc33fj5uhitjo at xps/
> 
> with the analysis concluding that this is a hardware issue affecting a
> subset of systems but no fix has been merged as yet.  As well as the
> actual problem found by testing the intermittent test failure is causing
> issues for the people doing the automated testing due to the noise.
> 
> In order to make the testing stable modify the test program to iterate
> through the test repeatedly, choosing 5000 iterations based on prior
> reports and local testing.  This unfortunately greatly increases the
> execution time for the selftests when things succeed which isn't great,
> in my local tests on a range of systems it pushes the execution time up
> to approximately a minute when no failures are encountered.
> 
> Reported-by: Dan Rue <dan.rue at linaro.org>
> Signed-off-by: Mark Brown <broonie at kernel.org>
> ---
>  tools/testing/selftests/x86/fsgsbase.c | 27 +++++++++++++++++++++++++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/x86/fsgsbase.c b/tools/testing/selftests/x86/fsgsbase.c
> index 6cda6daa1f8c..83410749ff1f 100644
> --- a/tools/testing/selftests/x86/fsgsbase.c
> +++ b/tools/testing/selftests/x86/fsgsbase.c
> @@ -379,7 +379,7 @@ static void test_unexpected_base(void)
>  	}
>  }
>  
> -int main()
> +int test()
>  {
>  	pthread_t thread;
>  
> @@ -437,3 +437,28 @@ int main()
>  
>  	return nerrs == 0 ? 0 : 1;
>  }
> +
> +int main()
> +{
> +	int tries = 5000;
> +	int i;
> +
> +	if (tries > 1)
> +		quiet = true;
> +
> +	for (i = 0; i < tries; i++) {
> +		if (test() != 0)
> +			break;
> +	}
> +
> +	if (quiet) {
> +		if (nerrs) {
> +			printf("[FAIL] %d errors detected in %d tries\n",
> +				nerrs, i + 1);
> +		} else {
> +			printf("[PASS] %d runs succeeded\n", i);
> +		}
> +	}
> +
> +	return nerrs == 0 ? 0 : 1;
> +}

So this isn't very user-friendly either, previously it would run a 
testcase and immediately provide output.

Now it's just starting and 'hanging':

  galatea:~/linux/linux/tools/testing/selftests/x86> ./fsgsbase_64 

I got bored and Ctrl-C-ed it after ~30 seconds.

How long is this supposed to run, and why isn't the user informed?

Also, testcases should really be short, so I think a better approach 
would be to thread the test-case and start an instance on every CPU. That 
should also excercise SMP bugs, if any.

Thanks,

	Ingo

WARNING: multiple messages have this Message-ID (diff)

From: mingo@kernel.org (Ingo Molnar)
Subject: [PATCH 2/2] selftests/x86/fsgsbase: Default to trying to run the test repeatedly
Date: Mon, 11 Feb 2019 09:49:16 +0100	[thread overview]
Message-ID: <20190211084916.GB62722@gmail.com> (raw)
Message-ID: <20190211084916.2EI_TYT_tcdIwtDheZwu8TRyjLodT5HCN1sXtr5aGEI@z> (raw)
In-Reply-To: <20190203134017.9375-3-broonie@kernel.org>


* Mark Brown <broonie@kernel.org> wrote:

> In automated testing it has been found that on many systems the fsgsbase
> test fails intermittently.  This was reported and discussed a while
> back:
> 
>     https://lore.kernel.org/lkml/20180126153631.ha7yc33fj5uhitjo at xps/
> 
> with the analysis concluding that this is a hardware issue affecting a
> subset of systems but no fix has been merged as yet.  As well as the
> actual problem found by testing the intermittent test failure is causing
> issues for the people doing the automated testing due to the noise.
> 
> In order to make the testing stable modify the test program to iterate
> through the test repeatedly, choosing 5000 iterations based on prior
> reports and local testing.  This unfortunately greatly increases the
> execution time for the selftests when things succeed which isn't great,
> in my local tests on a range of systems it pushes the execution time up
> to approximately a minute when no failures are encountered.
> 
> Reported-by: Dan Rue <dan.rue at linaro.org>
> Signed-off-by: Mark Brown <broonie at kernel.org>
> ---
>  tools/testing/selftests/x86/fsgsbase.c | 27 +++++++++++++++++++++++++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/x86/fsgsbase.c b/tools/testing/selftests/x86/fsgsbase.c
> index 6cda6daa1f8c..83410749ff1f 100644
> --- a/tools/testing/selftests/x86/fsgsbase.c
> +++ b/tools/testing/selftests/x86/fsgsbase.c
> @@ -379,7 +379,7 @@ static void test_unexpected_base(void)
>  	}
>  }
>  
> -int main()
> +int test()
>  {
>  	pthread_t thread;
>  
> @@ -437,3 +437,28 @@ int main()
>  
>  	return nerrs == 0 ? 0 : 1;
>  }
> +
> +int main()
> +{
> +	int tries = 5000;
> +	int i;
> +
> +	if (tries > 1)
> +		quiet = true;
> +
> +	for (i = 0; i < tries; i++) {
> +		if (test() != 0)
> +			break;
> +	}
> +
> +	if (quiet) {
> +		if (nerrs) {
> +			printf("[FAIL] %d errors detected in %d tries\n",
> +				nerrs, i + 1);
> +		} else {
> +			printf("[PASS] %d runs succeeded\n", i);
> +		}
> +	}
> +
> +	return nerrs == 0 ? 0 : 1;
> +}

So this isn't very user-friendly either, previously it would run a 
testcase and immediately provide output.

Now it's just starting and 'hanging':

  galatea:~/linux/linux/tools/testing/selftests/x86> ./fsgsbase_64 

I got bored and Ctrl-C-ed it after ~30 seconds.

How long is this supposed to run, and why isn't the user informed?

Also, testcases should really be short, so I think a better approach 
would be to thread the test-case and start an instance on every CPU. That 
should also excercise SMP bugs, if any.

Thanks,

	Ingo

WARNING: multiple messages have this Message-ID (diff)

From: Ingo Molnar <mingo@kernel.org>
To: Mark Brown <broonie@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H . Peter Anvin" <hpa@zytor.com>,
	Andy Lutomirski <luto@amacapital.net>,
	linux-kernel@vger.kernel.org, x86@kernel.org,
	linux-kselftest@vger.kernel.org, Dan Rue <dan.rue@linaro.org>
Subject: Re: [PATCH 2/2] selftests/x86/fsgsbase: Default to trying to run the test repeatedly
Date: Mon, 11 Feb 2019 09:49:16 +0100	[thread overview]
Message-ID: <20190211084916.GB62722@gmail.com> (raw)
In-Reply-To: <20190203134017.9375-3-broonie@kernel.org>


* Mark Brown <broonie@kernel.org> wrote:

> In automated testing it has been found that on many systems the fsgsbase
> test fails intermittently.  This was reported and discussed a while
> back:
> 
>     https://lore.kernel.org/lkml/20180126153631.ha7yc33fj5uhitjo@xps/
> 
> with the analysis concluding that this is a hardware issue affecting a
> subset of systems but no fix has been merged as yet.  As well as the
> actual problem found by testing the intermittent test failure is causing
> issues for the people doing the automated testing due to the noise.
> 
> In order to make the testing stable modify the test program to iterate
> through the test repeatedly, choosing 5000 iterations based on prior
> reports and local testing.  This unfortunately greatly increases the
> execution time for the selftests when things succeed which isn't great,
> in my local tests on a range of systems it pushes the execution time up
> to approximately a minute when no failures are encountered.
> 
> Reported-by: Dan Rue <dan.rue@linaro.org>
> Signed-off-by: Mark Brown <broonie@kernel.org>
> ---
>  tools/testing/selftests/x86/fsgsbase.c | 27 +++++++++++++++++++++++++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/x86/fsgsbase.c b/tools/testing/selftests/x86/fsgsbase.c
> index 6cda6daa1f8c..83410749ff1f 100644
> --- a/tools/testing/selftests/x86/fsgsbase.c
> +++ b/tools/testing/selftests/x86/fsgsbase.c
> @@ -379,7 +379,7 @@ static void test_unexpected_base(void)
>  	}
>  }
>  
> -int main()
> +int test()
>  {
>  	pthread_t thread;
>  
> @@ -437,3 +437,28 @@ int main()
>  
>  	return nerrs == 0 ? 0 : 1;
>  }
> +
> +int main()
> +{
> +	int tries = 5000;
> +	int i;
> +
> +	if (tries > 1)
> +		quiet = true;
> +
> +	for (i = 0; i < tries; i++) {
> +		if (test() != 0)
> +			break;
> +	}
> +
> +	if (quiet) {
> +		if (nerrs) {
> +			printf("[FAIL] %d errors detected in %d tries\n",
> +				nerrs, i + 1);
> +		} else {
> +			printf("[PASS] %d runs succeeded\n", i);
> +		}
> +	}
> +
> +	return nerrs == 0 ? 0 : 1;
> +}

So this isn't very user-friendly either, previously it would run a 
testcase and immediately provide output.

Now it's just starting and 'hanging':

  galatea:~/linux/linux/tools/testing/selftests/x86> ./fsgsbase_64 

I got bored and Ctrl-C-ed it after ~30 seconds.

How long is this supposed to run, and why isn't the user informed?

Also, testcases should really be short, so I think a better approach 
would be to thread the test-case and start an instance on every CPU. That 
should also excercise SMP bugs, if any.

Thanks,

	Ingo

next prev parent reply	other threads:[~2019-02-11  8:49 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-03 13:40 [PATCH 0/2] Make fsgsbase test more stable broonie
2019-02-03 13:40 ` Mark Brown
2019-02-03 13:40 ` Mark Brown
2019-02-03 13:40 ` [PATCH 1/2] selftests/x86/fsgsbase: Indirect output through a wrapper function broonie
2019-02-03 13:40   ` Mark Brown
2019-02-03 13:40   ` Mark Brown
2019-02-11  8:45   ` mingo
2019-02-11  8:45     ` Ingo Molnar
2019-02-11  8:45     ` Ingo Molnar
2019-02-11 13:02     ` broonie
2019-02-11 13:02       ` Mark Brown
2019-02-11 13:02       ` Mark Brown
2019-02-03 13:40 ` [PATCH 2/2] selftests/x86/fsgsbase: Default to trying to run the test repeatedly broonie
2019-02-03 13:40   ` Mark Brown
2019-02-03 13:40   ` Mark Brown
2019-02-11  8:49   ` mingo [this message]
2019-02-11  8:49     ` Ingo Molnar
2019-02-11  8:49     ` Ingo Molnar
2019-02-11 12:47     ` broonie
2019-02-11 12:47       ` Mark Brown
2019-02-11 12:47       ` Mark Brown
2019-02-11 12:51       ` mingo
2019-02-11 12:51         ` Ingo Molnar
2019-02-11 12:51         ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190211084916.GB62722@gmail.com \
    --to=unknown@example.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.