Linux Kernel Selftest development
 help / color / mirror / Atom feed
From: mingo at kernel.org (Ingo Molnar)
Subject: [PATCH 2/2] selftests/x86/fsgsbase: Default to trying to run the test repeatedly
Date: Mon, 11 Feb 2019 09:49:16 +0100	[thread overview]
Message-ID: <20190211084916.GB62722@gmail.com> (raw)
In-Reply-To: <20190203134017.9375-3-broonie@kernel.org>


* Mark Brown <broonie at kernel.org> wrote:

> In automated testing it has been found that on many systems the fsgsbase
> test fails intermittently.  This was reported and discussed a while
> back:
> 
>     https://lore.kernel.org/lkml/20180126153631.ha7yc33fj5uhitjo at xps/
> 
> with the analysis concluding that this is a hardware issue affecting a
> subset of systems but no fix has been merged as yet.  As well as the
> actual problem found by testing the intermittent test failure is causing
> issues for the people doing the automated testing due to the noise.
> 
> In order to make the testing stable modify the test program to iterate
> through the test repeatedly, choosing 5000 iterations based on prior
> reports and local testing.  This unfortunately greatly increases the
> execution time for the selftests when things succeed which isn't great,
> in my local tests on a range of systems it pushes the execution time up
> to approximately a minute when no failures are encountered.
> 
> Reported-by: Dan Rue <dan.rue at linaro.org>
> Signed-off-by: Mark Brown <broonie at kernel.org>
> ---
>  tools/testing/selftests/x86/fsgsbase.c | 27 +++++++++++++++++++++++++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/x86/fsgsbase.c b/tools/testing/selftests/x86/fsgsbase.c
> index 6cda6daa1f8c..83410749ff1f 100644
> --- a/tools/testing/selftests/x86/fsgsbase.c
> +++ b/tools/testing/selftests/x86/fsgsbase.c
> @@ -379,7 +379,7 @@ static void test_unexpected_base(void)
>  	}
>  }
>  
> -int main()
> +int test()
>  {
>  	pthread_t thread;
>  
> @@ -437,3 +437,28 @@ int main()
>  
>  	return nerrs == 0 ? 0 : 1;
>  }
> +
> +int main()
> +{
> +	int tries = 5000;
> +	int i;
> +
> +	if (tries > 1)
> +		quiet = true;
> +
> +	for (i = 0; i < tries; i++) {
> +		if (test() != 0)
> +			break;
> +	}
> +
> +	if (quiet) {
> +		if (nerrs) {
> +			printf("[FAIL] %d errors detected in %d tries\n",
> +				nerrs, i + 1);
> +		} else {
> +			printf("[PASS] %d runs succeeded\n", i);
> +		}
> +	}
> +
> +	return nerrs == 0 ? 0 : 1;
> +}

So this isn't very user-friendly either, previously it would run a 
testcase and immediately provide output.

Now it's just starting and 'hanging':

  galatea:~/linux/linux/tools/testing/selftests/x86> ./fsgsbase_64 

I got bored and Ctrl-C-ed it after ~30 seconds.

How long is this supposed to run, and why isn't the user informed?

Also, testcases should really be short, so I think a better approach 
would be to thread the test-case and start an instance on every CPU. That 
should also excercise SMP bugs, if any.

Thanks,

	Ingo

WARNING: multiple messages have this Message-ID (diff)
From: mingo@kernel.org (Ingo Molnar)
Subject: [PATCH 2/2] selftests/x86/fsgsbase: Default to trying to run the test repeatedly
Date: Mon, 11 Feb 2019 09:49:16 +0100	[thread overview]
Message-ID: <20190211084916.GB62722@gmail.com> (raw)
Message-ID: <20190211084916.2EI_TYT_tcdIwtDheZwu8TRyjLodT5HCN1sXtr5aGEI@z> (raw)
In-Reply-To: <20190203134017.9375-3-broonie@kernel.org>


* Mark Brown <broonie@kernel.org> wrote:

> In automated testing it has been found that on many systems the fsgsbase
> test fails intermittently.  This was reported and discussed a while
> back:
> 
>     https://lore.kernel.org/lkml/20180126153631.ha7yc33fj5uhitjo at xps/
> 
> with the analysis concluding that this is a hardware issue affecting a
> subset of systems but no fix has been merged as yet.  As well as the
> actual problem found by testing the intermittent test failure is causing
> issues for the people doing the automated testing due to the noise.
> 
> In order to make the testing stable modify the test program to iterate
> through the test repeatedly, choosing 5000 iterations based on prior
> reports and local testing.  This unfortunately greatly increases the
> execution time for the selftests when things succeed which isn't great,
> in my local tests on a range of systems it pushes the execution time up
> to approximately a minute when no failures are encountered.
> 
> Reported-by: Dan Rue <dan.rue at linaro.org>
> Signed-off-by: Mark Brown <broonie at kernel.org>
> ---
>  tools/testing/selftests/x86/fsgsbase.c | 27 +++++++++++++++++++++++++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/x86/fsgsbase.c b/tools/testing/selftests/x86/fsgsbase.c
> index 6cda6daa1f8c..83410749ff1f 100644
> --- a/tools/testing/selftests/x86/fsgsbase.c
> +++ b/tools/testing/selftests/x86/fsgsbase.c
> @@ -379,7 +379,7 @@ static void test_unexpected_base(void)
>  	}
>  }
>  
> -int main()
> +int test()
>  {
>  	pthread_t thread;
>  
> @@ -437,3 +437,28 @@ int main()
>  
>  	return nerrs == 0 ? 0 : 1;
>  }
> +
> +int main()
> +{
> +	int tries = 5000;
> +	int i;
> +
> +	if (tries > 1)
> +		quiet = true;
> +
> +	for (i = 0; i < tries; i++) {
> +		if (test() != 0)
> +			break;
> +	}
> +
> +	if (quiet) {
> +		if (nerrs) {
> +			printf("[FAIL] %d errors detected in %d tries\n",
> +				nerrs, i + 1);
> +		} else {
> +			printf("[PASS] %d runs succeeded\n", i);
> +		}
> +	}
> +
> +	return nerrs == 0 ? 0 : 1;
> +}

So this isn't very user-friendly either, previously it would run a 
testcase and immediately provide output.

Now it's just starting and 'hanging':

  galatea:~/linux/linux/tools/testing/selftests/x86> ./fsgsbase_64 

I got bored and Ctrl-C-ed it after ~30 seconds.

How long is this supposed to run, and why isn't the user informed?

Also, testcases should really be short, so I think a better approach 
would be to thread the test-case and start an instance on every CPU. That 
should also excercise SMP bugs, if any.

Thanks,

	Ingo

  parent reply	other threads:[~2019-02-11  8:49 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-03 13:40 [PATCH 0/2] Make fsgsbase test more stable broonie
2019-02-03 13:40 ` Mark Brown
2019-02-03 13:40 ` [PATCH 1/2] selftests/x86/fsgsbase: Indirect output through a wrapper function broonie
2019-02-03 13:40   ` Mark Brown
2019-02-11  8:45   ` mingo
2019-02-11  8:45     ` Ingo Molnar
2019-02-11 13:02     ` broonie
2019-02-11 13:02       ` Mark Brown
2019-02-03 13:40 ` [PATCH 2/2] selftests/x86/fsgsbase: Default to trying to run the test repeatedly broonie
2019-02-03 13:40   ` Mark Brown
2019-02-11  8:49   ` mingo [this message]
2019-02-11  8:49     ` Ingo Molnar
2019-02-11 12:47     ` broonie
2019-02-11 12:47       ` Mark Brown
2019-02-11 12:51       ` mingo
2019-02-11 12:51         ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190211084916.GB62722@gmail.com \
    --to=linux-kselftest@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox