public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Tvrtko Ursulin <tursulin@ursulin.net>,
	igt-dev@lists.freedesktop.org, Intel-gfx@lists.freedesktop.org,
	Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Subject: Re: [igt-dev] [PATCH i-g-t] tests/perf_pmu: Handle CPU hotplug failures better
Date: Fri, 23 Feb 2018 14:20:59 +0000	[thread overview]
Message-ID: <37e6341c-67c2-ac09-6f3d-806895344c2d@linux.intel.com> (raw)
In-Reply-To: <20180223115837.limfwpxjpyotnvvt@platvala-desk.ger.corp.intel.com>


On 23/02/2018 11:58, Petri Latvala wrote:
> On Fri, Feb 23, 2018 at 11:34:53AM +0000, Tvrtko Ursulin wrote:
>> From: Chris Wilson <chris@chris-wilson.co.uk>
>>
>> CPU hotplug, especially CPU0, can be flaky on commodity hardware.
>>
>> To improve test reliability and reponse times when testing larger runs we
>> need to handle those cases better.
>>
>> Handle failures to off-line a CPU by immediately skipping the test, and
>> failures to on-line a CPU by immediately rebooting the machine.
>>
>> This patch includes igt_sysrq_reboot implementation from Chris Wilson.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> ---
>>   lib/Makefile.sources |  2 ++
>>   lib/igt_sysrq.c      | 22 ++++++++++++++++++++++
>>   lib/igt_sysrq.h      | 30 ++++++++++++++++++++++++++++++
>>   lib/meson.build      |  2 ++
>>   tests/perf_pmu.c     | 42 ++++++++++++++++++++++++++++++++++--------
>>   5 files changed, 90 insertions(+), 8 deletions(-)
>>   create mode 100644 lib/igt_sysrq.c
>>   create mode 100644 lib/igt_sysrq.h
>>
>> diff --git a/lib/Makefile.sources b/lib/Makefile.sources
>> index 5b13ef8896c0..3d37ef1d1984 100644
>> --- a/lib/Makefile.sources
>> +++ b/lib/Makefile.sources
>> @@ -35,6 +35,8 @@ lib_source_list =	 	\
>>   	igt_stats.h		\
>>   	igt_sysfs.c		\
>>   	igt_sysfs.h		\
>> +	igt_sysrq.c		\
>> +	igt_sysrq.h		\
>>   	igt_x86.h		\
>>   	igt_x86.c		\
>>   	igt_vgem.c		\
>> diff --git a/lib/igt_sysrq.c b/lib/igt_sysrq.c
>> new file mode 100644
>> index 000000000000..fe3d2e344ff1
>> --- /dev/null
>> +++ b/lib/igt_sysrq.c
>> @@ -0,0 +1,22 @@
>> +#include <unistd.h>
>> +#include <fcntl.h>
>> +#include <stdlib.h>
>> +#include <sys/reboot.h>
>> +
>> +#include "igt_core.h"
>> +
>> +#include "igt_sysrq.h"
>> +
>> +void igt_sysrq_reboot(void)
>> +{
>> +	sync();
>> +
>> +	/* Try to be nice at first, and if that fails pull the trigger */
>> +	if (reboot(RB_AUTOBOOT)) {
>> +		int fd = open("/proc/sysrq-trigger", O_WRONLY);
>> +		igt_ignore_warn(write(fd, "b", 2));
>> +		close(fd);
>> +	}
>> +
>> +	abort();
>> +}
> 
> 
> While the cause for taking this action might be dire, rebooting
> people's machines can be kind of a dick move, even considering they're
> running tests that can be fatal to the machine in other ways.
> 
> We have IGT_HANG and IGT_HANG_WITHOUT_RESET so the users can opt
> in/out of some fatal behaviour already. I'm fine with auto-rebooting,
> even as the default, if users can opt out of it with
> IGT_NO_REBOOT_PRETTY_PLEASE or so.

I am fine with something like that. Just lets define how to call the env 
variable and what the default should be?

Do we have a return code from a test which stops the test runner?

I am thinking that the best approach would be not to reboot but to halt 
testing, unless this environment option is set.

But then it is up to CI people to say if they want to be setting this 
option across all systems, or would actually prefer to reboot by default.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2018-02-23 14:20 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-23 11:34 [PATCH i-g-t] tests/perf_pmu: Handle CPU hotplug failures better Tvrtko Ursulin
2018-02-23 11:58 ` [igt-dev] " Petri Latvala
2018-02-23 14:20   ` Tvrtko Ursulin [this message]
2018-02-26 10:03     ` Petri Latvala
2018-02-26 10:14       ` Tomi Sarvela
2018-02-28 10:05         ` [PATCH i-g-t v2] " Tvrtko Ursulin
2018-03-02 11:12           ` Petri Latvala
2018-03-02 11:28             ` [PATCH i-g-t v3] " Tvrtko Ursulin
2018-03-02 11:39               ` [PATCH i-g-t v4] " Tvrtko Ursulin
2018-03-02 11:42                 ` Chris Wilson
2018-03-02 11:55                   ` [PATCH i-g-t v5] " Tvrtko Ursulin
2018-03-02 11:32             ` [PATCH i-g-t v2] " Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=37e6341c-67c2-ac09-6f3d-806895344c2d@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=Intel-gfx@lists.freedesktop.org \
    --cc=igt-dev@lists.freedesktop.org \
    --cc=tursulin@ursulin.net \
    --cc=tvrtko.ursulin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox