From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Tvrtko Ursulin <tursulin@ursulin.net>,
igt-dev@lists.freedesktop.org, Intel-gfx@lists.freedesktop.org,
Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Subject: Re: [igt-dev] [PATCH i-g-t] tests/perf_pmu: Handle CPU hotplug failures better
Date: Fri, 23 Feb 2018 14:20:59 +0000 [thread overview]
Message-ID: <37e6341c-67c2-ac09-6f3d-806895344c2d@linux.intel.com> (raw)
In-Reply-To: <20180223115837.limfwpxjpyotnvvt@platvala-desk.ger.corp.intel.com>
On 23/02/2018 11:58, Petri Latvala wrote:
> On Fri, Feb 23, 2018 at 11:34:53AM +0000, Tvrtko Ursulin wrote:
>> From: Chris Wilson <chris@chris-wilson.co.uk>
>>
>> CPU hotplug, especially CPU0, can be flaky on commodity hardware.
>>
>> To improve test reliability and reponse times when testing larger runs we
>> need to handle those cases better.
>>
>> Handle failures to off-line a CPU by immediately skipping the test, and
>> failures to on-line a CPU by immediately rebooting the machine.
>>
>> This patch includes igt_sysrq_reboot implementation from Chris Wilson.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> ---
>> lib/Makefile.sources | 2 ++
>> lib/igt_sysrq.c | 22 ++++++++++++++++++++++
>> lib/igt_sysrq.h | 30 ++++++++++++++++++++++++++++++
>> lib/meson.build | 2 ++
>> tests/perf_pmu.c | 42 ++++++++++++++++++++++++++++++++++--------
>> 5 files changed, 90 insertions(+), 8 deletions(-)
>> create mode 100644 lib/igt_sysrq.c
>> create mode 100644 lib/igt_sysrq.h
>>
>> diff --git a/lib/Makefile.sources b/lib/Makefile.sources
>> index 5b13ef8896c0..3d37ef1d1984 100644
>> --- a/lib/Makefile.sources
>> +++ b/lib/Makefile.sources
>> @@ -35,6 +35,8 @@ lib_source_list = \
>> igt_stats.h \
>> igt_sysfs.c \
>> igt_sysfs.h \
>> + igt_sysrq.c \
>> + igt_sysrq.h \
>> igt_x86.h \
>> igt_x86.c \
>> igt_vgem.c \
>> diff --git a/lib/igt_sysrq.c b/lib/igt_sysrq.c
>> new file mode 100644
>> index 000000000000..fe3d2e344ff1
>> --- /dev/null
>> +++ b/lib/igt_sysrq.c
>> @@ -0,0 +1,22 @@
>> +#include <unistd.h>
>> +#include <fcntl.h>
>> +#include <stdlib.h>
>> +#include <sys/reboot.h>
>> +
>> +#include "igt_core.h"
>> +
>> +#include "igt_sysrq.h"
>> +
>> +void igt_sysrq_reboot(void)
>> +{
>> + sync();
>> +
>> + /* Try to be nice at first, and if that fails pull the trigger */
>> + if (reboot(RB_AUTOBOOT)) {
>> + int fd = open("/proc/sysrq-trigger", O_WRONLY);
>> + igt_ignore_warn(write(fd, "b", 2));
>> + close(fd);
>> + }
>> +
>> + abort();
>> +}
>
>
> While the cause for taking this action might be dire, rebooting
> people's machines can be kind of a dick move, even considering they're
> running tests that can be fatal to the machine in other ways.
>
> We have IGT_HANG and IGT_HANG_WITHOUT_RESET so the users can opt
> in/out of some fatal behaviour already. I'm fine with auto-rebooting,
> even as the default, if users can opt out of it with
> IGT_NO_REBOOT_PRETTY_PLEASE or so.
I am fine with something like that. Just lets define how to call the env
variable and what the default should be?
Do we have a return code from a test which stops the test runner?
I am thinking that the best approach would be not to reboot but to halt
testing, unless this environment option is set.
But then it is up to CI people to say if they want to be setting this
option across all systems, or would actually prefer to reboot by default.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2018-02-23 14:20 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-23 11:34 [PATCH i-g-t] tests/perf_pmu: Handle CPU hotplug failures better Tvrtko Ursulin
2018-02-23 11:58 ` [igt-dev] " Petri Latvala
2018-02-23 14:20 ` Tvrtko Ursulin [this message]
2018-02-26 10:03 ` Petri Latvala
2018-02-26 10:14 ` Tomi Sarvela
2018-02-28 10:05 ` [PATCH i-g-t v2] " Tvrtko Ursulin
2018-03-02 11:12 ` Petri Latvala
2018-03-02 11:28 ` [PATCH i-g-t v3] " Tvrtko Ursulin
2018-03-02 11:39 ` [PATCH i-g-t v4] " Tvrtko Ursulin
2018-03-02 11:42 ` Chris Wilson
2018-03-02 11:55 ` [PATCH i-g-t v5] " Tvrtko Ursulin
2018-03-02 11:32 ` [PATCH i-g-t v2] " Chris Wilson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=37e6341c-67c2-ac09-6f3d-806895344c2d@linux.intel.com \
--to=tvrtko.ursulin@linux.intel.com \
--cc=Intel-gfx@lists.freedesktop.org \
--cc=igt-dev@lists.freedesktop.org \
--cc=tursulin@ursulin.net \
--cc=tvrtko.ursulin@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox