* [igt-dev] [PATCH i-g-t] tests/perf_pmu: Avoid off-lining all CPUs
@ 2018-02-19 13:40 Tvrtko Ursulin
2018-02-19 13:54 ` Chris Wilson
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Tvrtko Ursulin @ 2018-02-19 13:40 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Bail out from the cpu-hotplug test if we failed to bring a CPU back
online.
This still leaves the machine in a quite bad state, but at least it
avoids hard hanging it.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
tests/perf_pmu.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
index 7fab73e22c2d..a334d3b5770e 100644
--- a/tests/perf_pmu.c
+++ b/tests/perf_pmu.c
@@ -965,6 +965,7 @@ static void cpu_hotplug(int gem_fd)
int link[2];
int fd, ret;
int cur = 0;
+ char buf;
igt_require(cpu0_hotplug_support());
@@ -1012,7 +1013,15 @@ static void cpu_hotplug(int gem_fd)
/* Offline followed by online a CPU. */
igt_assert_eq(write(cpufd, "0", 2), 2);
usleep(1e6);
- igt_assert_eq(write(cpufd, "1", 2), 2);
+ ret = write(cpufd, "1", 2);
+ if (ret < 0) {
+ /*
+ * Abort the test if we failed to bring a CPU
+ * back online.
+ */
+ igt_assert_eq(write(link[1], "s", 1), 1);
+ break;
+ }
close(cpufd);
cpu++;
@@ -1026,7 +1035,6 @@ static void cpu_hotplug(int gem_fd)
* until the CPU core shuffler finishes one loop.
*/
for (;;) {
- char buf;
int ret2;
usleep(500e3);
@@ -1053,6 +1061,9 @@ static void cpu_hotplug(int gem_fd)
close(fd);
close(link[0]);
+ /* Skip if child signals a problem with bringing a CPU back online. */
+ igt_skip_on(buf == 's');
+
assert_within_epsilon(val, ts[1] - ts[0], tolerance);
}
--
2.14.1
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [igt-dev] [PATCH i-g-t] tests/perf_pmu: Avoid off-lining all CPUs
2018-02-19 13:40 [igt-dev] [PATCH i-g-t] tests/perf_pmu: Avoid off-lining all CPUs Tvrtko Ursulin
@ 2018-02-19 13:54 ` Chris Wilson
2018-02-19 14:08 ` [igt-dev] [Intel-gfx] " Tvrtko Ursulin
2018-02-19 14:01 ` [igt-dev] ✓ Fi.CI.BAT: success for " Patchwork
2018-02-19 15:55 ` [igt-dev] ✗ Fi.CI.IGT: failure " Patchwork
2 siblings, 1 reply; 8+ messages in thread
From: Chris Wilson @ 2018-02-19 13:54 UTC (permalink / raw)
To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin
Quoting Tvrtko Ursulin (2018-02-19 13:40:37)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Bail out from the cpu-hotplug test if we failed to bring a CPU back
> online.
>
> This still leaves the machine in a quite bad state, but at least it
> avoids hard hanging it.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> tests/perf_pmu.c | 15 +++++++++++++--
> 1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
> index 7fab73e22c2d..a334d3b5770e 100644
> --- a/tests/perf_pmu.c
> +++ b/tests/perf_pmu.c
> @@ -965,6 +965,7 @@ static void cpu_hotplug(int gem_fd)
> int link[2];
> int fd, ret;
> int cur = 0;
> + char buf;
>
> igt_require(cpu0_hotplug_support());
>
> @@ -1012,7 +1013,15 @@ static void cpu_hotplug(int gem_fd)
> /* Offline followed by online a CPU. */
> igt_assert_eq(write(cpufd, "0", 2), 2);
> usleep(1e6);
> - igt_assert_eq(write(cpufd, "1", 2), 2);
> + ret = write(cpufd, "1", 2);
> + if (ret < 0) {
> + /*
> + * Abort the test if we failed to bring a CPU
> + * back online.
> + */
> + igt_assert_eq(write(link[1], "s", 1), 1);
> + break;
> + }
>
> close(cpufd);
> cpu++;
> @@ -1026,7 +1035,6 @@ static void cpu_hotplug(int gem_fd)
> * until the CPU core shuffler finishes one loop.
> */
> for (;;) {
> - char buf;
> int ret2;
>
> usleep(500e3);
> @@ -1053,6 +1061,9 @@ static void cpu_hotplug(int gem_fd)
> close(fd);
> close(link[0]);
>
> + /* Skip if child signals a problem with bringing a CPU back online. */
> + igt_skip_on(buf == 's');
The logic makes sense.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
What happens if we try to online an already on cpu? Will that report the
failure or just tell us to stop wasting its time?
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 8+ messages in thread
* [igt-dev] ✓ Fi.CI.BAT: success for tests/perf_pmu: Avoid off-lining all CPUs
2018-02-19 13:40 [igt-dev] [PATCH i-g-t] tests/perf_pmu: Avoid off-lining all CPUs Tvrtko Ursulin
2018-02-19 13:54 ` Chris Wilson
@ 2018-02-19 14:01 ` Patchwork
2018-02-19 15:55 ` [igt-dev] ✗ Fi.CI.IGT: failure " Patchwork
2 siblings, 0 replies; 8+ messages in thread
From: Patchwork @ 2018-02-19 14:01 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: igt-dev
== Series Details ==
Series: tests/perf_pmu: Avoid off-lining all CPUs
URL : https://patchwork.freedesktop.org/series/38513/
State : success
== Summary ==
IGT patchset tested on top of latest successful build
bf777c92448d51010aba51d1f1b657b0fdc673a6 tests/gem_ctx_param: Update invalid param
with latest DRM-Tip kernel build CI_DRM_3797
282bcdb38286 drm-tip: 2018y-02m-19d-12h-01m-54s UTC integration manifest
No testlist changes.
Test gem_ctx_switch:
Subgroup basic-default-heavy:
pass -> INCOMPLETE (fi-cnl-y3) fdo#105086
Test kms_chamelium:
Subgroup dp-edid-read:
fail -> PASS (fi-kbl-7500u) fdo#102505
Subgroup hdmi-hpd-fast:
skip -> FAIL (fi-kbl-7500u) fdo#102672
Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-b:
pass -> INCOMPLETE (fi-snb-2520m) fdo#103713
fdo#105086 https://bugs.freedesktop.org/show_bug.cgi?id=105086
fdo#102505 https://bugs.freedesktop.org/show_bug.cgi?id=102505
fdo#102672 https://bugs.freedesktop.org/show_bug.cgi?id=102672
fdo#103713 https://bugs.freedesktop.org/show_bug.cgi?id=103713
fi-bdw-5557u total:288 pass:267 dwarn:0 dfail:0 fail:0 skip:21 time:425s
fi-bdw-gvtdvm total:288 pass:264 dwarn:0 dfail:0 fail:0 skip:24 time:431s
fi-blb-e6850 total:288 pass:223 dwarn:1 dfail:0 fail:0 skip:64 time:375s
fi-bsw-n3050 total:288 pass:242 dwarn:0 dfail:0 fail:0 skip:46 time:491s
fi-bwr-2160 total:288 pass:183 dwarn:0 dfail:0 fail:0 skip:105 time:288s
fi-bxt-dsi total:288 pass:258 dwarn:0 dfail:0 fail:0 skip:30 time:484s
fi-bxt-j4205 total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:488s
fi-byt-j1900 total:288 pass:253 dwarn:0 dfail:0 fail:0 skip:35 time:481s
fi-byt-n2820 total:288 pass:249 dwarn:0 dfail:0 fail:0 skip:39 time:464s
fi-cfl-s2 total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:566s
fi-cnl-y3 total:22 pass:21 dwarn:0 dfail:0 fail:0 skip:0
fi-elk-e7500 total:288 pass:229 dwarn:0 dfail:0 fail:0 skip:59 time:418s
fi-gdg-551 total:288 pass:179 dwarn:0 dfail:0 fail:1 skip:108 time:286s
fi-glk-1 total:288 pass:260 dwarn:0 dfail:0 fail:0 skip:28 time:511s
fi-hsw-4770 total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:392s
fi-ilk-650 total:288 pass:228 dwarn:0 dfail:0 fail:0 skip:60 time:413s
fi-ivb-3520m total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:462s
fi-ivb-3770 total:288 pass:255 dwarn:0 dfail:0 fail:0 skip:33 time:412s
fi-kbl-7500u total:288 pass:263 dwarn:1 dfail:0 fail:1 skip:23 time:458s
fi-kbl-7560u total:288 pass:269 dwarn:0 dfail:0 fail:0 skip:19 time:492s
fi-kbl-7567u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:454s
fi-kbl-r total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:501s
fi-pnv-d510 total:288 pass:222 dwarn:1 dfail:0 fail:0 skip:65 time:593s
fi-skl-6260u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:434s
fi-skl-6600u total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:512s
fi-skl-6700hq total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:525s
fi-skl-6700k2 total:288 pass:264 dwarn:0 dfail:0 fail:0 skip:24 time:496s
fi-skl-6770hq total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:489s
fi-skl-guc total:288 pass:260 dwarn:0 dfail:0 fail:0 skip:28 time:412s
fi-skl-gvtdvm total:288 pass:265 dwarn:0 dfail:0 fail:0 skip:23 time:430s
fi-snb-2520m total:245 pass:211 dwarn:0 dfail:0 fail:0 skip:33
fi-snb-2600 total:288 pass:248 dwarn:0 dfail:0 fail:0 skip:40 time:399s
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_951/issues.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [igt-dev] [Intel-gfx] [PATCH i-g-t] tests/perf_pmu: Avoid off-lining all CPUs
2018-02-19 13:54 ` Chris Wilson
@ 2018-02-19 14:08 ` Tvrtko Ursulin
0 siblings, 0 replies; 8+ messages in thread
From: Tvrtko Ursulin @ 2018-02-19 14:08 UTC (permalink / raw)
To: Chris Wilson, Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx
On 19/02/2018 13:54, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-02-19 13:40:37)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Bail out from the cpu-hotplug test if we failed to bring a CPU back
>> online.
>>
>> This still leaves the machine in a quite bad state, but at least it
>> avoids hard hanging it.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>> tests/perf_pmu.c | 15 +++++++++++++--
>> 1 file changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
>> index 7fab73e22c2d..a334d3b5770e 100644
>> --- a/tests/perf_pmu.c
>> +++ b/tests/perf_pmu.c
>> @@ -965,6 +965,7 @@ static void cpu_hotplug(int gem_fd)
>> int link[2];
>> int fd, ret;
>> int cur = 0;
>> + char buf;
>>
>> igt_require(cpu0_hotplug_support());
>>
>> @@ -1012,7 +1013,15 @@ static void cpu_hotplug(int gem_fd)
>> /* Offline followed by online a CPU. */
>> igt_assert_eq(write(cpufd, "0", 2), 2);
>> usleep(1e6);
>> - igt_assert_eq(write(cpufd, "1", 2), 2);
>> + ret = write(cpufd, "1", 2);
>> + if (ret < 0) {
>> + /*
>> + * Abort the test if we failed to bring a CPU
>> + * back online.
>> + */
>> + igt_assert_eq(write(link[1], "s", 1), 1);
>> + break;
>> + }
>>
>> close(cpufd);
>> cpu++;
>> @@ -1026,7 +1035,6 @@ static void cpu_hotplug(int gem_fd)
>> * until the CPU core shuffler finishes one loop.
>> */
>> for (;;) {
>> - char buf;
>> int ret2;
>>
>> usleep(500e3);
>> @@ -1053,6 +1061,9 @@ static void cpu_hotplug(int gem_fd)
>> close(fd);
>> close(link[0]);
>>
>> + /* Skip if child signals a problem with bringing a CPU back online. */
>> + igt_skip_on(buf == 's');
>
> The logic makes sense.
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
I wasn't so sure after I've sent it. There was an assert on write error
already so that would have aborted the child. Will see what shards will
say..
> What happens if we try to online an already on cpu? Will that report the
> failure or just tell us to stop wasting its time?
Seems the attempt is simply ignored.
Regards,
Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 8+ messages in thread
* [igt-dev] ✗ Fi.CI.IGT: failure for tests/perf_pmu: Avoid off-lining all CPUs
2018-02-19 13:40 [igt-dev] [PATCH i-g-t] tests/perf_pmu: Avoid off-lining all CPUs Tvrtko Ursulin
2018-02-19 13:54 ` Chris Wilson
2018-02-19 14:01 ` [igt-dev] ✓ Fi.CI.BAT: success for " Patchwork
@ 2018-02-19 15:55 ` Patchwork
2018-02-19 16:07 ` Chris Wilson
2 siblings, 1 reply; 8+ messages in thread
From: Patchwork @ 2018-02-19 15:55 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: igt-dev
== Series Details ==
Series: tests/perf_pmu: Avoid off-lining all CPUs
URL : https://patchwork.freedesktop.org/series/38513/
State : failure
== Summary ==
Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-a:
skip -> PASS (shard-snb) fdo#103375 +1
Test perf:
Subgroup polling:
pass -> FAIL (shard-hsw) fdo#102252
Subgroup oa-exponents:
fail -> PASS (shard-apl) fdo#102254
Test gem_eio:
Subgroup in-flight-suspend:
fail -> PASS (shard-hsw) fdo#104676
Test kms_cursor_crc:
Subgroup cursor-64x64-suspend:
incomplete -> PASS (shard-hsw) fdo#103540
Test perf_pmu:
Subgroup cpu-hotplug:
incomplete -> SKIP (shard-apl) fdo#104965
Subgroup other-read-1:
pass -> FAIL (shard-apl)
Test kms_frontbuffer_tracking:
Subgroup fbc-1p-pri-indfb-multidraw:
pass -> FAIL (shard-snb) fdo#103167
Subgroup fbc-1p-offscren-pri-shrfb-draw-mmap-cpu:
pass -> FAIL (shard-apl) fdo#101623
Test gem_exec_capture:
Subgroup capture-vebox:
pass -> INCOMPLETE (shard-apl)
fdo#103375 https://bugs.freedesktop.org/show_bug.cgi?id=103375
fdo#102252 https://bugs.freedesktop.org/show_bug.cgi?id=102252
fdo#102254 https://bugs.freedesktop.org/show_bug.cgi?id=102254
fdo#104676 https://bugs.freedesktop.org/show_bug.cgi?id=104676
fdo#103540 https://bugs.freedesktop.org/show_bug.cgi?id=103540
fdo#104965 https://bugs.freedesktop.org/show_bug.cgi?id=104965
fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
fdo#101623 https://bugs.freedesktop.org/show_bug.cgi?id=101623
shard-apl total:3429 pass:1793 dwarn:1 dfail:0 fail:14 skip:1620 time:12125s
shard-hsw total:3434 pass:1761 dwarn:1 dfail:0 fail:3 skip:1668 time:11649s
shard-snb total:3434 pass:1350 dwarn:1 dfail:0 fail:3 skip:2080 time:6549s
Blacklisted hosts:
shard-kbl total:3413 pass:1911 dwarn:1 dfail:0 fail:12 skip:1488 time:9581s
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_951/shards.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [igt-dev] ✗ Fi.CI.IGT: failure for tests/perf_pmu: Avoid off-lining all CPUs
2018-02-19 15:55 ` [igt-dev] ✗ Fi.CI.IGT: failure " Patchwork
@ 2018-02-19 16:07 ` Chris Wilson
2018-02-19 17:53 ` Tvrtko Ursulin
0 siblings, 1 reply; 8+ messages in thread
From: Chris Wilson @ 2018-02-19 16:07 UTC (permalink / raw)
To: igt-dev, Patchwork, Tvrtko Ursulin
Quoting Patchwork (2018-02-19 15:55:19)
> Test perf_pmu:
> Subgroup cpu-hotplug:
> incomplete -> SKIP (shard-apl) fdo#104965
> Subgroup other-read-1:
> pass -> FAIL (shard-apl)
I guess that's the expected fallout from deciding to skip after trying
to offline a cpu.
I also guess that makes it a no-go (without some tweaking) if we cause
random fallout in subsequent tests, i.e. unpredictable flip-flops.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [igt-dev] ✗ Fi.CI.IGT: failure for tests/perf_pmu: Avoid off-lining all CPUs
2018-02-19 16:07 ` Chris Wilson
@ 2018-02-19 17:53 ` Tvrtko Ursulin
2018-02-20 9:41 ` Chris Wilson
0 siblings, 1 reply; 8+ messages in thread
From: Tvrtko Ursulin @ 2018-02-19 17:53 UTC (permalink / raw)
To: Chris Wilson, igt-dev, Patchwork, Tvrtko Ursulin
On 19/02/2018 16:07, Chris Wilson wrote:
> Quoting Patchwork (2018-02-19 15:55:19)
>> Test perf_pmu:
>> Subgroup cpu-hotplug:
>> incomplete -> SKIP (shard-apl) fdo#104965
>> Subgroup other-read-1:
>> pass -> FAIL (shard-apl)
>
> I guess that's the expected fallout from deciding to skip after trying
> to offline a cpu.
Kind of expected because I am not quite sure why it behaves differently
from the existing code which has igt_assert on the write. Maybe I am
particularly blind today.
> I also guess that makes it a no-go (without some tweaking) if we cause
> random fallout in subsequent tests, i.e. unpredictable flip-flops.
Yep agreed 100%.
Sounds like a case for igt_fail2 = please reboot me, or igt_wedge_machine.
A bit lighter weight, but with weaker coverage, maybe adding
i915.debug_pmu_cpu=N modparam (present only when I915_DEBUG is set), and
living with not testing the cpu0 offline paths?
Regards,
Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [igt-dev] ✗ Fi.CI.IGT: failure for tests/perf_pmu: Avoid off-lining all CPUs
2018-02-19 17:53 ` Tvrtko Ursulin
@ 2018-02-20 9:41 ` Chris Wilson
0 siblings, 0 replies; 8+ messages in thread
From: Chris Wilson @ 2018-02-20 9:41 UTC (permalink / raw)
To: Tvrtko Ursulin, igt-dev, Patchwork, Tvrtko Ursulin
Quoting Tvrtko Ursulin (2018-02-19 17:53:06)
>
> On 19/02/2018 16:07, Chris Wilson wrote:
> > Quoting Patchwork (2018-02-19 15:55:19)
> >> Test perf_pmu:
> >> Subgroup cpu-hotplug:
> >> incomplete -> SKIP (shard-apl) fdo#104965
> >> Subgroup other-read-1:
> >> pass -> FAIL (shard-apl)
> >
> > I guess that's the expected fallout from deciding to skip after trying
> > to offline a cpu.
>
> Kind of expected because I am not quite sure why it behaves differently
> from the existing code which has igt_assert on the write. Maybe I am
> particularly blind today.
>
> > I also guess that makes it a no-go (without some tweaking) if we cause
> > random fallout in subsequent tests, i.e. unpredictable flip-flops.
>
> Yep agreed 100%.
>
> Sounds like a case for igt_fail2 = please reboot me, or igt_wedge_machine.
Indeed, igt_sysrq_reboot() sounds like a useful intermediate patch.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-02-20 9:41 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-02-19 13:40 [igt-dev] [PATCH i-g-t] tests/perf_pmu: Avoid off-lining all CPUs Tvrtko Ursulin
2018-02-19 13:54 ` Chris Wilson
2018-02-19 14:08 ` [igt-dev] [Intel-gfx] " Tvrtko Ursulin
2018-02-19 14:01 ` [igt-dev] ✓ Fi.CI.BAT: success for " Patchwork
2018-02-19 15:55 ` [igt-dev] ✗ Fi.CI.IGT: failure " Patchwork
2018-02-19 16:07 ` Chris Wilson
2018-02-19 17:53 ` Tvrtko Ursulin
2018-02-20 9:41 ` Chris Wilson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox