public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on snb
@ 2017-10-19  9:51 Daniel Vetter
  2017-10-19  9:57 ` Daniel Vetter
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Daniel Vetter @ 2017-10-19  9:51 UTC (permalink / raw)
  To: Intel Graphics Development; +Cc: Daniel Vetter

CI gets upset about it resulting in an incomplete, let's skip it until
that's fixed to avoid havoc in the CI farm. Of course this should/will
be reverted as soon as we have a fix (similar to how we dealt with the
snb-dies-in-blt-hangs issue).

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Lofstedt, Marta" <marta.lofstedt@intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
References: https://intel-gfx-ci.01.org/tree/drm-tip/igt@gem_eio@in-flight-suspend.html
References: https://bugs.freedesktop.org/show_bug.cgi?id=103289
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 tests/gem_eio.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tests/gem_eio.c b/tests/gem_eio.c
index 899cb62728e3..28375e208232 100644
--- a/tests/gem_eio.c
+++ b/tests/gem_eio.c
@@ -218,6 +218,9 @@ static void test_inflight_suspend(int fd)
 	igt_require(gem_has_exec_fence(fd));
 	igt_require(i915_reset_control(false));
 
+	igt_skip_on_f(IS_SANDYBRIDGE(intel_get_drm_devid(fd)),
+		      "random incompletes in CI with this test\n");
+
 	memset(obj, 0, sizeof(obj));
 	obj[0].flags = EXEC_OBJECT_WRITE;
 	obj[1].handle = gem_create(fd, 4096);
-- 
2.15.0.rc1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on snb
  2017-10-19  9:51 [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on snb Daniel Vetter
@ 2017-10-19  9:57 ` Daniel Vetter
  2017-10-19 11:37   ` Lofstedt, Marta
  2017-10-19 10:12 ` ✗ Fi.CI.BAT: failure for " Patchwork
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Daniel Vetter @ 2017-10-19  9:57 UTC (permalink / raw)
  To: Intel Graphics Development; +Cc: Daniel Vetter

On Thu, Oct 19, 2017 at 11:51:51AM +0200, Daniel Vetter wrote:
> CI gets upset about it resulting in an incomplete, let's skip it until
> that's fixed to avoid havoc in the CI farm. Of course this should/will
> be reverted as soon as we have a fix (similar to how we dealt with the
> snb-dies-in-blt-hangs issue).
> 
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: "Lofstedt, Marta" <marta.lofstedt@intel.com>
> Cc: Martin Peres <martin.peres@linux.intel.com>
> References: https://intel-gfx-ci.01.org/tree/drm-tip/igt@gem_eio@in-flight-suspend.html
> References: https://bugs.freedesktop.org/show_bug.cgi?id=103289
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

For more context, since I forgot to add: I'm definitely not advertising
for abusing igt_skip to handle problematic testcases in general. What
makes this special here is the combo of
- new testcase
- old machine where we don't have priority to fix things

Hence why I think it'll make sense to treat this as a feature-like thing,
where we simply skip if stuff doesn't work/isn't exposed on older
platforms and shrug it off. And once someone does a free time project to
fix it up, we can then remove the skip.

I hope that explains a bit the reasoning from my behind using skip here.

The other bit is that if/once Maarten figured out what's wrong with
watermarks, we should be able to enable shard-snb reporting in CI results,
which would be really great. Except we really can't have tests that
incomplete, because they victimize too much else and so would need to
blacklist until fixed one way or the other anyways.
-Daniel

> ---
>  tests/gem_eio.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/tests/gem_eio.c b/tests/gem_eio.c
> index 899cb62728e3..28375e208232 100644
> --- a/tests/gem_eio.c
> +++ b/tests/gem_eio.c
> @@ -218,6 +218,9 @@ static void test_inflight_suspend(int fd)
>  	igt_require(gem_has_exec_fence(fd));
>  	igt_require(i915_reset_control(false));
>  
> +	igt_skip_on_f(IS_SANDYBRIDGE(intel_get_drm_devid(fd)),
> +		      "random incompletes in CI with this test\n");
> +
>  	memset(obj, 0, sizeof(obj));
>  	obj[0].flags = EXEC_OBJECT_WRITE;
>  	obj[1].handle = gem_create(fd, 4096);
> -- 
> 2.15.0.rc1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* ✗ Fi.CI.BAT: failure for tests/gem_eio: Skip in-flight-suspend on snb
  2017-10-19  9:51 [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on snb Daniel Vetter
  2017-10-19  9:57 ` Daniel Vetter
@ 2017-10-19 10:12 ` Patchwork
  2017-10-19 10:28   ` Chris Wilson
  2017-10-19 12:01 ` ✗ Fi.CI.BAT: warning " Patchwork
  2017-10-19 13:29 ` [PATCH i-g-t] " Martin Peres
  3 siblings, 1 reply; 12+ messages in thread
From: Patchwork @ 2017-10-19 10:12 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

== Series Details ==

Series: tests/gem_eio: Skip in-flight-suspend on snb
URL   : https://patchwork.freedesktop.org/series/32280/
State : failure

== Summary ==

IGT patchset tested on top of latest successful build
abc08cba366a64a07f7f4deb167ae7d6ae059958 lib: Free all internal buffers before measuring available memory

with latest DRM-Tip kernel build CI_DRM_3261
399dd92e2b42 drm-tip: 2017y-10m-18d-21h-54m-46s UTC integration manifest

No testlist changes.

Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-a-frame-sequence:
                pass       -> INCOMPLETE (fi-skl-6700hq)

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:441s
fi-bdw-gvtdvm    total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:455s
fi-blb-e6850     total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:376s
fi-bsw-n3050     total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  time:529s
fi-bwr-2160      total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 time:264s
fi-bxt-dsi       total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  time:493s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:496s
fi-byt-j1900     total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  time:499s
fi-byt-n2820     total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  time:480s
fi-cfl-s         total:289  pass:253  dwarn:4   dfail:0   fail:0   skip:32  time:564s
fi-elk-e7500     total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:413s
fi-gdg-551       total:289  pass:178  dwarn:1   dfail:0   fail:1   skip:109 time:248s
fi-glk-1         total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:586s
fi-hsw-4770      total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:455s
fi-hsw-4770r     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:431s
fi-ilk-650       total:289  pass:228  dwarn:0   dfail:0   fail:0   skip:61  time:433s
fi-ivb-3520m     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:486s
fi-ivb-3770      total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:463s
fi-kbl-7500u     total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  time:488s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:572s
fi-kbl-7567u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:478s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:584s
fi-pnv-d510      total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  time:552s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:450s
fi-skl-6700hq    total:240  pass:215  dwarn:0   dfail:0   fail:0   skip:24 
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:524s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:510s
fi-skl-gvtdvm    total:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  time:456s
fi-snb-2520m     total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:571s
fi-snb-2600      total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  time:430s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_384/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ✗ Fi.CI.BAT: failure for tests/gem_eio: Skip in-flight-suspend on snb
  2017-10-19 10:12 ` ✗ Fi.CI.BAT: failure for " Patchwork
@ 2017-10-19 10:28   ` Chris Wilson
  0 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2017-10-19 10:28 UTC (permalink / raw)
  To: Patchwork, Daniel Vetter; +Cc: intel-gfx

Ah, domains.

It is a real bug, but the quickest way to paper over it is to
s/TEST_DEVICES/TEST_NONE/ (that way we still have the test in arguably a
more realistic mode), we have other reset tests that also trigger
the death. The bug tells us reset on snb is unreliable (writing to GDRST
causes a machine hang a short time later before the reset is complete).
Exactly what engine setup prior to the GPU hang triggers the death and
how to workaround the death are currently eluding me.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on snb
  2017-10-19  9:57 ` Daniel Vetter
@ 2017-10-19 11:37   ` Lofstedt, Marta
  2017-10-19 13:00     ` Daniel Vetter
  0 siblings, 1 reply; 12+ messages in thread
From: Lofstedt, Marta @ 2017-10-19 11:37 UTC (permalink / raw)
  To: Daniel Vetter, Intel Graphics Development; +Cc: Daniel Vetter



> -----Original Message-----
> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel
> Vetter
> Sent: Thursday, October 19, 2017 12:57 PM
> To: Intel Graphics Development <intel-gfx@lists.freedesktop.org>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>; Joonas Lahtinen
> <joonas.lahtinen@linux.intel.com>; Chris Wilson <chris@chris-wilson.co.uk>;
> Lofstedt, Marta <marta.lofstedt@intel.com>; Martin Peres
> <martin.peres@linux.intel.com>
> Subject: Re: [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on snb
> 
> On Thu, Oct 19, 2017 at 11:51:51AM +0200, Daniel Vetter wrote:
> > CI gets upset about it resulting in an incomplete, let's skip it until
> > that's fixed to avoid havoc in the CI farm. Of course this should/will
> > be reverted as soon as we have a fix (similar to how we dealt with the
> > snb-dies-in-blt-hangs issue).
> >
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: "Lofstedt, Marta" <marta.lofstedt@intel.com>
> > Cc: Martin Peres <martin.peres@linux.intel.com>
> > References:
> > https://intel-gfx-ci.01.org/tree/drm-tip/igt@gem_eio@in-flight-suspend
> > .html
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=103289
> > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> For more context, since I forgot to add: I'm definitely not advertising for
> abusing igt_skip to handle problematic testcases in general. What makes this
> special here is the combo of
> - new testcase
> - old machine where we don't have priority to fix things
> 
> Hence why I think it'll make sense to treat this as a feature-like thing, where
> we simply skip if stuff doesn't work/isn't exposed on older platforms and
> shrug it off. And once someone does a free time project to fix it up, we can
> then remove the skip.
> 
> I hope that explains a bit the reasoning from my behind using skip here.

I am not buying this. 
Could you define which old machines that we are not going to care about to find out that we are having this real issue?
I also don't understand why new test-cases should be treated differently compare to the old bad behaving ones we already have.

/Marta
> 
> The other bit is that if/once Maarten figured out what's wrong with
> watermarks, we should be able to enable shard-snb reporting in CI results,
> which would be really great. Except we really can't have tests that
> incomplete, because they victimize too much else and so would need to
> blacklist until fixed one way or the other anyways.
> -Daniel
> 
> > ---
> >  tests/gem_eio.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/tests/gem_eio.c b/tests/gem_eio.c index
> > 899cb62728e3..28375e208232 100644
> > --- a/tests/gem_eio.c
> > +++ b/tests/gem_eio.c
> > @@ -218,6 +218,9 @@ static void test_inflight_suspend(int fd)
> >  	igt_require(gem_has_exec_fence(fd));
> >  	igt_require(i915_reset_control(false));
> >
> > +	igt_skip_on_f(IS_SANDYBRIDGE(intel_get_drm_devid(fd)),
> > +		      "random incompletes in CI with this test\n");
> > +
> >  	memset(obj, 0, sizeof(obj));
> >  	obj[0].flags = EXEC_OBJECT_WRITE;
> >  	obj[1].handle = gem_create(fd, 4096);
> > --
> > 2.15.0.rc1
> >
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* ✗ Fi.CI.BAT: warning for tests/gem_eio: Skip in-flight-suspend on snb
  2017-10-19  9:51 [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on snb Daniel Vetter
  2017-10-19  9:57 ` Daniel Vetter
  2017-10-19 10:12 ` ✗ Fi.CI.BAT: failure for " Patchwork
@ 2017-10-19 12:01 ` Patchwork
  2017-10-19 13:29 ` [PATCH i-g-t] " Martin Peres
  3 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2017-10-19 12:01 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

== Series Details ==

Series: tests/gem_eio: Skip in-flight-suspend on snb
URL   : https://patchwork.freedesktop.org/series/32280/
State : warning

== Summary ==

IGT patchset tested on top of latest successful build
abc08cba366a64a07f7f4deb167ae7d6ae059958 lib: Free all internal buffers before measuring available memory

with latest DRM-Tip kernel build CI_DRM_3265
93f001963c99 drm-tip: 2017y-10m-19d-10h-52m-17s UTC integration manifest

No testlist changes.

Test drv_module_reload:
        Subgroup basic-no-display:
                pass       -> DMESG-WARN (fi-skl-6770hq)

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:445s
fi-bdw-gvtdvm    total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:453s
fi-blb-e6850     total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:373s
fi-bsw-n3050     total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  time:525s
fi-bwr-2160      total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 time:264s
fi-bxt-dsi       total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  time:504s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:499s
fi-byt-j1900     total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  time:493s
fi-byt-n2820     total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  time:489s
fi-cfl-s         total:289  pass:253  dwarn:4   dfail:0   fail:0   skip:32  time:561s
fi-elk-e7500     total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:427s
fi-gdg-551       total:289  pass:178  dwarn:1   dfail:0   fail:1   skip:109 time:250s
fi-glk-1         total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:585s
fi-hsw-4770      total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:452s
fi-hsw-4770r     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:427s
fi-ilk-650       total:289  pass:228  dwarn:0   dfail:0   fail:0   skip:61  time:438s
fi-ivb-3520m     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:494s
fi-ivb-3770      total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:460s
fi-kbl-7500u     total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  time:493s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:571s
fi-kbl-7567u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:474s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:584s
fi-pnv-d510      total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  time:548s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:454s
fi-skl-6700hq    total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:648s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:522s
fi-skl-6770hq    total:289  pass:268  dwarn:1   dfail:0   fail:0   skip:20  time:504s
fi-skl-gvtdvm    total:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  time:458s
fi-snb-2520m     total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:568s
fi-snb-2600      total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  time:420s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_386/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on snb
  2017-10-19 11:37   ` Lofstedt, Marta
@ 2017-10-19 13:00     ` Daniel Vetter
  0 siblings, 0 replies; 12+ messages in thread
From: Daniel Vetter @ 2017-10-19 13:00 UTC (permalink / raw)
  To: Lofstedt, Marta; +Cc: Intel Graphics Development

On Thu, Oct 19, 2017 at 1:37 PM, Lofstedt, Marta
<marta.lofstedt@intel.com> wrote:
>
>
>> -----Original Message-----
>> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel
>> Vetter
>> Sent: Thursday, October 19, 2017 12:57 PM
>> To: Intel Graphics Development <intel-gfx@lists.freedesktop.org>
>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>; Joonas Lahtinen
>> <joonas.lahtinen@linux.intel.com>; Chris Wilson <chris@chris-wilson.co.uk>;
>> Lofstedt, Marta <marta.lofstedt@intel.com>; Martin Peres
>> <martin.peres@linux.intel.com>
>> Subject: Re: [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on snb
>>
>> On Thu, Oct 19, 2017 at 11:51:51AM +0200, Daniel Vetter wrote:
>> > CI gets upset about it resulting in an incomplete, let's skip it until
>> > that's fixed to avoid havoc in the CI farm. Of course this should/will
>> > be reverted as soon as we have a fix (similar to how we dealt with the
>> > snb-dies-in-blt-hangs issue).
>> >
>> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> > Cc: "Lofstedt, Marta" <marta.lofstedt@intel.com>
>> > Cc: Martin Peres <martin.peres@linux.intel.com>
>> > References:
>> > https://intel-gfx-ci.01.org/tree/drm-tip/igt@gem_eio@in-flight-suspend
>> > .html
>> > References: https://bugs.freedesktop.org/show_bug.cgi?id=103289
>> > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>
>> For more context, since I forgot to add: I'm definitely not advertising for
>> abusing igt_skip to handle problematic testcases in general. What makes this
>> special here is the combo of
>> - new testcase
>> - old machine where we don't have priority to fix things
>>
>> Hence why I think it'll make sense to treat this as a feature-like thing, where
>> we simply skip if stuff doesn't work/isn't exposed on older platforms and
>> shrug it off. And once someone does a free time project to fix it up, we can
>> then remove the skip.
>>
>> I hope that explains a bit the reasoning from my behind using skip here.
>
> I am not buying this.
> Could you define which old machines that we are not going to care about to find out that we are having this real issue?
> I also don't understand why new test-cases should be treated differently compare to the old bad behaving ones we already have.

What other old bad behaving ones do we have? The only other incomplete
I'm seeing on older boxes is gem_exec_suspend@basice-s3, and I'm
semi-tempted to do the same for that box too. But at least with
fast-feedback the run order is fixed, so the tests which are not run
are always the same ones.

There's one more cibuglog entry for pre-gen9 machines:

https://bugs.freedesktop.org/show_bug.cgi?id=102890

But cibuglog stats say reproduction rate is just 1%. I think that's
ok, even for an incomplete.

I don't see anything else. There's tons of issues on newer machines
where we still care, and where we still are working on stabilizing
them. But nothing else causing incompletes on gen8 or older afaict.
-Daniel

>
> /Marta
>>
>> The other bit is that if/once Maarten figured out what's wrong with
>> watermarks, we should be able to enable shard-snb reporting in CI results,
>> which would be really great. Except we really can't have tests that
>> incomplete, because they victimize too much else and so would need to
>> blacklist until fixed one way or the other anyways.
>> -Daniel
>>
>> > ---
>> >  tests/gem_eio.c | 3 +++
>> >  1 file changed, 3 insertions(+)
>> >
>> > diff --git a/tests/gem_eio.c b/tests/gem_eio.c index
>> > 899cb62728e3..28375e208232 100644
>> > --- a/tests/gem_eio.c
>> > +++ b/tests/gem_eio.c
>> > @@ -218,6 +218,9 @@ static void test_inflight_suspend(int fd)
>> >     igt_require(gem_has_exec_fence(fd));
>> >     igt_require(i915_reset_control(false));
>> >
>> > +   igt_skip_on_f(IS_SANDYBRIDGE(intel_get_drm_devid(fd)),
>> > +                 "random incompletes in CI with this test\n");
>> > +
>> >     memset(obj, 0, sizeof(obj));
>> >     obj[0].flags = EXEC_OBJECT_WRITE;
>> >     obj[1].handle = gem_create(fd, 4096);
>> > --
>> > 2.15.0.rc1
>> >
>>
>> --
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on snb
  2017-10-19  9:51 [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on snb Daniel Vetter
                   ` (2 preceding siblings ...)
  2017-10-19 12:01 ` ✗ Fi.CI.BAT: warning " Patchwork
@ 2017-10-19 13:29 ` Martin Peres
  2017-10-20  9:26   ` Joonas Lahtinen
  3 siblings, 1 reply; 12+ messages in thread
From: Martin Peres @ 2017-10-19 13:29 UTC (permalink / raw)
  To: Daniel Vetter, Intel Graphics Development

On 19/10/17 12:51, Daniel Vetter wrote:
> CI gets upset about it resulting in an incomplete, let's skip it until
> that's fixed to avoid havoc in the CI farm. Of course this should/will
> be reverted as soon as we have a fix (similar to how we dealt with the
> snb-dies-in-blt-hangs issue).
> 
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: "Lofstedt, Marta" <marta.lofstedt@intel.com>
> Cc: Martin Peres <martin.peres@linux.intel.com>
> References: https://intel-gfx-ci.01.org/tree/drm-tip/igt@gem_eio@in-flight-suspend.html
> References: https://bugs.freedesktop.org/show_bug.cgi?id=103289
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>   tests/gem_eio.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/tests/gem_eio.c b/tests/gem_eio.c
> index 899cb62728e3..28375e208232 100644
> --- a/tests/gem_eio.c
> +++ b/tests/gem_eio.c
> @@ -218,6 +218,9 @@ static void test_inflight_suspend(int fd)
>   	igt_require(gem_has_exec_fence(fd));
>   	igt_require(i915_reset_control(false));
>   
> +	igt_skip_on_f(IS_SANDYBRIDGE(intel_get_drm_devid(fd)),
> +		      "random incompletes in CI with this test\n");
> +

So, let's recap the problem here:
  - Any incomplete in sharded runs mean that the platform is unfit for 
pre-merge (because any other test after will go from pass to notrun)
  - We can't fix issues immediately, especially for old platforms

This patch is sweeping the test under the rug by using the skip output, 
which is not only hard to track, it is also misleading.

After discussing with Marta, Arek and Petri, we found some consensus on 
the following proposal (terminology is up for debate):

- Introduce igt_dodge_on(cond, label): Report a pre-emptive 'fail' when 
the condition is true. Make sure this is over-ridable with IGT_DODGE=0 
so as we can easily run these tests without recompiling them.

- Introduce a new piglit result (dodged), so as we can more easily keep 
track of the issue (no need to open the piglit results).

Any thoughts?

>   	memset(obj, 0, sizeof(obj));
>   	obj[0].flags = EXEC_OBJECT_WRITE;
>   	obj[1].handle = gem_create(fd, 4096);
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on snb
  2017-10-19 13:29 ` [PATCH i-g-t] " Martin Peres
@ 2017-10-20  9:26   ` Joonas Lahtinen
  2017-10-23  9:28     ` Martin Peres
  0 siblings, 1 reply; 12+ messages in thread
From: Joonas Lahtinen @ 2017-10-20  9:26 UTC (permalink / raw)
  To: Martin Peres, Daniel Vetter, Intel Graphics Development,
	Petri Latvala

+ Petri

On Thu, 2017-10-19 at 16:29 +0300, Martin Peres wrote:
> On 19/10/17 12:51, Daniel Vetter wrote:
> > CI gets upset about it resulting in an incomplete, let's skip it until
> > that's fixed to avoid havoc in the CI farm. Of course this should/will
> > be reverted as soon as we have a fix (similar to how we dealt with the
> > snb-dies-in-blt-hangs issue).
> > 
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: "Lofstedt, Marta" <marta.lofstedt@intel.com>
> > Cc: Martin Peres <martin.peres@linux.intel.com>
> > References: https://intel-gfx-ci.01.org/tree/drm-tip/igt@gem_eio@in-flight-suspend.html
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=103289
> > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

<SNIP>

> So, let's recap the problem here:
>   - Any incomplete in sharded runs mean that the platform is unfit for 
> pre-merge (because any other test after will go from pass to notrun)
>   - We can't fix issues immediately, especially for old platforms
> 
> This patch is sweeping the test under the rug by using the skip output, 
> which is not only hard to track, it is also misleading.
> 
> After discussing with Marta, Arek and Petri, we found some consensus on 
> the following proposal (terminology is up for debate):
> 
> - Introduce igt_dodge_on(cond, label): Report a pre-emptive 'fail' when 
> the condition is true. Make sure this is over-ridable with IGT_DODGE=0 
> so as we can easily run these tests without recompiling them.

Make this igt_skip_on_ci(cond) and require IGT_CI=1 to activate them.
Much like with simulation.

Still, a BIOS update to one of the CI machines might mean (if it's not
now the case, not very far fetched for the future) that we go churn in
the IGT codebase to drop bunch of these. That's not the optimal
workflow I can think of when we're discussing a separate mailing list
for IGT discussion and patches to make it more self-contained. Then we
bind that new mailing list to our CI farm contents, and bind making
fixes to the CI farm operation directly to the IGT reviewing bandwidth?

I'm still thinking best way would be that CI would mask the known
problematic ones from the failure/pass criteria, and then somebody
actually looks at the masked on after their testing coverage is
prioritized. I think IGT should try to provide a wide range of tests
that are supposed to work on any certain hardware. If they don't, it's
not a reason to change the tests itself.

With the filter, we can grow the testing coverage for the new
platforms, even if CI happens to have odd machines that may not pass
those tests (and we may not have the resources to immediately fix
those). All this without churning on the IGT codebase.

But if this is the only technically viable solution in short-term, then
so be it. I just see better options too.

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on snb
  2017-10-20  9:26   ` Joonas Lahtinen
@ 2017-10-23  9:28     ` Martin Peres
  2017-10-26 10:57       ` Lofstedt, Marta
  2017-10-26 11:19       ` Lofstedt, Marta
  0 siblings, 2 replies; 12+ messages in thread
From: Martin Peres @ 2017-10-23  9:28 UTC (permalink / raw)
  To: Joonas Lahtinen, Daniel Vetter, Intel Graphics Development,
	Petri Latvala

On 20/10/17 12:26, Joonas Lahtinen wrote:
> + Petri
> 
> On Thu, 2017-10-19 at 16:29 +0300, Martin Peres wrote:
>> On 19/10/17 12:51, Daniel Vetter wrote:
>>> CI gets upset about it resulting in an incomplete, let's skip it until
>>> that's fixed to avoid havoc in the CI farm. Of course this should/will
>>> be reverted as soon as we have a fix (similar to how we dealt with the
>>> snb-dies-in-blt-hangs issue).
>>>
>>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: "Lofstedt, Marta" <marta.lofstedt@intel.com>
>>> Cc: Martin Peres <martin.peres@linux.intel.com>
>>> References: https://intel-gfx-ci.01.org/tree/drm-tip/igt@gem_eio@in-flight-suspend.html
>>> References: https://bugs.freedesktop.org/show_bug.cgi?id=103289
>>> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> <SNIP>
> 
>> So, let's recap the problem here:
>>   - Any incomplete in sharded runs mean that the platform is unfit for 
>> pre-merge (because any other test after will go from pass to notrun)
>>   - We can't fix issues immediately, especially for old platforms
>>
>> This patch is sweeping the test under the rug by using the skip output, 
>> which is not only hard to track, it is also misleading.
>>
>> After discussing with Marta, Arek and Petri, we found some consensus on 
>> the following proposal (terminology is up for debate):
>>
>> - Introduce igt_dodge_on(cond, label): Report a pre-emptive 'fail' when 
>> the condition is true. Make sure this is over-ridable with IGT_DODGE=0 
>> so as we can easily run these tests without recompiling them.
> 
> Make this igt_skip_on_ci(cond) and require IGT_CI=1 to activate them.
> Much like with simulation.

replace skip with fail, and we agree. Skips are too easy to ignore!

> 
> Still, a BIOS update to one of the CI machines might mean (if it's not
> now the case, not very far fetched for the future) that we go churn in
> the IGT codebase to drop bunch of these. That's not the optimal
> workflow I can think of when we're discussing a separate mailing list
> for IGT discussion and patches to make it more self-contained. Then we
> bind that new mailing list to our CI farm contents, and bind making
> fixes to the CI farm operation directly to the IGT reviewing bandwidth?

Isn't what we are proposing doing exactly this? By changing the source
code of IGT, we allow people to send patches to remove some workarounds
and see if they pass or fail in the same way they would propose any
change to IGT. Moreover, we make the running of IGT in our farm as
transparent as possible.

> 
> I'm still thinking best way would be that CI would mask the known
> problematic ones from the failure/pass criteria, and then somebody
> actually looks at the masked on after their testing coverage is
> prioritized. I think IGT should try to provide a wide range of tests
> that are supposed to work on any certain hardware. If they don't, it's
> not a reason to change the tests itself.

This is true that some skips will be highly-machine specific, but isn't
our role as developers to know what and what doesn't work? By pushing
this to an external whitelist only for CI, we miss an opportunity to
improve this CI blacklist.

Now, let me remind you that this blacklist is *only* for tests that hang
the machine or leave it in an inconsistant state which will lead to a
crash later.

> 
> With the filter, we can grow the testing coverage for the new
> platforms, even if CI happens to have odd machines that may not pass
> those tests (and we may not have the resources to immediately fix
> those). All this without churning on the IGT codebase.

You are describing what cibuglog is already doing. Failing tests cases
are suppressed in pre-merge, and associated bugs[1].

See above for what this proposal is about.

[1] https://intel-gfx-ci.01.org/cibuglog/

> 
> But if this is the only technically viable solution in short-term, then
> so be it. I just see better options too.

Maybe we need a write up our workflow. This time, a public one!

I hope
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on snb
  2017-10-23  9:28     ` Martin Peres
@ 2017-10-26 10:57       ` Lofstedt, Marta
  2017-10-26 11:19       ` Lofstedt, Marta
  1 sibling, 0 replies; 12+ messages in thread
From: Lofstedt, Marta @ 2017-10-26 10:57 UTC (permalink / raw)
  To: Martin Peres, Joonas Lahtinen, Daniel Vetter,
	Intel Graphics Development, Latvala, Petri

Since the discussion on this died and I believe that everyone are scared that the dodge suggestion would require someone to do some work.
I could Ack the patch if: 

+	igt_skip_on_f(IS_SANDYBRIDGE(intel_get_drm_devid(fd)),
+		      "random incompletes in CI with this test\n");
+

Was replaced with an igt_warn

/Marta

> -----Original Message-----
> From: Martin Peres [mailto:martin.peres@linux.intel.com]
> Sent: Monday, October 23, 2017 12:29 PM
> To: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>; Daniel Vetter:
> <daniel.vetter@ffwll.ch>; Intel Graphics Development <intel-
> gfx@lists.freedesktop.org>; Latvala, Petri <petri.latvala@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>; Lofstedt, Marta
> <marta.lofstedt@intel.com>
> Subject: Re: [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on snb
> 
> On 20/10/17 12:26, Joonas Lahtinen wrote:
> > + Petri
> >
> > On Thu, 2017-10-19 at 16:29 +0300, Martin Peres wrote:
> >> On 19/10/17 12:51, Daniel Vetter wrote:
> >>> CI gets upset about it resulting in an incomplete, let's skip it
> >>> until that's fixed to avoid havoc in the CI farm. Of course this
> >>> should/will be reverted as soon as we have a fix (similar to how we
> >>> dealt with the snb-dies-in-blt-hangs issue).
> >>>
> >>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> >>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >>> Cc: "Lofstedt, Marta" <marta.lofstedt@intel.com>
> >>> Cc: Martin Peres <martin.peres@linux.intel.com>
> >>> References:
> >>> https://intel-gfx-ci.01.org/tree/drm-tip/igt@gem_eio@in-flight-suspe
> >>> nd.html
> >>> References: https://bugs.freedesktop.org/show_bug.cgi?id=103289
> >>> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >
> > <SNIP>
> >
> >> So, let's recap the problem here:
> >>   - Any incomplete in sharded runs mean that the platform is unfit
> >> for pre-merge (because any other test after will go from pass to notrun)
> >>   - We can't fix issues immediately, especially for old platforms
> >>
> >> This patch is sweeping the test under the rug by using the skip
> >> output, which is not only hard to track, it is also misleading.
> >>
> >> After discussing with Marta, Arek and Petri, we found some consensus
> >> on the following proposal (terminology is up for debate):
> >>
> >> - Introduce igt_dodge_on(cond, label): Report a pre-emptive 'fail'
> >> when the condition is true. Make sure this is over-ridable with
> >> IGT_DODGE=0 so as we can easily run these tests without recompiling
> them.
> >
> > Make this igt_skip_on_ci(cond) and require IGT_CI=1 to activate them.
> > Much like with simulation.
> 
> replace skip with fail, and we agree. Skips are too easy to ignore!
> 
> >
> > Still, a BIOS update to one of the CI machines might mean (if it's not
> > now the case, not very far fetched for the future) that we go churn in
> > the IGT codebase to drop bunch of these. That's not the optimal
> > workflow I can think of when we're discussing a separate mailing list
> > for IGT discussion and patches to make it more self-contained. Then we
> > bind that new mailing list to our CI farm contents, and bind making
> > fixes to the CI farm operation directly to the IGT reviewing bandwidth?
> 
> Isn't what we are proposing doing exactly this? By changing the source code
> of IGT, we allow people to send patches to remove some workarounds and
> see if they pass or fail in the same way they would propose any change to
> IGT. Moreover, we make the running of IGT in our farm as transparent as
> possible.
> 
> >
> > I'm still thinking best way would be that CI would mask the known
> > problematic ones from the failure/pass criteria, and then somebody
> > actually looks at the masked on after their testing coverage is
> > prioritized. I think IGT should try to provide a wide range of tests
> > that are supposed to work on any certain hardware. If they don't, it's
> > not a reason to change the tests itself.
> 
> This is true that some skips will be highly-machine specific, but isn't our role
> as developers to know what and what doesn't work? By pushing this to an
> external whitelist only for CI, we miss an opportunity to improve this CI
> blacklist.
> 
> Now, let me remind you that this blacklist is *only* for tests that hang the
> machine or leave it in an inconsistant state which will lead to a crash later.
> 
> >
> > With the filter, we can grow the testing coverage for the new
> > platforms, even if CI happens to have odd machines that may not pass
> > those tests (and we may not have the resources to immediately fix
> > those). All this without churning on the IGT codebase.
> 
> You are describing what cibuglog is already doing. Failing tests cases are
> suppressed in pre-merge, and associated bugs[1].
> 
> See above for what this proposal is about.
> 
> [1] https://intel-gfx-ci.01.org/cibuglog/
> 
> >
> > But if this is the only technically viable solution in short-term,
> > then so be it. I just see better options too.
> 
> Maybe we need a write up our workflow. This time, a public one!
> 
> I hope
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on snb
  2017-10-23  9:28     ` Martin Peres
  2017-10-26 10:57       ` Lofstedt, Marta
@ 2017-10-26 11:19       ` Lofstedt, Marta
  1 sibling, 0 replies; 12+ messages in thread
From: Lofstedt, Marta @ 2017-10-26 11:19 UTC (permalink / raw)
  To: Martin Peres, Joonas Lahtinen, Daniel Vetter,
	Intel Graphics Development, Latvala, Petri



> -----Original Message-----
> From: Lofstedt, Marta
> Sent: Thursday, October 26, 2017 1:57 PM
> To: 'Martin Peres' <martin.peres@linux.intel.com>; Joonas Lahtinen
> <joonas.lahtinen@linux.intel.com>; Daniel Vetter <daniel.vetter@ffwll.ch>;
> Intel Graphics Development <intel-gfx@lists.freedesktop.org>; Latvala, Petri
> <petri.latvala@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Subject: RE: [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on snb
> 
> Since the discussion on this died and I believe that everyone are scared that
> the dodge suggestion would require someone to do some work.
> I could Ack the patch if:
> 
> +	igt_skip_on_f(IS_SANDYBRIDGE(intel_get_drm_devid(fd)),
> +		      "random incompletes in CI with this test\n");
> +
> 
> Was replaced with an igt_warn
> 
Forgot to write, it should be igt_warn paired with success on the test. This will produce the WARN result, note this is not the same as dmesg-warn. This is quite rare and it will be noticed.

> /Marta
> 
> > -----Original Message-----
> > From: Martin Peres [mailto:martin.peres@linux.intel.com]
> > Sent: Monday, October 23, 2017 12:29 PM
> > To: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>; Daniel Vetter:
> > <daniel.vetter@ffwll.ch>; Intel Graphics Development <intel-
> > gfx@lists.freedesktop.org>; Latvala, Petri <petri.latvala@intel.com>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>; Lofstedt, Marta
> > <marta.lofstedt@intel.com>
> > Subject: Re: [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on
> > snb
> >
> > On 20/10/17 12:26, Joonas Lahtinen wrote:
> > > + Petri
> > >
> > > On Thu, 2017-10-19 at 16:29 +0300, Martin Peres wrote:
> > >> On 19/10/17 12:51, Daniel Vetter wrote:
> > >>> CI gets upset about it resulting in an incomplete, let's skip it
> > >>> until that's fixed to avoid havoc in the CI farm. Of course this
> > >>> should/will be reverted as soon as we have a fix (similar to how
> > >>> we dealt with the snb-dies-in-blt-hangs issue).
> > >>>
> > >>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > >>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > >>> Cc: "Lofstedt, Marta" <marta.lofstedt@intel.com>
> > >>> Cc: Martin Peres <martin.peres@linux.intel.com>
> > >>> References:
> > >>> https://intel-gfx-ci.01.org/tree/drm-tip/igt@gem_eio@in-flight-sus
> > >>> pe
> > >>> nd.html
> > >>> References: https://bugs.freedesktop.org/show_bug.cgi?id=103289
> > >>> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > >
> > > <SNIP>
> > >
> > >> So, let's recap the problem here:
> > >>   - Any incomplete in sharded runs mean that the platform is unfit
> > >> for pre-merge (because any other test after will go from pass to notrun)
> > >>   - We can't fix issues immediately, especially for old platforms
> > >>
> > >> This patch is sweeping the test under the rug by using the skip
> > >> output, which is not only hard to track, it is also misleading.
> > >>
> > >> After discussing with Marta, Arek and Petri, we found some
> > >> consensus on the following proposal (terminology is up for debate):
> > >>
> > >> - Introduce igt_dodge_on(cond, label): Report a pre-emptive 'fail'
> > >> when the condition is true. Make sure this is over-ridable with
> > >> IGT_DODGE=0 so as we can easily run these tests without recompiling
> > them.
> > >
> > > Make this igt_skip_on_ci(cond) and require IGT_CI=1 to activate them.
> > > Much like with simulation.
> >
> > replace skip with fail, and we agree. Skips are too easy to ignore!
> >
> > >
> > > Still, a BIOS update to one of the CI machines might mean (if it's
> > > not now the case, not very far fetched for the future) that we go
> > > churn in the IGT codebase to drop bunch of these. That's not the
> > > optimal workflow I can think of when we're discussing a separate
> > > mailing list for IGT discussion and patches to make it more
> > > self-contained. Then we bind that new mailing list to our CI farm
> > > contents, and bind making fixes to the CI farm operation directly to the
> IGT reviewing bandwidth?
> >
> > Isn't what we are proposing doing exactly this? By changing the source
> > code of IGT, we allow people to send patches to remove some
> > workarounds and see if they pass or fail in the same way they would
> > propose any change to IGT. Moreover, we make the running of IGT in our
> > farm as transparent as possible.
> >
> > >
> > > I'm still thinking best way would be that CI would mask the known
> > > problematic ones from the failure/pass criteria, and then somebody
> > > actually looks at the masked on after their testing coverage is
> > > prioritized. I think IGT should try to provide a wide range of tests
> > > that are supposed to work on any certain hardware. If they don't,
> > > it's not a reason to change the tests itself.
> >
> > This is true that some skips will be highly-machine specific, but
> > isn't our role as developers to know what and what doesn't work? By
> > pushing this to an external whitelist only for CI, we miss an
> > opportunity to improve this CI blacklist.
> >
> > Now, let me remind you that this blacklist is *only* for tests that
> > hang the machine or leave it in an inconsistant state which will lead to a
> crash later.
> >
> > >
> > > With the filter, we can grow the testing coverage for the new
> > > platforms, even if CI happens to have odd machines that may not pass
> > > those tests (and we may not have the resources to immediately fix
> > > those). All this without churning on the IGT codebase.
> >
> > You are describing what cibuglog is already doing. Failing tests cases
> > are suppressed in pre-merge, and associated bugs[1].
> >
> > See above for what this proposal is about.
> >
> > [1] https://intel-gfx-ci.01.org/cibuglog/
> >
> > >
> > > But if this is the only technically viable solution in short-term,
> > > then so be it. I just see better options too.
> >
> > Maybe we need a write up our workflow. This time, a public one!
> >
> > I hope
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-10-26 11:19 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-19  9:51 [PATCH i-g-t] tests/gem_eio: Skip in-flight-suspend on snb Daniel Vetter
2017-10-19  9:57 ` Daniel Vetter
2017-10-19 11:37   ` Lofstedt, Marta
2017-10-19 13:00     ` Daniel Vetter
2017-10-19 10:12 ` ✗ Fi.CI.BAT: failure for " Patchwork
2017-10-19 10:28   ` Chris Wilson
2017-10-19 12:01 ` ✗ Fi.CI.BAT: warning " Patchwork
2017-10-19 13:29 ` [PATCH i-g-t] " Martin Peres
2017-10-20  9:26   ` Joonas Lahtinen
2017-10-23  9:28     ` Martin Peres
2017-10-26 10:57       ` Lofstedt, Marta
2017-10-26 11:19       ` Lofstedt, Marta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox