* [PATCH] e1000e: Taint a HW lockup
@ 2017-12-05 18:00 Chris Wilson
2017-12-05 18:05 ` Chris Wilson
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Chris Wilson @ 2017-12-05 18:00 UTC (permalink / raw)
To: intel-gfx; +Cc: Tomi Sarvela, Daniel Vetter
When we see an e1000e HW lockup in CI, it is typically fatal with the
hang repeating until the host is forcibly rebooted. Speed up that
process by tainting the kernel, which CI can trivially detect (and is
being used to detect similarly fatal CI conditions) and reboot soon
after.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
---
drivers/net/ethernet/intel/e1000e/netdev.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 9f18d39bdc8f..bcc4b226a184 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -1170,6 +1170,8 @@ static void e1000_print_hw_hang(struct work_struct *work)
/* Suggest workaround for known h/w issue */
if ((hw->mac.type == e1000_pchlan) && (er32(CTRL) & E1000_CTRL_TFCE))
e_err("Try turning off Tx pause (flow control) via ethtool\n");
+
+ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
}
/**
--
2.15.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] e1000e: Taint a HW lockup
2017-12-05 18:00 [PATCH] e1000e: Taint a HW lockup Chris Wilson
@ 2017-12-05 18:05 ` Chris Wilson
2017-12-06 9:47 ` Daniel Vetter
2017-12-05 18:52 ` ✓ Fi.CI.BAT: success for " Patchwork
2017-12-05 21:13 ` ✓ Fi.CI.IGT: " Patchwork
2 siblings, 1 reply; 6+ messages in thread
From: Chris Wilson @ 2017-12-05 18:05 UTC (permalink / raw)
To: intel-gfx; +Cc: Tomi Sarvela, Daniel Vetter
Quoting Chris Wilson (2017-12-05 18:00:00)
> When we see an e1000e HW lockup in CI, it is typically fatal with the
> hang repeating until the host is forcibly rebooted. Speed up that
> process by tainting the kernel, which CI can trivially detect (and is
> being used to detect similarly fatal CI conditions) and reboot soon
> after.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
I'm not concerned on selling this to e1000e, but if it helps improving
CI robustness, then topic/core-for-CI. Or maybe we should create a new
topic, Daniel? topic/taints-for-CI?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread
* ✓ Fi.CI.BAT: success for e1000e: Taint a HW lockup
2017-12-05 18:00 [PATCH] e1000e: Taint a HW lockup Chris Wilson
2017-12-05 18:05 ` Chris Wilson
@ 2017-12-05 18:52 ` Patchwork
2017-12-05 21:13 ` ✓ Fi.CI.IGT: " Patchwork
2 siblings, 0 replies; 6+ messages in thread
From: Patchwork @ 2017-12-05 18:52 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: e1000e: Taint a HW lockup
URL : https://patchwork.freedesktop.org/series/34931/
State : success
== Summary ==
Series 34931v1 e1000e: Taint a HW lockup
https://patchwork.freedesktop.org/api/1.0/series/34931/revisions/1/mbox/
fi-bdw-5557u total:288 pass:267 dwarn:0 dfail:0 fail:0 skip:21 time:442s
fi-blb-e6850 total:288 pass:223 dwarn:1 dfail:0 fail:0 skip:64 time:385s
fi-bsw-n3050 total:288 pass:242 dwarn:0 dfail:0 fail:0 skip:46 time:515s
fi-bwr-2160 total:288 pass:183 dwarn:0 dfail:0 fail:0 skip:105 time:281s
fi-bxt-dsi total:288 pass:258 dwarn:0 dfail:0 fail:0 skip:30 time:504s
fi-bxt-j4205 total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:504s
fi-byt-j1900 total:288 pass:253 dwarn:0 dfail:0 fail:0 skip:35 time:484s
fi-byt-n2820 total:288 pass:249 dwarn:0 dfail:0 fail:0 skip:39 time:472s
fi-elk-e7500 total:224 pass:163 dwarn:15 dfail:0 fail:0 skip:45
fi-gdg-551 total:288 pass:178 dwarn:1 dfail:0 fail:1 skip:108 time:267s
fi-glk-1 total:288 pass:260 dwarn:0 dfail:0 fail:0 skip:28 time:539s
fi-hsw-4770 total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:358s
fi-hsw-4770r total:288 pass:224 dwarn:0 dfail:0 fail:0 skip:64 time:261s
fi-ivb-3520m total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:485s
fi-ivb-3770 total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:445s
fi-kbl-7560u total:288 pass:269 dwarn:0 dfail:0 fail:0 skip:19 time:530s
fi-kbl-7567u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:475s
fi-kbl-r total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:537s
fi-pnv-d510 total:288 pass:222 dwarn:1 dfail:0 fail:0 skip:65 time:586s
fi-skl-6260u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:456s
fi-skl-6600u total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:541s
fi-skl-6700hq total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:572s
fi-skl-6700k total:288 pass:264 dwarn:0 dfail:0 fail:0 skip:24 time:515s
fi-skl-6770hq total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:498s
fi-snb-2520m total:288 pass:249 dwarn:0 dfail:0 fail:0 skip:39 time:548s
fi-snb-2600 total:288 pass:248 dwarn:0 dfail:0 fail:0 skip:40 time:411s
Blacklisted hosts:
fi-cfl-s2 total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:600s
fi-cnl-y total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:616s
fi-glk-dsi total:288 pass:258 dwarn:0 dfail:0 fail:0 skip:30 time:489s
0d0fe916f52ad8f05dddab384ae7c90bb62ebac4 drm-tip: 2017y-12m-05d-14h-52m-17s UTC integration manifest
f0ee3df4e66c e1000e: Taint a HW lockup
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7417/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread
* ✓ Fi.CI.IGT: success for e1000e: Taint a HW lockup
2017-12-05 18:00 [PATCH] e1000e: Taint a HW lockup Chris Wilson
2017-12-05 18:05 ` Chris Wilson
2017-12-05 18:52 ` ✓ Fi.CI.BAT: success for " Patchwork
@ 2017-12-05 21:13 ` Patchwork
2 siblings, 0 replies; 6+ messages in thread
From: Patchwork @ 2017-12-05 21:13 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: e1000e: Taint a HW lockup
URL : https://patchwork.freedesktop.org/series/34931/
State : success
== Summary ==
Test kms_flip:
Subgroup vblank-vs-modeset-suspend:
pass -> SKIP (shard-snb) fdo#102365
Subgroup modeset-vs-vblank-race-interruptible:
pass -> FAIL (shard-hsw) fdo#103060
Subgroup vblank-vs-modeset-suspend-interruptible:
skip -> PASS (shard-snb)
Test kms_frontbuffer_tracking:
Subgroup fbc-1p-offscren-pri-shrfb-draw-blt:
fail -> PASS (shard-snb) fdo#101623
Subgroup fbc-rgb101010-draw-render:
skip -> PASS (shard-snb) fdo#103167 +1
Test drv_module_reload:
Subgroup basic-no-display:
dmesg-warn -> PASS (shard-hsw) fdo#102707
Test kms_chv_cursor_fail:
Subgroup pipe-b-128x128-top-edge:
incomplete -> PASS (shard-hsw)
Test prime_mmap_kms:
Subgroup buffer-sharing:
skip -> PASS (shard-snb)
fdo#102365 https://bugs.freedesktop.org/show_bug.cgi?id=102365
fdo#103060 https://bugs.freedesktop.org/show_bug.cgi?id=103060
fdo#101623 https://bugs.freedesktop.org/show_bug.cgi?id=101623
fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
fdo#102707 https://bugs.freedesktop.org/show_bug.cgi?id=102707
shard-hsw total:2679 pass:1535 dwarn:1 dfail:0 fail:11 skip:1132 time:9438s
shard-snb total:2679 pass:1306 dwarn:2 dfail:0 fail:11 skip:1360 time:8041s
Blacklisted hosts:
shard-apl total:2636 pass:1636 dwarn:0 dfail:0 fail:23 skip:977 time:13356s
shard-kbl total:2545 pass:1694 dwarn:5 dfail:1 fail:22 skip:822 time:10261s
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7417/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] e1000e: Taint a HW lockup
2017-12-05 18:05 ` Chris Wilson
@ 2017-12-06 9:47 ` Daniel Vetter
2017-12-06 19:27 ` Jeff Kirsher
0 siblings, 1 reply; 6+ messages in thread
From: Daniel Vetter @ 2017-12-06 9:47 UTC (permalink / raw)
To: Chris Wilson, Saarinen, Jani, Jeff Kirsher, intel-wired-lan
Cc: Tomi Sarvela, intel-gfx
On Tue, Dec 5, 2017 at 7:05 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> Quoting Chris Wilson (2017-12-05 18:00:00)
>> When we see an e1000e HW lockup in CI, it is typically fatal with the
>> hang repeating until the host is forcibly rebooted. Speed up that
>> process by tainting the kernel, which CI can trivially detect (and is
>> being used to detect similarly fatal CI conditions) and reboot soon
>> after.
>>
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>> Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
>
> I'm not concerned on selling this to e1000e, but if it helps improving
> CI robustness, then topic/core-for-CI. Or maybe we should create a new
> topic, Daniel? topic/taints-for-CI?
Sounds like a usable idea for CI. Would be especially interesting
because despite applying the suggested w/a, we still hit lockups.
Before we do that though I think we should get an ack from the e1000e
team. Jani S. maybe something you can driver?
Adding more folks to cc.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] e1000e: Taint a HW lockup
2017-12-06 9:47 ` Daniel Vetter
@ 2017-12-06 19:27 ` Jeff Kirsher
0 siblings, 0 replies; 6+ messages in thread
From: Jeff Kirsher @ 2017-12-06 19:27 UTC (permalink / raw)
To: Daniel Vetter, Chris Wilson, Saarinen, Jani, intel-wired-lan
Cc: Tomi Sarvela, intel-gfx
[-- Attachment #1.1: Type: text/plain, Size: 1386 bytes --]
On Wed, 2017-12-06 at 10:47 +0100, Daniel Vetter wrote:
> On Tue, Dec 5, 2017 at 7:05 PM, Chris Wilson <chris@chris-wilson.co.u
> k> wrote:
> > Quoting Chris Wilson (2017-12-05 18:00:00)
> > > When we see an e1000e HW lockup in CI, it is typically fatal with
> > > the
> > > hang repeating until the host is forcibly rebooted. Speed up that
> > > process by tainting the kernel, which CI can trivially detect
> > > (and is
> > > being used to detect similarly fatal CI conditions) and reboot
> > > soon
> > > after.
> > >
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
> >
> > I'm not concerned on selling this to e1000e, but if it helps
> > improving
> > CI robustness, then topic/core-for-CI. Or maybe we should create a
> > new
> > topic, Daniel? topic/taints-for-CI?
>
> Sounds like a usable idea for CI. Would be especially interesting
> because despite applying the suggested w/a, we still hit lockups.
> Before we do that though I think we should get an ack from the e1000e
> team. Jani S. maybe something you can driver?
>
> Adding more folks to cc.
> -Daniel
Please send any e1000e patches to the intel-wired-lan mailing list and
make sure to CC Sasha Neftin <sasha.neftin@intel.com>, since he is the
e1000e driver maintainer.
[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-12-06 19:27 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-05 18:00 [PATCH] e1000e: Taint a HW lockup Chris Wilson
2017-12-05 18:05 ` Chris Wilson
2017-12-06 9:47 ` Daniel Vetter
2017-12-06 19:27 ` Jeff Kirsher
2017-12-05 18:52 ` ✓ Fi.CI.BAT: success for " Patchwork
2017-12-05 21:13 ` ✓ Fi.CI.IGT: " Patchwork
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).